Changelog
Release history for MSRBot.io
See docs/buildlog.md for details of v1.0.0 released on Nov 26, 2025.
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[Unreleased] - yyyy-mm-dd
Unreleased
Added
Changed
Fixed
[v1.4.0] - 2026-02-28
2026-02-28
Added
- API Explorer page at
/api/— searchable, filterable document browser with URL parameter syncing, pagination, and an inline JSON viewer for inspecting full provenance records. - Full-provenance JSON API — static endpoints for machine consumption:
/api/documents.json— full registry with all source fields and provenance metadata./api/doc/{docId}.json— per-document JSON with full record./api/stats.json— registry statistics and metadata (withmeta.repoUrl,meta.changelogUrl).- JSON Schema publishing at
/api/schemas/— existing schemas (documents,groups,portals,projects) are now served as static assets for consumer validation. - API versioning — all API JSON responses include
$schemaandapiVersionfields; initial API version is1.0.0. - Machine-readable discovery — added
<link rel="alternate" type="application/json">and<link rel="describedby" type="application/schema+json">to the API Explorer page and all document detail pages. - OpenSearch JSON template —
opensearch.xmlnow includes a JSON response URL (/api/?q={searchTerms}) alongside the existing HTML template. - JSON-LD SearchAction — structured data now includes search actions for both
/docs/and/api/endpoints. - Source Data (JSON) panel on document detail pages — collapsible card showing the full registry record with a direct link to the per-document API endpoint.
- Internal Changelog page at
/changelog/— rendered fromCHANGELOG.mdas styled cards, replacing external GitHub blob links. - Added API Explorer and schema links to the Dev Tools & Resources popover and site footer.
- Added API link on the homepage.
Changed
- Renamed "Dev Tools" navigation label to "Dev Tools & Resources."
- Updated README badges and Key Artifacts to reference the new API Explorer and internal changelog.
- Updated sitemap to include
/api/and/changelog/entries.
Fixed
- Fixed suites/collections page document rendering when publisher labels differ by composite forms (for example,
ISO/IECdocs underISOcollections); collection matching now normalizes publisher aliases/composites before filtering. - Fixed JSON-LD
SearchActiontarget URLs missing path separator aftercanonicalBase.
[v1.3.0] - 2026-02-26
2026-02-26
Added
- Providerized extraction architecture:
- Added SMPTE discovery provider module at
src/main/scripts/providers/smpte.discovery.js. - Added SMPTE parser provider module at
src/main/scripts/providers/smpte.parse.js. - Added IETF discovery provider module at
src/main/scripts/providers/ietf.discovery.js. - Added IETF parser provider module at
src/main/scripts/providers/ietf.parse.js. - Added provider-specific metadata configs:
src/main/scripts/providers/smpte.meta.jssrc/main/scripts/providers/ietf.meta.js- Added provider registry at
src/main/scripts/providers/index.js. - Added optional document schema fields for citation structure:
volume,number,pages,chapter,edition.- Added explicit npm alias
extract:smptefor provider-targeted extraction. - Added dedicated IETF extraction workflow:
.github/workflows/extract-docs-ietf.yml(separate branch/PR path from SMPTE extraction). - Added keyword governance utilities and config source:
- Added
controlledKeywordslist insrc/main/config/site.json. - Added
keywords-syncutility atsrc/main/scripts/utils/keywords.sync.js(npm run keywords-sync, dry-run by default,--writeto apply). - Added centralized command/flags documentation at
docs/commands.md. - Added and expanded
AGENTS.mdguidance for branch naming, issue/PR label usage, PR hygiene, validation expectations, repo guardrails, and changelog/documentation/provenance expectations.
Changed
- Refactored
extractDocs.jsto be provider-agnostic orchestration (merge, metadata, MRI, and logging), with provider-specific discovery/parsing moved out of main script. - Extraction provider selection is now explicit via
--provider; implicit/default provider execution was removed. - Renamed SMPTE extraction workflow to
extract-docs-smpte.yml(Extract Documents - SMPTE) and aligned workflow references/triggers accordingly. - Updated docs and badges to reference the renamed SMPTE extraction workflow.
- Updated validation architecture for keywords:
- Removed hard keyword enum enforcement from
documents.schema.json. - Moved keyword conformance checks to
documents.validate.jsagainstsrc/main/config/site.json#controlledKeywords. - Added keyword validation mode controls for
npm run validate: - default strict mode (
--error) - optional warn mode (
--warn) for unknown keyword drift checks. - Extraction workflows now run keyword validation in warn mode; build/local validation remains strict by default.
- Expanded IETF extraction behavior:
- RFC extraction now uses RFC Index XML (
rfc-index.xml) as first-pass canonical metadata for seeded RFCs, with per-document sources used as enrichment/fallback. - RFC field source precedence is now explicit; status relations (
obsoletes/obsoleted-by/updates/updated-by) are sourced from RFC Index XML + RFC info<dl>merge, eliminating loose relation text fallback. - RFC author precedence now prefers Datatracker
doc.jsonauthors (richer names) over RFC Index XML, with HTML/info fallbacks. - Added RFC Index XML/XSD mapping contract and required-field coverage warnings in IETF parser for schema-backed extraction hygiene.
- RFC relation fields now derive from RFC info page relation
<dl>parsing (no broad relation text fallback injection). - Non-RFC extraction now enriches from archive XML (
/archive/id/*.xml) for front-matter fields and keywords. - Non-RFC keywords are normalized to project keyword style (Title Case with preserved acronyms/common forms such as
JSON,URN,B-Chain,DCinema,DCP*,SHA-1). - RFC reference parsing now uses RFC HTML section-aware extraction with strict
NormativevsInformative/Bibliographicbucketing and overlap guards. - RFC fallback reference slicing is now bounded to reference sections, next section heading, and page-break markers to avoid body/header/footer soak-through.
- IETF reference sightings now write to MRI for both RFC HTML and non-RFC XML paths using final document IDs.
- Expanded shared reference normalization rules in
src/main/lib/referencing.js: - RFC IDs normalize leading zeros (e.g.,
RFC0821→RFC821). - W3C
REC-*URL forms normalize to canonical W3C shortname IDs (noREC-prefix in docId). - Added href-first resolvers for Unicode and Mozilla Bugzilla references.
- Added improved ISO hyphenated designator parsing (e.g.,
ISO-8859-1:1987). - Updated project docs with provider extraction and keyword-governance guidance in
README.mdandCONTRIBUTING.md. - Updated docs to link
docs/commands.mdfromREADME.mdandCONTRIBUTING.md. - Enhanced Portal document listings with additional context fields:
- Display of
docTypeandpublicationDatein document tables. - New Doc Type filter, aligned with existing Publisher filtering.
- Extended Portal sorting controls to support:
- Sorting by Type and Published date.
- Ascending / descending sort direction for all supported sort keys, consistent with Suites and Collections.
- Updated RefTree unresolved-document UX:
- Unresolved nodes remain visible and navigable in-tree, but now display muted/italic labels with a
NOT IN REGISTRYbadge. - In the Current Tree Root card, unresolved docs no longer click through to
/docs/:docId/; in-registry roots remain clickable. - Improved docs page reference-list readability:
- Added explicit spacing between normative/bibliographic reference labels and their status tokens (e.g.,
[Active],[SUITE]). - Updated
docs/CONTRIBUTING_SHORT.mdto align branch prefix guidance and add an explicit Unreleased changelog checklist item for workflow/policy/behavior changes. - Simplified PR preview check behavior by removing custom check-run/status publication from preview workflow and relying on the single native workflow job check context.
- Added MSI→MRI chain guard in MRI workflow to skip MRI when MSI already opened a PR (artifact marker present), preventing duplicate chained data PRs.
- Hardened MRI missing-ref issue upsert behavior with no-op update skipping and per-run mutation budget (
MAX_MUTATIONS), reducing secondary GitHub rate-limit failures. - Stopped MSI/MRI metadata-only auto-commits to default branch; report timestamp/date-only churn is now ignored unless content-change PR criteria are met.
- Refined home page information architecture and responsive layout:
- Reduced card density, improved section hierarchy, and rebalanced content columns.
- Updated portal home rendering to a scalable list layout for growth.
- Refined footer layout/content hierarchy:
- Improved responsive alignment/spacing, constrained divider width to container, and added explicit developer/issue links.
- Standardized branding presentation with PrZ3/MSR marks and config-driven copyright year.
- Updated workflow trigger path:
- Site build (
Build MSRBot.io Site and Test) now runs onpushtomain. - URL validation now triggers from MRI completion (plus schedule/manual), not from site build completion.
- PR gate remains
PR Build Preview (MSRBot.io site)onpull_request. - Added focused documents-registry helper scripts:
npm run docs-sortto sortsrc/main/data/documents.jsonbydocId.npm run docs-validateas explicit docs validation alias.npm run docs-fixto run sort + validation in one step for manual doc edits.- Updated
docIdSortbehavior for low-noise editing: - Removed legacy
.baksidecar creation. - Preserved per-entry object formatting and reordered entries only.
- Aligned sort comparator with validator ordering (
toUpperCase()lexical) to prevent sort/validate mismatch loops.
Fixed
- Fixed OM remap path in extraction by correcting title variable scope usage, enabling OM ID remapping updates to apply correctly.
- Fixed README weekly schedule Markdown table separator to render correctly with all columns.
- Fixed doc citation “Copy (undated)” behavior on doc pages so undated snippet blocks copy correctly (no blank clipboard payload).
- Fixed undated citation snippet
<cite id>generation to strip only terminal date suffixes for undated variants, while leaving dated variants unchanged.
[v1.2.0] - 2026-02-05
2026-02-05
Primary changes delivered via https://github.com/PrZ3r/MSRBot.io/pull/695
Added
- Automated extraction of
Scopein HTML documents to map toabstract. - Introduced Portals: curated, first-class landing pages that aggregate documents across suites, collections, publishers, and document types.
- First (3) portals:
/dcinema/,/imf/,/accessibility/ - Added a complete Portal build and schema pipeline, supporting:
- Keyword-based document matching.
- Explicit pinning and post-resolution filtering.
- Shared narrative/overview sections.
- Curated resource collections.
- Delivered a Suites-aligned Portal UX, including:
- Searchable document tables with abstracts.
- Expandable previews (shared behavior with Suites).
- Visual muting of withdrawn and superseded documents.
- Structured, card-based overview and resource sections.
Portal Behavior & UX Details
- Portals render as dedicated pages with stable URLs (e.g.
/dcinema/). - Portal document listings support:
- Default sorting by
docLabel. - Search, publisher filtering, and sortable columns.
- Abstract previews with More/Less expansion.
- Portal overview sections support shared explanatory content using the same card patterns as Suites.
- Resource sections support:
- Grouping by category.
- Independent collapsible sections.
- Per-resource description expansion for long content.
- Portal navigation dynamically adapts based on available content (Overview / Docs / Resources).
Changed
- Backfilled (auto and manually)
abstractfields for DC and IMFcollections
Fixed
- Fixed rendering of
abstractparagraph breaks insuites.
[v1.1.0] - 2026-01-06
2026-01-06
Primary changes delivered via https://github.com/PrZ3r/MSRBot.io/pull/678
Added
- Introduced first-class Suites and Collections as distinct core concepts:
- Suites represent true multipart standards (shared lineage number).
- Collections represent related documents without formal parts.
- Suites and collections share UX but retain distinct semantics.
- Added full Suites / Collections build pipeline, emitting:
build/suites/_data/suites.json(mixed, with explicitkind: suite | collection).- Dedicated pages at
/suites/:slug/for both suites and collections. - Index page supporting mixed display with filtering by kind.
- Implemented docSuiteTitle extraction and propagation:
- HTML: derived directly from
pubSuiteTitle. - PDF: parsed as text before first em-dash.
- Integrated across search index, citations, RefTree roots, suite cards, and doc detail pages.
- Enabled full ALLPARTS resolution:
- Supports ISO and SMPTE ALLPARTS identifiers.
- Doc detail pages resolve ALLPARTS to suite pages with correct labels and status.
- RefTree displays suites as non-clickable parents that expand to child documents.
- Added guardrails and explicit metadata to prevent future regressions:
- Explicit
kind: suite | collection. - Flags for
SUITETITLEMISMATCH. - Hard exclusions for unsupported publishers and document types.
Changed
- Finalized and locked build order to ensure correctness and stability:
- Documents → MSI → Suites/Collections → Pages.
- Updated suite and collection rendering:
- Suites show all documents, including withdrawn (visually muted).
- Collections hide parts column and sort by label.
- Abstract previews and expand/collapse behavior added.
- Refined RefTree behavior:
- RefTrees may display suites but never re-center on them.
- Suite labels replace ALLPARTS identifiers where applicable.
- Normalized publisher handling for edge cases (e.g., ANSI/ASA) so suite and collection lookups resolve correctly.
Fixed
- Fixed ALLPARTS resolution failures where document type previously blocked linking.
- Corrected publisher logo and link resolution on suite and collection pages.
- Resolved reference edge cases for W3C documents.
- Eliminated legacy suite/collection duplication and silent clobbering in the build process.