Back to the Standard

Standard version diff

What changed in every version of the standard.

The Lyric Scoring Standard is currently at v1.2.0. This page is the structural diff between every published version: which metrics changed, which anti-inflation rules were added, which RFCs ratified the bumps, and the score-delta band of each transition. The companion timeline view at /scoring/standard/changelog interleaves these with the RFC submission history.

Bump-type contract

Per RFC-0001 (Rubric Versioning Policy):

MAJOR — golden-eval score delta exceeds 5 points. Breaks wire compatibility for tooling pinned to the prior version. Requires public RFC + 30-day comment window.
MINOR — clarifications, new anti-inflation rules, descriptive refactors. Score deltas under 5 points. isCompatibleRubricVersion stays true; pinned tooling keeps working.
PATCH — docs, typos, surface improvements. Zero score delta. No RFC required.

1.1.01.2.0MINOR

B19392026-05-02

Score delta: <3 points on the golden-eval set.

M8 — Voice & POV Integrity
Changed
Refactored from "one coherent narrator" to "INTENTIONAL POV." Deliberate narrator switches that align with structural beats (K-pop multi-voice, hip-hop featured-verse contrast, gospel call-and-response) no longer score as drift failures. Random narrator switches without semantic function still do.
M11 — Memorability
Changed
Refactored from a single 60-minute recall test to a 4-signal cumulative read: hook integration + phonemic distinctiveness + chorus repetition strategy + one-listen recall. Cumulative / oral-tradition / ritual-repetition forms are no longer false-positive low-scored. Mantric chants that earn their repetition through structural intention now register at the canonical band.
Refactor descriptive, not prescriptive
Note
Both M8 and M11 changes describe WHEN the metric applies; they don’t change WHAT the metric values. Score deltas on the golden-eval set average under 3 points — within the MINOR threshold per RFC-0001.

1.0.11.1.0MINOR

B12402026-04-25

Score delta: <2 points on the golden-eval set.

Anti-Platitude rule
Added
Anti-inflation rules expanded from 4 to 5. New rule: lines that resolve with generic emotional summaries ("all I need is love," "this is my truth") hit the rubric’s lowest Specificity + Voice band regardless of surface polish.
Antagonist Ceiling rule
Changed
Clarified to require evidence. The dedicated critical voice that challenges every score now must produce specific lines / specific failure modes — vague disagreement does not lower the score.
Historical Context Anchor rule
Changed
Anchored explicitly to the published corpus, especially the Hank Williams S-band entry at composite 95. "Professional craft" now points at a citeable artifact, not an abstract claim.
Migration
Note
Existing scored content auto-rescores on next eval. The reproducibility seal’s rubricVersion field now reads "1.1.0."

Ratifying RFC: RFC-0002-anti-platitude-rule

1.0.01.0.1PATCH

B12112026-04-25

Reproducibility seal in API
Added
The seal field landed in /api/v1/score (B1199). Every score response now carries the ed25519-signed envelope binding rubricVersion + model + temperature + buildSha + composite + grade.
Public model card
Added
Published at /scoring/standard/model-card (B1197). Pins the runtime contract: which Anthropic model, what temperature, where the prompt lives, what limitations apply.
Versioning policy formalized
Added
RFC-0001 documented the version-bump rules: MAJOR for >5-point golden-eval delta, MINOR for clarifications, PATCH for docs/typos. Every later bump follows this contract.
No rubric content changes
Note
PATCH bumps don’t change the rubric itself — same metrics, same weights, same anti-inflation rules. Score deltas: zero.

Ratifying RFC: RFC-0001-rubric-versioning-policy

(initial)1.0.0MAJOR

2026-04-20

12 metrics across 3 tiers
Added
Initial public release. Craft (25%): M1 Prosody, M2 Structure, M3 Rhyme, M4 Economy. Expression (40%): M5 Specificity, M6 Imagery, M7 Emotional Truth, M8 Voice. Impact (35%): M9 Transcendence, M10 Arc, M11 Memorability, M12 Genre.
4 anti-inflation rules
Added
Gravity (default 50), Burden of Proof (claims earned by lines that show), Antagonist Ceiling, Historical Context. Anti-Platitude rule (5th) was added in the 1.1.0 bump.
Grade scale
Added
Locked: S+ (96+), S (91-95), A+ (86-90), A (80-85), B+ (73-79), B (65-72), C+ (55-64), C (45-54), D+ (35-44), D (25-34), F (<25). Same scale used by every later version.

How to verify a transition yourself

Fetch the current rubric JSON at https://songforgeai.com/scoring-standard.json. Compare against the prior version (archived in git at the commit that bumped the version).
Verify the version field matches the bump type rule (MAJOR / MINOR / PATCH).
Cross-reference the golden-eval delta against the published threshold (<5 for MINOR; PATCH should be 0).
Check the ratifying RFC at /rfc. MINOR + MAJOR bumps require an RFC; PATCH bumps ship without one.

Diff data sourced from the changelog field of the published rubric JSON plus the RFC registry. Updated on every published bump. Cite as “Lyric Scoring Standard v1.2.0 — Version Diff (B1963).” Back to the standard

More from the standard