Versions + RFCs, in one timeline
Every published version of the rubric, every RFC that ratified or proposed a change. Cite by URL. The current rubric version is v1.1.0, published 2026-04-25.
- RFC-00072026-04-26in-comment
Reproducibility audit methodology
Pins the methodology for the quarterly reproducibility audit (Section 4 of /reports/calibration-2026-q2). For a sampled batch of 25 published scores, replay the same lyrics + genre through the same model + temp + rubric version and report per-row score delta. Zero or near-zero deltas confirm the seal is honest.
Read RFC-0007 - RFC-00062026-04-26in-comment
Per-cohort engagement methodology + divergence-as-calibration
Pins the methodology behind the five-persona listener panel engagement scores (B1281+) and defines when per-cohort divergence from the composite triggers a rubric calibration response. Empirical input cites the B1340 /admin/cohort-divergence dashboard.
Read RFC-0006 - RFC-00042026-04-26in-comment
Voltage Coach behavior change policy
Defines what counts as a "behavior change" to the voltage coach (B1291) — kind classification, hint copy, accept-rate triggers — and pins the publication discipline for each. Empirical input cite the B1346 kind-breakdown surface that landed earlier this session.
Read RFC-0004 - RFC-00052026-04-26in-comment
GPT-4o pre-gauntlet critique as Sonnet gauntlet input
Adds a second GPT-4o call to the pipeline — this time as a literary critic rather than a re-scorer. Output feeds the Sonnet gauntlet as additional evidence the gauntlet decision-rule incorporates or rejects. Cost: ~$0.04/forge. Off-by-default behind SF_GPT4O_PREGAUNTLET. Operator-driven request: "should ChatGPT be used more than we are using it?"
Read RFC-0005 - RFC-00032026-04-26in-comment
Hum Score as the M11 (Memorability) calibration signal
The 24-hour-delayed Hum Test (B1303) is the only longitudinal-recall signal we have. This RFC proposes formally treating the systematic delta between fresh-M11 and Hum-M11 as the calibration ground truth for the Memorability metric, with quarterly rubric adjustments triggered when the corpus-wide median delta exceeds ±5pts.
Read RFC-0003 - v1.0.12026-04-25version
PATCH (B1211): docs only. Reproducibility seal landed in /api/v1/score (B1199); model card published at /scoring/standard/model-card (B1197). No score deltas. Versioning policy formalized in RFC-0001 (in-comment until 2026-05-02): MAJOR for >5pt golden-eval delta, MINOR for clarifications, PATCH for docs/typos.
- v1.1.02026-04-25version
MINOR (B1240): first MINOR bump shipped through the published cadence. Anti-Inflation rules expanded from 4 to 5 with the addition of the Anti-Platitude rule (lines that resolve with generic emotional summaries hit the lowest Specificity + Voice band regardless of surface polish). Antagonist Ceiling clarified to require evidence; Historical Context anchored to the published corpus. Score deltas on the golden-eval set: <2 points on average (within MINOR threshold). Migration: existing scored content is auto-rescored on next eval; the seal field's rubricVersion now reads '1.1.0'. RFC-0002 (anti-platitude formalization) drafted as the in-comment artifact for this bump.
- RFC-00022026-04-25in-comment
Anti-Platitude rule (5th anti-inflation rule, v1.1.0)
Lines that resolve with generic emotional summaries ("all I need is love", "this is my truth", "love wins") hit the lowest Specificity + Voice band regardless of surface polish. Documented inline so implementers cite a published rule rather than discover it empirically.
Read RFC-0002 - RFC-00012026-04-25in-comment
Rubric versioning policy (v1.x cadence + diff format)
How the Lyric Scoring Standard versions: when a bump happens, what gets published with it, and how third parties verify which rubric scored their lyrics.
Read RFC-0001 - v1.0.02026-04-20version
Initial public release. 12 metrics finalized, anti-inflation rules documented, grade scale locked.