Skip to content
Back to operating principles
Public RFCs

Major changes ship after public comment.

Every change to the rubric, the scoring pipeline, the public API contract, or pricing opens here as an RFC and stays open for at least 7 days. You read it. You email comments. We resolve it in public.

What counts as “major”

  • Any change to the Lyric Scoring Standard (rubric, weights, anti-inflation rules)
  • Any breaking change to a public API contract
  • Any change to pricing, tier limits, or refund policy
  • Any change to the data we collect or how we use it
RFC-0009in-commentopened 2026-04-26comment until 2026-05-03

Multi-language scoring methodology (Latin / Italian / Spanish / Japanese / French)

Pins the methodology for scoring non-English lyrics. Punch List #42. Operator-driven request: "could a user ask for a Gregorian Chant in Latin, an opera in Italian?" Today the rubric runs in English-only mode; the banned-terms dictionary is English; the platitude detector (RFC-0002) is English. Every non-English score therefore depends on the model's implicit translation. This RFC pins what changes in the rubric vs what stays language-agnostic.

Read the RFC
RFC-0008acceptedopened 2026-04-26

Open Scoring Corpus contribution policy

Pins the policy for how third parties contribute scored lyrics to the open Lyric Scoring corpus. Punch List #33 entry point. Eventual goal: 1,000+ human-scored lyrics, public, used to version + calibrate the rubric across the industry.

Read the RFC
RFC-0007in-commentopened 2026-04-26comment until 2026-05-03

Reproducibility audit methodology

Pins the methodology for the quarterly reproducibility audit (Section 4 of /reports/calibration-2026-q2). For a sampled batch of 25 published scores, replay the same lyrics + genre through the same model + temp + rubric version and report per-row score delta. Zero or near-zero deltas confirm the seal is honest.

Read the RFC
RFC-0006in-commentopened 2026-04-26comment until 2026-05-03

Per-cohort engagement methodology + divergence-as-calibration

Pins the methodology behind the five-persona listener panel engagement scores (B1281+) and defines when per-cohort divergence from the composite triggers a rubric calibration response. Empirical input cites the B1340 /admin/cohort-divergence dashboard.

Read the RFC
RFC-0004acceptedopened 2026-04-26

Voltage Coach behavior change policy

Defines what counts as a "behavior change" to the voltage coach (B1291) — kind classification, hint copy, accept-rate triggers — and pins the publication discipline for each. Empirical input cite the B1346 kind-breakdown surface that landed earlier this session.

Read the RFC
RFC-0005acceptedopened 2026-04-26

GPT-4o pre-gauntlet critique as Sonnet gauntlet input

Adds a second GPT-4o call to the pipeline — this time as a literary critic rather than a re-scorer. Output feeds the Sonnet gauntlet as additional evidence the gauntlet decision-rule incorporates or rejects. Cost: ~$0.04/forge. Off-by-default behind SF_GPT4O_PREGAUNTLET. Operator-driven request: "should ChatGPT be used more than we are using it?"

Read the RFC
RFC-0003acceptedopened 2026-04-26

Hum Score as the M11 (Memorability) calibration signal

The 24-hour-delayed Hum Test (B1303) is the only longitudinal-recall signal we have. This RFC proposes formally treating the systematic delta between fresh-M11 and Hum-M11 as the calibration ground truth for the Memorability metric, with quarterly rubric adjustments triggered when the corpus-wide median delta exceeds ±5pts.

Read the RFC
RFC-0002acceptedopened 2026-04-25

Anti-Platitude rule (5th anti-inflation rule, v1.1.0)

Lines that resolve with generic emotional summaries ("all I need is love", "this is my truth", "love wins") hit the lowest Specificity + Voice band regardless of surface polish. Documented inline so implementers cite a published rule rather than discover it empirically.

Read the RFC
RFC-0001acceptedopened 2026-04-25

Rubric versioning policy (v1.x cadence + diff format)

How the Lyric Scoring Standard versions: when a bump happens, what gets published with it, and how third parties verify which rubric scored their lyrics.

Read the RFC
RFC-0010in-commentopened 2026-05-19comment until 2026-05-26

Fidelity Score v0.1.0 — calibration + composite formula

Pins the seven-component fidelity composite (premise 30% / anchors 25% / structure 15% / style 15% / forbidden 5% / chorus 5% / transcendence 5%), the constraint-mode multipliers (strict 0.95, standard 1.0, loose 1.15), the 8-tier grade calibration A+ through F, the brief-complexity 0–10 formula, and the three UX prominence buckets (hide / secondary / primary). Opens public comment on the entire Phase-2 CAF stack documented at /scoring/standard/fidelity v0.1.0.

Read the RFC