What SongForgeAI is building.
Three honest columns: shipped, building, next. Updated with every meaningful deploy.
Currently on Build 3629. Every item here ties to a public commit on GitHub.
The Engineering Punch List · Tier A
Fifteen foundational discipline items, each ratcheting a real CI gate. Tier A = this quarter. Lives at the engineering punch list on GitHub — all 50 items + Tier B/C/D plans.
#1Kill explicit-any allowlist
Build 1178#2jsdom + first real renderHook test
Build 1174#3Design tokens module
Build 1175#4Bundle-size CI gate
Build 1176#5Golden-evals CI gate
Build 1177#6Kill the unchecked-index allowlist *(B1597 CLOSEOUT: cumulative 166 → 0 (100%). B1595 cleaned components/CoverArt.tsx (15 \u2014 hexToRgb signature widened to absorb 12 + RGBA pixel-stride asserts for 3); B1596 cleaned lib/genre-profile.ts (15 \u2014 ! on every GENRE_PROFILES static-key lookup since each key is guaranteed in the record); B1597 cleaned admin/lineage/page.tsx (30 \u2014 narrow-once locals in the Fruchterman-Reingold force-layout loops). Allowlist now empty. Tier-A item closed.)*
#7Kill the strict-tsc allowlist *(B1600 CLOSEOUT: cumulative 37 → 0 (100%). B1598 cleaned dashboard/page (4 dead-code orphans deleted: EvalDeleteButton + handleDeleteEval + upgrades pipeline). B1599 cleaned dashboard/SongDetail (10 violations: unused destructured hook returns + handleSunoExport orphan). B1600 cleaned forge/page (56 violations \u2014 ~40 dead imports from EXTRACT-1 factory pattern + 9 dead destructured hook fields + 2 orphan closures + cascading dead-setter cleanup). Allowlist now empty. Tier-A item closed.)*
#8Kill the console-log allowlist *(B1586 CLOSEOUT: 10 → 0. Cumulative since B1037: 251 → 0. Tier-A target hit. Final 8 calls migrated to logger.info: forge/criticalSend (2 in create-stream-plumbing), mycelial-pathways, data-intelligence/orchestrator, manhattan/creative-debt, manhattan/skill-frontier, side-effects/cover-art, side-effects/focus-group. Two legitimate exclusions documented in the script: observability/logger.ts (the canonical sink) and developer/page.tsx (a JS code snippet inside a multi-line backtick template). The exclusions are commented inline in scripts/check-console-log.ts.)*
#9Visual regression with Playwright toHaveScreenshot() (every page, every breakpoint) *(B1185: e2e/visual.spec.ts snapshots 8 public surfaces with dynamic-content masks (SongCounter, ShippingThisWeek, leaderboard, hero canvas). Per-platform baseline path (Linux is the CI source of truth). Tightened threshold 0.20 → 0.15. CI auto-uploads new baselines + diff PNGs as artifact for PR review. Single-breakpoint (1280x800) for now; mobile + tablet breakpoints land as follow-up when the desktop baseline is stable.)*
#10A11y audit with axe-core in CI, block on serious violations
Build 1179#11Storybook for every primitive (Button, Card, Badge, Input, ScoreBadge, etc.) *(B1602 deferred; **B2358 stand-up + B2372 closeout**. B2358 installed Storybook 8.6 + the first 3 stories (Button 12, Chip 10, ScoreBadge 9). B2372 shipped the remaining 7 (Disclosure, Surface, MetadataRow, NextActionPair, TrustBlock, FooterLadder, CrucibleVoiceIcons) in a single Audit 2026-05-14 A16 batch — every design-system primitive now has a story file with realistic args + variant coverage. Item flipped to closed at B2421 (the punch list lagged the actual shipment by ~50 builds; B2421 caught the drift during a punch-list audit).)*
#12Real component library (replace ad-hoc Tailwind with <Button intent="forge" size="md">) *(B1602 kickoff: Button primitive shipped \u2014 typed intent="forge|ai|outline" + size="sm|md|lg", polymorphic <button> / <Link> via href prop, loading state with spinner, leadingIcon/trailingIcon slots. First migration: dashboard BillingTab upgrade CTA (next/link \u2192 Button href=...). 11 contract tests assert intent/size class composition + canon-compliance (no off-palette utilities, no rounded-3xl, no inline gradients). Closes when 30+ btn-* call sites are migrated to Button + the design-canon gate adds a forbid-rule for raw <button className="btn-*">.)*
#13Type-generated Supabase schema (database.types.ts regenerated on migration; no hand-typed table shapes) *(B1186 prep: workflow + npm script + placeholder + freshness check shipped. CLOSEOUT B1761: four-build chain landed the criteria — B1756 typed factory createTypedServiceRoleClient() scaffold + 6 contract tests; B1757 operator ran npm run db:gen-types + types regenerated against live schema (32 tables, 4 RPCs, schema 14.4); B1758 wave 1 migration (admin/regression-check + admin/costs, launderer ceiling 13 → 11); B1761 wave 2 migration (admin/forge-metrics + admin/cost-per-song, ceiling 11 → 8, including FULL/FALLBACK dynamic-select dead-code removal). Five admin routes now use the typed client; row types narrow against the live schema; the placeholder is gone; the freshness check would fail loudly if a migration drifted the types. Item closed.)*
#14Zod runtime validation on every SSE event (no trusted casts) *(B1180: PipelineEventSchema + CrucibleEventSchema; runtime validation in withSSEStream, forge create-stream-plumbing, and crucible route. 34 schema tests covering valid variants + drift catalog. Catches typo'd type discriminators, missing required fields, mixed envelope drift (text vs message). Critical events that fail validation get replaced with a generic error envelope so the client never sees a corrupt payload.)*
#15Discriminated-union state for ForgeResult (Draft | Evaluated | Gauntleted, each with required fields) *(B1181: ForgeSongState union + deriveSongState() pure derive function in src/app/forge/song-state.ts. useSongState() hook combines useForgeSessionState + useGauntletState through deriveSongState. 20 unit tests cover every transition path including the partial-Gauntleted guard. Source-of-truth state stays in the existing hooks; consumers migrate to the union as files are touched.)*
Recent commits (auto-derived from git)
- B3592operator requestradio-player cover art — fire burns across the bottom, on top.06-04
- B3588Deep Auditretry-once on a flaky gate (premise corrected).06-04
- B3587Deep Auditcheck:telemetry-capture gate.06-04
- B3586operator-reported/albums track count varies — stop hardcoding 12.06-04
- B3571Deep Auditsingle source of truth for the scoring-standard version + CI gate.06-03
- B3525WAR ROOMthe "Tiny Genius Songs" children's-music failure — diagnosis + solution ladder + SA#33 candidate.06-01
- B3494operator requestdashboard Search now spans the whole library + language filter + Oldest First.05-31
- B3492operator requestthe forge empty-state seed control now serves the 200 RICH examples.05-31
- B3491operator requestrelocate the rich-prompt hint actions.05-31
- B3490operator request200 rich-prompt examples + dice Re-Roll on the forge hint.05-31
- B3456operator requestignite the per-song fire on the canonical radio player.05-29
- B3455operator requestbrighten the HeroFireBand on /examples.05-29
- B3448WAR ROOMconsolidate 15+ TODO/punch-list docs into one prioritized docs/MASTER-PUNCH-LIST.md.05-29
- B3447Punch Listchorus-mass + named-event bridge audit primitives.05-29
- B3426Deep Auditextract primitive failure-mode wiring into fidelity-audit-rb-primitives.ts and bump component-loc ceiling for the legitimate interface surface.05-28
- B3425Deep AuditR5-B-wire phase 1 — ATL + STDD + IRD now plumbed into runFidelityAudit. Every forged song persists the 3 new signals in its fidelity_audit JSON column.05-28
- B3424Deep AuditTrust Decay Audit ritual run — 33/33 claims green at 3-day accelerated cadence.05-28
- B3423Deep AuditPublish R5-B WAR Room methodology as /blog/concrete-to-abstract-drift-2026-05-28.05-28
- B3422Deep AuditR5-B-lexicon expansion — cross-arc vocabulary unblocks STDD empirical proof. Mean STDD: 5.8 → 7.7 (+33%).05-28
- B3421Deep AuditInstall pre-push hook + clear 14 days of silently-red test debt — SA#18 cleanup.05-28
Auto-pulled from git log. Regenerated on every push (the snapshot is committed). Never goes stale.
Highlighted shipped (curated, last 90 days)
score-stability harness + check:score-stability ratchet.
Build 3444score-stability harness + check:score-stability ratchet.
generate + attach album art on the audio admin page.
Build 3443generate + attach album art on the audio admin page.
admin area to attach audio to the nine OSNG showcase songs.
Build 3442admin area to attach audio to the nine OSNG showcase songs.
synthesize 6 external reviews + 3 idea-dumps into one stress-test + repositioning program.
Build 3441synthesize 6 external reviews + 3 idea-dumps into one stress-test + repositioning program.
OG share-image, prominent Suno recipes, and a radio player wired + dormant until audio…
Build 3440OG share-image, prominent Suno recipes, and a radio player wired + dormant until audio…
ship the public /one-story-nine-genres proof page.
Build 3439ship the public /one-story-nine-genres proof page.
convergent-validity analysis vs the eval's per-metric sub-scores.
Build 3438convergent-validity analysis vs the eval's per-metric sub-scores.
ablation-manufactured calibration — break the no-corpus deadlock.
Build 3437ablation-manufactured calibration — break the no-corpus deadlock.
publish the Corner Booth case study (AI tool de-named) + commit the OSNG harness / data…
Build 3436publish the Corner Booth case study (AI tool de-named) + commit the OSNG harness / data…
genre-primitive discrimination gate — the first CI gate that runs the craft detectors o…
Build 3435genre-primitive discrimination gate — the first CI gate that runs the craft detectors o…
prove the genre-craft gates were never corpus-calibrated — vault calibration evidence.
Build 3434prove the genre-craft gates were never corpus-calibrated — vault calibration evidence.
credit named details (proper nouns + spelled numbers) in NCD + CID detectors.
Build 3433credit named details (proper nouns + spelled numbers) in NCD + CID detectors.
test the Stripe webhook money path.
Build 3432test the Stripe webhook money path.
fix inverted layering — move singability + lyrics-detail-types into src/lib, gate lib→app.
Build 3431fix inverted layering — move singability + lyrics-detail-types into src/lib, gate lib→app.
Actively building
DATA: Make the taste-test the homepage secondary CTA. Move from `/leaderboard` footer…
Make the taste-test the homepage secondary CTA. Move from `/leaderboard` footer to homepage hero slot.
STANDARDS: Ship `npm install @songforgeai/scoring-rubric` — pure data + helper functions,…
Ship `npm install @songforgeai/scoring-rubric` — pure data + helper functions, MIT-licensed. Every install is a citation.
DISTRIBUTION: Free tier: 100 calls/month, no card required. Get the SDK in every developer's…
Free tier: 100 calls/month, no card required. Get the SDK in every developer's hands.
VELOCITY: Continue ratchet stack expansion: cyclomatic complexity ceiling,…
Continue ratchet stack expansion: cyclomatic complexity ceiling, dependency-cycle ceiling, untested-public-API ceiling, prompt-token-budget ceiling.
Dated commitments
Five public commitments with target dates. Slipping a date isn't fatal — but it's logged, and the next Bet Review explains why. No vapor.
First real testimonial on the homepage
DistributionDue 2026-05-14
23d overdue
One named, attributed quote from a real user — not staff, not company-authored. Lands in the TestimonialsSection registry (lib/testimonials.ts); the section currently renders an honest "we don't ship fabricated testimonials" disclosure card while the registry is empty (B2025).
Founder Loom walkthrough (10 min)
DistributionDue 2026-05-09
28d overdue
Solo-operator + AI-pair tour through the forge → score → refine loop. Posted on the homepage above the fold + the about page.
Inngest Phase 2B — audio generation off SSE
VelocityDue 2026-05-31
6d overdue
Multi-step Suno API pattern routed through the queue so the user-facing forge stream is no longer blocked on audio. Phase 1 (cover-art) and 2A (focus group) already shipped in B1429 + B1550.
Track status →First external academic citation of the Lyric Scoring Standard
StandardsDue 2026-06-30
24d out
A music school, ISMIR paper, or third-party tool that imports @songforgeai/scoring-rubric and credits the standard by name + version. The npm package shipped at v1.1.0 in B1759.
Track status →100 npm installs of @songforgeai/scoring-rubric
DistributionDue 2026-07-31
55d out
Adoption-counter milestone. Today: 0 installs. The badge on /scoring/standard updates live; this is the first crossing of the threshold where the standard has measurable real-world reach.
Track status →
Next up
DATA: Recruit 10 paid songwriter raters at $500/quarter each. (45-day milestone.)
Recruit 10 paid songwriter raters at $500/quarter each. (45-day milestone.)
DATA: Recruit 50 paid contributors active by month 12.
Recruit 50 paid contributors active by month 12.
DATA: Hum Test corpus (B1303): once 6 months of data lands, publish first…
Hum Test corpus (B1303): once 6 months of data lands, publish first longitudinal-recall analysis.
DATA: Quarterly calibration report cadence: "Rubric v1.X was calibrated against N…
Quarterly calibration report cadence: "Rubric v1.X was calibrated against N human preferences with Y% agreement at 13+pt score gaps."
STANDARDS: Quarterly RFC cycle, public. RFC-0002 (Anti-Platitude) shipped privately at…
Quarterly RFC cycle, public. RFC-0002 (Anti-Platitude) shipped privately at B1255. Future RFCs go through 7-day public comment via `/rfc`.
STANDARDS: Academic paper route: submit to ICCC 2027 or ISMIR. The rubric + anti-inflation…
Academic paper route: submit to ICCC 2027 or ISMIR. The rubric + anti-inflation discipline + calibration corpus = a real paper.
STANDARDS: Lyric Scoring Foundation prep: when corpus reaches 10K+ entries and 3+…
Lyric Scoring Foundation prep: when corpus reaches 10K+ entries and 3+ third-party citations, file the non-profit paperwork.
DISTRIBUTION: Co-branded proposal to Suno + Udio: API tier, integration docs, OG-cards on…
Co-branded proposal to Suno + Udio: API tier, integration docs, OG-cards on shared songs that show our seal.
DISTRIBUTION: "Powered by Lyric Scoring Standard" badge: SVG + HTML snippet at…
"Powered by Lyric Scoring Standard" badge: SVG + HTML snippet at `/developer/embed`.
DISTRIBUTION: B2B sales motion: music supervisors, sync licensors, music-rights orgs.…
B2B sales motion: music supervisors, sync licensors, music-rights orgs. Reproducibility seal IS the legal-grade evidence they need.
DISTRIBUTION: Zapier/Make integration on top of the existing webhook outbound (B1274).
Zapier/Make integration on top of the existing webhook outbound (B1274).
VELOCITY: `/engineering/punch-list` public page that updates from `docs/PUNCH-LIST.md`.…
`/engineering/punch-list` public page that updates from `docs/PUNCH-LIST.md`. Recruiting tool.
VELOCITY: Codify the Deep Pass Protocol cadence: 4x/year, one major subsystem per quarter.
Codify the Deep Pass Protocol cadence: 4x/year, one major subsystem per quarter.
VELOCITY: Open-source `@songforgeai/quality-tools`: Forbidden Archive parser, audit…
Open-source `@songforgeai/quality-tools`: Forbidden Archive parser, audit scripts, ratchet checkers.
VELOCITY: Quarterly "X builds shipped, Y ratchets tightened, Z punch list items closed"…
Quarterly "X builds shipped, Y ratchets tightened, Z punch list items closed" public report.
Under consideration
Mobile-first forge UX
Current flow is responsive but desktop-first.
Songwriter-craft certification program
Pass X metrics → get a badge. B2B-adjacent and community-building.
Cluster-pillar SEO reorganization
33 songwriting guides regrouped into hub-and-spoke clusters. Current structure is flat.
Why this page exists
Most SaaS roadmaps are marketing artifacts — aspirational lists that never get updated. This one is the opposite. If something is on the shipped list, there is a commit behind it. If it moves between sections, the commit message says so.
A publicly-tracked roadmap is accountability, not a promise. If you want to lobby for something near the top of “Next up,” email me.