Skip to content
Back to home
Roadmap

What SongForgeAI is building.

Three honest columns: shipped, building, next. Updated with every meaningful deploy.

Currently on Build 3629. Every item here ties to a public commit on GitHub.

The Engineering Punch List · Tier A

Fifteen foundational discipline items, each ratcheting a real CI gate. Tier A = this quarter. Lives at the engineering punch list on GitHub — all 50 items + Tier B/C/D plans.

  • #1Kill explicit-any allowlist

    Build 1178
  • #2jsdom + first real renderHook test

    Build 1174
  • #3Design tokens module

    Build 1175
  • #4Bundle-size CI gate

    Build 1176
  • #5Golden-evals CI gate

    Build 1177
  • #6Kill the unchecked-index allowlist *(B1597 CLOSEOUT: cumulative 166 → 0 (100%). B1595 cleaned components/CoverArt.tsx (15 \u2014 hexToRgb signature widened to absorb 12 + RGBA pixel-stride asserts for 3); B1596 cleaned lib/genre-profile.ts (15 \u2014 ! on every GENRE_PROFILES static-key lookup since each key is guaranteed in the record); B1597 cleaned admin/lineage/page.tsx (30 \u2014 narrow-once locals in the Fruchterman-Reingold force-layout loops). Allowlist now empty. Tier-A item closed.)*

  • #7Kill the strict-tsc allowlist *(B1600 CLOSEOUT: cumulative 37 → 0 (100%). B1598 cleaned dashboard/page (4 dead-code orphans deleted: EvalDeleteButton + handleDeleteEval + upgrades pipeline). B1599 cleaned dashboard/SongDetail (10 violations: unused destructured hook returns + handleSunoExport orphan). B1600 cleaned forge/page (56 violations \u2014 ~40 dead imports from EXTRACT-1 factory pattern + 9 dead destructured hook fields + 2 orphan closures + cascading dead-setter cleanup). Allowlist now empty. Tier-A item closed.)*

  • #8Kill the console-log allowlist *(B1586 CLOSEOUT: 10 → 0. Cumulative since B1037: 251 → 0. Tier-A target hit. Final 8 calls migrated to logger.info: forge/criticalSend (2 in create-stream-plumbing), mycelial-pathways, data-intelligence/orchestrator, manhattan/creative-debt, manhattan/skill-frontier, side-effects/cover-art, side-effects/focus-group. Two legitimate exclusions documented in the script: observability/logger.ts (the canonical sink) and developer/page.tsx (a JS code snippet inside a multi-line backtick template). The exclusions are commented inline in scripts/check-console-log.ts.)*

  • #9Visual regression with Playwright toHaveScreenshot() (every page, every breakpoint) *(B1185: e2e/visual.spec.ts snapshots 8 public surfaces with dynamic-content masks (SongCounter, ShippingThisWeek, leaderboard, hero canvas). Per-platform baseline path (Linux is the CI source of truth). Tightened threshold 0.20 → 0.15. CI auto-uploads new baselines + diff PNGs as artifact for PR review. Single-breakpoint (1280x800) for now; mobile + tablet breakpoints land as follow-up when the desktop baseline is stable.)*

  • #10A11y audit with axe-core in CI, block on serious violations

    Build 1179
  • #11Storybook for every primitive (Button, Card, Badge, Input, ScoreBadge, etc.) *(B1602 deferred; **B2358 stand-up + B2372 closeout**. B2358 installed Storybook 8.6 + the first 3 stories (Button 12, Chip 10, ScoreBadge 9). B2372 shipped the remaining 7 (Disclosure, Surface, MetadataRow, NextActionPair, TrustBlock, FooterLadder, CrucibleVoiceIcons) in a single Audit 2026-05-14 A16 batch — every design-system primitive now has a story file with realistic args + variant coverage. Item flipped to closed at B2421 (the punch list lagged the actual shipment by ~50 builds; B2421 caught the drift during a punch-list audit).)*

  • #12Real component library (replace ad-hoc Tailwind with <Button intent="forge" size="md">) *(B1602 kickoff: Button primitive shipped \u2014 typed intent="forge|ai|outline" + size="sm|md|lg", polymorphic <button> / <Link> via href prop, loading state with spinner, leadingIcon/trailingIcon slots. First migration: dashboard BillingTab upgrade CTA (next/link \u2192 Button href=...). 11 contract tests assert intent/size class composition + canon-compliance (no off-palette utilities, no rounded-3xl, no inline gradients). Closes when 30+ btn-* call sites are migrated to Button + the design-canon gate adds a forbid-rule for raw <button className="btn-*">.)*

  • #13Type-generated Supabase schema (database.types.ts regenerated on migration; no hand-typed table shapes) *(B1186 prep: workflow + npm script + placeholder + freshness check shipped. CLOSEOUT B1761: four-build chain landed the criteria — B1756 typed factory createTypedServiceRoleClient() scaffold + 6 contract tests; B1757 operator ran npm run db:gen-types + types regenerated against live schema (32 tables, 4 RPCs, schema 14.4); B1758 wave 1 migration (admin/regression-check + admin/costs, launderer ceiling 13 → 11); B1761 wave 2 migration (admin/forge-metrics + admin/cost-per-song, ceiling 11 → 8, including FULL/FALLBACK dynamic-select dead-code removal). Five admin routes now use the typed client; row types narrow against the live schema; the placeholder is gone; the freshness check would fail loudly if a migration drifted the types. Item closed.)*

  • #14Zod runtime validation on every SSE event (no trusted casts) *(B1180: PipelineEventSchema + CrucibleEventSchema; runtime validation in withSSEStream, forge create-stream-plumbing, and crucible route. 34 schema tests covering valid variants + drift catalog. Catches typo'd type discriminators, missing required fields, mixed envelope drift (text vs message). Critical events that fail validation get replaced with a generic error envelope so the client never sees a corrupt payload.)*

  • #15Discriminated-union state for ForgeResult (Draft | Evaluated | Gauntleted, each with required fields) *(B1181: ForgeSongState union + deriveSongState() pure derive function in src/app/forge/song-state.ts. useSongState() hook combines useForgeSessionState + useGauntletState through deriveSongState. 20 unit tests cover every transition path including the partial-Gauntleted guard. Source-of-truth state stays in the existing hooks; consumers migrate to the union as files are touched.)*

15 done0 partial0 not yet

Recent commits (auto-derived from git)

live · no doc update needed
  • B3592operator requestradio-player cover art — fire burns across the bottom, on top.06-04
  • B3588Deep Auditretry-once on a flaky gate (premise corrected).06-04
  • B3587Deep Auditcheck:telemetry-capture gate.06-04
  • B3586operator-reported/albums track count varies — stop hardcoding 12.06-04
  • B3571Deep Auditsingle source of truth for the scoring-standard version + CI gate.06-03
  • B3525WAR ROOMthe "Tiny Genius Songs" children's-music failure — diagnosis + solution ladder + SA#33 candidate.06-01
  • B3494operator requestdashboard Search now spans the whole library + language filter + Oldest First.05-31
  • B3492operator requestthe forge empty-state seed control now serves the 200 RICH examples.05-31
  • B3491operator requestrelocate the rich-prompt hint actions.05-31
  • B3490operator request200 rich-prompt examples + dice Re-Roll on the forge hint.05-31
  • B3456operator requestignite the per-song fire on the canonical radio player.05-29
  • B3455operator requestbrighten the HeroFireBand on /examples.05-29
  • B3448WAR ROOMconsolidate 15+ TODO/punch-list docs into one prioritized docs/MASTER-PUNCH-LIST.md.05-29
  • B3447Punch Listchorus-mass + named-event bridge audit primitives.05-29
  • B3426Deep Auditextract primitive failure-mode wiring into fidelity-audit-rb-primitives.ts and bump component-loc ceiling for the legitimate interface surface.05-28
  • B3425Deep AuditR5-B-wire phase 1 — ATL + STDD + IRD now plumbed into runFidelityAudit. Every forged song persists the 3 new signals in its fidelity_audit JSON column.05-28
  • B3424Deep AuditTrust Decay Audit ritual run — 33/33 claims green at 3-day accelerated cadence.05-28
  • B3423Deep AuditPublish R5-B WAR Room methodology as /blog/concrete-to-abstract-drift-2026-05-28.05-28
  • B3422Deep AuditR5-B-lexicon expansion — cross-arc vocabulary unblocks STDD empirical proof. Mean STDD: 5.8 → 7.7 (+33%).05-28
  • B3421Deep AuditInstall pre-push hook + clear 14 days of silently-red test debt — SA#18 cleanup.05-28

Auto-pulled from git log. Regenerated on every push (the snapshot is committed). Never goes stale.

Highlighted shipped (curated, last 90 days)

  • score-stability harness + check:score-stability ratchet.

    Build 3444

    score-stability harness + check:score-stability ratchet.

  • generate + attach album art on the audio admin page.

    Build 3443

    generate + attach album art on the audio admin page.

  • admin area to attach audio to the nine OSNG showcase songs.

    Build 3442

    admin area to attach audio to the nine OSNG showcase songs.

  • synthesize 6 external reviews + 3 idea-dumps into one stress-test + repositioning program.

    Build 3441

    synthesize 6 external reviews + 3 idea-dumps into one stress-test + repositioning program.

  • OG share-image, prominent Suno recipes, and a radio player wired + dormant until audio…

    Build 3440

    OG share-image, prominent Suno recipes, and a radio player wired + dormant until audio…

  • ship the public /one-story-nine-genres proof page.

    Build 3439

    ship the public /one-story-nine-genres proof page.

  • convergent-validity analysis vs the eval's per-metric sub-scores.

    Build 3438

    convergent-validity analysis vs the eval's per-metric sub-scores.

  • ablation-manufactured calibration — break the no-corpus deadlock.

    Build 3437

    ablation-manufactured calibration — break the no-corpus deadlock.

  • publish the Corner Booth case study (AI tool de-named) + commit the OSNG harness / data…

    Build 3436

    publish the Corner Booth case study (AI tool de-named) + commit the OSNG harness / data…

  • genre-primitive discrimination gate — the first CI gate that runs the craft detectors o…

    Build 3435

    genre-primitive discrimination gate — the first CI gate that runs the craft detectors o…

  • prove the genre-craft gates were never corpus-calibrated — vault calibration evidence.

    Build 3434

    prove the genre-craft gates were never corpus-calibrated — vault calibration evidence.

  • credit named details (proper nouns + spelled numbers) in NCD + CID detectors.

    Build 3433

    credit named details (proper nouns + spelled numbers) in NCD + CID detectors.

  • test the Stripe webhook money path.

    Build 3432

    test the Stripe webhook money path.

  • fix inverted layering — move singability + lyrics-detail-types into src/lib, gate lib→app.

    Build 3431

    fix inverted layering — move singability + lyrics-detail-types into src/lib, gate lib→app.

Actively building

  • DATA: Make the taste-test the homepage secondary CTA. Move from `/leaderboard` footer…

    Make the taste-test the homepage secondary CTA. Move from `/leaderboard` footer to homepage hero slot.

  • STANDARDS: Ship `npm install @songforgeai/scoring-rubric` — pure data + helper functions,…

    Ship `npm install @songforgeai/scoring-rubric` — pure data + helper functions, MIT-licensed. Every install is a citation.

  • DISTRIBUTION: Free tier: 100 calls/month, no card required. Get the SDK in every developer's…

    Free tier: 100 calls/month, no card required. Get the SDK in every developer's hands.

  • VELOCITY: Continue ratchet stack expansion: cyclomatic complexity ceiling,…

    Continue ratchet stack expansion: cyclomatic complexity ceiling, dependency-cycle ceiling, untested-public-API ceiling, prompt-token-budget ceiling.

Dated commitments

Five public commitments with target dates. Slipping a date isn't fatal — but it's logged, and the next Bet Review explains why. No vapor.

  • First real testimonial on the homepage

    Distribution

    Due 2026-05-14

    23d overdue

    One named, attributed quote from a real user — not staff, not company-authored. Lands in the TestimonialsSection registry (lib/testimonials.ts); the section currently renders an honest "we don't ship fabricated testimonials" disclosure card while the registry is empty (B2025).

  • Founder Loom walkthrough (10 min)

    Distribution

    Due 2026-05-09

    28d overdue

    Solo-operator + AI-pair tour through the forge → score → refine loop. Posted on the homepage above the fold + the about page.

  • Inngest Phase 2B — audio generation off SSE

    Velocity

    Due 2026-05-31

    6d overdue

    Multi-step Suno API pattern routed through the queue so the user-facing forge stream is no longer blocked on audio. Phase 1 (cover-art) and 2A (focus group) already shipped in B1429 + B1550.

    Track status →
  • First external academic citation of the Lyric Scoring Standard

    Standards

    Due 2026-06-30

    24d out

    A music school, ISMIR paper, or third-party tool that imports @songforgeai/scoring-rubric and credits the standard by name + version. The npm package shipped at v1.1.0 in B1759.

    Track status →
  • 100 npm installs of @songforgeai/scoring-rubric

    Distribution

    Due 2026-07-31

    55d out

    Adoption-counter milestone. Today: 0 installs. The badge on /scoring/standard updates live; this is the first crossing of the threshold where the standard has measurable real-world reach.

    Track status →

Next up

  • DATA: Recruit 10 paid songwriter raters at $500/quarter each. (45-day milestone.)

    Recruit 10 paid songwriter raters at $500/quarter each. (45-day milestone.)

  • DATA: Recruit 50 paid contributors active by month 12.

    Recruit 50 paid contributors active by month 12.

  • DATA: Hum Test corpus (B1303): once 6 months of data lands, publish first…

    Hum Test corpus (B1303): once 6 months of data lands, publish first longitudinal-recall analysis.

  • DATA: Quarterly calibration report cadence: "Rubric v1.X was calibrated against N…

    Quarterly calibration report cadence: "Rubric v1.X was calibrated against N human preferences with Y% agreement at 13+pt score gaps."

  • STANDARDS: Quarterly RFC cycle, public. RFC-0002 (Anti-Platitude) shipped privately at…

    Quarterly RFC cycle, public. RFC-0002 (Anti-Platitude) shipped privately at B1255. Future RFCs go through 7-day public comment via `/rfc`.

  • STANDARDS: Academic paper route: submit to ICCC 2027 or ISMIR. The rubric + anti-inflation…

    Academic paper route: submit to ICCC 2027 or ISMIR. The rubric + anti-inflation discipline + calibration corpus = a real paper.

  • STANDARDS: Lyric Scoring Foundation prep: when corpus reaches 10K+ entries and 3+…

    Lyric Scoring Foundation prep: when corpus reaches 10K+ entries and 3+ third-party citations, file the non-profit paperwork.

  • DISTRIBUTION: Co-branded proposal to Suno + Udio: API tier, integration docs, OG-cards on…

    Co-branded proposal to Suno + Udio: API tier, integration docs, OG-cards on shared songs that show our seal.

  • DISTRIBUTION: "Powered by Lyric Scoring Standard" badge: SVG + HTML snippet at…

    "Powered by Lyric Scoring Standard" badge: SVG + HTML snippet at `/developer/embed`.

  • DISTRIBUTION: B2B sales motion: music supervisors, sync licensors, music-rights orgs.…

    B2B sales motion: music supervisors, sync licensors, music-rights orgs. Reproducibility seal IS the legal-grade evidence they need.

  • DISTRIBUTION: Zapier/Make integration on top of the existing webhook outbound (B1274).

    Zapier/Make integration on top of the existing webhook outbound (B1274).

  • VELOCITY: `/engineering/punch-list` public page that updates from `docs/PUNCH-LIST.md`.…

    `/engineering/punch-list` public page that updates from `docs/PUNCH-LIST.md`. Recruiting tool.

  • VELOCITY: Codify the Deep Pass Protocol cadence: 4x/year, one major subsystem per quarter.

    Codify the Deep Pass Protocol cadence: 4x/year, one major subsystem per quarter.

  • VELOCITY: Open-source `@songforgeai/quality-tools`: Forbidden Archive parser, audit…

    Open-source `@songforgeai/quality-tools`: Forbidden Archive parser, audit scripts, ratchet checkers.

  • VELOCITY: Quarterly "X builds shipped, Y ratchets tightened, Z punch list items closed"…

    Quarterly "X builds shipped, Y ratchets tightened, Z punch list items closed" public report.

Under consideration

  • Mobile-first forge UX

    Current flow is responsive but desktop-first.

  • Songwriter-craft certification program

    Pass X metrics → get a badge. B2B-adjacent and community-building.

  • Cluster-pillar SEO reorganization

    33 songwriting guides regrouped into hub-and-spoke clusters. Current structure is flat.

Why this page exists

Most SaaS roadmaps are marketing artifacts — aspirational lists that never get updated. This one is the opposite. If something is on the shipped list, there is a commit behind it. If it moves between sections, the commit message says so.

A publicly-tracked roadmap is accountability, not a promise. If you want to lobby for something near the top of “Next up,” email me.