Category — Behind the Scenes

Behind the Scenes posts

30 posts tagged behind the scenes.

How a Concept Album Holds Together: Inside 'The Grief Is Smaller Than the Room'

A track-by-track look at how a 12-song concept album becomes one record instead of a playlist: a palette that withholds then opens, a recurring wordless hum, and an afterlife set in a laundromat.

Behind the Scenes1 week ago11 min read

How We Caught the Chorus-Thesis-Line Bug in 10 Builds

A multi-AI craft critique of three real SongForgeAI songs surfaced the same failure mode in all of them: the system observes brilliantly in verses, then thesis-summarizes in choruses. Here is how we operationalized the diagnosis into 7 audit primitives, one forge-prompt rule, and a falsifiable empirical baseline — all in 72 hours.

Behind the Scenes2 weeks ago6 min read

The fix-side evidence rule: don't ship a 'this fixes X' change without measuring X

Three builds in a row, each shipping a fix that depended on the previous build's inference being correct. We did this. We named it. Here is the rule we now live by.

Behind the ScenesMay 1013 min read

Cross-language rubric bias: our Italian songs scored higher than our English ones — here’s why

We ran identical briefs through the SongForgeAI rubric in seven languages. Italian and Spanish songs averaged 3.4 points higher than English. Here is the bias audit, the cause, and what we shipped.

Behind the ScenesMay 66 min read

We just audited every button on the site. Here’s what we found.

Five parallel agents, eighty pages, every clickable element traced to its destination. Three production 404s, one dead-code conditional, a Stripe portal landing on the wrong tab. Here is the receipt.

Behind the ScenesMay 45 min read

How a CI ratchet replaces a code review (and why it matters to you)

A “ratchet” is a CI gate that only moves one direction: better. We have 38 of them. This week one of them hit zero. Here is what that means for the tool you use.

Behind the ScenesMay 37 min read

We found a 6.2-point bias in our own rubric. Here’s how we’re fixing it.

A 200-song operational audit revealed the published Lyric Scoring Standard systematically inflates scores for non-English texts. We’re publishing the finding, the methodology, and the fix plan before patching, because that’s what a published standard owes its users.

Behind the ScenesMay 214 min read

The State of AI Lyrics, 2026

Annual flagship report. Twelve months of AI-generated lyrics: what improved, what regressed, where the rubric pushed back, and what 2027 looks like if the operator class learns to read scoring rubrics.

Behind the ScenesMay 17 min read

We claimed signed seals for 390 builds. They weren’t actually signed.

A real engineering postmortem. The cryptographic seal infrastructure shipped at Build 1431. The env var that activated it was set in Build 1817 — 390 builds and six months later. Here is what we found, what it cost, and what we changed so the class of bug can’t hide again.

Behind the ScenesApr 306 min read

The Hank Williams Test — why our scoring rubric is calibrated against a country song from 1949

Most AI scoring is calibrated against other AI output. The Lyric Scoring Standard anchors at 95 against Hank Williams' "I'm So Lonesome I Could Cry." Here is why that choice is load-bearing.

Behind the ScenesApr 306 min read

How we measure chorus compression — and why it matters more than emotion.

External reviewers kept flagging the same gap: choruses that are emotionally correct but musically forgettable. So we built an analyzer that measures the structural compression that makes a chorus chant-able. Here is how it works, what it doesn’t do, and why we shipped instrumentation before scoring.

Behind the ScenesApr 277 min read

How we shipped Italian opera + Gregorian chant in six builds.

One operator question ("I don’t see how to do a Gregorian chant") exposed a bug class: capabilities registered in code with no user click-path. Six builds later: two new ghost voices, six new genres, a pure-mode prompt path, and two CI ratchets that prevent the bug class from shipping again.

Behind the ScenesApr 2711 min read

We rewrote the forge as a state machine. The cutover took five days.

A two-pipeline migration that most teams take a quarter to complete shipped in five builds. Not because it was easy — because three years of strangler-fig discipline factored V1 into reusable modules years before V2 needed them. A retrospective on the build sequence, the 3:1 reuse ratio, and the architectural debt we deliberately did not pay.

Behind the ScenesApr 269 min read

How a single missing config flag silently disabled our error tracking for 24 builds

A debug post-mortem. Sentry was installed, configured, and visibly invoked from server code — but no events ever reached the dashboard. The bug was three layered failures, each one masked by the next. Here is the trail and the durable lessons.

Behind the ScenesApr 2410 min read

Anatomy of a Forge: one song from prompt to final score

A full walkthrough of what happens between the moment you type a prompt and the moment a finished song shows up on the page. Seven internal phases, two scoring runs, one real example traced from "a heartbreak on a Tuesday" to a 78-composite country ballad.

Behind the ScenesApr 249 min read

How I cut 800 lines from a 2,800-line React component

Twenty-five extraction passes, six hooks, eight helper modules, one monolith that almost ate me. Field notes from a refactor that actually shipped.

Behind the ScenesApr 246 min read

Why the default score is 50, not 75

Most AI evaluators grade like a flattering tutor. Ours starts every song at 50 and makes the lyric earn its way up. The Gravity Rule, explained.

Behind the ScenesApr 157 min read

Case Study: From Pretty Nature Poetry to a Song You Can Feel in Your Lungs

A meditative mountain hymn full of beautiful abstractions went through SongForgeAI. It came back with ice in its beard, shallow breath at 12,000 feet, and a line about the difference between rushing and coming home.

Behind the ScenesApr 77 min read

Case Study: Every Rain Song Sounds the Same. This One Doesn't.

We wrote a rain-as-grief breakup ballad with ChatGPT. It was atmospheric and completely generic. SongForgeAI put a man in a truck behind a diner and made the rain real.

Behind the ScenesMar 307 min read

Case Study: From Generic Christmas Hymn to a Song That Made Us Cry

We wrote a Christmas worship song with ChatGPT. It was doctrinally correct and completely forgettable. Then SongForgeAI turned it into a testimony about a recovering father reading the nativity to his daughter.

Behind the ScenesMar 227 min read

Case Study: How a Purple-Prose Love Ballad Became Something Real

We wrote a romantic ballad with ChatGPT last year — overflowing with velvet shrouds, lighthouse beacons, and shattered petals. Then we ran it through SongForgeAI. The transformation was dramatic.

Behind the ScenesMar 146 min read

From ChatGPT Protest Song to Working-Class Anthem: A Before/After Case Study

We wrote an anti-war protest song with ChatGPT last year. Then we ran it through our own system. The result: every cliché replaced, every character named, and a completely different emotional register.

Behind the ScenesFeb 185 min read

I Scored My Own Lyrics and Got a 54. Here Is What I Learned.

A songwriter puts their best work through the 12-metric scoring system. The results are humbling, specific, and immediately useful.

Behind the ScenesFeb 147 min read

Case Study: The Punk Song That Refused to Stop Being Polite

A punk prompt produced a Hallmark card. Read the before, the panel critique, and the after — the narrator with teeth, the specific call-center detail that broke the song open, and why aggressive genres need a grammar the AI default sands down.

Behind the ScenesFeb 24 min read

The 87 Words We Banned (And Why Your Lyrics Are Better Without Them)

SongForgeAI scans every generated lyric for 87 specific words and phrases. They are not offensive. They are worse — they are boring.

Behind the ScenesJan 175 min read

What Happens When You Score the Same Song Twice

We ran the same lyrics through the scoring engine 20 times. Here is what we learned about consistency, variance, and what the numbers actually mean.

Behind the ScenesDec 24, 20255 min read

What a 90+ Score Actually Looks Like (And Why Most Songs Don't Get There)

We built the scoring system to be hard on purpose. Here is what separates a strong 80 from a rare 90.

Behind the ScenesDec 12, 20258 min read

Meet the Antagonist: The Voice We Built to Drop Your Score

Self-evaluating AI converges on polite consensus. We built an adversarial voice into the scoring panel whose only job is finding what is wrong. Here is why it works, why users hate it at first, and why we kept it anyway.

Behind the ScenesNov 30, 20255 min read

Anatomy of a Forge: What Happens in the 3 Minutes After You Hit the Button

From prompt to finished lyric package — here is every step that runs inside the SongForgeAI pipeline, and why each one matters.

Behind the ScenesNov 14, 20255 min read

How the 12-Metric Scoring System Works

Every song gets evaluated across Craft, Expression, and Impact. Here is what each metric measures and why the scores are deliberately hard.