Why We Publish Our Banned Clichés List
Most AI lyric tools have a list of words they filter from output. None publish it. We publish ours under CC BY 4.0. Here is the case for transparency over secrecy — and why the discipline matters more than the specific terms.
The list is the easy part
SongForgeAI maintains a list of 303 terms across three enforcement tiers: hard-banned (always rewritten), defense-required (must justify in the writing-room notes), and watch-listed (overuse tracked). Eight terms make up the most-recognizable AI-cliché core: neon, echo, shatter, tapestry, kaleidoscope, weave, whisper-of, shadow-of. The other 295 are the rest of the AI-default vocabulary plus genre-specific add-ons.
Building the list took ~40 hours of corpus analysis over the first six months. Updating it is ongoing. None of that is the hard part.
The hard part is the discipline that operationalizes the list — the post-generation scanner that auto-rewrites hits, the cold reader that catches the cliché-adjacent register, the writing-room prompt that frames the list as non-negotiable. The list without the discipline is just a hope; the discipline without a published list is just a black box.
Why publish at all?
Most AI tools that filter content keep their lists private. Three usual reasons:
- "Competitive advantage." The list is treated as IP. Publishing erodes the moat.
- "Adversarial robustness." A published list helps anyone trying to bypass the filter. Hidden lists are slightly harder to game.
- "Looks bad to admit we filter anything." The marketing copy says "AI excellence" not "we filter 303 specific phrases out of every output."
We reject all three.
The first reason is empty: the list is not the moat. Discipline is. Two tools can have the same list and produce wildly different output. SongForgeAI's moat is the five-layer enforcement, not the spreadsheet of terms.
The second reason is real but irrelevant to our use case. We're not filtering hostile inputs; we're filtering AI's own default vocabulary, which the model produces unprompted. Adversarial gaming would mean a customer trying to get a banned cliché into their delivered song — for which the answer is "they can write it themselves; we won't write it for them."
The third reason is the one we most strongly reject. "We filter the AI's default cliché vocabulary so your delivered song doesn't read like template output" IS the value proposition. Hiding the discipline that produces the value is upside-down.
What publishing it gets us
Three concrete benefits:
- Trust transparency. A buyer evaluating us can read the list before paying. They see exactly what we'll filter out of their song. The public claim ("we filter AI clichés") and the internal practice ("here are the 303 specific terms") match. No gap = no broken-promise risk.
- Category leadership. Other AI music tools have lists too. Publishing ours under CC BY 4.0 means anyone can fork + extend the discipline. If two more tools adopt the same list, we're a small de facto standard. If a journalist or AI-ethics researcher cites it, we're a citation source. The Lyric Scoring Standard (B884) and the Voice-Reference Discipline (B2672) are companion artifacts under the same license — three together start to read as a movement.
- Discipline ratchet. Once the list is public, drift is detectable. Anyone can re-fetch the file and diff it against last quarter. The act of publishing forces us to keep the list curated; the act of curating is the discipline.
What you can do with it
If you build a lyric tool, an AI music product, or a creative-AI tool of any kind, you're welcome to fork the list under CC BY 4.0. A few use cases we've seen or anticipate:
- Embed it in your generation prompt. Drop the 303 terms into your system prompt as "phrases to avoid." The discipline migrates with the list.
- Post-generation scan. Run a word-boundary scan of your delivered output against the list. Flag hits for human review or auto-rewrite.
- Audit your existing corpus. If you've shipped songs, scan them against the list and see how many of your "production" songs have a banned cliché in them. The number is usually surprising.
- Contribute back. If you find a phrase that's missing or should be added, email us. The list is curated, not crowd-sourced, but we read every suggestion.
Attribution: a link back to /standards/banned-cliches is appreciated. The license requires it; the gesture is what matters.
The deeper point
Publishing the list is not a marketing move. It's a discipline statement: we believe the standards we operate under should be inspectable. Three of them are now published:
- The Lyric Scoring Standard v1.2 — the 12-metric rubric we score every lyric against.
- The Banned Clichés List — the 303 terms we filter.
- The Voice-Reference Discipline — how we use named artists as craft references without imitating them.
All three are CC BY 4.0. Fork any of them. Build a lyric tool that hews to the same craft bar. The category becomes better and SongForgeAI doesn't lose — because the discipline is downstream of the list, not the list itself.
Most AI tools sell black boxes. We sell discipline. The list is the difference made visible.