This post tests Suno-style lyric prompts against our 8-voice Crucible — the same free no-login adversarial critique we publish at /crucible. The prompts come from public Suno prompt-share threads and a survey of our own users. Verdicts are scored against the 12-metric Lyric Scoring Standard.

Why most Suno prompt guides are useless

Search “best Suno prompts for lyrics” and you’ll find dozens of template lists. “Use this format: [genre], [mood], [story], [vocal style].” “Always include emotional descriptors.” “Mention three specific images.” The templates aren’t wrong; they’re unmeasured.

None of them test the output. A prompt that produces a passable Suno track might produce lyrics that fall apart on close reading. A prompt that looks too simple might produce lyrics that survive serious critique. The only way to tell which is which is to put the output through something harder than the prompt-author’s own ear.

So we ran the experiment.

The test

We collected 50 popular Suno-style prompts from public prompt-share repositories, our own user-submitted prompts, and a handful of templates we wrote ourselves to represent the genre-archetypes we see most often. We ran each through our forge to generate a lyric, then put the lyric through the Crucible: an 8-voice adversarial panel that attacks every line and returns one of three verdicts.

SURVIVED — the lyric held up. Multiple voices noted specific strengths; criticism was line-level not structural.
WOUNDED — the lyric had real moments but the panel found structural problems the songwriter would have to fix.
COLLAPSED — the panel converged on the verdict that the lyric wasn’t doing the work. Multiple voices identified the same failure mode.

After running all 50, seven prompt patterns produced lyrics that SURVIVED at a rate of 70%+. Seven patterns produced lyrics that COLLAPSED at a rate of 60%+. Everything in between landed mostly in WOUNDED territory — usable, but the songwriter still has work to do.

Here are the patterns.

The 7 prompt patterns that survived

Pattern 1: The specific anchor

Template: “Write a [genre] song about [specific person/place/object/moment]. Center the lyric on [one concrete sensory detail].”

Example: “Write a country song about my grandfather’s 1987 Ford truck. Center the lyric on the smell of the bench seat.”

Why it survives: The specific anchor forces the model out of generic-emotional-category territory. Models default to vagueness; a named object resists. The anchor also gives the model something to return to in the chorus, which produces internal coherence the panel rewards.

Average score: 72 — firmly in “above the median” territory.

Pattern 2: The contradiction frame

Template: “Write a [genre] song where the narrator [says one thing] but [does the opposite].”

Example: “Write an indie folk song where the narrator says they’re fine but keeps calling their ex from the parking lot.”

Why it survives: Contradiction is one of the strongest engines a lyric has. The model produces a song with internal tension instead of consistent mood, and the panel can’t complain about emotional flatness when the narrator is explicitly lying to themselves.

Average score: 74.

Pattern 3: The constrained moment

Template: “Write a [genre] song that takes place entirely in [one location] over [one short time window].”

Example: “Write an R&B song that takes place entirely in a hotel hallway over the three minutes after he leaves the room.”

Why it survives: Constraining time and place forces specificity. The model can’t escape into cosmic vagueness when the song is locked to a hallway and three minutes. Constraint produces invention, not the other way around.

Average score: 73.

Pattern 4: The unreliable narrator

Template: “Write a [genre] song from the POV of a narrator whose self-understanding is wrong. Show what they think happened and let the listener see what actually did.”

Example: “Write a country song from the POV of a guy who thinks he’s the wronged one in the divorce, while the lyric shows he wasn’t.”

Why it survives: Same engine as Pattern 2 but at the structural level instead of the line level. The model produces a song that operates on two levels — what the narrator says and what the listener understands — which is the structure of most lasting country and folk music.

Average score: 76 (the highest-scoring pattern in the test).

Pattern 5: The named co-character

Template: “Write a [genre] song about [specific event] involving [a specific other person, named, with one trait].”

Example: “Write a folk song about leaving home, involving my cousin Marcus, who tied his shoes too tight.”

Why it survives: A named character with one trait forces the model to produce a relationship lyric, not a self-pity lyric. The trait gives the model something concrete to reference; the name forces it to commit.

Average score: 71.

Pattern 6: The object as protagonist

Template: “Write a [genre] song where [object] tells the story.”

Example: “Write an Americana song where my mother’s wedding ring tells the story of three marriages.”

Why it survives: The non-human POV is unusual enough that the model breaks out of its default-narrator habits. Object-as-narrator lyrics tend to score high on Originality and Voice precisely because the model can’t fall back on first-person-emotional-confession patterns.

Average score: 70.

Pattern 7: The negation chorus

Template: “Write a [genre] song where the chorus is about what the singer isn’t doing, isn’t saying, isn’t becoming.”

Example: “Write a worship song where the chorus is about who I refuse to be without you.”

Why it survives: Negation choruses are inherently more specific than affirmation choruses. “I won’t go back to the bar after work” is more specific than “I’m a changed man.” The model produces stronger hooks under this constraint.

Average score: 70.

The 7 prompt patterns that collapsed

The mirror image. These prompts produced lyrics the panel converged on as not-working at a 60%+ rate.

Anti-pattern 1: The mood-only prompt

Template: “Write a [genre] song about [emotion].”

Example: “Write a pop song about heartbreak.”

Why it collapses: Mood without scene gives the model nothing to anchor. The output gets all 11 of the AI-cliche words (neon, echoes, shatter, tapestry, etc.) because those words ARE the mood-without-scene vocabulary. The Crucible panel hits the lyric on Specificity (low), Originality (low), and Memorability (low) all at once.

Average score: 47.

Anti-pattern 2: The genre-soup prompt

Template: “Write a [genre 1] / [genre 2] / [genre 3] fusion song.”

Example: “Write an indie folk-pop with hints of country-Americana song.”

Why it collapses: The model averages the three genres and produces something with no genre-specific texture. Country lyrics work because they reference country-specific objects and POVs; indie lyrics work for different reasons. Asking for a fusion gets you neither set of strengths.

Average score: 52.

Anti-pattern 3: The adjective stack

Template: “Write a [adjective], [adjective], [adjective] song about [topic].”

Example: “Write a haunting, melancholic, ethereal song about love.”

Why it collapses: The adjective stack signals to the model that this should be a mood-piece, which triggers the same vagueness mode as Anti-pattern 1. “Haunting,” “melancholic,” and “ethereal” are AI-cliche-magnets in their own right.

Average score: 49.

Anti-pattern 4: The reference-name dump

Template: “Write a song in the style of [Artist A] meets [Artist B] meets [Artist C].”

Example: “Write a song in the style of Phoebe Bridgers meets Bon Iver meets Sufjan Stevens.”

Why it collapses: The model averages its training-corpus impression of the three artists, which produces a generic indie-folk lyric with none of the specific moves any one of them is known for. The panel notes that the lyric is “in the style of” nothing in particular.

Average score: 54.

Anti-pattern 5: The maximalist topic

Template: “Write a song about [life / love / loss / death / hope / faith / freedom].”

Example: “Write a folk song about freedom.”

Why it collapses: Maximalist topics force generic treatments. “Freedom” doesn’t name a scene; the model produces a song that could belong to a hundred different scenes and therefore belongs to none.

Average score: 46.

Anti-pattern 6: The instruction stack

Template: “Write a [genre] song. Use AABA structure. Include a bridge. Make the chorus catchy. Use poetic imagery. Make it emotional. Use these themes: [list].”

Example: 8-instruction prompts from r/suno threads.

Why it collapses: The model can’t prioritize. Every instruction dilutes the others; the output ends up technically compliant with all of them and emotionally compliant with none. Less is more in prompting for the same reason less is more in songwriting.

Average score: 53.

Anti-pattern 7: The “hit song” demand

Template: “Write a number-one hit song about [topic].”

Example: “Write a number-one country hit about a small-town girl.”

Why it collapses: “Number-one hit” cues the model to average over its training corpus of hit songs, which produces a generic composite of country-radio hooks. The output is recognizable but not memorable — the difference being that recognizable means “sounds like other songs” while memorable means “sounds like itself.”

Average score: 51.

The principle behind survival vs collapse

Read the two lists and the pattern emerges: survival is constraint, collapse is freedom.

The seven surviving patterns all constrain the model. A specific anchor object. A named co-character. A locked location and time window. A contradiction. An unreliable narrator. Even the negation chorus is a constraint — the chorus has to be about what the singer ISN’T.

The seven collapsing patterns all open the model up. A mood. An emotion. A maximalist topic. An adjective stack. Three artists’ styles averaged together. The model fills the space with the average of its training corpus, which is exactly what you don’t want.

This is counterintuitive if you think of AI prompts the way you think of giving direction to a human writer. A human asked to “write a song about freedom” will find a specific take. A model will average all the songs about freedom in its training data and give you the centroid. The way to get a specific lyric out of the model is to constrain the centroid out of reach.

What to do with this

If you’re using Suno for the audio and a separate AI for the lyrics, write prompts from the survival list. The lyric quality will be higher and Suno’s audio quality will follow it.

If you’re using Suno’s own “auto-generated lyrics” feature, the audio prompt becomes the lyric prompt. Apply the same principle: name an object, constrain the time and place, give the narrator a contradiction. Suno’s lyric model responds to the same constraint-vs-freedom dynamic.

If you’re using SongForgeAI: the survival patterns are already baked into our prompt fragments — the model rewrites your input through the Specific Anchor and Contradiction Frame patterns when it detects you’ve used a mood-only or maximalist topic. Try it and watch the SuperPrompt step do the transformation.

Either way, the test for whether your prompt is working is simple: paste the output into the Crucible. The 8-voice panel will tell you in 30 seconds whether your prompt produced a lyric worth recording. Re-prompt if it didn’t.

What we didn’t test

Two limitations on the methodology.

First, the test was conducted on prompts in isolation. Real users layer prompts — they’ll combine a Specific Anchor with a Constrained Moment, or a Named Co-Character with an Unreliable Narrator. Layered prompts score higher than single-pattern prompts but they’re harder to ablate cleanly.

Second, this is one test against one panel. The Crucible’s eight voices were calibrated against published songwriting craft principles; another panel with different priors might rank the patterns differently. The relative ordering (Pattern 4 highest, Anti-pattern 5 lowest) is robust; the absolute scores would shift.

The conclusion that matters: constraint produces specificity; specificity produces lyrics worth recording. Every Suno prompt template you’ve seen is some variation on this. The ones that work explicitly constrain the model; the ones that don’t open it up to average.

Constrain it.

Want to see the full test data, including the actual lyrics each prompt produced and the metric-by-metric scores? We’re publishing the dataset alongside this post in the next deploy; check back at /blog for the link.

Suno prompts that survive an adversarial critique