← Field notesStrategy

Why AI-generated content fails (and the three fixes that work)

Ryan Walker7 min readUpdated June 15, 2026

Why AI-generated content fails (and the three fixes that work)

AI-generated content fails in predictable ways. The failures are not random — they are structural. There are three failure modes: the vague brief, no brand voice constraint, and no evaluation step. All three are system problems. The model is not the issue. The system around it is.

Failure mode 1: the vague brief

When you give a model a topic instead of a claim, it produces the average of everything written on that topic. That average is generic, hedged, and forgettable — because the training data is generic, hedged, and forgettable.

“Write about AI for solopreneurs” produces a listicle that could have been written by anyone. “Write a post arguing that solopreneurs have a structural advantage in AI adoption over enterprises” produces a specific argument with a defensible position.

The brief is the input. The specificity of the output is determined by the specificity of the input. This is not a model limitation — it is a physics-of-language-models fact. The model interpolates between what it has seen. A vague prompt lands in the center of a crowded distribution.

Fix 1: the specific claim brief

Every brief must contain four things:

  1. A declarative claim — not a topic, a position. Something that could be wrong.
  2. The audience in one sentence — who they are, what they already know, what they care about.
  3. Two to three specific pieces of evidence or examples — numbers, named cases, mechanisms.
  4. The CTA — what you want the reader to do after reading.

Vague brief: “Write a blog post about AI content tools for marketers.”

Specific brief: “Write a Field Notes post arguing that AI content tools fail marketers not because the models are weak but because the briefs are vague. Audience: growth marketers who have tried AI content and been disappointed. Evidence: (1) a vague prompt produces the statistical average of training data, (2) a claim-first prompt produces a specific argument, (3) without a critic prompt, quality is never enforced. CTA: download our brief template.”

The second brief produces a post that sounds like it was written by someone with a point of view. The first produces filler.

Failure mode 2: no brand voice constraint

Without a brand voice block, the model defaults to the statistical center of its training data. That center is corporate, hedged, and vague. The output sounds like a press release because that is what most of the internet sounds like.

The model is not wrong. It is unconstrained. It is doing exactly what it was trained to do: predict the most probable next token given the input. If the input contains no voice signal, the output reflects the average voice of the web.

This is why AI content sounds like AI content. Not because the model is bad at writing — because no one told it how you write.

Fix 2: the brand voice block

A brand voice block is a 100-word block placed at the top of every prompt. It contains:

  • Voice in three adjectives (e.g., direct, dry, practitioner)
  • Audience in one sentence
  • Five banned phrases (e.g., “unlock”, “game-changing”, “seamless”, “in today’s fast-paced world”, “imagine a world”)
  • Three things you always do (e.g., lead with the claim, use specific numbers, explain the mechanism)
  • Two examples of your best writing — actual sentences or paragraphs

The examples are the most important part. The model pattern-matches to examples faster than it follows abstract instructions. “Be direct” is an instruction. A direct sentence is a demonstration. Demonstrations win.

Paste the same block into every prompt. Treat it as infrastructure, not decoration.

Failure mode 3: no evaluation step

You read the output. It seems fine. You ship it. Three weeks later you notice the content sounds nothing like you and is not getting cited by AI engines or linked by anyone.

The evaluation step was missing. Without it, the standard is never enforced. You are relying on your own read — which is fast, optimistic, and subject to the same anchoring bias that makes you think a draft is better than it is because you wrote the brief.

One pass of human review is not an evaluation step. It is a vibe check. Vibe checks do not catch brand voice drift, missing evidence, or GEO structure failures.

Fix 3: the critic prompt

A critic prompt is a second AI call. It takes the generated output and scores it against five criteria:

  1. Brand voice match — does it sound like the voice block examples?
  2. Specific claim present — is there a falsifiable argument in the first two sentences?
  3. Evidence cited — are there numbers, named examples, or mechanisms?
  4. GEO structure — does it answer first, support second? Is there an FAQ block?
  5. No banned phrases — run a literal string match against your banned list.

The critic returns pass or fail with reasons. If it fails any criterion, the output goes back for revision before it reaches a human. Run it before every publish.

This is not optional polish. It is the enforcement mechanism. Without it, your brand voice block and your specific claim brief are aspirational, not operational.

The system view

The full pipeline is four steps: brief → generate → evaluate → publish.

All three fixes live in two of those steps — brief and evaluate. The generate step (the model) is not the problem. It never was. The model does what the system tells it to do. If the system has no specific input and no evaluation gate, the output will be generic and unreviewed.

Most teams invest in the generate step: better models, more tokens, fine-tuning. The leverage is in the steps they skip.

Fix the brief. Add the critic. The model will do the rest.

We send our brief template, brand voice block, and critic prompt to Field Notes subscribers. Get them at avakata.agency/contact.html.

If you want to walk through how this pipeline applies to your content operation, book a discovery call. We will look at your current briefs and tell you exactly where the system is breaking down.

Frequently asked questions

Why does AI-generated content sound generic?
Three reasons: a vague brief (the model produces the average of everything written on the topic), no brand voice constraint (the model defaults to corporate, hedged language), and no evaluation step (the standard is never enforced before publishing). All three are system problems, not model problems. Fix the brief, add a brand voice block, and add a critic prompt.
How do I write a better AI content brief?
Replace the topic with a declarative claim. Instead of 'Write about AI for solopreneurs,' write 'Write a post arguing that solopreneurs have a structural advantage in AI adoption over enterprises because they deploy faster and govern more tightly.' Add the audience in one sentence, two to three specific evidence points, and the CTA. Specificity in equals specificity out.
What is a critic prompt and how does it fix AI content?
A critic prompt is a second AI call that takes the output of the first call and scores it against defined criteria: brand voice match, specific claim present, evidence cited, GEO structure, no banned phrases. It returns pass or fail with reasons. Running it before every publish enforces your standard consistently — something a human spot-check does not do at scale.

Related reading