Writing Quality - How Prompts Shape Content

You changed one line in a prompt and the output got worse. Or you added a rule and Claude ignored it anyway.

Prompt changes that seem minor produce wildly different output. A rule placed at the end of a prompt behaves differently than the same rule placed at the start. A vague instruction like "add some links" produces different output than a specific one. The gap between what you wrote and what Claude produced is usually structural, not random. Which of these matches what happened to you?

Claude processes instructions in order. A constraint placed after the task instruction often arrives too late. By the time Claude reads "keep titles under 55 characters," the title is already written at 70. This is not a Claude failure. It is a sequencing failure. The styleguide must come before the task, not after. But there is a second problem: even well-ordered prompts fail if the role definition is missing. Which situation fits yours?

Research on professional journalists found they spend 0.9% of their research time verifying sources. That is about 80 seconds per workday. Human-written content is already poorly verified. AI-generated content without explicit verification constraints is worse. Wire addresses this through source classification and diversity checks, but only if the prompt enforces them. A rule that says "add some links" produces random links. The constraint has to specify what counts as a source and prevent removal during rewrites. Which part of this is the problem?

The most common rewrite failure: Claude treats "update this page" as permission to replace the page. Existing citations disappear. Sections that took hours to write are gone. The fix is not asking Claude to be more careful. It is adding explicit preservation rules before the task instruction. "Do not remove existing content" and "do not remove existing citations" must appear as constraints, not as hopes. The prompt examples in this section show the exact wording that works versus the wording that fails.

Investigative journalism uses an eight-step chain where each stage builds on verified output from the previous one. Skipping stages produces content that looks professional but contains unchecked claims. Wire's pipeline mirrors this structure: data collection, news gathering, independent evaluation, source classification, styleguide enforcement, auto-fixes, audit, and enrichment. The key insight is that quality is sequential. A single generation step with a good prompt is not equivalent to eight sequential steps with verification between them.

A page created in week one is not the same page in week twelve. Wire's update cycle runs create, then news and refine, then audit and deduplicate, then reword, then enrich. Each step improves the page against new data. No single step produces a finished page. But the order matters: enrichment runs last because it requires knowing what keywords the page already ranks for. Running it first produces content optimized for guesses, not evidence. The compounding only works if the sequence is respected.

Wire tracks seven indicators, each mapped to independent research rather than arbitrary thresholds. Title length maps to a study of 81,000 pages and rewrite risk. H1-title alignment maps to entity extraction behavior documented in a Google API leak. Source diversity maps to a finding about journalists converging on the same sources through Google. The indicators are not opinions about quality. They are proxies for specific failure modes that have been measured externally. If your content scores well on all seven and still underperforms, the problem is elsewhere.

Wire generates content through Claude, but the output quality depends entirely on how the prompts are structured. Small changes in prompt order, wording, and constraints produce dramatically different results. This guide explains the mechanics and shows what works versus what fails.

The principles here draw on empirical journalism research, particularly the LfM-Band 60 study (235 journalists observed, 21,145 coded research actions), the netzwerk recherche training handbook, and findings from the NR-Werkstatt Nr. 18 series on online journalism quality.

Why Prompt Order Matters

Wire assembles every Claude prompt from two layers: the styleguide (site-wide rules) prepended to the action prompt (task-specific instructions). The styleguide always comes first. This order is deliberate.

Claude processes instructions sequentially. Rules stated early in the prompt carry more weight than rules stated later. When a styleguide rule conflicts with an action prompt instruction, the styleguide usually wins, because Claude encountered it first.

Good order:

1. Editorial rules (styleguide) — what Claude must NOT do
2. Role definition — who Claude is for this task
3. Context data — current page, news, search terms
4. Task instructions — what to produce
5. Output format — frontmatter + body structure

Bad order:

1. Task instructions — "rewrite this page"
2. Context data — current page, keywords
3. Rules — "by the way, keep titles under 55 characters"

The bad order produces titles that are 60-70 characters. Claude completes the task before processing the constraint. By the time it reads "keep titles under 55 characters," the title is already written.

The Verification Gap

The LfM-Band 60 study found that professional journalists spend only 7.9% of their research time on verification, checking whether information is accurate. Source checking (verifying who said something and whether they are credible) accounts for 0.9% of all research actions. Just 1 minute and 21 seconds per journalist per workday.

This means human-written content is already poorly verified. AI-generated content without verification constraints is worse. Wire addresses this through three mechanisms:

1. Source Requirements in the Styleguide

The styleguide mandates at least one external citation per page. This is not optional. The audit system flags pages with zero outbound links. But the rule goes deeper than quantity.

Good prompt rule:

Every factual claim must cite an external source.
Sources are append-only — you cannot remove existing citations.
Prefer analyst reports and research studies over blog posts.

Bad prompt rule:

Add some links to external sources.

The first version teaches Claude what counts as a source and prevents it from removing citations during rewrites. The second version produces random links to whatever Claude's training data suggests.

2. Source Classification

Wire's news pipeline classifies every article as vendor-origin (press releases, company blogs) or third-party (analyst reports, news outlets). This distinction matters because vendor-origin content serves the vendor's interests, not the reader's.

The journalism research calls this "Interessengebundenheit" (interest-boundedness). Every source must be assessed for whose interests it serves. The Trainingshandbuch Recherche lists five evaluation criteria for sources: reliability, credibility, professional competence, status, and interest-boundedness.

Wire cannot assess all five automatically. But it can enforce source diversity (no domain contributing more than 40% of citations) and classify source type (vendor vs. third-party). These two checks catch the most common failure mode: content that only cites the subject's own press releases.

3. Source Diversity Detection

The LfM study found that Google's 90.4% search engine market share among journalists creates a structural problem: all journalists find the same sources, interview the same contacts, and produce converging coverage. The study calls this "Googleisierung" (Googleization).

Wire detects source concentration through analyze_source_diversity(). When a page over-relies on a few external domains, the news pipeline specifically searches for articles from different sources. The refine prompt warns Claude against creating new concentration from the news it integrates.

Good and Bad Prompt Examples

Titles

Bad:

Write a title for this page about invoice processing automation.

Claude produces: "The Complete Guide to Invoice Processing Automation Solutions for Enterprise" (73 characters, keyword-stuffed, likely to be rewritten by Google)

Good:

Title: 51-55 characters. Use dashes, not pipes. No brackets.
Match the H1 exactly. No superlatives (best, leading, top).
Start with the most specific concept, not generic terms.

Claude produces: "Invoice Processing Automation - How OCR Replaces Manual Entry" (57 characters, specific, matches H1, describes content)

Anchor Text

Bad:

Add internal links to related pages.

Claude produces: "For more information about our capabilities, click here."

Good:

Link using 2-5 word descriptive phrases. Never "click here" or "learn more."
First mention only. Do not repeat links to the same page.
Only link to pages listed in the site directory below.

Claude produces: "Wire detects these overlaps through keyword cannibalization analysis and resolves them automatically."

The difference: the good version uses the target page's topic as anchor text, links only once, and confirms the target exists.

Content Updates

Bad:

Update this page with the latest news.

Claude produces a rewrite that drops half the existing content, removes citations, and adds unsourced claims from training data.

Good:

Integrate the news below into the existing page.
Do NOT remove any existing content.
Do NOT remove any existing external links or citations.
Add new citations from the news sources.
If the news contradicts existing content, keep both versions
and note the discrepancy.

Claude preserves the existing page, adds the news as new sections or updates to existing sections, and keeps all previous citations.

The Editorial Pipeline Parallel

The Trainingshandbuch Recherche documents an eight-step editorial chain used in investigative journalism:

Pre-research and hypothesis formation
Document acquisition and authentication
Multi-source corroboration
Interview verification
Editorial review
Legal review
Pre-publication strategy
Post-publication follow-up

Wire's content pipeline mirrors this structure:

Journalism phase	Wire equivalent
Pre-research	`wire.chief data`, pull GSC metrics, identify opportunities
Document acquisition	`wire.chief news`, gather articles from web search
Multi-source corroboration	Junior-senior evaluation, each article assessed independently
Verification	Source diversity checks, vendor vs. third-party classification
Editorial review	Styleguide rules applied on every Claude call
Quality control	`_sanitize_content()`, 9 auto-fixes on every save
Post-publication monitoring	`wire.chief audit`, detect problems across the site
Follow-up research	`wire.chief enrich`, analyze + improve based on GSC data

The key insight from journalism research: quality requires sequential stages where each step builds on verified output from the previous step. Skipping stages, like generating content without verification, produces content that looks professional but contains unchecked claims.

How Small Changes Compound

The NR-Werkstatt Nr. 18 states: "Online-Beitrage sind niemals endgultig vollendet" (online content is never finished). Content quality is a process, not a deliverable. Wire embodies this through its update cycle:

Week 1: create produces a page with web research and external citations. Week 3: news finds industry developments. refine integrates them while preserving existing content. Week 5: audit reveals the page shares keywords with another page. deduplicate differentiates both. Week 8: reword optimizes title and headings for the page's actual search performance. Week 12: enrich adds content for keywords the page ranks position 8-20 for.

Each step improves the page incrementally. No single step produces a perfect page. But after three months, the page has been verified against search data, enriched with industry news, differentiated from competing pages, and optimized for actual search demand.

Content as Process, Not Product

The LfM study measured that journalists spend 43% of their workday on research, significantly more than the 22% they self-reported. The gap exists because journalists do not consider routine verification (checking a name, confirming a date) as "research." They see it as part of writing.

Wire makes the same assumption explicit: analysis is free, only generation costs money. The analyze.py module performs keyword presence analysis, BM25 scoring, keyword routing, and amendment brief generation at zero API cost. Only when Wire knows exactly what changes to make does it call Claude.

This inverts the typical AI content workflow. Most tools call the LLM first and hope for good output. Wire calls the LLM last, with precise instructions built from data.

Measurable Quality Indicators

Wire tracks quality through metrics, not opinions:

Indicator	What it measures	Evidence basis
Title length 51-55 chars	Rewrite risk	Zyppy 81K study
H1-title alignment	Entity extraction clarity	Google API leak
External citations per page	Source authority	Reboot Online experiment
Source diversity	Research breadth	LfM-Band 60 Googleization finding
Inbound internal links	Crawl equity distribution	SearchPilot orphan page test
Word count above 200	Content depth floor	Multiple pruning case studies
No broken internal links	Link equity preservation	Google API leak badBacklinks signal

Every indicator maps to independent research, not to arbitrary thresholds. Wire's SEO reference documents the evidence behind each number.