On this page
- The Data Pipeline
- Audit: Read-Only Analysis
- Keyword Cannibalization
- Opportunity Scoring
- Why Wire Ignores What Others Check
- Tiered Reword
- Content Gap Detection
- Dead Page Detection
- Internal Link Health
- How Wire Makes Decisions with Search Data
- Which Page Survives a Merge
- When a Page Is Dead vs Just New
- When Redirects Are Leaking Traffic
- What "Fresh Data" Means
- How Restructures Preserve SEO Value
- The Evidence Hierarchy
- But my boss wants Ahrefs reports
Wire reads live search data into a local SQLite database and uses it to make every content decision. No guessing which pages need work. The database tells you what Google sees, and Wire acts on it.
The Data Pipeline
Everything starts with pulling search metrics.
python -m wire.chief data
This fetches keyword data for every page across all topics and stores it locally. The database tracks impressions, clicks, positions, and CTR for each keyword-page combination. Data older than 28 days is considered stale and re-fetched on the next run.
After fetching, Wire prints a summary:
Database: products
Pages with data: 142
Total keywords: 8,530
Latest snapshot: 2026-03-08
Overlap pairs (3+): 23
Audit: Read-Only Analysis
The audit command examines your entire site without changing anything.
python -m wire.chief audit products
python -m wire.chief audit # All topics at once
Audit produces four sections:
HEALTH. Pass/fail indicators. A + means clean, a - means problems found. Covers search data freshness, dead pages, cannibalization, duplicate titles, duplicate descriptions, news staleness, orphan pages, broken links, source diversity, H1 tags, thin content, and heading structure.
ACTION. Specific problems with commands to fix them. Dead pages to archive, overlaps to resolve, broken links to fix, pages needing news updates.
SEO. Reword opportunities ranked by score, content gaps where no page owns a keyword cluster.
INFO. Summary statistics: page count, archived pages, untracked pages without search data.
Keyword Cannibalization
When multiple pages rank for the same keywords, Google splits traffic between them. Wire detects this through database analysis and classifies each overlap.
| Scenario | Detection | Resolution |
|---|---|---|
| Hard overlap | Ratio > 0.4, one page gets 70%+ traffic | Merge weak into strong |
| Soft overlap | Ratio > 0.15, balanced traffic | Differentiate both pages |
| Google confused | Ratio > 0.15, 50/50 split | Differentiate both pages |
| Dead page | Below site median threshold | Archive and expand covering page |
python -m wire.chief deduplicate products
The deduplicate command runs the full resolution: merging, differentiating, and archiving based on the classification. Each operation uses AI to rewrite content, so the output reads naturally rather than being a mechanical splice.
Opportunity Scoring
Traditional CTR benchmarks are obsolete. Position 1 used to mean 28% CTR (First Page Sage, 2023). In 2025-2026, AI Overviews, ChatGPT Search, and Perplexity intercept 34-61% of search traffic before users reach organic results (Seer Interactive, Advanced Web Ranking). For niche B2B sites, real CTR at position 1 is closer to 2-3%.
This changes everything about how content tools should score opportunities. Click-based metrics undercount demand. A keyword with 500 impressions and 0.5% CTR looks weak by old standards but represents genuine search interest that AI Overviews are intercepting.
Wire scores every keyword where your page ranks position 5-30. The formula: impressions * (1-CTR). High impressions with low click-through means search demand exists but your page is not capturing it. This scoring works in the AI era because it prioritizes demand volume, not click volume.
The position window matters too. Position 1-5 pages are already performing well. Position 30+ pages are unreachable through content changes alone (they need authority signals). The 5-30 range is where content quality directly affects rankings.
Keywords scoring above the threshold (configurable, default 15) become candidates for content improvement. The enrich command uses these scores to decide what to add to each page.
Why Wire Ignores What Others Check
Most SEO tools audit factors that have weak or no evidence behind them. Wire skips these deliberately.
Readability scores have zero correlation with rankings. Portent analyzed 750,000 pages and Ahrefs studied 15,000 keywords: Flesch score does not predict position. Dwell time matters (confirmed in Google's leaked API as NavBoost), but reading level is not the same as clarity.
Keyword density is an artifact of early search engines. The 2024 Google API leak confirmed that NavBoost shifted ranking signals away from TF-IDF frequency toward user behavior metrics. Topical coverage matters; word repetition does not.
Schema markup shows no direct ranking change in SearchPilot A/B tests. The leaked API confirms entity signals feed ranking indirectly through CTR, but generating JSON-LD is a template responsibility, not a content pipeline task.
Wire focuses on signals the leaked API confirmed actually matter: click data (NavBoost), internal linking structure, title-H1 alignment, content freshness, and keyword cannibalization.
Tiered Reword
Not every page deserves the same level of SEO attention. Wire's reword command applies three tiers based on opportunity score.
python -m wire.chief reword products
Top 20%. Full SEO rewrite. Wire rewrites headings, body text, title, and description to target the highest-scoring keywords. Uses the SEO prompt template.
Next 30%. Light touch. Only title and description are rewritten. Body stays unchanged. Uses the light SEO prompt template.
Bottom 50%. Skipped entirely. These pages either rank well already or have too little search demand to justify the AI tokens.
Content Gap Detection
Wire finds keywords where search demand exists but no page owns the topic. Detection criteria: the keyword appears on 3+ pages, no page ranks in the top 20, and no page slug matches the keyword.
Results are clustered by theme and reported in the audit output:
Content Gaps: products
5 keywords in 3 topic clusters
1. "invoice": 280 impressions, 2 keywords
"invoice processing benchmark" (140 imp, 3 pages)
"invoice automation comparison" (140 imp, 4 pages)
These suggest new pages to create with the content pipeline.
Dead Page Detection
Ahrefs found that 90.63% of all pages get zero organic traffic from Google. HubSpot pruned 72% of their audited posts and saw a 458% increase in organic views. The evidence is clear: dead content hurts the pages that should rank.
Wire uses a relative threshold, max(10, site_median * 0.05), so the bar adapts to your site's scale. Pages below this threshold for 180+ days are flagged as dead. Wire checks whether other pages already rank for the dead page's keywords. If coverage exists, the dead page is safe to archive.
Internal Link Health
Wire counts inbound internal links for every page. Pages with fewer than 3 inbound links are flagged as underlinked. The crosslink command adds links to fix this.
python -m wire.chief crosslink products
Broken internal links are detected during audit and fixed by the sanitize command, which re-saves affected pages through the auto-fix pipeline.
python -m wire.chief sanitize products
How Wire Makes Decisions with Search Data
The functions above produce recommendations. This section explains the logic behind those recommendations: how Wire decides what to merge, what to kill, and what to keep.
Which Page Survives a Merge
When two pages cannibalize each other, Wire picks a keeper using a composite score:
| Factor | Weight | What it measures |
|---|---|---|
| Impressions | 40% | How much Google shows this page |
| Position | 30% | Where it ranks (inverted: position 1 scores highest) |
| Clicks | 20% | How many users actually click through |
| Keyword breadth | 10% | How many distinct keywords the page ranks for |
Scores are normalized within the topic, so the strongest page scores close to 1.0. The page with the higher keeper score absorbs the other page's content. This means a page with moderate clicks but broad keyword coverage can beat a page with high clicks on a single keyword. Breadth signals topical authority.
When a Page Is Dead vs Just New
A page with zero impressions might be dead or might be published yesterday. Wire uses an age fallback chain to tell the difference:
createddate from frontmatter (most reliable, set by the author)first_seenin the GSC database (when Wire first saw it in search data)first_checkedtimestamp (when Wire first registered it, even without data)
A page must be older than 180 days AND below the impression threshold (max(10, site_median * 0.05)) to be flagged as dead. New pages get time to earn their place before Wire recommends archiving them.
When Redirects Are Leaking Traffic
After restructuring a site, old URLs may still receive search traffic. Wire detects this by comparing GSC URL data against your page structure and redirect map. Any URL with impressions that has no page and no redirect is a leak. Real visitors are hitting a 404.
The GSC coverage build guard blocks the build until every URL with impressions is accounted for. This is deliberate: deploying a site with leaking traffic is worse than not deploying. Configure exclude_gsc_data_for_path in wire.yml for paths that are not Wire content (web apps, tools on the same domain).
What "Fresh Data" Means
Wire only uses the most recent snapshot per page for all analysis. Not historical averages, not trends. This keeps decisions grounded in current search performance, not outdated positions.
Bulk URL discovery (the GscUrl table) expires after 28 days. If you haven't run data in a month, Wire treats the URL list as stale and returns nothing. This prevents the coverage build guard from blocking on URLs that Google may have already dropped.
Run data at least monthly. Weekly is better for sites where content changes frequently.
I find having bad data is worse than having no data. Bad data can lead to the wrong conclusions, and those conclusions are confident.
u/mafost-matt on r/SEO, 216 upvotes. The thread "Bye Semrush. After 8 years, cutting the cord" drew 291 comments from SEOs abandoning third-party tools for first-party data.
How Restructures Preserve SEO Value
When you move pages (rename slugs, reorganize into topics), the old URLs have search history in the database. Without migration, the new URLs look like brand-new pages with zero data.
migrate-gsc rekeys the database: it reads your redirects (from both .wire/redirects.yml and wire.yml), finds Content rows matching old paths, and updates them to point at the new paths. If the new path already has a Content row, snapshots are merged so no keyword data is lost.
# After restructuring pages and adding redirects:
python -m wire.chief migrate-gsc
Run this once after any restructure, before running audit or deduplicate on the new structure.
The Evidence Hierarchy
Wire does not follow Google's public guidance. The 2024 Google API leak proved that Google systematically misrepresents how their ranking system works. They denied using click data while NavBoost, based entirely on click data, was their strongest ranking signal. They denied having domain authority while siteAuthority sat in their API documentation.
Wire uses a strict evidence hierarchy for every SEO decision:
Tier 1: Leak-confirmed signals. NavBoost click data, siteAuthority, titleMatchScore, badBacklinks, ChromeInTotal, bylineDate. These are ground truth: internal API documentation that Google cannot deny without perjury (NavBoost was confirmed under oath during the DOJ antitrust trial).
Tier 2: Controlled A/B tests. SearchPilot runs experiments on live production sites with statistical significance. H1 keyword alignment produced a 28% traffic increase. Orphan page internal linking produced significant uplift. Schema markup produced no measurable change. These results are reproducible and independently verifiable.
Tier 3: Large-scale correlation studies. Zyppy analyzed 80,959 title tags and 23 million links. Backlinko studied 15,000 keywords. Ahrefs analyzed their entire index. These show what correlates with rankings. Not causation, but strong signals when combined with Tier 1 and Tier 2 evidence.
Tier 4: Case studies. HubSpot pruned 3,000 posts and saw 458% more traffic. 201Creative deleted thin ecommerce pages and saw 867% more traffic and 291% more sales. RecipeLion deleted 1,156 thin articles with a neutral result, because those articles had unique queries, proving thin content only hurts when it cannibalizes.
Tier 5: "Google says." Noted but never sufficient. Every Google public statement is filtered through their legal and PR teams. The leak proved this is not cynicism; it is pattern recognition.
This hierarchy is encoded in every Wire threshold, every auto-fix rule, and every audit check. When Wire skips keyword density checking, it is because NavBoost shifted ranking power to user behavior signals. When Wire enforces H1 alignment, it is because a controlled experiment measured a 28% uplift. No rule exists without evidence. No evidence exists without a citation.
See the Capabilities overview for all Wire features.
But my boss wants Ahrefs reports
Keep Ahrefs for competitive research. It is good at that. Wire does not replace Ahrefs. Wire replaces the gap between "Ahrefs told me what is wrong" and "someone actually fixed it." Use both. Your boss gets Ahrefs competitor reports and Wire's deterministic quality metrics for your own site.