Libretto Turns Your Browser Into a Research Pipeline

On this page

What libretto is
Install in two minutes
Use case 1: Competitor price monitoring
What you collect manually now
What libretto handles
Use case 2: Industry news for chief news
Use case 3: Internal knowledge base
Use case 4: Visual QA against live competitor pages
How it fits the Wire workflow
The files libretto creates
What libretto cannot do
Start here

I build systems that make work unnecessary. Not the people. The overhead.

Most Wire operators do web research the slow way. They open a browser, copy prices into a spreadsheet, paste headlines into a notes file, alt-tab between four windows. The information reaches their Wire content pipeline hours later than it should, if it gets there at all.

Libretto is a browser automation toolkit designed to give a coding agent a live browser. You describe what you want. The agent navigates, extracts, and saves it. Structured. Repeatable. No alt-tabbing.

Here is how Wire operators can use it.

What libretto is

Libretto is not a scraper. It does not parse HTML blindly. It gives a coding agent (Claude Code, Cursor, whatever you run) a live Chromium session with four tools: a screenshot + HTML snapshot for reading the page, exec for running Playwright expressions against the live DOM, run for executing saved workflows, and save for persisting authenticated sessions.

The architecture is deliberate: a separate vision model handles screenshots so the heavy visual context does not land in your coding agent's context window. The agent stays fast. The vision stays accurate.

4Tools: snapshot, exec, run, save

1Browser: Chromium, headless or headed

0CAPTCHAs handled (you log in once, libretto saves the session)

AllAI backends: Anthropic, OpenAI, Gemini, Vertex

Install in two minutes

npm install libretto
npx libretto setup

Setup installs Chromium and pins a snapshot model. You configure the vision backend once in .libretto/config.json. If you already have ANTHROPIC_API_KEY in your environment, it picks it up automatically.

Use case 1: Competitor price monitoring

You run a vendor comparison site. Your 400 pages each cover a software product. Prices change. Vendors bury their pricing behind "contact sales" forms, freemium tiers, and enterprise callouts that move every quarter. Manual collection takes a day. You do it every few months. Your pages are stale.

With libretto, you record the workflow once:

# Open the browser to a competitor pricing page
npx libretto open https://example-vendor.com/pricing

# Snapshot: tell the agent what you need
npx libretto snapshot \
  --objective "Extract all pricing tiers: name, price, billing period, included features" \
  --context "Pricing page for SaaS vendor, look for tier cards or a pricing table"

The snapshot returns structured output: tier names, prices, feature bullets, without scraping the raw HTML. The agent validates each extracted field against the live DOM using exec:

npx libretto exec "document.querySelectorAll('.pricing-card h2').length"

You save the workflow as a TypeScript file:

// workflows/competitor-pricing.ts
export async function workflow() {
  await page.goto('https://example-vendor.com/pricing');
  const tiers = await page.evaluate(() =>
    [...document.querySelectorAll('.pricing-card')].map(card => ({
      name: card.querySelector('h2')?.textContent?.trim(),
      price: card.querySelector('.price')?.textContent?.trim(),
      period: card.querySelector('.billing-period')?.textContent?.trim(),
    }))
  );
  return tiers;
}

Run it weekly:

npx libretto run workflows/competitor-pricing.ts

The output is JSON. Pipe it into a Wire frontmatter update, a spreadsheet, or a Telegram alert when a price changes. Your vendor pages stay current without opening a browser.

What you collect manually now

Monday: 40 tabs. Copy-paste prices into a spreadsheet. Update four pages. Wonder if you missed anything. Lose the tab with the freemium footnote.

What libretto handles

Scheduled script. Structured JSON. Wire frontmatter updated via wire.content update. Telegram notification when a price changes. You review the diff.

Use case 2: Industry news for `chief news`

Wire's chief news command searches the web for recent developments per vendor. It works. But it searches by vendor name and uses a text window. It cannot navigate behind login walls, it cannot distinguish between a product update blog post and a press release template, and it cannot visit a vendor's own changelog.

Libretto fills that gap. Before you run chief news, run a libretto workflow that visits each vendor's blog, changelog, and press room. These pages do not surface cleanly in search:

npx libretto open https://example-vendor.com/changelog
npx libretto snapshot \
  --objective "List all releases in the last 60 days with date and summary" \
  --context "Product changelog, looking for version numbers, release dates, feature descriptions"

Save the output as markdown files in your topic's news/ directory before running chief refine. Wire ingests them as if they came from the standard news pipeline. The difference: you collected them from sources that search engines do not index well.

# After running libretto workflows that output to docs/vendors/news/
cd /opt/wire/sites/your-site
source /opt/wire/venv/bin/activate
python -m wire.chief refine vendors

Use case 3: Internal knowledge base

Some Wire sites are internal: competitive intelligence dashboards, procurement research portals, partner capability matrices. The content lives across dozens of external sites with no API. You need prices, feature tables, contract terms, SLA commitments. All in one place, updated regularly.

This is where libretto's session persistence matters:

# Log in once, save the session
npx libretto open https://vendor-portal.com/login
# (browser opens headed, you log in manually; libretto cannot handle MFA)
npx libretto save vendor-portal.com

Future runs reuse the saved session with no repeated login:

npx libretto open https://vendor-portal.com/pricing --profile vendor-portal.com
npx libretto snapshot \
  --objective "Extract enterprise contract terms: minimum seats, annual commitment, SLA uptime percentage" \
  --context "Enterprise pricing page behind login, looking for contract detail section"

For a 30-vendor internal knowledge base, this workflow runs nightly. Each vendor gets a JSON file with current terms. Wire ingests them on the next chief refine run. The whole knowledge base refreshes while you sleep.

Use case 4: Visual QA against live competitor pages

Wire includes wire.qa for screenshot capture. It captures your own site. But sometimes you need to compare against a live competitor page. Did they launch a new homepage layout? Did they change their above-the-fold message?

Libretto handles this:

# Snapshot a competitor page with full context
npx libretto open https://competitor.com
npx libretto snapshot \
  --objective "Capture full page structure: hero headline, primary CTA text, main nav items, footer links" \
  --context "Competitor homepage, comparing against last month's snapshot"

Store the snapshot in .libretto/sessions/. Run it monthly. When the structured output changes, you have a diff. When you need screenshots for a QA report, libretto writes the PNG alongside the HTML snapshot automatically.

This does not replace wire.qa for your own site. It complements it: Wire QA covers your output, libretto covers the competitive landscape you're positioning against.

How it fits the Wire workflow

Libretto runs before Wire, not inside it. The sequence:

Collect (libretto)

Run libretto workflows against competitor sites, vendor portals, and news sources. Output: JSON files, markdown snippets, structured data. Duration: 5 to 30 minutes depending on how many sources.

Ingest (Wire content pipeline)

Run `chief news`, `chief refine`, or `wire.content update` to integrate the collected data into your pages. Wire's AI layer evaluates relevance, integrates facts, and updates frontmatter.

Build and verify (Wire build)

Run `python -m wire.build` to validate the updates. Wire's lint rules catch thin content, missing metadata, and broken links before they reach production.

Push

Git commit, git push. Changes are live.

The files libretto creates

All libretto state lives in .libretto/ at the root of wherever you run it:

Path	Contents
`.libretto/config.json`	Vision model config (Anthropic, OpenAI, etc.)
`.libretto/sessions/<name>/`	Network traffic, recorded actions, snapshots from one session
`.libretto/profiles/<domain>/`	Saved auth sessions for sites that require login

Sessions are git-ignored by default. Profiles are not. Commit them to share authenticated sessions across machines on internal tooling.

What libretto cannot do

Libretto is not magic. It navigates pages a browser can navigate. If a site requires clicking through a CAPTCHA, libretto pauses and asks you to do it. If a site fingerprints automated browsers and blocks them, you'll need to use a real browser profile via save. If a site has no structured pricing page and buries terms in a PDF, libretto can extract PDF links but cannot parse the PDF itself.

None of that is a problem for the most common Wire research workflows: vendor pricing pages, product changelogs, press rooms. These pages are built for humans to read. Libretto lets your agent read them instead.

Start here

# Install
npm install libretto
npx libretto setup

# Try your first snapshot
npx libretto open https://example.com
npx libretto snapshot --objective "What is this page about?" --context "Homepage"

The repo is at github.com/saffron-health/libretto. The four-phase workflow documentation at libretto.sh/docs is worth reading before you build your first multi-step workflow.

For the Wire side of this pipeline, start with chief news and chief refine in the workflow guide.