An autonomous website is a site whose conversion improvements are written, ranked, and shipped by AI agents — without you opening a dashboard. This is the working blueprint we ship at Eyepup, end to end. Six steps, three runtime processes, four CLI commands. No vaporware, no "in 2030 maybe" — every piece is something you can install today.

The shape of the loop

Karpathy's autoresearcher loop applied to product growth looks like this:

Watch → Research → Rank → Ship → Measure → Repeat

Translated to runtime processes:

Watch — a tracker on your site (PostHog, Plausible, Eyepup snippet) captures every visitor's session
Research — an LLM agent profiles each session and writes a verdict
Rank — patterns are clustered and sorted by impact
Ship — your coding agent (Claude Code, Cursor, Codex) reads the top pattern and writes the diff
Measure — the analytics layer re-evaluates the next batch of visitors
Repeat — the loop never stops; resolved patterns inform the next ranking pass

Three of those steps run unattended (1, 2, 3, 5). Two need a human in the loop today (4 ships the diff, but only because PR review is still valuable; 6 is just "the loop runs"). One is fully autonomous if you wire it with a hook (4 with auto-merge for low-risk changes — see below).

Step 1 — Watch

Drop a tracker. The Eyepup CLI auto-injects into Next.js / Astro / Vite / Nuxt:

$ npm i -g eyepup
$ eyepup login
$ cd your-app && eyepup install

That writes the snippet into your layout, ships behavioural events (pageviews, clicks, rage clicks, scroll depth, errors) plus an rrweb session recording to the Eyepup ingest. The recording is what makes the Research step work — without the video, the LLM has stats; with the video, it has ground truth.

If you're already on PostHog or Plausible, keep them — Eyepup runs alongside, doesn't replace.

Step 2 — Research

This is the only step that's not "your code." A scheduled agent (cron-driven on the Eyepup side) reads new sessions and writes a per-visitor profile. Output looks like this:

Sarah — London-bound budget shopper · 70° heat
Persona: Returning researcher
Verdict: Spent 7m 13s on /pricing after 20 prior bounces. Hesitated at the
"Annual / Monthly" toggle for 12s, then exited.
Friction: Pricing toggle ambiguity — three out of five matching visitors
hit the toggle and bounced within 30s.
Recommended action: Replace the toggle with a single annual/monthly card
and a savings badge. ETA: 30 min.

You don't run this. The agent runs on a 2-min cadence, profiles every session that's been ended >5 min, and writes the dossier to your Eyepup account.

Inspect from the CLI:

$ eyepup hottest                    # top 5 hot visitors right now
$ eyepup visitor <distinct_id>      # full dossier for one
$ eyepup visitors --flag high_intent --hours 24

Step 3 — Rank

Profiles are clustered into friction patterns by another agent (pattern-finder + pattern-compactor). Each pattern has:

users_count — how many visitors hit it
drop_off_rate — fraction who bounced
impact_score = users × intent_weight × drop_off_rate
confidence — LLM's self-rating of the diagnosis

Top 25 patterns per site stay active. Resolved/dismissed patterns get archived but feed back into the LLM's prompt as "PRIOR PATTERNS — don't re-suggest these." That's the memory loop that prevents the same fix being recommended six weeks running.

$ eyepup todo --site yoursite.com --limit 5

You get a ranked queue, top item first.

Step 4 — Ship (the actually-autonomous part)

This is where most "autonomous" tools stop. Eyepup hands the top pattern to your coding agent as a paste-ready prompt:

$ eyepup todo --limit 1 | claude -p

What happens inside Claude Code:

Reads the friction pattern + recommended action + the file path guess
Greps the codebase for the relevant component
Writes a diff
Optionally runs your tests
Commits or opens a PR

If you trust the agent for low-risk changes (copy edits, CTA repositioning), wire a Claude Code post-tool hook to auto-commit and push to a eyepup/<pattern_id> branch:

# .claude/hooks/post-edit.yml — pseudocode shape
match: edit
run: |
  git checkout -b eyepup/$EYEPUP_PATTERN_ID
  git add -A
  git commit -m "fix: $EYEPUP_PATTERN_NAME (Eyepup #$EYEPUP_PATTERN_ID)"
  git push -u origin HEAD
  gh pr create --fill --label eyepup-suggested

For Cursor users: open the integrated terminal, run the same eyepup todo --limit 1 command. Cursor reads the terminal output as context — when you type "ship the top fix" it has the prompt in scope.

Step 5 — Measure

After the deploy, log the change:

$ eyepup log "rewrote /pricing toggle" --kind content_change --paths /pricing

That row hits the dossier agent's prompt. On the next cycle, when new visitors arrive, the LLM knows /pricing was changed and grades the new sessions against the deploy. If the friction pattern shrinks, you'll see it in eyepup todo — the row drops in impact_score or disappears. If it doesn't, the pattern returns with confidence rising — the LLM now flags it as a regression.

Mark resolved patterns explicitly:

$ eyepup done <pattern_id>

That archives the row and tells the pattern-finder agent not to re-emit it unless the friction returns.

Step 6 — Repeat

The loop is now running. Every 2 minutes a new batch of sessions gets profiled. Every 4 minutes the friction patterns get reranked. Every time you ship and eyepup log it, the agent re-evaluates. There's nothing to schedule on your side — your only job is the PR review on Step 4.

What's autonomous vs what's not

Autonomous today: capture (1), profiling (2), ranking (3), measurement (5), repeat (6). The agent runs these without you.
Human-in-the-loop today: ship (4). Most teams keep PR review even for AI-generated diffs. If you trust your test suite + CI, you can flip auto-merge for eyepup-suggested PRs and remove the human entirely from low-risk paths.
Optional: anomaly alerts. Eyepup's anomaly agent flags z-score spikes (signup rate dropped 40% this hour) and writes you a one-sentence likely-cause. Wire that to Slack or email and you've got a paged-on-issues system without a PagerDuty config.

Cost

Real numbers from the Eyepup runtime:

Profiling cost (LLM): ~$0.005-0.03 per session at video tier (Gemini 2.5 Flash via OpenRouter), ~$0.001 per session at text-only fallback
Ranking cost: $0.05-0.15 per (team, site) per 2h tick — DeepSeek V4 Pro via OpenRouter
Total for a 10K-visitor/month site: about $30-80/month in LLM spend + your tracker hosting

That's the actual unit economics. Compare to a CRO contractor at $5-15K/month for 2-4 fixes shipped, and the autonomous version is paying for itself by the third pattern.

What can go wrong

The bugs we've actually shipped and fixed (so you don't):

The LLM hallucinates rule values that don't match dossiers. Symptom: every cohort claims 0 users. Fix: inject a vocabulary block of actual sample tokens into the prompt and constrain the LLM to it.
Cohort counts diverge from the visitor list. Symptom: /improve says "2 users", visitor list shows 1. Cause: forgot to apply the team's excluded_distinct_ids filter. Fix: always apply the same exclusion in every analytics endpoint.
The LLM emits non-canonical JSON. Symptom: parser silently drops every clause. Fix: use response_format: { type: "json_schema", strict: true } — DeepSeek V4 + most providers on OpenRouter support it natively.

You'll hit at least one of these. They're not loop-killers, just bugs to file.

What's next

The loop closes today. What gets better in 2026 H2:

Native MCP server so Claude Code / Cursor / Codex call analytics tools without CLI piping — same data, less wiring
Auto-merge for paste-ready prompts with a confidence floor — Claude Code already supports it via post-tool hooks, the analytics layer just needs to flag confidence ≥ 90 patterns as auto-merge-eligible
Cross-site pattern transfer — a friction pattern Eyepup learned on one customer's checkout informs the next customer's checkout, faster

Build the loop now. The shape doesn't change; only the components get better.

Wire it up — npm i -g eyepup, eyepup login, eyepup install. First profile in two minutes; first paste-ready fix in an hour.