Multi-Agent Competitive Intel: 3 Sub-Agents Watching Sites, Ads, and Social, Producing a Weekly PDF Brief
Contents
A client pinged me on a Wednesday last March with a screenshot of a competitor's homepage. "When did they add this whole new pricing tier?" I scrolled through Wayback Machine for twenty minutes before I had a clean answer: roughly 11 days earlier, on a Tuesday afternoon. Three of us had been on Slack that day. None of us had noticed. By the time the screenshot landed in my inbox, that competitor's new tier had been live for nearly two weeks and our sales team was getting undercut in four deals we didn't know we were losing.
That was the week I built the weekly competitive intel pipeline. It has been running for 14 months, has produced 60 weekly briefs, and it has caught every meaningful competitor move on the watchlist since week 3. The whole thing is three sub-agents and a dumb parent. The dumbness is the design.
The architecture in one paragraph
A parent Claude agent runs on a Friday 6:00 AM cron (a scheduled job — a clock-trigger that runs the workflow at a fixed time every week). It receives a list of five competitors from a competitors.json file in a Git repo. It dispatches three sub-agents in parallel — site-watcher, ad-watcher, social-watcher — each given the same list and a strictly different job. Each sub-agent returns a single JSON object conforming to a schema the parent validates. The parent merges the three JSON files into a single Markdown brief, runs it through Pandoc, and produces an 8-page PDF. The PDF is uploaded to a Notion page and a Slack message lands in #competitive-intel at 6:14 AM. The whole pipeline takes 8 to 14 minutes depending on how many competitors' sites have ad-heavy JavaScript. I read the brief on Monday morning over coffee. It is the first marketing artifact I open each week.
The interesting part is not the agents. It is the boundaries between them. None of the three sub-agents knows the others exist. None of them writes prose. None of them formats anything. Each one takes a list of names and a URL list and returns a JSON file. The parent knows nothing about scraping, ad libraries, or social APIs. It only knows how to merge three JSON files into a Markdown template. That is the design, and the design is deliberately dumb.
Why three sub-agents and not one mega-agent
I tried the "one Claude agent that does everything" version first. It was supposed to be elegant: one prompt, one tool belt (Playwright, Meta Ads Library, the social scraper), one output. It ran for two weeks. The outputs were useless in a specific way: the agent was spending its context window (the model's working memory — the text it can "see" at any one time) on formatting the JSON contract, which meant it was skimping on the actual intel. When the site-watcher sub-agent finds a pricing change, it has to do three things well: render the page, diff the snapshot, classify the change. It does not need to know what a hook angle is. It does not need to know what an exec summary looks like. Putting all three jobs in one prompt meant every job was being done with 60% of the attention. Splitting them out meant each job got the full context window.
The deeper reason is failure isolation. When the mega-agent failed, it failed opaquely — sometimes the social section was empty, sometimes the ads section was hallucinated, sometimes both. There was no way to tell which tool had broken. With three sub-agents, the parent gets three JSON files. If ads.json is missing the meta_ads field, I know the Meta API token expired and the rest of the brief is fine. If social.json returns an empty posts array, I know the LinkedIn scraper hit a rate limit (a throttle — the platform cuts you off when you make too many requests too fast) and not to trust the empty section. The granularity of failure is the granularity of trust.
The third reason is cost. A long mega-agent prompt that includes all three tool descriptions and all three job descriptions burns 4,000 tokens on the system message alone. Three short sub-agent prompts burn 800 tokens each. On a weekly run that is not a big deal, but it is the right shape — agents that only do one thing are easier to maintain, easier to swap, and cheaper to call.
Sub-agent 1: Site-watcher
The site-watcher's job is to find what changed on the competitor's owned properties this week. It runs Playwright (a headless browser — a Chrome that runs without a visible window, used to script page interactions) against a fixed URL list per competitor — homepage, pricing page, top three product pages, careers page. It screenshots each one, computes a perceptual hash (a short fingerprint string that captures the visual "look" of an image — two near-identical pages produce near-identical hashes) of the screenshot, compares against last week's hash, and if the hash distance crosses a threshold, it pulls the page's full HTML and runs a structural diff against last week's snapshot using a Python diff library.
What it returns is a single JSON file with this schema:
json{
"agent": "site-watcher",
"run_id": "2026-03-12",
"competitors": [
{
"name": "Acme Cloud",
"urls_watched": ["https://acme.com", "https://acme.com/pricing"],
"changes": [
{
"url": "https://acme.com/pricing",
"change_type": "pricing_update",
"severity": "high",
"summary": "New 'Team' tier at $49/seat added between Pro and Enterprise",
"before_excerpt": "Pro: $99/seat ... Enterprise: Contact sales",
"after_excerpt": "Pro: $99/seat ... Team: $49/seat (min 5) ... Enterprise: Contact sales",
"detected_at": "2026-03-11T14:22:00Z"
}
],
"no_change_urls": ["https://acme.com/about"]
}
]
}The prompt is short. The whole thing is about 600 tokens:
You are a competitive site-watcher. For each competitor in the list, render the provided URLs, compare against last week's snapshot, and return a JSON object conforming exactly to this schema:
{...}. For each change, classify thechange_typeas one of:pricing_update,new_feature,copy_shift,layout_change,unknown. Setseveritytohighif the change affects pricing, packaging, or a top-of-funnel page;mediumfor product page changes;lowfor footer / nav / careers. Do not invent changes. If the page failed to render, returnerrorfor that URL with the failure reason. No prose, no markdown, no commentary outside the JSON.
Two details matter. The first is the explicit error field — when Playwright hits a layout the scraper has never seen, the sub-agent does not pretend the page was unchanged. It returns an error per URL, the parent flags the brief with a coverage gap, and the human reading the brief knows to spot-check that competitor manually. The second is the severity rules. The model is otherwise too liberal with high. Without an explicit rule it flags every footer change as high.
Sub-agent 2: Ad-watcher
The ad-watcher's job is to scan the Meta Ads Library for every active ad each competitor is running. It pulls the list of active ads, classifies the ad copy by hook angle and offer type, and reports the ad count per competitor week-over-week. The ad count is the load-bearing signal — Meta does not publish spend data, but a 40% jump in active ad count is a strong proxy for a spend increase.
The schema:
json{
"agent": "ad-watcher",
"run_id": "2026-03-12",
"competitors": [
{
"name": "Acme Cloud",
"active_ad_count": 47,
"ad_count_delta_pct": 38.5,
"new_ads": [
{
"ad_id": "abc123",
"first_seen": "2026-03-09",
"hook_angle": "fear_of_missing_out",
"offer_type": "free_trial_extension",
"copy_excerpt": "Q2 is closing. Extend your trial another 30 days, no card.",
"platform": "instagram_feed"
}
],
"hook_distribution": {
"fear_of_missing_out": 12,
"social_proof": 8,
"feature_callout": 15,
"price_anchor": 7,
"other": 5
},
"low_confidence": false
}
]
}The interesting field is low_confidence. The Meta Ads Library returns very little data for small competitors — sometimes 3 ads total, sometimes only ads that ran more than 30 days ago. The sub-agent is told explicitly: if you have fewer than 10 active ads for a competitor, set low_confidence: true and zero out the hook_distribution. The parent then renders that competitor's ad section as "low confidence, see raw data" instead of pretending the distribution is meaningful. Inventing a hook distribution out of three ads is the most common failure mode of competitive ad analysis. The schema forces honesty.
Sub-agent 3: Social-watcher
The social-watcher's job is the sloppiest of the three and the one I trust least. It scrapes the last 14 days of LinkedIn posts and tweets for each competitor, classifies the post theme, and reports any shifts in the topic mix.
json{
"agent": "social-watcher",
"run_id": "2026-03-12",
"competitors": [
{
"name": "Acme Cloud",
"platforms_scanned": ["linkedin", "twitter"],
"post_count": 23,
"theme_distribution": {
"product_launch": 4,
"thought_leadership": 11,
"customer_story": 3,
"hiring": 5
},
"theme_shift_vs_last_run": {
"thought_leadership": 6.0,
"product_launch": -2.0
},
"notable_posts": [
{
"url": "https://linkedin.com/posts/acme-...",
"date": "2026-03-08",
"theme": "thought_leadership",
"excerpt": "We rebuilt our entire billing system. Here's what we learned about event-driven architecture...",
"engagement": { "likes": 1240, "comments": 87 }
}
],
"data_quality_note": "LinkedIn full-post scrape throttled at 14 days for 2 of 5 competitors"
}
]
}The data_quality_note is the load-bearing field. The LinkedIn scraper has a 14-day lookback limit on full-post content even with authentication. Posts older than that come back as truncated previews. The sub-agent is told to flag this in the note, not to silently work with truncated data. The parent then renders the social section with a footnote. Better to know the data is partial than to think you have a complete picture built on 14-day-old fragments.
The parent agent and the 8-page PDF
The parent agent is the dumbest piece. It does no scraping, no classification, no synthesis beyond merging the three JSON files into a Markdown template. The full prompt is shorter than the sub-agent prompts:
You are a competitive intel brief writer. You will receive three JSON files:
site.json,ads.json,social.json. Each conforms to a strict schema. Your only job is to:
- Read all three.
- Write page 1: a 200-word exec summary naming the most important change per competitor and the one recommended action for our team this week.
- Write pages 2–6: one page per competitor, with a fixed structure: "What changed on their site", "What they are advertising", "What they are posting about", "What this means for us" (3–5 sentences).
- Write page 7: a cross-competitor "this week's themes" page (e.g. "3 of 5 launched a new free-trial extension this week").
- Write page 8: the recommended actions list, ranked by impact. Use the provided Markdown template. Do not invent data. If a section is empty, write "No data this week" — do not pad. Return only Markdown, no commentary.
The Markdown template is the part that gets Pandoc-fied into a PDF. It is a 110-line file with header sizes, page break hints, and a fixed cover page. The Pandoc command is one line:
bashpandoc brief.md \
--pdf-engine=xelatex \
--template=template.tex \
-V geometry:margin=1in \
-V fontsize=10pt \
-o brief-2026-03-12.pdfThe template.tex is where the visual identity lives — V3-style geometric header, monospace date in the corner, accent color on section rules. I lifted it from the blog's design system and tweaked the type sizes for print.
The 8 pages map exactly: 1 cover + exec summary, 5 per-competitor, 1 themes, 1 recommendations. If we ever go to 7 competitors, the brief becomes 10 pages and the template stretches. The parent is told the page count, not the template details, so the structure is robust to small changes in the watchlist.
What has actually broken in 14 months
A competitor rebuilt their pricing page as a single-page React app. The Playwright snapshot rendered fine — the perceptual hash was identical, the page looked the same — but the actual pricing data was behind a hydration step (the moment when a server-rendered HTML page is "woken up" by JavaScript and made interactive). The sub-agent returned no_change_urls for the pricing page, the brief said "no pricing changes this week", and the next week's run caught a $30/seat jump on the new "Growth" tier. I lost a week. The fix was a 200ms waitForSelector before the screenshot. Now every site-watcher run has a 200ms wait per URL, which costs 4 seconds total. Worth it.
Meta's Ads Library API rate-limited the sub-agent in week 11. The ad-watcher returned 5 of 5 competitors with active_ad_count: 0 because the entire run hit the rate limit at minute 8. The schema's low_confidence: true field caught this — the parent rendered the ad section as "low confidence — see raw data" and the brief still shipped. Without the confidence flag, the brief would have claimed "all 5 competitors paused all advertising this week", which is a five-alarm fire and also false. The rate-limit error was the actual story; the schema let us tell it.
The LinkedIn scraper silently changed its auth requirements in week 23. The sub-agent returned an empty posts array for 4 of 5 competitors and a partial array for the fifth. The data_quality_note field caught it again — the parent rendered the social section with a coverage footnote. The brief still shipped. I fixed the auth the next day. The week's social coverage was a known unknown.
A competitor changed their homepage copy from "AI-powered" to "agentic" in week 31. The site-watcher flagged it as copy_shift with severity: low. The parent's exec summary did not mention it. I read the brief, did not agree with the deprioritization, and added an explicit rule: any copy_shift containing the strings "AI", "agent", "automation", or "GPT" should be severity: high. The next run caught the same competitor's blog pivot to "agentic workflows" and surfaced it on the exec summary page.
The honest accounting
14 months. 60 weekly briefs. 312 sub-agent runs (60 weeks × ~5 competitors × 3 agents, with retries). One postmortem-worthy miss in the first month. Total Anthropic spend: $84. Total Pandoc + Playwright + Apify cost: about $0.20 per brief. Total human time spent: about 20 minutes per week reading the brief, 30 minutes per month maintaining the watchlist and tuning prompts. Time saved vs. the manual scan: roughly 6 hours per week. The week the competitor added the new tier I would have missed? That alone paid for the whole pipeline through 2027.
The thing the brief is best at is not the big moves. Those usually get caught by a human at some point. The brief is best at the small moves — a 7% price cut, a renamed feature, a 40% jump in active ad count — that humans skim past on a manual scan but that compound over a quarter. By month 4, I was making better pricing decisions in March than I had been in November, and most of that was a function of seeing every competitor move every week, not a function of being smarter.
The thing I would change: I would put the parent prompt in version control with the rest of the pipeline, but I would also add a prompts/ directory and store the three sub-agent prompts as separate files. Right now they live in the same n8n workflow JSON as the rest of the configuration. Editing them is a deploy. That is fine for one pipeline. It is wrong for three.
If you build this, start with the JSON contracts. Get the schemas right first. The prompts can be mediocre and the pipeline will still work. The schemas are the contract — when they are loose, the parent invents; when they are strict, the parent tells the truth. Spend the afternoon on the schemas. The prompts are a Tuesday.