skills/refurbish-demos/SKILL.md
Upgrade demo agent knowledge — clears old summaries, scrapes real pages via MCP Tavily (advanced mode), falls back to WebFetch, fixes icons, republishes. Pass campaignId or 'all-sent' as parameter.
npx skillsauth add psquared-development/psquared-skills refurbish-demosInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Upgrade demo agent knowledge from old AI-written summaries to properly scraped page content with real sourceUrls. Each knowledge item becomes a URL-type entry with the actual page URL, so the chatbot can cite specific pages in its answers.
Parameter: campaignId (CRM campaign ID) or all-sent (all demos where outreach was sent)
Announce:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Refurbish Demos — upgrading knowledge ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Read from .env in the working directory:
NUXT_MCP_DEMO_TOKEN — Bearer token for InboxMate MCP (contains # character — use curl, not Python urllib)PSQUARED_CRM_TOKEN — Bearer token for Twenty CRMEMAIL_DRAFT_ONLY_BEARER — Bearer token for notification service (used in Option B: all-sent)MCP endpoint: https://app.psquared.dev/api/mcp
CRM endpoint: https://crm.psquared.dev/graphql
Demo API: https://app.psquared.dev/api/demo/{demoId}
Supabase project: fevtfywriufbqnvbgyrm (for direct DB queries when needed)
Demo account ID: 8942a6e5-91cb-4c5d-8ef5-98cfe7945620
All MCP calls use this pattern (use curl, NOT Python urllib — the token contains #):
curl -s --max-time 120 -X POST https://app.psquared.dev/api/mcp \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $NUXT_MCP_DEMO_TOKEN" \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"TOOL_NAME","arguments":{...}}}'
| Tool | Purpose |
|---|---|
| cleanup_agent | Wipe all knowledge from agent's buckets + delete orphaned buckets in demo account |
| clear_bucket | Remove all items from a specific bucket |
| scrape_and_build_knowledge | Scrape URLs via Tavily (advanced mode), create URL-type knowledge items. Max 10 URLs per call. Returns { created: [], failed: [] } |
| add_to_bucket | Manually add knowledge item. If sourceUrl is provided, creates URL-type entry |
| get_agent | Get agent config (check knowledgeBucketIds, buttonIcon) |
| list_bucket_items | List items in a bucket (verify results) |
| update_widget_style | Fix buttonIcon or other widget config |
| publish_agent | Republish agent (required after knowledge changes) |
Announce:
[1/4] Building work list...
Query CRM for opportunities in the campaign:
curl -s -X POST https://crm.psquared.dev/graphql \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $PSQUARED_CRM_TOKEN" \
-d "{\"query\":\"{ opportunities(filter: { campaignId: { eq: \\\"CAMPAIGN_ID\\\" } }, first: 150) { edges { node { id name demoStatus demoUrl { primaryLinkUrl } company { id name domainName { primaryLinkUrl } } } } } }\"}"
Get sent email drafts from notification service, extract demo IDs, then look up agents:
curl -s "https://notifications.psquared.dev/drafts?status=SENT&draftType=outreach&pageSize=200" \
-H "Authorization: Bearer $EMAIL_DRAFT_ONLY_BEARER"
Extract demoUrl from each draft's variables, parse the ?id= param to get demoId.
curl -s https://app.psquared.dev/api/demo/DEMO_ID
# Returns: { agentId, companyName, companyDomain, ... }
Then get bucket ID:
{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"get_agent","arguments":{"agentId":"AGENT_ID"}}}
Extract knowledgeBucketIds[0].
Filter out: demoStatus = SKIP_*, DISQUALIFIED. Skip demos without companyDomain.
Build: [ { companyName, companyDomain, agentId, bucketId, demoId } ]
Announce: Found N demos to refurbish.
Announce:
[2/4] Discovering pages to scrape...
For each company, discover which pages actually exist before scraping. Never guess URLs blindly — generic patterns like /kontakt/, /ueber-uns/, /leistungen/ fail on 60%+ of sites (e-commerce, single-page, non-standard CMS). This produces "Seite nicht gefunden" knowledge items that pollute RAG.
WebFetch https://www.{domain}/ (or https://{domain}/ if www fails). Extract:
From the discovered links, pick URLs that give the chatbot useful knowledge:
Priority order:
Rules:
/wp-json/, /xmlrpc.php, /feed/, /favicon.ico, apple-touch-icon, .webp, .ico, .css, .jsTavily advanced mode takes ~5-30s per URL. A single call with 5+ URLs hits server timeouts (120s). Always split into batches of max 3 URLs per MCP call.
If the homepage can't be fetched (timeout, blocking), fall back to these common patterns but expect failures and don't count on them:
https://www.{domain}/
https://www.{domain}/kontakt/
https://www.{domain}/ueber-uns/
https://www.{domain}/impressum/
Announce:
[3/4] Rebuilding knowledge for N demos...
Process one company at a time:
{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"cleanup_agent","arguments":{"agentId":"AGENT_ID"}}}
This clears the agent's buckets AND deletes all orphaned buckets in the demo account (from old demo creation flow). First call handles the global orphan cleanup.
Send discovered URLs to scrape_and_build_knowledge in batches of max 3:
{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"scrape_and_build_knowledge","arguments":{"bucketId":"BUCKET_ID","urls":["URL1","URL2","URL3"]}}}
The MCP tool automatically:
failed array with reason 404/not-found page detected)sourceUrl on created items (enables chat citations)Check the response created and failed arrays after each batch.
For each failed URL, check the failure reason:
| Reason | Action |
|--------|--------|
| 404/not-found page detected | Skip — page doesn't exist. Don't retry. |
| Content too short | Skip — page has no useful content. |
| Timeout or Failed to scrape | Retry via WebFetch (Tavily couldn't reach the site). |
| Content must not exceed | Shouldn't happen with 50K limit. If it does, WebFetch and trim. |
WebFetch fallback for timeouts/scrape failures:
add_to_bucket with cleaned content + sourceUrl:{"jsonrpc":"2.0","id":4,"method":"tools/call","params":{"name":"add_to_bucket","arguments":{"bucketId":"BUCKET_ID","title":"Page Title","content":"[full cleaned page text]","sourceUrl":"https://domain.de/page/"}}}
Some sites block Tavily's scraper. If every URL times out or fails:
add_to_bucket with sourceUrlContent rules for manual WebFetch entries:
sourceUrl MUST be the actual page URLAfter all scraping + fallbacks, check:
Valid icons: messageCircle, messageSquare, sparkles, support, help, inboxmate, heart, zap, globe, wave, brain, lightbulb, compass, star, shield, robot, mascot
Any other value (e.g. shoppingBag, truck, home, car, music) renders as a broken circle in the widget. Fix to messageCircle:
{"jsonrpc":"2.0","id":5,"method":"tools/call","params":{"name":"update_widget_style","arguments":{"agentId":"AGENT_ID","buttonIcon":"messageCircle"}}}
{"jsonrpc":"2.0","id":6,"method":"tools/call","params":{"name":"publish_agent","arguments":{"agentId":"AGENT_ID"}}}
Log per company:
✓ {companyName}: {N} items created, {M} failed → republished
Announce:
[4/4] Refurbish complete.
Print summary table:
| Company | Domain | Items | Failed | Icon Fixed | Status |
|---------|--------|-------|--------|------------|--------|
| Schuh Marke GmbH | schuh-marke.de | 5 | 0 | yes | ✓ |
| Mainfilm | mainfilm.tv | 5 | 0 | no | ✓ |
Totals:
For large batches (100+ demos), use the Python batch script instead of processing one-by-one from Claude:
python3 -u scripts/refurbish-all.py < /tmp/worklist.json > /tmp/refurbish.log 2>&1
The script at claude-overlord-folder/scripts/refurbish-all.py:
www. prefix correctly (no www.www. duplication)# in auth token[{company_name, company_domain, agent_id, bucket_id}]Limitation: The script does NOT do WebFetch fallback — it only uses MCP scrape_and_build_knowledge (Tavily). If Tavily fails for a site, the agent will be left with fewer items. After a batch run, check the log for agents with 0-2 items created and fix those manually with WebFetch.
Tavily extract_depth: 'advanced' takes up to 30s per URL. Batch 5 URLs in one MCP call = 150s, which exceeds server timeout. Always split into batches of 2-3 URLs.
If primary Tavily key runs out, set NUXT_TAVILY_FALLBACK_API_KEY in agenthub .env. Auto-switches on 429/402 errors.
If 404 pages end up in knowledge (from fallback URL guessing or site changes), they DO hurt RAG — the chatbot may cite non-existent pages or give confused answers. Phase 2 now discovers real URLs first to prevent this. If you see 404 items after a refurbish, the homepage WebFetch likely failed and fell back to guessed URLs — investigate and re-run with manual URL discovery.
Old demo creation flow sometimes linked wrong-named buckets to agents (e.g. "Autohaus Freier Wissensbasis" on a Hinzmann Elektrotechnik agent). The content is correct after refurbish — the bucket name is cosmetic. Can be fixed via SQL if needed.
Old demo creation left ~160 orphaned "Wissensdatenbank" buckets not linked to any agent. cleanup_agent deletes these. For large batches, clean up via SQL first (faster):
DELETE FROM knowledge_bucket_chunks WHERE bucket_item_id IN (
SELECT kbi.id FROM knowledge_bucket_items kbi
JOIN knowledge_buckets kb ON kb.id = kbi.bucket_id
WHERE kb.account_id = '8942a6e5-91cb-4c5d-8ef5-98cfe7945620'
AND kb.id NOT IN (SELECT unnest(knowledge_bucket_ids) FROM agents WHERE account_id = '8942a6e5-91cb-4c5d-8ef5-98cfe7945620')
);
-- Then delete items, then buckets (same pattern)
Knowledge entries allow up to 50,000 characters (was 20K, bumped 2026-03-28). Tavily advanced mode returns full pages including nav/footer HTML — the extra headroom prevents truncation of actual content. Pages hitting the limit likely have nav pollution but the important content still fits.
Some pages load data via JavaScript (store locators, interactive maps, product configurators). Tavily can't extract this content. If a page returns mostly nav/boilerplate with no real data, WebFetch it manually, extract the data from the page description or other sources, and add via add_to_bucket with sourceUrl.
Audit showed only 1 true case out of 152 sent demos (Dachdeckermeister Hoffmann had DBM Metallbau content). Refurbish fixes this by clearing and re-scraping based on the correct domain from the demo page.
tools
Set up a personalized InboxMate INBOX demo (Demo-Postfach) for a sales prospect: a public, read-only seeded inbox showing 5-7 pre-triaged emails in their industry's language, with categories, routing and ready AI drafts. Use for email-automation outreach (the €49-349 product), NOT for chatbot outreach. No agent is created.
development
Build InboxMate demos AND write personalised outreach drafts in a single pass per company — eliminating the double-research that happens when /inboxmate-batch-demo and /setup-email-drafts run separately. Use when kicking off a new campaign where the campaign already exists (plan via /plan-campaign first). For each target company, dispatches ONE subagent that researches the site, builds the demo, creates the CRM opportunity, and drafts the outreach email — reusing the same research across all three. After all subagents return, runs a single batch call to auto-generate follow-ups.
testing
Autonomous pilot for the InboxMate EMAIL outreach (Demo-Postfach/INBOX track). Assesses where the inbox pipeline stands (leads → demos → review → campaign → drafts) and executes the next sensible step end-to-end, always finishing with the inbox sanity check and a summary of what the user should do next (ideally: just schedule the mails). Runs in save mode by default: orchestration + all quality gates on the top model, data collection on haiku subagents, content generation on sonnet subagents (pass 'full' to disable). Use when asked to 'advance the email outreach', 'run the inbox pipeline', or 'what's next for the Demo-Postfach motion'.
tools
Generate a polished psquared client offer as a multi-page PDF (title, project description, screenshots, Angebot/pricing, AGB). Walks the user through gathering inputs (or accepts a JSON config), renders branded HTML templates with Playwright in two passes (title page edge-to-edge + body pages with margins and pagination), then merges with pdf-lib.