skills/pi/SKILL.md
Delegate a bulk-work subtask to the local Qwen via one-shot pi run. Use when the subtask is high-volume but low-complexity (file scans, log parsing, large-text summaries, repetitive transforms) so it should not burn parent-model tokens.
npx skillsauth add krystophny/prompts piInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Run a one-shot pi -p against the local Qwen model. Defaults use llamacpp/qwen on http://127.0.0.1:8080/v1 with high thinking. Use this to offload bulk grunt work — scanning thousands of files, summarising large CSV/JSON dumps, extracting patterns from message archives, repetitive transforms — so the parent session does not spend tokens on it while Pi still reasons locally.
Trigger when the user invokes /pi or asks to delegate to local/qwen/pi. Also trigger proactively when about to read or summarise:
@image.png.Do NOT use for tasks needing cloud-model judgement, strict specification compliance, or correctness on tricky edge cases. Use the gpt or codex skills for those.
Take the user's argument (or current subtask) as the prompt. If absent, ask what to delegate.
Frame the prompt with explicit inputs (paths, shell commands to run) and a precise expected output format. The local model is strong at following structured instructions but weak at open-ended judgement — be specific.
Run pi -p in print (non-interactive) mode with strict isolation flags so pi cannot side-quest:
pi -p --no-tools -nc -ne -np --provider llamacpp --model qwen --thinking high "<prompt>" 2>/dev/null
-p / --print exits after one response.--no-tools disables built-in read/bash/edit/write so pi cannot touch the disk.-nc / --no-context-files skips AGENTS.md / CLAUDE.md walking from cwd, so vault notes do not leak into the system prompt.-ne / --no-extensions skips extension discovery.-np / --no-prompt-templates skips prompt template discovery.Defaults from ~/.pi/agent/settings.json route to provider llamacpp, model qwen, thinking high. The provider URL http://127.0.0.1:8080/v1 is configured in ~/.pi/agent/models.json (symlinked from ~/code/prompts/pi/agent/models.json). Keep this path for Pi unless the user explicitly asks for another provider.
For very long prompts, pipe via stdin instead of a positional argument to avoid MAX_ARG_STRLEN limits:
cat /tmp/big_prompt.txt | pi -p --no-tools -nc -ne -np --provider llamacpp --model qwen --thinking high 2>/dev/null
For image input, use the same multimodal Qwen path and pass the image as an @file argument:
pi -p --no-tools -nc -ne -np --provider llamacpp --model qwen --thinking high @screenshot.png "Describe the UI state." 2>/dev/null
For long jobs, prefer run_in_background: true on the Bash tool. Capture stdout to a file under /tmp/ and read it back when ready.
Run serially. There is only one local GPU; concurrent pi calls block each other and leave stale processes after interruptions. If a structured pipeline has multiple distinct subtasks, queue them one at a time. Before starting a new pi run after an interruption, check and kill leftover pi / node.*pi clients.
Watch the output cap. Replies are limited to ~16k tokens. Chunk inputs so each call's expected output fits with margin (≈25-30 structured items per call, or roughly 8-10k characters of free-form summary). Pre-filter large source files instead of dumping them whole — keep input prompts in the low-100k range to leave headroom for the response.
Verify the output before integrating. Local-model output may contain hallucinated paths/names — spot-check against the source files.
llamacpp/qwen → http://127.0.0.1:8080/v1 (local llama.cpp; Qwen3.6 35B A3B Q4 + KV-Q8; 256k context, 16k output, multimodal).:8081 if registered as an additional provider in ~/.pi/agent/models.json.Local inference: zero tokens billed to the parent model and no Anthropic spend on the bulk text. Only the framing prompt and the returned summary consume parent-session context. This is the entire point of the skill.
Opencode has historically been the default local-Qwen delegate. It still works, but pi's design is closer to what we need here: explicit --no-tools + --no-context-files flags give clean isolation, the thinking level is wired in via ~/.pi/agent/settings.json, and pi has been observed to be more reliable on long thinking traces (opencode wedged a 35-min run with no output on 2026-04-26). pi is now the preferred local-LLM delegate. Use opencode only when a feature is missing in pi (e.g. legacy -f <image> flow until image attach lands in pi).
development
Delegate a bulk-work subtask to the local Qwen via one-shot opencode run. Use when the subtask is high-volume but low-complexity (file scans, log parsing, large-text summaries, repetitive transforms) so it should not burn parent-model tokens.
development
ETL pipeline that imports manually-downloaded Discord, LinkedIn, and WhatsApp archive ZIPs into the user's brain vaults as plain files (no APIs, no tokens, no daemons). Use when the task involves processing or querying a Discord/LinkedIn/WhatsApp data export.
tools
The user's email, contacts, personal tasks/todos, and full-CRUD Google + EWS calendars. Drives the sloptools CLI (same surface as the sloppy MCP on 127.0.0.1:9420). Use for mail (Gmail / Exchange-EWS / IMAP — list, read, send, reply, forward, flag, categorize, server-side filters, delegated mailboxes, out-of-office), calendar events (create / update / delete / RSVP / freebusy / ICS export across work + private accounts), contacts and contact groups, tasks (Google Tasks, Todoist), slopshell canvas, agent handoffs, and workspace items/artifacts/actors/triage.
tools
Render lecture videos from Beamer PDFs and pdfpc speaker notes via the slopcast pipeline (slide.pdf → segments.json → audio → clips → final/lecture.mp4) with Qwen3-TTS voice cloning and optional split-screen livecode. Use to generate, re-render, or splice slopcast lecture videos, or to author Beamer notes for slopcast.