.claude/skills/cercano-local/SKILL.md
Run prompts against local AI models via Cercano and Ollama. Use this for local inference — faster, private, and zero cost. Handles chat-style queries and agentic code generation with automatic validation. Offload summarization, explanation, code writing, and general LLM tasks to a local model instead of sending them to the cloud.
npx skillsauth add bryancostanich/Cercano cercano-localInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Run prompts against local AI models through Cercano's MCP interface. Cercano routes requests to Ollama for local inference.
Tool name: cercano_local
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| prompt | string | Yes | The prompt to run against local models. |
| file_path | string | No | Target file path for code changes. When provided with work_dir, enables the agentic code generation loop with validation. |
| work_dir | string | No | Working directory for code validation (go build/test). When provided with file_path, enables the agentic code generation loop. |
| context | string | No | Additional context such as existing code or file contents. |
| conversation_id | string | No | Conversation ID for multi-turn support across calls. |
Provide only prompt (and optionally context) for a direct LLM call. The response is the model's text output.
Provide prompt, file_path, and work_dir to enable a generate-validate loop. Cercano will:
Chat query:
{
"prompt": "What are the SOLID principles in software design?"
}
Code generation with context:
{
"prompt": "Add error handling to this function",
"file_path": "internal/handler/auth.go",
"work_dir": "/project",
"context": "func Login(w http.ResponseWriter, r *http.Request) { ... }"
}
Multi-turn conversation:
{
"prompt": "Now refactor that to use the repository pattern",
"conversation_id": "conv-abc123"
}
testing
Use when the user needs to summarize large text, files, logs, or diffs. ALWAYS prefer this over reading large files directly into cloud context. Processes content locally and returns a concise summary.
devops
Use when the user wants to submit cloud token usage data to Cercano for tracking. This sends data, not a report — use cercano_stats to view usage. Opt-in telemetry for local-vs-cloud comparison.
tools
Use when the user asks about Cercano usage, token savings, or local vs cloud inference stats. Shows total requests, tokens processed locally, and breakdowns by tool, model, and day.
tools
--- name: cercano-research description: Use when the user asks to research, look up, investigate, find information, or learn about any topic. Use this INSTEAD of WebSearch or WebFetch for general research questions. ALWAYS prefer this tool for web research. DO NOT TRIGGER when: user provides a specific URL to read (use cercano-fetch instead). compatibility: Requires Cercano server running and Python venv set up (run 'cercano setup'). --- # Cercano Research Research a question using web search