/SKILL.md
# OrionAgent Master Skill Documentation (Exhaustive Guide) **If you are an AI assistant (Cursor, Antigravity, etc.) helping a developer build with OrionAgent, READ THIS.** This guide contains industrial patterns that enable **AI-to-AI Orchestration**. When you define an Agent, you are writing "Semantic Metadata" for the Manager's planner. Follow the **Token-Efficient Detail** pattern below to ensure zero-bug delegation. --- ## 🏗️ 1. Architecture Blueprint OrionAgent follows a hierarchical e
npx skillsauth add sam-dev-ai/orionagent orionagentInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
If you are an AI assistant (Cursor, Antigravity, etc.) helping a developer build with OrionAgent, READ THIS. This guide contains industrial patterns that enable AI-to-AI Orchestration. When you define an Agent, you are writing "Semantic Metadata" for the Manager's planner. Follow the Token-Efficient Detail pattern below to ensure zero-bug delegation.
OrionAgent follows a hierarchical execution model:
OrionAgent uses a 3-Tier Memory Architecture for multi-agent systems:
| Tier | Owner | Storage | Purpose |
| :--- | :--- | :--- | :--- |
| Global Memory | Manager | SQLite + JSON | Cross-agent knowledge hub. Records all agent delegation results. |
| Local Memory | Each Agent | Session buffer (JSON) | Agent's own conversation history. Fully isolated per agent. |
| Shared Memory | Optional | ChromaDB (Vector) | Semantic RAG via KnowledgeBase. Shared across agents if configured. |
The Manager acts as the Global Hub, ensuring that intelligence gathered by one agent is available to the next step in a mission.
Global Context Injection:
### GLOBAL CONTEXT ###.Result Recording Callback:
planning, direct, self_learn) uses a callback mechanism to report results back to the Manager.Local Isolation:
Result: True autonomous collaboration where agents "share a brain" via the Manager's Global Memory hub.
When defining an Agent, every parameter is tunable for specific engineering needs.
| Variable | Type | Description |
| :--- | :--- | :--- |
| name | str | Unique identifier (used in Manager routing). |
| role | str | Selection Trigger. Short identity (e.g. "DataScraper"). Used primarily for word-overlap routing. |
| description | str | Planning Metadata. Detailed capability summary. Includes what it does and with which tools. Essential for Planner models. |
| system_instruction| str | Logic Guard. Cached persistently. Must contain explicit tool-use instructions and deterministic rules. |
| temperature | float | 0.0 (Deterministic) to 1.0 (Creative). Agent-level override. |
| use_default_tools | bool | Auto-loads Browser, File, OS, Terminal, and Python tools. |
| memory | str/cfg| "none", "session", "long_term", or a MemoryConfig object. |
| async_mode | bool | Performance Gate. Enables parallel tool calls (Up to 60% faster). CRITICAL for scrapers/terminal use. |
| thinking| bool | Reasoning Mode. Enables Chain-of-Thought (e.g. DeepSeek R1, Gemini Thinking). |
| show_thinking|bool| Thought Visibility. If False, strips <thought> blocks from the output. |
To enable real-time observability in the terminal (natively working with Flask/FastAPI), use these Model Provider flags:
| Flag | Effect |
| :--- | :--- |
| token_count=True | Logs Input/Output tokens for every call immediately. |
| verbose=True | Logs sequence of events (Agent ask, Tool call, etc.). |
| debug=True | Logs deep internals and raw payloads. |
| Tool Status (1/0) | Shows binary status: 1 (Success) or 0 (Failure) in 1 line. |
Note: These logs use flush=True and standardized ASCII symbols (+-) to ensure they stream instantly in active server environments (Flask/FastAPI) across all OS platforms (Windows/Linux). In a Flask/FastAPI terminal, you will see a clean stream of execution for every concurrent request:
[AGENT] Vanguard-Industrial
[MEMORY] Storing user turn
[TOOL] scrape_website : 1
[TOOL] save_to_db : 1
.ask() vs .chat()Understanding how to trigger an agent is key to preventing redundant loops or lost context.
agent.ask(prompt)agent.chat() or chat(agent)You: prompt.manager.chat(greeting)Manager's .chat() triggers the Strategy Engine.strategy="planning" is set, every message you send triggers the Strategic Orchestration Loop:
user_id)OrionAgent is designed for production environments. To prevent "context leakage" between different people:
user_id to .ask():
# isolated session for user_123
manager.ask("Hello", user_id="user_123")
agent_memory/sessions/{user_id}/ to store history.Manager or Agent instance can handle thousands of concurrent users safely.When building a web backend, pass the persistent User ID (JWT subject or Session ID) directly to the framework:
from flask import Flask, request, jsonify
from orionagent import Manager, Gemini
app = Flask(__name__)
llm = Gemini(debug=True) # Enable terminal telemetry
manager = Manager(model=llm, agents=[...])
@app.route("/chat", methods=["POST"])
def chat():
data = request.json
uid = data.get("user_id") # e.g. "user_789"
task = data.get("message")
# The terminal will now stream logs for this user specifically
response = manager.ask(task, user_id=uid)
return jsonify({"response": response})
OrionAgent uses a Dual-Layer Logic Engine to manage context. This is the #1 way to save tokens in long-running sessions.
Select the "Power Level" of your agent's memory based on your ecosystem needs:
| Level | Mode | Behavior | Power |
| :--- | :--- | :--- | :--- |
| 1 | none | No memory. Static responses only. | Static |
| 2 | session | Default. Fast temporary conversation buffer. | Medium |
| 3 | long_term| Session + Persistent SQLite (Structured Fact Recall). | High |
| 4 | chroma | Session + SQLite + Vector Knowledge (Semantic RAG).| Ultimate |
Once you have long_term or chroma enabled, you can tune how deep the entity extraction goes:
| Tier | Name | Behavior | Token Cost |
| :--- | :--- | :--- | :--- |
| low | Token Saver | 1-sentence minimalist summaries. No entity extraction. | 📉 Minimal |
| medium | Default | Balanced summary + Structured entity extraction. | ⚖️ Moderate |
| high | Deep Knowledge| Detailed summaries + Exhaustive naming/fact extraction. | 📈 High |
Pro Tip: Use chroma mode for industrial knowledge bases (RAG) and long_term for simple user preferences.
MemoryConfig Parameter RegistryWhen building for an ecosystem, tune these for cost and precision:
| Parameter | Type | Range | Description |
| :--- | :--- | :--- | :--- |
| priority | str | low/med/high | Depth of Summary. Higher depth means more detailed context window, but uses more tokens. |
| extract_entities| bool | True/False | Knowledge Extraction. If True, agent identifies Names, Dates, and Facts for the JSON knowledge vault. |
| importance_threshold| int | 1-10 | Sync Filter. Facts with importance below this are kept only in the session. 7+ is recommended for permanent storage. |
| chunk_size | int | 5-50 | How many messages to wait before compressing history into a summary. |
OrionAgent uses a Parallel Tool Dispatcher. When an agent calls multiple tools:
async_mode.True, tools are launched in a ThreadPoolExecutor.When using strategy=["planning", "self_learn"]:
The Manager is the technical "Brain" of your swarm. Its sole purpose is to decompose high-level goals into executable roadmaps and delegate them to specialized personnel.
OrionAgent enforces a strict Behavior Lock on the Manager. It is a Planner, not an Executor.
The strategy parameter determines how the Manager processes your intent.
| Strategy | Mode | Best For | Behavior |
| :--- | :--- | :--- | :--- |
| None | Direct | Simple Routing | Fast, one-step delegation to the best agent. |
| "planning"| Strategic| Complex Goals | Decomposes task into a multi-step JSON plan. |
When using "planning", the Manager maintains an internal Efficiency Gate.
Apply surgical control over the agent pool during execution using manager.ask():
# Force a specific specialist
manager.ask("Draft a legal report", force_agent="Lawyer")
# Exclude specific roles
manager.ask("Debug this script", blocked_agents=["Researcher"])
# Use only a subset
manager.ask("Market analysis", allowed_agents=["Scraper", "Analyst"])
hitl (Human-in-the-Loop)(y/n) before proceeding.Shortcut Version:
# Enables the 'Balanced' risk check by default
manager = Manager(agents=[...], hitl=True)
Advanced Configuration (HitlConfig):
from orionagent import Manager, HitlConfig
# Configure safety levels
safety_cfg = HitlConfig(
permission_level="medium", # "low" (always), "medium" (risky), "high" (never)
use_llm=True, # Use LLM call for intent analysis
ask_once=True, # Approve once for the whole session turn
plan_review=True # Show full decomposition before approval
)
manager = Manager(agents=[...], hitl=safety_cfg)
| Level | Name | Trigger Logic |
| :--- | :--- | :--- |
| low | Paranoiac | Always asks for approval for every goal or plan. |
| medium | Default | LLM-Based Risk Check (Active if hitl=True). Uses a lightweight model to classify if the task is risky (e.g. system mods) or safe (e.g. math/chat). |
| high | Autonomous | Never asks. Complete trust. |
use_llm)OrionAgent's HITL now supports Dynamic Risk Intelligence. Instead of matching static keywords, a lightweight LLM call (~30 tokens) analyzes the intent of the task to determine if it's destructive.
# Enable Dynamic Risk Check
safety_cfg = HitlConfig(permission_level="medium", use_llm=True)
Note: If use_llm is False or the model call fails, the system automatically falls back to the Deterministic Keyword Moat (detecting words like delete, rm, sudo, etc.).
HitlConfig Parameter RegistryControl the "Safety Valve" of your orchestrator:
| Parameter | Type | Values | Description |
| :--- | :--- | :--- | :--- |
| permission_level| str | low/med/high | The "Fear Factor". medium is best for general use (checks risk). |
| ask_once | bool | True/False | Session Honor. If True, once you approve a goal, all sub-tasks in that plan execute without interruption. |
| plan_review | bool | True/False | Transparency. If False, only the high-level goal is shown. If True, the full step-by-step plan is displayed. |
A multi-agent setup where one agent plans, one codes, and one audits, with the Manager ensuring quality.
from orionagent import Agent, Manager, Gemini, chat
# 1. High-Performance Model (Configure logging here)
llm = Gemini("gemini-2.5-flash", temperature=0.1, token_count=True, debug=True)
# 2. Specialized Personnel
researcher = Agent(name="Scraper", role="Researcher", use_default_tools=True)
coder = Agent(name="Dev", role="Python Expert", use_default_tools=True)
auditor = Agent(name="Sentry", role="QA Engineer", system_instruction="Find bugs and logic flaws.")
# 3. Master Orchestrator (Behavior Lock)
manager = Manager(
model=llm,
agents=[researcher, coder, auditor],
strategy="planning", # Enable Structured JSON Orchestration
)
# Efficiency Gate: Simple greetings skip planning automatically
manager.chat("Hello team!")
# Full Orchestration: Complex goals trigger the JSON roadmap
manager.chat("Build a secure file-encryptor with AES-256 and unit tests.")
from orionagent import Anthropic
# Claude-backed orchestrator
llm = Anthropic(model_name="claude-3-5-sonnet-20240620")
An agent designed for 100% accuracy in data extraction, syncing facts to a permanent database.
from orionagent import Agent, MemoryConfig, Gemini
# Force a 'High' priority knowledge vault
knowledge_cfg = MemoryConfig(
mode="long_term",
priority="high",
importance_threshold=8, # Only save critical facts
storage_path="corporate_memory"
)
agent = Agent(
memory=knowledge_cfg
)
# Facts extracted here will move to SQLite permanently
agent.ask("Extract all decision-makers from the Q1 meeting notes.")
# Later, even in a new session:
agent.ask("Who approved the Q1 budget?") # Auto-retrieves from SQLite
Gemini for speed: Its native caching makes it the fastest for high-tool use.strategy="planning" for complex goals: It provides a strict, structured roadmap for specialized agents.PlanningStrategy automatically bypasses the planning phase for simple tasks (detected via high-speed heuristic), saving ~4s of overhead.token_count=True during development.description fields ultra-concise (15-20 words). The model doesn't need a manual; it needs a functional summary.self_learn mode, the system automatically skips the quality evaluation step for known-good patterns and very short conversational turns (<15 words), saving ~50 tokens per call.import os
from orionagent import Agent, Manager, Gemini, chat, tool, MemoryConfig
# 1. High-Performance Model Caching (Centralized Logging)
llm = Gemini(
model_name="gemini-2.5-flash",
temperature=0.1,
token_count=True,
debug=True,
verbose=True,
thinking=True, # Auto-switches to gemini-2.0-flash-thinking-exp
show_thinking=False # Hide internal reasoning from the user
)
# 2. Deep Memory Config
long_term_memory = MemoryConfig(
mode="long_term",
priority="high",
importance_threshold=9 # Only store absolute critical facts
)
# 3. Defensive Specialist
auditor = Agent(
name="Auditor",
role="Security Specialist",
model=llm,
memory=long_term_memory
)
# 4. Master Engine (Behavior Lock)
manager = Manager(
model=llm,
agents=[auditor],
strategy="planning", # Enable Structured JSON Orchestration
)
# 5. Launch
chat(manager, greeting="Orion System Online. Deployment Authorized.")
OrionAgent features an industrial-grade Retrieval-Augmented Generation (RAG) engine. This allows agents to ingest private documents (PDF, MD, TXT) and retrieve facts with semantic precision.
The KnowledgeBase manages a dedicated vector collection in ChromaDB, separate from conversation memory.
from orionagent import Agent, KnowledgeBase
# 1. Initialize a named knowledge collection
kb = KnowledgeBase(collection_name="project_nebula")
# 2. Assign to an agent
agent = Agent(name="Researcher", knowledge=kb)
When an agent is initialized with knowledge, it automatically receives two high-performance tools:
ingest_file(file_path): Automatically reads, chunks, and indexes a local file.ingest_text(text): Direct indexing of raw strings into the collection.query_knowledge(query): Performs a semantic search across the entire knowledge base and returns relevant snippets.kb.ingest_file("data.pdf") before starting the agent to "pre-load" its brain."Agent, please read the manual at C:/docs/manual.pdf". It will call the tool and learn the contents dynamically.OrionAgent is engineered to solve the "abstraction tax" of other frameworks.
==== ACTIVE TASK ==== header. This strictly separates the conversation history (context) from the current goal (action), preventing the agent from getting lost in old dialogue.research, analyze, plan trigger full orchestration.To ensure high-performance tool usage and prevent "Tool not found" or "Unavailable" refusals, follow these industrial patterns:
Every custom tool MUST use the @tool decorator and a Google-style docstring.
@tool
def my_custom_tool(param1: str, param2: int = 10):
"""
Detailed description of what the tool does.
(This is sent to the LLM to explain the tool's purpose).
Args:
param1: Description of the first parameter.
param2: Description of the second parameter.
"""
# Logic here
return f"Result: {param1}"
str, int, float, or bool. These are automatically converted to JSON Schema.Args: section is mandatory for complex tools. Without it, the LLM will see parameters but won't know what values to pass.@tool
def search_and_analyze(query: str):
# web_browser is a @tool, but we call it like a function
raw_data = web_browser(action="search", query_or_url=query)
return analyze_text(raw_data)
Error:. The Agent will see this and attempt to fix its input.The python_sandbox is a first-class tool for industrial agents to verify logic, process data, and solve algorithmic challenges.
The sandbox operates on a Zero-Footprint principle:
-c flag to a temporary Python subprocess..py files are created or saved to the disk.print(): The agent ONLY sees what is explicitly printed to stdout.Error:.# Industrial Example: Calculating Fibonacci securely
@tool
def heavy_calculation(n: int):
# This logic is delegated to the sandbox for 100% accuracy
code = f"""
def fib(n):
a, b = 0, 1
for _ in range(n):
yield a
a, b = b, a + b
print(list(fib({n})))
"""
return python_sandbox(code=code)
To ensure agents execute reliably without asking clarifying questions, their system_instruction must be engineered with explicit tool and agent awareness.
Every agent should know exactly which tools it owns and when to trigger them.
Example:
system_instruction="You are a Researcher. Use 'web_search' to find business names. DO NOT ask for permission; find the data and return it."
The Manager must understand its identity as a leader and the capabilities of its swarm.
Example:
system_instruction="""You are the Lead Manager.
You coordinate 'TheResearcher' (for discovery) and 'TheScraper' (for extraction).
Your goal is to ensure a complete data loop: Discover, Extract, and Save."""
While OrionAgent provides powerful default tools, high-precision industrial agents should prioritize Custom Tools built for specific tasks. This is the core of "Vibe Coding"—Human and AI collaborating to build the perfect toolkit.
web_browser returns raw HTML; a custom scrape_lead_details tool returns a structured list of emails.save_leads_to_csv tool has built-in validation, whereas a generic execute_command can be dangerous.@tool that handles the heavy lifting (math, file I/O, API calls).[!TIP] Industrial Rule: Use default tools for exploration, but build custom tools for execution.
In AI-driven development (Vibe Coding), you aren't just writing code; you are building a Swarm Ecosystem. The Manager's planner chooses agents based on the clarity of their metadata.
The description is the primary metadata used by the Manager's PlanningStrategy. If the description is vague, the Manager will fail to select the right agent during complex task decomposition.
Selection Comparison:
| Level | description Pattern | Result |
| :--- | :--- | :--- |
| Generic | "Scrapes websites." | Failure. Manager might skip this agent for "Lead Generation" tasks. |
| Industrial | "Extracts specific emails and contact info from URLs using web_browser. Returns structured contact data." | Success. Planner maps "find leads" directly to this capability. |
A "Hard-Level" agent doesn't just know what it is; it knows how it must interact with the swarm.
Enforce tool-primacy to prevent the LLM from trying to "hallucinate" a summary without actually executing the work.
"You are a Researcher. You MUST call 'web_browser' for every discovery task. Do not summarize until tool output is received."The Manager needs to know its role as a coordinator, not just a chatbot.
"You are the SalesDirector. Coordinate 'LeadFinder' for URLs and 'DataWriter' for file saving. Ensure the loop is: Find -> Scrape -> Save."To keep the context window clean while maintaining 100% reliability:
"You are a [Specialist]. You only do [X].""You MUST use 'python_sandbox' for all logic verification.""Save every result via 'file_manager'. Do not ask for confirmation."When you (the AI) are building an agent, choose the tier based on the user's vibe:
| Tier | Config | Use Case |
| :--- | :--- | :--- |
| Speed Runner | Gemini, async_mode=True, memory="none" | Rapid web scraping, data search, fast responses. |
| Data Scientist | Claude, guards=["json"], python_sandbox | Complex data transformation, code execution, logic checks. |
| Knowledge Vault | Gemini, memory="chroma", priority="high" | Long-term RAG, persistent user state, corporate memory. |
system_instruction to maximize provider-level prompt caching.priority="low" for conversational agents where deep historical context isn't required.temperature=0.0 for agents using complex tools (File, OS, Terminal) to prevent parameter hallucinations.debug=True and verbose=True on the Model Provider (Gemini, OpenAI, etc.) instead of the Agent or Manager for consistent observability.OrionAgent: Build Agents That Actually Work.
development
Maintainer-only workflow for handling GitHub Secret Scanning alerts on OpenClaw. Use when Codex needs to triage, redact, clean up, and resolve secret leakage found in issue comments, issue bodies, PR comments, or other GitHub content.
development
Maintainer workflow for OpenClaw releases, prereleases, changelog release notes, and publish validation. Use when Codex needs to prepare or verify stable or beta release steps, align version naming, assemble release notes, check release auth requirements, or validate publish-time commands and artifacts.
development
Run, watch, debug, and extend OpenClaw QA testing with qa-lab and qa-channel. Use when Codex needs to execute the repo-backed QA suite, inspect live QA artifacts, debug failing scenarios, add new QA scenarios, or explain the OpenClaw QA workflow. Prefer the live OpenAI lane with regular openai/gpt-5.4 in fast mode; do not use gpt-5.4-pro or gpt-5.4-mini unless the user explicitly overrides that policy.
development
End-to-end Parallels smoke, upgrade, and rerun workflow for OpenClaw across macOS, Windows, and Linux guests. Use when Codex needs to run, rerun, debug, or interpret VM-based install, onboarding, gateway smoke tests, latest-release-to-main upgrade checks, fresh snapshot retests, or optional Discord roundtrip verification under Parallels.