skills/agentic-os/SKILL.md
Framework-agnostic persistent memory and self-improvement loops for AI agents. Scaffolds shared state, task queues, and learnings files that can be read/written by Claude, Gemini, and Antigravity. Use this to initialize an Agentic OS layer in any workspace and instruct agents on how to use it.
npx skillsauth add baphomet480/claude-skills agentic-osInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill establishes a framework-agnostic Agentic Operating System layer in a project workspace. It enables Claude, Gemini, and Antigravity to share persistent operational memory, coordinate via task queues, and continuously improve through structured feedback loops.
An Agentic OS solves the "blank slate" problem. Without it, every agent session starts fresh, requiring manual context loading and repeating past mistakes. By centralizing state and learnings into an .agent/ directory, any agent runtime can pick up exactly where the last one left off and apply compounded knowledge.
To set up the Agentic OS scaffolding in a new project, run:
bash ~/.agents/skills/agentic-os/scripts/init-os.sh
This creates the following structure at the root of the project:
.agent/
├── state/
│ ├── last-run.json # Global log of the last agent actions
│ ├── tasks.json # Shared queue of pending/completed tasks
│ ├── errors.json # Log of unresolved task failures
│ └── status.md # Output of the command-center skill
├── learnings/ # Per-skill feedback loops
│ └── template.json
└── evals/ # Per-skill evaluation criteria
└── template.json
Whenever you operate in a project that has an .agent/ directory, you MUST adhere to the following workflow:
Before executing a specialized task or skill:
.agent/state/last-run.json to understand what was recently completed. Check .agent/state/tasks.json if you need to pull a task from the queue.write-draft), check if .agent/learnings/write-draft.json exists. If it does, read it and explicitly apply its rule_change entries to your approach..agent/evals/<skill>.json exists to understand the definition of "done".After generating your output but before finalizing:
eval.json exists for your task, grade your own output against its criteria. (Use python3 ~/.agents/skills/agentic-os/scripts/eval.py list --skill <skill>)python3 ~/.agents/skills/agentic-os/scripts/eval.py verify --skill <skill> --task-id <id> --notes "..."Before you conclude your session or step:
python3 ~/.agents/skills/agentic-os/scripts/learn.py add --skill <skill> --worked "..." --didnt "..." --rule "..." to record it..agent/state/last-run.json with a summary of what you just did, what decisions you made, and what needs to happen next.tasks.json, mark its status as completed.external_receipts array in last-run.json and the corresponding task in tasks.json to ensure an audit trail.When you learn a critical fact about the project, the user's preferences, or system architecture, do not rely solely on your proprietary memory system (e.g., Claude's internal .claude/ memory or Gemini's save_memory / ~/.gemini/tmp/.../MEMORY.md).
Proprietary memory is siloed and invisible to other agents. You MUST record project-wide facts in shared locations so that Claude, Gemini, and Antigravity can all access them:
.agent/learnings/ for specific skill rules or workflow learnings.GEMINI.md or CLAUDE.md for team-shared architectural conventions, repository rules, or broad project guidance.last-run.jsonUsed for operational continuity.
{
"timestamp": "2026-04-25T12:00:00Z",
"task_id": "task-123",
"status": "completed",
"description": "Built the new Hero component.",
"assigned_skill": "build-component",
"project_id": "generic-service",
"agent_id": "antigravity",
"user_id": "matthias",
"outcome": {
"result": "Success"
},
"trace_id": "req-12345",
"decision_log": "Used standard Tailwind utility classes instead of custom CSS for faster rendering",
"external_receipts": [
{
"action": "Vercel Preview Deployment",
"receipt_hash": "sha256-1a2b3c..."
}
]
}
errors.jsonUsed by the command-center skill to track unresolved failures.
[
{
"timestamp": "2026-04-25T12:05:00Z",
"task_id": "task-124",
"assigned_skill": "osint",
"reason": "Target API returned 429 Too Many Requests after 3 retries",
"trace_id": "req-12346",
"decision_log": "Attempted to backoff but exceeded max wait time.",
"resolved": false
}
]
learnings.json (per skill)Used to permanently correct agent behavior without modifying the base skill prompt.
{
"skill": "build-component",
"history": [
{
"date": "2026-04-21",
"what_worked": "Extracting the SVG into a separate file kept the component clean.",
"what_didnt": "Trying to use Next.js Image for inline SVGs caused layout shifts.",
"rule_change": "Always put complex SVGs in a separate .tsx file and import as a React component. Do not use next/image for vectors."
}
]
}
eval.json (per skill)Used to define rigid quality gates.
{
"skill": "build-component",
"criteria": [
"Component must be responsive down to 320px",
"Must not use inline styles",
"Must be exported as a default export"
]
}
development
Sets up, configures, and optimizes Google Analytics 4 (GA4) properties. Evaluates websites for proper GA4 implementation, tracking codes, and configuration improvements. Uses the Google Analytics Admin API for programmatic setup or provides manual integration paths via gtag.js or Next.js Third Parties.
development
Open-source intelligence on people, companies, domains, and B2B accounts. Use when the user wants to investigate, vet, research, or build a dossier on a target — phrases like "OSINT", "due diligence", "background check", "research this person", "look into [company/domain]", "vet this prospect/vendor", "what does X do", "is this account worth pursuing", "find me a contact at", "who's the buyer for", or any open-source investigation task. Disambiguates identities before reporting and grades every claim by independent source count.
development
Generate, edit, describe, restyle, restore, thumbnail, and batch-process images using xAI (Grok) or OpenAI image APIs and GPT-4o vision. Default provider is xAI ($0.02/image flat rate). Use this skill whenever the user asks to generate, create, make, draw, or design an image or picture using AI, or wants to edit, modify, transform, restyle, composite, or inpaint an existing image. Also handles image description and alt-text generation, background removal, style transfer, photo restoration, thumbnail creation, and batch generation from JSON manifests. Trigger when the user mentions DALL-E, gpt-image, Grok image, xAI image, OpenAI image generation, or wants AI-generated visuals for any purpose (logos, mockups, illustrations, thumbnails, icons, concept art, memes). Also trigger for batch image generation, generating a set or series of images, processing multiple images from a manifest, or creating consistent image collections. If the user says "make me an image of...", "generate a picture", "edit this photo to...", "describe this image", "remove the background", "make this look like watercolor", "restore this old photo", "create a thumbnail", "generate a batch of images", or "process this image manifest", this is the skill to use.
testing
Agentic OS Orchestrator. Process and execute tasks from the shared .agent/state/tasks.json queue. Use when the user asks to 'check the queue', 'process tasks', or run the heartbeat.