skills/auto-improve/SKILL.md
Self-improvement loop for arkadian. Launches arkadian -d with a GitHub issue, polls until done, evaluates per-agent criteria, reads transcripts on failure, improves prompts, deletes worktree, and relaunches. Loops until passing or max iterations.
npx skillsauth add arklabshq/arkadian auto-improveInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are running the auto-improvement loop for Arkadian. Your job is to make Arkadian's development pipeline reliable by iteratively running real tasks and fixing the prompts/skills/hooks that cause failures.
You receive a GitHub issue URL (e.g. https://github.com/ark-network/ark/issues/909).
ARKADIAN_DIR — path to the arkadian repo (where you're running)ARKADIAN_DATA_DIR — runtime data directory (session state, logs)Execute this loop. Do NOT skip steps. Do NOT proceed to improvement without reading artifacts first.
FOR iteration = 1 to 8:
1. LAUNCH: arkadian -d "<issue_url>"
2. POLL: wait for session to complete (2h max)
3. FIND: locate session directory
4. EVALUATE: run eval-engine.ts + check artifacts
5. IF PASS → DONE (keep worktree, generate report)
6. ANALYSE: read hook log + transcript for failing agents
7. IMPROVE: edit max 2 files (agents, skills, orchestrator, hooks, templates)
8. CLEAN: delete worktree from failed run
9. CHECK: if 3 consecutive no-improvement → STOP
10. LOOP
arkadian -d "<ISSUE_URL>"
This command outputs lines like:
Arkadian detached (PID 12345)
/path/to/sessions/<session-id>
Capture:
PID — the process IDSESSION_DIR — the session directory path (second line, trimmed)If only a log path is shown (no session dir), wait 10s and check ${ARKADIAN_DATA_DIR} for the newest *_active.txt file to find the session dir.
Run this polling loop:
# Find the session ID from the active files
ACTIVE_FILE=$(ls -t ${ARKADIAN_DATA_DIR}/*_active.txt 2>/dev/null | head -1)
SESSION_ID=$(basename "$ACTIVE_FILE" | sed 's/_active.txt//')
# Poll every 30 seconds, max 2 hours (240 iterations)
for i in $(seq 1 240); do
if [ ! -f "${ARKADIAN_DATA_DIR}/${SESSION_ID}_active.txt" ]; then
echo "Session completed"
break
fi
sleep 30
done
The session is done when _active.txt disappears (deleted by session-stop-hook).
If timeout (2h): Report timeout, check if process is still alive (kill -0 $PID). If zombie, kill it. Move to evaluation with whatever artifacts exist.
After completion, the session folder gets renamed by the summarize worker. Find it:
${ARKADIAN_DIR}/sessions/.index/manifest.jsonsession_dir fieldIf manifest hasn't been updated yet (worker still running), wait 15s and retry. If still not found, search ${ARKADIAN_DIR}/sessions/ for the most recently modified directory.
Store SESSION_DIR for subsequent steps.
bun ${ARKADIAN_DIR}/benchmarks/eval-engine.ts --session-dir "${SESSION_DIR}"
This outputs JSON with per-agent scores. Parse the output.
The JSON structure:
{
"pass": true/false,
"overall_percentage": 85,
"agents": [
{
"agent": "ark-developer",
"score": 11,
"max_score": 15,
"percentage": 73,
"criteria": [
{"id": "e2e_test_written", "passed": false, "detail": "..."},
...
]
},
...
],
"failures": ["[ark-developer] e2e_test_written: integration_test_written = false", ...]
}
Even if eval-engine says pass, manually verify these (the engine can't catch everything):
test-evidence.md — is there REAL test output? Not just "go vet" or "go build"?assessment.yaml — does the guru's analysis make sense for this issue?changes.yaml — is the worktree path present? Were files actually changed?If pass: true in eval results AND manual artifact check looks good:
changes.yaml to get the worktree_path# Auto-Improve: PASS
**Issue**: <url>
**Iteration**: N of 8
**Session**: <session_dir>
**Worktree**: <worktree_path> (preserved for inspection)
## Per-Agent Scores
| Agent | Score | Status |
|-------|-------|--------|
| ark-guru | 8/8 (100%) | PASS |
| ark-project-manager | 6/6 (100%) | PASS |
| ark-developer | 13/15 (87%) | PASS |
| ark-developer-ci | 7/7 (100%) | PASS |
| ark-pr-reviewer | 2/2 (100%) | PASS |
## Improvements Made (across iterations)
- Iter 1: Edited agents/ark-developer.md — added constraint about Skill invocation
- Iter 2: Edited skills/arkd-dev-loop/SKILL.md — fixed infra health check
## Prompt Changes (review with `git diff agents/ skills/ ORCHESTRATOR.md templates/ hooks/`)
<list accumulated changes>
Save report to ${ARKADIAN_DIR}/benchmarks/runs/<issue-slug>/report.md.
STOP HERE. Do not continue the loop.
For each failing agent (percentage below pass threshold or critical criteria failed):
# Find the log file
LOG_FILE=$(ls -t ${ARKADIAN_DATA_DIR}/*_log.txt 2>/dev/null | head -1)
Read the log file. Look for:
Locate the transcript:
transcript_path~/.claude/projects/-Users-*/${SESSION_UUID}.jsonlRead in chunks (use Read with offset and limit). Focus on the failing agent's phase:
| Pattern in transcript | What it means |
|----------------------|---------------|
| Agent never called Skill("arkd-dev-loop") | Skill dispatch not happening |
| Agent said "I'll skip..." or "Since infrastructure isn't..." | Agent found escape hatch |
| Agent looped on infra setup (repeated "docker", "nigiri", "connect" failures) | Skill procedure broken |
| Agent ran out of context ("I need to summarize...") | Prompt too large |
| Agent produced output that contradicts spec | Didn't read/follow PM artifacts |
| Agent didn't read upstream artifacts at all | Artifact passing or spec issue |
| Orchestrator spec was missing fields | Orchestrator template problem |
Combine eval failures + log analysis + transcript reading to identify:
Make at most 2 edits per iteration. Each edit targets the root cause from Step 6.
| Layer | Files |
|-------|-------|
| Agent prompts | agents/ark-guru.md, agents/ark-developer.md, agents/ark-project-manager.md, agents/ark-pr-reviewer.md |
| Skills | skills/arkd-dev-loop/SKILL.md, skills/arkd-gha/SKILL.md, skills/fulmine-dev-loop/SKILL.md, skills/dev-implement/prompt.md, skills/pm-spec/prompt.md, skills/pm-plan/prompt.md, skills/pm-tasks/prompt.md |
| Orchestrator | ORCHESTRATOR.md |
| Templates | templates/sub_agent_input_spec.md, templates/workflows/development_unified.yaml, templates/agent_catalog.md |
| Hooks | hooks/validation-contracts.ts, hooks/post-agent-validator.ts, hooks/pre-agent-validator.ts |
arkadian -d launch (skills need make install-skills first if in skills/, but agent files are loaded directly)If you edited any file in skills/:
cd ${ARKADIAN_DIR} && make install-skills
This copies the updated skill to ~/.claude/commands/ where Claude Code can find it.
Only on failure (not on pass — see Step 5).
Read changes.yaml from the session to find the worktree path:
# Extract worktree_path from changes.yaml
WORKTREE=$(grep 'worktree_path:' "${SESSION_DIR}/artifacts/implement/changes.yaml" | awk '{print $2}' | tr -d '"')
if [ -n "$WORKTREE" ] && [ -d "$WORKTREE" ]; then
rm -rf "$WORKTREE"
# Get the repo root from the worktree path (parent of .worktrees/)
REPO_ROOT=$(echo "$WORKTREE" | sed 's|/.worktrees/.*||')
git -C "$REPO_ROOT" worktree prune 2>/dev/null
fi
If changes.yaml doesn't exist (crash before implement phase), skip this step.
Track across iterations:
scores: array of overall_percentage per iterationimprovements_made: count of edits per iterationStop conditions:
scores[i] <= scores[i-1] for 3 iterations in a row, stop and report. The prompt changes aren't helping — need human intervention.On stop:
# Auto-Improve: STOPPED
**Reason**: <3 consecutive no-improvement | max iterations>
**Best score**: <N>% (iteration <M>)
**Issue**: <url>
## Score Progression
| Iter | Score | Improvements |
|------|-------|-------------|
| 1 | 45% | agents/ark-developer.md (add_constraint) |
| 2 | 60% | skills/arkd-dev-loop/SKILL.md (fix_procedure) |
| 3 | 60% | agents/ark-guru.md (refine_wording) |
| 4 | 60% | STOPPED — no improvement in 3 iterations |
## Remaining Failures
<list from last eval>
## Recommendation
<what a human should look at>
Maintain state in ${ARKADIAN_DIR}/benchmarks/runs/<issue-slug>/campaign.yaml:
issue_url: "https://github.com/ark-network/ark/issues/909"
issue_slug: "ark-909"
started_at: "2026-03-28T10:00:00Z"
status: "in_progress" # in_progress | passed | stopped
iterations:
- iteration: 1
session_dir: "/path/to/session"
score: 45
max_score: 100
improvements:
- file: "agents/ark-developer.md"
type: "add_constraint"
description: "Added explicit Skill invocation requirement"
worktree_deleted: true
- iteration: 2
session_dir: "/path/to/session"
score: 85
max_score: 100
improvements: []
worktree_deleted: false # kept on pass
best_score: 85
best_iteration: 2
Update this file after each iteration.
make install-skills)documentation
Update project documentation based on new commits and changes in the repository. Use when: user wants to sync docs after project changes.
testing
Remove a project from the Arkadian documentation registry and delete all associated documentation files. Use when: user wants to deregister a project.
tools
RESTRICTED to ark-project-manager. Generate actionable, dependency-ordered task lists organized by user story.
testing
RESTRICTED to ark-project-manager. Create or update feature specifications from natural language descriptions.