.claude/skills/taiga-api/SKILL.md
Query the hosted Taiga API at taiga.ant.dev for job results, passrates, transcripts, and run evaluations. Use when user asks about Taiga jobs, problem scores, eval results, or needs to submit/check jobs.
npx skillsauth add atondwal/config taiga-apiInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Query the hosted Taiga evaluation platform API for job results, transcripts, and problem runs.
Always use Python for Taiga API requests. Shell has env var + pipe bugs that strip cookie values.
Python helper to load cookie:
def get_cookie():
with open('/home/atondwal/dmodel/ant/taiga-worktree/.env') as f:
for line in f:
if line.startswith('TAIGA_IAP_COOKIE='):
return line.split('=', 1)[1].strip().strip('"')
When submitting jobs, ALWAYS use claude-opus-4-5-20251101 as the model. Never use Sonnet or other models unless explicitly requested.
Cookie stored in ~/dmodel/ant/taiga-worktree/.env. Uses __Host- prefix (session-only). If auth fails, ask user to refresh from browser DevTools → Network → copy Cookie header.
import urllib.request, json
def taiga_get(endpoint):
cookie = get_cookie() # see helper above
req = urllib.request.Request(f"https://taiga.ant.dev/api{endpoint}")
req.add_header('Cookie', cookie)
return json.loads(urllib.request.urlopen(req).read())
# Example: get job problems
data = taiga_get(f"/jobs/{job_id}/problems")
Full docs at: https://taiga.ant.dev/api/docs
| Endpoint | Method | Purpose |
|----------|--------|---------|
| /jobs | GET | List all jobs |
| /jobs?environment_id={id} | GET | List jobs for environment |
| /jobs/{job_id} | GET | Get job details |
| /jobs/{job_id}/problems | GET | Get problem results (passrates, scores) |
| /jobs/{job_id}/problems/stream | GET | Stream problem results |
| /jobs/{job_id}/error-summary | GET | Get error summary |
| /jobs | POST | Create job with problems |
| /cancel-job/{job_id} | POST | Cancel running job |
| /resubmit-problem/{job_id}/{problem_id} | POST | Resubmit specific problem |
| Endpoint | Method | Purpose |
|----------|--------|---------|
| /transcript/{problem_run_id} | GET | Get full transcript |
| /transcript/stream/{problem_run_id} | GET | Stream transcript |
| Endpoint | Method | Purpose |
|----------|--------|---------|
| /problem_runs/{problem_id} | GET | List runs for problem |
| /problem-runs/{id}/container-logs | GET | Get container logs |
| /problem-runs/{id}/mcp-server-logs | GET | Get MCP server logs |
| /problem-runs/{id}/download-output | GET | Download output directory |
| Endpoint | Method | Purpose |
|----------|--------|---------|
| /environments | GET | List environments |
| /environments/{id} | GET | Get environment details |
| /environments?skip=0&limit=100 | GET | Paginated list |
| Endpoint | Method | Purpose |
|----------|--------|---------|
| /problems/{problem_id}/attempts | GET | Get problem attempts |
| /problems/versions/{version_id} | GET | Get problem version |
| /problems/versions/{version_id}/run | POST | Run problem version |
| /problem-crud | GET | List all problems |
| /problem-crud/stats/pass-rates | POST | Get pass rate stats |
| Endpoint | Method | Purpose |
|----------|--------|---------|
| /docker-images | GET | List docker images |
| /docker-images/{id}/download | GET | Download image source |
job_id = "3c300cca-707a-4e92-ac71-5688165f9ae1" # from URL ?id= param
data = taiga_get(f"/jobs/{job_id}/problems")
for r in data:
print(f"{r['problem_id']}: {r['final_score']}")
from collections import defaultdict
job_id = "YOUR_JOB_ID"
data = taiga_get(f"/jobs/{job_id}/problems")
problems = defaultdict(list)
for r in data:
problems[r['problem_id']].append(r['final_score'])
total_pass = total_runs = 0
for pid, scores in sorted(problems.items()):
passed = sum(1 for s in scores if s == 1.0)
total = len(scores)
total_pass += passed
total_runs += total
print(f"{pid}: {passed}/{total} ({100*passed/total:.0f}%)")
print(f"\nOverall: {total_pass}/{total_runs} ({100*total_pass/total_runs:.1f}%)")
problem_run_id = "118ed21a-9864-4c8c-b34b-d92428f1c22a"
transcript = taiga_get(f"/transcript/{problem_run_id}")
env_id = "8e646c11-1461-44a4-9e8d-e3800a02ba07"
jobs = taiga_get(f"/jobs?environment_id={env_id}")
for j in jobs:
print(f"{j['id']}: {j['status']}")
job = taiga_get(f"/jobs/{job_id}")
print(f"Status: {job['status']}, Completed: {job.get('completed_count')}")
import urllib.request, json
with open('problems-metadata.json') as f:
problems = json.load(f)
payload = {
"name": "my-job-name",
"problems_metadata": problems,
"n_attempts_per_problem": 10,
"api_model_name": "claude-opus-4-5-20251101" # ALWAYS use Opus 4.5
}
cookie = get_cookie()
req = urllib.request.Request(
"https://taiga.ant.dev/api/jobs",
data=json.dumps(payload).encode(),
headers={"Cookie": cookie, "Content-Type": "application/json"}
)
resp = json.loads(urllib.request.urlopen(req).read())
print(f"Job ID: {resp.get('job_id')}")
{
"id": "118ed21a-...",
"problem_id": "sort-unique",
"attempt_number": 1,
"final_score": 1.0,
"status": "completed",
"subscores": {"matched_solution": 1.0},
"weights": {"matched_solution": 1.0},
"execution_time_ms": 467000,
"total_tokens": 34205
}
{
"id": "3c300cca-...",
"status": "completed",
"environment_id": "8e646c11-...",
"api_model_name": "claude-opus-4-5-20251101",
"created_at": "2025-11-24T17:46:30Z"
}
From Taiga web UI URLs:
https://taiga.ant.dev/job?id={job_id}&environmentId={env_id}https://taiga.ant.dev/transcripts?id={job_id}&problemId={problem_id}&...The id parameter in URLs is the job_id.
urllib.request - avoid shell due to env var bugs/jobs/{id}/problems is the main endpoint for checking pass rates/stream variantsdevelopment
Search through historical Claude sessions for code, conversations, and commands. Use when user asks about previous sessions, past conversations, "did I ever", "find in history", or wants to retrieve code/info from earlier chats.
development
Maintainer-only workflow for handling GitHub Secret Scanning alerts on OpenClaw. Use when Codex needs to triage, redact, clean up, and resolve secret leakage found in issue comments, issue bodies, PR comments, or other GitHub content.
development
Maintainer workflow for OpenClaw releases, prereleases, changelog release notes, and publish validation. Use when Codex needs to prepare or verify stable or beta release steps, align version naming, assemble release notes, check release auth requirements, or validate publish-time commands and artifacts.
development
Run, watch, debug, and extend OpenClaw QA testing with qa-lab and qa-channel. Use when Codex needs to execute the repo-backed QA suite, inspect live QA artifacts, debug failing scenarios, add new QA scenarios, or explain the OpenClaw QA workflow. Prefer the live OpenAI lane with regular openai/gpt-5.4 in fast mode; do not use gpt-5.4-pro or gpt-5.4-mini unless the user explicitly overrides that policy.