tool-error-analyzer/SKILL.md
Analyze Codex rollout transcripts for one repository and time window, extract repo-scoped tool calls that failed, cluster repeated error patterns, and recommend the next agent/tooling improvements. Use when the user wants to inspect tool-call failures, operational friction, repeated command mistakes, patch failures, or prompt/tooling improvements based on recent Codex runs.
npx skillsauth add grp06/useful-codex-skills tool-error-analyzerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this skill when the job is "show me where the agent struggled operationally" rather than "summarize what work happened."
The default path is: run the bundled digest script once, read its clustered failure summary, and only open raw rollout transcripts if the digest reports incomplete coverage or a cluster needs deeper inspection.
This skill is transcript-first. It reads real rollout JSONL files and pairs:
function_call with function_call_outputcustom_tool_call with custom_tool_call_outputThe script focuses on meaningful failures such as:
apply_patch verification failuresIt intentionally filters low-signal cases such as plain rg no-match exits when
the transcript shows no actual error text.
Run the digest first.
python3 ~/.codex/skills/tool-error-analyzer/scripts/repo_tool_error_digest.py \
--repo ~/TaskRally \
--when yesterday \
--format markdown
If the user gave a specific date, replace --when yesterday with
--date YYYY-MM-DD.
If the user asked for a rolling recent window:
python3 ~/.codex/skills/tool-error-analyzer/scripts/repo_tool_error_digest.py \
--repo ~/TaskRally \
--last-hours 6 \
--format markdown
If you want machine-readable output:
python3 ~/.codex/skills/tool-error-analyzer/scripts/repo_tool_error_digest.py \
--repo ~/TaskRally \
--date 2026-04-01 \
--format json
Important: the script is fail-closed for transcript coverage. If any relevant
repo session for the target window has no readable rollout transcript, the
digest reports INCOMPLETE and exits non-zero unless --allow-incomplete is
set. Do not make confident process recommendations from a partial result.
Resolve the repo path.
~/repo, let the script expand it.Run the digest before reading raw transcripts.
--format markdown unless you explicitly need JSON.--last-hours X.Read the clustered failure summary.
candidate_improvements as the default starting point.Escalate to raw transcripts only when needed.
Answer in terms of agent improvement.
apply_patch verification failed as a real workflow smell even when a
later patch succeeds.rg/grep no-match exits when there is no other error text.~/.codex unless CODEX_HOME is
set.session-analyzer.~/.codex/sessions~/.codex/archived_sessions~/.codex/scripts/sessionslogs_1.sqlite for tool error extraction.call_id.The JSON payload has a stable top-level schema_version.
Key top-level fields:
status: overall digest completeness (complete or incomplete)target_mode: calendar_day or rolling_hourstarget_window: normalized window metadatarelevant_sessions: repo-relevant top-level sessions with per-session failure
countstool_error_events: normalized meaningful failureserror_clusters: grouped recurring failure patternscandidate_improvements: ranked next improvements derived from clustersEach tool_error_events[] entry includes:
session_idtimestamptool_nametool_familyinput_summaryerror_kinderror_familyexit_coderecovered_laterrecovery_call_idtarget_hintsMinimal example:
{
"schema_version": 1,
"status": "complete",
"target_mode": "rolling_hours",
"tool_error_events": [
{
"session_id": "thread-id",
"tool_name": "exec_command",
"tool_family": "function",
"error_kind": "shell_globbing",
"error_family": "agent_command_bug",
"exit_code": 1,
"recovered_later": true
}
],
"candidate_improvements": [
{
"cluster_key": "shell_globbing",
"title": "Quote shell paths before opening files with glob characters",
"score": 12
}
]
}
development
Create an ExecPlan from a locked refactor decision, PRD, RFC, or detailed problem statement, following the repo's PLANS.md. Use when the user asks for an exec plan, execution plan, or ExecPlan, or wants a decided refactor turned into a step-by-step plan.
documentation
Update ARCHITECTURE.md after implementing an execution plan. Use when user asks to update architecture docs, sync architecture, refresh architecture after implementation, or mentions updating docs post-execplan.
development
Analyze Codex session logs for one repository, extract the repo-specific top-level conversations for a target day, summarize every user message in order, and surface the next best action. Use when the user asks what happened yesterday in a repo, which conversations touched a repo, what to work on next based on recent Codex work, or to reconstruct repo workstreams from session history.
data-ai
Execute a pending execution plan from the .agent folder. Use when user asks to implement an execplan, execute the plan, run the pending plan, or mentions execplan-pending.md.