assets/skills/analyze-project/SKILL.md
Forensic root cause analyzer for Antigravity sessions. Classifies scope deltas, rework patterns, root causes, hotspots, and auto-improves prompts/health.
npx skillsauth add aliabbaschadhar/agent-superpowers analyze-projectInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Analyze AI-assisted coding sessions in brain/ and produce a diagnostic report that explains not just what happened, but why it happened, who/what caused it, and what should change next time.
This workflow is not a simple metrics dashboard. It is a forensic analysis workflow for AI coding sessions.
For each session, determine:
.resolved.N counts as signals of iteration intensity, not proof of failureconversation_idtitleobjectivecreatedlast_modifiedOutput: indexed list of conversations to analyze.
For each conversation, read all structured artifacts that exist.
task.mdimplementation_plan.mdwalkthrough.md*.metadata.jsontask.md.resolved.0 ... Nimplementation_plan.md.resolved.0 ... Nwalkthrough.md.resolved.0 ... N.md artifactshas_taskhas_planhas_walkthroughis_completedis_abandoned_candidate = has task but no walkthroughtask_versionsplan_versionswalkthrough_versionsextra_artifactstask_items_initialtask_items_finaltask_completed_pctscope_delta_rawscope_creep_pct_rawcreated_atcompleted_atduration_minutesobjective_textinitial_plan_summaryfinal_plan_summaryinitial_task_excerptfinal_task_excerptwalkthrough_summarymentioned_files_or_subsystemsvalidation_requirements_presentacceptance_criteria_presentnon_goals_presentscope_boundaries_presentfile_targets_presentconstraints_presentFor each conversation, score the opening objective/request on a 0–2 scale for each dimension:
Create:
prompt_sufficiency_scoreprompt_sufficiency_band = High / Medium / LowThen note which missing ingredients likely contributed to later friction.
Important: Do not assume a low-detail prompt is bad by default. Short prompts can still be good if the task is narrow and the repo context is obvious.
Do not treat all scope growth as the same.
For each conversation, classify scope delta into:
New items clearly introduced beyond the initial ask. Examples:
Work that was not in the opening ask but appears required to complete it correctly. Examples:
Work that appears not requested and not necessary, likely introduced by agent overreach.
For each conversation record:
scope_change_type_primaryscope_change_type_secondary (optional)scope_change_confidenceDo not just count revisions. Determine the shape of session rework.
Classify each conversation into one of these patterns:
Record:
rework_shaperework_shape_confidenceFor every non-clean session, assign:
Choose one:
SPEC_AMBIGUITYHUMAN_SCOPE_CHANGEREPO_FRAGILITYAGENT_ARCHITECTURAL_ERRORVERIFICATION_CHURNLEGITIMATE_TASK_COMPLEXITYOptional if a second factor materially contributed.
Every root cause assignment must include:
Use when the opening ask lacked boundaries, targets, criteria, or constraints, and the plan had to invent them.
Use when the task set expanded due to new asks, broadened goals, or post-hoc additions.
Use when hidden coupling, unclear architecture, brittle files, or environmental issues forced extra work.
Use when the agent chose the wrong approach, wrong files, wrong assumptions, or hallucinated structure.
Use when implementation mostly succeeded but tests, validation, QA, or fixes created repeated loops.
Use when revisions were reasonable given the difficulty and do not strongly indicate avoidable failure.
Across all conversations, cluster repeated struggle by subsystem, folder, or file mentions.
Examples:
frontend/auth/*db.pyui.pyvideo_pipeline/*For each cluster, calculate:
Output the top recurring friction zones.
Goal: Identify whether struggle is prompt-driven, agent-driven, or concentrated in specific repo areas.
Compare these cohorts:
For each comparison, identify:
Do not merely restate averages. Extract causal-looking patterns cautiously and label them as inference where appropriate.
Generate 3–7 findings that are not simple metric restatements.
Good examples:
Bad examples:
Each finding must include:
Create session_analysis_report.md in the current conversation’s brain folder.
Use this structure:
Generated: [timestamp] Conversations Analyzed: [N] Date Range: [earliest] → [latest]
| Metric | Value | Rating | |:---|:---|:---| | First-Shot Success Rate | X% | 🟢/🟡/🔴 | | Completion Rate | X% | 🟢/🟡/🔴 | | Avg Scope Growth | X% | 🟢/🟡/🔴 | | Replan Rate | X% | 🟢/🟡/🔴 | | Median Duration | Xm | — | | Avg Revision Intensity | X | 🟢/🟡/🔴 |
Then include a short narrative summary:
| Root Cause | Count | % | Notes | |:---|:---|:---|:---| | Spec Ambiguity | X | X% | ... | | Human Scope Change | X | X% | ... | | Repo Fragility | X | X% | ... | | Agent Architectural Error | X | X% | ... | | Verification Churn | X | X% | ... | | Legitimate Task Complexity | X | X% | ... |
Separate:
Show top offenders in each category.
Summarize how sessions tend to fail:
Cluster repeated struggle by subsystem/file/domain. Show which areas correlate with:
List the cleanest sessions and extract what made them work:
List 3–7 high-value findings with evidence and confidence.
Each recommendation must use this format:
Recommendations must be specific, not generic.
| # | Title | Duration | Scope Δ | Plan Revs | Task Revs | Root Cause | Rework Shape | Complete? | |:---|:---|:---|:---|:---|:---|:---|:---|:---|
Add short notes only where meaningful.
~/.gemini/antigravity/.agent/skills/project-health-state/SKILL.mdUpdate:
Create prompt_improvement_tips.md
Do not give generic advice. Instead extract:
If multiple struggle sessions cluster around the same subsystem or repeated sequence, recommend:
Only recommend workflows when the pattern appears repeatedly.
The workflow must produce:
If evidence is weak, say so. Do not overclaim. Prefer explicit uncertainty over fake precision.
How to invoke this skill
Just say any of these in a new conversation:
The agent will automatically discover and use the skill.
tools
Comprehensive molecular biology toolkit. Use for sequence manipulation, file parsing (FASTA/GenBank/PDB), phylogenetics, and programmatic NCBI/PubMed access (Bio.Entrez). Best for batch processing, custom bioinformatics pipelines, BLAST automation. For quick lookups use gget;...
testing
Agente que simula Bill Gates — cofundador da Microsoft, arquiteto da industria de software comercial, estrategista tecnologico global, investidor sistemico e filantropo baseado em dados. Use...
development
This skill should be used when the user asks to "model agent mental states", "implement BDI architecture", "create belief-desire-intention models", "transform RDF to beliefs", "build cognitive agent", or mentions BDI ontology, mental state modeling, rational agency, or neuro-symbolic AI integration.
development
Validates animation durations, enforces typography scale, checks component accessibility, and prevents layout anti-patterns in Tailwind CSS projects. Use when building UI components, reviewing CSS utilities, styling React views, or enforcing design consistency.