Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

itmegirish/.claude/skills/draft-benchmark

Name: .claude/skills/draft-benchmark
Author: itmegirish

.claude/skills/draft-benchmark/SKILL.md

npx skillsauth add itmegirish/boardingmcp-server .claude/skills/draft-benchmark

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

SKILL: draft-benchmark

Use when: user asks to run draft pipeline benchmarks, fill Draft_test.xlsx, compare with ChatGPT-5.4, or analyze draft errors.

Instructions for Claude

When this skill is invoked, follow these steps ONE BY ONE:

Step 1: Run Pipeline on Each Scenario (Fill "Draft Agent" Column C)

For each of the 10 scenarios in docs/Draft_test.xlsx column B:

Read the scenario text from the Excel file using openpyxl

Run the drafting pipeline:

import asyncio, openpyxl, sys, time
from pathlib import Path
sys.path.insert(0, str(Path("c:/Girish/Fundamental_Projects/ActionAi/Agent_steer_backend/BoardingMcp-Server")))
from app.agents.drafting_agents.drafting_graph import get_drafting_graph

graph = get_drafting_graph()
result = await graph.ainvoke({"user_request": scenario_text})

Extract draft text from result:
- Try result["final_draft"]["draft_artifacts"][0]["text"] first
- Fallback to result["draft"]["draft_artifacts"][0]["text"]
Write the draft text into column C of that row
Save Excel after each scenario
Print: scenario number, word count, time taken
Move to next scenario

Do this ONE scenario at a time. Do NOT batch. Save after each one.

Step 2: Error Analysis + Compare (Fill Columns E and F)

After all 10 drafts are filled, for each scenario:

Read column C (Draft Agent) and column D (ChatGPT-5.4) from the Excel
Run error analysis on BOTH drafts using research/run_draft_benchmark.py --compare OR manually check:

Error Categories to Check:

| Category | Severity | What to Look For | |----------|----------|------------------| | Fabrication | CRITICAL (-2.0) | Invented AIR/SCC/ILR citations, fake annexures for documents not in input, invented events/dates | | Wrong Statute | CRITICAL (-2.0) | Indian Evidence Act 1872 (repealed→BSA 2023), CrPC 1973 (repealed→BNSS 2023), IPC (repealed→BNS 2023), phantom S.27A SRA | | Missing Section | HIGH (-1.0) | No verification clause, no prayer, no jurisdiction, no cause of action, no valuation, no court fee | | Legal Error | HIGH/MEDIUM (-1.0/-0.5) | Limitation anchored to notice date, pendente lite cites S.34 CPC (should be Order XX Rule 11), facts-law section mixing, "and/or" usage | | Placeholder Excess | MEDIUM (-0.5) | More than 15 {{PLACEHOLDER}} in one draft | | Structural | MEDIUM/LOW (-0.5/-0.25) | Missing paragraph numbers, non-continuous numbering, no continuous numbering through document |

Scoring: Start at 10.0, deduct per severity above. Min 0, max 10.

Write into column E ("Compare"):

Winner: [pipeline/chatgpt] ([score diff])
Pipeline: [score]/10 ([word_count]w, [placeholder_count] placeholders)
ChatGPT: [score]/10 ([word_count]w, [placeholder_count] placeholders)

Write into column F ("Improvements") — list every pipeline error:

[CRITICAL] Fabricated citation: AIR 2019 SC 456
[HIGH] Missing: cause of action section
[MEDIUM] Pendente lite cites S.34 CPC instead of Order XX Rule 11
[MISSING] Sections present in ChatGPT but not pipeline: schedule of property

Save Excel after each scenario comparison

Step 3: Print Summary

After all 10 are compared, print:

Average score: Pipeline vs ChatGPT
Win count: Pipeline X | ChatGPT Y
Top 3 error categories hurting pipeline (by frequency)
Which scenarios pipeline loses worst on

Key Files

Excel: docs/Draft_test.xlsx (columns: s.no, Civil Draft Scenarios, Draft Agent, Chatgpt-5.4, Compare, Improvements)
Pipeline: app/agents/drafting_agents/drafting_graph.py → get_drafting_graph()
Runner script: research/run_draft_benchmark.py (--draft or --compare mode)
Reports saved to: research/output/

Running Via Script (Alternative)

# Fill Draft Agent column
agent_steer/Scripts/python.exe research/run_draft_benchmark.py --draft

# Compare + fill Compare and Improvements columns
agent_steer/Scripts/python.exe research/run_draft_benchmark.py --compare

itmegirish/.claude/skills/draft-benchmark

.claude/skills/draft-benchmark/SKILL.md

# SKILL: draft-benchmark Use when: user asks to run draft pipeline benchmarks, fill Draft_test.xlsx, compare with ChatGPT-5.4, or analyze draft errors. ## Instructions for Claude When this skill is invoked, follow these steps ONE BY ONE: ### Step 1: Run Pipeline on Each Scenario (Fill "Draft Agent" Column C) For each of the 10 scenarios in `docs/Draft_test.xlsx` column B: 1. Read the scenario text from the Excel file using openpyxl 2. Run the drafting pipeline: ```python import async

development

Updated Apr 6, 2026

$ install --global

skillsauth

npx skillsauth add itmegirish/boardingmcp-server .claude/skills/draft-benchmark

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 6, 2026, 2:17 AM6.2s1 file scanned

SKILL.md

SKILL: draft-benchmark

Use when: user asks to run draft pipeline benchmarks, fill Draft_test.xlsx, compare with ChatGPT-5.4, or analyze draft errors.

Instructions for Claude

When this skill is invoked, follow these steps ONE BY ONE:

Step 1: Run Pipeline on Each Scenario (Fill "Draft Agent" Column C)

For each of the 10 scenarios in docs/Draft_test.xlsx column B:

Read the scenario text from the Excel file using openpyxl

Run the drafting pipeline:

import asyncio, openpyxl, sys, time
from pathlib import Path
sys.path.insert(0, str(Path("c:/Girish/Fundamental_Projects/ActionAi/Agent_steer_backend/BoardingMcp-Server")))
from app.agents.drafting_agents.drafting_graph import get_drafting_graph

graph = get_drafting_graph()
result = await graph.ainvoke({"user_request": scenario_text})

Extract draft text from result:
- Try result["final_draft"]["draft_artifacts"][0]["text"] first
- Fallback to result["draft"]["draft_artifacts"][0]["text"]
Write the draft text into column C of that row
Save Excel after each scenario
Print: scenario number, word count, time taken
Move to next scenario

Do this ONE scenario at a time. Do NOT batch. Save after each one.

Step 2: Error Analysis + Compare (Fill Columns E and F)

After all 10 drafts are filled, for each scenario:

Read column C (Draft Agent) and column D (ChatGPT-5.4) from the Excel
Run error analysis on BOTH drafts using research/run_draft_benchmark.py --compare OR manually check:

Error Categories to Check:

Scoring: Start at 10.0, deduct per severity above. Min 0, max 10.

Write into column E ("Compare"):

Winner: [pipeline/chatgpt] ([score diff])
Pipeline: [score]/10 ([word_count]w, [placeholder_count] placeholders)
ChatGPT: [score]/10 ([word_count]w, [placeholder_count] placeholders)

Write into column F ("Improvements") — list every pipeline error:

[CRITICAL] Fabricated citation: AIR 2019 SC 456
[HIGH] Missing: cause of action section
[MEDIUM] Pendente lite cites S.34 CPC instead of Order XX Rule 11
[MISSING] Sections present in ChatGPT but not pipeline: schedule of property

Save Excel after each scenario comparison

Step 3: Print Summary

After all 10 are compared, print:

Average score: Pipeline vs ChatGPT
Win count: Pipeline X | ChatGPT Y
Top 3 error categories hurting pipeline (by frequency)
Which scenarios pipeline loses worst on

Key Files

Excel: docs/Draft_test.xlsx (columns: s.no, Civil Draft Scenarios, Draft Agent, Chatgpt-5.4, Compare, Improvements)
Pipeline: app/agents/drafting_agents/drafting_graph.py → get_drafting_graph()
Runner script: research/run_draft_benchmark.py (--draft or --compare mode)
Reports saved to: research/output/

Running Via Script (Alternative)

# Fill Draft Agent column
agent_steer/Scripts/python.exe research/run_draft_benchmark.py --draft

# Compare + fill Compare and Improvements columns
agent_steer/Scripts/python.exe research/run_draft_benchmark.py --compare

Related Skills

itmegirish/.claude/skills/v9-architecture

development

VerifiedTrustedCommunity

# SKILL: v9-architecture Use when: planning, building, or reviewing v11.0 architecture components (LKB 2-layer model, document schemas, structured prompt builder, gates, family migrations). ## v11.0 Architecture — Scalable Context-Driven Pipeline ### Core Principles 1. **Better context to LLM = better draft** — no complex engine needed 2. **Separate law from structure** — cause type (92) × document type (12) = 1,104 combinations 3. **Decide law before drafting, enforce law after drafting** #

SKILL.mdUpdated Apr 6, 2026

itmegirish/.claude/skills/v9-architecture

itmegirish/.claude/skills/test-draft-pipeline

development

VerifiedTrustedCommunity

# SKILL: test-draft-pipeline ## Purpose Run the drafting pipeline, evaluate output quality, and verify all 4 gates + review work correctly. ## When to Use - After modifying any pipeline node, gate, or prompt - After creating or updating an exemplar or LKB entry - For regression testing across multiple scenarios - For debugging pipeline failures ## Test Runners ### Quick Test (single scenario) ```bash agent_steer/Scripts/python.exe research/run_draft_live.py ``` ### Unit Tests ```bash agent_

SKILL.mdUpdated Apr 6, 2026

itmegirish/.claude/skills/test-draft-pipeline

itmegirish/.claude/skills/template-builder

development

VerifiedTrustedCommunity

# SKILL: exemplar-builder ## Purpose Create, validate, and maintain document schemas and LKB Layer 2 data for the v11.0 scalable drafting pipeline. **v11.0 approach:** No exemplar documents in prompts. Instead: LKB 2-layer data + document schema → structured prompt → LLM drafts. ## When to Use - Creating a new document schema (e.g., written_statement, appeal_memo) - Enriching LKB entries with Layer 2 data (available_reliefs, jurisdiction_basis) - Reviewing schema quality against CPC rules - A

SKILL.mdUpdated Apr 6, 2026

itmegirish/.claude/skills/template-builder

itmegirish/.claude/skills/section-validator

development

VerifiedTrustedCommunity

# SKILL: section-validator ## Purpose Build and maintain the 4 deterministic verification gates (Stage 3). Gates run on the full draft text with zero LLM calls. They validate, auto-fix formatting, and flag issues for review. ## When to Use - Building or modifying any gate - Adding new entity extraction patterns - Debugging false positives / false negatives - Extending verified provisions coverage ## Architecture Context (v5.1 — what's running) 4 gates run sequentially on `draft.draft_artifac

SKILL.mdUpdated Apr 6, 2026

itmegirish/.claude/skills/section-validator

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/itmegirish/boardingmcp-server.git

# Copy into Claude Code skills folder (global)
cp -r boardingmcp-server/.claude/skills/draft-benchmark ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

itmegirish/boardingmcp-server

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT