skills/forgewright/skills/production-grade/SKILL.md
Orchestrates software engineering work — build apps, add features, fix bugs, refactor code, review PRs, write tests, deploy services, audit security, design architecture, generate docs, optimize performance, debug issues, or explore ideas. Any coding or development request gets routed to the right specialized skills automatically.
npx skillsauth add ouakar/web-hosting-ubinarys-dental production-gradeInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
!git status 2>/dev/null || echo "No git repo detected"
!cat ANTIGRAVITY.md 2>/dev/null || echo "No ANTIGRAVITY.md found"
!ls .forgewright/ 2>/dev/null || echo "No existing workspace"
!cat .production-grade.yaml 2>/dev/null || echo "No config file — defaults apply"
Adaptive meta-skill orchestrator for all software engineering work. Analyzes the user's request, identifies which skills are needed, builds a minimal task graph, and executes — from a single code review to a full 17-skill greenfield build.
17 skills, one orchestrator. The orchestrator routes to the right skills based on what the user actually needs. No forced full-pipeline execution for everyday tasks.
All skills are bundled in this plugin. Single install, everything included.
Every skill invocation is wrapped by an ordered middleware chain. Read skills/_shared/protocols/middleware-chain.md for the full specification.
Pre-Skill: ① SessionData → ② ContextLoader → ③ SkillRegistry → ④ Guardrail → ⑤ Summarization
═══ SKILL EXECUTION ═══
Post-Skill: ⑥ QualityGate → ⑦ BrownfieldSafety → ⑧ TaskTracking → ⑨ Memory → ⑩ GracefulFailure
Skills are loaded on-demand based on classified mode. Read .forgewright/skills-config.json for the mode→skill mapping.
Instead of loading all 50 skill descriptions (~66KB), only load skills relevant to the mode:
Review mode → loads 1 skill (~3KB)
Feature mode → loads 5 skills (~15KB)
Full Build → loads 10 skills (~30KB)
Fallback → load all skills (classification failure)
Before any execution, classify the user's request into a mode. This determines which skills run and how.
Before classifying, check if this session is managed by Paperclip:
Paperclip indicators: ticket reference (#42, CLIP-, [paperclip]),
heartbeat context, budget mention, agent identity
If detected:
skills/_shared/protocols/paperclip-integration.mdIf not detected → proceed normally (no changes).
Step 1 — Analyze the request:
Read $ARGUMENTS and the user's message. Classify into one of these modes:
| Mode | Trigger Signals | Skills Involved | |------|----------------|-----------------| | Full Build | "build a SaaS", "production grade", "from scratch", "full stack", greenfield intent | All skills, full DEFINE→BUILD→HARDEN→SHIP→SUSTAIN→GROW pipeline | | Feature | "add [feature]", "implement [feature]", "new endpoint", "new page", "integrate [service]" | BA (if gaps detected) → PM (scoped) → Architect (scoped) → BE/FE → QA | | Harden | "review", "audit", "secure", "harden", "before launch", "production ready" (on EXISTING code) | Security + QA + Code Review (sequential) → Remediation | | Ship | "deploy", "CI/CD", "containerize", "infrastructure", "terraform", "docker" | DevOps → SRE | | Debug | "debug", "fix bug", "broken", "investigate", "not working", "error", "trace", "crashes" | Debugger (→ Software/Frontend Engineer for fix) | | AI Build | "AI feature", "chatbot", "RAG", "embeddings", "LLM", "agent", "prompt", "AI-powered", "scrape", "crawl website" | AI Engineer + Prompt Engineer + Data Scientist + Web Scraper (if web data) + Architect (scoped) → BE/FE | | Migrate | "migrate", "upgrade", "migration", "database change", "schema change", "refactor DB", "move to" | Database Engineer + Software Engineer → QA | | Test | "write tests", "test coverage", "test this", "add tests" | QA | | Review | "review my code", "code review", "code quality", "check my code" | Code Reviewer | | Architect | "design", "architecture", "API design", "data model", "tech stack", "how should I structure" | Solution Architect | | Document | "document", "write docs", "API docs", "README" | Technical Writer | | Explore | "explain", "understand", "help me think", "what should I", "I'm not sure" | Polymath | | Research | "research", "deep research", "find sources", "analyze topic", "investigate [domain]" | Polymath (research mode) + NotebookLM MCP (optional) + crawl4ai (optional, for JS-rendered sites) | | Optimize | "performance", "slow", "optimize", "scale", "reliability" | Performance Engineer + SRE + Code Reviewer | | Design | "design UI", "wireframes", "design system", "color palette", "UX flow" | UX Researcher → UI Designer | | Mobile | "mobile app", "React Native", "Flutter", "iOS", "Android" | BA (if gaps detected) → Mobile Engineer (+ PM scoped, Architect scoped if needed) | | Game Build | "game", "Unity", "Unreal", "Godot", "Roblox", "gameplay", "game design", "build a game" | Game Designer → Engine Engineer (Unity/Unreal/Godot/Roblox) → Level/Narrative/TechArt/Audio | | XR Build | "VR", "AR", "MR", "XR", "spatial", "Quest", "Vision Pro", "WebXR" | XR Engineer (+ Game Build pipeline if game-like XR) | | Marketing | "marketing", "SEO", "launch strategy", "copywriting", "content strategy", "go-to-market" | Growth Marketer (+ Conversion Optimizer if CRO mentioned) | | Grow | "growth", "CRO", "conversion", "funnel", "A/B test", "churn", "retention", "referral" | Conversion Optimizer (+ Growth Marketer if strategy needed) | | Analyze | "analyze requirements", "evaluate this", "is this feasible", "validate requirements", "check completeness", "client says" | Business Analyst (standalone requirements analysis) | | Custom | Doesn't fit above patterns | Present skill menu, let user pick |
Step 2 — Present or skip the plan:
Single-skill modes (Test, Review, Architect, Document, Explore, Design, Debug, Analyze): Skip plan presentation. Classify → invoke immediately. The intent is obvious — no overhead needed.
Multi-skill modes (Feature, Harden, Ship, Optimize, AI Build, Migrate, Custom): Present the plan for confirmation via notify_user:
Here's my plan:
[numbered list of skills and what each does]
Scope: [light / moderate / heavy]
1. **Looks good — start (Recommended)** — Execute this plan
2. **I want the full production-grade pipeline** — Run all 17 skills, 5 phases, 3 gates
3. **Adjust the plan** — Add or remove skills from the plan
4. **Chat about this** — Free-form input
Full Build mode: Always proceed to the Full Build Pipeline section below.
If the user selects "full pipeline" from any mode, switch to Full Build.
Step 3 — Execute the mode:
For non-Full-Build modes, use the lightweight execution flows below. For Full Build, use the Full Build Pipeline.
Read codingLevel from .production-grade.yaml (default: 8). Adapt ALL skill output accordingly:
# .production-grade.yaml
codingLevel: 8 # 1-10 scale (default: 8 = senior/terse)
| Level | Style | Output Behavior | |-------|-------|-----------------| | 1-3 (Junior) | Guided | Detailed explanations for every decision. Inline comments on complex logic. Link to relevant docs/tutorials. Explain WHY, not just WHAT. Step-by-step instructions for manual steps. | | 4-7 (Mid) | Standard | Balanced output — explain non-obvious decisions, skip the obvious. Standard inline comments. Focus on trade-offs and alternatives. | | 8-10 (Senior) | Terse | Code-focused, minimal commentary. Only flag unexpected decisions or gotchas. Diff-style output preferred. No tutorials, no hand-holding. Assume deep familiarity with tools and patterns. |
Rules:
codingLevel is not set, default to Standard (5)All skills MUST follow the sensitive file protection protocol:
!cat skills/_shared/protocols/sensitive-file-protection.md 2>/dev/null || echo "Protocol not found — apply defaults: never read .env without user approval, redact secrets in output, check .gitignore before commit"
ALL skills MUST run the plan quality loop before doing any work. No exceptions — every skill plans first, scores, improves until ≥ 8.0:
!cat skills/_shared/protocols/plan-quality-loop.md 2>/dev/null || echo "Protocol not found — apply defaults: every skill must plan first, score against 8 criteria, threshold 8.0/10, improve loop with research + skill self-improvement"
All modes share these behaviors:
mkdir -p skills/_shared/protocols/ .forgewright/.production-grade.yaml for path overrides.production-grade.yaml (see above)Add a feature to an existing codebase. Lightweight DEFINE → BUILD → TEST.
✓ Requirements complete — skipping BA or ⧖ Information gaps detected — running BA elicitationba-package.md to reduce questions.1 gate: After PM scoping (step 3), confirm scope before building.
Security + quality audit on existing code. No building, pure analysis + fixes.
1 gate: After findings (step 4), before remediation.
Get existing code deployed. Infrastructure + reliability.
1 gate: After DevOps infra plan, before applying.
Write tests for existing code. Single skill.
0 gates. QA operates autonomously.
Code quality review. Single skill, read-only.
0 gates. Read-only operation.
Design or redesign architecture. Single skill.
1 gate: Architecture approval before scaffold generation.
Generate documentation for existing code. Single skill.
0 gates. Technical Writer operates autonomously.
Thinking partner. Single skill.
0 gates. Polymath manages its own dialogue.
Deep, grounded research on any topic. Polymath + NotebookLM MCP (optional) + crawl4ai (optional).
skills/polymath/SKILL.md and invoke in research modesearch_web sweeps (3-5 parallel) to gather relevant URLs and initial understandingread_url_content fails on key URLs (JS-rendered, anti-bot), use Polymath's Crawl4AI Deep Research pattern. Security: library-only, URL validation, output sanitization. See skills/web-scraper/SKILL.md.server_info())workflows/deep-research.md for detailed steps0 gates. Polymath manages dialogue. NotebookLM and crawl4ai are enhancement layers, not requirements.
Performance + reliability analysis. Two skills.
1 gate: After analysis, before fixes.
Go-to-market strategy, content, and SEO. Primarily Growth Marketer.
1 gate: After strategy, before content creation.
Conversion optimization, experimentation, and growth engineering. Primarily Conversion Optimizer.
1 gate: After audit, before implementation.
Standalone requirements analysis and validation. Single skill.
skills/business-analyst/SKILL.md and follow its instructionsba-package.md with validated requirementsAnalysis complete. What next?
1. **Hand off to PM — write BRD from this analysis (Recommended)**
2. **Start Feature mode — build what was analyzed**
3. **Start Full Build — full pipeline from this analysis**
4. **Done — I just needed the analysis**
5. **Chat about this** — Free-form input
0 gates. BA operates autonomously. Handoff is optional.
User picks skills from a menu. Present via notify_user:
Which skills do you need? (list the numbers separated by commas)
--- Core Engineering ---
1. **Business Analyst** — Requirements elicitation, feasibility analysis, critical evaluation, information gatekeeping
2. **Product Manager** — Requirements, user stories, BRD
3. **Solution Architect** — System design, API contracts, tech stack
4. **Software Engineer** — Backend implementation
5. **Frontend Engineer** — UI components, pages, design system
6. **QA Engineer** — Tests — unit, integration, e2e, performance
7. **Security Engineer** — OWASP audit, STRIDE, AI security, runtime detection
8. **Code Reviewer** — Architecture conformance, code quality, git workflow
9. **DevOps** — Docker, CI/CD, Terraform, monitoring
10. **SRE** — SLOs, chaos engineering, runbooks
11. **Technical Writer** — API docs, dev guides, architecture docs
12. **Data Scientist** — AI/ML systems, RAG pipelines, agent orchestration
13. **Debugger** — Bug investigation, root cause analysis, regression testing
14. **Prompt Engineer** — Prompt design, evaluation, optimization
15. **API Designer** — REST/GraphQL design, endpoints, error taxonomy
16. **Database Engineer** — Schema design, migrations, query optimization
17. **AI Engineer** — MLOps, model serving, fine-tuning, evaluation
18. **Accessibility Engineer** — WCAG compliance, a11y audit, screen reader
19. **Performance Engineer** — Load testing, profiling, Core Web Vitals
20. **UX Researcher** — User research, usability testing, personas
21. **Data Engineer** — ETL pipelines, data warehouse, dbt, data quality
22. **Project Manager** — Sprint planning, velocity, risk management
23. **XLSX Engineer** — Excel spreadsheet creation, financial models, formula-driven reports, data formatting
--- Game Development ---
24. **Game Designer** — GDD, gameplay loops, economy, mechanic specs
25. **Unity Engineer** — C# game architecture, ScriptableObjects, Editor tools
26. **Unreal Engineer** — C++/Blueprint, GAS, Nanite/Lumen
27. **Godot Engineer** — GDScript, scene tree, signals, cross-platform
28. **Godot Multiplayer** — MultiplayerSpawner, ENet, prediction, dedicated server
29. **Roblox Engineer** — Luau, DataStore, Roblox Studio, experience design
30. **Level Designer** — Spatial design, encounters, pacing, environmental storytelling
31. **Narrative Designer** — Branching dialogue, character voice, lore
32. **Technical Artist** — Shaders, VFX, LOD, performance budgets
33. **Game Audio Engineer** — Spatial audio, adaptive music, SFX, mix
34. **Unity Shader Artist** — Shader Graph, HLSL, VFX Graph, post-processing
35. **Unity Multiplayer** — Netcode for GameObjects, relay, prediction
36. **Unreal Technical Artist** — Niagara, Material Editor, Lumen/Nanite
37. **Unreal Multiplayer** — Replication, dedicated server, GAS networking
38. **XR Engineer** — AR/VR/MR, spatial UI, hand tracking, comfort
--- Growth ---
39. **Growth Marketer** — Launch strategy, content, channels, SEO
40. **Conversion Optimizer** — CRO, funnel analysis, A/B testing, retention
--- Data Acquisition ---
41. **Web Scraper** — Secure web crawling (crawl4ai), URL validation, output sanitization, CSS/LLM extraction
--- Integration ---
42. **Paperclip** (optional) — Multi-agent orchestration, ticket management, budget control, heartbeat scheduling
43. **Chat about this** — Free-form input
Execute selected skills in dependency order. If user picks conflicting skills, resolve via the authority hierarchy.
Systematic bug investigation. Single skill (+ optional fix).
skills/debugger/SKILL.md and follow its instructions1 gate: After root cause identified (step 3), before applying fix.
Build or integrate AI-powered features. Multi-skill.
2 gates: After AI architecture design (step 3-4), and after prompt evaluation (step 7).
Database migration, framework upgrade, or large-scale code migration.
2 gates: After migration plan (step 2), and after migration scripts generated (before execution).
Build a game from concept to playable build. Full game development pipeline.
.production-grade.yaml for game.engine override, or ask:
Which engine for this game?
1. **Unity** (Recommended for indie-AA, mobile, 2D/3D)
2. **Unreal Engine** (AAA quality, heavy 3D, C++/Blueprint)
3. **Godot** (Open-source, lightweight, rapid iteration)
skills/game-designer/SKILL.md — design pillars, core loop, economy, mechanic specs, player flowsskills/unity-engineer/SKILL.md — SO architecture, gameplay systems, UI, Editor toolsskills/unreal-engineer/SKILL.md — C++ architecture, GAS, AI, Blueprint layerskills/godot-engineer/SKILL.md — scene tree, signals, Resources, exportskills/level-designer/SKILL.md — level structure, encounters, pacing, blockoutsskills/narrative-designer/SKILL.md — dialogue, characters, loreskills/technical-artist/SKILL.md — shaders, VFX, LOD, performance budgetsskills/game-audio-engineer/SKILL.md — SFX, adaptive music, mixskills/unity-multiplayer/SKILL.md or skills/unreal-multiplayer/SKILL.mdskills/unity-shader-artist/SKILL.md or skills/unreal-technical-artist/SKILL.md3 gates: After Game Designer GDD (step 3), after engine architecture (step 4), and after first playable (step 9).
Build AR/VR/MR applications. XR Engineer + optional game development pipeline.
skills/xr-engineer/SKILL.md — XR setup, spatial interaction, comfort, spatial UI2 gates: After XR architecture (step 2), and after spatial interaction playable (step 3-4).
Run silently BEFORE any execution (all modes) to ensure project intelligence is fully configured.
Step 0.1 — MCP & GitNexus Check:
.forgewright/mcp-server/mcp-config.json exists in the project root.npx --yes gitnexus analyzebash <path-to-forgewright-submodule>/scripts/mcp-generate.shℹ Auto-initialized GitNexus index and MCP server (missing setup).Run BEFORE any execution (all modes). Silent if current. One prompt max if update exists.
Step 0 — version check:
read_url_content to fetch https://raw.githubusercontent.com/buiphucminhtam/forgewright/main/VERSION → read the version string (this is the remote version)production-grade v{remote} is available (you have v{local})
1. **Update to v{remote} (Recommended)** — Auto-update and restart pipeline
2. **Skip — continue with v{local}** — Use current version
git clone --depth 1 https://github.com/buiphucminhtam/forgewright.git /tmp/pg-update
rm -rf /tmp/pg-update✓ Updated to v{remote_version}. Re-invoke /production-grade to use the new version.If any update step fails, print a warning and continue with the current version. Never let the updater break the pipeline.
Run AFTER update check, BEFORE mode classification. Follows skills/_shared/protocols/session-lifecycle.md.
Step 0.5 — session start:
Load project profile:
.forgewright/project-profile.json exists and is fresh (<24h) → load context, skip re-onboardingskills/_shared/protocols/project-onboarding.md)Load last session state:
.forgewright/session-log.json exists with interrupted session → offer resume via notify_userLoad memory context:
.forgewright/code-conventions.md if existsDetect manual changes:
Display quality trend (if history exists):
.forgewright/quality-history.json → show trend of last 5 sessionsLog: ✓ Session context loaded — [project name], last session: [summary or "first session"]
When mode is Full Build, follow this EXACT sequence:
━━━ Production Grade Pipeline v{local_version} ━━━━━━━━━━━━━━━━━━
Project: [extracted from user's message]
⧖ Bootstrapping workspace...
mkdir -p skills/_shared/protocols/
mkdir -p .forgewright/
skills/_shared/protocols/:| Protocol File | Content |
|---------------|---------|
| ux-protocol.md | 6 UX rules: never open-ended questions, "Chat about this" last, recommended first, continuous execution, real-time progress, autonomy |
| input-validation.md | 5-step validation: read config → probe inputs in parallel → classify Critical/Degraded/Optional → print gap summary → adapt scope |
| tool-efficiency.md | Parallel tool calls, view_file_outline before view_file, find_by_name not find, grep_search not grep, config-aware paths |
| conflict-resolution.md | Authority hierarchy, dedup by file:line (keep highest severity), HARDEN→BUILD feedback loops (2 cycle max) |
| project-onboarding.md | 5-phase deep project analysis: fingerprint → health check → pattern analysis → risk assessment → profile generation |
| session-lifecycle.md | Cross-session continuity: session start/save/end hooks, resume protocol, drift detection, memory integration |
| quality-gate.md | Universal per-skill validation: 4 levels (build, regression, standards, traceability), quality scoring 0-100, configurable thresholds |
| brownfield-safety.md | Safety net for existing projects: git branching, baseline snapshots, protected paths, change manifest, regression checks, rollback |
| quality-dashboard.md | Quality scoring & reporting: real-time tracking, final dashboard, machine-readable JSON reports, cross-session trending, early warning |
| graceful-failure.md | Retry limits, stuck detection, graceful exit format, failure categories — prevents skills from looping on impossible tasks |
| code-intelligence.md | GitNexus-powered knowledge graph: impact analysis, 360° context, process tracing, pre-commit risk — optional enhancement for deep code awareness |
Read these from the plugin's skills/_shared/protocols/ directory and copy them. If plugin path is unavailable, write from the summaries above.
Codebase discovery — detect greenfield vs brownfield:
If project onboarding already ran (Step 0.5 loaded .forgewright/project-profile.json) → use cached fingerprint data. Otherwise, run scans:
Run these scans in parallel:
find_by_name("package.json"), find_by_name("go.mod"), find_by_name("pyproject.toml"), find_by_name("Cargo.toml"), find_by_name("pom.xml")
find_by_name("*", "src/"), find_by_name("*", "services/"), find_by_name("*", "frontend/"), find_by_name("*", "tests/"), find_by_name("*", "docs/")
find_by_name("Dockerfile*"), find_by_name("*", ".github/workflows/"), find_by_name("*", "infrastructure/"), find_by_name("*", "terraform/")
find_by_name(".production-grade.yaml")
Classify the project:
| Signal | Mode | Behavior |
|--------|------|----------|
| Empty/new directory, no source files | Greenfield | Create everything from scratch |
| Source files exist, no .production-grade.yaml | Brownfield (unmapped) | Deep onboarding, generate config, adapt |
| Source files + .production-grade.yaml exist | Brownfield (mapped) | Use config paths, augment existing code |
If Greenfield → log ✓ Greenfield project — creating from scratch. Write minimal .forgewright/project-profile.json (to be populated progressively). Continue to step 5.
If Brownfield → run the enhanced adaptation sequence:
a. Deep project onboarding — run full skills/_shared/protocols/project-onboarding.md if not already done in Step 0.5. This produces:
.forgewright/project-profile.json — full fingerprint, health, patterns, risk.forgewright/code-conventions.md — coding patterns for all skills to followb. Structure report — display from project profile:
⧖ Existing codebase analyzed:
Language: [fingerprint.language] | Framework: [fingerprint.framework]
Architecture: [fingerprint.architecture]
Tests: [health.test_count] ([health.test_coverage_percent]% coverage)
Health: Build [✓/✗] | Tests [✓/✗] | Lint [✓/⚠] | CVEs [count]
Risk Score: [risk.overall_risk_score]/10
Patterns: [patterns.naming_convention], [patterns.component_pattern]
c. Path mapping — if no .production-grade.yaml, generate one from discovered structure. Notify user via notify_user:
I've analyzed your existing codebase. Here's what I found:
[structure summary from project profile]
I'll map the pipeline outputs to your existing structure.
1. **Approve mapping (Recommended)** — Use detected paths, generate .production-grade.yaml
2. **Customize paths** — Review and adjust the path mapping
3. **Treat as greenfield** — Ignore existing code, create fresh structure
4. **Chat about this** — Discuss how the pipeline adapts to your codebase
d. Write .production-grade.yaml from discovered structure — map paths.* to actual directories found.
e. Set brownfield context — write to .forgewright/codebase-context.md:
# Codebase Context
Mode: brownfield
Language: [detected]
Framework: [detected]
Existing paths: [mapping]
Code conventions: .forgewright/code-conventions.md
Project profile: .forgewright/project-profile.json
## Rules for all agents
- Don't overwrite existing files without explicit user approval — blindly replacing files can destroy production-critical configuration or break existing consumers that depend on current signatures
- READ .forgewright/code-conventions.md and MATCH existing code style
- ADD to existing directories, don't replace them
- If a file exists at the target path, create alongside it or extend it
- Existing tests must still pass after changes (verified by quality-gate)
- Check .forgewright/project-profile.json → risk.protected_paths before writing
f. Activate brownfield safety net — follow skills/_shared/protocols/brownfield-safety.md:
forgewright/session-{timestamp}✓ Safety net active — branch: forgewright/session-{timestamp}, baseline: [N] testsAll skills read codebase-context.md and code-conventions.md before executing.
Engagement mode:
Notify user via notify_user:
How deeply should the pipeline involve you in decisions?
1. **Standard (Recommended)** — 3 gates + moderate architect interview. Best balance of speed and control.
2. **Express** — Minimal interaction. 3 gates only, auto-derive architecture from BRD. Fastest.
3. **Thorough** — Deep interviews at PM and Architect. Full capacity planning. Review phase summaries.
4. **Meticulous** — Maximum depth. Approve each ADR individually. Review every agent output. Full control.
Write the choice to .forgewright/settings.md:
# Pipeline Settings
Engagement: [express|standard|thorough|meticulous]
All skills read this file at startup to adapt their depth. The engagement mode controls:
5b. Execution strategy — Scope Analysis & Recommendation:
Before asking the user, the orchestrator should analyze the project scope and generate a data-driven recommendation — this avoids wasting the user's time with uninformed "how would you like to proceed?" questions. This runs AFTER Gate 2 (architecture approved), when the full scope is known.
Step 5b-1: Scope Metrics Collection
Read the approved architecture and BRD to extract these metrics:
From docs/architecture/ and api/:
service_count = number of backend services/modules
endpoint_count = number of API endpoints
db_model_count = number of database models/entities
From product-manager/BRD/:
page_count = number of frontend pages/screens
user_story_count = number of user stories
From .production-grade.yaml:
has_frontend = features.frontend (true/false)
has_mobile = features.mobile (true/false)
has_ai_ml = features.ai_ml (true/false)
architecture = project.architecture (monolith/microservices)
Derived:
parallel_task_count = count of active BUILD tasks (T3a + T3b? + T3c? + T4)
integration_points = number of cross-service API calls
shared_deps = number of shared libraries/packages
Step 5b-2: Complexity Scoring
Calculate a complexity score (1-10) from the metrics:
| Factor | Weight | Score Formula | |--------|--------|---------------| | Service count | 25% | 1-2: score 2, 3-5: score 5, 6+: score 8 | | Page count | 15% | 1-3: score 2, 4-8: score 5, 9+: score 8 | | Cross-cutting concerns | 20% | shared_deps × 2 + integration_points | | Architecture type | 20% | monolith: 2, modular-monolith: 5, microservices: 8 | | Feature breadth | 20% | +2 per active platform (web, mobile, AI/ML) |
complexity_score = weighted_sum(factors)
Step 5b-3: Time Estimation
Estimate wall-clock execution time for both modes:
Base times per task (approximate):
T3a (Backend): ~15-40 min (scales with service_count)
T3b (Frontend): ~10-25 min (scales with page_count)
T3c (Mobile): ~10-20 min (scales with page_count)
T4 (DevOps): ~5-10 min
T5 (QA): ~10-20 min
T6a (Security): ~5-10 min
T6b (Review): ~5-10 min
Sequential time:
total_sequential = sum of all active task times (BUILD + HARDEN)
Parallel time:
build_parallel = max(T3a, T3b, T3c) + T4 # longest worker + sequential T4
harden_parallel = max(T5, T6a, T6b) # longest worker
merge_overhead = 2-5 min per parallel group # validation + merge
total_parallel = build_parallel + merge_overhead + harden_parallel + merge_overhead
Speed gain:
speedup_factor = total_sequential / total_parallel
time_saved = total_sequential - total_parallel
Step 5b-4: Risk Assessment (Parallel Mode)
Evaluate risks specific to parallel execution:
| Risk | Condition | Severity | Mitigation | |------|-----------|----------|------------| | Merge conflict | shared_deps > 2 OR services share DB models | Medium-High | Merge Arbiter auto-resolves configs; code conflicts escalate | | Shared schema divergence | Multiple workers read same schema, one modifies | Medium | Contract locks schema as readonly for all workers | | Package version mismatch | Workers add conflicting dependency versions | Low | Merge Arbiter unions package.json, runs dedupe | | Integration failure post-merge | Workers build against stale API contracts | Medium | All workers share same frozen api/ snapshot | | Resource exhaustion | 4 Gemini CLI processes × large context | Low | MAX_WORKERS cap + timeout per worker | | Rollback complexity | Post-merge integration fail, hard to isolate | Medium | Per-branch rollback via merge-arbiter protocol |
Risk level:
LOW — service_count <= 2, no shared deps, monolith
MEDIUM — service_count 3-5, some shared deps, modular
HIGH — service_count 6+, heavy integration, microservices
Step 5b-5: Generate Recommendation
Based on analysis, determine the recommended mode:
IF complexity_score >= 5 AND parallel_task_count >= 3 AND risk_level != HIGH:
recommendation = PARALLEL
reason = "Scope large enough to benefit from parallelization"
ELIF complexity_score >= 5 AND risk_level == HIGH:
recommendation = PARALLEL with caution
reason = "Large scope benefits from parallel, but high integration risk"
ELIF complexity_score < 5 OR parallel_task_count < 3:
recommendation = SEQUENTIAL
reason = "Scope too small for parallel overhead to pay off"
Step 5b-6: Present to User
Notify user via notify_user with the analysis:
━━━ Execution Strategy Analysis ━━━━━━━━━━━━━━━━━━━━━━━━━━━
📊 Project Scope:
Services: [N] | Pages: [N] | Endpoints: [N]
Platforms: [Web / Mobile / AI]
Architecture: [monolith / modular / microservices]
Complexity Score: [X]/10
⏱ Time Estimates:
Sequential: ~[X] min (all tasks one-by-one)
Parallel: ~[Y] min (independent tasks simultaneous)
⚡ Speedup: ~[Z]x faster ([N] min saved)
⚠️ Parallel Risks:
• Merge conflict risk: [Low/Medium/High] — [detail]
• Integration risk: [Low/Medium/High] — [detail]
• Resource usage: [N] concurrent Gemini CLI workers
📋 Recommendation: [PARALLEL / SEQUENTIAL]
Reason: [explanation]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
1. **[Recommended mode] (Recommended)** — [brief why]
2. **[Other mode]** — [brief why user might want this]
3. **Chat about this** — Discuss the analysis or ask questions
Step 5b-7: Save Decision
Append to .forgewright/settings.md:
Execution: [parallel|sequential]
Max_Workers: 4
Complexity_Score: [X]
Estimated_Time_Sequential: [N]min
Estimated_Time_Parallel: [N]min
Risk_Level: [LOW|MEDIUM|HIGH]
Write analysis report to .forgewright/scope-analysis.md for future reference.
When Parallel is selected, the BUILD and HARDEN phases use the parallel-dispatch skill (skills/parallel-dispatch/SKILL.md) to spawn git worktrees, distribute Task Contracts, and merge results. When Sequential is selected, the pipeline behaves as before.
Detect existing workspace & load memory — if .forgewright/ has prior state, use session-lifecycle resume protocol. If .forgewright/session-log.json has interrupted state, offer resume. Otherwise offer clean start via notify_user.
python3 scripts/mem0-cli.py search "<project-name> <user-request-keywords>" --limit 5 --format compact to retrieve relevant project context. Inject results into your context for this session.python3 scripts/mem0-cli.py refresh once to bootstrap memory from project files.Polymath pre-flight check:
.forgewright/polymath/handoff/context-package.md exists → read it, pass to PM as pre-loaded context. Log: ✓ Polymath context loaded — skipping redundant discoveryskills/polymath/SKILL.md and follow its instructions for pre-flight consultation before proceeding. The polymath will research, clarify with the user, and write a context package when ready.✓ Request is clear — proceeding to BA/PM7.5. BA pre-flight check (after Polymath, before PM):
.forgewright/business-analyst/handoff/ba-package.md exists → read it, pass to PM as pre-loaded context. Log: ✓ BA package loaded — requirements pre-validatedskills/business-analyst/SKILL.md and follow its instructions. BA will elicit, evaluate, validate, and produce ba-package.md.✓ Requirements sufficiently complete — proceeding to PMhealth.tests_pass == false → suggest Harden mode firstrisk.known_cves > 0 (Critical/High) → warn and suggest Security auditrisk.tech_debt_score > 7 → suggest addressing tech debt before new featuresResearch the domain — use search_web before asking the user anything (skip if polymath already researched).
Create task tracking:
Create a task.md file in .forgewright/ with all 13 tasks and their statuses. Track dependencies and completion.
phases/define.md and start immediately. Do NOT ask "should I proceed?"python3 scripts/mem0-cli.py add "Session started: [mode] mode for [brief request]. Engagement: [level]" --category sessionKey principle: The user already told you what to build. Research, plan, start building. Pause at the 3 approval gates. In Thorough/Meticulous mode, also show phase summaries between major phases — but never block on them (inform, don't gate).
After EVERY skill completes (in any mode — Full Build, Feature, Harden, etc.), run the Universal Quality Gate Protocol (skills/_shared/protocols/quality-gate.md):
quality.block_score (default 60) → STOP. Score < quality.minimum_score (default 90) → WARN at next gate.For brownfield projects: Level 2 (Regression) compares against the baseline snapshot from brownfield-safety.md. Any previously-passing test that now fails = regression = STOP.
For greenfield projects: Level 2 is auto-satisfied (no baseline).
Call these hooks at the appropriate lifecycle points:
| Event | Hook | Action |
|-------|------|--------|
| Phase completes | PHASE_COMPLETE(name, summary) | Update session-log, save to memory, update quality metrics |
| Task completes | TASK_COMPLETE(id, name, status, summary) | Update session-log |
| Gate decided | GATE_DECISION(gate#, decision, feedback) | Update session-log, save decision to memory |
| Error occurs | ERROR(task_id, type, details) | Update session-log, save blocker to memory |
| Pipeline ends | Session End | Summarize, save to memory, update project profile |
Follow the shared UX Protocol at skills/_shared/protocols/ux-protocol.md. Key rules:
(Recommended) suffixWhen the user selects "Chat about this" at any gate, invoke the polymath in translate mode:
Read skills/polymath/SKILL.md and follow its instructions in translate mode.
The polymath reads the gate artifacts, explains in plain language,
answers the user's questions via structured options,
then re-presents the original gate options when the user is ready.
This ensures non-technical users can understand what they're approving without the orchestrator needing to be the translator.
Gate 1 — BRD Approval (after T1):
Notify user via notify_user:
BRD complete: [X] user stories, [Y] acceptance criteria. Approve?
1. **Approve — start architecture (Recommended)** — BRD locked, proceed to Solution Architect
2. **Show BRD details** — Display the full BRD before deciding
3. **I have changes** — Request modifications to requirements
4. **Chat about this** — Free-form input about the BRD
Gate 2 — Architecture Approval (after T2):
Notify user via notify_user:
Architecture complete: [tech stack summary]. Approve to start building?
1. **Approve — start building (Recommended)** — Architecture locked, begin autonomous BUILD phase
2. **Show architecture details** — Walk through ADRs, diagrams, and API spec
3. **I have concerns** — Flag issues with architecture decisions
4. **Chat about this** — Free-form input about the architecture
Gate 3 — Production Readiness (after T9):
Notify user via notify_user:
All phases complete. [summary]. Ship it?
1. **Ship it — production ready (Recommended)** — Finalize assembly and deploy
2. **Show full report** — Display complete pipeline summary
3. **Fix issues first** — Address remaining findings before shipping
4. **Chat about this** — Free-form input about production readiness
Task execution with clear dependency tracking. The orchestrator reads the architecture output (number of services, pages, modules) and generates tasks accordingly. Supports both sequential and parallel execution based on settings.md.
T1: product-manager (BRD)
↓ [GATE 1]
T2: solution-architect (Architecture)
↓ [GATE 2]
T3a: software-engineer — implement backend services (1 per service)
T3b: frontend-engineer — implement frontend pages (1 per page group)
T4a: devops — Dockerfiles + CI skeleton
↓ (code written)
T5: qa-engineer — implement tests (unit/integ/e2e/perf)
T6a: security-engineer — STRIDE + code audit + dep scan
T6b: code-reviewer — arch conformance + quality review
↓
T7: devops (IaC + CI/CD)
T8: remediation (HARDEN fixes)
T9: sre (SLOs + chaos + capacity)
T10: data-scientist (conditional on AI/ML)
↓ [GATE 3]
T11: technical-writer (API ref + dev guides)
T12: skill-maker
↓
T13: Compound Learning + Assembly
T1: product-manager (BRD)
↓ [GATE 1]
T2: solution-architect (Architecture)
↓ [GATE 2]
┌────────────────────── Parallel Group A (BUILD) ─────────────────┐
│ T3a: software-engineer ──── worktree: .worktrees/T3a │
│ T3b: frontend-engineer ──── worktree: .worktrees/T3b │
│ T3c: mobile-engineer ──── worktree: .worktrees/T3c [cond.] │
└────────────────── validate → merge → integration test ─────────┘
T4a: devops (depends on merged T3a output)
↓ (code written)
┌────────────────────── Parallel Group B (HARDEN) ────────────────┐
│ T5: qa-engineer ──── worktree: .worktrees/T5 │
│ T6a: security-engineer ──── worktree: .worktrees/T6a │
│ T6b: code-reviewer ──── worktree: .worktrees/T6b │
└────────────────── validate → merge → integration test ─────────┘
↓
T7: devops (IaC + CI/CD)
T8: remediation (HARDEN fixes)
T9: sre (SLOs + chaos + capacity)
T10: data-scientist (conditional on AI/ML)
↓ [GATE 3]
T11: technical-writer (API ref + dev guides)
T12: skill-maker
↓
T13: Compound Learning + Assembly
When parallel mode is active, the orchestrator reads skills/parallel-dispatch/SKILL.md for the dispatch flow.
| Task | Blocked By | Notes | |------|-----------|-------| | T1 | — | First task, no blockers | | T2 | T1 | Needs BRD | | T3a | T2 | Backend — implement services from architecture | | T3b | T2 | Frontend — implement pages from BRD | | T4a | T2 | DevOps — Dockerfiles + CI skeleton | | T5 | T3a, T3b | QA — needs code + test plan | | T6a | T3a, T3b | Security — needs code + threat model | | T6b | T3a, T3b | Review — needs code + checklist | | T7 | T5, T6a, T6b | IaC + CI/CD — needs HARDEN output | | T8 | T5, T6a, T6b | Remediation — needs HARDEN findings | | T9 | T7, T8 | SRE — needs infra + fixes | | T10 | T7, T8 | Conditional on AI/ML usage | | T11 | T9 | Docs — needs all prior output | | T12 | T9 | Skills — needs all prior output | | T13 | T11, T12 | Final step |
After Gate 2 (architecture approved), the orchestrator reads the architecture output to determine work units:
docs/architecture/ service list or api/ specs. For each service, note it for sequential implementation in T3a..production-grade.yaml has features.frontend: falseopenai, anthropic, langchain, transformers, torch, tensorflow imports. If not detected and features.ai_ml: false, mark as completed immediately.Each phase loads its dispatcher file for task management. In parallel mode, BUILD and HARDEN phases additionally invoke the parallel-dispatch skill.
| Phase | File | Tasks | Parallel Support |
|-------|------|-------|------------------|
| DEFINE | phases/define.md | T1, T2 | No (gate-protected) |
| BUILD | phases/build.md | T3a, T3b, T3c, T4a | Yes (Group A) |
| HARDEN | phases/harden.md | T5, T6a, T6b | Yes (Group B) |
| SHIP | phases/ship.md | T7, T8, T9, T10 |
| SUSTAIN | phases/sustain.md | T11, T12, T13 |
Read the phase file BEFORE starting that phase. Never load all phase files at once.
Internal skill architecture — each skill's internal phase structure (executed sequentially in Antigravity):
| Skill | Internal Phases | |-------|----------------| | software-engineer | Shared foundations first (Phase 2a), then per-service implementation (Phase 2b). Foundations ensure consistency. | | frontend-engineer | UI Primitives first (Phase 3a), then Layout + Features (Phase 3b), then Pages (Phase 4). Primitives are foundational atoms. | | qa-engineer | Unit, integration, e2e, performance tests — sequential by test type | | security-engineer | Code audit, auth review, data security, supply chain — sequential by domain | | code-reviewer | Architecture conformance, code quality, performance review — sequential by focus | | devops | IaC, CI/CD, container orchestration — sequential by layer | | sre | Chaos engineering, incident management, capacity planning — sequential | | technical-writer | API reference, developer guides — sequential |
Read the skill's SKILL.md file and follow its instructions directly:
Read skills/<skill-name>/SKILL.md and follow its instructions.
Provide context: architecture files, BRD, workspace paths, etc.
Follow the shared protocol at skills/_shared/protocols/conflict-resolution.md.
| Artifact | Sole Authority | Others Must NOT | |----------|---------------|-----------------| | OWASP, STRIDE, PII, encryption | security-engineer | code-reviewer must NOT do security review | | SLO, error budgets, runbooks | sre | devops must NOT define SLOs | | Code quality, arch conformance | code-reviewer | — | | Infrastructure, CI/CD, monitoring setup | devops | sre reviews but doesn't provision | | Requirements (WHAT) | product-manager | architect flags gaps, doesn't change requirements | | Architecture (HOW) | solution-architect | — |
When HARDEN skills find Critical/High issues:
services/, frontend/| Task | Reads From | Writes To (Project Root) | Writes To (Workspace) |
|------|-----------|--------------------------|----------------------|
| Polymath | User dialogue, web research | — | polymath/context/, polymath/handoff/ |
| T1: PM | User input, polymath context, web research | — | product-manager/BRD/ |
| T2: Architect | product-manager/BRD/ | api/, schemas/, docs/architecture/ | solution-architect/ |
| T3a: Backend | api/, schemas/, docs/architecture/ | services/, libs/shared/ | software-engineer/ |
| T3b: Frontend | api/, product-manager/BRD/ | frontend/ | frontend-engineer/ |
| T4: DevOps | services/, docs/architecture/ | Dockerfiles at root | devops/containers/ |
| T5: QA | services/, frontend/, api/ | tests/ | qa-engineer/ |
| T6a: Security | All implementation code | — | security-engineer/ |
| T6b: Review | All implementation + architecture | — | code-reviewer/ |
| T7: DevOps IaC | Architecture, implementation | infrastructure/, .github/workflows/ | devops/ |
| T8: Remediation | HARDEN findings | Fixes in services/, frontend/ | — |
| T9: SRE | All prior outputs | docs/runbooks/ | sre/ |
| T10: Data Sci | Implementation (LLM usage) | — | data-scientist/ |
| T11: Tech Writer | ALL workspace + project | docs/ | technical-writer/ |
| T12: Skill Maker | ALL workspace | skills/ | skill-maker/ |
Deliverables go to project root (respecting .production-grade.yaml path overrides). Workspace artifacts go to .forgewright/<skill-name>/.
.forgewright/
├── .protocols/ # Shared protocols (written at bootstrap)
├── .orchestrator/ # Pipeline state via task.md
├── product-manager/ # BRD, research
├── solution-architect/ # Architecture artifacts
├── software-engineer/ # Backend logs/artifacts
├── frontend-engineer/ # Frontend logs/artifacts
├── qa-engineer/ # Test artifacts
├── security-engineer/ # Security findings
├── code-reviewer/ # Quality findings
├── devops/ # Infrastructure artifacts
├── sre/ # Readiness artifacts
├── data-scientist/ # AI/ML artifacts (conditional)
├── technical-writer/ # Documentation artifacts
└── skill-maker/ # Custom skills
| Situation | Action |
|-----------|--------|
| No frontend needed | Skip T3b, simplify DevOps |
| Monolith architecture | Single Dockerfile, skip K8s/service mesh |
| LLM/ML APIs detected | Auto-enable T10 (Data Scientist) |
| Critical security finding | Create remediation task (T8) |
| QA failures > 20% | Flag to user |
| Architecture drift detected | Warn user (arch decisions are user-approved) |
| features.frontend: false | Skip T3b entirely |
| features.ai_ml: false | Skip T10 unless auto-detected |
Security runs during ALL phases:
rm -rf /, chmod 777, destructive operations.env, .key, .pem, credentials.json from gitEvery skill execution follows:
skills/_shared/protocols/quality-gate.md after each skill output. Score must meet threshold.while not valid: fix(errors); validate().forgewright/code-conventions.md (if brownfield) and match existing patterns.| Command | Tasks Run |
|---------|----------|
| just define | T1, T2 only |
| just build | T3a, T3b, T4 (requires T2 output) |
| just harden | T5, T6a, T6b (requires BUILD output) |
| just ship | T7-T10 (requires HARDEN output) |
| just document | T11 only |
| skip frontend | Omit T3b |
| start from architecture | Skip T1, start at T2 |
| just onboard | Run project-onboarding only (no pipeline) |
At pipeline completion, generate the Quality Dashboard from skills/_shared/protocols/quality-dashboard.md. This replaces the legacy text banner with a comprehensive, machine-readable quality report.
The dashboard includes:
Machine-readable output: .forgewright/quality-report-{session}.json
Quality trending: .forgewright/quality-history.json (appended each session)
Also display the legacy summary for backward compatibility:
╔══════════════════════════════════════════════════════════════╗
║ FORGE17 v{local_version} — COMPLETE ║
╠══════════════════════════════════════════════════════════════╣
║ Project: <name> ║
║ Quality Score: [XX]/100 (Grade [A-F]) ║
║ ║
║ DEFINE: ✓ BRD (<X> stories) ✓ Architecture (<pattern>) ║
║ BUILD: ✓ Backend (<N> services) ✓ Tests (<N> passing) ║
║ HARDEN: ✓ Security (<N> fixed) ✓ Code Review (<N> fixed) ║
║ SHIP: ✓ Docker ✓ CI/CD ✓ Terraform ✓ SRE approved ║
║ SUSTAIN: ✓ Docs ✓ Skills (<N> created) ✓ Learnings captured ║
║ ║
║ Workspace: .forgewright/ ║
║ Config: .production-grade.yaml ║
║ Report: .forgewright/quality-report-{session}.json ║
╚══════════════════════════════════════════════════════════════╝
For ALL brownfield projects (any mode, not just Full Build), activate the safety net from skills/_shared/protocols/brownfield-safety.md:
| Safety Layer | When | Action |
|-------------|------|--------|
| Git branch | Pre-pipeline | Create forgewright/session-{timestamp} branch |
| Baseline snapshot | Pre-pipeline | Run existing tests, record pass count |
| Protected paths | Pre-pipeline | Register paths that must not be modified |
| Regression checks | After T3a, T3b, T5 | Verify existing tests still pass |
| Change manifest | During pipeline | Track every file create/modify/delete |
| Merge readiness | Pre-Gate 3 | Full regression + quality check |
| Rollback | On failure | Revert via session branch |
| Mistake | Fix |
|---------|-----|
| Running BUILD without DEFINE | Architecture decisions must exist first |
| Code reviewer doing OWASP review | security-engineer is sole OWASP authority |
| DevOps defining SLOs | sre is sole SLO authority |
| DevOps writing runbooks | sre writes runbooks to docs/runbooks/ |
| Skipping tests | Production grade means tested |
| Not running code after writing | Every skill verifies output compiles and runs |
| Skills working in isolation | Cross-reference via Context Bridging table |
| Over-asking the user | Respect engagement mode. Express: 3 gates only. Standard: 3 gates + moderate interview. Thorough/Meticulous: deeper interviews but always structured options. |
| Ignoring engagement mode | ALL skills must read settings.md and adapt depth. Express architect doesn't ask 15 questions. Meticulous PM doesn't skip to BRD after 2 questions. |
| One-size-fits-all architecture | Architecture is derived from constraints (scale, team, budget, compliance). A 100-user internal tool does NOT need microservices + K8s. |
| Writing stubs | No // TODO: implement in production code |
| Hardcoded paths | Read .production-grade.yaml for path overrides |
| Not leveraging skill architecture | Even though execution is sequential, each skill's internal phase structure ensures quality. Foundations before dependent work. |
| Duplicating security review | code-reviewer references security-engineer findings |
| Skipping quality gate | EVERY skill output must pass quality-gate.md — no exceptions, even in sequential mode |
| Ignoring code conventions in brownfield | Read .forgewright/code-conventions.md BEFORE writing code. Match existing patterns. |
| Modifying protected paths | Check brownfield-safety protected paths before ANY file write |
| No regression check in brownfield | After EACH build skill, verify existing tests still pass against baseline |
| Not saving session state | Call session lifecycle hooks at every phase/task/gate completion |
development
[production-grade internal] Builds AR/VR/MR applications — spatial UI/UX, hand tracking, gaze input, controller interaction, comfort optimization, and cross-platform XR (Quest, Vision Pro, WebXR, PCVR). Routed via the production-grade orchestrator (Game Build mode).
development
[production-grade internal] Creates, edits, analyzes, and validates Excel spreadsheet files (.xlsx, .csv, .tsv). Trigger when the primary deliverable is a spreadsheet — creating financial models, data reports, dashboards, cleaning messy tabular data, adding formulas/formatting, or converting between tabular formats. Also trigger when user references a spreadsheet file by name or path and wants it modified or analyzed. DO NOT trigger when the deliverable is a web page, database pipeline, Google Sheets API integration, or standalone Python script — even if tabular data is involved. Routed via the production-grade orchestrator (Feature/Custom mode).
development
[production-grade internal] Security-first web scraping and data extraction — crawl4ai integration with URL validation, output sanitization, SSRF defense, CSS-first extraction, and browser isolation. Library-only mode (no Docker API). Routed via the production-grade orchestrator (AI Build/Research/Feature mode).
testing
[production-grade internal] Conducts user research — usability testing, user interviews, persona creation, journey mapping, heuristic evaluation, and data-driven design recommendations. Routed via the production-grade orchestrator (Design mode).