plugins/yzmir-dynamic-architectures/skills/using-dynamic-architectures/SKILL.md
Use when building networks that grow, prune, or adapt topology during training. Routes to continual learning, gradient isolation, modular composition, and lifecycle orchestration skills.
npx skillsauth add tachyon-beep/skillpacks using-dynamic-architecturesInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Invoke this meta-skill when you encounter:
This is the entry point for dynamic/morphogenetic neural network patterns. It routes to 7 specialized reference sheets.
IMPORTANT: All reference sheets are located in the SAME DIRECTORY as this SKILL.md file.
When this skill is loaded from:
skills/using-dynamic-architectures/SKILL.md
Reference sheets like continual-learning-foundations.md are at:
skills/using-dynamic-architectures/continual-learning-foundations.md
NOT at:
skills/continual-learning-foundations.md (WRONG PATH)
Dynamic architectures grow capability, not just tune weights.
Static networks are a guess about capacity. Dynamic networks let training signal drive structure. The challenge is growing without forgetting, integrating without destabilizing, and knowing when to act.
Key tensions:
Diagnostic Questions:
Quick Routing:
| Problem | Primary Skill | |---------|---------------| | "Model forgets old tasks when I train new ones" | continual-learning-foundations | | "New module destabilizes existing weights" | gradient-isolation-techniques | | "Fine-tune LLM efficiently without full training" | peft-adapter-techniques | | "Pick a modern PEFT variant (VeRA / LoRA+ / PiSSA / LoftQ / rsLoRA)" | peft-adapter-techniques | | "When should I add more capacity?" | dynamic-architecture-patterns | | "How do module outputs combine?" | modular-neural-composition | | "Merge several fine-tuned checkpoints (TIES / DARE / SLERP / MergeKit)" | modular-neural-composition | | "Production-grade MoE (Switch / Mixtral / DeepSeek-MoE / Expert Choice)" | modular-neural-composition | | "How do I manage the grow/train/integrate cycle?" | ml-lifecycle-orchestration | | "How do I warm up new modules safely?" | progressive-training-strategies | | "Serve many LoRAs in one process (S-LoRA / LoRAX / Punica)" | → yzmir-ml-production |
Symptoms:
Route to: continual-learning-foundations.md
Covers:
When to Use:
Symptoms:
Route to: gradient-isolation-techniques.md
Covers:
detach() vs no_grad() semanticsWhen to Use:
Symptoms:
Route to: peft-adapter-techniques.md
Covers:
yzmir-ml-production)When to Use:
Symptoms:
Route to: dynamic-architecture-patterns.md
Covers:
When to Use:
Symptoms:
Route to: modular-neural-composition.md
Covers:
When to Use:
Symptoms:
Route to: ml-lifecycle-orchestration.md
Covers:
When to Use:
Symptoms:
Route to: progressive-training-strategies.md
Covers:
When to Use:
Need: Network that grows seeds, trains them in isolation, and grafts successful ones
Routing sequence:
Need: Train on sequence of tasks without catastrophic forgetting
Routing sequence:
Need: Grow/prune network based on training signal
Routing sequence:
Need: RL agent deciding when to grow, prune, integrate
Routing sequence:
yzmir-morphogenetic-rl (companion pack) - Controller action/observation/reward design, governor and safety gates, rollback-as-RL-signal shaping. This is the canonical home for the RL-controller-decides-mutation loop.Boundary: yzmir-morphogenetic-rl covers WHEN/HOW the controller decides to grow. This pack covers HOW the growable network trains once a decision is made.
| Rationalization | Reality | Counter-Guidance | |-----------------|---------|------------------| | "Just train a bigger model from scratch" | Transfer + growth often beats from-scratch | "Check continual-learning-foundations for why" | | "I'll freeze everything except the new layer" | Full freeze may be too restrictive | "Check gradient-isolation-techniques for partial strategies" | | "I'll add capacity whenever loss plateaus" | Need more than loss plateau (contribution check) | "Check ml-lifecycle-orchestration for proper gates" | | "Modules can just sum their outputs" | Naive summation can cause interference | "Check modular-neural-composition for combination mechanisms" | | "I'll integrate immediately when training finishes" | Need warmup/holding period | "Check progressive-training-strategies for safe integration" | | "EWC solves all forgetting problems" | EWC has limitations, may need architectural approach | "Check continual-learning-foundations for trade-offs" |
Watch for these signs of incorrect approach:
| Request | Primary Pack | Why | |---------|--------------|-----| | "Implement PPO for architecture decisions" | yzmir-deep-rl | RL algorithm implementation | | "Evaluate architecture changes without mutation" | yzmir-deep-rl/counterfactual-reasoning | Counterfactual simulation | | "Debug PyTorch gradient flow" | yzmir-pytorch-engineering | Low-level PyTorch debugging | | "Optimize training loop performance" | yzmir-training-optimization | General training optimization | | "FSDP2 + QLoRA, FP8 training, MoE dispatch kernels" | yzmir-training-optimization | Distributed/low-precision throughput | | "Apply PEFT recipes to LLMs (instruction tuning, RLHF)" | yzmir-llm-specialist | PEFT applied to LLMs in production | | "Design transformer architecture" | yzmir-neural-architectures | Static architecture design | | "Deploy morphogenetic model" | yzmir-ml-production | Production deployment | | "Serve many LoRAs in one process (S-LoRA / LoRAX / Punica)" | yzmir-ml-production | Multi-tenant adapter serving |
Intersection with deep-rl + morphogenetic-rl: If using RL to control architecture decisions (when to grow/prune), the canonical home for that work is yzmir-morphogenetic-rl (controller, governor, rollback shaping). Compose with yzmir-deep-rl's policy gradient / actor-critic methods for the algorithm side, and this pack's lifecycle orchestration for the network-training side.
Counterfactual evaluation: Before committing to a live mutation (grow/prune), use deep-rl's counterfactual-reasoning.md to simulate the change and evaluate outcomes without risk. This is critical for production morphogenetic systems.
Use these to route users:
START: Dynamic architecture problem
├─ Forgetting old tasks?
│ └─ → continual-learning-foundations
├─ New module destabilizes existing?
│ └─ → gradient-isolation-techniques
├─ Fine-tuning LLM efficiently?
│ └─ → peft-adapter-techniques
├─ When/where to add capacity?
│ └─ → dynamic-architecture-patterns
├─ How modules combine?
│ └─ → modular-neural-composition
├─ Managing grow/train/integrate cycle?
│ └─ → ml-lifecycle-orchestration
├─ Warmup/cooldown for new capacity?
│ └─ → progressive-training-strategies
└─ Building complete morphogenetic system?
└─ → Start with dynamic-architecture-patterns
→ Then gradient-isolation-techniques
→ Then ml-lifecycle-orchestration
After routing, load the appropriate reference sheet:
tools
Use when designing, implementing, or auditing an MCP (Model Context Protocol) server — tool API design, idempotency under agent retry, structured error envelopes agents can recover from, schema versioning across model drift, transport reliability (stdio / HTTP), output-shape and pagination discipline, and choosing between tools / resources / prompts / sampling. Also use when an MCP server's tools confuse agents, return unstructured errors, deadlock under concurrent calls, double-execute under retry, or lose state across reconnects. Do not use for general REST/GraphQL API design (use `/web-backend`), for client-side prompt engineering or tool-loop design (use `/llm-specialist`), for general in-process plugin architecture (use `/system-architect`), or for cryptographic-provenance audit trails (use `/audit-pipelines`).
development
Use when running **SQLite or DuckDB inside an application process** as the durable store — not as a development convenience but as the production database. Use when scaling an SQLite layer that worked at low concurrency and is now hitting SQLITE_BUSY, WAL bloat, lock contention, schema-migration ceremony, or correctness gaps under multi-process writers. Use when introducing DuckDB as an OLAP complement to an OLTP SQLite store, or when picking between the two for a new component. Pairs with `/web-backend` (the API surface above the DB) and `/audit-pipelines` (when the DB is also the audit trail). Do not load for server databases (Postgres, MySQL), key-value stores, or ORM choice in isolation.
development
Use when designing or critiquing the structure of a staged procedure — a wizard, configuration flow, troubleshooting tree, training curriculum, multi-stage approval pipeline, decision pipeline, or any decomposition of expert work into composable stages. Use for both producer work (build the decomposition) and critic work (audit a proposed decomposition). Use when reasoning about capacity, bottlenecks, or soundness of a procedural flow. Do not use for implementation-plan critique of code changes (use `/axiom-planning` instead), for execution-time dynamics (use `/simulation-foundations`), or for rendering an already-designed procedure as docs or UI (use `/technical-writer` or `/ux-designer`).
testing
Use when the user wants to draft fiction or creative nonfiction prose, get craft critique on prose they have written, or plan story structure, outline, or premise. Workshop-voiced. Three explicit modes (draft, critique, plan) and the router will refuse to begin work without a declared mode.