skills/chain-simulation-dual-mode-reasoning/SKILL.md
Dual-mode reasoning framework that dynamically routes problems to specialized strategies: computational flow for math, symbolic JSON state tracking for spatial/entity reasoning, and hybrid fact-extraction for multi-hop inference. Use when asked to 'solve this step by step', 'reason through this problem', 'track state changes', 'figure out the answer to this logic puzzle', 'solve this math word problem', or 'chain these facts together'.
npx skillsauth add ndpvt-web/arxiv-claude-skills chain-simulation-dual-mode-reasoningInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill enables Claude to apply the Chain of Simulation framework -- a dynamic problem-routing system that classifies reasoning tasks and dispatches them to one of three specialized modes: computational flow (math with self-consistency), symbolic state tracking (spatial/entity reasoning via JSON), or hybrid fact-extraction (multi-hop logical inference). Instead of applying a single generic chain-of-thought strategy to every problem, CoS matches the reasoning strategy to the problem type, achieving higher accuracy at lower computational cost.
CoS works by first classifying the input problem along four dimensions -- mathematical content, spatial content, multi-hop logical structure, and entity-tracking density -- then routing to the mode best suited for that problem type. This matters because applying the wrong mode catastrophically fails: computational mode scores 81.2% on math problems but 0% on spatial tasks. The routing is deterministic and keyword-driven, not probabilistic.
The three modes each impose a different reasoning structure. Computational flow generates step-by-step arithmetic, extracts a final numeric answer, and optionally samples multiple reasoning paths (self-consistency with k=5) to pick the median/majority answer. Symbolic state tracking initializes a JSON object representing world state ({"locations": {}, "objects": {}}) and iteratively updates it event-by-event, producing a machine-parseable final state from which the answer is read. Hybrid fact-extraction decomposes the problem into fact extraction, relationship identification, logical chaining, and conclusion -- ideal for yes/no questions requiring external knowledge synthesis.
The efficiency gain comes from targeted application: instead of running expensive self-consistency (k=5+ samples) on every problem, CoS only applies multi-sample generation to math problems where it helps, and uses deterministic single-pass reasoning (temperature=0) for state tracking. This achieves comparable accuracy to blanket self-consistency at 54% lower cost.
Classify the problem by scanning for four indicator types:
Route to the appropriate mode using this priority logic:
For Computational mode: Structure the solution as explicit step-by-step calculations. Show every intermediate value. Extract the final numeric answer on a clearly marked line. If confidence is low, generate the solution via multiple reasoning paths and take the median of numeric answers or majority vote of categorical answers.
For Symbolic mode: Initialize a JSON state object representing all known entities, locations, and attributes. Process each event/sentence sequentially, updating the JSON state after each step. Show the state transitions explicitly. Extract the answer by querying the final JSON state.
For Hybrid mode: First extract all relevant facts as a numbered list. Then identify relationships between facts. Then chain the facts through logical steps. Finally, state the conclusion with a clear final answer.
Extract the answer in a normalized format: strip whitespace, lowercase text answers, apply domain-specific aliases (e.g., "bath" = "bathroom"), and for numeric answers compare with tolerance (epsilon = 1e-9).
Validate the answer by checking internal consistency: does the final answer match intermediate results? For computational mode, verify the last calculation produces the claimed number. For symbolic mode, confirm the final JSON state is well-formed and the queried field exists. For hybrid mode, confirm the conclusion follows from the extracted facts.
Present the result with the reasoning trace visible, the mode used identified, and the final answer clearly marked.
Example 1: Math Word Problem (Computational Mode)
User: "A store sells notebooks for $4 each and pens for $1.50 each. Sarah buys 3 notebooks and 7 pens. She pays with a $50 bill. How much change does she receive?"
Classification: Contains numbers, arithmetic keywords ("buys", "pays", "how much"), no spatial indicators. Route: Computational mode
Approach:
FINAL_ANSWER: $27.50
Example 2: Entity/Spatial Tracking (Symbolic Mode)
User: "John put the apple in the kitchen. Mary moved the apple to the garden. John went to the bedroom. Mary moved the apple to the bedroom. Where is the apple?"
Classification: Named entities (John, Mary), spatial keywords ("put", "moved", "went to", "kitchen", "garden", "bedroom"), object tracking required. Route: Symbolic mode
Approach:
{"locations": {"John": "unknown", "Mary": "unknown"}, "objects": {"apple": "unknown"}}
{"locations": {"John": "kitchen", "Mary": "unknown"}, "objects": {"apple": "kitchen"}}
{"locations": {"John": "kitchen", "Mary": "garden"}, "objects": {"apple": "garden"}}
{"locations": {"John": "bedroom", "Mary": "garden"}, "objects": {"apple": "garden"}}
{"locations": {"John": "bedroom", "Mary": "bedroom"}, "objects": {"apple": "bedroom"}}
FINAL_ANSWER: bedroom
Example 3: Multi-Hop Inference (Hybrid Mode)
User: "Was the first president of the United States born in the same century as the invention of the steam engine?"
Classification: No arithmetic, no spatial movement, requires chaining multiple facts ("first president" -> birth year, "steam engine" -> invention date, then comparison). Multi-hop indicators present. Route: Hybrid mode
Approach:
FINAL_ANSWER: Yes
Example 4: Hybrid Problem (Math + Spatial)
User: "A delivery driver starts at the warehouse, drives 15 km to Store A, then 8 km to Store B, then 12 km back to the warehouse. Gas costs $1.20 per km. How much did the driver spend on gas?"
Classification: Mathematical (numbers, "costs", "how much") AND spatial ("drives", "to Store A", "back to warehouse"). Route: Hybrid mode (math + spatial)
Approach:
FINAL_ANSWER: $42.00
Paper: Chain of Simulation: A Dual-Mode Reasoning Framework for Large Language Models with Dynamic Problem Routing by Saeid Sheikhi (2026). Look for Algorithms 1-4 which detail the classification vector computation, mode selection priority logic, computational flow with self-consistency, and JSON state tracking with repair mechanisms.
development
Audit LLM-based automatic short answer grading (ASAG) systems for adversarial vulnerabilities using token-level and prompt-level attack strategies from the GradingAttack framework. Triggers: 'test grading robustness', 'adversarial attack on grading', 'audit LLM grader', 'red-team answer grading', 'ASAG vulnerability assessment', 'grading fairness attack'
development
Build structured information-seeking agents that decompose complex queries into multi-turn search-and-browse workflows, aggregate results from multiple web sources, and return answers in typed structured formats (items, sets, lists, tables). Applies the GISA benchmark's ReAct-based agent architecture and evaluation methodology. Trigger phrases: "build an information-seeking agent", "search agent pipeline", "multi-turn web research agent", "structured web search workflow", "aggregate information from multiple sources", "web research with structured output"
data-ai
Optimize LLM prompts using GFlowPO's iterative generate-evaluate-refine loop with diversity-preserving exploration and dynamic memory. Use when: 'optimize this prompt', 'find a better prompt for this task', 'prompt engineering with examples', 'auto-tune my system prompt', 'improve prompt accuracy', 'generate prompt variations'.
development
Constrain LLM generation with executable Pydantic schemas and multi-agent pipelines to produce structurally valid, domain-rich artifacts. Uses ontology-as-grammar to eliminate hallucinated structures while preserving creative output. Trigger phrases: "generate a valid game design", "schema-constrained generation", "build a multi-agent pipeline with Pydantic validation", "ontology-driven content generation", "structured creative generation with DSPy", "generate artifacts that pass domain validation".