skills/canonical-intermediate-representation-llm-based/SKILL.md
Translate natural language optimization problems into executable solver code using a Canonical Intermediate Representation (CIR) schema and multi-agent R2C pipeline. Decomposes operational rules into constraint archetypes and modeling paradigms before generating code. Triggers: "formulate this optimization problem", "write a solver for this scheduling problem", "convert these business rules to constraints", "model this linear program from the description", "generate Gurobi/PuLP code for this OR problem", "help me formulate these operational constraints mathematically".
npx skillsauth add ndpvt-web/arxiv-claude-skills canonical-intermediate-representation-llm-basedInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill enables Claude to translate natural language descriptions of optimization problems into executable mathematical programming code by using a structured intermediate representation. Instead of jumping directly from English to solver code (which causes LLMs to botch composite constraints and choose wrong modeling paradigms), Claude first generates a Canonical Intermediate Representation (CIR) that explicitly maps each operational rule to a constraint archetype and modeling paradigm, then instantiates that CIR into solver code. This decouples what a rule means from how it's encoded mathematically, dramatically improving correctness on complex problems.
The core insight of this approach is that LLMs fail on optimization formulation not because they lack mathematical knowledge, but because they try to do too much in one step: simultaneously understanding the operational rule, choosing a modeling paradigm, and writing code. CIR breaks this into explicit, auditable stages. Each operational rule is first classified into a constraint archetype (assignment, capacity, flow conservation, temporal sequencing, cardinality, indicator/conditional, symmetry-breaking, linking) and paired with a candidate modeling paradigm (binary decision variables, big-M linearization, indicator constraints, piecewise-linear, flow-balance equations, set-covering). This classification is recorded in a structured schema before any code is written.
The R2C (Rule-to-Constraint) framework then operates as a multi-agent pipeline with four stages: (1) a Parser Agent extracts sets, parameters, decision variables, objectives, and operational rules from the natural language description; (2) a CIR Synthesizer Agent maps each extracted rule to constraint archetypes and modeling paradigms by retrieving from a knowledge base of known (archetype, paradigm, code-template) triples; (3) a Code Generator Agent instantiates the full optimization model in executable solver code; and (4) a Reflection Agent validates the generated code by checking constraint coverage, dimensional consistency, and solver feasibility, feeding corrections back if needed.
The knowledge base is the backbone: it stores canonical pairings of constraint archetypes with modeling paradigms alongside reusable code templates. When a new rule is encountered, the synthesizer retrieves the closest matching entries and adapts them. This retrieval-augmented approach means the system doesn't reinvent formulations from scratch — it composes proven patterns.
Extract problem components from natural language. Parse the user's description to identify: (a) sets/indices (e.g., nurses, shifts, days), (b) parameters (e.g., demand per shift, max hours), (c) decision variables with their domains (binary, integer, continuous), (d) the objective function (minimize cost, maximize coverage), and (e) each distinct operational rule stated in the problem.
Enumerate operational rules explicitly. List every constraint-bearing sentence as a separate rule. For example, "Each nurse works at most 5 consecutive days and must have at least 2 weekends off per month" becomes two rules: R1 (consecutive day limit) and R2 (weekend-off minimum).
Classify each rule into a constraint archetype. Assign each rule to one of these archetypes:
Select a modeling paradigm for each archetype. Choose the mathematical encoding:
addGenConstrIndicator)Construct the CIR table. For each rule, record a structured entry:
Rule ID | Natural Language Rule | Archetype | Paradigm | Variables Involved | Constraint Sketch
The constraint sketch is a semi-formal mathematical expression (not yet code) showing the constraint structure.
Validate the CIR for completeness and consistency. Check that: every rule from step 2 has a CIR entry, variable domains are consistent across constraints, the objective references only defined variables, and no rule is modeled with a paradigm that contradicts another (e.g., two constraints requiring contradictory variable domains).
Generate executable solver code from the CIR. Translate each CIR entry into solver API calls. Use PuLP by default (most accessible), Gurobi for performance-critical problems, or OR-Tools if the user specifies. Structure the code as: (a) import solver, (b) define model, (c) create variables, (d) add constraints grouped by CIR rule ID, (e) set objective, (f) solve, (g) extract and print results.
Add constraint annotations. Comment each constraint block with its Rule ID, natural language rule, and archetype so the code is auditable and the user can trace each line back to the original problem statement.
Run the reflection check. Verify the generated code by mentally executing: Are all indices bounded? Do big-M values have justified bounds (not arbitrary 1e6)? Are binary variables correctly typed? Does the objective direction match the problem (min vs max)? Are there off-by-one errors in temporal constraints?
Present the CIR table and code together. Always show the user the intermediate CIR table before or alongside the code, so they can validate the modeling decisions independently of the code.
Example 1: Nurse Scheduling Problem
User: "I need to schedule 20 nurses across 3 shifts (morning, evening, night) over 7 days. Each shift needs exactly 5 nurses. No nurse can work more than 5 days. No nurse should work night shift followed by morning shift the next day. Minimize the total number of night shifts assigned."
Approach:
R1 | "Each shift needs exactly 5 nurses" | Assignment | Set-partitioning | x[n,s,d] | sum_n x[n,s,d] = 5, forall s,d
R2 | "No nurse works more than 5 days" | Capacity | Binary sum bound | x[n,s,d] | sum_{s,d} x[n,s,d] <= 5, forall n (note: sum over s first gives at most 1 per day)
R3 | "No night then morning next day" | Temporal | Binary linking | x[n,s,d] | x[n,N,d] + x[n,M,d+1] <= 1, forall n, d<7
import pulp
nurses = range(20)
shifts = ['M', 'E', 'N']
days = range(7)
model = pulp.LpProblem("NurseScheduling", pulp.LpMinimize)
# Decision variables
x = pulp.LpVariable.dicts("x", (nurses, shifts, days), cat='Binary')
# Objective: minimize total night shifts
model += pulp.lpSum(x[n]['N'][d] for n in nurses for d in days)
# R1 - Assignment: exactly 5 nurses per shift per day
for s in shifts:
for d in days:
model += pulp.lpSum(x[n][s][d] for n in nurses) == 5
# R2 - Capacity: each nurse works at most 5 days
for n in nurses:
model += pulp.lpSum(x[n][s][d] for s in shifts for d in days) <= 5
# R2 auxiliary - at most one shift per nurse per day
for n in nurses:
for d in days:
model += pulp.lpSum(x[n][s][d] for s in shifts) <= 1
# R3 - Temporal: no night shift followed by morning shift next day
for n in nurses:
for d in range(6):
model += x[n]['N'][d] + x[n]['M'][d+1] <= 1
model.solve()
for n in nurses:
schedule = [s for d in days for s in shifts if x[n][s][d].varValue > 0.5]
print(f"Nurse {n}: {schedule}")
Example 2: Facility Location with Conditional Capacity
User: "I have 10 candidate warehouse locations and 50 customers. Each warehouse has a fixed opening cost and a capacity. Each customer must be served by exactly one open warehouse. If a warehouse is open, it can serve at most its capacity. Minimize total fixed cost plus transportation cost."
Approach:
R1 | "Each customer served by exactly one" | Assignment | Set-partitioning | x[w,c] | sum_w x[w,c] = 1, forall c
R2 | "Only open warehouses can serve" | Indicator | Binary linking | x[w,c],y[w]| x[w,c] <= y[w], forall w,c
R3 | "Capacity limit if open" | Capacity | Binary linking | x[w,c],y[w]| sum_c demand[c]*x[w,c] <= capacity[w]*y[w], forall w
Note: R3 subsumes R2 (if y[w]=0, no customer can be assigned), but keeping R2 explicitly tightens the LP relaxation, which is good practice.
import pulp
warehouses = range(10)
customers = range(50)
# Assume these are loaded from data:
# fixed_cost[w], capacity[w], transport_cost[w][c], demand[c]
model = pulp.LpProblem("FacilityLocation", pulp.LpMinimize)
y = pulp.LpVariable.dicts("open", warehouses, cat='Binary')
x = pulp.LpVariable.dicts("assign", (warehouses, customers), cat='Binary')
# Objective
model += (pulp.lpSum(fixed_cost[w] * y[w] for w in warehouses) +
pulp.lpSum(transport_cost[w][c] * x[w][c] for w in warehouses for c in customers))
# R1 - Assignment: each customer to exactly one warehouse
for c in customers:
model += pulp.lpSum(x[w][c] for w in warehouses) == 1
# R2 - Indicator: assignment only if open
for w in warehouses:
for c in customers:
model += x[w][c] <= y[w]
# R3 - Capacity: respect warehouse capacity
for w in warehouses:
model += pulp.lpSum(demand[c] * x[w][c] for c in customers) <= capacity[w] * y[w]
model.solve()
Example 3: Vehicle Routing with Time Windows (Partial)
User: "Each delivery truck starts and ends at the depot. Every customer must be visited exactly once. Each truck has capacity 100. Customers have time windows [a_i, b_i]. Minimize total travel distance."
CIR Table (key constraints only):
R1 | "Every customer visited once" | Assignment | Set-partitioning | x[i,j,k] | sum_{j,k} x[i,j,k] = 1, forall customer i
R2 | "Truck starts/ends at depot" | Flow conservation | Flow-balance | x[i,j,k] | sum_j x[0,j,k] = 1, sum_j x[j,0,k] = 1, forall k
R3 | "Flow conservation at customers" | Flow conservation | Flow-balance | x[i,j,k] | sum_j x[j,i,k] = sum_j x[i,j,k], forall i,k
R4 | "Truck capacity" | Capacity | Binary sum bound | x[i,j,k] | sum_i demand[i]*sum_j x[i,j,k] <= 100, forall k
R5 | "Time windows" | Temporal | Big-M linking | t[i],x | t[i] + travel[i,j] - M*(1-x[i,j,k]) <= t[j], a[i] <= t[i] <= b[i]
The Big-M for R5 should be set to max(b[i] + travel[i,j] - a[j]) for each (i,j) pair — never an arbitrary large number.
Paper: Lyu et al., "Canonical Intermediate Representation for LLM-based optimization problem formulation and code generation" (arXiv:2602.02029, 2026). Look for: the CIR schema definition (Section 3), constraint archetype taxonomy (Table 2), modeling paradigm catalog (Table 3), and the R2C pipeline architecture (Figure 2).
development
Audit LLM-based automatic short answer grading (ASAG) systems for adversarial vulnerabilities using token-level and prompt-level attack strategies from the GradingAttack framework. Triggers: 'test grading robustness', 'adversarial attack on grading', 'audit LLM grader', 'red-team answer grading', 'ASAG vulnerability assessment', 'grading fairness attack'
development
Build structured information-seeking agents that decompose complex queries into multi-turn search-and-browse workflows, aggregate results from multiple web sources, and return answers in typed structured formats (items, sets, lists, tables). Applies the GISA benchmark's ReAct-based agent architecture and evaluation methodology. Trigger phrases: "build an information-seeking agent", "search agent pipeline", "multi-turn web research agent", "structured web search workflow", "aggregate information from multiple sources", "web research with structured output"
data-ai
Optimize LLM prompts using GFlowPO's iterative generate-evaluate-refine loop with diversity-preserving exploration and dynamic memory. Use when: 'optimize this prompt', 'find a better prompt for this task', 'prompt engineering with examples', 'auto-tune my system prompt', 'improve prompt accuracy', 'generate prompt variations'.
development
Constrain LLM generation with executable Pydantic schemas and multi-agent pipelines to produce structurally valid, domain-rich artifacts. Uses ontology-as-grammar to eliminate hallucinated structures while preserving creative output. Trigger phrases: "generate a valid game design", "schema-constrained generation", "build a multi-agent pipeline with Pydantic validation", "ontology-driven content generation", "structured creative generation with DSPy", "generate artifacts that pass domain validation".