skills/chart-specification-structural-representations/SKILL.md
Generate high-fidelity plotting code from chart images or descriptions using structured intermediate specifications. Decomposes charts into semantic topology (type, coordinates, domains, series) and runtime numerical facts before producing code, preventing hallucinated data and structural errors. Trigger phrases: - "convert this chart image to code" - "recreate this plot in matplotlib" - "generate plotting code from this chart" - "reproduce this visualization programmatically" - "write code that matches this chart exactly" - "chart to code"
npx skillsauth add ndpvt-web/arxiv-claude-skills chart-specification-structural-representationsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill enables Claude to generate structurally faithful plotting code from chart images or descriptions by first constructing a Chart Specification -- a structured JSON intermediate representation that decomposes a chart into its semantic topology and numerical primitives. Instead of attempting to produce code directly (which leads to hallucinated values and wrong chart types), Claude first reasons about the chart's structure declaratively, then generates code grounded in that verified specification. This approach, from He et al. (2026), achieves up to 61.7% improvement over direct generation on complex chart benchmarks.
The core problem: Direct chart-to-code generation encourages surface-level token imitation. A VLM sees a chart image and tries to write code that "looks right" token by token, but frequently hallucinates data values, misidentifies chart types, drops series, or produces wrong axis ranges. The code may execute without errors yet render a chart that is structurally wrong.
The solution -- Chart Specification: Before writing any code, decompose the chart into a structured
representation S = <S_sem, S_code>. The semantic component S_sem captures declarative intent:
chart type/family, panel layout, coordinate system (cartesian, polar, 3D), axis domains (ranges and
categories), series/legend entries, and analytic representations (functional forms like curve
equations). The code-level component S_code captures runtime numerical facts that are only
available through execution: computed wedge proportions in pie charts, statistical quartiles in box
plots, node-edge adjacency in graph visualizations, and interpolated curve sample points.
Why it works: By forcing explicit reasoning about structure before code generation, errors become detectable at the specification level. A wrong chart type is caught before any code is written. A missing series is flagged in the specification. Axis range mismatches show up as domain IoU failures. This "specify first, code second" pipeline converts a fuzzy generation problem into a structured verification problem, and is the key insight practitioners can apply even without the paper's full RL training pipeline.
Analyze the chart input. Examine the chart image or description. Identify what is being visualized at the highest level: is it a comparison, distribution, composition, relationship, or temporal trend?
Extract global topology. Determine the chart family (bar, line, scatter, pie, box, heatmap, etc.), the number of panels/subplots, and the layout grid (e.g., 1x2, 2x2). Record whether panels share axes or are independent.
Identify coordinate systems. For each panel, classify the coordinate space: Cartesian (x-y), polar (theta-r), 3D (x-y-z), or geographic (lat-lon). Note any axis inversions or log scales.
Extract data domains. For each axis, record the range (min, max) for numerical axes or the complete list of categories for categorical axes. For shared axes across panels, note the unified domain.
Enumerate data series and visual encodings. List every distinct data series by its legend label. For each series, record: the visual encoding (line, bar, marker, area), color, and any distinguishing style (dashed, hatched, marker shape). Compute the Jaccard overlap with the legend to verify completeness.
Extract numerical data or functional forms. For data-driven charts, extract the actual values
(or best estimates from the image). For function-driven charts (curves, density plots), identify
the analytic expression (e.g., y = np.exp(-(x-mu)**2 / (2*sigma**2))). For statistical
charts, extract the summary statistics (median, quartiles, whisker extents).
Assemble the Chart Specification as structured JSON. Organize all extracted information into the specification format (see template below). This is the verification checkpoint -- review the spec for internal consistency before proceeding.
Generate plotting code grounded in the specification. Write matplotlib/plotly/seaborn code that directly implements each field of the specification. Every axis range, series, color, and layout parameter must trace back to a spec field. Do not add visual elements not in the spec.
Validate structural alignment. Cross-check the generated code against the specification: Does the code create the right number of subplots? Are all series present? Do axis limits match the domain ranges? Are coordinate systems correct?
Present the specification and code together. Show the user both the structured spec (for verification) and the executable code. This lets them catch structural errors at the spec level rather than debugging rendered output.
{
"topology": {
"chart_family": "grouped_bar",
"num_panels": 1,
"layout": [1, 1]
},
"panels": [
{
"coordinate_system": "cartesian",
"x_axis": {
"type": "categorical",
"categories": ["Q1", "Q2", "Q3", "Q4"],
"label": "Quarter"
},
"y_axis": {
"type": "numerical",
"domain": [0, 120],
"label": "Revenue ($M)"
},
"series": [
{
"label": "Product A",
"encoding": "bar",
"color": "#4C78A8",
"values": [45, 62, 78, 95]
},
{
"label": "Product B",
"encoding": "bar",
"color": "#F58518",
"values": [38, 55, 70, 110]
}
]
}
],
"annotations": [],
"title": "Quarterly Revenue by Product"
}
Example 1: Grouped bar chart from image
User: "Here's a screenshot of a bar chart comparing test scores across three subjects for two classes. Recreate it in matplotlib."
Approach:
Output specification:
{
"topology": { "chart_family": "grouped_bar", "num_panels": 1, "layout": [1, 1] },
"panels": [{
"coordinate_system": "cartesian",
"x_axis": { "type": "categorical", "categories": ["Math", "Science", "English"], "label": "Subject" },
"y_axis": { "type": "numerical", "domain": [0, 100], "label": "Average Score" },
"series": [
{ "label": "Class A", "encoding": "bar", "color": "#1f77b4", "values": [82, 75, 90] },
{ "label": "Class B", "encoding": "bar", "color": "#ff7f0e", "values": [78, 88, 85] }
]
}],
"title": "Test Scores by Subject"
}
Output code:
import matplotlib.pyplot as plt
import numpy as np
categories = ["Math", "Science", "English"]
class_a = [82, 75, 90]
class_b = [78, 88, 85]
x = np.arange(len(categories))
width = 0.35
fig, ax = plt.subplots(figsize=(8, 5))
ax.bar(x - width/2, class_a, width, label="Class A", color="#1f77b4")
ax.bar(x + width/2, class_b, width, label="Class B", color="#ff7f0e")
ax.set_xlabel("Subject")
ax.set_ylabel("Average Score")
ax.set_title("Test Scores by Subject")
ax.set_xticks(x)
ax.set_xticklabels(categories)
ax.set_ylim(0, 100)
ax.legend()
plt.tight_layout()
plt.savefig("chart.png", dpi=150)
plt.show()
Example 2: Multi-panel chart with mixed types
User: "Generate code for a 1x2 subplot layout. Left panel: line chart of temperature over 12 months with min/max shaded range. Right panel: pie chart of energy source breakdown."
Approach:
plt.subplots(1, 2).Output specification:
{
"topology": { "chart_family": "composite", "num_panels": 2, "layout": [1, 2] },
"panels": [
{
"coordinate_system": "cartesian",
"x_axis": { "type": "categorical", "categories": ["Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"] },
"y_axis": { "type": "numerical", "domain": [-5, 35], "label": "Temperature (C)" },
"series": [
{ "label": "Mean Temp", "encoding": "line", "color": "#e45756", "values": [2,3,8,14,19,24,27,26,21,14,8,3] },
{ "label": "Min-Max Range", "encoding": "fill_between", "color": "#e4575640",
"min_values": [-3,-1,3,8,13,18,21,20,15,8,3,-1],
"max_values": [7,8,13,20,25,30,33,32,27,20,13,7] }
]
},
{
"coordinate_system": "polar",
"series": [
{ "label": "Solar", "value": 0.30, "color": "#f58518" },
{ "label": "Wind", "value": 0.25, "color": "#4c78a8" },
{ "label": "Gas", "value": 0.28, "color": "#72b7b2" },
{ "label": "Nuclear", "value": 0.17, "color": "#b279a2" }
]
}
],
"title": "Climate and Energy Overview"
}
Output code:
import matplotlib.pyplot as plt
import numpy as np
months = ["Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"]
mean_temp = [2,3,8,14,19,24,27,26,21,14,8,3]
min_temp = [-3,-1,3,8,13,18,21,20,15,8,3,-1]
max_temp = [7,8,13,20,25,30,33,32,27,20,13,7]
energy_labels = ["Solar", "Wind", "Gas", "Nuclear"]
energy_values = [0.30, 0.25, 0.28, 0.17]
energy_colors = ["#f58518", "#4c78a8", "#72b7b2", "#b279a2"]
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))
# Left panel: temperature line with shaded range
x = np.arange(len(months))
ax1.plot(x, mean_temp, color="#e45756", linewidth=2, label="Mean Temp")
ax1.fill_between(x, min_temp, max_temp, color="#e45756", alpha=0.25, label="Min-Max Range")
ax1.set_xticks(x)
ax1.set_xticklabels(months, rotation=45)
ax1.set_ylabel("Temperature (C)")
ax1.set_ylim(-5, 35)
ax1.legend()
# Right panel: pie chart
ax2.pie(energy_values, labels=energy_labels, colors=energy_colors, autopct="%1.0f%%", startangle=90)
ax2.set_aspect("equal")
fig.suptitle("Climate and Energy Overview")
plt.tight_layout()
plt.savefig("chart.png", dpi=150)
plt.show()
Example 3: Correcting structurally wrong generated code
User: "I asked another tool to generate a box plot from this image, but it produced a bar chart instead. Can you fix it?"
Approach:
ax.boxplot() or ax.bxp().boxplot -- aligned.The specification catches the error at the topology level ("chart_family": "box" vs the incorrect "bar"), making the fix systematic rather than guesswork.
plt.pie().| Problem | Detection | Resolution | |---------|-----------|------------| | Wrong chart type identified | Topology field contradicts visual evidence | Re-examine image; check for dual-axis or composite types that look like simpler charts | | Missing data series | Series count in spec < legend entries in image | Enumerate legend items explicitly; compare spec series labels to image legend | | Axis range mismatch | Domain IoU between spec and image is low | Read axis tick labels carefully; account for padding matplotlib adds beyond data range | | Code executes but renders wrong | Spec-code cross-check reveals divergence | Walk through spec fields one by one, verify each has a corresponding code statement | | Cannot read exact values from image | Data values marked as approximate | State the uncertainty; offer to generate code with placeholder data the user can fill in | | Composite/unusual chart type | No matching chart_family in standard taxonomy | Decompose into constituent primitives (e.g., a lollipop chart = scatter + vlines) |
fill_between) map differently across matplotlib,
plotly, and seaborn.He, M., Dai, M., Zhang, J., Liu, Y., & Tao, S. (2026). Chart Specification: Structural Representations for Incentivizing VLM Reasoning in Chart-to-Code Generation. arXiv:2602.10880
Key takeaway: structured intermediate representations with hierarchical verification (topology before semantics before numerical details) dramatically improve chart-to-code fidelity, even with minimal training data.
development
Audit LLM-based automatic short answer grading (ASAG) systems for adversarial vulnerabilities using token-level and prompt-level attack strategies from the GradingAttack framework. Triggers: 'test grading robustness', 'adversarial attack on grading', 'audit LLM grader', 'red-team answer grading', 'ASAG vulnerability assessment', 'grading fairness attack'
development
Build structured information-seeking agents that decompose complex queries into multi-turn search-and-browse workflows, aggregate results from multiple web sources, and return answers in typed structured formats (items, sets, lists, tables). Applies the GISA benchmark's ReAct-based agent architecture and evaluation methodology. Trigger phrases: "build an information-seeking agent", "search agent pipeline", "multi-turn web research agent", "structured web search workflow", "aggregate information from multiple sources", "web research with structured output"
data-ai
Optimize LLM prompts using GFlowPO's iterative generate-evaluate-refine loop with diversity-preserving exploration and dynamic memory. Use when: 'optimize this prompt', 'find a better prompt for this task', 'prompt engineering with examples', 'auto-tune my system prompt', 'improve prompt accuracy', 'generate prompt variations'.
development
Constrain LLM generation with executable Pydantic schemas and multi-agent pipelines to produce structurally valid, domain-rich artifacts. Uses ontology-as-grammar to eliminate hallucinated structures while preserving creative output. Trigger phrases: "generate a valid game design", "schema-constrained generation", "build a multi-agent pipeline with Pydantic validation", "ontology-driven content generation", "structured creative generation with DSPy", "generate artifacts that pass domain validation".