skills/image-to-svg/SKILL.md
Convert raster images (photos, illustrations, AI-generated art) into high-quality SVG recreations. Breaks the image into isolated features, builds each as a standalone SVG layer, then composites them. Use when the user wants to recreate an image as SVG, create vector versions of artwork, or extract specific elements from images as scalable graphics.
npx skillsauth add shhac/skills image-to-svgInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Recreate raster images as high-quality SVGs by decomposing, studying, and rebuilding each visual element independently.
Never try to reproduce the whole image at once. The quality comes from isolating each feature, studying it closely against a cropped reference, and building it as a standalone SVG before compositing.
Correctness over speed. Every shortcut in this workflow compounds into visible quality loss in the final output. Batching crop verification, skipping programmatic checks, eyeballing coordinates instead of measuring, settling for "looks about right" instead of running the diff — each saves a minute but costs ten in rework or produces a visibly worse result. The value of this skill is in the output quality. Take the time to verify at every step.
You are converting a raster image into an SVG recreation. Follow the phases below in order.
This skill uses incremental discovery — reference files live in subdirectories adjacent to this skill (analysis/, features/, styles/, workflow/). Read them only when a specific phase or condition calls for them. Do not read all reference files upfront.
Read workflow/workflow-dependencies.md and run the dependency check script. Ensure required tools (magick, rsvg-convert, xmllint) are available. For optional tools (vtracer, svgo), check availability and note which enhancements are possible.
If vtracer is not installed and Python is available, ask the user where they'd like the virtual environment before creating one. See the dependencies file for the three options (project-local, shared skill venv, user-specified).
Initial analysis — these are independent. Delegate to parallel subagents so each can focus fully on its concern:
styles/styles-identification.md, studies the image, and reports back with the style classification and reasoning (line work, shape language, color approach, detail level). The identified style determines which techniques you'll use later.analysis/analysis-asking-questions.md, studies the image, and reports back with answers to the key observation questions — especially the Construction and Structural questions for complex objects. This report informs the decomposition step.magick identify -format "%[channels]" original.png — if it reports srgba or similar alpha channel, note whether the transparent background should be preserved in the final SVG (common for emoji/stickers) or filled with a solid color.Decompose — depends on the observation framework above:
analysis/analysis-identifying-concepts.md. Break the image into independent visual elements and establish a z-order (layer stack).Parallel preparation — these all depend on the feature list from step 4 but are independent of each other. Run them in parallel:
Create and verify reference crops. Read analysis/analysis-reference-crops.md. Write feature-locations.yml with bounding boxes for every feature, then crop all features from it in one pass. Run the programmatic edge-margin check on every crop — this is the most common failure point. Fix any failing crops by adjusting the YAML and re-cropping (don't re-estimate from the image). Then visually verify each crop individually (one per Read call, not batched). Do not proceed to the build phase with any clipped crops.
Measure and map coordinates programmatically. Read workflow/workflow-verification.md for the measurement pipeline. Do NOT eyeball feature coordinates — small estimation errors compound across features and ruin proportions.
Write a subject brief. In 2-3 sentences, describe the personality, expression, and overall vibe of the subject ("a cheeky, confident goblin with a big happy grin and a proud crossed-arms stance"). This qualitative description travels with every agent alongside the measurements — it gives agents a target for the feeling of the subject, not just the geometry. Without it, agents produce features that are technically correct but lack the character's personality.
After crops are verified (step 5 must complete first):
Extract trace metadata from crops. If vtracer is available, read workflow/workflow-trace-metadata.md. Auto-trace each feature crop in polygon mode and extract structured metadata: color palettes, sub-element positions/sizes, area percentages, and topology hints. This gives agents precise numeric data (~130 tokens per feature) instead of requiring them to eyeball colors and positions from the raster image. Add the trace metadata to each feature's entry in the feature map.
If vtracer is not available, fall back to ImageMagick color extraction:
magick refs/{feature}.png -resize 200x200 -kmeans 10 -unique-colors txt: | tail -n +2 | tr -s ' ' | cut -d' ' -f3
This gives accurate hex values but no spatial sub-element data.
Once the style is identified, reference crops verified, and the feature map established, agent swarm the feature builds. Each feature is independent — they can be built in parallel by separate agents, each working from its own reference crop.
For character or face images, read the relevant feature reference sheet from features/ before building each element:
| Feature | Reference file |
|---|---|
| Eyes | features/features-eyes.md |
| Mouth | features/features-mouth.md |
| Nose | features/features-nose.md |
| Ears | features/features-ears.md |
| Face shape | features/features-face-shape.md |
| Hair | features/features-hair.md |
| Body | features/features-body.md |
| Accessories | features/features-accessories.md |
| Complex objects (held items, props) | features/features-objects.md |
Only read the reference sheets for features that exist in the image.
The features/ reference sheets are character-specific. For other subjects, decompose by visual layer instead:
<text> elements, since fonts won't match), secondary elements, border/framefeatures/features-objects.md for detailed guidance on structural decomposition. Objects have internal structure, multiple visible surfaces, and perspective complexity that goes far beyond silhouette + fill. Decompose into structural parts (panels, ribs, joints, handles), not just color regions.features/features-vehicles.md. Vehicles are panel assemblies — decompose by body panels, glass, wheels, lights, and trim. Panel lines and metallic gradients are critical.features/features-food.md. Shape-building approach with glossy highlights, layered construction, and steam/aroma effects.features/features-plants.md. Radial petal symmetry with <use> + rotate, leaf construction with vein clipping.<pattern> or <use> where possible), accent elements, overlay effectsThe same principles apply: one crop per element, one standalone SVG per layer, same composite viewBox. Read analysis/analysis-asking-questions.md for each element — the shape, color, and position questions are universal.
Some features are disproportionately important because they define the character's personality or the object's identity. These get extra comparison rigor — more iteration passes, programmatic diff verification, and side-by-side checks before moving to composition:
For these features, always run the programmatic diff (see "Render-Compare Loop" below) and iterate until the diff score converges, even if it means exceeding 3 passes.
For each feature:
analysis/analysis-asking-questions.mdparts/styles/styles-line-and-brush.md for illustrated/cartoon styles, or styles/styles-geometric.md for flat/geometric styles, or styles/styles-applying-to-lifelike.md for photographic/realistic images. Read only the style file matching the style identified in Phase 1.styles/styles-curves-and-shapes.md for curve construction techniques. This applies to all styles — it covers how to actually build shapes with the right SVG path commands, when to use filled shapes vs strokes, and how to construct organic curves. This is the bridge between "what should it look like" and "how do I build it in SVG."Prefer complex construction over simple geometry (except for images identified as geometric/flat style — defer to styles/styles-geometric.md for those). A filled shape built from cubic Beziers with proper width variation, per-panel lighting, and structural detail will always produce a more valuable result than a circle with a stroke. Only use SVG primitives (<circle>, <rect>, <ellipse>) when the reference image genuinely shows a perfect geometric shape or the style is explicitly flat/geometric. When in doubt, build the more complex version — the visual quality difference is substantial.
One feature = one SVG = one agent. No exceptions. Even trivially simple features (a nose that's just two dots, a sparkle, a small badge) get their own file and their own agent. The cost of an extra agent is low; the cost of coupled parts during compositing and future animation is high.
Paired features are ALWAYS separate. Left eye and right eye are separate agents writing separate SVGs. Same for left/right ears, left/right boots, left/right arms. They will be checked for consistency in the alignment phase (Phase 3) — that's where consistency is enforced, not by having one agent build both.
Body parts are independent. Arms are separate from the torso. Each leg is separate. The head is separate from the neck. Think of each part as something that might animate independently later — an arm could move while the body stays still, one ear could wiggle while the other doesn't.
If there are 5 features, spawn 5 agents. If there are 50 features, spawn 50 agents. The parallelism is the point. Each agent receives:
features/)styles/styles-line-and-brush.md, styles/styles-geometric.md, or styles/styles-applying-to-lifelike.md)styles/styles-curves-and-shapes.md) — always includedworkflow/workflow-verification.md) — always includedDescribe features quantitatively, not qualitatively. When briefing agents, text descriptions lose visual nuance — "wide grin" doesn't convey the exact curvature, "thick brim" is ambiguous. Instead use measurements: "mouth width = 55% of face width", "brim height = 5% of hat height, follows dome curvature". Adjectives fail; ratios survive.
parts/{feature-name}.svgAll agents must use the same viewBox as the composite canvas (e.g., viewBox="0 0 512 512"). Each agent positions its feature within the full canvas coordinates using the bounding box from the feature map. This ensures parts align without rescaling during composition.
Features that interact (e.g., face + ears, hair + hat) should be noted but built independently — interactions are resolved in Phase 4. For tightly coupled features (ears + face contour, hair + hat brim), include the neighboring feature's bounding box so the agent knows where the boundary sits.
Read workflow/workflow-verification.md for the full verification pipeline. The key insight: don't rely on visual comparison alone — the LLM is good at spotting catastrophic errors but bad at catching subtle proportion and curvature differences. Use programmatic diff to find errors precisely, then use the LLM to interpret and fix them.
After every SVG change:
Validate the SVG XML before rendering:
xmllint --noout parts/{feature}.svg
This catches unclosed tags, malformed attributes, and missing namespaces with clear error messages — far more helpful than rsvg-convert's cryptic failures.
Render the SVG to PNG:
rsvg-convert -w 512 -h 512 parts/{feature}.svg -o parts/{feature}.png
If rsvg-convert is not installed, install it (brew install librsvg on macOS, apt install librsvg2-bin on Linux).
Programmatic diff — generate a visual diff image and numerical score comparing the rendered feature against the reference crop. See workflow/workflow-verification.md for ImageMagick commands. The diff image highlights exactly WHERE the SVG diverges — red areas show the biggest differences.
Read the diff image — use the highlighted differences to direct your corrections. This is far more effective than comparing two similar-looking images: ImageMagick finds the errors precisely, you interpret them and know how to fix the SVG.
Visual sanity check — also read both the rendered PNG and reference crop for qualitative assessment (colors, overall feel, details).
Iterate — fix the top issue highlighted by the diff, re-render, re-diff. Repeat.
When to stop iterating: Limit to 3-5 refinement passes for normal features. For expression-critical features (mouth, eyes, overall proportions), continue up to 10 passes — these define the character and are worth the extra iteration.
Convergence targets (RMSE, normalized 0-1):
These are guidelines, not hard gates. A feature at RMSE 0.18 that looks right is done; a feature at 0.12 that looks wrong needs a different approach. Trust the diff image over the number.
When a feature is partially hidden by another layer:
After all features are built individually, check paired and repeated features for consistency. Agent swarm this — one agent per class of similar features.
A "class" is a group of features that should share the same construction style:
For images with multiple subjects, classes are per-subject: "character A eyes" and "character B eyes" are separate classes.
Each alignment agent receives both SVGs, both reference crops, and the full original image, and checks:
The alignment agent normalizes any unintentional inconsistencies — making paired features match while preserving intentional asymmetry from the reference (e.g., if the reference genuinely shows different-sized eyes, keep that).
Read workflow/composition-bringing-layers-together.md.
This phase is not optional. The first assembly is never the final output. Individual features built in isolation always have proportion and alignment issues that only become visible in context.
styles/styles-effects.md for <clipPath>, <mask>, and <filter> where neededworkflow/workflow-verification.md. This highlights exactly where the composite diverges from the original.rsvg-convert -w 64 -h 64 final.svg -o /tmp/small-check-64.png
rsvg-convert -w 128 -h 128 final.svg -o /tmp/small-check-128.png
Read both renders — does it still read clearly at icon size? Features that looked fine at 512px may merge or disappear.Read workflow/workflow-file-structure.md for the expected project layout.
Optimize the final SVG. If svgo is available, run it with cleanupIds disabled to preserve named groups:
svgo final.svg -o final.svg \
--config='{"plugins":[{"name":"preset-default","params":{"overrides":{"cleanupIds":false,"collapseGroups":false,"convertShapeToPath":false}}}]}'
This typically reduces file size by 25-40% (numeric precision, default attributes, path command optimization) without changing the visual output. If svgo is not available, skip this step — the SVG is still valid.
Keep the parts/ directory with standalone SVGs for future edits
Provide the final composite SVG
Render a PNG at the target resolution for comparison
development
Audit a codebase's module boundaries — enumerate modules, map their seams (import edges between modules), produce a layered topology diagram, and classify each module as narrow, hub-by-design, or accidental hub (with separate flags for cycles, layer violations, and uncertain import graphs). Outputs a diagram plus a flagged-for-review list; does not change code. Use when assessing whether abstractions live at the right boundaries, before/after a refactor to verify the boundaries improved, or when an unfamiliar codebase needs an architectural map. Not for intra-module refactoring (see improve-code-structure), bug hunting, or feature work.
testing
Investigate and solve problems using a team of specialist agents. Use when facing complex, multi-faceted problems that benefit from parallel research and structured implementation.
tools
Sync a forked repository with its upstream. Fetches both remotes, shows divergence, resets shared branches to upstream, re-merges local-only branches, cleans up branches already merged upstream, and pushes. Use when upstream has accepted PRs or moved ahead and you need to bring your fork in line.
data-ai
Manage stacked branches — rebase cascades, detect landed PRs, show stack status. Use when branches are stacked (B on A on main), trunk has advanced, a mid-stack branch changed, or a PR has landed and descendants need rebasing. Lightweight alternative to Graphite that infers the stack from git history.