agent/skills/tools/duckdb/SKILL.md
Process JSON with DuckDB — schema inference, SQL queries, complex joins, and streaming. Use when Nushell pipelines aren't enough for JSON analytics.
npx skillsauth add knoopx/pi duckdbInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
DuckDB excels at JSON analytics: schema inference, SQL queries, complex joins, and streaming. Use when Nushell pipeline commands aren't enough.
Reads JSON files or URLs, auto-detects structure into typed columns. Arrays and objects become composite types.
-- Load from URL or file
select * from read_json('https://api.example.com/data');
select * from read_json('data.json');
-- In-memory (no .db file)
duckdb :memory:
Date-like columns auto-cast to timestamp. Use explicit schema for control:
-- Explicit column types
select * from read_json('films.json', columns = {
title: varchar, release_date: date, created: timestamp
});
-- Cast specific columns
select title, created::datetime from read_json('films.json');
Returns raw JSON blobs instead of auto-detected columns:
-- Single column of JSON blobs
select json_data from read_json('data.json',
format = 'unstructured', columns = {json_data: 'json[]'});
-- Unfold into rows
select unnest(json_data) as raw from read_json('data.json',
format = 'unstructured', columns = {json_data: 'json[]'});
Returns the inferred schema of a JSON blob:
select json_structure(raw_data) from my_table limit 1;
-- {"title":"VARCHAR","episode_id":"UBIGINT",...}
Use with from_json for type conversion:
set variable json_schema = (select json_structure(raw_data) from my_table limit 1);
select from_json(raw_data, getvariable('json_schema')) as typed from my_table;
select unnest(json_keys(raw_data)) as keys from my_table limit 1;
Converts a struct into actual columns:
select unnest(from_json(raw_data, getvariable('json_schema'))) as row from my_table;
->) and json_extract_string (->>)select
raw_data->>'title' as title,
(raw_data->>'episode_id')::uint64 as episode_id,
(raw_data->'characters')::varchar[] as characters
from my_table;
-- Equivalent function form
select json_extract_string(raw_data, 'title') as title from my_table;
select
json_extract_string(raw_data, 'title') as title,
unnest(cast(json_extract(raw_data, 'characters') as varchar[])) as character_id
from my_table;
Warning: Using unnest on two different columns zips them by position. Normalize one table at a time.
Compose nested structures with { key: value } syntax:
select {
type: 'character',
name: character.name,
homeworld: { type: 'planet', name: planet.name, climate: planet.climate },
species: species.name
} as character
from character
join planet on planet.url = character.homeworld
left join species on species.url = character.species[1];
Inverse of unnest — groups rows back into arrays:
select
{ type: 'film', title: title, characters: array_agg(character_blob) } as film
from film_character
join character_blob on character_blob.url = film_character.character
group by film_character.url, film_character.title;
select to_json(film_struct) as film_json from film_blob;
-- json() and json_string() are equivalent
-- Line-delimited JSON (default)
copy film to 'output.ndjson';
-- JSON array
copy film to 'output.json' (format json, array true);
Process JSON without intermediate files:
# stdin → transform → stdout
echo '[{"a":1,"b":[2,3]}]' | duckdb -json -c \
"select a, unnest(b) as b from read_json('/dev/stdin')"
# Database → stdout
duckdb -c "copy (select title, unnest(characters) from film) to '/dev/stdout' (format json, array true)" db.db
| Function | Purpose |
| ----------------------------- | -------------------------------------- |
| read_json() | Load JSON file or URL with auto-schema |
| json_structure() | Infer schema from JSON blob |
| json_keys() | List top-level keys |
| json_extract (->) | Extract value as JSON |
| json_extract_string (->>) | Extract value as string |
| from_json() | Transform JSON to native type |
| to_json() | Cast native type to JSON |
| unnest() | Unfold array into rows |
| array_agg() | Fold rows into array |
tools
Inform the user what is happening — skip passive lookups
development
Renders markdown to self-contained HTML with a custom dark stylesheet and opens in browser. Use when previewing markdown documents, generating styled HTML from README or report files.
testing
Programmatic hunk selection for Jujutsu — split, commit, or squash specific hunks without interactive prompts. Use when making partial commits or selective squashes.
content-media
Manage version control with Jujutsu (jj) — no staging area, immediate changes, smart rebasing. Use when navigating history, squashing, or pushing to Git remotes.