running-dag-pipelines

A skill for composing directed acyclic graph (DAG) workflows that process arbitrarily long inputs and manage arbitrarily long sessions. It unifies two complementary approaches:

RLM (Recursive Language Models) — slice long inputs, fan out LLM sub-calls, and reduce results, with optional depth-bounded recursion
LCM (Lossless Context Management) — persist all intermediate artifacts in a content-addressed immutable store with a summary DAG, so nothing is ever lost across compaction boundaries

When to use this skill

| Situation | Trigger | |---|---| | Input exceeds the model's context window | Use slice → map → reduce | | Need relevance filtering before expensive LLM calls | Add a filter node | | Input is much larger than context and may need multi-level processing | Use recurse operator | | Long-running multi-turn session risks context rot | Use compact operator + artifact store | | Need a reproducible, logged, artifact-backed processing pipeline | Define a dag_spec |

Quick start

1. Pick a template or write a dag_spec

Templates live in scripts/templates/. Three are provided:

flat_map_reduce.json — slice → map → reduce (simplest RLM pattern)
hierarchical_summarize.json — slice → filter → map → compact → reduce
recursive_deep_dive.json — single recurse node with depth-bounded processing

Copy a template and edit the config fields (model, prompts, thresholds).

2. Run the DAG

cd skills/context-dag/scripts
python dag_runner.py \
  --spec templates/flat_map_reduce.json \
  --input /path/to/large_document.txt \
  --out-dir ./my_run

3. Inspect results

my_run/
  progress.jsonl        # full event log
  run_summary.json      # execution metadata
  result.txt            # final output
  split/                # per-node artifact directories
  process/
  combine/

Operator reference

`slice`

Splits a single input into N chunks.

| Config key | Type | Default | Description | |---|---|---|---| | strategy | "headings" | "markers" | "fixed" | "fixed" | Slicing strategy | | chunk_size | int | 200000 | Characters per chunk | | overlap | int | 0 | Char overlap for fixed chunks | | max_slices | int | 20 | Maximum number of slices | | marker_start | str | — | Regex for marker-based slicing | | marker_end | str | — | Regex for marker end |

Inputs: 1 artifact → Outputs: N artifacts (one per slice)

`map`

Runs an LLM call on each input artifact in parallel (LCM's LLM-Map pattern).

| Config key | Type | Default | Description | |---|---|---|---| | model | str | from $meta.default_model | LLM model | | system | str | "" | System prompt | | prompt_template | str | "{input}" | Format string with {input} | | temperature | float | 0.0 | | | max_tokens | int | 4096 | | | json_schema | dict | — | Optional structured output schema | | concurrency | int | 8 | Max parallel LLM calls |

Inputs: N artifacts → Outputs: N artifacts (1:1)

`reduce`

Merges N inputs into one. Two modes:

single — concatenate all inputs, one LLM call
tree — pairwise reduction in log(n) rounds (LCM-style tree-reduce)
auto (default) — single if ≤ tree_threshold inputs, tree otherwise

| Config key | Type | Default | Description | |---|---|---|---| | model | str | from meta | LLM model | | system | str | (built-in) | System prompt | | prompt_template | str | "{input}" (single) or "{left}" / "{right}" (tree) | | | mode | "single" | "tree" | "auto" | "auto" | | | tree_threshold | int | 6 | Inputs above this trigger tree mode | | concurrency | int | 4 | Parallel pairs in tree mode |

Inputs: N artifacts → Outputs: 1 artifact

`recurse`

Depth-bounded recursive processing (RLM's core pattern). At each level: terminal inputs get a leaf LLM call; non-terminal inputs are sliced, mapped recursively, and reduced.

| Config key | Type | Default | Description | |---|---|---|---| | model | str | from meta | LLM model | | system | str | "" | System prompt for leaf calls | | prompt_template | str | "{input}" | | | max_depth | int | 3 | Maximum recursion depth | | size_threshold | int | 50000 | Chars below which input is terminal | | chunk_size | int | 100000 | Chars per sub-slice | | max_slices | int | 10 | | | reduce_system | str | (built-in) | System prompt for intermediate reductions | | concurrency | int | 4 | |

Inputs: N artifacts → Outputs: N artifacts (each recursively processed)

`compact`

LCM-style context compaction with 3-level escalation:

LLM summarisation
Aggressive LLM summarisation (tighter budget)
Deterministic fallback (first/last N chars, no LLM)

Writes compaction_pointers.json mapping summaries → originals for lossless retrieval.

| Config key | Type | Default | Description | |---|---|---|---| | model | str | from meta | | | system | str | (built-in) | | | level | 1 | 2 | 3 | "auto" | "auto" | | | soft_threshold | int | 100000 | Chars to trigger level 1 | | hard_threshold | int | 200000 | Chars to trigger level 2 | | fallback_chars | int | 2000 | Chars kept in level-3 fallback | | max_summary_tokens | int | 1024 | |

Inputs: N artifacts → Outputs: N artifacts (compacted where needed)

`filter`

Relevance gating — scores each input against a query and drops those below threshold.

| Config key | Type | Default | Description | |---|---|---|---| | model | str | "gpt-4o-mini" | Scoring model | | query | str | required | Relevance question | | threshold | float | 0.3 | Score cutoff (0.0–1.0) | | concurrency | int | 8 | |

Inputs: N artifacts → Outputs: ≤ N artifacts (relevant ones only)

DAG spec format

A JSON (or YAML) file with:

{
  "meta": {
    "name": "workflow-name",
    "description": "...",
    "default_model": "gpt-4o"
  },
  "nodes": [
    {
      "id": "unique_node_id",
      "operator": "slice|map|reduce|recurse|compact|filter",
      "config": { ... },
      "inputs": ["$input", "other_node_id", ...]
    }
  ]
}

$input is the DAG's external input (the file passed via --input)
Node inputs reference other node IDs — the runner topologically sorts and routes artifacts automatically
Config values can reference "$meta.default_model" to inherit from meta
The last node in topological order produces the DAG's final output

Artifact store (Phase 3)

For multi-session persistence, the ArtifactStore class in scripts/artifact_store.py provides:

Content-addressed storage — every artifact gets a sha256 ID
Append-only index — full history in index.jsonl
Summary pointers — put_summary() records which originals a summary compresses
Lossless expand — expand(ref) retrieves the original texts behind any summary node (one DAG layer at a time, matching LCM's lcm_expand pattern)

from artifact_store import ArtifactStore
store = ArtifactStore(Path("./my_store"))

ref = store.put("full text...", tag="msg_42")
summary_ref = store.put("brief summary", tag="summary_42")
store.put_summary(summary_ref, original_refs=[ref])

# Later: retrieve originals
originals = store.expand(summary_ref)  # → ["full text..."]

LLM client

scripts/llm_client.py provides a llm_call() function with two backends:

Primary: Simon Willison's `llm` library

pip install llm                     # core
llm install llm-anthropic           # Claude
llm install llm-gemini              # Gemini
# … any of the 70+ llm plugins

When the llm package is installed, the client uses it for all calls. This gives you:

70+ model providers via plugins — OpenAI, Anthropic, Gemini, local models, OpenAI-compatible endpoints, etc.
Automatic SQLite logging of every prompt/response (built into llm)
Structured output via schema= with Pydantic or dict
Token counting via response.usage()
API key management via llm keys set (no env vars needed)
CLI parity — agents can also shell out to llm prompt ...

The llm_call_usage() variant returns both the response text and a usage dict with input/output token counts, which is useful for compaction threshold decisions.

Fallback: direct SDKs

If llm is not installed, the client falls back to direct OpenAI / Google Gemini SDK calls. In this mode:

Models starting with gemini → Google GenAI SDK
All others → OpenAI SDK
Required env vars: OPENAI_API_KEY, GEMINI_API_KEY (or GOOGLE_API_KEY)

Hook points

llm_client exposes two module-level hooks for bolting LCM-style interception onto any tool:

import llm_client

def my_before_hook(model, prompt, system, **kw):
    print(f"About to call {model} with {len(prompt)} chars")

def my_after_hook(model, response_text, usage, **kw):
    # e.g. store in artifact store, check token budget, trigger compaction
    print(f"Got {len(response_text)} chars, used {usage.get('input', '?')} input tokens")

llm_client.before_call_hook = my_before_hook
llm_client.after_call_hook = my_after_hook

This is how you'd bolt LCM patterns onto standalone tools without privileged access: intercept at the llm_call layer, persist to the artifact store, and trigger compaction when token budgets are exceeded.

Provenance

| Component | Inspired by | |---|---| | slice, map, reduce, recurse | RLM (Zhang, Kraska, Khattab 2025) — recursive language model inference | | compact, artifact store, expand | LCM (Ehrlich / Voltropy 2025) — lossless context management | | filter | Practical addition for relevance gating | | DAG engine | Composition layer enabling flexible routing between all patterns | | map concurrency model | LCM's LLM-Map operator-level recursion | | Tree-reduce | Discussed in LCM HN thread as planned llm_reduce |

running-dag-pipelines

A skill for composing directed acyclic graph (DAG) workflows that process arbitrarily long inputs and manage arbitrarily long sessions. It unifies two complementary approaches:

RLM (Recursive Language Models) — slice long inputs, fan out LLM sub-calls, and reduce results, with optional depth-bounded recursion
LCM (Lossless Context Management) — persist all intermediate artifacts in a content-addressed immutable store with a summary DAG, so nothing is ever lost across compaction boundaries

When to use this skill

Quick start

1. Pick a template or write a dag_spec

Templates live in scripts/templates/. Three are provided:

flat_map_reduce.json — slice → map → reduce (simplest RLM pattern)
hierarchical_summarize.json — slice → filter → map → compact → reduce
recursive_deep_dive.json — single recurse node with depth-bounded processing

Copy a template and edit the config fields (model, prompts, thresholds).

2. Run the DAG

cd skills/context-dag/scripts
python dag_runner.py \
  --spec templates/flat_map_reduce.json \
  --input /path/to/large_document.txt \
  --out-dir ./my_run

3. Inspect results

my_run/
  progress.jsonl        # full event log
  run_summary.json      # execution metadata
  result.txt            # final output
  split/                # per-node artifact directories
  process/
  combine/

Operator reference

`slice`

Splits a single input into N chunks.

Inputs: 1 artifact → Outputs: N artifacts (one per slice)

`map`

Runs an LLM call on each input artifact in parallel (LCM's LLM-Map pattern).

Inputs: N artifacts → Outputs: N artifacts (1:1)

`reduce`

Merges N inputs into one. Two modes:

single — concatenate all inputs, one LLM call
tree — pairwise reduction in log(n) rounds (LCM-style tree-reduce)
auto (default) — single if ≤ tree_threshold inputs, tree otherwise

Inputs: N artifacts → Outputs: 1 artifact

`recurse`

Depth-bounded recursive processing (RLM's core pattern). At each level: terminal inputs get a leaf LLM call; non-terminal inputs are sliced, mapped recursively, and reduced.

Inputs: N artifacts → Outputs: N artifacts (each recursively processed)

`compact`

LCM-style context compaction with 3-level escalation:

LLM summarisation
Aggressive LLM summarisation (tighter budget)
Deterministic fallback (first/last N chars, no LLM)

Writes compaction_pointers.json mapping summaries → originals for lossless retrieval.

Inputs: N artifacts → Outputs: N artifacts (compacted where needed)

`filter`

Relevance gating — scores each input against a query and drops those below threshold.

Inputs: N artifacts → Outputs: ≤ N artifacts (relevant ones only)

DAG spec format

A JSON (or YAML) file with:

{
  "meta": {
    "name": "workflow-name",
    "description": "...",
    "default_model": "gpt-4o"
  },
  "nodes": [
    {
      "id": "unique_node_id",
      "operator": "slice|map|reduce|recurse|compact|filter",
      "config": { ... },
      "inputs": ["$input", "other_node_id", ...]
    }
  ]
}

$input is the DAG's external input (the file passed via --input)
Node inputs reference other node IDs — the runner topologically sorts and routes artifacts automatically
Config values can reference "$meta.default_model" to inherit from meta
The last node in topological order produces the DAG's final output

Artifact store (Phase 3)

For multi-session persistence, the ArtifactStore class in scripts/artifact_store.py provides:

Content-addressed storage — every artifact gets a sha256 ID
Append-only index — full history in index.jsonl
Summary pointers — put_summary() records which originals a summary compresses
Lossless expand — expand(ref) retrieves the original texts behind any summary node (one DAG layer at a time, matching LCM's lcm_expand pattern)

from artifact_store import ArtifactStore
store = ArtifactStore(Path("./my_store"))

ref = store.put("full text...", tag="msg_42")
summary_ref = store.put("brief summary", tag="summary_42")
store.put_summary(summary_ref, original_refs=[ref])

# Later: retrieve originals
originals = store.expand(summary_ref)  # → ["full text..."]

LLM client

scripts/llm_client.py provides a llm_call() function with two backends:

Primary: Simon Willison's `llm` library

pip install llm                     # core
llm install llm-anthropic           # Claude
llm install llm-gemini              # Gemini
# … any of the 70+ llm plugins

When the llm package is installed, the client uses it for all calls. This gives you:

70+ model providers via plugins — OpenAI, Anthropic, Gemini, local models, OpenAI-compatible endpoints, etc.
Automatic SQLite logging of every prompt/response (built into llm)
Structured output via schema= with Pydantic or dict
Token counting via response.usage()
API key management via llm keys set (no env vars needed)
CLI parity — agents can also shell out to llm prompt ...

The llm_call_usage() variant returns both the response text and a usage dict with input/output token counts, which is useful for compaction threshold decisions.

Fallback: direct SDKs

If llm is not installed, the client falls back to direct OpenAI / Google Gemini SDK calls. In this mode:

Models starting with gemini → Google GenAI SDK
All others → OpenAI SDK
Required env vars: OPENAI_API_KEY, GEMINI_API_KEY (or GOOGLE_API_KEY)

Hook points

llm_client exposes two module-level hooks for bolting LCM-style interception onto any tool:

import llm_client

def my_before_hook(model, prompt, system, **kw):
    print(f"About to call {model} with {len(prompt)} chars")

def my_after_hook(model, response_text, usage, **kw):
    # e.g. store in artifact store, check token budget, trigger compaction
    print(f"Got {len(response_text)} chars, used {usage.get('input', '?')} input tokens")

llm_client.before_call_hook = my_before_hook
llm_client.after_call_hook = my_after_hook

Adoption

aufrank/running-dag-pipelines

$ install --global

Security Scan Results

SKILL.md

running-dag-pipelines

When to use this skill

Quick start

1. Pick a template or write a dag_spec

2. Run the DAG

3. Inspect results

Operator reference

slice

map

reduce

recurse

compact

filter

DAG spec format

Artifact store (Phase 3)

LLM client

Primary: Simon Willison's llm library

Fallback: direct SDKs

Hook points

Provenance

Related Skills

aufrank/austin-frank-voice

aufrank/working-with-notion-programmatically

aufrank/workflow-or-agent-decider

aufrank/using-mcp-tools-with-mcpc

aufrank/running-dag-pipelines

$ install --global

Security Scan Results

SKILL.md

running-dag-pipelines

When to use this skill

Quick start

1. Pick a template or write a dag_spec

2. Run the DAG

3. Inspect results

Operator reference

slice

map

reduce

recurse

compact

filter

DAG spec format

Artifact store (Phase 3)

LLM client

Primary: Simon Willison's llm library

Fallback: direct SDKs

Hook points

Provenance

Related Skills

aufrank/austin-frank-voice

aufrank/working-with-notion-programmatically

aufrank/workflow-or-agent-decider

aufrank/using-mcp-tools-with-mcpc

`slice`

`map`

`reduce`

`recurse`

`compact`

`filter`

Primary: Simon Willison's `llm` library

`slice`

`map`

`reduce`

`recurse`

`compact`

`filter`

Primary: Simon Willison's `llm` library