Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

jmagly/corpus-export

Name: corpus-export
Author: jmagly

agentic/code/frameworks/research-complete/skills/corpus-export/SKILL.md

npx skillsauth add jmagly/aiwg corpus-export

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Corpus Export

Package corpus subsets as distribution archives. Selects papers by cluster, topic, REF range, or custom filter and bundles all artifacts (PDF, analysis doc, citation sidecar, web source, BibTeX) into a portable archive with manifest.

Triggers

"export the corpus"
"package papers for distribution"
"create a distribution archive"
"export agentic canon"
"corpus export"
/corpus-export

Parameters

Selection (one required)

`--cluster <name>`

Select all papers in a named cluster (from /research-gap-detect).

/corpus-export --cluster "Agentic Canon"

`--refs <range>`

Explicit REF range or list. Supports ranges, multi-ranges, and individual IDs.

/corpus-export --refs REF-016:REF-024,REF-121
/corpus-export --refs REF-016,REF-018,REF-024

`--topic <name>`

Select all papers tagged with a specific topic.

/corpus-export --topic "GUI Agents"

`--filter <expr>`

Custom filter expression (frontmatter field comparisons).

/corpus-export --filter "year>=2023 AND incoming>=10"
/corpus-export --filter "grade=High AND tag:reproducibility"

Options

`--output <path>` (optional)

Output archive path. Default: .aiwg/research/exports/corpus-<selector>-<date>.tar.gz.

`--format tar.gz|zip` (optional)

Archive format. Default: tar.gz.

`--include` (optional, repeatable)

Artifact types to include. Defaults: pdf,analysis,citations,bibtex.

Available: pdf, text, web, analysis, citations, bibtex, metadata, provenance.

`--dry-run` (optional)

List what would be included without creating the archive.

Execution Flow

Phase 1: Selection

Resolve the selection criteria to a list of REF-XXX identifiers:

--cluster: look up cluster in citation-network index, return member REFs
--refs: parse range expression
--topic: scan findings frontmatter for matching tags
--filter: evaluate expression against frontmatter

Report resolved selection:

Selection: "Agentic Canon" cluster
Papers: 17 (REF-001, REF-016, REF-018, REF-024, ...)

Phase 2: Artifact Gathering

For each selected REF, gather the configured artifact types from canonical locations:

REF-016:
  ✓ PDF: sources/pdfs/full/REF-016-autogen.pdf (2.4 MB)
  ✓ Analysis: findings/REF-016-autogen.md (287 lines)
  ✓ Citations: documentation/citations/REF-016.md (43 outgoing, 12 incoming)
  ✓ BibTeX: citations/bibtex/REF-016.bib
  ✗ Web: no web source (PDF primary)
  ✓ Metadata: sources/metadata/REF-016.yaml

Flag missing artifacts:

REF-299:
  ✗ PDF: MISSING (acquisition failed)
  ✓ Analysis: findings/REF-299-stub.md (22 lines — STUB)
  ...

Phase 3: Manifest Generation

Write a MANIFEST.md to the archive root describing the export:

# Corpus Export Manifest

**Date**: 2026-04-13
**Selector**: --cluster "Agentic Canon"
**Papers**: 17
**Total size**: 48.3 MB

## Contents

| REF | Title | Year | GRADE | PDF | Analysis | Citations |
|-----|-------|------|-------|-----|----------|-----------|
| REF-016 | AutoGen | 2023 | High | ✓ | 287 lines | 43/12 |
| REF-018 | Multi-Agent Debate | 2024 | High | ✓ | 312 lines | 28/17 |
...

## Missing Artifacts

- REF-299: PDF missing (acquisition failed)
- REF-312: Analysis doc is a skeleton (<40 lines)

## Provenance

Generated by `corpus-export` v1.0 from corpus at:
- Fixity manifest: .aiwg/research/fixity-manifest.json (checksum: abc123...)
- Citation graph: indices/citation-network.md (generated 2026-04-13T10:00Z)

Phase 4: Archive Creation

Create the archive with structure:

corpus-agentic-canon-2026-04-13.tar.gz
├── MANIFEST.md
├── pdfs/
│   ├── REF-016-autogen.pdf
│   ├── REF-018-multi-agent-debate.pdf
│   └── ...
├── findings/
│   ├── REF-016-autogen.md
│   └── ...
├── citations/
│   ├── REF-016.md
│   └── ...
├── bibtex/
│   ├── REF-016.bib
│   └── all.bib                    # concatenated bibliography
└── README.md                      # extraction + usage instructions

Phase 5: Report

Corpus Export Complete
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Selector: --cluster "Agentic Canon"
Papers selected: 17
Artifacts bundled: 68 files
Missing artifacts: 2 (reported in MANIFEST.md)

Archive: .aiwg/research/exports/corpus-agentic-canon-2026-04-13.tar.gz
Size: 48.3 MB
SHA-256: abc123def456...

Contents:
  17 PDFs (45.1 MB)
  17 analysis docs (1.2 MB)
  17 citation sidecars (0.8 MB)
  17 BibTeX entries + all.bib (50 KB)
  1 MANIFEST.md (4 KB)
  1 README.md (2 KB)

Archive Use Cases

Research sharing

Share a cluster with collaborators without sharing the entire corpus.

/corpus-export --cluster "Agentic Canon"

Snapshot for publication

Package the corpus state referenced by a paper for reproducibility.

/corpus-export --refs REF-016:REF-024 --include pdf,analysis,citations,provenance

Topic digest

Export everything on a specific topic for a focused review.

/corpus-export --topic "Evaluation" --filter "year>=2023"

Quality subset

Export only high-quality sources.

/corpus-export --filter "grade=High"

Integration Points

| Component | Relationship | |-----------|-------------| | research-gap-detect | Provides --cluster names | | corpus-index-build | Provides topic and metadata for selection | | research-quality-audit | Flags missing/skeleton artifacts in manifest | | research-cite | Generates BibTeX entries bundled in export | | Media curator /acquire | Source of PDF files packaged into export |

Examples

# Export a named cluster
/corpus-export --cluster "Agentic Canon"

# Export a REF range
/corpus-export --refs REF-016:REF-024,REF-121

# Export by topic
/corpus-export --topic "GUI Agents"

# Filter: recent high-grade papers with many citations
/corpus-export --filter "year>=2023 AND grade=High AND incoming>=10"

# Preview without creating archive
/corpus-export --cluster "Agentic Canon" --dry-run

# Minimal export (analysis docs only)
/corpus-export --topic "Reproducibility" --include analysis,citations

# Custom output path
/corpus-export --refs REF-001:REF-100 --output /tmp/first-100.tar.gz

References

@$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/research-gap-detect/SKILL.md — Provides cluster names
@$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/corpus-index-build/SKILL.md — Provides topic/metadata indices
@$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/research-quality-audit/SKILL.md — Flags missing artifacts
@$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/research-cite/SKILL.md — Generates BibTeX
@$AIWG_ROOT/docs/integrations/media-curator-to-research-handoff.md — Source acquisition contract

jmagly/corpus-export

agentic/code/frameworks/research-complete/skills/corpus-export/SKILL.md

Package corpus subsets as distribution archives. Filter by cluster, topic, REF range, or custom expression; bundles PDFs, analysis, citations, and BibTeX into tar.gz.

128 stars

testing

Updated May 8, 2026

$ install --global

skillsauth

npx skillsauth add jmagly/aiwg corpus-export

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 8, 2026, 6:09 AM206.7s1 file scanned

SKILL.md

namespace:: aiwg
platforms:: [all]
name:: corpus-export
description:: Package corpus subsets as distribution archives. Filter by cluster, topic, REF range, or custom expression; bundles PDFs, analysis, citations, and BibTeX into tar.gz.
argumentHint:: [--cluster <name>] [--refs <range>] [--topic <name>] [--filter <expr>] [--output <path>] [--format tar.gz|zip]
allowedTools:: Read, Write, Glob, Grep, Bash
model:: sonnet
category:: research-distribution

Corpus Export

Triggers

"export the corpus"
"package papers for distribution"
"create a distribution archive"
"export agentic canon"
"corpus export"
/corpus-export

Parameters

Selection (one required)

`--cluster <name>`

Select all papers in a named cluster (from /research-gap-detect).

/corpus-export --cluster "Agentic Canon"

`--refs <range>`

Explicit REF range or list. Supports ranges, multi-ranges, and individual IDs.

/corpus-export --refs REF-016:REF-024,REF-121
/corpus-export --refs REF-016,REF-018,REF-024

`--topic <name>`

Select all papers tagged with a specific topic.

/corpus-export --topic "GUI Agents"

`--filter <expr>`

Custom filter expression (frontmatter field comparisons).

/corpus-export --filter "year>=2023 AND incoming>=10"
/corpus-export --filter "grade=High AND tag:reproducibility"

Options

`--output <path>` (optional)

Output archive path. Default: .aiwg/research/exports/corpus-<selector>-<date>.tar.gz.

`--format tar.gz|zip` (optional)

Archive format. Default: tar.gz.

`--include` (optional, repeatable)

Artifact types to include. Defaults: pdf,analysis,citations,bibtex.

Available: pdf, text, web, analysis, citations, bibtex, metadata, provenance.

`--dry-run` (optional)

List what would be included without creating the archive.

Execution Flow

Phase 1: Selection

Resolve the selection criteria to a list of REF-XXX identifiers:

--cluster: look up cluster in citation-network index, return member REFs
--refs: parse range expression
--topic: scan findings frontmatter for matching tags
--filter: evaluate expression against frontmatter

Report resolved selection:

Selection: "Agentic Canon" cluster
Papers: 17 (REF-001, REF-016, REF-018, REF-024, ...)

Phase 2: Artifact Gathering

For each selected REF, gather the configured artifact types from canonical locations:

REF-016:
  ✓ PDF: sources/pdfs/full/REF-016-autogen.pdf (2.4 MB)
  ✓ Analysis: findings/REF-016-autogen.md (287 lines)
  ✓ Citations: documentation/citations/REF-016.md (43 outgoing, 12 incoming)
  ✓ BibTeX: citations/bibtex/REF-016.bib
  ✗ Web: no web source (PDF primary)
  ✓ Metadata: sources/metadata/REF-016.yaml

Flag missing artifacts:

REF-299:
  ✗ PDF: MISSING (acquisition failed)
  ✓ Analysis: findings/REF-299-stub.md (22 lines — STUB)
  ...

Phase 3: Manifest Generation

Write a MANIFEST.md to the archive root describing the export:

# Corpus Export Manifest

**Date**: 2026-04-13
**Selector**: --cluster "Agentic Canon"
**Papers**: 17
**Total size**: 48.3 MB

## Contents

| REF | Title | Year | GRADE | PDF | Analysis | Citations |
|-----|-------|------|-------|-----|----------|-----------|
| REF-016 | AutoGen | 2023 | High | ✓ | 287 lines | 43/12 |
| REF-018 | Multi-Agent Debate | 2024 | High | ✓ | 312 lines | 28/17 |
...

## Missing Artifacts

- REF-299: PDF missing (acquisition failed)
- REF-312: Analysis doc is a skeleton (<40 lines)

## Provenance

Generated by `corpus-export` v1.0 from corpus at:
- Fixity manifest: .aiwg/research/fixity-manifest.json (checksum: abc123...)
- Citation graph: indices/citation-network.md (generated 2026-04-13T10:00Z)

Phase 4: Archive Creation

Create the archive with structure:

corpus-agentic-canon-2026-04-13.tar.gz
├── MANIFEST.md
├── pdfs/
│   ├── REF-016-autogen.pdf
│   ├── REF-018-multi-agent-debate.pdf
│   └── ...
├── findings/
│   ├── REF-016-autogen.md
│   └── ...
├── citations/
│   ├── REF-016.md
│   └── ...
├── bibtex/
│   ├── REF-016.bib
│   └── all.bib                    # concatenated bibliography
└── README.md                      # extraction + usage instructions

Phase 5: Report

Corpus Export Complete
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Selector: --cluster "Agentic Canon"
Papers selected: 17
Artifacts bundled: 68 files
Missing artifacts: 2 (reported in MANIFEST.md)

Archive: .aiwg/research/exports/corpus-agentic-canon-2026-04-13.tar.gz
Size: 48.3 MB
SHA-256: abc123def456...

Contents:
  17 PDFs (45.1 MB)
  17 analysis docs (1.2 MB)
  17 citation sidecars (0.8 MB)
  17 BibTeX entries + all.bib (50 KB)
  1 MANIFEST.md (4 KB)
  1 README.md (2 KB)

Archive Use Cases

Research sharing

Share a cluster with collaborators without sharing the entire corpus.

/corpus-export --cluster "Agentic Canon"

Snapshot for publication

Package the corpus state referenced by a paper for reproducibility.

/corpus-export --refs REF-016:REF-024 --include pdf,analysis,citations,provenance

Topic digest

Export everything on a specific topic for a focused review.

/corpus-export --topic "Evaluation" --filter "year>=2023"

Quality subset

Export only high-quality sources.

/corpus-export --filter "grade=High"

Integration Points

Examples

# Export a named cluster
/corpus-export --cluster "Agentic Canon"

# Export a REF range
/corpus-export --refs REF-016:REF-024,REF-121

# Export by topic
/corpus-export --topic "GUI Agents"

# Filter: recent high-grade papers with many citations
/corpus-export --filter "year>=2023 AND grade=High AND incoming>=10"

# Preview without creating archive
/corpus-export --cluster "Agentic Canon" --dry-run

# Minimal export (analysis docs only)
/corpus-export --topic "Reproducibility" --include analysis,citations

# Custom output path
/corpus-export --refs REF-001:REF-100 --output /tmp/first-100.tar.gz

References

@$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/research-gap-detect/SKILL.md — Provides cluster names
@$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/corpus-index-build/SKILL.md — Provides topic/metadata indices
@$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/research-quality-audit/SKILL.md — Flags missing artifacts
@$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/research-cite/SKILL.md — Generates BibTeX
@$AIWG_ROOT/docs/integrations/media-curator-to-research-handoff.md — Source acquisition contract

Related Skills

jmagly/radar-status

data-ai

VerifiedTrustedCommunity

Report which research-corpus radar sidecars are overdue for refresh. Computes staleness (days since last refresh vs the cadence window) for every radar, sorted most-overdue-first. Runs via `aiwg corpus radar-status`.

140SKILL.mdUpdated May 28, 2026

jmagly/radar-report

data-ai

VerifiedTrustedCommunity

Aggregate research-corpus radar sidecars into a corpus or per-cluster freshness report — totals, overdue count, per-cluster / per-GRADE / per-trajectory breakdowns, an overdue table, and per-radar rationale snippets. Runs via `aiwg corpus radar-report`.

140SKILL.mdUpdated May 28, 2026

jmagly/radar-init

testing

VerifiedTrustedCommunity

Scaffold radar/freshness sidecars for research-corpus REFs. Pulls title/authors from the citation sidecar and GRADE from the analysis doc, defaults the refresh cadence from GRADE and the cluster from a corpus-local map, and stamps documentation/radar/REF-XXX-radar.md. Runs via `aiwg corpus radar-init`.

140SKILL.mdUpdated May 28, 2026

jmagly/profile-temporal

data-ai

VerifiedTrustedCommunity

Compute an entity's publication trajectory — per-year paper counts, topic drift, hot-streak detection (≥3 consecutive A-grade years), and career phase. Runs via `aiwg corpus profile-temporal`.

140SKILL.mdUpdated May 28, 2026

jmagly/profile-temporal

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/jmagly/aiwg.git

# Copy into Claude Code skills folder (global)
cp -r aiwg/agentic/code/frameworks/research-complete/skills/corpus-export ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

jmagly/aiwg

128 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

jmagly/corpus-export

$ install --global

Security Scan Results

SKILL.md

Corpus Export

Triggers

Parameters

Selection (one required)

--cluster <name>

--refs <range>

--topic <name>

--filter <expr>

Options

--output <path> (optional)

--format tar.gz|zip (optional)

--include (optional, repeatable)

--dry-run (optional)

Execution Flow

Phase 1: Selection

Phase 2: Artifact Gathering

Phase 3: Manifest Generation

Phase 4: Archive Creation

Phase 5: Report

Archive Use Cases

Research sharing

Snapshot for publication

Topic digest

Quality subset

Integration Points

Examples

References

Related Skills

jmagly/radar-status

jmagly/radar-report

jmagly/radar-init

jmagly/profile-temporal

jmagly/corpus-export

$ install --global

Security Scan Results

SKILL.md

Corpus Export

Triggers

Parameters

Selection (one required)

--cluster <name>

--refs <range>

--topic <name>

--filter <expr>

Options

--output <path> (optional)

--format tar.gz|zip (optional)

--include (optional, repeatable)

--dry-run (optional)

Execution Flow

Phase 1: Selection

Phase 2: Artifact Gathering

Phase 3: Manifest Generation

Phase 4: Archive Creation

Phase 5: Report

Archive Use Cases

Research sharing

Snapshot for publication

Topic digest

Quality subset

Integration Points

Examples

References

Related Skills

jmagly/radar-status

jmagly/radar-report

jmagly/radar-init

jmagly/profile-temporal

`--cluster <name>`

`--refs <range>`

`--topic <name>`

`--filter <expr>`

`--output <path>` (optional)

`--format tar.gz|zip` (optional)

`--include` (optional, repeatable)

`--dry-run` (optional)

`--cluster <name>`

`--refs <range>`

`--topic <name>`

`--filter <expr>`

`--output <path>` (optional)

`--format tar.gz|zip` (optional)

`--include` (optional, repeatable)

`--dry-run` (optional)