Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

jmagly/research-acquire

Name: research-acquire
Author: jmagly

agentic/code/frameworks/research-complete/skills/research-acquire/SKILL.md

npx skillsauth add jmagly/aiwg research-acquire

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Research Acquire Command

Download research papers from public repositories and extract metadata.

Instructions

When invoked, perform automated paper acquisition:

Identify Source
- Parse DOI, arXiv ID, or URL
- Determine paper hosting location
- Check if paper already exists in .aiwg/research/sources/
Download Paper
- Attempt direct PDF download from source
- Try fallback sources (arXiv mirror, Unpaywall, PMC)
- Save to .aiwg/research/sources/[ref-id].pdf
- Verify download integrity (file size, PDF structure)
Extract Metadata
- Parse PDF metadata (title, authors, year)
- Query CrossRef/Semantic Scholar for enhanced metadata
- Extract abstract, keywords, citation count
- Determine source type (journal, conference, preprint)
Generate Frontmatter
- Create YAML frontmatter per @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/schemas/research/frontmatter-schema.yaml
- Assign REF-XXX identifier
- Calculate PDF checksum (SHA-256)
- Set initial GRADE baseline from source type
Extract Full Text (default, unless --no-extract-text)
- Extract full text from PDF to .aiwg/research/sources/text/REF-XXX.txt
- This text is the primary input for downstream analysis — analysis agents must read this file, not just metadata or abstract
- If extraction fails (scanned PDF, encrypted): log warning, set full_text_available: false in frontmatter
Create Finding Document
- Generate .aiwg/research/findings/REF-XXX-[slug].md from template
- Populate frontmatter with extracted metadata
- Add placeholder sections for key findings
- Update fixity manifest
Post-Acquisition
- Log acquisition in .aiwg/research/acquisition-log.yaml
- Update corpus index
- Suggest next steps (quality assessment, documentation)

Arguments

[identifier] - DOI, arXiv ID, or URL (required)
--output [path] - Custom output location (default: auto-generate)
--ref-id [REF-XXX] - Specific REF-XXX identifier (default: auto-assign)
--extract-text - Extract full text to .txt file for analysis (default: enabled; use --no-extract-text to skip)
--no-metadata - Skip metadata enrichment
--force - Re-download even if paper exists

Examples

# Acquire by DOI
/research-acquire 10.48550/arXiv.2308.08155

# Acquire by arXiv ID
/research-acquire arXiv:2308.08155

# Acquire with custom identifier
/research-acquire https://arxiv.org/pdf/2308.08155.pdf --ref-id REF-022

# Acquire with full text extraction
/research-acquire 10.1145/3377811.3380330 --extract-text

Expected Output

Acquiring Paper: 10.48550/arXiv.2308.08155
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Step 1: Resolving identifier
  ✓ DOI resolved to arXiv:2308.08155
  ✓ Paper not found in corpus

Step 2: Downloading PDF
  ✓ Downloaded from arxiv.org (2.4 MB)
  ✓ Saved to .aiwg/research/sources/REF-022.pdf
  ✓ Checksum: a1b2c3d4e5f6...

Step 3: Extracting metadata
  ✓ Title: AutoGen: Enabling Next-Gen LLM Applications...
  ✓ Authors: Wu, Q., Bansal, G., Zhang, J., et al. (9 authors)
  ✓ Year: 2023
  ✓ Source: arXiv preprint
  ✓ Citations: 234 (as of 2026-02-03)

Step 4: Creating finding document
  ✓ Generated .aiwg/research/findings/REF-022-autogen.md
  ✓ Frontmatter populated
  ✓ Template sections added

Step 5: Updating corpus
  ✓ Added to fixity manifest
  ✓ Updated INDEX.md
  ✓ Logged acquisition

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Acquisition complete!

REF-ID: REF-022
Title: AutoGen: Enabling Next-Gen LLM Applications...
File: .aiwg/research/sources/REF-022.pdf
Finding: .aiwg/research/findings/REF-022-autogen.md

Next Steps:
1. /research-quality REF-022 - Assess evidence quality
2. /research-document REF-022 - Create detailed summary
3. /research-cite REF-022 - Generate citation

Provenance Tracking

All acquisitions create provenance records:

# .aiwg/research/provenance/records/REF-022-acquisition.yaml
entity:
  id: "urn:aiwg:artifact:.aiwg/research/sources/REF-022.pdf"
  type: "research_paper"

activity:
  id: "urn:aiwg:activity:acquisition:REF-022:001"
  type: "acquisition"
  started_at: "2026-02-03T12:00:00Z"
  ended_at: "2026-02-03T12:00:15Z"

agent:
  id: "urn:aiwg:agent:research-acquisition-agent"
  type: "aiwg_agent"

source:
  identifier: "10.48550/arXiv.2308.08155"
  url: "https://arxiv.org/pdf/2308.08155.pdf"

References

@$AIWG_ROOT/agentic/code/frameworks/research-complete/agents/research-acquisition-agent.md - Acquisition Agent
@$AIWG_ROOT/src/research/services/acquisition-service.ts - Download implementation
@$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/schemas/research/frontmatter-schema.yaml - Metadata format
@.aiwg/research/fixity-manifest.json - Checksum tracking
@$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/rules/provenance-tracking.md - Provenance requirements

Storage Routing (#934, #968)

This skill's persistence flows through resolveStorage('research'). On the default fs backend the research corpus lives at .aiwg/research/. Heavy artifacts (papers, archived sources) can move to a secondary drive by setting roots.research in .aiwg/storage.config (one of the headline #934 use cases).

aiwg research-store path                            # resolved root
aiwg research-store list --prefix sources/
aiwg research-store get sources/paper-123.md

jmagly/research-acquire

agentic/code/frameworks/research-complete/skills/research-acquire/SKILL.md

Download research papers and extract metadata

128 stars

data-ai

Updated May 11, 2026

$ install --global

skillsauth

npx skillsauth add jmagly/aiwg research-acquire

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 11, 2026, 6:22 AM163.0s1 file scanned

SKILL.md

namespace:: aiwg
name:: research-acquire
platforms:: [all]
description:: Download research papers and extract metadata
argumentHint:: [DOI or arXiv ID] [--output path] [--extract-text]
category:: research-acquisition

Research Acquire Command

Download research papers from public repositories and extract metadata.

Instructions

When invoked, perform automated paper acquisition:

Identify Source
- Parse DOI, arXiv ID, or URL
- Determine paper hosting location
- Check if paper already exists in .aiwg/research/sources/
Download Paper
- Attempt direct PDF download from source
- Try fallback sources (arXiv mirror, Unpaywall, PMC)
- Save to .aiwg/research/sources/[ref-id].pdf
- Verify download integrity (file size, PDF structure)
Extract Metadata
- Parse PDF metadata (title, authors, year)
- Query CrossRef/Semantic Scholar for enhanced metadata
- Extract abstract, keywords, citation count
- Determine source type (journal, conference, preprint)
Generate Frontmatter
- Create YAML frontmatter per @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/schemas/research/frontmatter-schema.yaml
- Assign REF-XXX identifier
- Calculate PDF checksum (SHA-256)
- Set initial GRADE baseline from source type
Extract Full Text (default, unless --no-extract-text)
- Extract full text from PDF to .aiwg/research/sources/text/REF-XXX.txt
- This text is the primary input for downstream analysis — analysis agents must read this file, not just metadata or abstract
- If extraction fails (scanned PDF, encrypted): log warning, set full_text_available: false in frontmatter
Create Finding Document
- Generate .aiwg/research/findings/REF-XXX-[slug].md from template
- Populate frontmatter with extracted metadata
- Add placeholder sections for key findings
- Update fixity manifest
Post-Acquisition
- Log acquisition in .aiwg/research/acquisition-log.yaml
- Update corpus index
- Suggest next steps (quality assessment, documentation)

Arguments

[identifier] - DOI, arXiv ID, or URL (required)
--output [path] - Custom output location (default: auto-generate)
--ref-id [REF-XXX] - Specific REF-XXX identifier (default: auto-assign)
--extract-text - Extract full text to .txt file for analysis (default: enabled; use --no-extract-text to skip)
--no-metadata - Skip metadata enrichment
--force - Re-download even if paper exists

Examples

# Acquire by DOI
/research-acquire 10.48550/arXiv.2308.08155

# Acquire by arXiv ID
/research-acquire arXiv:2308.08155

# Acquire with custom identifier
/research-acquire https://arxiv.org/pdf/2308.08155.pdf --ref-id REF-022

# Acquire with full text extraction
/research-acquire 10.1145/3377811.3380330 --extract-text

Expected Output

Acquiring Paper: 10.48550/arXiv.2308.08155
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Step 1: Resolving identifier
  ✓ DOI resolved to arXiv:2308.08155
  ✓ Paper not found in corpus

Step 2: Downloading PDF
  ✓ Downloaded from arxiv.org (2.4 MB)
  ✓ Saved to .aiwg/research/sources/REF-022.pdf
  ✓ Checksum: a1b2c3d4e5f6...

Step 3: Extracting metadata
  ✓ Title: AutoGen: Enabling Next-Gen LLM Applications...
  ✓ Authors: Wu, Q., Bansal, G., Zhang, J., et al. (9 authors)
  ✓ Year: 2023
  ✓ Source: arXiv preprint
  ✓ Citations: 234 (as of 2026-02-03)

Step 4: Creating finding document
  ✓ Generated .aiwg/research/findings/REF-022-autogen.md
  ✓ Frontmatter populated
  ✓ Template sections added

Step 5: Updating corpus
  ✓ Added to fixity manifest
  ✓ Updated INDEX.md
  ✓ Logged acquisition

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Acquisition complete!

REF-ID: REF-022
Title: AutoGen: Enabling Next-Gen LLM Applications...
File: .aiwg/research/sources/REF-022.pdf
Finding: .aiwg/research/findings/REF-022-autogen.md

Next Steps:
1. /research-quality REF-022 - Assess evidence quality
2. /research-document REF-022 - Create detailed summary
3. /research-cite REF-022 - Generate citation

Provenance Tracking

All acquisitions create provenance records:

# .aiwg/research/provenance/records/REF-022-acquisition.yaml
entity:
  id: "urn:aiwg:artifact:.aiwg/research/sources/REF-022.pdf"
  type: "research_paper"

activity:
  id: "urn:aiwg:activity:acquisition:REF-022:001"
  type: "acquisition"
  started_at: "2026-02-03T12:00:00Z"
  ended_at: "2026-02-03T12:00:15Z"

agent:
  id: "urn:aiwg:agent:research-acquisition-agent"
  type: "aiwg_agent"

source:
  identifier: "10.48550/arXiv.2308.08155"
  url: "https://arxiv.org/pdf/2308.08155.pdf"

References

@$AIWG_ROOT/agentic/code/frameworks/research-complete/agents/research-acquisition-agent.md - Acquisition Agent
@$AIWG_ROOT/src/research/services/acquisition-service.ts - Download implementation
@$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/schemas/research/frontmatter-schema.yaml - Metadata format
@.aiwg/research/fixity-manifest.json - Checksum tracking
@$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/rules/provenance-tracking.md - Provenance requirements

Storage Routing (#934, #968)

aiwg research-store path                            # resolved root
aiwg research-store list --prefix sources/
aiwg research-store get sources/paper-123.md

Related Skills

jmagly/radar-status

data-ai

VerifiedTrustedCommunity

Report which research-corpus radar sidecars are overdue for refresh. Computes staleness (days since last refresh vs the cadence window) for every radar, sorted most-overdue-first. Runs via `aiwg corpus radar-status`.

140SKILL.mdUpdated May 28, 2026

jmagly/radar-report

data-ai

VerifiedTrustedCommunity

Aggregate research-corpus radar sidecars into a corpus or per-cluster freshness report — totals, overdue count, per-cluster / per-GRADE / per-trajectory breakdowns, an overdue table, and per-radar rationale snippets. Runs via `aiwg corpus radar-report`.

140SKILL.mdUpdated May 28, 2026

jmagly/radar-init

testing

VerifiedTrustedCommunity

Scaffold radar/freshness sidecars for research-corpus REFs. Pulls title/authors from the citation sidecar and GRADE from the analysis doc, defaults the refresh cadence from GRADE and the cluster from a corpus-local map, and stamps documentation/radar/REF-XXX-radar.md. Runs via `aiwg corpus radar-init`.

140SKILL.mdUpdated May 28, 2026

jmagly/profile-temporal

data-ai

VerifiedTrustedCommunity

Compute an entity's publication trajectory — per-year paper counts, topic drift, hot-streak detection (≥3 consecutive A-grade years), and career phase. Runs via `aiwg corpus profile-temporal`.

140SKILL.mdUpdated May 28, 2026

jmagly/profile-temporal

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/jmagly/aiwg.git

# Copy into Claude Code skills folder (global)
cp -r aiwg/agentic/code/frameworks/research-complete/skills/research-acquire ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

jmagly/aiwg

128 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT