Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

maedoc/tvb-wiki-ingestion

Name: tvb-wiki-ingestion
Author: maedoc

skills/tvb-wiki-ingestion/SKILL.md

npx skillsauth add maedoc/tvb-wiki tvb-wiki-ingestion

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

TVB Wiki Ingestion

Skill for fetching new arXiv papers relevant to connectome‑based whole‑brain modeling, extracting software/concept/author mentions, and updating the wiki’s raw paper library and entity counts.

When This Skill Activates

Use this skill when:

You need to fetch the latest arXiv papers for the TVB wiki
You want to update the raw paper library without running the full pipeline
You are setting up a new ingestion source (bioRxiv, GitHub, etc.)
The user asks “fetch new papers for the wiki” or “update the paper database”

Prerequisites

TVB wiki at ~/tvb‑wiki (or WIKI_PATH environment variable)
Python 3.11+ with requests and xml.etree.ElementTree (standard library)
Network access to arXiv API (https://export.arxiv.org)

Architecture

~/tvb‑wiki/
├── raw/papers/               # Raw paper markdown files (arxiv-*.md)
├── meta/entity_counts.json   # Counts of software/concept/author mentions
└── scripts/hourly_update.py  # Main ingestion script

The ingestion script (hourly_update.py) performs:

Query arXiv with domain‑specific search queries
Filter duplicates by arXiv ID
Save new papers as markdown in raw/papers/
Extract entities via keyword matching
Update entity counts in meta/entity_counts.json

Step‑by‑Step Ingestion

1. Run the Ingestion Script

cd ~/tvb‑wiki
python3 scripts/hourly_update.py

Output: Prints number of new papers added, e.g., Hourly update complete: 3 new papers.

2. Inspect New Papers

List recently added raw papers:

ls -lt ~/tvb‑wiki/raw/papers/arxiv-*.md | head -5

View a raw paper:

cat ~/tvb‑wiki/raw/papers/arxiv-<ID>.md

3. Check Entity Counts

The script updates meta/entity_counts.json with mention counts. View current counts:

cat ~/tvb‑wiki/meta/entity_counts.json | jq .

Example structure:

{
  "software": {
    "TVB": 42,
    "NEST": 18,
    "NEURON": 12
  },
  "concepts": {
    "neural mass model": 56,
    "fMRI": 89,
    "EEG": 67
  },
  "authors": {
    "John Doe": 5,
    "Jane Smith": 3
  }
}

4. Manual Ingestion with Custom Queries

If you need to ingest papers from other arXiv categories or with different keywords, you can modify the search queries inside hourly_update.py (function fetch_arxiv_since). Default queries:

queries = [
    'cat:q-bio.NC+AND+all:connectome',
    'all:neural+mass+model',
    'all:dynamic+causal+modeling',
    'all:The+Virtual+Brain',
    'all:TVB',
    'all:whole+brain+model',
]

Add new queries, adjust max_results, or change the since_date threshold.

5. Add a New Ingestion Source (e.g., bioRxiv)

To extend beyond arXiv:

Create a new function fetch_biorxiv_since() that uses the bioRxiv RSS feed or API.
Merge results with arXiv papers.
Assign a source: bioRxiv field and save with filename prefix biorxiv-.
Update entity extraction accordingly.

See the automated‑research‑wiki skill for a multi‑source pattern.

Entity Extraction Details

The current extraction uses simple keyword matching. Keywords are defined in extract_entities:

Software: TVB, The Virtual Brain, NEST, NEURON, Brian, ANTs, SPM, FSL, FreeSurfer
Concepts: neural mass model, dynamic causal modeling, Wilson‑Cowan, Jansen‑Rit, fMRI, EEG, MEG, DTI, functional connectivity, structural connectivity, effective connectivity, resting‑state, whole‑brain, connectomics, brain network

Limitation: Keyword matching may miss synonyms or paraphrases. For more robust extraction, consider using a lightweight NLP library (spaCy) or an LLM‑based extractor.

Integration with Wiki Pages

Entity counts are used by the wiki’s page‑creation logic (in hourly_full.py). When a software/concept/author reaches a threshold (default: 2 mentions), a corresponding page is created in entities/ or concepts/. This happens during the full pipeline, not during ingestion alone.

For Other Agents (Codex, Pi)

Agents that discover this skill can:

Locate the script – scripts/hourly_update.py
Run ingestion – execute the Python script as described.
Adapt to their environment – set WIKI_PATH if the repo is elsewhere.
Extend with new sources – follow the pattern to add bioRxiv, GitHub, etc.

Skill Metadata for Discovery

This skill is listed in the repo’s skill‑manifest.json:

{
  "name": "tvb‑wiki‑ingestion",
  "path": "skills/tvb‑wiki‑ingestion/SKILL.md",
  "description": "arXiv paper ingestion and entity/concept extraction.",
  "entry_point": "scripts/hourly_update.py",
  "dependencies": ["python3"],
  "schedule": "hourly"
}

Pitfalls

arXiv rate limits – Polite usage: max 1 request per second. The script includes time.sleep(1) between queries.
Duplicate detection – Based on arXiv ID (without version suffix). If a paper is revised (v2, v3), the new version will replace the old only if the filename matches exactly; currently, only the base ID is used, so revisions are ignored.
Network timeouts – If arXiv API is unreachable, the script will fail silently (caught exception). Check network connectivity.
Keyword misses – New software or concept terms may not be in the keyword list. Periodically review and update the keyword lists.

Verification

After ingestion, verify:

New .md files appear in raw/papers/
meta/entity_counts.json has updated counts
No errors in the script output

Run a quick count:

grep -c '^# ' ~/tvb‑wiki/raw/papers/arxiv-*.md

Related Skills

arxiv – General arXiv search and download skill
llm‑wiki – Core wiki‑building skill
automated‑research‑wiki – Full automation pattern

References

arXiv API user manual: https://arxiv.org/help/api/user‑manual
TVB wiki repo: https://github.com/maedoc/tvb‑wiki

maedoc/tvb-wiki-ingestion

skills/tvb-wiki-ingestion/SKILL.md

arXiv paper ingestion for the TVB research wiki: fetch new papers, extract entities/concepts, update raw/ and meta/.

documentation

Updated Apr 21, 2026

$ install --global

skillsauth

npx skillsauth add maedoc/tvb-wiki tvb-wiki-ingestion

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 22, 2026, 3:04 AM25.5s1 file scanned

SKILL.md

name:: tvb-wiki-ingestion
description:: arXiv paper ingestion for the TVB research wiki: fetch new papers, extract entities/concepts, update raw/ and meta/.
version:: 1.0.0
author:: Hermes Agent
license:: MIT
tags:: [wiki, ingestion, arxiv, research, entities]
category:: research
related_skills:: [arxiv, llm-wiki]
slash_commands:: [/ingest-arxiv]
agents:: [researcher]

TVB Wiki Ingestion

Skill for fetching new arXiv papers relevant to connectome‑based whole‑brain modeling, extracting software/concept/author mentions, and updating the wiki’s raw paper library and entity counts.

When This Skill Activates

Use this skill when:

You need to fetch the latest arXiv papers for the TVB wiki
You want to update the raw paper library without running the full pipeline
You are setting up a new ingestion source (bioRxiv, GitHub, etc.)
The user asks “fetch new papers for the wiki” or “update the paper database”

Prerequisites

TVB wiki at ~/tvb‑wiki (or WIKI_PATH environment variable)
Python 3.11+ with requests and xml.etree.ElementTree (standard library)
Network access to arXiv API (https://export.arxiv.org)

Architecture

~/tvb‑wiki/
├── raw/papers/               # Raw paper markdown files (arxiv-*.md)
├── meta/entity_counts.json   # Counts of software/concept/author mentions
└── scripts/hourly_update.py  # Main ingestion script

The ingestion script (hourly_update.py) performs:

Query arXiv with domain‑specific search queries
Filter duplicates by arXiv ID
Save new papers as markdown in raw/papers/
Extract entities via keyword matching
Update entity counts in meta/entity_counts.json

Step‑by‑Step Ingestion

1. Run the Ingestion Script

cd ~/tvb‑wiki
python3 scripts/hourly_update.py

Output: Prints number of new papers added, e.g., Hourly update complete: 3 new papers.

2. Inspect New Papers

List recently added raw papers:

ls -lt ~/tvb‑wiki/raw/papers/arxiv-*.md | head -5

View a raw paper:

cat ~/tvb‑wiki/raw/papers/arxiv-<ID>.md

3. Check Entity Counts

The script updates meta/entity_counts.json with mention counts. View current counts:

cat ~/tvb‑wiki/meta/entity_counts.json | jq .

Example structure:

{
  "software": {
    "TVB": 42,
    "NEST": 18,
    "NEURON": 12
  },
  "concepts": {
    "neural mass model": 56,
    "fMRI": 89,
    "EEG": 67
  },
  "authors": {
    "John Doe": 5,
    "Jane Smith": 3
  }
}

4. Manual Ingestion with Custom Queries

If you need to ingest papers from other arXiv categories or with different keywords, you can modify the search queries inside hourly_update.py (function fetch_arxiv_since). Default queries:

queries = [
    'cat:q-bio.NC+AND+all:connectome',
    'all:neural+mass+model',
    'all:dynamic+causal+modeling',
    'all:The+Virtual+Brain',
    'all:TVB',
    'all:whole+brain+model',
]

Add new queries, adjust max_results, or change the since_date threshold.

5. Add a New Ingestion Source (e.g., bioRxiv)

To extend beyond arXiv:

Create a new function fetch_biorxiv_since() that uses the bioRxiv RSS feed or API.
Merge results with arXiv papers.
Assign a source: bioRxiv field and save with filename prefix biorxiv-.
Update entity extraction accordingly.

See the automated‑research‑wiki skill for a multi‑source pattern.

Entity Extraction Details

The current extraction uses simple keyword matching. Keywords are defined in extract_entities:

Software: TVB, The Virtual Brain, NEST, NEURON, Brian, ANTs, SPM, FSL, FreeSurfer
Concepts: neural mass model, dynamic causal modeling, Wilson‑Cowan, Jansen‑Rit, fMRI, EEG, MEG, DTI, functional connectivity, structural connectivity, effective connectivity, resting‑state, whole‑brain, connectomics, brain network

Limitation: Keyword matching may miss synonyms or paraphrases. For more robust extraction, consider using a lightweight NLP library (spaCy) or an LLM‑based extractor.

Integration with Wiki Pages

For Other Agents (Codex, Pi)

Agents that discover this skill can:

Locate the script – scripts/hourly_update.py
Run ingestion – execute the Python script as described.
Adapt to their environment – set WIKI_PATH if the repo is elsewhere.
Extend with new sources – follow the pattern to add bioRxiv, GitHub, etc.

Skill Metadata for Discovery

This skill is listed in the repo’s skill‑manifest.json:

{
  "name": "tvb‑wiki‑ingestion",
  "path": "skills/tvb‑wiki‑ingestion/SKILL.md",
  "description": "arXiv paper ingestion and entity/concept extraction.",
  "entry_point": "scripts/hourly_update.py",
  "dependencies": ["python3"],
  "schedule": "hourly"
}

Pitfalls

arXiv rate limits – Polite usage: max 1 request per second. The script includes time.sleep(1) between queries.
Duplicate detection – Based on arXiv ID (without version suffix). If a paper is revised (v2, v3), the new version will replace the old only if the filename matches exactly; currently, only the base ID is used, so revisions are ignored.
Network timeouts – If arXiv API is unreachable, the script will fail silently (caught exception). Check network connectivity.
Keyword misses – New software or concept terms may not be in the keyword list. Periodically review and update the keyword lists.

Verification

After ingestion, verify:

New .md files appear in raw/papers/
meta/entity_counts.json has updated counts
No errors in the script output

Run a quick count:

grep -c '^# ' ~/tvb‑wiki/raw/papers/arxiv-*.md

Related Skills

arxiv – General arXiv search and download skill
llm‑wiki – Core wiki‑building skill
automated‑research‑wiki – Full automation pattern

References

arXiv API user manual: https://arxiv.org/help/api/user‑manual
TVB wiki repo: https://github.com/maedoc/tvb‑wiki

Related Skills

maedoc/watch

research

VerifiedTrustedCommunity

Set up a recurring research watch on a topic, company, paper area, or product surface. Use when the user asks to monitor a field, track new papers, watch for updates, or set up alerts on a research area.

SKILL.mdUpdated Apr 21, 2026

maedoc/tvb-wiki-static-site

development

VerifiedTrustedCommunity

Build and deploy the TVB research wiki as a static site using MkDocs and GitHub Pages.

SKILL.mdUpdated Apr 21, 2026

maedoc/tvb-wiki-static-site

maedoc/tvb-wiki-maintenance

development

VerifiedTrustedCommunity

Full pipeline for maintaining the TVB research wiki: hourly arXiv ingestion, static site build, git commit, and GitHub Pages deployment.

SKILL.mdUpdated Apr 21, 2026

maedoc/tvb-wiki-maintenance

maedoc/source-comparison

tools

VerifiedTrustedCommunity

Compare multiple sources on a topic and produce a grounded comparison matrix. Use when the user asks to compare papers, tools, approaches, frameworks, or claims across multiple sources.

SKILL.mdUpdated Apr 21, 2026

maedoc/source-comparison

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/maedoc/tvb-wiki.git

# Copy into Claude Code skills folder (global)
cp -r tvb-wiki/skills/tvb-wiki-ingestion ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

maedoc/tvb-wiki

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT