Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

sahuno/igv-reports

Name: igv-reports
Author: sahuno

claude/skills/igv-reports/SKILL.md

npx skillsauth add sahuno/llm_configs igv-reports

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

igv-reports

This skill builds self-contained HTML genomic-region reports with igv-reports (create_report). Each report is a single browseable HTML containing the igv.js viewer plus embedded data slices for every region. No server, no internet, no IGV install needed at view time.

The skill has three entry points:

build — one-shot: sites BED + BAM(s) ± VCF → HTML.
cohort — multi-sample driver from a samplesheet → per-sample HTMLs + index.
prep-track — utility: convert plain-gzip GFF/GTF/BED.gz into a bgzip + tabix-indexed track that igv-reports can load.

When to use which entry point

| User request | Entry point | |---|---| | "Make an HTML for these 5 SV breakpoints in tumor.bam" | build | | "Give me one HTML per patient for the cohort integration calls" | cohort | | "create_report fails with 'not BGZF' on this gencode" | prep-track |

Defaults (locked in)

Tracks always loaded, top-to-bottom in the viewer:
1. CpG islands (BED, plain or bgzipped)
2. Gencode full annotation (GFF3.gz, transcripts + exons + CDS + UTRs, NOT a gene-level-only file)
3. RepeatMasker (BED.gz, bgzipped + tabix-indexed) Plus the user's BAM(s), VCF, and any extra tracks they pass.
--flanking 300 bp on either side of each site (good for SV breakpoints and point variants alike). Override per call if needed.
--standalone so the HTML is offline-viewable.
Output filename includes the genome tag — e.g. cohort.hg38.html — to pass enforce-genome-tag.sh.
Reference FASTA is resolved from databases_config.yaml: /data1/greenbab/users/ahunos/apps/llm_configs/claude/profiles/databases/databases_config.yaml. Supported genome IDs: hg38, mm10, mm39, t2t_CHM13v2_plusY, GRCh37.
Per-genome default track availability is recorded in references/databases_config_paths.md — read it before assembling tracks so the skill doesn't try to load a track that doesn't exist for the selected genome (e.g., mm39 has no rmsk in our database).

Sites BED format (critical)

igv-reports' BED parser reads fields by position and trips on a header row (ValueError: invalid literal for int() with base 10: 'start'). Always emit a plain headerless 4-column BED:

chr    start    end    name
chr2   25227855 25342590 DNMT3A_full_gene

Tab-separated. The name becomes the row label in the report's variant table — make it specific enough to identify the site after deduping.

The project's enforce-genome-tag.sh hook requires a genome tag in the BED filename: use sites.hg38.bed, not sites.bed.

Pitfalls (the skill should encode and/or detect these)

| Symptom | Root cause | Fix | |---|---|---| | ValueError: invalid literal for int() on first row | Header row in sites BED | Strip header — plain BED | | UnicodeDecodeError: byte 0x8b reading a track | igv-reports reading bgzip as text | Filename must end .gff3.gz / .bed.gz AND be true bgzip (check with file <name> for "extra field") | | tabix: not BGZF | Track was plain-gzipped, not bgzipped | Run prep-track entry point | | tabix: out of order while indexing | GFF/GTF/BED records not pos-sorted within chr | prep-track does sort -k1,1 -k4,4n before bgzip | | Annotation track empty in viewer | Tabix returns no rows in displayed window — often correct biology (e.g., CGI-distal site). Confirm with tabix file region | | Genome ID lookup fails with --genome hg38 | igv.js bundled IDs require internet at view + render time. Use --fasta /path/to/local.fa instead (always works offline) |

Full pitfalls + create_report flag reference in references/best_practices.md.

How to run — quick recipe

Activate the snakemake conda env first; create_report lives there:

source /home/ahunos/miniforge3/etc/profile.d/conda.sh
conda activate snakemake

Then call the bundled driver script:

python /data1/greenbab/users/ahunos/apps/llm_configs/claude/skills/igv-reports/scripts/build_igvreports.py \
    --sites results/run/inputs/sites.hg38.bed \
    --bam tumor.bam normal.bam \
    --vcf calls.vcf \
    --genome hg38 \
    --output results/run/reports/cohort.hg38.html

The driver:

Resolves the genome's CpG / gencode / rmsk paths from databases_config.yaml (skipping any that don't exist for the chosen genome).
Validates the sites BED is headerless and that all rows have start < end.
Calls create_report with --flanking 300 --standalone.
Writes a logs/ entry capturing the full command, the flanking value, the per-region embedded data sizes, and the resolved track list — required by the project's analysis-script audit-trail expectations.

For multi-sample cohorts, use --samplesheet samplesheet.tsv instead of --bam/--vcf. Samplesheet format: sample, bam_tumor, bam_normal, vcf, sites_bed. The driver emits one HTML per sample plus a top-level index.html that lists all samples with links. Layout matches the ATLL viral-integration reference implementation:

results/<run>/
├── inputs/<sample>/sites.<genome>.bed
├── reports/<sample>.<genome>.html
├── reports/index.html
└── logs/run_<timestamp>.log

prep-track — fixing a non-bgzip track

If a GFF3/GTF/BED.gz is plain-gzip rather than bgzip, igv-reports fails silently or with an obscure error. Convert in place with backup:

bash /data1/greenbab/users/ahunos/apps/llm_configs/claude/skills/igv-reports/scripts/prep_track.sh \
    /path/to/track.gff3.gz

The script:

Backs up the original to <name>.bak.original_gzip.
gunzip -cs the file.
Sorts by chr then numeric pos (sort -k1,1 -k4,4n). (Gencode delivers records interleaved by feature type at the same locus — tabix requires pos-sorted.)
bgzips in place.
tabix -p <gff|gtf|bed>s.
Verifies a sample tabix query returns rows.

Run from the snakemake conda env (bgzip/tabix from htslib).

When generating an answer.md / run.sh for the user

The driver script (build_igvreports.py) deliberately abstracts the underlying create_report flags — it sets --standalone, --fasta, the --flanking 300 default, and the YAML-resolved annotation tracks internally so the user doesn't have to remember them. That abstraction is good for ergonomics but bad for auditability: a reviewer reading the answer.md later can't see what flags are actually being invoked without opening the driver source.

To keep both: when you produce a runnable command for the user, also include a code block titled "Equivalent direct create_report invocation" that shows the fully-expanded command with all flags and resolved track paths inline. The user should see the wrapper command they're going to run AND the underlying command it expands to. Example:

## Run

```bash
python build_igvreports.py --genome mm10 --sites peaks.mm10.bed \\
    --bam ./data/ip.bam ./data/input.bam \\
    --output reports/peaks_qc.mm10.html
```

### Equivalent direct create_report invocation

```bash
create_report peaks.mm10.bed \\
    --fasta /data1/greenbab/database/mm10/mm10.fa \\
    --flanking 300 --standalone \\
    --tracks ./data/ip.bam ./data/input.bam \\
        /data1/greenbab/database/mm10/mm10_CpGIslands.bed \\
        /data1/greenbab/database/mm10/annotations/gencode.vM25.annotation.gtf.gz \\
        /data1/greenbab/database/RepeatMaskerDB/.../rmsk_all_repeats_mm10.bed.gz \\
    --title "ChIP-seq peak QC (mm10) — IP vs Input" \\
    --output reports/peaks_qc.mm10.html
```

This costs you ~10 lines and gives the reviewer a full audit trail. For cohort runs, show the expanded form for ONE representative sample only — the others differ only in BAM/VCF paths.

Output and workflow logging

Every run logs to logs/run_<YYYYMMDD_HHMMSS>.log next to the reports dir. The log captures:

Resolved track paths (per genome, after databases_config.yaml lookup).
The exact create_report command.
The flanking value used (default 300 bp — this is the value that's baked into all the embedded data slices, so audit trails depend on it).
Per-region embedded data sizes (extracted post-render so the user can see which regions inflated the HTML).
Total HTML size.

This satisfies CLAUDE.md §"Logging and Audit Trail" — every run is reproducible from the log alone.

Track choice nuances

For gencode on hg38, the default points at gencode.v47.annotation.gff3.gz (full annotation, bgzip + tabix). This gives transcript models with exons / CDS / UTRs. The gene-level-only companion (gencode.v47.genes.annotation.sorted.gff3.gz) renders only solid gene boxes and is fine for high-zoom views, but the full annotation is the right default for read-level inspection at integration / fusion / SV junctions.

For mouse genomes, databases_config.yaml ships .gtf.gz paths instead. GTFs work in igv-reports if bgzip + tabix-indexed; prep-track converts plain-gzip GTFs the same way it does GFF3s.

For T2T-CHM13, only the FASTA + GTF + CGI are indexed in our DB; rmsk is absent and is auto-skipped by the driver. The variant table will load without rmsk; flag this in the run log.

Common-case examples

The examples/ directory has runnable templates:

single_sample.sh — one BAM + one VCF + a sites BED → one HTML.
cohort_samplesheet.sh — TSV-driven multi-sample run.
prep_track_demo.sh — convert a plain-gzip gencode to bgzip+tabix.

These are reference implementations; copy and edit them for new runs rather than starting from scratch.

sahuno/igv-reports

claude/skills/igv-reports/SKILL.md

Build self-contained, offline HTML genomic-region reports with igv-reports (create_report). Each HTML bundles igv.js viewers per region with embedded BAM/VCF data slices and default tracks (CpG islands, gencode, RepeatMasker); a reviewer clicks the variant table to inspect read-level evidence with no internet, no server, no IGV install. USE this skill whenever the user wants an HTML, clickable, or browseable viewer of genomic data — phrases like "HTML IGV report", "offline IGV", "self-contained HTML", "clickable viewer", "create_report", "igv-reports", "email this viewer", or any browseable HTML of reads at variants, fusion breakpoints, SV junctions, viral integrations, ChIP peaks, or ROIs. Trigger even when the user doesn't say "igv-reports" — giveaway is HTML/clickable/offline plus genomic regions. Also fire on /igv-reports. DO NOT use for static PNG/PDF/SVG IGV screenshots — use the igv-screenshots skill. Supports hg38, mm10, mm39, T2T. Defaults: --flanking 300, --standalone, genome-tagged output.

tools

Updated May 7, 2026

$ install --global

skillsauth

npx skillsauth add sahuno/llm_configs igv-reports

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 7, 2026, 5:33 AM160.9s10 files scanned

SKILL.md

name:: igv-reports
description:: |
Supports hg38, mm10, mm39, T2T. Defaults:: --flanking 300, --standalone, genome-tagged output.

igv-reports

The skill has three entry points:

build — one-shot: sites BED + BAM(s) ± VCF → HTML.
cohort — multi-sample driver from a samplesheet → per-sample HTMLs + index.
prep-track — utility: convert plain-gzip GFF/GTF/BED.gz into a bgzip + tabix-indexed track that igv-reports can load.

When to use which entry point

Defaults (locked in)

Tracks always loaded, top-to-bottom in the viewer:
1. CpG islands (BED, plain or bgzipped)
2. Gencode full annotation (GFF3.gz, transcripts + exons + CDS + UTRs, NOT a gene-level-only file)
3. RepeatMasker (BED.gz, bgzipped + tabix-indexed) Plus the user's BAM(s), VCF, and any extra tracks they pass.
--flanking 300 bp on either side of each site (good for SV breakpoints and point variants alike). Override per call if needed.
--standalone so the HTML is offline-viewable.
Output filename includes the genome tag — e.g. cohort.hg38.html — to pass enforce-genome-tag.sh.
Reference FASTA is resolved from databases_config.yaml: /data1/greenbab/users/ahunos/apps/llm_configs/claude/profiles/databases/databases_config.yaml. Supported genome IDs: hg38, mm10, mm39, t2t_CHM13v2_plusY, GRCh37.
Per-genome default track availability is recorded in references/databases_config_paths.md — read it before assembling tracks so the skill doesn't try to load a track that doesn't exist for the selected genome (e.g., mm39 has no rmsk in our database).

Sites BED format (critical)

igv-reports' BED parser reads fields by position and trips on a header row (ValueError: invalid literal for int() with base 10: 'start'). Always emit a plain headerless 4-column BED:

chr    start    end    name
chr2   25227855 25342590 DNMT3A_full_gene

Tab-separated. The name becomes the row label in the report's variant table — make it specific enough to identify the site after deduping.

The project's enforce-genome-tag.sh hook requires a genome tag in the BED filename: use sites.hg38.bed, not sites.bed.

Pitfalls (the skill should encode and/or detect these)

Full pitfalls + create_report flag reference in references/best_practices.md.

How to run — quick recipe

Activate the snakemake conda env first; create_report lives there:

source /home/ahunos/miniforge3/etc/profile.d/conda.sh
conda activate snakemake

Then call the bundled driver script:

python /data1/greenbab/users/ahunos/apps/llm_configs/claude/skills/igv-reports/scripts/build_igvreports.py \
    --sites results/run/inputs/sites.hg38.bed \
    --bam tumor.bam normal.bam \
    --vcf calls.vcf \
    --genome hg38 \
    --output results/run/reports/cohort.hg38.html

The driver:

Resolves the genome's CpG / gencode / rmsk paths from databases_config.yaml (skipping any that don't exist for the chosen genome).
Validates the sites BED is headerless and that all rows have start < end.
Calls create_report with --flanking 300 --standalone.
Writes a logs/ entry capturing the full command, the flanking value, the per-region embedded data sizes, and the resolved track list — required by the project's analysis-script audit-trail expectations.

results/<run>/
├── inputs/<sample>/sites.<genome>.bed
├── reports/<sample>.<genome>.html
├── reports/index.html
└── logs/run_<timestamp>.log

prep-track — fixing a non-bgzip track

If a GFF3/GTF/BED.gz is plain-gzip rather than bgzip, igv-reports fails silently or with an obscure error. Convert in place with backup:

bash /data1/greenbab/users/ahunos/apps/llm_configs/claude/skills/igv-reports/scripts/prep_track.sh \
    /path/to/track.gff3.gz

The script:

Backs up the original to <name>.bak.original_gzip.
gunzip -cs the file.
Sorts by chr then numeric pos (sort -k1,1 -k4,4n). (Gencode delivers records interleaved by feature type at the same locus — tabix requires pos-sorted.)
bgzips in place.
tabix -p <gff|gtf|bed>s.
Verifies a sample tabix query returns rows.

Run from the snakemake conda env (bgzip/tabix from htslib).

When generating an answer.md / run.sh for the user

## Run

```bash
python build_igvreports.py --genome mm10 --sites peaks.mm10.bed \\
    --bam ./data/ip.bam ./data/input.bam \\
    --output reports/peaks_qc.mm10.html
```

### Equivalent direct create_report invocation

```bash
create_report peaks.mm10.bed \\
    --fasta /data1/greenbab/database/mm10/mm10.fa \\
    --flanking 300 --standalone \\
    --tracks ./data/ip.bam ./data/input.bam \\
        /data1/greenbab/database/mm10/mm10_CpGIslands.bed \\
        /data1/greenbab/database/mm10/annotations/gencode.vM25.annotation.gtf.gz \\
        /data1/greenbab/database/RepeatMaskerDB/.../rmsk_all_repeats_mm10.bed.gz \\
    --title "ChIP-seq peak QC (mm10) — IP vs Input" \\
    --output reports/peaks_qc.mm10.html
```

This costs you ~10 lines and gives the reviewer a full audit trail. For cohort runs, show the expanded form for ONE representative sample only — the others differ only in BAM/VCF paths.

Output and workflow logging

Every run logs to logs/run_<YYYYMMDD_HHMMSS>.log next to the reports dir. The log captures:

Resolved track paths (per genome, after databases_config.yaml lookup).
The exact create_report command.
The flanking value used (default 300 bp — this is the value that's baked into all the embedded data slices, so audit trails depend on it).
Per-region embedded data sizes (extracted post-render so the user can see which regions inflated the HTML).
Total HTML size.

This satisfies CLAUDE.md §"Logging and Audit Trail" — every run is reproducible from the log alone.

Track choice nuances

For mouse genomes, databases_config.yaml ships .gtf.gz paths instead. GTFs work in igv-reports if bgzip + tabix-indexed; prep-track converts plain-gzip GTFs the same way it does GFF3s.

For T2T-CHM13, only the FASTA + GTF + CGI are indexed in our DB; rmsk is absent and is auto-skipped by the driver. The variant table will load without rmsk; flag this in the run log.

Common-case examples

The examples/ directory has runnable templates:

single_sample.sh — one BAM + one VCF + a sites BED → one HTML.
cohort_samplesheet.sh — TSV-driven multi-sample run.
prep_track_demo.sh — convert a plain-gzip gencode to bgzip+tabix.

These are reference implementations; copy and edit them for new runs rather than starting from scratch.

Related Skills

sahuno/scatter-gather

development

VerifiedTrustedCommunity

Decide whether and how to scatter genomics workloads across chromosomes or region tiles, then gather the per-shard outputs back together correctly. Use proactively whenever the user mentions parallelizing per-chromosome, sharding by chrom, tiling the genome, splitting a BAM/VCF/BED by region, merging per-chrom outputs, or has a workflow with obvious per-chromosome parallelism (variant calling, methylation pileup/DMR, coverage, liftover, peak calling, SV calling). Also triggers on /scatter-gather, "scatter X across chromosomes", "shard this", "chunked variant calling", "merge per-chrom VCFs", "gather these bedmethyl files", "concat these bigwigs", or any per-region parallelism question. **Trigger even when the user is also using Snakemake or Nextflow** — those skills handle DAG plumbing while this one defines *what* to scatter, *whether* it's even safe to scatter (some computations like DSS DMLtest pool globally and break under naive sharding), and *how* to gather each output format without silent corruption. Especially trigger on questions about merging per-chromosome BAM / VCF / BED / bedMethyl / bigwig outputs, or whether a scatter-gather is equivalent to running on the whole genome.

SKILL.mdUpdated May 7, 2026

sahuno/scatter-gather

sahuno/chimeric-read-validation

development

VerifiedTrustedCommunity

Verify that structural-variant / breakpoint calls are actually real by checking the chimeric reads that support them. Use whenever the user has caller output (Severus, Manta, Sniffles2, Delly, GRIDSS, MELT, Arriba, SvABA) and wants to validate / audit / QC / double-check their calls — viral integrations (HTLV-1, HBV, HPV, EBV), gene fusions (BCR-ABL, IGH translocations), mobile element insertions (L1, Alu, SVA), translocations. Trigger on phrasings like "is this integration real?", "should I trust this fusion call?", "are these false positives?", "are these PASS calls actually supported by reads?", "QC my SV calls", or any per-call chimeric-read / contamination / bimodality / T-vs-N read overlap question. Also fires on BAM @PG -Y / SA-tag questions on chimeric BAMs, and on /chimeric-read-validation. Output is a per-call TSV with pass / needs_review / fail verdicts. Do not use for calling SVs (use the caller), IGV screenshots (use igv-reports), or RNA-level fusion FDR (use Arriba).

SKILL.mdUpdated May 7, 2026

sahuno/chimeric-read-validation

sahuno/runtime-resource-study

tools

VerifiedTrustedCommunity

Run a stage-gated runtime/resource optimization study for any bioinformatics tool or command-line program on a SLURM HPC cluster. Walks through preflight, OFAT factor scan, 2^k confirmation factorial, build-mode + alternative-implementation comparison, input-size scan, out-of-sample validation, and produces a fitted predictive resource model (wall_s and peak_rss as functions of input size), a machine-readable model.yaml with caveats, a full REPORT.md, and a one-page exec summary PDF. Trigger PROACTIVELY whenever the user asks to "benchmark", "optimize", "tune", "characterize runtime/memory", "find best config", "build a resource model", "how does X scale", or "what should I put in my Snakemake resources directive for tool Y" — for any compute-bound bioinformatics step (sort, dedup, alignment, variant calling, methylation calling, basecalling, indexing, pileup, liftover). Also triggers on /runtime-resource-study or /benchmark-tool. Skip only for one-off quick timing where a single number suffices and no model is needed.

SKILL.mdUpdated Apr 30, 2026

sahuno/runtime-resource-study

sahuno/nfcore-module

tools

VerifiedTrustedCommunity

End-to-end builder for new nf-core modules. Scaffolds all required files, runs lint and nf-test in a loop until both pass, and produces PR-ready artifacts (description, Slack draft, checklist). Use this skill proactively whenever the user wants to: create a new nf-core module, add a tool to nf-core/modules, write a DORADO_BASECALLER or MODKIT_LOCALIZE style process, wrap a bioinformatics tool in Nextflow for nf-core, or asks "how do I submit a module to nf-core". Also trigger for: adding GPU support to a module, wrapping an R or Python script as an nf-core process, handling licensed/ non-bioconda tools in nf-core, fixing nf-core lint failures on a new module. Do NOT trigger for: editing existing pipelines, writing Snakemake rules, or debugging non-module Nextflow code.

SKILL.mdUpdated Apr 30, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/sahuno/llm_configs.git

# Copy into Claude Code skills folder (global)
cp -r llm_configs/claude/skills/igv-reports ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

sahuno/llm_configs

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

sahuno/igv-reports

$ install --global

Security Scan Results

SKILL.md

igv-reports

When to use which entry point

Defaults (locked in)

Sites BED format (critical)

Pitfalls (the skill should encode and/or detect these)

How to run — quick recipe

prep-track — fixing a non-bgzip track

When generating an answer.md / run.sh for the user

Output and workflow logging

Track choice nuances

Common-case examples

See also

Related Skills

sahuno/scatter-gather

sahuno/chimeric-read-validation

sahuno/runtime-resource-study

sahuno/nfcore-module

sahuno/igv-reports

$ install --global

Security Scan Results

SKILL.md

igv-reports

When to use which entry point

Defaults (locked in)

Sites BED format (critical)

Pitfalls (the skill should encode and/or detect these)

How to run — quick recipe

prep-track — fixing a non-bgzip track

When generating an answer.md / run.sh for the user

Output and workflow logging

Track choice nuances

Common-case examples

See also

Related Skills

sahuno/scatter-gather

sahuno/chimeric-read-validation

sahuno/runtime-resource-study

sahuno/nfcore-module