Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

sahuno/cohort-overview

Name: cohort-overview
Author: sahuno

claude/skills/cohort-overview/SKILL.md

npx skillsauth add sahuno/llm_configs cohort-overview

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Cohort Overview Heatmap

Recreates a SPECTRUM-style cohort data-availability heatmap showing, for each patient, which assays were run on which samples, alongside patient-level genomic annotations and per-sample site labels.

What you'll produce: a results/{png,pdf,svg}/ directory with a single heatmap. Rows are grouped by assay; within each assay block, samples are stacked by a consistent per-patient slot index (so sample S02 of patient 017 always sits in the same row across every assay block). Columns are patients, ordered by the user's first patient-level annotation (typically a mutational signature).

When to use this skill

Trigger whenever a user with a per-sample table wants a "what have we profiled" overview. Common phrasings: "make a cohort overview", "plot sample availability across assays", "which patients have DLP + WGS + scRNA-seq", "recreate the SPECTRUM data heatmap".

Skip this skill for single-assay plots, per-gene heatmaps, or anything that isn't a patient × assay availability matrix.

The two scripts

This skill ships two scripts. Do not rewrite them from scratch — copy them into the user's project (usually scripts/) and adapt only if the user asks:

scripts/01_generate_mock_cohort.R — generates data/cohort_wide.tsv with 85 patients / ~290 samples / 7 assays. Useful when the user wants a demo run, or to sanity-check their own TSV against the expected schema.
scripts/02_plot_cohort_heatmap.R — the main plotter. Uses optparse, ComplexHeatmap, and writes PNG/PDF/SVG. Fully configurable via CLI flags.

Both live at <skill-dir>/scripts/ (the skill directory Claude Code loaded this SKILL.md from). Copy them with cp; don't regenerate them.

Input TSV schema (wide format, one row per sample)

| Column group | Columns | Notes | |--------------------|----------------------------------------------------------------------|--------------------------------------------------------------| | sample keys | patient_id, sample_id | sample_id unique per row; patient_id repeats | | sample attribute | site | Adnexa, Omentum, Blood, Ascites, … | | patient-level | signature, BRCA1, BRCA2, CCNE1 | must be constant within a patient_id — script validates | | assay availability | DLP, ONT, WGS, mpIF, scATAC-seq, scRNA-seq, scRNAseqVDJ | cell values: Yes | No | NA |

NA means "not attempted / unknown" — semantically distinct from No ("attempted, failed or negative"). The plot encodes them as different colours by default.

A 3-line example lives at examples/cohort_wide.tsv. Read it only if the user is debugging their own file format.

Plot semantics

Top annotation: one block per patient, ordered by the first --patient-cols column (default: signature).
Row split: one block per assay, in --assays order.
Within each assay block: samples are stacked by a per-patient slot index (ordered by site, then sample_id). Block height = the largest number of samples any patient has.
Cell colours: Yes → black, No → grey, NA → white (all three overridable via CLI). The default palette is chosen so that ink = attempt: dark cells = success, grey cells = attempted-but-failed, blank cells = never attempted. This lets you see coverage gaps at a glance.
Left strip: modal Site across patients for that row — tells you "most samples in slot 1 are from the primary tumour site".

Workflow

1. Inspect the user's data

If the user has their own TSV: read the first ~5 lines and confirm all required columns are present. If assay columns have values other than Yes/No/NA, ask what they mean before proceeding.
If the user wants a demo: run scripts/01_generate_mock_cohort.R from their project directory to produce data/cohort_wide.tsv.
If patient-level columns vary within a patient_id, the plotter will error. Warn the user and ask whether to take the first value, the mode, or fix upstream.

2. Copy scripts into the user's project

mkdir -p <project>/scripts <project>/data <project>/results
cp <skill-dir>/scripts/01_generate_mock_cohort.R <project>/scripts/
cp <skill-dir>/scripts/02_plot_cohort_heatmap.R <project>/scripts/

3. Run the plotter

Rscript scripts/02_plot_cohort_heatmap.R --input data/cohort_wide.tsv
Rscript scripts/02_plot_cohort_heatmap.R --help     # full option list

4. Check the output

Figures land under results/{png,pdf,svg}/. Default stem is cohort_overview_heatmap. Open the PNG and verify: patients ordered by signature, all listed assays present as row blocks, cell colours match --yes-color / --no-color / --na-color.

A reference PNG from the mock data is at examples/cohort_overview_heatmap_reference.png if you want to show the user what "correct" looks like.

CLI options (script `02_plot_cohort_heatmap.R`)

| Flag | Default | Purpose | |--------------------------------------|----------------------------------------------|-----------------------------------------------------------| | -i, --input | required | path to wide TSV | | -o, --outdir | results | output root (creates pdf/, png/, svg/) | | -n, --name | cohort_overview_heatmap | file stem | | -W, --width | 14 | figure width (inches) | | -H, --height | 8 | figure height (inches) | | --assays | 7-assay default | comma-separated assay columns, in plot order | | --patient-cols | signature,BRCA1,BRCA2,CCNE1 | patient-level annotation columns; 1st orders columns | | --sig-order | FBI,HRD,HRD-Del,HRD-Dup,TD,Undetermined,NA | factor order for the 1st patient-level col | | --patient-id-col | patient_id | patient-id column name | | --sample-id-col | sample_id | sample-id column name | | --site-col | site | site column name | | --yes-color | #000000 | cell colour for Yes (attempted, succeeded) | | --no-color | #BDBDBD | cell colour for No (attempted, failed/negative) | | --na-color | #FFFFFF | cell colour for NA (not attempted / unknown) | | --na-as-no | off | render NA as No (collapses legend to Yes/No) | | --no-png / --no-pdf / --no-svg | off | skip a format |

Common tailoring

Subset of assays: --assays "DLP,WGS,ONT" (order matters — top to bottom).
Custom patient annotations: --patient-cols "signature,TP53,RB1" — the first column still determines patient ordering. Unknown columns get an auto palette (uses RColorBrewer if available).
Two-state cells: --na-as-no collapses NA into No and shortens the legend to Yes/No.
Wider canvas for big cohorts: --width 20 --height 10 for 150+ patients.

Required R packages

Required: optparse, dplyr, tidyr, readr, ComplexHeatmap, circlize
Optional: svglite (true SVG), showtext + sysfonts (Arial), RColorBrewer (nicer palettes for custom --patient-cols)

If packages are missing, install with:

install.packages(c("optparse","dplyr","tidyr","readr","circlize","svglite","showtext","sysfonts","RColorBrewer"))
# ComplexHeatmap via Bioconductor:
if (!require("BiocManager")) install.packages("BiocManager")
BiocManager::install("ComplexHeatmap")

Expected final layout

<project>/
├── data/cohort_wide.tsv
├── scripts/
│   ├── 01_generate_mock_cohort.R
│   └── 02_plot_cohort_heatmap.R
└── results/
    ├── png/cohort_overview_heatmap.png
    ├── pdf/cohort_overview_heatmap.pdf
    └── svg/cohort_overview_heatmap.svg

Troubleshooting

"Arial font not found in PostScript font database" — harmless; the script falls back to sans. Install showtext + sysfonts for true Arial.
SVG export fails — install svglite. The base svg() device needs X11 which is often unavailable on macOS; svglite is self-contained.
Patient-level column not constant within patient_id — the plotter errors with the offending patient_id. Fix upstream or collapse to the mode.
No rows render for an assay — all values for that column are NA and --na-as-no is off. Either drop the assay from --assays or pass --na-as-no.

sahuno/cohort-overview

claude/skills/cohort-overview/SKILL.md

Build a per-patient × per-assay sample-availability heatmap (SPECTRUM-style cohort overview) from a wide TSV. Patient-level annotations (signature, BRCA1/2, CCNE1) on top, site strip on left, three-state cells (Yes/No/NA) with customizable colours and optional NA→No collapse. Use proactively when the user asks for a cohort overview, sample availability heatmap, assay coverage plot, data availability figure, or SPECTRUM-style heatmap; has a TSV with one row per sample and wants to visualize which assays (DLP, ONT, WGS, mpIF, scRNA/scATAC-seq, etc.) were run per patient; or types `/cohort-overview`. Keywords: "cohort overview", "sample availability", "assay matrix", "per-patient heatmap", "SPECTRUM heatmap", "coverage overview", "what samples do we have". Uses ComplexHeatmap: row-split by assay, per-patient sample slots, patient annotations on top, site strip on left.

development

Updated Apr 15, 2026

$ install --global

skillsauth

npx skillsauth add sahuno/llm_configs cohort-overview

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 15, 2026, 7:12 AM6.1s5 files scanned

SKILL.md

name:: cohort-overview
description:: |
run per patient; or types `/cohort-overview`. Keywords:: cohort overview",
Uses ComplexHeatmap:: row-split by assay, per-patient sample slots, patient

Cohort Overview Heatmap

Recreates a SPECTRUM-style cohort data-availability heatmap showing, for each patient, which assays were run on which samples, alongside patient-level genomic annotations and per-sample site labels.

When to use this skill

Skip this skill for single-assay plots, per-gene heatmaps, or anything that isn't a patient × assay availability matrix.

The two scripts

This skill ships two scripts. Do not rewrite them from scratch — copy them into the user's project (usually scripts/) and adapt only if the user asks:

scripts/01_generate_mock_cohort.R — generates data/cohort_wide.tsv with 85 patients / ~290 samples / 7 assays. Useful when the user wants a demo run, or to sanity-check their own TSV against the expected schema.
scripts/02_plot_cohort_heatmap.R — the main plotter. Uses optparse, ComplexHeatmap, and writes PNG/PDF/SVG. Fully configurable via CLI flags.

Both live at <skill-dir>/scripts/ (the skill directory Claude Code loaded this SKILL.md from). Copy them with cp; don't regenerate them.

Input TSV schema (wide format, one row per sample)

NA means "not attempted / unknown" — semantically distinct from No ("attempted, failed or negative"). The plot encodes them as different colours by default.

A 3-line example lives at examples/cohort_wide.tsv. Read it only if the user is debugging their own file format.

Plot semantics

Top annotation: one block per patient, ordered by the first --patient-cols column (default: signature).
Row split: one block per assay, in --assays order.
Within each assay block: samples are stacked by a per-patient slot index (ordered by site, then sample_id). Block height = the largest number of samples any patient has.
Cell colours: Yes → black, No → grey, NA → white (all three overridable via CLI). The default palette is chosen so that ink = attempt: dark cells = success, grey cells = attempted-but-failed, blank cells = never attempted. This lets you see coverage gaps at a glance.
Left strip: modal Site across patients for that row — tells you "most samples in slot 1 are from the primary tumour site".

Workflow

1. Inspect the user's data

If the user has their own TSV: read the first ~5 lines and confirm all required columns are present. If assay columns have values other than Yes/No/NA, ask what they mean before proceeding.
If the user wants a demo: run scripts/01_generate_mock_cohort.R from their project directory to produce data/cohort_wide.tsv.
If patient-level columns vary within a patient_id, the plotter will error. Warn the user and ask whether to take the first value, the mode, or fix upstream.

2. Copy scripts into the user's project

mkdir -p <project>/scripts <project>/data <project>/results
cp <skill-dir>/scripts/01_generate_mock_cohort.R <project>/scripts/
cp <skill-dir>/scripts/02_plot_cohort_heatmap.R <project>/scripts/

3. Run the plotter

Rscript scripts/02_plot_cohort_heatmap.R --input data/cohort_wide.tsv
Rscript scripts/02_plot_cohort_heatmap.R --help     # full option list

4. Check the output

A reference PNG from the mock data is at examples/cohort_overview_heatmap_reference.png if you want to show the user what "correct" looks like.

CLI options (script `02_plot_cohort_heatmap.R`)

Common tailoring

Subset of assays: --assays "DLP,WGS,ONT" (order matters — top to bottom).
Custom patient annotations: --patient-cols "signature,TP53,RB1" — the first column still determines patient ordering. Unknown columns get an auto palette (uses RColorBrewer if available).
Two-state cells: --na-as-no collapses NA into No and shortens the legend to Yes/No.
Wider canvas for big cohorts: --width 20 --height 10 for 150+ patients.

Required R packages

Required: optparse, dplyr, tidyr, readr, ComplexHeatmap, circlize
Optional: svglite (true SVG), showtext + sysfonts (Arial), RColorBrewer (nicer palettes for custom --patient-cols)

If packages are missing, install with:

install.packages(c("optparse","dplyr","tidyr","readr","circlize","svglite","showtext","sysfonts","RColorBrewer"))
# ComplexHeatmap via Bioconductor:
if (!require("BiocManager")) install.packages("BiocManager")
BiocManager::install("ComplexHeatmap")

Expected final layout

<project>/
├── data/cohort_wide.tsv
├── scripts/
│   ├── 01_generate_mock_cohort.R
│   └── 02_plot_cohort_heatmap.R
└── results/
    ├── png/cohort_overview_heatmap.png
    ├── pdf/cohort_overview_heatmap.pdf
    └── svg/cohort_overview_heatmap.svg

Troubleshooting

"Arial font not found in PostScript font database" — harmless; the script falls back to sans. Install showtext + sysfonts for true Arial.
SVG export fails — install svglite. The base svg() device needs X11 which is often unavailable on macOS; svglite is self-contained.
Patient-level column not constant within patient_id — the plotter errors with the offending patient_id. Fix upstream or collapse to the mode.
No rows render for an assay — all values for that column are NA and --na-as-no is off. Either drop the assay from --assays or pass --na-as-no.

Related Skills

sahuno/scatter-gather

development

VerifiedTrustedCommunity

Decide whether and how to scatter genomics workloads across chromosomes or region tiles, then gather the per-shard outputs back together correctly. Use proactively whenever the user mentions parallelizing per-chromosome, sharding by chrom, tiling the genome, splitting a BAM/VCF/BED by region, merging per-chrom outputs, or has a workflow with obvious per-chromosome parallelism (variant calling, methylation pileup/DMR, coverage, liftover, peak calling, SV calling). Also triggers on /scatter-gather, "scatter X across chromosomes", "shard this", "chunked variant calling", "merge per-chrom VCFs", "gather these bedmethyl files", "concat these bigwigs", or any per-region parallelism question. **Trigger even when the user is also using Snakemake or Nextflow** — those skills handle DAG plumbing while this one defines *what* to scatter, *whether* it's even safe to scatter (some computations like DSS DMLtest pool globally and break under naive sharding), and *how* to gather each output format without silent corruption. Especially trigger on questions about merging per-chromosome BAM / VCF / BED / bedMethyl / bigwig outputs, or whether a scatter-gather is equivalent to running on the whole genome.

SKILL.mdUpdated May 7, 2026

sahuno/scatter-gather

sahuno/igv-reports

tools

VerifiedTrustedCommunity

Build self-contained, offline HTML genomic-region reports with igv-reports (create_report). Each HTML bundles igv.js viewers per region with embedded BAM/VCF data slices and default tracks (CpG islands, gencode, RepeatMasker); a reviewer clicks the variant table to inspect read-level evidence with no internet, no server, no IGV install. USE this skill whenever the user wants an HTML, clickable, or browseable viewer of genomic data — phrases like "HTML IGV report", "offline IGV", "self-contained HTML", "clickable viewer", "create_report", "igv-reports", "email this viewer", or any browseable HTML of reads at variants, fusion breakpoints, SV junctions, viral integrations, ChIP peaks, or ROIs. Trigger even when the user doesn't say "igv-reports" — giveaway is HTML/clickable/offline plus genomic regions. Also fire on /igv-reports. DO NOT use for static PNG/PDF/SVG IGV screenshots — use the igv-screenshots skill. Supports hg38, mm10, mm39, T2T. Defaults: --flanking 300, --standalone, genome-tagged output.

SKILL.mdUpdated May 7, 2026

sahuno/chimeric-read-validation

development

VerifiedTrustedCommunity

Verify that structural-variant / breakpoint calls are actually real by checking the chimeric reads that support them. Use whenever the user has caller output (Severus, Manta, Sniffles2, Delly, GRIDSS, MELT, Arriba, SvABA) and wants to validate / audit / QC / double-check their calls — viral integrations (HTLV-1, HBV, HPV, EBV), gene fusions (BCR-ABL, IGH translocations), mobile element insertions (L1, Alu, SVA), translocations. Trigger on phrasings like "is this integration real?", "should I trust this fusion call?", "are these false positives?", "are these PASS calls actually supported by reads?", "QC my SV calls", or any per-call chimeric-read / contamination / bimodality / T-vs-N read overlap question. Also fires on BAM @PG -Y / SA-tag questions on chimeric BAMs, and on /chimeric-read-validation. Output is a per-call TSV with pass / needs_review / fail verdicts. Do not use for calling SVs (use the caller), IGV screenshots (use igv-reports), or RNA-level fusion FDR (use Arriba).

SKILL.mdUpdated May 7, 2026

sahuno/chimeric-read-validation

sahuno/runtime-resource-study

tools

VerifiedTrustedCommunity

Run a stage-gated runtime/resource optimization study for any bioinformatics tool or command-line program on a SLURM HPC cluster. Walks through preflight, OFAT factor scan, 2^k confirmation factorial, build-mode + alternative-implementation comparison, input-size scan, out-of-sample validation, and produces a fitted predictive resource model (wall_s and peak_rss as functions of input size), a machine-readable model.yaml with caveats, a full REPORT.md, and a one-page exec summary PDF. Trigger PROACTIVELY whenever the user asks to "benchmark", "optimize", "tune", "characterize runtime/memory", "find best config", "build a resource model", "how does X scale", or "what should I put in my Snakemake resources directive for tool Y" — for any compute-bound bioinformatics step (sort, dedup, alignment, variant calling, methylation calling, basecalling, indexing, pileup, liftover). Also triggers on /runtime-resource-study or /benchmark-tool. Skip only for one-off quick timing where a single number suffices and no model is needed.

SKILL.mdUpdated Apr 30, 2026

sahuno/runtime-resource-study

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/sahuno/llm_configs.git

# Copy into Claude Code skills folder (global)
cp -r llm_configs/claude/skills/cohort-overview ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

sahuno/llm_configs

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

sahuno/cohort-overview

$ install --global

Security Scan Results

SKILL.md

Cohort Overview Heatmap

When to use this skill

The two scripts

Input TSV schema (wide format, one row per sample)

Plot semantics

Workflow

1. Inspect the user's data

2. Copy scripts into the user's project

3. Run the plotter

4. Check the output

CLI options (script 02_plot_cohort_heatmap.R)

Common tailoring

Required R packages

Expected final layout

Troubleshooting

Related Skills

sahuno/scatter-gather

sahuno/igv-reports

sahuno/chimeric-read-validation

sahuno/runtime-resource-study

sahuno/cohort-overview

$ install --global

Security Scan Results

SKILL.md

Cohort Overview Heatmap

When to use this skill

The two scripts

Input TSV schema (wide format, one row per sample)

Plot semantics

Workflow

1. Inspect the user's data

2. Copy scripts into the user's project

3. Run the plotter

4. Check the output

CLI options (script 02_plot_cohort_heatmap.R)

Common tailoring

Required R packages

Expected final layout

Troubleshooting

Related Skills

sahuno/scatter-gather

sahuno/igv-reports

sahuno/chimeric-read-validation

sahuno/runtime-resource-study

CLI options (script `02_plot_cohort_heatmap.R`)

CLI options (script `02_plot_cohort_heatmap.R`)