Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

GPTomics/bio-copy-number-cnv-annotation

Name: bio-copy-number-cnv-annotation
Author: GPTomics

copy-number/cnv-annotation/SKILL.md

npx skillsauth add GPTomics/bioSkills bio-copy-number-cnv-annotation

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Version Compatibility

Reference examples tested with: bedtools 2.31+, AnnotSV 3.4+, Python 3.10+ with pybedtools 0.9+, pandas 2.2+, pysam 0.22+; R 4.3+ with clusterProfiler 4.10+.

Before using code patterns, verify installed versions match. If versions differ:

CLI: bedtools --version, AnnotSV --version
Python: pip show pybedtools pandas pysam
R: packageVersion('clusterProfiler')

If code throws an error, introspect the installed package and adapt the example. AnnotSV output column names change between major versions — verify against the installed version.

CNV Annotation

"Annotate my CNV calls with the genes they affect" -> Overlap CNV segments with gene models, dosage-sensitivity maps, and clinical databases. The hard part is not the intersection — it is deciding which genes matter. A focal amplification overlapping 30 genes usually has one driver (the peak gene); a deletion's consequence depends on whether each gene is dosage-sensitive and whether the whole gene or only part is removed.

CLI: bedtools intersect -a cnvs.bed -b genes.bed -wa -wb; AnnotSV for full annotation
Python: pybedtools for interval logic; pysam for VCF database queries

Annotation Strategy — Pick the Database for the Question

| Question | Resource | What it answers | |----------|----------|-----------------| | Which genes does this CNV span? | RefSeq/GENCODE gene BED | Raw overlap (not yet consequence) | | Is loss of this gene damaging? | ClinGen haploinsufficiency (HI) score | Dosage sensitivity to deletion | | Is gain of this gene damaging? | ClinGen triplosensitivity (TS) score | Dosage sensitivity to duplication | | Is this CNV common in the population? | gnomAD-SV, DGV, 1000G CNV | Benign-frequency filtering | | Is this a known recurrent disorder locus? | ClinGen dosage regions, DECIPHER | Genomic-disorder context | | Is this a cancer driver? | COSMIC Cancer Gene Census, OncoKB | Oncogene vs tumor-suppressor role | | Is there pathogenic small-variant content? | ClinVar | Coincident SNV/indel pathogenicity | | One-shot comprehensive annotation + ranking | AnnotSV | Aggregates most of the above |

For constitutional CNV classification (assigning pathogenic/VUS/benign), the annotated output feeds the ACMG/ClinGen points framework — see germline-cnv-interpretation. For cohort-level recurrence and driver-peak identification, see recurrent-cnv.

The Core Distinction: Overlap Is Not Consequence

A CNV overlapping a gene does not necessarily change that gene's dosage in a way that matters. Three refinements separate annotation from interpretation:

Whole-gene vs partial overlap. A deletion spanning an entire gene removes one copy (clean haploinsufficiency test). A deletion removing only the last two exons creates a truncated allele — a different, often more damaging, consequence. Always record the fraction of each gene covered and whether coding exons or only introns/UTRs are hit.
Dosage sensitivity. Most genes tolerate single-copy loss. ClinGen HI/TS scores (3 = sufficient evidence for dosage sensitivity, 0 = no evidence, 30 = gene associated with an autosomal-recessive phenotype, 40 = dosage sensitivity unlikely) indicate which genes' loss/gain is actually consequential.
Driver vs passenger in focal events. A focal amplification carries many genes; the driver is the one under selection, typically at the recurrence peak across a cohort (GISTIC) and a known oncogene. Annotating all 30 genes as "amplified" overstates.

Gene Overlap with bedtools

Goal: Find genes overlapping each CNV segment, recording overlap extent.

Approach: Convert segments to BED, intersect with a gene model, keep both feature sets (-wo reports the overlap length) so partial vs whole-gene overlap is recoverable.

# Segments to BED (CNVkit .cns example; columns chrom/start/end/log2)
awk 'NR>1 {print $1"\t"$2"\t"$3"\t"$5}' sample.cns > sample.cnv.bed

# Intersect; -wo appends the number of overlapping bases
bedtools intersect -a sample.cnv.bed -b gencode.genes.bed -wo > cnv_gene_overlap.txt

Comprehensive Annotation with AnnotSV

Goal: Annotate CNVs against genes, dosage maps, population frequency, and clinical databases in one pass, with a built-in pathogenicity ranking.

Approach: Export CNVs to VCF or BED and run AnnotSV; it returns a "full" line per gene plus a "split" summary, with an ACMG-aligned rank (1-5).

AnnotSV \
    -SVinputFile sample.cnv.vcf \
    -genomeBuild GRCh38 \
    -annotationMode both \
    -outputFile sample_annotated.tsv

# Output includes: overlapped genes, ClinGen HI/TS, gnomAD-SV/DGV frequency, OMIM,
# ClinVar, DECIPHER, and an ACMG-class rank per SV.

AnnotSV's rank is a useful triage signal, not a final classification — confirm against the ClinGen points framework for clinical reporting.

Dosage-Sensitivity and Driver Annotation

Goal: Tag each affected gene with its dosage sensitivity and, for tumors, its driver role, so passengers can be separated from drivers.

Approach: Join the gene-overlap table to the ClinGen dosage map (HI/TS scores) and to the COSMIC Cancer Gene Census; flag CNVs whose direction matches a known mechanism (oncogene amplified, tumor suppressor deleted).

import pandas as pd

def annotate_dosage_and_drivers(overlap_tsv, clingen_dosage, cgc_file):
    '''Tag overlapped genes with ClinGen HI/TS and COSMIC driver role.'''
    cols = ['cnv_chrom', 'cnv_start', 'cnv_end', 'log2',
            'gene_chrom', 'gene_start', 'gene_end', 'gene', 'overlap_bp']
    df = pd.read_csv(overlap_tsv, sep='\t', names=cols)
    df['gene_len'] = df['gene_end'] - df['gene_start']
    df['gene_frac_covered'] = (df['overlap_bp'] / df['gene_len']).clip(upper=1.0)
    df['whole_gene'] = df['gene_frac_covered'] >= 0.99

    dosage = pd.read_csv(clingen_dosage, sep='\t')  # gene, HI_score, TS_score
    df = df.merge(dosage, on='gene', how='left')

    cgc = pd.read_csv(cgc_file, sep='\t')
    role = dict(zip(cgc['Gene Symbol'], cgc['Role in Cancer']))
    df['driver_role'] = df['gene'].map(role)

    # Direction-consistent driver hits: oncogene amplified or TSG deleted.
    df['driver_hit'] = (
        ((df['log2'] > 0.3) & df['driver_role'].fillna('').str.contains('oncogene')) |
        ((df['log2'] < -0.3) & df['driver_role'].fillna('').str.contains('TSG')))
    return df

Population-Frequency Filtering

Goal: Remove common, presumed-benign CNVs before clinical interpretation.

Approach: Reciprocal-overlap match each CNV against a population SV catalog (gnomAD-SV, DGV); a CNV with high reciprocal overlap to a common population CNV of the same type is likely benign.

# 50% reciprocal overlap (-f 0.5 -r): same-type, similar-extent population match.
# Reciprocal overlap, not one-sided, prevents a tiny CNV inside a huge population CNV
# (or vice versa) from being wrongly matched.
bedtools intersect -a sample.cnv.bed -b gnomad_sv.bed -f 0.5 -r -wa -wb \
    > cnv_population_match.txt

Pathway Enrichment of Affected Genes

Goal: Test whether genes in amplified (or deleted) regions are enriched for pathways.

Approach: Extract genes by CNV direction, map to Entrez IDs, run GO/KEGG enrichment. Caveat: CNVs are large and gene-dense, so enrichment is biased toward whatever pathways cluster in CNV-prone genomic regions — interpret as hypothesis-generating.

library(clusterProfiler)
library(org.Hs.eg.db)

amp_genes <- unique(cnv_annot$gene[cnv_annot$log2 > 0.3])
entrez <- na.omit(mapIds(org.Hs.eg.db, keys = amp_genes,
                         keytype = 'SYMBOL', column = 'ENTREZID'))
go_bp <- enrichGO(gene = entrez, OrgDb = org.Hs.eg.db, ont = 'BP',
                  pAdjustMethod = 'BH', qvalueCutoff = 0.05)

Failure Modes

Genome-build mismatch between CNVs and annotation

Trigger: CNV coordinates on GRCh37 intersected with a GRCh38 gene model (or vice versa).

Mechanism: Coordinates silently shift; the intersection succeeds and returns wrong genes.

Symptom: Implausible gene assignments; a known driver locus annotated with the wrong gene; systematic offset.

Fix: Confirm both inputs are the same build. If not, liftOver the CNVs (note that liftOver can split or drop segments across assembly gaps) and verify a known landmark.

Annotating all overlapped genes as the "affected" genes

Trigger: Reporting every gene a focal amplification spans as amplified/driver.

Mechanism: Focal events are megabases wide and gene-dense; only the selected gene is the driver.

Symptom: A 2 Mb amplicon "amplifies" 40 genes; the report cannot distinguish ERBB2 from its passengers.

Fix: For focal events, prioritize the gene at the cohort recurrence peak (GISTIC, see recurrent-cnv) and known drivers (CGC/OncoKB). Report passengers separately or not at all.

ClinVar CLNSIG parsing errors

Trigger: Naive string matching on the ClinVar CLNSIG INFO field.

Mechanism: CLNSIG is multi-valued, mixes terms ("Conflicting_classifications", "Pathogenic/Likely_pathogenic", "Benign/Likely_benign"), and is per-small-variant — not per-CNV. A substring match for "pathogenic" silently captures "Likely_pathogenic" (intended) but a careless match also fires on records that are conflicting or benign once underscores and slashes are involved.

Symptom: Benign or conflicting variants reported as pathogenic; CNV flagged on incidental nearby SNVs.

Fix: Parse CLNSIG against the controlled vocabulary; exclude "Conflicting" and benign terms explicitly. Remember ClinVar SNV/indel pathogenicity does not transfer to a CNV — use it as context, and use ClinVar's own CNV records or ClinGen dosage regions for the CNV itself.

Equating overlap with consequence

Trigger: Treating any gene-overlapping CNV as functionally significant.

Mechanism: Most single-copy losses are tolerated; partial overlaps may hit only introns/UTRs.

Symptom: Long lists of "affected" dosage-insensitive genes; benign CNVs over-called as significant.

Fix: Require dosage evidence (ClinGen HI/TS) and record coding-exon overlap and whole-gene-vs-partial status before calling a gene affected.

Quantitative Thresholds

| Threshold | Value | Source / Rationale | |-----------|-------|--------------------| | Population-CNV reciprocal overlap | >= 50% (-f 0.5 -r) | Standard reciprocal-overlap match for benign filtering (convention traces to gnomAD-SV / DGV workflows; Collins SR et al 2020 Nature 581:444 uses comparable reciprocal-overlap thresholds for benign-population matching) | | Common-CNV benign frequency | > 1% population frequency | ACMG/ClinGen: high frequency supports benign | | ClinGen HI/TS dosage-sensitive | score = 3 | ClinGen: sufficient evidence for dosage sensitivity | | Whole-gene overlap | >= 99% gene length covered | Distinguishes clean haploinsufficiency from partial/truncating | | AnnotSV pathogenic ranks | rank 4-5 | AnnotSV ACMG-aligned ranking (1 benign - 5 pathogenic) |

Common Errors

| Error / symptom | Cause | Solution | |-----------------|-------|----------| | Wrong genes assigned to a CNV | hg19/hg38 build mismatch | Match builds; liftOver and verify a landmark | | 40 genes called "amplified" for one amplicon | All overlapped genes reported as drivers | Prioritize recurrence-peak + known-driver genes | | Benign variants flagged pathogenic | Substring match on ClinVar CLNSIG | Parse the controlled vocabulary explicitly | | Tiny CNV matched to a huge population CNV | One-sided overlap used for frequency filter | Use reciprocal overlap (-f 0.5 -r) | | AnnotSV columns not found | Column names differ across AnnotSV versions | Check head of the output for the installed version | | Enrichment dominated by gene-dense loci | CNVs span gene clusters | Treat CNV-gene enrichment as hypothesis-generating |

References

Geoffroy V et al 2018. AnnotSV: an integrated tool for structural variations annotation. Bioinformatics 34:3572
Riggs ER et al 2020. Technical standards for the interpretation and reporting of constitutional copy-number variants: ACMG and ClinGen. Genet Med 22:245
Collins RL et al 2020. A structural variation reference for medical and population genetics (gnomAD-SV). Nature 581:444
Sondka Z et al 2018. The COSMIC Cancer Gene Census. Nat Rev Cancer 18:696

Related Skills

copy-number/germline-cnv-interpretation - ACMG/ClinGen points-based CNV classification
copy-number/recurrent-cnv - GISTIC2 recurrence peaks and driver-gene identification
copy-number/cnvkit-analysis - Generates the CNV segments to annotate
copy-number/cnv-visualization - Visualizing annotated CNVs
pathway-analysis/go-enrichment - GO/KEGG enrichment methodology and caveats
genome-intervals/bed-file-basics - BED interval operations
clinical-databases/clinvar-lookup - Querying ClinVar for variant pathogenicity

GPTomics/bio-copy-number-cnv-annotation

copy-number/cnv-annotation/SKILL.md

Annotate copy number variant segments with overlapping genes, dosage-sensitivity scores, cancer driver databases, population frequencies, and clinical-variant content. Covers bedtools/pybedtools interval intersection, AnnotSV comprehensive annotation and ranking, ClinGen haploinsufficiency/triplosensitivity scoring, gnomAD-SV/DGV frequency filtering, COSMIC Cancer Gene Census, and ClinVar overlap. Use when interpreting which genes a CNV affects, distinguishing the driver gene of a focal event from passengers, filtering against population CNVs, separating whole-gene from partial-gene overlap, or preparing CNVs for clinical classification.

817 stars

tools

Updated May 31, 2026

$ install --global

skillsauth

npx skillsauth add GPTomics/bioSkills bio-copy-number-cnv-annotation

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 31, 2026, 7:08 AM92.1s3 files scanned

SKILL.md

name:: bio-copy-number-cnv-annotation
description:: Annotate copy number variant segments with overlapping genes, dosage-sensitivity scores, cancer driver databases, population frequencies, and clinical-variant content. Covers bedtools/pybedtools interval intersection, AnnotSV comprehensive annotation and ranking, ClinGen haploinsufficiency/triplosensitivity scoring, gnomAD-SV/DGV frequency filtering, COSMIC Cancer Gene Census, and ClinVar overlap. Use when interpreting which genes a CNV affects, distinguishing the driver gene of a focal event from passengers, filtering against population CNVs, separating whole-gene from partial-gene overlap, or preparing CNVs for clinical classification.
tool_type:: mixed
primary_tool:: bedtools

Version Compatibility

Reference examples tested with: bedtools 2.31+, AnnotSV 3.4+, Python 3.10+ with pybedtools 0.9+, pandas 2.2+, pysam 0.22+; R 4.3+ with clusterProfiler 4.10+.

Before using code patterns, verify installed versions match. If versions differ:

CLI: bedtools --version, AnnotSV --version
Python: pip show pybedtools pandas pysam
R: packageVersion('clusterProfiler')

If code throws an error, introspect the installed package and adapt the example. AnnotSV output column names change between major versions — verify against the installed version.

CNV Annotation

CLI: bedtools intersect -a cnvs.bed -b genes.bed -wa -wb; AnnotSV for full annotation
Python: pybedtools for interval logic; pysam for VCF database queries

Annotation Strategy — Pick the Database for the Question

The Core Distinction: Overlap Is Not Consequence

A CNV overlapping a gene does not necessarily change that gene's dosage in a way that matters. Three refinements separate annotation from interpretation:

Whole-gene vs partial overlap. A deletion spanning an entire gene removes one copy (clean haploinsufficiency test). A deletion removing only the last two exons creates a truncated allele — a different, often more damaging, consequence. Always record the fraction of each gene covered and whether coding exons or only introns/UTRs are hit.
Dosage sensitivity. Most genes tolerate single-copy loss. ClinGen HI/TS scores (3 = sufficient evidence for dosage sensitivity, 0 = no evidence, 30 = gene associated with an autosomal-recessive phenotype, 40 = dosage sensitivity unlikely) indicate which genes' loss/gain is actually consequential.
Driver vs passenger in focal events. A focal amplification carries many genes; the driver is the one under selection, typically at the recurrence peak across a cohort (GISTIC) and a known oncogene. Annotating all 30 genes as "amplified" overstates.

Gene Overlap with bedtools

Goal: Find genes overlapping each CNV segment, recording overlap extent.

Approach: Convert segments to BED, intersect with a gene model, keep both feature sets (-wo reports the overlap length) so partial vs whole-gene overlap is recoverable.

# Segments to BED (CNVkit .cns example; columns chrom/start/end/log2)
awk 'NR>1 {print $1"\t"$2"\t"$3"\t"$5}' sample.cns > sample.cnv.bed

# Intersect; -wo appends the number of overlapping bases
bedtools intersect -a sample.cnv.bed -b gencode.genes.bed -wo > cnv_gene_overlap.txt

Comprehensive Annotation with AnnotSV

Goal: Annotate CNVs against genes, dosage maps, population frequency, and clinical databases in one pass, with a built-in pathogenicity ranking.

Approach: Export CNVs to VCF or BED and run AnnotSV; it returns a "full" line per gene plus a "split" summary, with an ACMG-aligned rank (1-5).

AnnotSV \
    -SVinputFile sample.cnv.vcf \
    -genomeBuild GRCh38 \
    -annotationMode both \
    -outputFile sample_annotated.tsv

# Output includes: overlapped genes, ClinGen HI/TS, gnomAD-SV/DGV frequency, OMIM,
# ClinVar, DECIPHER, and an ACMG-class rank per SV.

AnnotSV's rank is a useful triage signal, not a final classification — confirm against the ClinGen points framework for clinical reporting.

Dosage-Sensitivity and Driver Annotation

Goal: Tag each affected gene with its dosage sensitivity and, for tumors, its driver role, so passengers can be separated from drivers.

import pandas as pd

def annotate_dosage_and_drivers(overlap_tsv, clingen_dosage, cgc_file):
    '''Tag overlapped genes with ClinGen HI/TS and COSMIC driver role.'''
    cols = ['cnv_chrom', 'cnv_start', 'cnv_end', 'log2',
            'gene_chrom', 'gene_start', 'gene_end', 'gene', 'overlap_bp']
    df = pd.read_csv(overlap_tsv, sep='\t', names=cols)
    df['gene_len'] = df['gene_end'] - df['gene_start']
    df['gene_frac_covered'] = (df['overlap_bp'] / df['gene_len']).clip(upper=1.0)
    df['whole_gene'] = df['gene_frac_covered'] >= 0.99

    dosage = pd.read_csv(clingen_dosage, sep='\t')  # gene, HI_score, TS_score
    df = df.merge(dosage, on='gene', how='left')

    cgc = pd.read_csv(cgc_file, sep='\t')
    role = dict(zip(cgc['Gene Symbol'], cgc['Role in Cancer']))
    df['driver_role'] = df['gene'].map(role)

    # Direction-consistent driver hits: oncogene amplified or TSG deleted.
    df['driver_hit'] = (
        ((df['log2'] > 0.3) & df['driver_role'].fillna('').str.contains('oncogene')) |
        ((df['log2'] < -0.3) & df['driver_role'].fillna('').str.contains('TSG')))
    return df

Population-Frequency Filtering

Goal: Remove common, presumed-benign CNVs before clinical interpretation.

Approach: Reciprocal-overlap match each CNV against a population SV catalog (gnomAD-SV, DGV); a CNV with high reciprocal overlap to a common population CNV of the same type is likely benign.

# 50% reciprocal overlap (-f 0.5 -r): same-type, similar-extent population match.
# Reciprocal overlap, not one-sided, prevents a tiny CNV inside a huge population CNV
# (or vice versa) from being wrongly matched.
bedtools intersect -a sample.cnv.bed -b gnomad_sv.bed -f 0.5 -r -wa -wb \
    > cnv_population_match.txt

Pathway Enrichment of Affected Genes

Goal: Test whether genes in amplified (or deleted) regions are enriched for pathways.

library(clusterProfiler)
library(org.Hs.eg.db)

amp_genes <- unique(cnv_annot$gene[cnv_annot$log2 > 0.3])
entrez <- na.omit(mapIds(org.Hs.eg.db, keys = amp_genes,
                         keytype = 'SYMBOL', column = 'ENTREZID'))
go_bp <- enrichGO(gene = entrez, OrgDb = org.Hs.eg.db, ont = 'BP',
                  pAdjustMethod = 'BH', qvalueCutoff = 0.05)

Failure Modes

Genome-build mismatch between CNVs and annotation

Trigger: CNV coordinates on GRCh37 intersected with a GRCh38 gene model (or vice versa).

Mechanism: Coordinates silently shift; the intersection succeeds and returns wrong genes.

Symptom: Implausible gene assignments; a known driver locus annotated with the wrong gene; systematic offset.

Fix: Confirm both inputs are the same build. If not, liftOver the CNVs (note that liftOver can split or drop segments across assembly gaps) and verify a known landmark.

Annotating all overlapped genes as the "affected" genes

Trigger: Reporting every gene a focal amplification spans as amplified/driver.

Mechanism: Focal events are megabases wide and gene-dense; only the selected gene is the driver.

Symptom: A 2 Mb amplicon "amplifies" 40 genes; the report cannot distinguish ERBB2 from its passengers.

Fix: For focal events, prioritize the gene at the cohort recurrence peak (GISTIC, see recurrent-cnv) and known drivers (CGC/OncoKB). Report passengers separately or not at all.

ClinVar CLNSIG parsing errors

Trigger: Naive string matching on the ClinVar CLNSIG INFO field.

Symptom: Benign or conflicting variants reported as pathogenic; CNV flagged on incidental nearby SNVs.

Equating overlap with consequence

Trigger: Treating any gene-overlapping CNV as functionally significant.

Mechanism: Most single-copy losses are tolerated; partial overlaps may hit only introns/UTRs.

Symptom: Long lists of "affected" dosage-insensitive genes; benign CNVs over-called as significant.

Fix: Require dosage evidence (ClinGen HI/TS) and record coding-exon overlap and whole-gene-vs-partial status before calling a gene affected.

Quantitative Thresholds

Common Errors

References

Geoffroy V et al 2018. AnnotSV: an integrated tool for structural variations annotation. Bioinformatics 34:3572
Riggs ER et al 2020. Technical standards for the interpretation and reporting of constitutional copy-number variants: ACMG and ClinGen. Genet Med 22:245
Collins RL et al 2020. A structural variation reference for medical and population genetics (gnomAD-SV). Nature 581:444
Sondka Z et al 2018. The COSMIC Cancer Gene Census. Nat Rev Cancer 18:696

Related Skills

copy-number/germline-cnv-interpretation - ACMG/ClinGen points-based CNV classification
copy-number/recurrent-cnv - GISTIC2 recurrence peaks and driver-gene identification
copy-number/cnvkit-analysis - Generates the CNV segments to annotate
copy-number/cnv-visualization - Visualizing annotated CNVs
pathway-analysis/go-enrichment - GO/KEGG enrichment methodology and caveats
genome-intervals/bed-file-basics - BED interval operations
clinical-databases/clinvar-lookup - Querying ClinVar for variant pathogenicity

Related Skills

GPTomics/bio-workflows-clip-pipeline

tools

VerifiedTrustedCommunity

End-to-end CLIP-seq pipeline from FASTQ to ENCODE-compliant binding sites, single-nucleotide crosslink maps, annotation, motifs, and (optionally) differential binding. Use when running the full Yeo lab eCLIP / iCLIP / iCLIP2 / iCLIP3 / irCLIP / PAR-CLIP analysis with SMInput control, protocol-specific UMI extraction, ENCODE STAR parameters, CLIPper or Skipper peak calling with stringent log2 FC and -log10 p thresholds, IDR rescue and self-consistency QC, and downstream motif registration with mCross or PEKA.

1,065SKILL.mdUpdated Jun 10, 2026

GPTomics/bio-workflows-clip-pipeline

GPTomics/bio-comparative-genomics-whole-genome-duplication

development

VerifiedTrustedCommunity

Detect, date, and contextualize whole-genome duplication (WGD / paleopolyploidy) events using wgd v2 (Chen et al 2024), KsRates (Sensalari 2022 substitution-rate-corrected Ks dating), DupGen_finder (Qiao 2019), MAPS (Li 2018 phylogenomic), POInT (Conant 2008 ordered-block), SLEDGe (2024 ML-based), Whale.jl (Bayesian DL+WGD), and synteny-anchored paranome construction. Use when identifying ancient polyploidy from Ks distributions and synteny block analysis, positioning WGD events relative to speciation, distinguishing tandem from segmental from WGD duplications, dating the 2R/3R vertebrate / fish / salmonid WGDs, building paranome and Ks-age mixture models, applying KsRates substitution-rate correction across lineages, or testing alternative biased-fractionation / dosage-balance models post-WGD.

1,065SKILL.mdUpdated May 23, 2026

GPTomics/bio-comparative-genomics-whole-genome-duplication

GPTomics/bio-comparative-genomics-whole-genome-alignment

tools

VerifiedTrustedCommunity

Build whole-genome alignments using Progressive Cactus (Armstrong 2020 reference-free clade-level WGA), Minigraph-Cactus (Hickey 2024 pangenome-aware), LASTZ chain/net (UCSC pipeline), MUMmer4 (Marçais 2018 pairwise), minimap2 -x asm5/10/20 (Li 2018 fast pairwise), AnchorWave (Song 2022 WGD-aware), and Mauve / progressiveMauve (bacterial). Operates the HAL toolkit (Hickey 2013) for downstream extraction including halSynteny, halLiftover, halBranchMutations, and hal2maf. Use when constructing multi-species alignments for comparative-annotation projection (TOGA), synteny detection, conservation analyses (phyloP / PhastCons), or pangenome graph construction; selecting between reference-free (Cactus) and reference-anchored (LASTZ chains/nets) approaches; tuning sensitivity for closely vs distantly related genomes; or producing HAL files for genome-wide downstream tools.

1,065SKILL.mdUpdated May 23, 2026

GPTomics/bio-comparative-genomics-whole-genome-alignment

GPTomics/bio-comparative-genomics-synteny-analysis

development

VerifiedTrustedCommunity

Detect syntenic blocks and structural rearrangements between genomes using MCScanX (Wang 2012), JCVI/MCScan (Tang 2008 Python), GENESPACE (Lovell 2022) for orthology-anchored riparian visualization, SyRI for structural variation, AnchorWave for sequence-level synteny, i-ADHoRe 3.0 for highly diverged species, SynNet for synteny networks, and ntSynt for multi-genome macrosynteny. Use when identifying collinear gene blocks across species, distinguishing macrosynteny from microsynteny, detecting inversions/translocations/duplications, anchoring orthology in WGD lineages, producing publication riparian plots, computing synteny block age via Ks (cross-references whole-genome-duplication), or running synteny-aware ortholog inference in polyploids.

1,065SKILL.mdUpdated May 23, 2026

GPTomics/bio-comparative-genomics-synteny-analysis

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/GPTomics/bioSkills.git

# Copy into Claude Code skills folder (global)
cp -r bioSkills/copy-number/cnv-annotation ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

GPTomics/bioSkills

817 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT