Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

GPTomics/bio-flow-cytometry-differential-analysis

Name: bio-flow-cytometry-differential-analysis
Author: GPTomics

flow-cytometry/differential-analysis/SKILL.md

npx skillsauth add GPTomics/bioSkills bio-flow-cytometry-differential-analysis

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Version Compatibility

Reference examples tested with: diffcyt 1.22+, CATALYST 1.26+, edgeR 4.0+, limma 3.58+.

Before using code patterns, verify installed versions match. If versions differ:

R: packageVersion('<pkg>') then ?function_name to verify parameters

testDA_edgeR/testDS_limma are diffcyt functions operating on count/median objects from calcCounts/calcMedians; the CATALYST-integrated path is the diffcyt() wrapper on the SCE. Confirm the signature with ?diffcyt before relying on it.

Differential Analysis

"Compare cell populations between my conditions" -> Test cluster frequencies (DA) and within-cluster marker expression (DS) between groups, with the sample (not the cell) as the unit.

R: diffcyt::diffcyt(sce, analysis_type='DA', method_DA='diffcyt-DA-edgeR', design, contrast)
R: diffcyt(sce, analysis_type='DS', method_DS='diffcyt-DS-limma', ...)

The Single Most Important Modern Insight -- The Sample Is the Experimental Unit, Not the Cell

Tens of thousands of cells from one donor are technical PSEUDOREPLICATES, not independent observations. A per-cell test (Wilcoxon across all cells) treats them as n = cells and produces astronomically significant p-values from two mice - it is the single most common statistical sin in modern cytometry (Hurlbert 1984 Ecol Monogr 54:187; the cytometry mirror of the scRNA-seq pseudobulk lesson). The correct unit is the SAMPLE/subject: diffcyt aggregates cells to PER-SAMPLE-PER-CLUSTER counts (DA) and PER-SAMPLE-PER-CLUSTER arcsinh-MEDIANS (DS), then tests across samples with edgeR/limma/GLMM (Weber 2019 Commun Biol 2:183). Biological replication is mandatory (>= 2-3 per group); DA from a single sample per condition has no valid test. Paired with this: cluster proportions are COMPOSITIONAL (they sum to 1), so a real increase in one population mechanically forces apparent depletion in others - a source of false DA in "unchanged" clusters.

DA vs DS, and the type/state marker link

DA (differential abundance): does a cluster's FREQUENCY differ? Clusters are defined by TYPE markers.
DS (differential state): within a fixed-identity cluster, does a STATE marker's expression differ? State markers were withheld from clustering for exactly this test.

Method Taxonomy

| Method | Citation | Mechanism | When to use | |--------|----------|-----------|-------------| | diffcyt-DA-edgeR / voom | Weber 2019 Commun Biol 2:183 | edgeR/voom empirical-Bayes on per-sample counts; optional TMM | standard 2+ group with replicates (DEFAULT) | | diffcyt-DA-GLMM / DS-LMM | Weber 2019 | random effects in the formula | paired/repeated-measures/nested (subject random effect) | | cydar | Lun 2017 Nat Methods 14:707 | overlapping hyperspheres + edgeR + spatial FDR | continuum, avoid hard clusters | | CITRUS | Bruggner 2014 PNAS 111:E2770 | hierarchical clustering + LASSO | predictive signature, LARGE n; correlated-not-causal; largely superseded | | sccomp / scCODA / DCATS | Mangiola 2023 PNAS 120:e2203828120 / Buttner 2021 Nat Commun 12:6876 / Lin 2023 Genome Biol 24:151 | simplex-aware compositional models | strong compositional shift (one pop dominates); DCATS for assignment uncertainty |

Run diffcyt DA and DS

Goal: Test abundance and state on a CATALYST-clustered SCE.

Approach: Build design + contrast from ei(sce); the diffcyt() wrapper uses the stored clustering. State markers are tested in DS, type markers define DA clusters.

library(CATALYST); library(diffcyt)

sce <- readRDS('sce_clustered.rds')
design   <- createDesignMatrix(ei(sce), cols_design = 'condition')
contrast <- createContrast(c(0, 1))                    # Treatment vs Control

res_DA <- diffcyt(sce, clustering_to_use = 'meta20',
                  analysis_type = 'DA', method_DA = 'diffcyt-DA-edgeR',
                  design = design, contrast = contrast)
res_DS <- diffcyt(sce, clustering_to_use = 'meta20',
                  analysis_type = 'DS', method_DS = 'diffcyt-DS-limma',
                  design = design, contrast = contrast)

library(SummarizedExperiment)
rowData(res_DA$res)        # cluster_id, logFC, p_val, p_adj (BH across clusters)

Paired / Repeated-Measures (mixed models)

Goal: Account for within-subject correlation (e.g. pre/post on the same donor).

Approach: Use a GLMM/LMM method with a random effect for subject via a formula.

formula <- createFormula(ei(sce), cols_fixed = 'condition', cols_random = 'patient_id')
res_DA  <- diffcyt(sce, clustering_to_use = 'meta20',
                   analysis_type = 'DA', method_DA = 'diffcyt-DA-GLMM',
                   formula = formula, contrast = createContrast(c(0, 1)))

Compositional Re-Check

Goal: Confirm a headline single-population shift is not inducing artifactual reciprocal depletion.

Approach: Re-test with a simplex-aware model when one cluster changes a lot or total yield differs by group.

# If a dominant population expands, the apparent depletion of others may be a simplex artifact.
# Re-test with sccomp / scCODA (reference cell type) / DCATS (assignment uncertainty)
# before reporting reciprocal depletion as independent biology.

Per-Method Failure Modes

Per-cell pseudoreplication

Trigger: Wilcoxon/t-test across all cells. Mechanism: cells aren't independent. Symptom: p ~ 1e-40 from few subjects. Fix: aggregate to per-sample summaries (diffcyt).

Compositional false DA

Trigger: one population expands strongly. Mechanism: proportions sum to 1. Symptom: significant "depletion" of unrelated clusters. Fix: TMM only when total cell abundance is NOT itself the biological signal (else it removes real signal), or a compositional method (sccomp/scCODA/DCATS); report total-yield differences.

Batch cleaned instead of modeled

Trigger: normalizing batch out then testing naively. Mechanism: over-correction removes real signal. Symptom: attenuated effects. Fix: include batch in the design; if batch == condition, no rescue - design it out.

No replicates

Trigger: 1 sample per condition. Mechanism: no error term. Symptom: uninterpretable p. Fix: require >= 2-3 biological replicates per group.

Quantitative Thresholds

| Threshold | Source | Rationale | |-----------|--------|-----------| | >= 2-3 biological replicates per group | Weber 2019 | minimum for a valid DA/DS error term | | BH FDR across clusters (and clusters x markers for DS) | diffcyt | high-resolution grids have many tests | | arcsinh median as DS statistic | Nowicka 2017 | robust per-cluster per-sample summary |

Common Errors

| Error / symptom | Cause | Solution | |-----------------|-------|----------| | testDA_edgeR(sce, ...) fails | wrong signature | use the diffcyt() wrapper on the SCE, or calcCounts first | | results empty | wrong clustering_to_use name | match the stored clustering id (e.g. meta20) | | no DS results | state markers not flagged | set marker_class='state' in the panel | | paired design ignored | used fixed-effect method | use diffcyt-DA-GLMM with a random effect |

References

Weber 2019 Commun Biol 2:183 — diffcyt (DA + DS).
Bruggner 2014 PNAS 111(26):E2770-E2777 — CITRUS.
Lun 2017 Nat Methods 14(7):707-709 — cydar hypersphere DA.
Mangiola 2023 PNAS 120(33):e2203828120 — sccomp compositional analysis.
Buttner 2021 Nat Commun 12:6876 — scCODA.
Lin 2023 Genome Biol 24:151 — DCATS (assignment-uncertainty-aware).
Nowicka 2017 F1000Research 6:748 — CyTOF workflow; arcsinh-median DS statistic.
Hurlbert 1984 Ecol Monogr 54(2):187-211 — pseudoreplication.

Related Skills

clustering-phenotyping - Cluster (type markers) before testing
gating-analysis - Compare manually gated population frequencies
differential-expression/de-results - Shared edgeR/limma output semantics (padj)
differential-expression/edger-basics - The count-model engine diffcyt reuses
experimental-design/multiple-testing - FDR across clusters and clusters x markers
experimental-design/batch-design - Model batch in the design, don't clean it out

GPTomics/bio-flow-cytometry-differential-analysis

flow-cytometry/differential-analysis/SKILL.md

Differential abundance (DA) and differential state (DS) analysis for flow and mass cytometry - tests which cell populations change in frequency or marker expression between conditions using diffcyt (edgeR/voom/GLMM for DA, limma/LMM for DS), with cydar, CITRUS, and compositional methods (sccomp, scCODA, DCATS) as alternatives. Covers the sample-is-the-experimental-unit principle, design/contrast and mixed-model formulas, compositionality of cluster proportions, and FDR across clusters. Use when comparing populations between groups, choosing a DA method, handling paired/batch designs, or deciding whether compositional correction is needed.

817 stars

testing

Updated May 31, 2026

$ install --global

skillsauth

npx skillsauth add GPTomics/bioSkills bio-flow-cytometry-differential-analysis

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 31, 2026, 7:16 AM42.3s3 files scanned

SKILL.md

name:: bio-flow-cytometry-differential-analysis
description:: Differential abundance (DA) and differential state (DS) analysis for flow and mass cytometry - tests which cell populations change in frequency or marker expression between conditions using diffcyt (edgeR/voom/GLMM for DA, limma/LMM for DS), with cydar, CITRUS, and compositional methods (sccomp, scCODA, DCATS) as alternatives. Covers the sample-is-the-experimental-unit principle, design/contrast and mixed-model formulas, compositionality of cluster proportions, and FDR across clusters. Use when comparing populations between groups, choosing a DA method, handling paired/batch designs, or deciding whether compositional correction is needed.
tool_type:: r
primary_tool:: diffcyt

Version Compatibility

Reference examples tested with: diffcyt 1.22+, CATALYST 1.26+, edgeR 4.0+, limma 3.58+.

Before using code patterns, verify installed versions match. If versions differ:

R: packageVersion('<pkg>') then ?function_name to verify parameters

Differential Analysis

"Compare cell populations between my conditions" -> Test cluster frequencies (DA) and within-cluster marker expression (DS) between groups, with the sample (not the cell) as the unit.

R: diffcyt::diffcyt(sce, analysis_type='DA', method_DA='diffcyt-DA-edgeR', design, contrast)
R: diffcyt(sce, analysis_type='DS', method_DS='diffcyt-DS-limma', ...)

The Single Most Important Modern Insight -- The Sample Is the Experimental Unit, Not the Cell

DA vs DS, and the type/state marker link

DA (differential abundance): does a cluster's FREQUENCY differ? Clusters are defined by TYPE markers.
DS (differential state): within a fixed-identity cluster, does a STATE marker's expression differ? State markers were withheld from clustering for exactly this test.

Method Taxonomy

Run diffcyt DA and DS

Goal: Test abundance and state on a CATALYST-clustered SCE.

Approach: Build design + contrast from ei(sce); the diffcyt() wrapper uses the stored clustering. State markers are tested in DS, type markers define DA clusters.

library(CATALYST); library(diffcyt)

sce <- readRDS('sce_clustered.rds')
design   <- createDesignMatrix(ei(sce), cols_design = 'condition')
contrast <- createContrast(c(0, 1))                    # Treatment vs Control

res_DA <- diffcyt(sce, clustering_to_use = 'meta20',
                  analysis_type = 'DA', method_DA = 'diffcyt-DA-edgeR',
                  design = design, contrast = contrast)
res_DS <- diffcyt(sce, clustering_to_use = 'meta20',
                  analysis_type = 'DS', method_DS = 'diffcyt-DS-limma',
                  design = design, contrast = contrast)

library(SummarizedExperiment)
rowData(res_DA$res)        # cluster_id, logFC, p_val, p_adj (BH across clusters)

Paired / Repeated-Measures (mixed models)

Goal: Account for within-subject correlation (e.g. pre/post on the same donor).

Approach: Use a GLMM/LMM method with a random effect for subject via a formula.

formula <- createFormula(ei(sce), cols_fixed = 'condition', cols_random = 'patient_id')
res_DA  <- diffcyt(sce, clustering_to_use = 'meta20',
                   analysis_type = 'DA', method_DA = 'diffcyt-DA-GLMM',
                   formula = formula, contrast = createContrast(c(0, 1)))

Compositional Re-Check

Goal: Confirm a headline single-population shift is not inducing artifactual reciprocal depletion.

Approach: Re-test with a simplex-aware model when one cluster changes a lot or total yield differs by group.

# If a dominant population expands, the apparent depletion of others may be a simplex artifact.
# Re-test with sccomp / scCODA (reference cell type) / DCATS (assignment uncertainty)
# before reporting reciprocal depletion as independent biology.

Per-Method Failure Modes

Per-cell pseudoreplication

Trigger: Wilcoxon/t-test across all cells. Mechanism: cells aren't independent. Symptom: p ~ 1e-40 from few subjects. Fix: aggregate to per-sample summaries (diffcyt).

Compositional false DA

Batch cleaned instead of modeled

No replicates

Trigger: 1 sample per condition. Mechanism: no error term. Symptom: uninterpretable p. Fix: require >= 2-3 biological replicates per group.

Quantitative Thresholds

Common Errors

References

Weber 2019 Commun Biol 2:183 — diffcyt (DA + DS).
Bruggner 2014 PNAS 111(26):E2770-E2777 — CITRUS.
Lun 2017 Nat Methods 14(7):707-709 — cydar hypersphere DA.
Mangiola 2023 PNAS 120(33):e2203828120 — sccomp compositional analysis.
Buttner 2021 Nat Commun 12:6876 — scCODA.
Lin 2023 Genome Biol 24:151 — DCATS (assignment-uncertainty-aware).
Nowicka 2017 F1000Research 6:748 — CyTOF workflow; arcsinh-median DS statistic.
Hurlbert 1984 Ecol Monogr 54(2):187-211 — pseudoreplication.

Related Skills

clustering-phenotyping - Cluster (type markers) before testing
gating-analysis - Compare manually gated population frequencies
differential-expression/de-results - Shared edgeR/limma output semantics (padj)
differential-expression/edger-basics - The count-model engine diffcyt reuses
experimental-design/multiple-testing - FDR across clusters and clusters x markers
experimental-design/batch-design - Model batch in the design, don't clean it out

Related Skills

GPTomics/bio-workflows-clip-pipeline

tools

VerifiedTrustedCommunity

End-to-end CLIP-seq pipeline from FASTQ to ENCODE-compliant binding sites, single-nucleotide crosslink maps, annotation, motifs, and (optionally) differential binding. Use when running the full Yeo lab eCLIP / iCLIP / iCLIP2 / iCLIP3 / irCLIP / PAR-CLIP analysis with SMInput control, protocol-specific UMI extraction, ENCODE STAR parameters, CLIPper or Skipper peak calling with stringent log2 FC and -log10 p thresholds, IDR rescue and self-consistency QC, and downstream motif registration with mCross or PEKA.

1,065SKILL.mdUpdated Jun 10, 2026

GPTomics/bio-workflows-clip-pipeline

GPTomics/bio-comparative-genomics-whole-genome-duplication

development

VerifiedTrustedCommunity

Detect, date, and contextualize whole-genome duplication (WGD / paleopolyploidy) events using wgd v2 (Chen et al 2024), KsRates (Sensalari 2022 substitution-rate-corrected Ks dating), DupGen_finder (Qiao 2019), MAPS (Li 2018 phylogenomic), POInT (Conant 2008 ordered-block), SLEDGe (2024 ML-based), Whale.jl (Bayesian DL+WGD), and synteny-anchored paranome construction. Use when identifying ancient polyploidy from Ks distributions and synteny block analysis, positioning WGD events relative to speciation, distinguishing tandem from segmental from WGD duplications, dating the 2R/3R vertebrate / fish / salmonid WGDs, building paranome and Ks-age mixture models, applying KsRates substitution-rate correction across lineages, or testing alternative biased-fractionation / dosage-balance models post-WGD.

1,065SKILL.mdUpdated May 23, 2026

GPTomics/bio-comparative-genomics-whole-genome-duplication

GPTomics/bio-comparative-genomics-whole-genome-alignment

tools

VerifiedTrustedCommunity

Build whole-genome alignments using Progressive Cactus (Armstrong 2020 reference-free clade-level WGA), Minigraph-Cactus (Hickey 2024 pangenome-aware), LASTZ chain/net (UCSC pipeline), MUMmer4 (Marçais 2018 pairwise), minimap2 -x asm5/10/20 (Li 2018 fast pairwise), AnchorWave (Song 2022 WGD-aware), and Mauve / progressiveMauve (bacterial). Operates the HAL toolkit (Hickey 2013) for downstream extraction including halSynteny, halLiftover, halBranchMutations, and hal2maf. Use when constructing multi-species alignments for comparative-annotation projection (TOGA), synteny detection, conservation analyses (phyloP / PhastCons), or pangenome graph construction; selecting between reference-free (Cactus) and reference-anchored (LASTZ chains/nets) approaches; tuning sensitivity for closely vs distantly related genomes; or producing HAL files for genome-wide downstream tools.

1,065SKILL.mdUpdated May 23, 2026

GPTomics/bio-comparative-genomics-whole-genome-alignment

GPTomics/bio-comparative-genomics-synteny-analysis

development

VerifiedTrustedCommunity

Detect syntenic blocks and structural rearrangements between genomes using MCScanX (Wang 2012), JCVI/MCScan (Tang 2008 Python), GENESPACE (Lovell 2022) for orthology-anchored riparian visualization, SyRI for structural variation, AnchorWave for sequence-level synteny, i-ADHoRe 3.0 for highly diverged species, SynNet for synteny networks, and ntSynt for multi-genome macrosynteny. Use when identifying collinear gene blocks across species, distinguishing macrosynteny from microsynteny, detecting inversions/translocations/duplications, anchoring orthology in WGD lineages, producing publication riparian plots, computing synteny block age via Ks (cross-references whole-genome-duplication), or running synteny-aware ortholog inference in polyploids.

1,065SKILL.mdUpdated May 23, 2026

GPTomics/bio-comparative-genomics-synteny-analysis

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/GPTomics/bioSkills.git

# Copy into Claude Code skills folder (global)
cp -r bioSkills/flow-cytometry/differential-analysis ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

GPTomics/bioSkills

817 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT