flow-cytometry/bead-normalization/SKILL.md
Bead-based signal normalization and cross-batch harmonization for CyTOF and high-parameter cytometry - EQ four-element bead normalization of instrument sensitivity drift (CATALYST normCytof, premessa), and reference-anchor cross-batch normalization (CytoNorm, per-cluster quantile splines). Covers the distinction between within-run drift correction and between-batch correction, the mandatory anchor/reference sample, why normalization is per-cluster with many quantiles, and the over-correction risk. Use when correcting CyTOF signal drift, harmonizing multi-batch or multi-site studies, or deciding whether to normalize data versus model batch in the design.
npx skillsauth add GPTomics/bioSkills bio-flow-cytometry-bead-normalizationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Reference examples tested with: CATALYST 1.26+, CytoNorm 2.0+, flowCore 2.14+.
Before using code patterns, verify installed versions match. If versions differ:
packageVersion('<pkg>') then ?function_name to verify parametersnormCytof() returns a LIST ($data, $beads, $removed, ...), not a flowFrame; beads="dvs" encodes EQ masses 140,151,153,165,175. Confirm with ?normCytof before relying on slot names.
"Normalize my CyTOF data" -> Correct instrument sensitivity drift with EQ beads (within/across runs), then harmonize batches with a reference anchor.
CATALYST::normCytof() (EQ-bead-based) or premessaCytoNorm::CytoNorm.train() + CytoNorm.normalize() (per-cluster quantile splines)Bead normalization and batch normalization correct DIFFERENT things and are NOT interchangeable. (1) EQ-BEAD normalization (Finck 2013 Cytometry A 83:483) corrects within-run and run-to-run instrument SENSITIVITY DRIFT using the four-element beads as a physical internal standard - applied first, on raw counts. (2) CROSS-BATCH normalization (CytoNorm, Van Gassen 2020 Cytometry A 97:268) corrects staining/acquisition batch effects using a shared ANCHOR/reference sample present in EVERY batch, learning per-FlowSOM-cluster quantile-spline transforms. Beads cannot fix staining-batch or reagent-lot effects; CytoNorm cannot fix intra-run detector drift. The anchor control is the load-bearing design element: because it is biologically identical across batches, any cross-batch difference in it is technical BY CONSTRUCTION. Dropping the anchor (CytoNorm 2.0) is convenient but reintroduces the over-correction risk the anchor was designed to eliminate - so the safest stance for inference is to MODEL batch in the diffcyt design and reserve normalization for visualization/clustering display.
Batch effects are cell-type-specific - a marker can drift in monocytes but not in T cells - so a single global channel transform over-corrects one population while under-correcting another and can erase real abundance differences. CytoNorm therefore learns the transform PER FlowSOM cluster. And it uses ~99 quantiles + a spline because the drift is non-linear and intensity-dependent (the negative and positive peaks move by different amounts); a single median shift or linear rescale reintroduces the distortion it is trying to remove.
Goal: Correct sensitivity drift and remove bead events.
Approach: normCytof() gates beads, computes the correction on the linear scale, and returns a list - the cleaned SCE is in $data.
library(CATALYST)
sce <- prepData(fs, panel, md) # no by_time arg - normalization is normCytof's job
res <- normCytof(sce, beads = 'dvs', # EQ masses 140,151,153,165,175
k = 500, remove_beads = TRUE, overwrite = FALSE) # k = smoothing window (default; affects bead-trace viz, not correction magnitude)
sce_norm <- res$data # normalized SCE; res$beads / res$removed available
Goal: Harmonize batches using a shared reference sample.
Approach: Train on the anchor (present in every batch) -> learn per-cluster quantile splines -> apply to the real samples. testCV() first: if cluster CV is high, the FlowSOM model is batch-unstable and per-cluster splines will distort (fall back to nClus=1).
library(CytoNorm)
model <- CytoNorm.train(files = ref_files, labels = batch_labels, channels = marker_channels,
transformList = tl,
FlowSOM.params = list(nCells = 6000, xdim = 10, ydim = 10, nClus = 10),
normMethod.train = QuantileNorm.train,
normParams = list(nQ = 99), seed = 42)
CytoNorm.normalize(model = model, files = sample_files, labels = batch_labels,
transformList = tl, transformList.reverse = tl_rev, # BOTH required
outputDir = 'normalized/')
Trigger: expecting beads to fix staining-batch effects. Mechanism: different layers. Symptom: residual batch structure after bead norm. Fix: bead norm for drift; CytoNorm for batch.
Trigger: a batch lacking the reference sample. Mechanism: nothing biologically-identical to learn from. Symptom: that batch can't be normalized / is over-corrected. Fix: run the anchor in every batch (or model batch instead).
Trigger: CytoNorm with groups confounded with batch, or anchor-free on variable samples. Mechanism: splines absorb real biology. Symptom: attenuated group differences. Fix: testCV() check; model batch in diffcyt for inference; normalize for display only.
Trigger: sce_norm <- normCytof(...). Mechanism: it returns a list. Symptom: downstream type error. Fix: res$data.
| Threshold | Source | Rationale |
|-----------|--------|-----------|
| bead drift reduced ~4.9x -> 1.3x | Finck 2013 Cytometry A 83:483 | EQ-bead correction over a month of runs |
| 99 quantiles, per-cluster | Van Gassen 2020 Cytometry A 97:268 | non-linear intensity-dependent, cell-type-specific drift |
| EQ masses 140,151,153,165,175 (dvs) | CATALYST | DVS/Fluidigm EQ four-element bead set |
| Error / symptom | Cause | Solution |
|-----------------|-------|----------|
| normCytof output not usable | it returns a list | use res$data |
| prepData(by_time=TRUE) errors | no such argument | use normCytof() for bead/drift correction |
| CytoNorm distorts populations | unstable FlowSOM clustering | run testCV(); reduce nClus (or 1) |
| batch effect remains | only bead-normalized | add CytoNorm with anchor samples |
Workflow order (CyTOF): EQ-bead drift normalization (raw counts, FIRST) -> cytometry-qc -> doublet-detection -> clustering -> CytoNorm cross-batch (LAST). The two normalization layers sit at opposite ends.
testing
Analyze multi-modal single-cell data (CITE-seq, Multiome, spatial). Use when working with data that measures multiple modalities per cell like RNA + protein or RNA + ATAC. Use when analyzing CITE-seq, Multiome, or other multi-modal single-cell data.
data-ai
Analyze metabolite-mediated cell-cell communication using MeboCost for metabolic signaling inference between cell types. Predict metabolite secretion and sensing patterns from scRNA-seq data. Use when studying metabolic crosstalk between cell populations or metabolite-receptor interactions.
development
Find marker genes and annotate cell types in single-cell RNA-seq using Seurat (R) and Scanpy (Python). Use for differential expression between clusters, identifying cluster-specific markers, scoring gene sets, and assigning cell type labels. Use when finding marker genes and annotating clusters.
development
Reconstruct cell lineage trees from CRISPR barcode tracing or mitochondrial mutations. Use when studying clonal dynamics, cell fate decisions, or developmental trajectories.