imaging-mass-cytometry/differential-analysis/SKILL.md
Compare cell-type composition and spatial features across conditions in IMC/MIBI cohorts with the patient as the experimental unit, covering pseudoreplication, per-patient aggregation, mixed models, compositional (Dirichlet/scCODA) differential abundance, diffcyt, per-image-to-patient spatial differential testing (SpaceANOVA), batch covariates, and FDR. Use when testing whether a cell type or spatial niche differs between groups, avoiding cell-level pseudoreplication, choosing a differential-abundance method, or correctly powering an IMC cohort comparison.
npx skillsauth add GPTomics/bioSkills bio-imaging-mass-cytometry-differential-analysisInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Reference examples tested with: diffcyt 1.22+ (R), lme4 1.1+ (R), statsmodels 0.14+, scanpy 1.10+, sccoda 0.1.9+, numpy 1.26+, pandas 2.2+
Before using code patterns, verify installed versions match. If versions differ:
pip show <package> then help(module.function) to check signaturespackageVersion('<pkg>') then ?function_name to verify parametersIf code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.
Notes specific to this skill: cell-type proportions are compositional (they sum to 1), so a real increase in one type forces apparent decreases in others -- a Dirichlet/CLR-aware method (scCODA) or a reference cell type is needed, not independent per-type tests. statsmodels mixedlm fits patient as a random effect. diffcyt operates on per-sample cluster counts (DA) and per-sample median marker expression (DS).
"Compare cell types and spatial structure between my conditions" -> Aggregate to the patient, then test across patients -- never across cells.
statsmodels.formula.api.mixedlm, sccoda, scanpydiffcyt, lme4::lmer, SpaceANOVA for spatial differential testingAfter phenotyping and spatial analysis, every interesting claim ("disease has more Tregs", "responders have more CD8-tumor contact") is a comparison BETWEEN groups, and the single most common fatal error is testing it at the cell level. Hundreds of thousands of cells from one patient are not independent replicates -- they are correlated reads of one biological sample, and cells within an image are massively spatially autocorrelated. Testing at the cell level inflates n by orders of magnitude and manufactures significance: a per-cell test over 50,000 cells reports p~0 for trivial effects because the effective sample size is the number of PATIENTS (often 10-40), not cells (Squair 2021 Nat Commun 12:5692). In imaging this is worse than in scRNA because slide and ROI add nesting levels and ROIs are not random samples of the tissue. The correct spine is invariant across every differential question: compute a per-image (or per-ROI) summary, aggregate to ONE value per patient, then test across patients with the patient as the unit -- a mixed model with patient as a random effect (image nested within patient), a pseudobulk-style per-patient summary, or a cell-count-weighted average of per-image statistics (Samorodnitsky and Wu 2024 Brief Bioinform 25:bbae522). Two riders complete the picture: cell-type proportions are COMPOSITIONAL (they are constrained to sum to 1, so a real rise in one type mechanically depresses the others, and independent per-type tests double-count this), and acquisition BATCH drifts by day/run and can align with clinical group, so batch must be a covariate and acquisition order randomized against condition. Phenotyping-method choice and statistical-unit choice are orthogonal: getting the cell types right does not excuse testing them wrong.
| Question | Per-image summary | Patient-level test | |----------|-------------------|--------------------| | Cell-type abundance differs between groups | per-image cell-type proportions | mixed model on proportions; scCODA (compositional); diffcyt-DA | | A functional/state marker differs within a type | per-image median marker per type | pseudobulk per patient + limma/edgeR; diffcyt-DS | | A spatial interaction/niche differs between groups | per-image enrichment z / Ripley's K / CN abundance | mixed model / cell-count-weighted; SpaceANOVA (FANOVA on cross-K) |
| Scenario | Recommended | Why | |----------|-------------|-----| | Compare cell-type proportions, several types shift | scCODA (or CLR + mixed model) | handles the compositional constraint and the reference-type problem | | Standard cytometry-style DA on clusters | diffcyt-DA (edgeR on per-sample counts) | designed for per-sample cluster counts; established | | Multiple ROIs per patient | mixed model, patient random effect (image nested) | respects nesting; or cell-count-weighted aggregate | | Few patients (n < ~10) | simple per-patient test; report low power honestly | do NOT rescue power with cell count | | Differential functional-marker expression within a type | pseudobulk per patient + limma/edgeR (diffcyt-DS) | aggregates out cell-level pseudoreplication | | Spatial interaction/niche across groups | per-image spatial stat -> patient aggregate; SpaceANOVA | the spatial statistic is the summary; the unit is still the patient | | Acquisition batch aligns with group | include batch covariate; if confounded, the contrast is unrescuable | randomize acquisition order against condition |
Goal: Collapse millions of cells to one summary per patient before any test.
Approach: Compute per-image cell-type proportions, then aggregate images to their patient. Every downstream test consumes this patient-level table, not the cell table.
import pandas as pd
# obs has one row per cell with image_id, patient, condition, cell_type
counts = obs.groupby(['patient', 'condition', 'image_id', 'cell_type']).size().unstack(fill_value=0)
image_prop = counts.div(counts.sum(axis=1), axis=0) # per-image proportions
patient_prop = image_prop.groupby(['patient', 'condition']).mean() # one row per patient
Goal: Test a cell type's proportion across groups while respecting patient/ROI nesting.
Approach: Fit a mixed model with patient as a random effect when multiple ROIs per patient exist; this absorbs within-patient correlation that a fixed-effect test would treat as independent replication.
import statsmodels.formula.api as smf
# one row per image; proportion of the target type; patient random intercept
df = image_prop.reset_index().rename(columns={'Treg': 'prop'}) # 'Treg' = an actual cell_type column
model = smf.mixedlm('prop ~ condition + batch', df, groups=df['patient']) # batch as covariate
res = model.fit()
print(res.summary()) # the condition coefficient is tested with patient as the unit
Goal: Avoid the false "everything changed" artifact when proportions are constrained to sum to 1.
Approach: Model the counts as compositional against a reference cell type; a change is interpreted relative to that reference rather than as an independent per-type shift.
import sccoda.util.cell_composition_data as dat
from sccoda.util import comp_ana as mod
# patient-level cell-type COUNTS (not proportions); pick a biologically stable reference type
data = dat.from_pandas(patient_counts, covariate_columns=['condition'])
analysis = mod.CompositionalAnalysis(data, formula='condition', reference_cell_type='Epithelial')
result = analysis.sample_hmc()
result.summary()
Goal: Test whether a spatial interaction or niche differs between groups, at the patient unit.
Approach: Treat the per-image spatial statistic (a neighborhood-enrichment z, a Ripley's cross-K curve, a CN abundance) as the summary, aggregate to patient, and test across patients with FDR over cell-type pairs. SpaceANOVA does this as a functional ANOVA on per-image cross-K with subject structure.
import statsmodels.formula.api as smf
from statsmodels.stats.multitest import multipletests
# per_image_pair: rows = (image_id, patient, condition, batch, enrichment_z) for ONE type pair
pvals = {}
for pair, sub in per_image_pair.groupby('pair'):
res = smf.mixedlm('enrichment_z ~ condition + batch', sub, groups=sub['patient']).fit()
pvals[pair] = res.pvalues['condition[T.responder]']
padj = dict(zip(pvals, multipletests(list(pvals.values()), method='fdr_bh')[1])) # FDR across pairs
Goal: Test whether a functional/state marker (Ki67, PD-1) differs within a cell type between groups.
Approach: Pseudobulk to one value per patient per cell type (median marker expression among that type's cells), then test across patients. diffcyt-DS (R) formalizes this on per-sample medians; the Python pseudobulk path is below.
import numpy as np
t = adata[adata.obs['cell_type'] == 'T cell']
ki67 = t[:, 'Ki67'].X
ki67 = ki67.toarray().ravel() if hasattr(ki67, 'toarray') else np.asarray(ki67).ravel()
pb = t.obs.assign(ki67=ki67).groupby(['patient', 'condition'])['ki67'].median().reset_index()
print(smf.ols('ki67 ~ condition', pb).fit().pvalues['condition[T.responder]']) # one value per patient
Trigger: a t-test/Wilcoxon/regression over individual cells. Mechanism: correlated cells from one sample are pseudoreplicates; effective n is the patient count. Symptom: p~0 for trivial effects; "significant" findings that do not replicate. Fix: aggregate to per-patient summaries; test across patients.
Trigger: a separate test per cell type on proportions. Mechanism: proportions sum to 1, so a real rise in one type depresses others mechanically. Symptom: many types appear to change in opposite directions. Fix: compositional model (scCODA) or CLR transform with a reference type.
Trigger: aggressive integration to make clusters patient-agnostic, then testing on corrected data. Mechanism: correction can treat real between-patient biology as batch. Symptom: the disease signal disappears. Fix: correct minimally, validate that invariant types align while variable types stay separate, and keep integration out of the across-patient inference path.
Trigger: claiming significance from n=4 patients because millions of cells were imaged. Mechanism: cell count is not replication. Symptom: confident claims from few patients. Fix: report the patient n and the true power honestly; collect more patients.
| Threshold | Source | Rationale | |-----------|--------|-----------| | Unit of replication = patient count (often 10-40) | Squair 2021 Nat Commun 12:5692 | cells/ROIs are pseudoreplicates | | Cell-count-weighted per-image aggregation | Samorodnitsky and Wu 2024 Brief Bioinform 25:bbae522 | controls type-I error with high power (vs unweighted ROI averaging) | | Reference cell type for compositional DA | Buttner 2021 Nat Commun 12:6876 | proportions are not independent | | BH-FDR across cell-type pairs and radii | multiplicity | ~200 pairs x radii guarantees false positives | | Batch covariate; randomize acquisition order | spatial dossier | run drift can align with clinical group |
| Error / symptom | Cause | Solution | |-----------------|-------|----------| | p~0 with a trivial effect | cell-level test | per-patient aggregation; mixed model | | All cell types "changed" | compositional constraint ignored | scCODA / CLR with reference type | | Disease signal vanished after integration | batch over-correction | minimal correction; keep it out of the test | | Significant with n=4 patients | power rescued by cell count | report patient n; do not over-claim | | Many significant pairs | no FDR across pairs/radii | BH-FDR over the full grid |
testing
Analyze multi-modal single-cell data (CITE-seq, Multiome, spatial). Use when working with data that measures multiple modalities per cell like RNA + protein or RNA + ATAC. Use when analyzing CITE-seq, Multiome, or other multi-modal single-cell data.
data-ai
Analyze metabolite-mediated cell-cell communication using MeboCost for metabolic signaling inference between cell types. Predict metabolite secretion and sensing patterns from scRNA-seq data. Use when studying metabolic crosstalk between cell populations or metabolite-receptor interactions.
development
Find marker genes and annotate cell types in single-cell RNA-seq using Seurat (R) and Scanpy (Python). Use for differential expression between clusters, identifying cluster-specific markers, scoring gene sets, and assigning cell type labels. Use when finding marker genes and annotating clusters.
development
Reconstruct cell lineage trees from CRISPR barcode tracing or mitochondrial mutations. Use when studying clonal dynamics, cell fate decisions, or developmental trajectories.