data-visualization/heatmaps-clustering/SKILL.md
Build clustered heatmaps for expression matrices and other features-by-samples data with rigorous distance/linkage/scaling choices, robust color mapping, optimal leaf ordering, and ComplexHeatmap/pheatmap/seaborn rendering. Covers the ward.D vs ward.D2 trap, the row-vs-column scaling decision, multi-track annotations, oncoPrint, and raster rendering for large matrices. Use when visualizing expression patterns across samples or identifying co-regulated clusters.
npx skillsauth add GPTomics/bioSkills bio-data-visualization-heatmaps-clusteringInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Reference examples tested with: ComplexHeatmap 2.18+, pheatmap 1.0.13 (still maintained as of 2025-06), circlize 0.4.16+, seaborn 0.13+, scipy 1.12+, scanpy 1.10+, ggplot2 3.5+.
Before using code patterns, verify installed versions match. If versions differ:
pip show <package> then help(module.function) to check signaturespackageVersion('<pkg>') then ?function_name to verify parametersIf code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.
"Make a clustered heatmap" -> Render an expression / feature matrix as a colored grid with hierarchical-clustering dendrograms, after committing to (a) how to scale the data (row z-score vs raw vs robust), (b) which distance metric (Euclidean vs correlation vs Manhattan), (c) which linkage criterion (ward.D2 vs complete vs average), (d) how to order the leaves (default vs optimal leaf ordering), (e) how to map values to color (sequential vs diverging, robust quantile bounds), and (f) which package can handle the matrix size and annotation complexity.
ComplexHeatmap::Heatmap() (modern default), pheatmap::pheatmap() (still maintained, simpler API)seaborn.clustermap(), scanpy.pl.heatmap() (single-cell-aware)A heatmap's dendrograms are produced by three orthogonal choices, each with material biological consequences:
1 − Pearson correlation; Manhattan tolerates outliers; correlation distance preserves co-regulation patterns regardless of amplitude.The biological story changes depending on these choices. A "module" identified with complete linkage on Euclidean distance of raw counts is not the same module identified with ward.D2 on correlation distance of z-scored data. Both can be defensible; neither is automatic. Pick deliberately, document, and verify the clustering against orthogonal evidence before claiming the modules are biological.
R stats::hclust exposes two methods both labeled "Ward": ward.D and ward.D2. They produce different dendrograms on the same data. Only ward.D2 (squared distances input) implements Ward's actual minimum-variance criterion (Murtagh & Legendre 2014 J Classif 31:274). ward.D is a historical implementation that does not.
hclust(dist(x), method = 'ward.D') # NOT Ward's criterion -- legacy
hclust(dist(x), method = 'ward.D2') # Ward's actual minimum-variance criterion
pheatmap::pheatmap(clustering_method='ward.D2') and ComplexHeatmap::Heatmap(clustering_method_rows='ward.D2') both pass through to hclust. Always specify ward.D2 unless reproducing a paper that used the unlabeled ward (which actually called ward.D pre-R 3.1).
| Scenario | Scaling | Distance | Linkage | Why |
|----------|---------|----------|---------|-----|
| Bulk RNA-seq, expression patterns across samples | row z-score | euclidean | ward.D2 | Standard; z-score removes absolute level so co-regulated genes cluster regardless of magnitude |
| Methylation beta values (already bounded [0,1]) | raw (no scale) | euclidean or manhattan | ward.D2 | Beta values are interpretable on absolute scale; scaling would distort |
| Co-expression module discovery | row z-score | correlation (1 - cor) | average | WGCNA convention; preserves co-regulation pattern |
| ChIP/ATAC peak intensity across samples | raw log-counts | euclidean | ward.D2 | Peaks are interpretable on absolute scale after log |
| Sample QC (correlation of samples) | column-wise raw | correlation | ward.D2 | The correlation IS the data; don't scale before computing it |
| Methylation array with outliers | raw + clip 1-99% | euclidean | ward.D2 | Outliers dominate Euclidean; robust clip preserves signal |
| Single-cell pseudobulk by cell type | row z-score | euclidean | ward.D2 | Same as bulk; downsample to <500 cells per type for rendering |
| Mutation matrix (binary present/absent) | raw | binary (jaccard) | average or complete | Standard distance for binary data; ward inappropriate |
| Drug response across cell lines | row z-score | spearman correlation | ward.D2 | Drug-rank patterns matter more than absolute IC50 |
A heatmap is a color encoding of a matrix. The default linear mapping from data to color is rarely correct:
Diverging data needs symmetric bounds. For z-scores or log-fold changes, the color bar must be symmetric around zero. colorRamp2(c(-2, 0, 2), c('#0072B2', 'white', '#D55E00')) ALWAYS, not colorRamp2(c(min, mean, max), ...).
Robust quantile bounds. Single outliers compress the entire color scale. Clip at 1st/99th percentile before mapping: bounds <- quantile(mat, c(0.01, 0.99)). ComplexHeatmap's colorRamp2(c(bounds[1], 0, bounds[2]), ...) is standard. Without this, one outlier sample turns the entire heatmap pale.
Sequential data uses a perceptually-uniform colormap. viridis, magma, cividis (Nuñez 2018), or batlow (Crameri 2020). NOT jet, NOT rainbow, NOT colorRampPalette(c('blue','red'))(100) which has a non-monotonic luminance.
Diverging palettes from Crameri (vik, roma) or ColorBrewer (RdBu, BrBG) are perceptually uniform. Reverse the default direction for log-fold-change (negative = blue, positive = red, by biological convention).
A dendrogram for n leaves has 2^(n-1) consistent linear orderings — only one is the leaf order shown. Default hclust gives a deterministic but visually arbitrary ordering. Optimal Leaf Ordering (OLO) chooses the consistent ordering that minimizes the sum of distances between adjacent leaves — making visually adjacent rows actually similar, and revealing block structure in the heatmap that the default ordering hides.
library(ComplexHeatmap)
library(seriation)
# OLO via seriation
dist_rows <- dist(mat)
hc_rows <- hclust(dist_rows, method = 'ward.D2')
olo_rows <- seriate(dist_rows, method = 'OLO', control = list(hclust = hc_rows))
Heatmap(mat,
cluster_rows = as.dendrogram(olo_rows[[1]]),
cluster_columns = TRUE,
clustering_method_columns = 'ward.D2')
For matrices >2000 rows OLO becomes slow (O(n^4) in worst case; modern implementations are much faster). The trade-off is worth it for publication figures.
Goal: Render an annotated heatmap with column metadata (condition, batch, age), row metadata (pathway, gene class), and split panels for grouped display.
Approach: Define HeatmapAnnotation (column) and rowAnnotation objects with explicit color lists; render with Heatmap() specifying row_split/column_split for grouped layout; use draw() to commit, not bare Heatmap(), when running non-interactively.
library(ComplexHeatmap)
library(circlize)
# Robust symmetric color mapping
bounds <- quantile(abs(mat[!is.na(mat)]), 0.99)
col_fun <- colorRamp2(c(-bounds, 0, bounds), c('#0072B2', 'white', '#D55E00'))
# Column metadata
ha_col <- HeatmapAnnotation(
Condition = metadata$condition,
Batch = metadata$batch,
Age = anno_barplot(metadata$age),
col = list(
Condition = c(Control = '#56B4E9', Treatment = '#D55E00'),
Batch = c(A = '#009E73', B = '#0072B2', C = '#CC79A7')
),
annotation_name_gp = gpar(fontsize = 8)
)
# Row metadata
ha_row <- rowAnnotation(
Pathway = gene_info$pathway,
LogFC = anno_barplot(gene_info$log2FC, baseline = 0,
gp = gpar(fill = ifelse(gene_info$log2FC > 0,
'#D55E00', '#0072B2'))),
col = list(Pathway = c(Metabolism = '#8491B4', Signaling = '#91D1C2'))
)
ht <- Heatmap(mat,
name = 'Z-score',
col = col_fun,
top_annotation = ha_col,
left_annotation = ha_row,
row_split = gene_info$pathway,
column_split = metadata$condition,
clustering_method_rows = 'ward.D2',
clustering_method_columns = 'ward.D2',
clustering_distance_rows = 'euclidean',
clustering_distance_columns = 'euclidean',
show_row_names = FALSE,
use_raster = TRUE) # rasterize cell layer for >2000 rows
draw(ht, merge_legends = TRUE) # draw() not bare Heatmap()
draw() requirement (silent failure)A bare Heatmap(mat) works at the R console because auto-print invokes draw(). Inside for, lapply, function, Quarto/Rmd chunks, or Rscript, a bare Heatmap() produces no output and no error. Always wrap in draw() non-interactively. Only draw() exposes merge_legends, heatmap_legend_side, ht_gap, and padding.
import seaborn as sns
import numpy as np
import pandas as pd
# Robust symmetric bounds (1-99% quantile)
vmax = np.quantile(np.abs(df.values[~np.isnan(df.values)]), 0.99)
# col_colors / row_colors for categorical annotations
condition_colors = metadata['condition'].map({'Control': '#56B4E9', 'Treatment': '#D55E00'})
batch_colors = metadata['batch'].map({'A': '#009E73', 'B': '#0072B2', 'C': '#CC79A7'})
col_colors = pd.DataFrame({'Condition': condition_colors, 'Batch': batch_colors})
g = sns.clustermap(df,
cmap='RdBu_r', center=0, vmin=-vmax, vmax=vmax,
row_cluster=True, col_cluster=True,
method='ward', # seaborn uses scipy ward, equivalent to R ward.D2
metric='euclidean',
z_score=0, # 0 = rows, 1 = columns
col_colors=col_colors,
dendrogram_ratio=0.15,
cbar_pos=(0.02, 0.8, 0.03, 0.15),
figsize=(10, 12),
rasterized=True) # rasterize the cell layer
standard_scale vs z_score confusion:
z_score=0 standardizes ROWS to mean 0, SD 1 (most common request)z_score=1 standardizes COLUMNS to mean 0, SD 1standard_scale=0 rescales ROWS to [0, 1] via (x − min) / (max − min) — NOT z-scoring, compresses outliers nonlinearlyA heatmap published with standard_scale looks like a z-scored heatmap but the color encoding is not interpretable as standard deviations.
OncoPrint (Cerami 2012 Cancer Discov 2:401; canonical at cBioPortal) is a stylized heatmap for mutation matrices where each cell encodes multiple alteration types via overlapping rectangles. Different from generic heatmaps — see data-visualization/oncoprint-mutation-matrices for the dedicated skill. Mentioned here only to note that ComplexHeatmap::oncoPrint() is the R implementation and inherits all the cluster/annotation machinery of Heatmap().
Trigger: clustering_method = 'ward' or 'ward.D' (with or without the .D).
Mechanism: R hclust ward.D is a legacy implementation that does NOT use squared distances — it does not implement Ward's minimum-variance criterion (Murtagh-Legendre 2014).
Symptom: Different dendrogram than published papers that say "Ward"; clusters look subtly different; reproducibility issues across R versions.
Fix: Always specify ward.D2 explicitly. For pheatmap and ComplexHeatmap, pass clustering_method_rows = 'ward.D2' and clustering_method_columns = 'ward.D2'.
Trigger: Plotting matrix without quantile clipping; one extreme value dominates the color range.
Mechanism: Default colorRamp2(c(min(mat), 0, max(mat)), ...) is dominated by the outlier — the rest of the matrix renders within a narrow band of pale colors.
Symptom: Heatmap looks "washed out" except for one cell or one column; biological pattern invisible.
Fix: bounds <- quantile(abs(mat), 0.99); col_fun <- colorRamp2(c(-bounds, 0, bounds), c('#0072B2', 'white', '#D55E00')). ComplexHeatmap's colorRamp2 does NOT clip values exceeding the range — they render at the extreme color, which is the intended behavior.
Trigger: Bare Heatmap(mat) inside a for loop, lapply, function(), Quarto/Rmd code chunk, or Rscript invocation.
Mechanism: Auto-print only happens at the top-level R prompt; in non-interactive contexts the Heatmap object is created but never rendered.
Symptom: No error, no warning, no PDF output. Looks like the script ran successfully.
Fix: Always wrap in draw(): pdf('out.pdf', ...); draw(Heatmap(mat, ...)); dev.off(). Use the draw() call to set legend layout: draw(ht, merge_legends = TRUE, heatmap_legend_side = 'right').
Trigger: cluster_columns = TRUE when columns are an ordered sequence (time points, dose levels, treatment stages).
Mechanism: Hierarchical clustering re-orders columns to maximize within-cluster similarity, destroying the time/dose axis.
Symptom: Time-course heatmap with time points scrambled; reader cannot follow temporal pattern.
Fix: cluster_columns = FALSE for ordered conditions. To group while preserving order, use column_split or column_order explicitly.
Trigger: Row z-scoring a matrix with many zero values (e.g., single-cell expression, sparse peak counts).
Mechanism: (x − mean) / sd is ill-defined when a row is mostly zeros — sd is dominated by the few non-zero values; z-scores explode for the non-zero entries.
Symptom: A few cells render as extreme colors; most cells are washed-out near-zero.
Fix: For single-cell, work with cluster-summarized pseudobulk matrices, not raw single-cell expression. For sparse peak data, filter rows by minimum non-zero count before scaling.
Trigger: clustering_distance_rows = 'correlation' on a matrix where samples have a strong batch effect.
Mechanism: Correlation preserves co-regulation pattern but is sensitive to global structure. If batch shifts ALL genes up in one batch, correlation distance reads this as "co-regulation."
Symptom: Modules cluster by batch, not by biology.
Fix: Batch-correct (limma::removeBatchEffect or ComBat) before clustering. Confirm with PCA that batch is no longer the dominant axis.
gaps_col interacts with cluster_colsTrigger: Setting gaps_col = c(5, 10) with cluster_cols = TRUE.
Mechanism: When cluster_cols = TRUE, gaps_col is silently ignored; the dendrogram-determined order has no concept of position.
Symptom: No gaps appear; no warning.
Fix: To use gaps_col, set cluster_cols = FALSE and pre-arrange the columns explicitly. Or in ComplexHeatmap use column_split to combine clustering AND visual gaps.
Trigger: use_raster = TRUE (default for large heatmaps in ComplexHeatmap) with default raster_quality = 1.
Mechanism: Default raster quality is set for screen rendering; the bitmap is upscaled for PDF output, producing blocky cells.
Symptom: Heatmap cells look pixelated at print zoom; diagonal "stair-stepping" on cell boundaries.
Fix: Heatmap(..., use_raster = TRUE, raster_quality = 5) increases the raster resolution. Set raster_device = 'CairoPNG' for transparency support.
| Pattern | Likely cause | Action |
|---------|--------------|--------|
| ComplexHeatmap and pheatmap give different dendrograms | pheatmap uses dist() with default method = 'euclidean'; ComplexHeatmap defaults to the same but different clustering distance defaults | Verify both with clustering_distance_rows = 'euclidean', clustering_method_rows = 'ward.D2' explicitly |
| Same code, different dendrogram across R versions | R 3.1 renamed 'ward' to 'ward.D' and added 'ward.D2' | Always specify ward.D2 explicitly; never 'ward' |
| Z-scored heatmap with extreme colors only in a few cells | Sparse matrix with zero-inflation | Filter low-expression rows; OR shift to robust scaling ((x - median) / mad) |
| Modules cluster by batch | Correlation distance picked up batch effect | Batch-correct upstream; verify via PCA |
| seaborn clustermap produces different clusters than R | seaborn method='ward' calls scipy.cluster.hierarchy.linkage which IS ward.D2-equivalent; difference is usually metric default | Set metric='euclidean' explicitly in both |
| OncoPrint mutual-exclusivity panel appears empty | column_order is being computed by clustering instead of preserved | Pass column_order = ... explicitly to oncoPrint |
Operational rule: a clustered heatmap is reproducible only when scaling, distance, linkage, color bounds, and (for OLO) the seriation method are all explicitly stated. Defaults differ across packages and across versions of the same package.
| Threshold | Value | Source |
|-----------|-------|--------|
| Robust color bound | 1st-99th percentile of |matrix| | Standard publication practice; suppresses single-outlier dominance |
| Raster trigger | >2000 rows or >2000 columns | ComplexHeatmap default use_raster = TRUE above 2000 |
| OLO practical limit | ~5000 rows | O(n^4) worst case; modern Bar-Joseph implementations faster |
| z-score symmetry | bounds around 0 | Z-scores are symmetric by construction |
| Minimum non-zero count to z-score | >=3 non-zero values per row | Below this sd is unreliable |
| Single-cell pseudobulk threshold | Downsample to <500 cells/group | Otherwise PDF rendering hangs |
| Error / symptom | Cause | Solution |
|-----------------|-------|----------|
| No PDF output from a script | Bare Heatmap() without draw() | Wrap in draw(); use pdf() / dev.off() explicitly |
| Color scale washed out | One outlier dominates | Clip to 1st-99th percentile; symmetric bounds for diverging |
| Time-course columns scrambled | cluster_columns = TRUE on ordered data | cluster_columns = FALSE; use column_split |
| pheatmap gaps_col ignored | Conflict with cluster_cols = TRUE | Disable clustering OR switch to ComplexHeatmap split |
| Dendrogram differs from a paper | Default clustering_method mismatch | Always specify ward.D2; never ward |
| seaborn standard_scale interpreted as z-score | Different rescaling functions | Use z_score=0 for row z-scoring; standard_scale is min-max not z |
| Heatmap renders pixelated | Default raster_quality = 1 | Set raster_quality = 5 for publication |
| Z-score blows up for some rows | Sparse rows; near-zero sd | Filter low-expression rows OR robust scale |
| Modules cluster by batch not biology | Batch effect not removed | limma::removeBatchEffect or ComBat upstream |
| Pushback | Standard response |
|----------|-------------------|
| "Which clustering method?" | Explicit ward.D2 (Murtagh-Legendre 2014) on row-z-scored Euclidean distance. Alternatives evaluated in supplementary |
| "How were leaves ordered?" | Optimal Leaf Ordering via seriation::seriate (Bar-Joseph 2001); reduces visually-adjacent dissimilarity |
| "Why this color scale?" | Diverging palette symmetric around zero, bounds = 1st-99th percentile of |z|. Robust to outliers per standard practice |
| "Why z-score?" | Removes absolute level so co-regulated genes cluster regardless of magnitude. Raw values clustered separately (supplementary) |
| "Why row split by pathway?" | Pre-specified gene-set annotation (KEGG/Reactome) to verify clustering recovers known biology, not to bias it |
| "Reproducibility across R versions?" | ward.D2 is stable across R 3.1+; clustering_method = 'ward' is not — never used |
tools
--- name: bio-phasing-imputation-foundations description: Frames the phasing/imputation pipeline before any tool runs: phasing and imputation are one Li-Stephens copying HMM (recombination is the transition, mutation the emission, the genetic map and Ne set the rates), imputation's honest output is a dosage with a self-estimated quality (INFO/R2/DR2) not a hard genotype, and the stages are ordered and each fails silently (QC, align build and strand to the panel, phase, impute per chromosome, fil
tools
Chooses the enrichment generation before any tool runs, mapping the input shape to a method class - a pre-selected gene list plus a background to over-representation analysis (ORA, hypergeometric), a ranked statistic for all genes to gene set enrichment (GSEA), a signed signaling topology to pathway-topology (SPIA) - then making the null explicit (competitive vs self-contained, gene vs subject sampling) and running a trustworthiness checklist (testable-gene universe, FDR, redundancy collapse, leading-edge check, version reporting). Covers why every clusterProfiler GSEA is the inter-gene-correlation-uncorrected competitive null, why the background not the gene list decides ORA significance, and why no method is universally best. Use when deciding ORA vs GSEA vs topology, which gene-set DB, whether a result is trustworthy, or which null a tool computes. For ORA see go-enrichment, GSEA see gsea, databases kegg-pathways/reactome-pathways/wikipathways; the ranking comes from differential-expression/de-results.
testing
End-to-end GWAS workflow from VCF to association results. Covers PLINK QC, population structure correction, and association testing for case-control or quantitative traits. Use when running genome-wide association studies.
development
Orchestrates the full path from differential expression results to redundancy-collapsed functional enrichment: choose ORA vs GSEA, convert gene IDs per method, run enrichGO/enrichKEGG/enrichPathway/enrichWP or gseGO/gseKEGG (clusterProfiler, ReactomePA, rWikiPathways), and visualize. Routes the ORA-vs-GSEA generation fork and the null/universe/reproducibility theory to pathway-analysis/enrichment-foundations. Use when a DESeq2/edgeR/limma result must become enriched GO terms, KEGG/Reactome/WikiPathways pathways, or a GSEA leading edge; when deciding whether a ranking exists for all genes (GSEA, named decreasing vector) or only a pre-selected list (ORA plus a defensible background universe); or when assembling DE-to-pathway end to end. The DE list and ranking statistic come from differential-expression/de-results; per-method nuance lives in the pathway-analysis skills.