pathway-analysis/enrichment-visualization/SKILL.md
Visualize enrichment results using enrichplot package functions. Use when creating publication-quality figures from clusterProfiler results. Covers dotplot, barplot, cnetplot, emapplot, gseaplot2, ridgeplot, and treeplot.
npx skillsauth add GPTomics/bioSkills bio-pathway-enrichment-visualizationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Reference examples tested with: ggplot2 3.5+
Before using code patterns, verify installed versions match. If versions differ:
packageVersion('<pkg>') then ?function_name to verify parametersIf code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.
"Create publication-quality plots from my enrichment analysis" -> Generate dotplots, gene-concept networks, enrichment maps, GSEA running score plots, and ridgeplots from clusterProfiler results.
dotplot(), cnetplot(), emapplot(), gseaplot2() (enrichplot)This skill covers enrichplot package functions designed for clusterProfiler results:
dotplot(), barplot() - Summary viewscnetplot(), emapplot(), treeplot() - Network/hierarchical viewsgseaplot2(), ridgeplot() - GSEA-specificgoplot(), heatplot(), upsetplot() - Specialized viewsFor custom ggplot2 dotplots and statistical annotation, see data-visualization/distribution-plots and data-visualization/ggplot2-fundamentals.
Goal: Load required packages for visualizing enrichment analysis results.
Approach: Import clusterProfiler, enrichplot, and ggplot2 which provide the plotting functions for enrichment objects.
library(clusterProfiler)
library(enrichplot)
library(ggplot2)
# Assume ego (enrichGO result), kk (enrichKEGG result), or gse (GSEA result) exists
Goal: Summarize enrichment results showing gene ratio, count, and significance in a single figure.
Approach: Use enrichplot dotplot which maps gene ratio to x-axis, term to y-axis, dot size to count, and color to p-value.
Most common visualization - shows gene ratio, count, and significance.
dotplot(ego, showCategory = 20)
# Customize
dotplot(ego, showCategory = 15, font.size = 10, title = 'GO Enrichment') +
scale_color_gradient(low = 'red', high = 'blue')
# Save
pdf('go_dotplot.pdf', width = 10, height = 8)
dotplot(ego, showCategory = 20)
dev.off()
Shows enrichment count or gene ratio.
barplot(ego, showCategory = 20)
# Customize
barplot(ego, showCategory = 15, x = 'GeneRatio', color = 'p.adjust')
Goal: Visualize which genes contribute to multiple enriched terms, revealing shared biology.
Approach: Build a bipartite network connecting enriched terms to their member genes, optionally colored by fold change.
Shows relationships between genes and enriched terms.
# Basic cnetplot
cnetplot(ego)
# With fold change colors
cnetplot(ego, foldChange = gene_list)
# Circular layout
cnetplot(ego, circular = TRUE, colorEdge = TRUE)
# Customize node size
cnetplot(ego, node_label = 'gene', cex_label_gene = 0.8)
Goal: Identify clusters of related enriched terms by visualizing shared gene overlap.
Approach: Compute pairwise term similarity, then plot as a network where edges connect terms sharing genes.
Shows term-term relationships based on shared genes.
# Requires pairwise_termsim first
ego_pt <- pairwise_termsim(ego)
emapplot(ego_pt)
# Customize
emapplot(ego_pt, showCategory = 30, cex_label_category = 0.6)
# Cluster by similarity
emapplot(ego_pt, group_category = TRUE, group_legend = TRUE)
# Default: Jaccard Coefficient (works with any gene set type)
ego_pt <- pairwise_termsim(ego)
# For GO terms: Wang semantic similarity (more biologically meaningful)
ego_pt <- pairwise_termsim(ego, method = 'Wang', semData = godata('org.Hs.eg.db', ont = 'BP'))
| Method | Type | When to Use | |--------|------|-------------| | JC (Jaccard) | Gene overlap | Default; works with KEGG, Reactome, any gene set | | Wang | Graph-based | Best for GO; captures biological relationships independent of annotation version | | Resnik/Lin/Jiang | IC-based | GO only; depends on annotation corpus (results change between database releases) |
Hierarchical clustering of enriched terms.
ego_pt <- pairwise_termsim(ego)
treeplot(ego_pt)
# Show more categories
treeplot(ego_pt, showCategory = 30)
Show overlapping genes between terms.
upsetplot(ego)
# Limit to specific number of terms
upsetplot(ego, n = 10)
# Single gene set
gseaplot2(gse, geneSetID = 1, title = gse$Description[1])
# Multiple gene sets
gseaplot2(gse, geneSetID = 1:3)
# With subplots
gseaplot2(gse, geneSetID = 1, subplots = 1:3)
# By term ID
gseaplot2(gse, geneSetID = 'GO:0006955')
Distribution of fold changes in gene sets.
ridgeplot(gse)
# Top n gene sets
ridgeplot(gse, showCategory = 15)
# Order by NES
ridgeplot(gse, showCategory = 20) + theme(axis.text.y = element_text(size = 8))
Reading ridge plots:
DAG structure of GO terms.
# Only for GO enrichment results
goplot(ego)
# Specific ontology
goplot(ego_bp) # where ego_bp is enrichGO with ont='BP'
Gene-concept heatmap.
heatplot(ego, foldChange = gene_list)
# Customize
heatplot(ego, showCategory = 15, foldChange = gene_list)
Goal: Visualize enrichment results side by side across multiple gene lists or conditions.
Approach: Use dotplot on compareCluster output, optionally faceting by cluster.
# Compare clusters (from compareCluster)
dotplot(ck, showCategory = 10)
# Facet by cluster
dotplot(ck) + facet_grid(~Cluster)
Goal: Fine-tune enrichment plots with custom titles, themes, colors, and text sizes.
Approach: Chain ggplot2 modifiers onto enrichplot output since all functions return ggplot2 objects.
All enrichplot functions return ggplot2 objects.
p <- dotplot(ego, showCategory = 20)
# Add title
p + ggtitle('GO Biological Process Enrichment')
# Change theme
p + theme_minimal()
# Adjust text
p + theme(axis.text.y = element_text(size = 10))
# Change colors
p + scale_color_viridis_c()
Goal: Export enrichment plots as publication-quality PDF or PNG files.
Approach: Use base R pdf/png device functions or ggplot2 ggsave to write plots to files.
# PDF (vector, publication quality)
pdf('enrichment_plots.pdf', width = 10, height = 8)
dotplot(ego, showCategory = 20)
dev.off()
# PNG (raster)
png('dotplot.png', width = 800, height = 600, res = 100)
dotplot(ego, showCategory = 20)
dev.off()
# Using ggsave
p <- dotplot(ego)
ggsave('dotplot.pdf', p, width = 10, height = 8)
| Function | Best For | Input Type | |----------|----------|------------| | dotplot | Overview of enrichment | ORA, GSEA | | barplot | Simple counts/ratios | ORA | | cnetplot | Gene-term relationships | ORA | | emapplot | Term clustering | ORA | | treeplot | Hierarchical grouping | ORA | | upsetplot | Term overlap | ORA | | gseaplot2 | Running enrichment score | GSEA | | ridgeplot | Fold change distribution | GSEA | | goplot | GO DAG structure | GO only | | heatplot | Gene-concept matrix | ORA |
| Goal | Plot | Key Tip |
|------|------|---------|
| First overview of top enriched terms | dotplot | Best starting point; shows 3 dimensions (ratio, count, p-value) |
| Which genes drive multiple enriched terms | cnetplot | Limit to 5-10 terms; use circular = TRUE for crowded networks |
| Identify functional modules among terms | emapplot | Run pairwise_termsim() first; if everything connects to everything, results are redundant |
| GSEA: detailed single-pathway view | gseaplot2 | Check where genes cluster in the ranked list |
| GSEA: overview of all enriched sets | ridgeplot | Read direction (left/right shift) and shape (narrow vs broad) |
| Compare enrichment across conditions | dotplot on compareCluster | Use facet_grid(~Cluster) for side-by-side panels |
showCategory = 15-20.simplify() before plotting.testing
Analyze multi-modal single-cell data (CITE-seq, Multiome, spatial). Use when working with data that measures multiple modalities per cell like RNA + protein or RNA + ATAC. Use when analyzing CITE-seq, Multiome, or other multi-modal single-cell data.
data-ai
Analyze metabolite-mediated cell-cell communication using MeboCost for metabolic signaling inference between cell types. Predict metabolite secretion and sensing patterns from scRNA-seq data. Use when studying metabolic crosstalk between cell populations or metabolite-receptor interactions.
development
Find marker genes and annotate cell types in single-cell RNA-seq using Seurat (R) and Scanpy (Python). Use for differential expression between clusters, identifying cluster-specific markers, scoring gene sets, and assigning cell type labels. Use when finding marker genes and annotating clusters.
development
Reconstruct cell lineage trees from CRISPR barcode tracing or mitochondrial mutations. Use when studying clonal dynamics, cell fate decisions, or developmental trajectories.