pathway-analysis/reactome-pathways/SKILL.md
Reactome pathway enrichment using ReactomePA package. Use when analyzing gene lists against Reactome's curated peer-reviewed pathway database. Performs over-representation analysis and GSEA with visualization and pathway hierarchy exploration.
npx skillsauth add GPTomics/bioSkills bio-pathway-reactomeInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Reference examples tested with: R stats (base), ReactomePA 1.46+, clusterProfiler 4.10+
Before using code patterns, verify installed versions match. If versions differ:
packageVersion('<pkg>') then ?function_name to verify parametersIf code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.
| Scenario | Reactome? | Alternative | |----------|-----------|-------------| | Signaling pathway detail (reaction-level) | Yes -- best choice | KEGG (pathway-level only) | | Metabolic pathway focus | Supplement | KEGG has stronger metabolic coverage | | Reproducibility / open license required | Yes (CC0) | WikiPathways (CC0) | | Non-model organism (bacteria, plants) | No (7 species only) | KEGG (8,000+ species) | | Non-human model organism (mouse, rat, fly) | Caution | Annotations are computationally inferred via orthology from human; may contain errors |
Reactome pathways are curated by PhD-level biologists and externally peer-reviewed, making them the highest-quality curated pathway database. Human is the primary species; all others are computationally inferred.
Goal: Identify Reactome pathways over-represented in a gene list from differential expression or other analyses.
Approach: Test for enrichment using the hypergeometric test via ReactomePA enrichPathway against curated peer-reviewed pathways.
"Run pathway enrichment against Reactome" -> Test whether genes in curated Reactome pathways are over-represented among significant genes.
library(ReactomePA)
library(org.Hs.eg.db)
pathway_result <- enrichPathway(
gene = entrez_ids, # Character vector of Entrez IDs
organism = 'human', # human, rat, mouse, celegans, yeast, zebrafish, fly
pvalueCutoff = 0.05,
pAdjustMethod = 'BH',
readable = TRUE # Convert to gene symbols
)
head(as.data.frame(pathway_result))
Goal: Extract significant Entrez gene IDs from differential expression results for Reactome enrichment.
Approach: Filter by significance and fold change, then convert symbols to Entrez IDs using bitr.
library(clusterProfiler)
de_results <- read.csv('de_results.csv')
sig_genes <- de_results[de_results$padj < 0.05 & abs(de_results$log2FoldChange) > 1, 'gene_symbol']
gene_ids <- bitr(sig_genes, fromType = 'SYMBOL', toType = 'ENTREZID', OrgDb = org.Hs.eg.db)
entrez_ids <- gene_ids$ENTREZID
Goal: Detect coordinated expression changes in Reactome pathways using all genes ranked by a statistic.
Approach: Create a sorted named vector from DE results and run gsePathway for rank-based enrichment.
# Create ranked gene list (named vector sorted by statistic)
gene_list <- de_results$log2FoldChange
names(gene_list) <- de_results$entrez_id
gene_list <- sort(gene_list, decreasing = TRUE)
gsea_result <- gsePathway(
geneList = gene_list,
organism = 'human',
pvalueCutoff = 0.05,
pAdjustMethod = 'BH',
verbose = FALSE
)
head(as.data.frame(gsea_result))
Goal: Restrict enrichment testing to only genes that were actually measured in the experiment.
Approach: Pass all tested gene IDs as the universe parameter to enrichPathway.
all_genes <- de_results$entrez_id # All tested genes
pathway_result <- enrichPathway(
gene = entrez_ids,
universe = all_genes, # Background gene set
organism = 'human',
pvalueCutoff = 0.05,
readable = TRUE
)
Goal: Create publication-quality plots of Reactome enrichment results.
Approach: Use enrichplot functions (dotplot, barplot, emapplot, cnetplot, gseaplot2) on enrichment result objects.
library(enrichplot)
# Dot plot
dotplot(pathway_result, showCategory = 15)
# Bar plot
barplot(pathway_result, showCategory = 15)
# Enrichment map (requires pairwise_termsim first)
pathway_result <- pairwise_termsim(pathway_result)
emapplot(pathway_result)
# Gene-concept network
cnetplot(pathway_result, categorySize = 'pvalue')
# GSEA plot
gseaplot2(gsea_result, geneSetID = 1:3)
# Open pathway in Reactome browser
viewPathway('R-HSA-109582', organism = 'human') # Uses pathway ID
# Get pathway ID from results
top_pathway_id <- pathway_result@result$ID[1]
viewPathway(top_pathway_id, organism = 'human')
results_df <- as.data.frame(pathway_result)
write.csv(results_df, 'reactome_enrichment.csv', row.names = FALSE)
# Key columns: ID, Description, GeneRatio, BgRatio, pvalue, p.adjust, geneID, Count
# Mouse
pathway_mouse <- enrichPathway(gene = mouse_entrez, organism = 'mouse', readable = TRUE)
# Rat
pathway_rat <- enrichPathway(gene = rat_entrez, organism = 'rat', readable = TRUE)
# Zebrafish
pathway_zfish <- enrichPathway(gene = zfish_entrez, organism = 'zebrafish', readable = TRUE)
# Supported: human, rat, mouse, celegans, yeast, zebrafish, fly
Goal: Compare Reactome pathway enrichment across multiple gene lists (e.g., upregulated vs downregulated).
Approach: Use compareCluster with enrichPathway to run enrichment per group and visualize side by side.
# Compare pathways across multiple gene lists
gene_clusters <- list(
upregulated = up_genes,
downregulated = down_genes
)
compare_result <- compareCluster(
geneClusters = gene_clusters,
fun = 'enrichPathway',
organism = 'human',
pvalueCutoff = 0.05
)
dotplot(compare_result)
| Parameter | Default | Description | |-----------|---------|-------------| | gene | required | Vector of Entrez IDs | | organism | human | Species name | | pvalueCutoff | 0.05 | P-value threshold | | pAdjustMethod | BH | Adjustment method | | universe | NULL | Background genes | | minGSSize | 10 | Min genes per pathway | | maxGSSize | 500 | Max genes per pathway | | readable | FALSE | Convert to symbols |
| Organism | Name | OrgDb | |----------|------|-------| | Human | human | org.Hs.eg.db | | Mouse | mouse | org.Mm.eg.db | | Rat | rat | org.Rn.eg.db | | Zebrafish | zebrafish | org.Dr.eg.db | | Fly | fly | org.Dm.eg.db | | C. elegans | celegans | org.Ce.eg.db | | Yeast | yeast | org.Sc.sgd.db |
minGSSize = 10 to filter these out.testing
Analyze multi-modal single-cell data (CITE-seq, Multiome, spatial). Use when working with data that measures multiple modalities per cell like RNA + protein or RNA + ATAC. Use when analyzing CITE-seq, Multiome, or other multi-modal single-cell data.
data-ai
Analyze metabolite-mediated cell-cell communication using MeboCost for metabolic signaling inference between cell types. Predict metabolite secretion and sensing patterns from scRNA-seq data. Use when studying metabolic crosstalk between cell populations or metabolite-receptor interactions.
development
Find marker genes and annotate cell types in single-cell RNA-seq using Seurat (R) and Scanpy (Python). Use for differential expression between clusters, identifying cluster-specific markers, scoring gene sets, and assigning cell type labels. Use when finding marker genes and annotating clusters.
development
Reconstruct cell lineage trees from CRISPR barcode tracing or mitochondrial mutations. Use when studying clonal dynamics, cell fate decisions, or developmental trajectories.