pathway-analysis/wikipathways/SKILL.md
WikiPathways enrichment using clusterProfiler and rWikiPathways. Use when analyzing gene lists against community-curated open-source pathways. Performs over-representation analysis and GSEA for 30+ species.
npx skillsauth add GPTomics/bioSkills bio-pathway-wikipathwaysInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Reference examples tested with: ReactomePA 1.46+, clusterProfiler 4.10+, rWikiPathways 1.24+
Before using code patterns, verify installed versions match. If versions differ:
packageVersion('<pkg>') then ?function_name to verify parametersIf code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.
WikiPathways is community-curated (wiki model), not expert or peer-reviewed like KEGG/Reactome. This means:
Check the "Last edited" date and contributor for specific pathways before relying on them for key conclusions.
Goal: Identify WikiPathways that are over-represented in a gene list.
Approach: Test for enrichment using enrichWP against community-curated open-source pathway definitions.
"Run pathway enrichment against WikiPathways" -> Test whether genes from community-curated WikiPathways are over-represented among significant genes.
library(clusterProfiler)
library(org.Hs.eg.db)
wp_result <- enrichWP(
gene = entrez_ids, # Character vector of Entrez IDs
organism = 'Homo sapiens', # Full species name
pvalueCutoff = 0.05,
pAdjustMethod = 'BH'
)
head(as.data.frame(wp_result))
Goal: Extract significant Entrez gene IDs from DE results for WikiPathways enrichment.
Approach: Filter by significance thresholds and convert gene symbols to Entrez IDs with bitr.
de_results <- read.csv('de_results.csv')
sig_genes <- de_results[de_results$padj < 0.05 & abs(de_results$log2FoldChange) > 1, 'gene_symbol']
gene_ids <- bitr(sig_genes, fromType = 'SYMBOL', toType = 'ENTREZID', OrgDb = org.Hs.eg.db)
entrez_ids <- gene_ids$ENTREZID
Goal: Detect coordinated expression changes in WikiPathways using a ranked gene list.
Approach: Sort genes by fold change and run gseWP for rank-based enrichment testing.
# Create ranked gene list
gene_list <- de_results$log2FoldChange
names(gene_list) <- de_results$entrez_id
gene_list <- sort(gene_list, decreasing = TRUE)
gsea_wp <- gseWP(
geneList = gene_list,
organism = 'Homo sapiens',
pvalueCutoff = 0.05,
pAdjustMethod = 'BH'
)
head(as.data.frame(gsea_wp))
all_genes <- de_results$entrez_id
wp_result <- enrichWP(
gene = entrez_ids,
universe = all_genes,
organism = 'Homo sapiens',
pvalueCutoff = 0.05
)
# Convert Entrez IDs to gene symbols
wp_readable <- setReadable(wp_result, OrgDb = org.Hs.eg.db, keyType = 'ENTREZID')
Goal: Create summary plots of WikiPathways enrichment results.
Approach: Use enrichplot functions (dotplot, barplot, cnetplot, emapplot) on the enrichment result object.
library(enrichplot)
# Dot plot
dotplot(wp_result, showCategory = 15)
# Bar plot
barplot(wp_result, showCategory = 15)
# Gene-concept network
cnetplot(wp_readable, categorySize = 'pvalue')
# Enrichment map
wp_result <- pairwise_termsim(wp_result)
emapplot(wp_result)
Goal: Query the WikiPathways database directly for pathway metadata, gene lists, and GMT files.
Approach: Use rWikiPathways API functions to list organisms, retrieve pathway info, and download gene set definitions.
library(rWikiPathways)
# List available organisms
listOrganisms()
# Get all pathways for an organism
human_pathways <- listPathways('Homo sapiens')
# Get pathway info
pathway_info <- getPathwayInfo('WP554') # ACE Inhibitor Pathway
# Get genes in a pathway
pathway_genes <- getXrefList('WP554', 'H') # HGNC symbols
pathway_entrez <- getXrefList('WP554', 'L') # Entrez IDs
# Download pathway as GMT for custom analysis
downloadPathwayArchive(organism = 'Homo sapiens', format = 'gmt')
Goal: Run enrichment using a downloaded WikiPathways GMT file for offline or custom analysis.
Approach: Download the GMT archive via rWikiPathways, read it with read.gmt, and run enricher.
# Download WikiPathways GMT
library(rWikiPathways)
downloadPathwayArchive(organism = 'Homo sapiens', format = 'gmt', destpath = '.')
# Read GMT and run enrichment
wp_gmt <- read.gmt('wikipathways-Homo_sapiens.gmt')
wp_custom <- enricher(
gene = entrez_ids,
TERM2GENE = wp_gmt,
pvalueCutoff = 0.05
)
# Mouse
wp_mouse <- enrichWP(gene = mouse_entrez, organism = 'Mus musculus')
# Rat
wp_rat <- enrichWP(gene = rat_entrez, organism = 'Rattus norvegicus')
# Zebrafish
wp_zfish <- enrichWP(gene = zfish_entrez, organism = 'Danio rerio')
# List all available organisms
library(rWikiPathways)
listOrganisms()
Goal: Compare WikiPathways enrichment across multiple gene lists (e.g., upregulated vs downregulated).
Approach: Use compareCluster with enrichWP to run enrichment per group and visualize with dotplot.
gene_clusters <- list(
upregulated = up_genes,
downregulated = down_genes
)
compare_wp <- compareCluster(
geneClusters = gene_clusters,
fun = 'enrichWP',
organism = 'Homo sapiens',
pvalueCutoff = 0.05
)
dotplot(compare_wp)
results_df <- as.data.frame(wp_result)
write.csv(results_df, 'wikipathways_enrichment.csv', row.names = FALSE)
| Parameter | Default | Description | |-----------|---------|-------------| | gene | required | Vector of Entrez IDs | | organism | required | Full species name | | pvalueCutoff | 0.05 | P-value threshold | | pAdjustMethod | BH | Adjustment method | | universe | NULL | Background genes | | minGSSize | 10 | Min genes per pathway | | maxGSSize | 500 | Max genes per pathway |
| Common Name | Scientific Name | |-------------|-----------------| | Human | Homo sapiens | | Mouse | Mus musculus | | Rat | Rattus norvegicus | | Zebrafish | Danio rerio | | Fruit fly | Drosophila melanogaster | | C. elegans | Caenorhabditis elegans | | Arabidopsis | Arabidopsis thaliana | | Yeast | Saccharomyces cerevisiae |
| Feature | WikiPathways | KEGG | Reactome | |---------|--------------|------|----------| | Curation | Community | Expert | Peer-reviewed | | License | Open (CC0) | Commercial | Open | | Species | 30+ | 4000+ | 7 | | Focus | Disease, drug | Metabolic | Signaling | | Updates | Continuous | Ongoing | Quarterly |
development
Find restriction enzyme cut sites in DNA sequences using Biopython Bio.Restriction. Search with single enzymes, batches of enzymes, or commercially available enzyme sets. Returns cut positions for linear or circular DNA. Use when finding restriction enzyme cut sites in sequences.
development
Create restriction maps showing enzyme cut positions on DNA sequences using Biopython Bio.Restriction. Visualize cut sites, calculate distances between sites, and generate text or graphical maps. Use when creating or analyzing restriction maps.
development
Analyze restriction digest fragments using Biopython Bio.Restriction. Predict fragment sizes, get fragment sequences, simulate gel electrophoresis patterns, and perform double digests. Use when analyzing restriction digest fragment patterns.
development
Select restriction enzymes by criteria using Biopython Bio.Restriction. Find enzymes that cut once, don't cut, produce specific overhangs, are commercially available, or have compatible ends for cloning. Use when selecting restriction enzymes for cloning or analysis.