scientific-skills/Data Analysis/go-kegg-enrichment/SKILL.md
Performs GO (Gene Ontology) and KEGG pathway enrichment analysis on gene lists. Trigger when: - User provides a list of genes (symbols or IDs) and asks for enrichment analysis - User mentions "GO enrichment", "KEGG enrichment", "pathway analysis" - User wants to understand biological functions of gene sets - User provides differentially expressed genes (DEGs) and asks for interpretation - Input: gene list (file or inline), organism (human/mouse/rat), background gene set (optional) - Output: enriched terms, statistics, visualizations (barplot, dotplot, enrichment map)
npx skillsauth add aipoch/medical-research-skills go-kegg-enrichmentInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
4 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Automated pipeline for Gene Ontology and KEGG pathway enrichment analysis with result interpretation and visualization.
| Common Name | Scientific Name | KEGG Code | OrgDB Package | |-------------|-----------------|-----------|---------------| | Human | Homo sapiens | hsa | org.Hs.eg.db | | Mouse | Mus musculus | mmu | org.Mm.eg.db | | Rat | Rattus norvegicus | rno | org.Rn.eg.db | | Zebrafish | Danio rerio | dre | org.Dr.eg.db | | Fly | Drosophila melanogaster | dme | org.Dm.eg.db | | Yeast | Saccharomyces cerevisiae | sce | org.Sc.sgd.db |
# Run enrichment analysis with gene list
python scripts/main.py --genes gene_list.txt --organism human --output results/
| Parameter | Description | Default | Required |
|-----------|-------------|---------|----------|
| --genes | Path to gene list file (one gene per line) | - | Yes |
| --organism | Organism code (human/mouse/rat/zebrafish/fly/yeast) | human | No |
| --id-type | Gene ID type (symbol/entrez/ensembl/refseq) | symbol | No |
| --background | Background gene list file | all genes | No |
| --pvalue-cutoff | P-value cutoff for significance | 0.05 | No |
| --qvalue-cutoff | Adjusted p-value (q-value) cutoff | 0.2 | No |
| --analysis | Analysis type (go/kegg/all) | all | No |
| --output | Output directory | ./enrichment_results | No |
| --format | Output format (csv/tsv/excel/all) | all | No |
# GO enrichment only with specific ontology
python scripts/main.py \
--genes deg_upregulated.txt \
--organism mouse \
--analysis go \
--go-ontologies BP,MF \
--pvalue-cutoff 0.01 \
--output go_results/
# KEGG enrichment with custom background
python scripts/main.py \
--genes treatment_genes.txt \
--background all_expressed_genes.txt \
--organism human \
--analysis kegg \
--qvalue-cutoff 0.05 \
--output kegg_results/
TP53
BRCA1
EGFR
MYC
KRAS
PTEN
gene,log2FoldChange
TP53,2.5
BRCA1,-1.8
EGFR,3.2
output/
├── go_enrichment/
│ ├── GO_BP_results.csv # Biological Process results
│ ├── GO_MF_results.csv # Molecular Function results
│ ├── GO_CC_results.csv # Cellular Component results
│ ├── GO_BP_barplot.pdf # Visualization
│ ├── GO_MF_dotplot.pdf
│ └── GO_summary.txt # Interpretation summary
├── kegg_enrichment/
│ ├── KEGG_results.csv # Pathway results
│ ├── KEGG_barplot.pdf
│ ├── KEGG_dotplot.pdf
│ └── KEGG_pathview/ # Pathway diagrams
└── combined_report.html # Interactive report
The tool automatically generates biological interpretation including:
⚠️ AI自主验收状态: 需人工检查
This skill requires:
install.packages(c("BiocManager", "ggplot2", "dplyr", "readr"))
BiocManager::install(c(
"clusterProfiler",
"org.Hs.eg.db", "org.Mm.eg.db", "org.Rn.eg.db",
"enrichplot", "pathview", "DOSE"
))
pip install pandas numpy matplotlib seaborn rpy2
See references/ for:
| Risk Indicator | Assessment | Level | |----------------|------------|-------| | Code Execution | Python/R scripts executed locally | Medium | | Network Access | No external API calls | Low | | File System Access | Read input files, write output files | Medium | | Instruction Tampering | Standard prompt guidelines | Low | | Data Exposure | Output files saved to workspace | Low |
# Python dependencies
pip install -r requirements.txt
tools
Generates complete conventional oncology bulk-transcriptome biomarker and hub-gene research designs from a user-provided cancer type and study direction. Always use this skill whenever a user wants to design, plan, or build a tumor bioinformatics study centered on differential expression, prognostic filtering or risk modeling, PPI-based hub-gene prioritization, diagnostic/prognostic evaluation, clinical association, immune infiltration context, methylation context, and optional tissue or cell validation. Covers five study patterns (signature-first prognostic workflow, hub-gene-first biomarker workflow, hybrid signature-to-hub workflow, immune-context biomarker workflow, translational validation workflow) and always outputs four workload configs (Lite / Standard / Advanced / Publication+) with recommended primary plan, step-by-step workflow, figure plan, validation strategy, minimal executable version, publication upgrade path...
development
Generates complete conventional non-oncology bioinformatics research designs from a user-provided disease context, process-related gene family or biological theme, and validation direction. Use when a study centers on multi-dataset bulk transcriptome integration, DEG analysis, process-gene intersection, enrichment analysis, GSEA, PPI hub-gene prioritization, TF/miRNA regulatory networks, ROC-based biomarker evaluation, and immune infiltration analysis. Covers five study patterns (process-DEG discovery, enrichment/GSEA interpretation, hub-gene prioritization, regulatory-network and immune interpretation, multi-layer public validation) and always outputs Lite / Standard / Advanced / Publication+ with a recommended primary plan, stepwise workflow, figure plan, validation hierarchy, minimal executable version, publication upgrade path, and strictly verified literature retrieval.
tools
Plans confounder control, variable adjustment logic, and bias mitigation strategies at the protocol stage for clinical, epidemiologic, translational, observational, and biomarker studies. Always use this skill when a user needs to identify major confounders, decide which variables should or should not be adjusted for, compare matching/stratification/weighting approaches, anticipate selection or measurement bias, or pressure-test a study design before execution. Focus on bias sensing, causal structure awareness, variable-role classification, and critical design review rather than generic statistical advice.
testing
Generates complete comparative network-toxicology research designs from a user-provided exposure pair, shared toxic phenotype, and validation direction. Use when a study centers on two related exposures under one outcome and needs target collection, shared-vs-specific target decomposition, enrichment, PPI hub prioritization, docking, optional transcriptomic cross-checks, and conservative mechanistic synthesis. Covers five study patterns and always outputs Lite / Standard / Advanced / Publication+ with a recommended primary plan, stepwise workflow, figure plan, validation hierarchy, minimal executable version, publication upgrade path, and strictly verified literature retrieval.