11_toolBased.functional-enrichment/SKILL.md
Perform GO and KEGG functional enrichment using HOMER from genomic regions (BED/narrowPeak/broadPeak) or gene lists, and produce R-based barplot/dotplot visualizations. Use this skill when you want to perform GO and KEGG functional enrichment using HOMER from genomic regions or just want to link genomic region to genes.
npx skillsauth add bisnake2001/chromskills functional-enrichmentInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
annotatePeaks.pl.findGO.pl (or annotatePeaks.pl -go) for BP/MF/CC.findGO.pl -kegg (or annotatePeaks.pl -kegg).ggplot2 from standardized outputs.Genomic region formats supported:
gene_list.txt with one official gene symbol per line (no header). And an optional gene_list_background.txt with one official gene symbol per line (no header).${sample}_functional_enrichment/
results/
${sample}.anno_genomic_features.txt
${sample}.anno_genomic_features_stats.txt
biological_process.txt
cellular_component.txt
molecular_function.txt
kegg.txt
biocyc.txt
chromosome.txt
cosmic.txt
interactions.txt
interpro.txt
gene3d.txt
pathwayInteractionDB.txt
pfam.txt
prints.txt
prosite.txt
reactome.txt
smpdb.txt
wikipathways.txt
gwas.txt
lipidmaps.txt
msigdb.txt
smart.txt
tables/
${sample}.gene_list.txt
go_bp.tsv
go_mf.tsv
go_cc.tsv
kegg.tsv
logs/
${sample}.anno_genomic_features.log # if genome region file is provided
findGO.log
Before calling any tool, ask the user:
sample): used as prefix and for the output directory ${sample}_functional_enrichment.genome): e.g. hg38, mm10, danRer11.
Call:
mcp__project-init-tools__project_initwith:
sample: the user-provided sample nametask: de_novo_motif_discoveryThe tool will:
${sample}_functional_enrichment directory.${sample}_functional_enrichment directory, which will be used as ${proj_dir}.Call:
mcp__homer-tools__check_genome_installationWith:
genome: the user-provided genome assembly, e.g. hg38, mm10, danRer11The tool will:
This step is optional. Only perform this step if the input file is a BED file. If the input file is a gene list, skip this step.
From 1 format to chr1 format
From MT format to chrM format
Call:
mcp__file-format-tools__standardize_bed_chrom_nameswith:
input_bed: the user-provided BED fileoutput_bed: the path to save the standardized BED fileThe tool will:
This step is optional. Only perform this step if the input file is a gene list file. If the input file is a BED file, skip this step.
Call:
mcp__mygene-tools__convert_gene_ids_mygeneWith:
input_ids_file: the user-provided gene list file. May end with .txt.scopes: the source ID type for mygene (e.g., 'ensembl.gene', 'symbol', 'entrezgene', 'uniprot', or a comma-separated list).fields: the comma-separated target fields to retrieve from mygene (e.g., 'symbol,ensembl.gene,uniprot,entrezgene').species: the species for mygene (e.g., 'human', 'mouse', 'zebrafish', or NCBI taxon ID like '9606').out_file: the path to save the converted gene list file. In this skill, it is the full path of the ${sample}_functional_enrichment directory returned by mcp__project-init-tools__project_initbatch_size: the batch size for mygene.querymany (default 1000).The tool will:
Only if the input file is a BED file. If the input file is a gene list, call tools in Option 2.
annotatePeaks.pl with -go option. If user also provides a background genome region file, like a control peak file, also call this tool for the background genome region file. Use a different ${sample} as the sample name for the background sample.Call:
mcp__homer-tools__annotate_genomic_features
With:
sample: the user-provided sample nameproj_dir: directory to save the genomic feature annotation results. In this skill, it is the full path of the ${sample}_functional_enrichment directory returned by mcp__project-init-tools__project_initregions_bed: the user-provided regions file in BED format. May end with .bed, .narrowPeak, .broadPeak, etc.genome: the user-provided genome assembly, e.g. hg38, mm10, danRer11ann: "custom homer annotation file (created by assignGenomeAnnotation.pl), (default: None).size_given: keep original region sizes (default: True)cpg: include CpG information (default: False)go: True to perform GO enrichment analysis.The tool will:
annotatePeaks.pl.${proj_dir}/results/ directory, and the path to the log file under ${proj_dir}/logs/ directory.
${proj_dir}/results/${sample}.anno_genomic_features.txt${proj_dir}/results/${sample}.anno_genomic_features_stats.txt${proj_dir}/logs/${sample}.anno_genomic_features.logCall:
mcp__file-format-tools__extract_gene_list
With:
sample: the user-provided sample nameproj_dir: directory to save the genomic feature annotation results. In this skill, it is the full path of the ${sample}_functional_enrichment directory returned by mcp__project-init-tools__project_initThe tool will:
${proj_dir}/tables/ directory.
${proj_dir}/tables/${sample}.gene_list.txtOnly if the input file is a gene list file. If the input file is a BED file, call tools in Option 1.
Call:
mcp__homer-tools__gene_function_enrichment
With:
sample: the user-provided sample nameproj_dir: directory to save the GO & KEGG enrichment results. In this skill, it is the full path of the ${sample}_functional_enrichment directory returned by mcp__project-init-tools__project_initgene_list_file: the user-provided gene list file. May end with .txt.organism: the user-provided organism name, e.g. human, mouse, zebrafish, etc.background_gene_list_file: the user-provided background gene list file. May end with .txt. If not provided, set this parameter to None.The tool will:
${proj_dir}/results/ directory.
${proj_dir}/results/biological_process.txt${proj_dir}/results/kegg.txt${proj_dir}/logs/ directory.
${proj_dir}/logs/${sample}.find_go_and_kegg_enrichment.logAlternative direct from BED
annotatePeaks.pl peaks.bed hg38 -go results/{run}/tables/go_dir -genomeOntology
annotatePeaks.pl peaks.bed hg38 -kegg results/{run}/tables/kegg_dir
chr1 vs 1).-bg helps reduce bias; choose a reasonable universe (e.g., all expressed or all accessible regions → genes).annotatePeaks.pl -go/-kegg is convenient; the gene-list route yields uniform TSVs for plotting.development
Align ChIP-seq or ATAC-seq FASTQ files to a reference genome using Bowtie2, with strict input validation, library layout detection, output organization and logging. Use it when raw sequencing reads must be converted into sorted/indexed BAM files before downstream QC, peak calling, or footprinting.
development
Align bisulfite sequencing DNA methylation reads using Bismark only, with explicit validation of reference preparation, library layout detection, output organization, logging, and alignment QC. Use it for WGBS, RRBS, or other bisulfite-converted DNA methylation sequencing data when raw FASTQ files must be aligned before methylation extraction and downstream analysis.
data-ai
Perform peak calling for ChIP-seq or ATAC-seq data using MACS3, with intelligent parameter detection from user feedback. Use it when you want to call peaks for ChIP-seq data or ATAC-seq data.
devops
The TF-differential-binding pipeline performs differential transcription factor (TF) binding analysis from ChIP-seq datasets (TF peaks) using the DiffBind package in R. It identifies genomic regions where TF binding intensity significantly differs between experimental conditions (e.g., treatment vs. control, mutant vs. wild-type). Use the TF-differential-binding pipeline when you need to analyze the different function of the same TF across two or more biological conditions, cell types, or treatments using ChIP-seq data or TF binding peaks. This pipeline is ideal for studying regulatory mechanisms that underlie transcriptional differences or epigenetic responses to perturbations.