skills/tooluniverse-gene-regulatory-networks/SKILL.md
Gene regulatory network analysis — TF-target inference (JASPAR motifs, ChIP-seq), motif scanning, eQTL integration, perturbation evidence (knockout/overexpression). Use for 'which TF regulates gene X', 'which genes does TF Y target', regulatory pathway reconstruction. Distinguishes direct (binding) vs indirect (co-expression) regulatory evidence.
npx skillsauth add mims-harvard/tooluniverse tooluniverse-gene-regulatory-networksInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
GRN inference starts with: which TF regulates which gene? Direct evidence (ChIP-seq binding) is stronger than indirect (co-expression correlation). A TF binding near a gene doesn't prove regulation — check if expression changes when the TF is perturbed. JASPAR provides binding motifs but motif presence in a promoter is only computational evidence (T3); ENCODE ChIP-seq data that places the TF at the locus in the relevant cell type is stronger (T1). eQTLs from GTEx show which variants affect expression but don't identify the upstream regulator — combine with TF motif disruption analysis for mechanistic insight.
LOOK UP DON'T GUESS: never assume JASPAR matrix IDs, Enrichr library names, or GTEx tissue identifiers — always search JASPAR by TF name and verify library names before calling enrichr.
Activate this skill when the user asks about:
When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.
Determine:
Search JASPAR for the TF's position weight matrix (PWM) and binding motif profile.
Tool: jaspar_search_matrices
Parameters:
search string TF name to search (e.g., "TP53")
limit integer Max results (default 10)
collection string JASPAR collection filter (e.g., "CORE")
species string Taxonomy ID filter (e.g., "9606" for human)
Example:
{"search": "TP53", "limit": 5}
Returns {status, data: {count, results: [{matrix_id, name, collection, base_id, version, sequence_logo}]}}.
Tool: jaspar_get_matrix (for detailed motif info)
Parameters:
matrix_id string JASPAR matrix ID (e.g., "MA0106.3")
Returns PFM (position frequency matrix), species, TF class, UniProt IDs.
Identify target genes from ChIP-seq experiments via Enrichr.
Tool: enrichr_gene_enrichment_analysis
Parameters:
gene_list array List of gene symbols (REQUIRED)
library string Enrichr library name (default "GO_Biological_Process_2023")
top_n integer Top enriched terms to return (default 10)
Key libraries for regulatory network analysis:
"ENCODE_TF_ChIP-seq_2015" -- TF binding from ENCODE ChIP-seq"ChEA_2022" -- ChIP-seq enrichment analysis (broader coverage)"TRRUST_Transcription_Factors_2019" -- Literature-curated TF-target relationships"ARCHS4_TFs_Coexp" -- TF co-expression from RNA-seqExample (find which TFs bind your gene set):
{
"gene_list": ["CDKN1A", "BAX", "MDM2", "GADD45A", "BBC3"],
"library": "ENCODE_TF_ChIP-seq_2015",
"top_n": 10
}
Returns {status, data: {library, gene_count, enriched_terms: [{rank, term, p_value, combined_score, overlapping_genes, adjusted_p_value}]}}.
IMPORTANT: Enrichr takes a gene list and tells you what TFs are enriched. To find targets OF a TF, use the TRRUST library or look up TF ChIP-seq targets directly.
Tool: ENCODE_search_histone_experiments
Parameters:
target string Histone mark (e.g., "H3K27ac", "H3K4me3", "H3K27me3")
tissue string Tissue/cell type (e.g., "liver", "brain")
limit integer Max results (default 10)
Common histone marks and their meaning:
H3K27ac -- Active enhancers and promotersH3K4me3 -- Active promotersH3K4me1 -- Poised/active enhancersH3K27me3 -- Polycomb-repressed regionsH3K9me3 -- HeterochromatinExample:
{"target": "H3K27ac", "tissue": "liver", "limit": 5}
Returns {status, data: {total, experiments: [{accession, histone_mark, biosample_summary, status, lab}]}}.
Tool: GTEx_query_eqtl
Parameters:
gene_symbol string Gene symbol (e.g., "TP53"). REQUIRED.
Returns eQTL SNPs across tissues, showing genetic variants that affect gene expression.
Example:
{"gene_symbol": "TP53"}
Returns {status, data: {singleTissueEqtl: [{snpId, variantId, geneSymbol, pValue, tissueSiteDetailId, nes}]}}. nes = normalized effect size; negative = lower expression with alt allele.
Tool: RegulomeDB_query_variant
Parameters:
rsid string dbSNP rsID (e.g., "rs7412")
Returns regulatory score (1a-7), tissue-specific scores, and overlapping regulatory features.
Tool: STRING_get_interaction_partners
Parameters:
identifiers string Protein/gene name (REQUIRED, e.g., "TP53")
species integer NCBI taxonomy ID (default 9606 for human)
limit integer Max partners to return
required_score integer Min combined score 0-1000 (400=medium, 700=high, 900=highest)
Example:
{"identifiers": "TP53", "species": 9606, "limit": 10}
Returns array of {preferredName_A, preferredName_B, score, escore, dscore, tscore, ascore}. Score components: escore (experimental), dscore (database), tscore (text-mining), ascore (coexpression).
Tool: intact_get_interaction_network
Parameters:
gene_symbol string Gene symbol (REQUIRED)
limit integer Max results
Returns experimentally validated molecular interactions from IntAct.
Tool: BioGRID_get_interactions
Parameters:
gene_symbol string Gene symbol (REQUIRED)
limit integer Max results
Returns physical and genetic interactions with experimental system details.
Tool: EuropePMC_search_articles
Parameters:
query string Search query (REQUIRED)
limit integer Max results (default 10)
Example:
{"query": "TP53 transcription factor regulatory network", "limit": 5}
Tool: PubMed_search_articles
Parameters:
query string Search query (REQUIRED)
limit integer Max results (default 10)
Tool: ols_search_terms
Parameters:
query string Search term (REQUIRED)
ontology string Ontology ID (e.g., "so" for Sequence Ontology, "go" for Gene Ontology)
limit integer Max results
Example for regulatory element types:
{"query": "transcription factor binding site", "ontology": "so", "limit": 5}
Tool: STRING_functional_enrichment
Parameters:
identifiers string Comma-separated gene names (REQUIRED)
species integer NCBI taxonomy ID (default 9606)
Performs GO, KEGG, Reactome enrichment on a gene set from the network.
JASPAR tool name: Use jaspar_search_matrices (lowercase, plural), NOT jaspar_get_matrix.
JASPAR search param: The parameter is search (NOT query or name).
STRING identifiers param: Use identifiers as a string (NOT an array). For multiple proteins, use STRING_get_network with array identifiers.
Enrichr direction: enrichr_gene_enrichment_analysis takes a gene SET and finds enriched TFs/pathways. To find targets of a TF, use "TRRUST_Transcription_Factors_2019" library with known target genes, or consult ENCODE ChIP-seq data directly.
Enrichr gene_list is required: Must be a JSON array of strings, not a single string.
GTEx uses gene_symbol: NOT Ensembl ID. The tool resolves it internally.
ENCODE tissue names: Use lowercase tissue names like "liver", "brain", "heart". Complex queries may fail -- keep tissue names simple.
BioGRID returns interactions as dict: Keys are interaction IDs, values contain OFFICIAL_SYMBOL_A and OFFICIAL_SYMBOL_B.
RegulomeDB rsID format: Must include the "rs" prefix (e.g., "rs7412" not "7412").
No TRRUST direct tool: TRRUST data is accessed via Enrichr library "TRRUST_Transcription_Factors_2019", not a standalone tool.
jaspar_search_matrices -- Get motif info for TF Xenrichr_gene_enrichment_analysis with TRRUST_Transcription_Factors_2019 library -- Use known targetsSTRING_get_interaction_partners -- Find interacting proteinsEuropePMC_search_articles -- Literature on TF X targetsenrichr_gene_enrichment_analysis with gene Y's co-regulated genes + ENCODE_TF_ChIP-seq_2015 libraryGTEx_query_eqtl -- Find eQTLs affecting gene Y expressionENCODE_search_histone_experiments -- Chromatin context at gene Y locusRegulomeDB_query_variant -- Annotate regulatory variants near gene Yenrichr_gene_enrichment_analysis with gene set Z + multiple TF librariesSTRING_get_interaction_partners for hub genesSTRING_functional_enrichment -- Pathway contextBioGRID_get_interactions -- Experimental validationEuropePMC_search_articles -- Supporting literatureGTEx_query_eqtl -- Tissue-specific eQTLs for gene XENCODE_search_histone_experiments with specific tissue -- Active regulatory marksRegulomeDB_query_variant -- Tissue-specific regulatory scores for eQTL SNPsenrichr_gene_enrichment_analysis -- Identify TFs active in that tissueRegulomeDB_query_variant -- Regulatory score and overlapping featuresGTEx_query_eqtl -- Is this variant an eQTL?ENCODE_search_histone_experiments -- Chromatin context at variant locusEuropePMC_search_articles -- Literature on the varianttools
PCR / qPCR primer and oligo design — design forward/reverse primers for a target region (SantaLucia nearest-neighbor thermodynamics), compute melting temperature (Tm) and annealing temperature (Ta), check GC content, and screen an oligo for hairpins and primer-dimers. Use when you need primers for a sequence, want to QC an existing primer pair, or need the Tm of an oligo. Covers the primer-design rules (Tm matching, GC clamp, 3'-end, length) and the tools' constraint quirks.
tools
Pharmacokinetic (PK) analysis of concentration-time data — non-compartmental analysis (NCA) for Cmax, Tmax, AUC (0-t and 0-∞), terminal half-life, clearance (CL), volume of distribution (Vd), MRT, and absolute bioavailability (F). Also one-compartment fitting. Use when you have plasma/serum drug concentrations over time after a dose and need PK parameters, or to compute bioavailability from IV + oral AUCs. NOT for ADMET property prediction from structure (use tooluniverse-admet-prediction).
tools
Molecular cloning assembly design — Gibson Assembly (overlap design for seamless multi-fragment joining) and Golden Gate Assembly (Type IIS / BsaI / BbsI design with unique 4-bp fusion overhangs). Use when you need to plan how to join DNA fragments into a construct, design assembly overlaps/overhangs, or decide between cloning methods. Covers the domestication (internal-site removal), overhang-uniqueness, and overlap-Tm rules. For PCR primers to generate the fragments, see tooluniverse-primer-design.
tools
Meta-analysis / evidence synthesis — pool effect sizes across studies (odds ratios, risk ratios, hazard ratios, mean differences, correlations, GWAS betas) with fixed- or random-effects models, quantify heterogeneity (Q, I², τ²), and build a forest plot. Use when you have results from MULTIPLE studies and need a single pooled estimate, or to synthesize evidence from a systematic review / multiple GWAS / replicated experiments. Handles the error-prone effect-size + standard-error preparation (converting OR/HR/CI, two-group means±SD, proportions, and correlations into the (effect, SE) the pooling step needs).