scientific-skills/gtex-database/SKILL.md
Query GTEx (Genotype-Tissue Expression) portal for tissue-specific gene expression, eQTLs (expression quantitative trait loci), and sQTLs. Essential for linking GWAS variants to gene regulation, understanding tissue-specific expression, and interpreting non-coding variant effects.
npx skillsauth add K-Dense-AI/claude-scientific-skills gtex-databaseInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
The Genotype-Tissue Expression (GTEx) project provides a comprehensive resource for studying tissue-specific gene expression and genetic regulation across 54 non-diseased human tissues from nearly 1,000 individuals. GTEx v10 (the latest release) enables researchers to understand how genetic variants regulate gene expression (eQTLs) and splicing (sQTLs) in a tissue-specific manner, which is critical for interpreting GWAS loci and identifying regulatory mechanisms.
Key resources:
Use GTEx when:
Base URL: https://gtexportal.org/api/v2/
The API returns JSON and does not require authentication. All endpoints support pagination.
import requests
BASE_URL = "https://gtexportal.org/api/v2"
def gtex_get(endpoint, params=None):
"""Make a GET request to the GTEx API."""
url = f"{BASE_URL}/{endpoint}"
response = requests.get(url, params=params, headers={"Accept": "application/json"})
response.raise_for_status()
return response.json()
import requests
import pandas as pd
def get_gene_expression_by_tissue(gene_id_or_symbol, dataset_id="gtex_v10"):
"""Get median gene expression across all tissues."""
url = "https://gtexportal.org/api/v2/expression/medianGeneExpression"
params = {
"gencodeId": gene_id_or_symbol,
"datasetId": dataset_id,
"itemsPerPage": 100
}
response = requests.get(url, params=params)
data = response.json()
records = data.get("data", [])
df = pd.DataFrame(records)
if not df.empty:
df = df[["tissueSiteDetailId", "tissueSiteDetail", "median", "unit"]].sort_values(
"median", ascending=False
)
return df
# Example: get expression of APOE across tissues
df = get_gene_expression_by_tissue("ENSG00000130203.10") # APOE GENCODE ID
# Or use gene symbol (some endpoints accept both)
print(df.head(10))
# Output: tissue name, median TPM, sorted by highest expression
import requests
import pandas as pd
def query_eqtl(gene_id, tissue_id=None, dataset_id="gtex_v10"):
"""Query significant eQTLs for a gene, optionally filtered by tissue."""
url = "https://gtexportal.org/api/v2/association/singleTissueEqtl"
params = {
"gencodeId": gene_id,
"datasetId": dataset_id,
"itemsPerPage": 250
}
if tissue_id:
params["tissueSiteDetailId"] = tissue_id
all_results = []
page = 0
while True:
params["page"] = page
response = requests.get(url, params=params)
data = response.json()
results = data.get("data", [])
if not results:
break
all_results.extend(results)
if len(results) < params["itemsPerPage"]:
break
page += 1
df = pd.DataFrame(all_results)
if not df.empty:
df = df.sort_values("pval", ascending=True)
return df
# Example: Find eQTLs for PCSK9
df = query_eqtl("ENSG00000169174.14")
print(df[["snpId", "tissueSiteDetailId", "slope", "pval", "gencodeId"]].head(20))
import requests
def query_variant_eqtl(variant_id, tissue_id=None, dataset_id="gtex_v10"):
"""Get all eQTL associations for a specific variant."""
url = "https://gtexportal.org/api/v2/association/singleTissueEqtl"
params = {
"variantId": variant_id, # e.g., "chr1_55516888_G_GA_b38"
"datasetId": dataset_id,
"itemsPerPage": 250
}
if tissue_id:
params["tissueSiteDetailId"] = tissue_id
response = requests.get(url, params=params)
return response.json()
# GTEx variant ID format: chr{chrom}_{pos}_{ref}_{alt}_b38
# Example: "chr17_43094692_G_A_b38"
import requests
def get_egenes(tissue_id, dataset_id="gtex_v10"):
"""Get all eGenes (genes with at least one significant eQTL) in a tissue."""
url = "https://gtexportal.org/api/v2/association/egene"
params = {
"tissueSiteDetailId": tissue_id,
"datasetId": dataset_id,
"itemsPerPage": 500
}
all_egenes = []
page = 0
while True:
params["page"] = page
response = requests.get(url, params=params)
data = response.json()
batch = data.get("data", [])
if not batch:
break
all_egenes.extend(batch)
if len(batch) < params["itemsPerPage"]:
break
page += 1
return all_egenes
# Example: all eGenes in whole blood
egenes = get_egenes("Whole_Blood")
print(f"Found {len(egenes)} eGenes in Whole Blood")
import requests
def get_tissues(dataset_id="gtex_v10"):
"""Get all available tissues with metadata."""
url = "https://gtexportal.org/api/v2/dataset/tissueSiteDetail"
params = {"datasetId": dataset_id, "itemsPerPage": 100}
response = requests.get(url, params=params)
return response.json()["data"]
tissues = get_tissues()
# Key fields: tissueSiteDetailId, tissueSiteDetail, colorHex, samplingSite
# Common tissue IDs:
# Whole_Blood, Brain_Cortex, Liver, Kidney_Cortex, Heart_Left_Ventricle,
# Lung, Muscle_Skeletal, Adipose_Subcutaneous, Colon_Transverse, ...
import requests
def query_sqtl(gene_id, tissue_id=None, dataset_id="gtex_v10"):
"""Query significant sQTLs for a gene."""
url = "https://gtexportal.org/api/v2/association/singleTissueSqtl"
params = {
"gencodeId": gene_id,
"datasetId": dataset_id,
"itemsPerPage": 250
}
if tissue_id:
params["tissueSiteDetailId"] = tissue_id
response = requests.get(url, params=params)
return response.json()
chr{chrom}_{pos}_{ref}_{alt}_b38)coloc (R package) with full summary statisticsimport requests, pandas as pd
def interpret_gwas_variant(variant_id, dataset_id="gtex_v10"):
"""Find all genes regulated by a GWAS variant."""
url = "https://gtexportal.org/api/v2/association/singleTissueEqtl"
params = {"variantId": variant_id, "datasetId": dataset_id, "itemsPerPage": 500}
response = requests.get(url, params=params)
data = response.json()
df = pd.DataFrame(data.get("data", []))
if df.empty:
return df
return df[["geneSymbol", "tissueSiteDetailId", "slope", "pval", "maf"]].sort_values("pval")
# Example
results = interpret_gwas_variant("chr1_154453788_A_T_b38")
print(results.groupby("geneSymbol")["tissueSiteDetailId"].count().sort_values(ascending=False))
| Endpoint | Description |
|----------|-------------|
| /expression/medianGeneExpression | Median TPM by tissue for a gene |
| /expression/geneExpression | Full distribution of expression per tissue |
| /association/singleTissueEqtl | Significant eQTL associations |
| /association/singleTissueSqtl | Significant sQTL associations |
| /association/egene | eGenes in a tissue |
| /dataset/tissueSiteDetail | Available tissues with metadata |
| /reference/gene | Gene metadata (GENCODE IDs, coordinates) |
| /variant/variantPage | Variant lookup by rsID or position |
| ID | Description |
|----|-------------|
| gtex_v10 | GTEx v10 (current; ~960 donors, 54 tissues) |
| gtex_v8 | GTEx v8 (838 donors, 49 tissues) — older but widely cited |
ENSG00000130203.10) for gene queries; the .version suffix matters for some endpointschr{chrom}_{pos}_{ref}_{alt}_b38 (GRCh38) — different from rs IDstissueSiteDetailId (e.g., Whole_Blood) not display names for API callsslope field is the effect of the alternative allele; positive = higher expression with alt alleleFor genome-wide analyses, download full summary statistics rather than using the API:
# All significant eQTLs (v10)
wget https://storage.googleapis.com/adult-gtex/bulk-qtl/v10/single-tissue-cis-qtl/GTEx_Analysis_v10_eQTL.tar
# Normalized expression matrices
wget https://storage.googleapis.com/adult-gtex/bulk-gex/v10/rna-seq/GTEx_Analysis_v10_RNASeQCv2.4.2_gene_reads.gct.gz
development
Spectral similarity and compound identification for metabolomics. Use for comparing mass spectra, computing similarity scores (cosine, modified cosine), and identifying unknown compounds from spectral libraries. Best for metabolite identification, spectral matching, library searching. For full LC-MS/MS proteomics pipelines use pyopenms.
development
Convert files and office documents to Markdown. Supports PDF, DOCX, PPTX, XLSX, images (with OCR), audio (with transcription), HTML, CSV, JSON, XML, ZIP, YouTube URLs, EPubs and more.
development
Generate comprehensive market research reports (50+ pages) in the style of top consulting firms (McKinsey, BCG, Gartner). Features professional LaTeX formatting, extensive visual generation with scientific-schematics and generate-image, deep integration with research-lookup for data gathering, and multi-framework strategic analysis including Porter Five Forces, PESTLE, SWOT, TAM/SAM/SOM, and BCG Matrix.
testing
Comprehensive markdown and Mermaid diagram writing skill. Use when creating any scientific document, report, analysis, or visualization. Establishes text-based diagrams as the default documentation standard with full style guides (markdown + mermaid), 24 diagram type references, and 9 document templates.