public/SKILLS/Scientific & Research Tools/ena-database/SKILL.md
Access European Nucleotide Archive via API/FTP. Retrieve DNA/RNA sequences, raw reads (FASTQ), genome assemblies by accession, for genomics and bioinformatics pipelines. Supports multiple formats.
npx skillsauth add eric861129/skills_all-in-one ena-databaseInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
The European Nucleotide Archive (ENA) is a comprehensive public repository for nucleotide sequence data and associated metadata. Access and query DNA/RNA sequences, raw reads, genome assemblies, and functional annotations through REST APIs and FTP for genomics and bioinformatics pipelines.
This skill should be used when:
ENA organizes data into hierarchical object types:
Studies/Projects - Group related data and control release dates. Studies are the primary unit for citing archived data.
Samples - Represent units of biomaterial from which sequencing libraries were produced. Samples must be registered before submitting most data types.
Raw Reads - Consist of:
Assemblies - Genome, transcriptome, metagenome, or metatranscriptome assemblies at various completion levels.
Sequences - Assembled and annotated sequences stored in the EMBL Nucleotide Sequence Database, including coding/non-coding regions and functional annotations.
Analyses - Results from computational analyses of sequence data.
Taxonomy Records - Taxonomic information including lineage and rank.
ENA provides multiple REST APIs for data access. Consult references/api_reference.md for detailed endpoint documentation.
Key APIs:
ENA Portal API - Advanced search functionality across all ENA data types
ENA Browser API - Direct retrieval of records and metadata
ENA Taxonomy REST API - Query taxonomic information
ENA Cross Reference Service - Access related records from external databases
CRAM Reference Registry - Retrieve reference sequences
Rate Limiting: All APIs have a rate limit of 50 requests per second. Exceeding this returns HTTP 429 (Too Many Requests).
Browser-Based Search:
Programmatic Queries:
Example API Query Pattern:
import requests
# Search for samples from a specific study
base_url = "https://www.ebi.ac.uk/ena/portal/api/search"
params = {
"result": "sample",
"query": "study_accession=PRJEB1234",
"format": "json",
"limit": 100
}
response = requests.get(base_url, params=params)
samples = response.json()
Metadata Formats:
Sequence Data:
Download Methods:
Retrieve raw sequencing reads by accession:
# Download run files using Browser API
accession = "ERR123456"
url = f"https://www.ebi.ac.uk/ena/browser/api/xml/{accession}"
Search for all samples in a study:
# Use Portal API to list samples
study_id = "PRJNA123456"
url = f"https://www.ebi.ac.uk/ena/portal/api/search?result=sample&query=study_accession={study_id}&format=tsv"
Find assemblies for a specific organism:
# Search assemblies by taxonomy
organism = "Escherichia coli"
url = f"https://www.ebi.ac.uk/ena/portal/api/search?result=assembly&query=tax_tree({organism})&format=json"
Get taxonomic lineage:
# Query taxonomy API
taxon_id = "562" # E. coli
url = f"https://www.ebi.ac.uk/ena/taxonomy/rest/tax-id/{taxon_id}"
Bulk Download Pattern:
BLAST Integration: Integrate with EBI's NCBI BLAST service (REST/SOAP API) for sequence similarity searches against ENA sequences.
Rate Limiting:
Data Citation:
API Response Handling:
Performance:
This skill includes detailed reference documentation for working with ENA:
api_reference.md - Comprehensive API endpoint documentation including:
Load this reference when constructing complex API queries, debugging API responses, or needing specific parameter details.
development
Run structured What-If scenario analysis with multi-branch possibility exploration. Use this skill when the user asks speculative questions like "what if...", "what would happen if...", "what are the possibilities", "explore scenarios", "scenario analysis", "possibility space", "what could go wrong", "best case / worst case", "risk analysis", "contingency planning", "strategic options", or any question about uncertain futures. Also trigger when the user faces a fork-in-the-road decision, wants to stress-test an idea, or needs to think through consequences before committing.
development
Access comprehensive LaTeX templates, formatting requirements, and submission guidelines for major scientific publication venues (Nature, Science, PLOS, IEEE, ACM), academic conferences (NeurIPS, ICML, CVPR, CHI), research posters, and grant proposals (NSF, NIH, DOE, DARPA). This skill should be used when preparing manuscripts for journal submission, conference papers, research posters, or grant proposals and need venue-specific formatting requirements and templates.
development
Use when challenging ideas, plans, decisions, or proposals using structured critical reasoning. Invoke to play devil's advocate, run a pre-mortem, red team, or audit evidence and assumptions.
tools
Core skill for the deep research and writing tool. Write scientific manuscripts in full paragraphs (never bullet points). Use two-stage process with (1) section outlines with key points using research-lookup then (2) convert to flowing prose. IMRAD structure, citations (APA/AMA/Vancouver), figures/tables, reporting guidelines (CONSORT/STROBE/PRISMA), for research papers and journal submissions.