skills/43-wentorai-research-plugins/skills/domains/biomedical/pdb-structure-api/SKILL.md
Search and retrieve 3D protein structures from the RCSB Protein Data Bank
npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research pdb-structure-apiInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
The RCSB Protein Data Bank (PDB) is the single global archive for experimentally determined 3D structures of biological macromolecules. It hosts over 200,000 structures resolved by X-ray crystallography, cryo-EM, NMR spectroscopy, and other methods. Each entry includes atomic coordinates, experimental metadata, polymer sequences, bound ligands, and literature references.
Two complementary APIs are available. The Data API (data.rcsb.org) serves structured entry metadata, polymer entities, and chemical components via RESTful GET endpoints. The Search API (search.rcsb.org) supports full-text, attribute-based, sequence similarity, and structure similarity searches.
No authentication required. Both APIs are freely accessible without API keys, tokens, or registration.
Retrieve metadata for a structure including experimental method, resolution, citations, and bound components.
GET https://data.rcsb.org/rest/v1/core/entry/{pdb_id}curl "https://data.rcsb.org/rest/v1/core/entry/4HHB"
{
"rcsb_id": "4HHB",
"struct": {
"title": "THE CRYSTAL STRUCTURE OF HUMAN DEOXYHAEMOGLOBIN AT 1.74 ANGSTROMS RESOLUTION"
},
"exptl": [{"method": "X-RAY DIFFRACTION"}],
"rcsb_entry_info": {
"deposited_atom_count": 4779,
"molecular_weight": 64.74,
"polymer_composition": "heteromeric protein",
"polymer_entity_count_protein": 2,
"resolution_combined": [1.74],
"nonpolymer_bound_components": ["HEM"]
}
}
Retrieve protein/nucleic acid entity details including sequence, organism, and gene info.
GET https://data.rcsb.org/rest/v1/core/polymer_entity/{pdb_id}/{entity_id}curl "https://data.rcsb.org/rest/v1/core/polymer_entity/4HHB/1"
{
"entity_poly": {
"pdbx_seq_one_letter_code_can": "VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSH...",
"rcsb_entity_polymer_type": "Protein",
"rcsb_sample_sequence_length": 141,
"type": "polypeptide(L)"
},
"entity_src_gen": [{
"gene_src_common_name": "Human",
"pdbx_gene_src_scientific_name": "Homo sapiens",
"pdbx_gene_src_ncbi_taxonomy_id": "9606"
}]
}
Retrieve ligand or small molecule metadata by component ID.
GET https://data.rcsb.org/rest/v1/core/chemcomp/{comp_id}curl "https://data.rcsb.org/rest/v1/core/chemcomp/HEM"
{
"rcsb_id": "HEM",
"chem_comp": {
"formula": "C34 H32 Fe N4 O4",
"formula_weight": 616.487,
"name": "PROTOPORPHYRIN IX CONTAINING FE",
"type": "non-polymer"
}
}
Search across all PDB entries with free-text queries. Returns ranked results by relevance.
POST https://search.rcsb.org/rcsbsearch/v2/queryContent-Type: application/jsonquery.type ("terminal"), query.service ("full_text", "text", "sequence", "structure"), query.parameters.value, return_type ("entry", "polymer_entity", "assembly"), request_options.paginate.start/rowscurl -X POST "https://search.rcsb.org/rcsbsearch/v2/query" \
-H "Content-Type: application/json" \
-d '{
"query": {
"type": "terminal",
"service": "full_text",
"parameters": {"value": "hemoglobin"}
},
"return_type": "entry",
"request_options": {
"results_content_type": ["experimental"],
"paginate": {"start": 0, "rows": 3}
}
}'
{
"query_id": "6f7192a6-d65b-4ff1-9d94-37b9600a8864",
"result_type": "entry",
"total_count": 8960,
"result_set": [
{"identifier": "3GOU", "score": 1.0},
{"identifier": "6IHX", "score": 0.9995},
{"identifier": "2PGH", "score": 0.9985}
]
}
For attribute-based searches, use "service": "text" with "attribute" and "operator" fields. Combine multiple criteria with "type": "group" and "logical_operator": "and".
No formal rate limits or rate-limit headers are published. RCSB recommends reasonable request rates. For bulk data, use FTP downloads at https://files.rcsb.org/pub/pdb/ or ftp://ftp.wwpdb.org/pub/pdb/ instead of iterative API calls.
import requests
# Search for kinase inhibitor structures
search_body = {
"query": {"type": "terminal", "service": "full_text",
"parameters": {"value": "tyrosine kinase inhibitor"}},
"return_type": "entry",
"request_options": {"results_content_type": ["experimental"],
"paginate": {"start": 0, "rows": 5}}
}
results = requests.post("https://search.rcsb.org/rcsbsearch/v2/query",
json=search_body).json()
print(f"Total hits: {results['total_count']}")
# Retrieve metadata for each hit
for hit in results["result_set"]:
pdb_id = hit["identifier"]
entry = requests.get(
f"https://data.rcsb.org/rest/v1/core/entry/{pdb_id}").json()
info = entry["rcsb_entry_info"]
print(f"{pdb_id}: {entry['struct']['title'][:80]}")
print(f" Resolution: {info.get('resolution_combined', ['N/A'])[0]} A, "
f"Method: {info['experimental_method']}")
import requests
pdb_id = "4HHB"
entry = requests.get(
f"https://data.rcsb.org/rest/v1/core/entry/{pdb_id}").json()
for eid in range(1, entry["rcsb_entry_info"]["polymer_entity_count"] + 1):
entity = requests.get(
f"https://data.rcsb.org/rest/v1/core/polymer_entity/{pdb_id}/{eid}"
).json()
poly = entity["entity_poly"]
src = entity.get("rcsb_entity_source_organism", [{}])[0]
print(f"Entity {eid}: {poly['rcsb_entity_polymer_type']} "
f"({src.get('ncbi_scientific_name', 'N/A')})")
print(f" {poly['rcsb_sample_sequence_length']} residues: "
f"{poly['pdbx_seq_one_letter_code_can'][:50]}...")
tools
Show mcp-stata identity, connected tools, and status. Use when the user asks if mcp-stata is available, asks about access to the toolkit, or asks what Stata tools are connected.
tools
Activate when users mention Stata commands, .do files, regressions, econometrics, stored results, graphs, dataset inspection, replication, or Stata errors. Route the task through mcp-stata tools and the specialized research skills instead of treating it as plain text coding.
development
Build and review paper-ready regression, balance, and summary tables from Stata outputs. Use when the user needs a clean table for a draft, appendix, or coauthor share-out.
tools
Install, configure, update, or verify mcp-stata across Claude Code, Codex, Gemini CLI, Cursor, Windsurf, and VS Code. Activate when users ask to set up the Stata toolkit or troubleshoot the installation.