skills/interproscan-domain-analysis/SKILL.md
Analyze protein sequences using InterProScan to identify functional domains, protein families, and Gene Ontology (GO) annotations.
npx skillsauth add InternScience/scp interproscan-domain-analysisInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use the same BioInfoToolsClient class as defined in the protein-blast-search skill.
This workflow analyzes protein sequences using InterProScan to identify functional domains, protein families, binding sites, and associated Gene Ontology annotations.
Workflow Steps:
Implementation:
from datetime import timedelta
## Initialize client
client = BioInfoToolsClient(
"https://scp.intern-ai.org.cn/api/v1/mcp/17/BioInfo-Tools",
"<your-api-key>"
)
if not await client.connect():
print("connection failed")
exit()
## Input: Protein sequence to analyze
protein_sequence = """
MVHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH
"""
## Step 1 & 2: Run InterProScan analysis
result = await client.session.call_tool(
"interproscan_analyze",
arguments={
"sequence": protein_sequence.strip(),
"sequence_id": "HBB_HUMAN", # Optional identifier
"databases": ["Pfam"], # Signature databases to use
"goterms": True # Include GO term annotations
},
read_timeout_seconds=timedelta(seconds=900) # Allow up to 15 minutes
)
## Step 3: Parse and display results
result_data = client.parse_result(result)
if result_data.get("success"):
results = result_data.get("results", {})
domains = results.get("domains", [])
go_terms = results.get("go_terms", [])
print(f"✅ InterProScan analysis completed successfully")
print(f"Execution time: {result_data.get('time_seconds', '?')} seconds")
print(f"Domains found: {len(domains)}")
print(f"GO annotations: {len(go_terms)}\n")
# Display domain information
if domains:
print("=== Functional Domains ===\n")
for i, domain in enumerate(domains, 1):
print(f"{i}. {domain.get('name', 'N/A')}")
print(f" Accession: {domain.get('accession', 'N/A')}")
print(f" Database: {domain.get('database', 'N/A')}")
if domain.get('description'):
print(f" Description: {domain.get('description')}")
# Display domain locations
locations = domain.get('locations', [])
if locations:
print(f" Locations:")
for loc in locations:
print(f" - Position {loc.get('start')}-{loc.get('end')} aa")
if loc.get('score'):
print(f" Score: {loc.get('score')}")
print()
# Display GO annotations
if go_terms:
print("=== Gene Ontology Annotations ===\n")
# Group by category
by_category = {}
for go in go_terms:
category = go.get('category', 'UNKNOWN')
if category not in by_category:
by_category[category] = []
by_category[category].append(go)
for category, terms in by_category.items():
print(f"{category}:")
for go in terms:
print(f" - {go.get('id', 'N/A')}: {go.get('name', 'N/A')}")
print()
else:
print(f"❌ InterProScan analysis failed: {result_data.get('error', 'Unknown error')}")
await client.disconnect()
BioInfo-Tools Server:
interproscan_analyze: Analyze protein sequence using InterProScan
sequence (str): Protein sequence in amino acid single-letter codesequence_id (str, optional): Identifier for the query sequencedatabases (list, optional): Signature databases to query (default: ["Pfam"])goterms (bool, optional): Include GO term annotations (default: True)success (bool): Whether analysis completed successfullyresults (dict): Analysis results containing domains and GO termstime_seconds (float): Execution timeInput:
sequence: Protein sequence (amino acid single-letter code)sequence_id: Optional identifier for the querydatabases: List of signature databases (e.g., ["Pfam", "SMART", "PRINTS"])goterms: Whether to include Gene Ontology annotationsOutput:
domains: List of identified protein domains, each containing:
name: Domain or family nameaccession: Database accession numberdatabase: Source database (e.g., "PFAM", "SMART")description: Functional descriptionlocations: List of domain positions in the sequence
start: Start position (amino acid number)end: End position (amino acid number)score: Match score (if available)go_terms: List of GO annotations, each containing:
id: GO identifier (e.g., "GO:0020037")name: GO term namecategory: GO category (MOLECULAR_FUNCTION, BIOLOGICAL_PROCESS, or CELLULAR_COMPONENT)InterProScan integrates multiple signature databases:
Default: ["Pfam"] for fastest results
testing
Assess wind energy potential and perform site analysis using atmospheric science calculations.
tools
Scientific Literature Mining - Mine scientific literature: PubMed search, arXiv search, web search, and Tavily deep search. Use this skill for scientific informatics tasks involving pubmed search search literature search web tavily search. Combines 4 tools from 2 SCP server(s).
tools
Virus Genomics Analysis - Analyze virus genomics: NCBI virus dataset, annotation, taxonomy, and literature search. Use this skill for virology tasks involving get virus dataset report get virus annotation report get taxonomy search literature. Combines 4 tools from 2 SCP server(s).
tools
Virtual Screening Pipeline - Virtual screening: search PubChem by substructure, compute similarity, filter by drug-likeness, and predict binding affinity. Use this skill for drug discovery tasks involving search pubchem by smiles calculate smiles similarity calculate mol drug chemistry boltz binding affinity. Combines 4 tools from 3 SCP server(s).