skills/create-tooluniverse-skill/SKILL.md
Create high-quality ToolUniverse skills following test-driven, implementation-agnostic methodology. Integrates tools from ToolUniverse's 1,264+ tool library, creates missing tools when needed using devtu-create-tool, tests thoroughly, and produces skills with Python SDK + MCP support. Use when asked to create new ToolUniverse skills, build research workflows, or develop domain-specific analysis capabilities for biology, chemistry, or medicine.
npx skillsauth add Zaoqu-Liu/ScienceClaw create-tooluniverse-skillInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Systematic workflow for creating production-ready ToolUniverse skills that integrate multiple scientific tools, follow implementation-agnostic standards, and achieve 100% test coverage.
CRITICAL - Read devtu-optimize-skills: This skill builds on principles from devtu-optimize-skills. Invoke that skill first or review its 10 pillars:
operation parameter where neededUse this skill when:
Phase 1: Domain Analysis →
Phase 2: Tool Discovery & Testing →
Phase 3: Tool Creation (if needed) →
Phase 4: Implementation →
Phase 5: Documentation →
Phase 6: Testing & Validation →
Phase 7: Packaging
Time per skill: ~1.5-2 hours (tested and documented)
Objective: Understand the scientific domain and identify required capabilities
Duration: 15 minutes
Gather concrete examples of how the skill will be used:
Example queries to clarify:
List what the skill needs to work with:
Break the workflow into logical phases:
Example - Metabolomics Skill:
Example - Cancer Genomics Skill:
Check existing ToolUniverse skills for relevant patterns:
skills/ directoryUseful related skills:
tooluniverse-systems-biology - Multi-database integrationtooluniverse-target-research - Comprehensive profilingtooluniverse-drug-research - Pharmaceutical workflowsObjective: Find and verify tools needed for each analysis phase
Duration: 30-45 minutes
CRITICAL: Following test-driven development from devtu-optimize-skills
Tool locations: /src/tooluniverse/data/*.json (186 tool files)
Search strategies:
grep -r "keyword" src/tooluniverse/data/*.jsonExample searches:
# Find metabolomics tools
grep -l "metabol" src/tooluniverse/data/*.json
# Find cancer-related tools
grep -l "cancer\|tumor\|oncology" src/tooluniverse/data/*.json
# Find single-cell tools
grep -l "single.cell\|scRNA" src/tooluniverse/data/*.json
For each relevant tool file, read to understand:
Check for:
operation parameter)NEVER skip this step - Testing before documentation is the #1 lesson
Template: See scripts/test_tools_template.py
Test script structure:
#!/usr/bin/env python3
"""
Test script for [Domain] tools
Following TDD: test ALL tools BEFORE creating skill documentation
"""
from tooluniverse import ToolUniverse
import json
def test_database_1():
"""Test Database 1 tools"""
tu = ToolUniverse()
tu.load_tools()
# Test tool 1
result = tu.tools.TOOL_NAME(param="value")
print(f"Status: {result.get('status')}")
# Verify response format
# Check data structure
def test_database_2():
"""Test Database 2 tools"""
# Similar structure
def main():
"""Run all tests"""
tests = [
("Database 1", test_database_1),
("Database 2", test_database_2),
]
results = {}
for name, test_func in tests:
try:
test_func()
results[name] = "✅ PASS"
except Exception as e:
results[name] = f"❌ FAIL: {e}"
# Print summary
for name, result in results.items():
print(f"{name}: {result}")
if __name__ == "__main__":
main()
Execute:
python test_[domain]_tools.py
Document discoveries:
operationCreate parameter corrections table: | Tool | Common Mistake | Correct Parameter | Evidence | |------|----------------|-------------------|----------| | TOOL_NAME | assumed_param | actual_param | Test result |
Objective: Create missing tools using devtu-create-tool
Duration: 30-60 minutes per tool (if needed)
When to create tools:
When NOT to create tools:
If tools are missing, invoke the devtu-create-tool skill:
Example invocation:
"I need to create a tool for [database/API]. The API endpoint is [URL],
it takes parameters [list], and returns [format]. Can you help create this tool?"
After creating tools, add them to test script:
def test_new_tool():
"""Test newly created tool"""
tu = ToolUniverse()
tu.load_tools()
result = tu.tools.NEW_TOOL_NAME(param="test_value")
assert result.get('status') == 'success', "New tool failed"
# Verify data structure matches expectations
If new tools fail tests, invoke devtu-fix-tool:
"The tool [TOOL_NAME] is returning errors: [error message].
Can you help fix it?"
Objective: Create working Python pipeline with tested tools
Duration: 30-45 minutes
CRITICAL: Build from tested tools, not assumptions
mkdir -p skills/tooluniverse-[domain-name]
cd skills/tooluniverse-[domain-name]
Template: See assets/skill_template/python_implementation.py
Structure:
#!/usr/bin/env python3
"""
[Domain Name] - Python SDK Implementation
Tested implementation following TDD principles
"""
from tooluniverse import ToolUniverse
from datetime import datetime
def domain_analysis_pipeline(
input_param_1=None,
input_param_2=None,
output_file=None
):
"""
[Domain] analysis pipeline.
Args:
input_param_1: Description
input_param_2: Description
output_file: Output markdown file path
Returns:
Path to generated report file
"""
tu = ToolUniverse()
tu.load_tools()
# Generate output filename
if output_file is None:
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
output_file = f"domain_analysis_{timestamp}.md"
# Initialize report
report = []
report.append("# [Domain] Analysis Report\n")
report.append(f"**Generated**: {datetime.now()}\n\n")
# Phase 1: [Description]
report.append("## 1. [Phase Name]\n")
try:
result = tu.tools.TOOL_NAME(param=value)
if result.get('status') == 'success':
data = result.get('data', [])
# Process and add to report
else:
report.append("*[Phase] data unavailable.*\n")
except Exception as e:
report.append(f"*Error in [Phase]: {str(e)}*\n")
# Phase 2, 3, 4... (similar structure)
# Write report
with open(output_file, 'w') as f:
f.write(''.join(report))
print(f"\n✅ Report generated: {output_file}")
return output_file
if __name__ == "__main__":
# Example usage
domain_analysis_pipeline(
input_param_1="example",
output_file="example_analysis.md"
)
Key principles:
Template: See assets/skill_template/test_skill.py
Test cases:
#!/usr/bin/env python3
"""Test script for [Domain] skill"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
from python_implementation import domain_analysis_pipeline
def test_basic_analysis():
"""Test basic analysis workflow"""
output = domain_analysis_pipeline(
input_param_1="test_value",
output_file="test_basic.md"
)
assert os.path.exists(output), "Output file not created"
def test_combined_analysis():
"""Test with multiple inputs"""
output = domain_analysis_pipeline(
input_param_1="value1",
input_param_2="value2",
output_file="test_combined.md"
)
# Verify report has all sections
with open(output, 'r') as f:
content = f.read()
assert "Phase 1" in content
assert "Phase 2" in content
def main():
tests = [
("Basic Analysis", test_basic_analysis),
("Combined Analysis", test_combined_analysis),
]
results = {}
for name, test_func in tests:
try:
test_func()
results[name] = "✅ PASS"
except Exception as e:
results[name] = f"❌ FAIL: {e}"
for name, result in results.items():
print(f"{name}: {result}")
all_passed = all("PASS" in r for r in results.values())
return 0 if all_passed else 1
if __name__ == "__main__":
sys.exit(main())
python test_skill.py
Fix bugs until 100% pass rate
Objective: Create implementation-agnostic documentation
Duration: 30-45 minutes
CRITICAL: SKILL.md must have ZERO Python/MCP specific code
Template: See assets/skill_template/SKILL.md
Structure:
---
name: tooluniverse-[domain-name]
description: [What it does]. [Capabilities]. [Databases used]. Use when [triggers].
---
# [Domain Name] Analysis
[One paragraph overview]
## When to Use This Skill
**Triggers**:
- "Analyze [domain] for [input]"
- "Find [data type] related to [query]"
- "[Domain-specific action]"
**Use Cases**:
1. [Use case 1 with description]
2. [Use case 2 with description]
## Core Databases Integrated
| Database | Coverage | Strengths |
|----------|----------|-----------|
| **Database 1** | [Scope] | [What it's good for] |
| **Database 2** | [Scope] | [What it's good for] |
## Workflow Overview
Input → Phase 1 → Phase 2 → Phase 3 → Report
---
## Phase 1: [Phase Name]
**When**: [Conditions]
**Objective**: [What this phase achieves]
### Tools Used
**TOOL_NAME**:
- **Input**:
- `parameter1`: Description
- `parameter2`: Description
- **Output**: Description
- **Use**: What it's used for
### Workflow
1. [Step 1]
2. [Step 2]
3. [Step 3]
### Decision Logic
- **Condition 1**: Action to take
- **Empty results**: How to handle
- **Errors**: Fallback strategy
---
## Phase 2, 3, 4... (similar structure)
---
## Output Structure
[Description of report format]
### Report Format
**Required Sections**:
1. Header with parameters
2. Phase 1 results
3. Phase 2 results
...
---
## Tool Parameter Reference
**Critical Parameter Notes** (from testing):
| Tool | Parameter | CORRECT Name | Common Mistake |
|------|-----------|--------------|----------------|
| TOOL_NAME | `param` | ✅ `actual_param` | ❌ `assumed_param` |
**Response Format Notes**:
- **TOOL_1**: Returns [format]
- **TOOL_2**: Returns [format]
---
## Fallback Strategies
[Document Primary → Fallback → Default for critical tools]
---
## Limitations & Known Issues
### Database-Specific
- **Database 1**: [Limitations]
- **Database 2**: [Limitations]
### Technical
- **Response formats**: [Notes]
- **Rate limits**: [If any]
---
## Summary
[Domain] skill provides:
1. ✅ [Capability 1]
2. ✅ [Capability 2]
**Outputs**: [Description]
**Best for**: [Use cases]
Key principles:
Template: See assets/skill_template/QUICK_START.md
Structure:
## Quick Start: [Domain] Analysis
[One paragraph overview]
---
## Choose Your Implementation
### Python SDK
#### Option 1: Complete Pipeline (Recommended)
```python
from skills.tooluniverse_[domain].python_implementation import domain_pipeline
# Example 1
domain_pipeline(
input_param="value",
output_file="analysis.md"
)
from tooluniverse import ToolUniverse
tu = ToolUniverse()
tu.load_tools()
# Tool 1
result = tu.tools.TOOL_NAME(param="value")
# Tool 2
result = tu.tools.TOOL_NAME2(param="value")
"Analyze [domain] for [input]"
"Find [data] related to [query]"
{
"tool": "TOOL_NAME",
"parameters": {
"param": "value"
}
}
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| param1 | string | Yes | [Description] |
| param2 | integer | No | [Description] |
Python SDK:
[Code example]
MCP:
[Conversational example]
[Show example report structure]
Solution: [Fix]
After running this skill:
**Key principles**:
- Equal treatment of Python SDK and MCP
- Concrete examples for both
- Parameter table applies to both
- Clear recipes
---
## Phase 6: Testing & Validation
**Objective**: Verify skill meets all quality standards through comprehensive testing
**Duration**: 30-45 minutes
**CRITICAL**: This phase is MANDATORY - no skill is complete without passing comprehensive tests
### 6.1 Create Comprehensive Test Suite
**File**: `test_skill_comprehensive.py`
**CRITICAL**: Test ALL use cases from documentation + edge cases
**Test structure**:
```python
#!/usr/bin/env python3
"""
Comprehensive Test Suite for [Domain] Skill
Tests all use cases from SKILL.md + edge cases
"""
import sys
import os
sys.path.insert(0, os.path.join(os.path.dirname(__file__)))
from python_implementation import skill_function
from tooluniverse import ToolUniverse
def test_1_use_case_from_skill_md():
"""Test Case 1: [Use case name from SKILL.md]"""
print("\n" + "="*80)
print("TEST 1: [Use Case Name]")
print("="*80)
print("Expected: [What should happen]")
tu = ToolUniverse()
result = skill_function(tu=tu, input_param="value")
# Validation
assert isinstance(result, ExpectedType), "Should return expected type"
assert result.field_name is not None, "Should have required field"
print(f"\n✅ PASS: [What passed]")
return result
def test_2_documentation_accuracy():
"""Test Case 2: QUICK_START.md example works exactly as documented"""
print("\n" + "="*80)
print("TEST 2: Documentation Accuracy")
print("="*80)
# Exact copy-paste from QUICK_START.md
tu = ToolUniverse()
result = skill_function(tu=tu, param="value") # From docs
# Verify documented attributes exist
assert hasattr(result, 'documented_field'), "Doc says this field exists"
print(f"\n✅ PASS: Documentation examples work")
return result
def test_3_edge_case_invalid_input():
"""Test Case 3: Error handling with invalid inputs"""
print("\n" + "="*80)
print("TEST 3: Error Handling")
print("="*80)
tu = ToolUniverse()
result = skill_function(tu=tu, input_param="INVALID")
# Should handle gracefully, not crash
assert isinstance(result, ExpectedType), "Should still return result"
assert len(result.warnings) > 0, "Should have warnings"
print(f"\n✅ PASS: Handled invalid input gracefully")
return result
def test_4_result_structure():
"""Test Case 4: Result structure matches documentation"""
print("\n" + "="*80)
print("TEST 4: Result Structure")
print("="*80)
tu = ToolUniverse()
result = skill_function(tu=tu, input_param="value")
# Check all documented fields
required_fields = ['field1', 'field2', 'field3']
for field in required_fields:
assert hasattr(result, field), f"Missing field: {field}"
print(f"\n✅ PASS: All documented fields present")
return result
def test_5_parameter_validation():
"""Test Case 5: All documented parameters work"""
print("\n" + "="*80)
print("TEST 5: Parameter Validation")
print("="*80)
tu = ToolUniverse()
result = skill_function(
tu=tu,
param1="value1", # All documented params
param2="value2",
param3=True
)
assert isinstance(result, ExpectedType), "Should work with all params"
print(f"\n✅ PASS: All parameters accepted")
return result
def run_all_tests():
"""Run all tests and generate report"""
print("\n" + "="*80)
print("[DOMAIN] SKILL - COMPREHENSIVE TEST SUITE")
print("="*80)
tests = [
("Use Case 1", test_1_use_case_from_skill_md),
("Documentation Accuracy", test_2_documentation_accuracy),
("Error Handling", test_3_edge_case_invalid_input),
("Result Structure", test_4_result_structure),
("Parameter Validation", test_5_parameter_validation),
]
results = {}
passed = 0
failed = 0
for test_name, test_func in tests:
try:
result = test_func()
results[test_name] = {"status": "PASS", "result": result}
passed += 1
except Exception as e:
results[test_name] = {"status": "FAIL", "error": str(e)}
failed += 1
print(f"\n❌ FAIL: {test_name}")
print(f" Error: {e}")
# Summary
print("\n" + "="*80)
print("TEST SUMMARY")
print("="*80)
print(f"✅ Passed: {passed}/{len(tests)}")
print(f"❌ Failed: {failed}/{len(tests)}")
print(f"📊 Success Rate: {passed/len(tests)*100:.1f}%")
if failed == 0:
print("\n🎉 ALL TESTS PASSED! Skill is production-ready.")
else:
print("\n⚠️ Some tests failed. Review errors above.")
return passed, failed
if __name__ == "__main__":
passed, failed = run_all_tests()
sys.exit(0 if failed == 0 else 1)
Requirements:
cd skills/tooluniverse-[domain]
python test_skill_comprehensive.py > test_output.txt 2>&1
Requirements:
File: SKILL_TESTING_REPORT.md
Content:
# [Domain] Skill - Testing Report
**Date**: [Date]
**Status**: ✅ PASS / ❌ FAIL
**Success Rate**: X/Y tests passed (100%)
## Executive Summary
[Brief summary of testing results]
## Test Results
### Test 1: [Use Case Name] ✅
**Use Case**: [Description from SKILL.md]
**Result**: [What happened]
**Validation**: [What was verified]
### Test 2-5: [Similar format]
## Quality Metrics
- **Code quality**: [Assessment]
- **Documentation accuracy**: [All examples work / Issues found]
- **Robustness**: [Error handling assessment]
- **User experience**: [Assessment]
## Recommendation
✅ PRODUCTION-READY / ❌ NEEDS FIXES
Implementation & Testing:
operation parameter (if applicable)Documentation:
Quality:
Test with fresh environment:
CRITICAL: If documentation example doesn't work, fix EITHER:
NEVER release with documentation that doesn't work
Objective: Create distributable skill and summary
Duration: 15 minutes
File: NEW_SKILL_[DOMAIN].md
Content:
# New Skill Created: [Domain] Analysis
**Date**: [Date]
**Status**: ✅ COMPLETE AND TESTED
## Overview
[Brief description]
### Key Features
- [Feature 1]
- [Feature 2]
## Skill Details
**Location**: `skills/tooluniverse-[domain]/`
**Files Created**:
1. ✅ `python_implementation.py` ([lines]) - Working pipeline
2. ✅ `SKILL.md` ([lines]) - Implementation-agnostic docs
3. ✅ `QUICK_START.md` ([lines]) - Multi-implementation guide
4. ✅ `test_skill.py` ([lines]) - Test suite
**Total**: [total lines]
## Tools Integrated
- **Database 1**: [X tools]
- **Database 2**: [Y tools]
**Total**: [N tools]
## Test Results
✅ Test 1 - PASS ✅ Test 2 - PASS ✅ Test 3 - PASS
100% test pass rate
## Capabilities
[List key capabilities]
## Impact & Value
[Describe impact]
## Next Steps
[Suggest enhancements]
If creating multiple skills in session, update tracking document with:
From devtu-optimize-skills - CRITICAL
| Pattern | Don't Assume | Always Verify |
|---------|--------------|---------------|
| Function name includes param name | drugbank_get_drug_by_name(name=...) | Test reveals uses query |
| Descriptive function name | map_uniprot_to_pathways(uniprot_id=...) | Test reveals uses id |
| Consistent naming | All similar functions use same param | Each tool may differ |
Indicators:
operation in schemaFix: Add operation parameter as first argument with method name
Standard: {status: "success", data: [...]}
Direct list: Returns [...] without wrapper
Direct dict: Returns {field1: ..., field2: ...} without status
Solution: Handle all three in implementation with isinstance() checks
Extended Reference: For detailed tool tables, examples, and templates, read
REFERENCE.mdin this skill directory. The agent can access it via:read skills/create-tooluniverse-skill/REFERENCE.md
testing
Therapeutics Data Commons. AI-ready drug discovery datasets (ADME, toxicity, DTI), benchmarks, scaffold splits, molecular oracles, for therapeutic ML and pharmacological prediction.
tools
Genomic file toolkit. Read/write SAM/BAM/CRAM alignments, VCF/BCF variants, FASTA/FASTQ sequences, extract regions, calculate coverage, for NGS data processing pipelines.
development
Complete mass spectrometry analysis platform. Use for proteomics workflows feature detection, peptide identification, protein quantification, and complex LC-MS/MS pipelines. Supports extensive file formats and algorithms. Best for proteomics, comprehensive MS data processing. For simple spectral comparison and metabolite ID use matchms.
development
Multi-objective optimization framework. NSGA-II, NSGA-III, MOEA/D, Pareto fronts, constraint handling, benchmarks (ZDT, DTLZ), for engineering design and optimization problems.