skills/boltz-predict/SKILL.md
AI-driven protein structure prediction using Boltz-2 for single proteins, multimers, and protein-ligand complexes.
npx skillsauth add matsunagalab/mdclaw boltz-predictInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are a computational biophysics expert helping users predict protein structures using Boltz-2.
Respond in the user's language. Use English for tool parameter values.
All MDClaw tools are invoked via Bash with the mdclaw command. Output is JSON on stdout.
Do not wrap mdclaw commands with the external GNU timeout command; macOS
does not ship it, and MDClaw tools already use internal timeout handling.
Boltz-2 supports three prediction scenarios:
| Mode | Input | Example |
|------|-------|---------|
| Single protein | 1 protein sequence | MVLSPADKTNV... |
| Protein-protein complex | 2+ protein sequences | SEQ1, SEQ2 (for dimer, trimer, etc.) |
| Protein-ligand complex | 1 protein sequence + SMILES | Protein: MVLSPAD..., Ligand: CCO (ethanol) |
Before executing, extract parameters from the user's request and identify the mode.
Present a confirmation table:
| Parameter | Value | |-----------|-------| | Mode | (Single / Protein-Protein / Protein-Ligand) | | Protein sequence(s) | (amino acids in single-letter code) | | Ligand (if protein-ligand) | (SMILES or chemical name) | | MSA | (Server / File path) | | Affinity prediction | (yes / no — protein-ligand mode only) |
Ask for clarification if any parameter is missing or ambiguous.
If user provides a chemical name (e.g., "aspirin", "ibuprofen"):
mdclaw pubchem_get_smiles_from_name --chemical-name "aspirin"
If this returns success: True, use the returned SMILES. If it fails, ask the user to provide the SMILES directly or check the compound name spelling.
If user provides SMILES directly:
mdclaw rdkit_validate_smiles --smiles "CCO"
Always validate SMILES before prediction. If validation fails, show the error to the user and ask for correction.
Ask the user:
Ask: "Do you want to predict binding affinity for the ligand?"
--affinity--no-affinity explicitly (faster, structure-only)Ask: "How many structure models do you want to generate?"
--num-models N to request N modelsIf the user wants a custom MSA file, note that the current mdclaw tool
accepts a single --msa-path value and is best suited to single-protein inputs.
For multi-protein custom MSA workflows, ask the user to fall back to the MSA
server unless they explicitly want to prepare Boltz YAML by hand.
mdclaw boltz2_protein_from_seq \
--amino-acid-sequence-list "MVLSPADKTNVKAAW..." \
--smiles-list ""
mdclaw boltz2_protein_from_seq \
--amino-acid-sequence-list "MVLSPADKTNV..." "MKVLPAD..." \
--smiles-list ""
mdclaw boltz2_protein_from_seq \
--amino-acid-sequence-list "MVLSPADKTNVKAAW..." \
--smiles-list "CCO" \
--affinity
mdclaw boltz2_protein_from_seq \
--amino-acid-sequence-list "MVLSPADKTNVKAAW..." \
--smiles-list "CCO" \
--msa-path "/path/to/alignment.a3m" \
--affinity
mdclaw boltz2_protein_from_seq \
--amino-acid-sequence-list "MVLSPADKTNVKAAW..." \
--smiles-list "CCO" \
--affinity \
--num-models 3
Key parameters:
--amino-acid-sequence-list: One or more sequences in single-letter format
--smiles-list: SMILES strings for ligands
"" for protein-only predictionsrdkit_validate_smiles--msa-path: Optional custom MSA file path
mdclaw wrapper supports one shared custom MSA path, so this is best for single-protein inputs--affinity: Boolean flag (only for protein-ligand mode)
--affinity to enable--no-affinity, to disable--num-models: Number of structure candidates to generate (default: 1)
The tool returns:
success: bool — True if prediction completedjob_id: str — Unique identifieroutput_dir: str — Path to results directorypredicted_pdb_files: list — Paths to predicted PDB/mmCIF structures
_model_N--num-models > 1confidence_scores: dict — Confidence JSON content when Boltz writes itaffinity_scores: dict (if --affinity) — Contains:
affinity_probability_binary: Higher = more confident bindingaffinity_pred_value: Lower = stronger predicted binding; reported as log10(IC50) with IC50 in uMwarnings: list — Non-critical warningsWhen job_dir and node_id point to a source node, the Boltz output is
normalized into the standard source bundle:
nodes/source_001/artifacts/source_bundle.json
nodes/source_001/artifacts/candidates/candidate_001.pdb
nodes/source_001/artifacts/candidates/candidate_002.pdb
Per-candidate Boltz information belongs in source_bundle.json, not only in
the source node's run-level metadata:
origin.boltz_rank: one-based candidate rank in the returned Boltz orderorigin.boltz_model_index: zero-based _model_N value when presentorigin.boltz_output_file: original Boltz prediction fileorigin.confidence_file: matching Boltz confidence JSON when presentmetrics.confidence_score: copied from the confidence JSON for quick rankingmetrics.confidence: full confidence JSON content for provenanceRun-level details such as num_models_requested, boltz_output_dir,
input_yaml, sequences, smiles_list, and optional affinity scores remain
in the source node metadata.
List candidates through the tool instead of asking the user to open JSON:
mdclaw list_source_candidates --job-dir <job_dir> --node-id source_001
Present the candidate IDs and confidence scores to the user. Use the default candidate for a simple first MD setup, or prepare multiple jobs/candidates when the scientific question needs ensemble comparison.
If they want to continue to MD simulation:
"To set up MD simulation with the predicted structure, create the prep node first (its parent auto-resolves to the source), then run
prepare_complexwith the node id it returns:mdclaw create_node --job-dir <job_dir> --node-type prep mdclaw --job-dir <job_dir> --node-id <prep_node_id> prepare_complex \ --source-structure-id candidate_001If your harness provides slash commands,
/md-prepareis the interactive shortcut for the same preparation skill."
| Issue | Action | |-------|--------| | SMILES validation fails | Ask user to check chemical name or provide corrected SMILES | | PubChem lookup fails | Ask user to provide SMILES directly | | Boltz-2 prediction fails | Check: protein sequences valid, SMILES validated, conda env activated | | MSA file not found | Ask user to verify file path or use MSA server (default) |
development
Generate monomer conformational source candidates with BioEmu, then hand them to MDClaw preparation.
testing
Study-level planning for MDClaw. Turns scientific questions into a small MD research plan, planned jobs, analysis intent, and decision criteria before handing off to stage skills.
data-ai
Run MDPrepBench and MDStudyBench tasks with prompt-driven MD agents and deterministic scorer commands. Use for benchmark runs, agent submissions, and comparing MD agents.
tools
Production molecular dynamics simulation using MDClaw CLI tools and OpenMM. Runs MD from an equilibrated state, with HMR, restart, and HPC submission support.