skills/campaign-manager/SKILL.md
Goal-oriented binder design campaign planning and health assessment. Use this skill when: (1) Planning a complete binder design campaign, (2) Converting high-level goals into runnable pipelines, (3) Assessing campaign health and pass rates, (4) Diagnosing why designs are failing QC, (5) Estimating time, cost, and expected yields, (6) Selecting between design tools for a specific target. This skill orchestrates the other protein design tools. For individual tool parameters, use the specific tool skills.
npx skillsauth add adaptyvbio/protein-design-skills campaign-managerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
When user says: "I need 10 good binders for EGFR"
Campaign Planning:
Goal: 10 high-quality binders for EGFR
├── Achievable: Yes (standard target)
├── Recommended pipeline: rfdiffusion → proteinmpnn → chai → protein-qc
├── Estimated designs needed: 500 backbones (to get ~50 passing QC)
├── Estimated time: 8-12 hours total
├── Estimated cost: ~$60 (Modal GPU compute)
└── Expected yield:
├── After backbone (500): 500 structures
├── After sequence (×8): 4,000 sequences
├── After validation: 4,000 predictions
├── After QC (~10-15%): 400-600 candidates
└── After clustering: 10-20 diverse final designs
# Step 1: Fetch and prepare target (5 min)
curl -o target.pdb "https://files.rcsb.org/download/{PDB_ID}.pdb"
# Trim to binding region if needed
# Step 2: Generate backbones (2-3h, ~$15)
# RFdiffusion runs from the official repo, not biomodals
python run_inference.py \
inference.input_pdb=target.pdb \
contigmap.contigs=[A1-150/0 70-100] \
ppi.hotspot_res=[A45,A67,A89] \
inference.num_designs=500
# Checkpoint: ls output/*.pdb | wc -l # Should be 500
# Step 3: Design sequences (1-2h, ~$10)
for f in output/*.pdb; do
modal run modal_ligandmpnn.py \
--input-pdb "$f" \
--params-str "--number_of_batches 8 --temperature 0.1"
done
# Checkpoint: grep -c "^>" output/seqs/*.fa # Should be ~4000
# Step 4: Quick ESM2 filter (30 min, ~$5, optional)
modal run modal_esm2_predict_masked.py --input-faa output/all_seqs.fa
# Filter sequences with PLL < 0.0
# Step 5: Structure validation (3-4h, ~$35)
modal run modal_alphafold.py \
--input-faa output/filtered_seqs.fa \
--out-dir predictions/
# Checkpoint: find predictions -name "*rank_001.pdb" | wc -l
# Step 6: Filter and rank (protein-qc skill)
# Apply thresholds: pLDDT > 0.85, ipTM > 0.5, scRMSD < 2.0
# Compute composite score
# Cluster at 70% identity, select top from each cluster
Total estimated time: 8-12 hours Total estimated cost: ~$60-70
| Goal | Backbones | Sequences/BB | Total Seq | Expected Passing | |------|-----------|--------------|-----------|------------------| | 5 binders | 200 | 8 | 1,600 | 160-240 | | 10 binders | 500 | 8 | 4,000 | 400-600 | | 20 binders | 1,000 | 8 | 8,000 | 800-1,200 | | 50 binders | 2,500 | 8 | 20,000 | 2,000-3,000 |
Rule of thumb: Generate 50x more designs than you need (10-15% pass rate × clustering).
| Scenario | Recommended Tool | Reason | |----------|------------------|--------| | Standard miniprotein | RFdiffusion + ProteinMPNN | High diversity, proven | | Need higher success rate | BindCraft | Integrated design loop | | All-atom precision needed | BoltzGen | Side-chain aware | | Difficult target | Mosaic | Gradient, multi-model objective | | Need fast iteration | ESMFold2 + ESM2 | Quick screening |
| Indicator | Easy Target | Difficult Target | |-----------|-------------|------------------| | Surface type | Concave pocket | Flat or convex | | Conservation | High | Low | | Known binders | Yes | No | | Flexibility | Rigid | Flexible | | Expected pass rate | 15-20% | 5-10% |
import pandas as pd
def assess_campaign(csv_path):
df = pd.read_csv(csv_path)
# Calculate pass rates
plddt_pass = (df['pLDDT'] > 0.85).mean()
iptm_pass = (df['ipTM'] > 0.50).mean()
scrmsd_pass = (df['scRMSD'] < 2.0).mean()
all_pass = ((df['pLDDT'] > 0.85) & (df['ipTM'] > 0.5) & (df['scRMSD'] < 2.0)).mean()
# Determine health
if all_pass > 0.15:
health = "EXCELLENT"
elif all_pass > 0.10:
health = "GOOD"
elif all_pass > 0.05:
health = "MARGINAL"
else:
health = "POOR"
# Identify top issue
issues = []
if plddt_pass < 0.20:
issues.append("Low pLDDT - backbone or sequence issue")
if iptm_pass < 0.20:
issues.append("Low ipTM - hotspot or interface issue")
if scrmsd_pass < 0.50:
issues.append("High scRMSD - sequence doesn't specify backbone")
return {
"health": health,
"overall_pass_rate": all_pass,
"plddt_pass_rate": plddt_pass,
"iptm_pass_rate": iptm_pass,
"scrmsd_pass_rate": scrmsd_pass,
"top_issues": issues
}
| Health | Pass Rate | Action | |--------|-----------|--------| | EXCELLENT | > 15% | Proceed to selection | | GOOD | 10-15% | Proceed, normal yield | | MARGINAL | 5-10% | Review failure tree | | POOR | < 5% | Diagnose and restart |
| Tool | GPU | $/hour | Typical Job | Cost | |------|-----|--------|-------------|------| | RFdiffusion | A10G | ~$1.20 | 500 designs/2h | ~$2.50 | | ProteinMPNN | T4 | ~$0.60 | 4000 seq/1.5h | ~$1.00 | | ESM2 (PLL) | A10G | ~$1.20 | 4000 seq/30min | ~$0.60 | | AlphaFold | A100 | ~$4.50 | 4000 preds/4h | ~$18.00 | | Chai | A100 | ~$4.50 | 500 preds/1h | ~$4.50 |
| Campaign Size | Total Cost | Notes | |---------------|------------|-------| | Small (100 bb) | ~$15 | Quick exploration | | Standard (500 bb) | ~$60 | Most campaigns | | Large (1000 bb) | ~$120 | Comprehensive | | XL (5000 bb) | ~$600 | Very thorough |
# More backbones, fewer sequences each (RFdiffusion from the official repo)
python run_inference.py inference.num_designs=2000
modal run modal_ligandmpnn.py --input-pdb bb.pdb --params-str "--number_of_batches 4 --temperature 0.2"
# Fewer backbones, more sequences each, lower temperature
python run_inference.py inference.num_designs=200
modal run modal_ligandmpnn.py --input-pdb bb.pdb --params-str "--number_of_batches 32 --temperature 0.1"
# Small batch, ESMFold2 for fast single-sequence folding
# RFdiffusion runs from the official repo (not biomodals); see the rfdiffusion skill
modal run modal_ligandmpnn.py --input-pdb bb.pdb --params-str "--number_of_batches 8"
modal run modal_esmfold2.py --input-faa all_seqs.fa
rfdiffusion, proteinmpnn, mosaic, chai, boltz, alphafoldprotein-qcbinder-designdata-ai
Structure prediction with Protenix, an open AlphaFold3 reproduction. Use this skill when: (1) Predicting complex structures with an AF3-class model, (2) Wanting an open alternative to AF3 alongside Boltz and Chai, (3) Validating designed binder-target complexes. For QC thresholds, use protein-qc. For ipSAE ranking, use ipsae.
devops
Multi-objective, gradient-based protein binder design with Mosaic. Use this skill when: (1) Composing several structure or sequence models into one design objective, (2) Optimizing binders against a custom loss rather than a fixed pipeline, (3) Wanting gradient descent over sequence space in the style of ColabDesign, RSO, or BindCraft but with interchangeable predictors, (4) Letting the optimizer choose the epitope instead of fixing hotspots. For an end-to-end binder pipeline with default filters, use bindcraft. For all-atom diffusion design, use boltzgen. For backbone-only generation, use rfdiffusion.
development
De novo antibody and nanobody (VHH) design with Germinal. Use this skill when: (1) Designing epitope-targeted nanobodies or scFvs, (2) Needing CDR design on a fixed framework, (3) Working on antibody-format binders rather than miniproteins. For miniprotein binders, use binder-design (BoltzGen, BindCraft, RFdiffusion, Mosaic). For structure validation, use boltz or chai.
testing
Access UniProt for protein sequence and annotation retrieval. Use this skill when: (1) Looking up protein sequences by accession, (2) Finding functional annotations, (3) Getting domain boundaries, (4) Finding homologs and variants, (5) Cross-referencing to PDB structures. For structure retrieval, use pdb. For sequence design, use proteinmpnn.