skills/mosaic/SKILL.md
Multi-objective, gradient-based protein binder design with Mosaic. Use this skill when: (1) Composing several structure or sequence models into one design objective, (2) Optimizing binders against a custom loss rather than a fixed pipeline, (3) Wanting gradient descent over sequence space in the style of ColabDesign, RSO, or BindCraft but with interchangeable predictors, (4) Letting the optimizer choose the epitope instead of fixing hotspots. For an end-to-end binder pipeline with default filters, use bindcraft. For all-atom diffusion design, use boltzgen. For backbone-only generation, use rfdiffusion.
npx skillsauth add adaptyvbio/protein-design-skills mosaicInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Mosaic (Escalante Bio) is a JAX framework for "functional, multi-objective protein design using continuous relaxation." It optimizes a soft sequence by gradient descent over a continuous relaxation of sequence space, in the lineage of ColabDesign, RSO, and BindCraft, with one key difference: it composes multiple learned objectives from different models in a single differentiable loss.
Mosaic is a framework for custom objectives, not a one-click method. The README is
explicit: it "may require substantial hand-holding (tuning learning rates, etc),
often produces proteins that fail simple in-silico tests, [and] should be combined
with standard filtering methods." Reach for it when a fixed pipeline cannot express
the objective you need. For a turnkey binder run, use bindcraft instead.
| Requirement | Minimum | Recommended | |-------------|---------|-------------| | Python | 3.11+ | 3.11 | | Framework | JAX with CUDA or TPU | JAX CUDA 12 | | GPU VRAM | 24GB | 48GB+ (depends on predictors used) |
JIT compilation makes the first call to any loss slow; later calls are fast.
Mosaic runs locally on a JAX GPU or TPU build. It has no CLI and no Modal integration; you drive it through the marimo notebooks or the Python API.
git clone https://github.com/escalante-bio/mosaic && cd mosaic
uv sync --group jax-cuda # or --group jax-tpu / --group jax-cpu
uv add jax[cuda12] # may be needed for a GPU build
uv run marimo edit examples/example_notebook.py
Ready-made examples include esmfold_minibinder.py, esmfold_vhh.py,
boltzgen_pipeline.py, and batched_protenix.py.
A design objective is built from LossTerm objects that you add and scale with plain
Python arithmetic, then hand to an optimizer.
import mosaic.losses.structure_prediction as sp
# Compose a loss from interface, confidence, and inverse-folding terms
design_loss = (
sp.BinderTargetContact()
+ sp.WithinBinderContact()
+ 0.05 * sp.TargetBinderPAE()
+ 0.05 * sp.BinderTargetPAE()
+ 0.025 * sp.IPTMLoss()
+ 0.1 * sp.PLDDTLoss()
)
Loss terms can wrap one model used several ways (for example a structure predictor scoring both the binder-target complex and the binder as a monomer). Composing different architectures also lowers the chance of finding adversarial sequences that fool a single predictor.
| Category | Options | |----------|---------| | Structure predictors | AF2, Boltz-1, Boltz-2, Protenix, OpenFold3, ESMFold2 | | Generative / design | BoltzGen, Proteina-Complexa | | Inverse folding | ProteinMPNN, SolubleMPNN, AbMPNN | | Language models | ESM-2, ESM-C, AbLang, trigram | | Property heads | Stability (megascale-trained) |
| Optimizer | Use |
|-----------|-----|
| simplex_APGM | Default; proximal gradient / mirror descent on the probability simplex |
| batched_simplex_APGM | The same, vmapped over many designs |
| gradient_MCMC | Discrete moves for fine-tuning a sequence |
A reasonable simplex_APGM step size is about 0.1 * sqrt(binder_length).
The published Nipah competition recipe optimizes a design loss on Boltz-2, then ranks candidates with a separate multi-sample loss built from ipTM and ipSAE. The multi-sample loss is a method on the Boltz2 model, not a free function:
from mosaic.models.boltz2 import Boltz2
boltz2 = Boltz2()
ranking_loss = boltz2.build_multisample_loss(
loss=1.00 * sp.IPTMLoss()
+ 0.5 * sp.TargetBinderIPSAE()
+ 0.5 * sp.BinderTargetIPSAE(),
features=design_features,
num_samples=6,
recycling_steps=3,
)
On the Adaptyv Nipah de novo target, this recipe produced 8 binders out of 9 tested designs at nanomolar affinity, the highest hit-rate of any method on that target in the public results. That is a small, expert-tuned sample on one hard target, not a guarantee across targets, so treat Mosaic as a high-ceiling option that rewards careful objective design rather than a turnkey default.
Two practices from that work are worth carrying over:
Should I use Mosaic?
│
├─ Need a custom objective across multiple models? → Mosaic
├─ Want one-click binders with default filters? → bindcraft
├─ Want all-atom diffusion design? → boltzgen
└─ Want backbone-only diversity? → rfdiffusion + proteinmpnn
Adaptyv's own tests of these models showed Mosaic costing about $0.55 per accepted design, averaged across 7 targets, among the cheapest per design of the methods tested. That is compute only; the setup and tuning effort is the real cost of using Mosaic.
| Issue | Cause | Fix |
|-------|-------|-----|
| Designs fail simple in-silico checks | Under-constrained objective | Add inverse-folding and confidence terms; filter with protein-qc |
| Optimization unstable | Step size too large | Lower the simplex_APGM step size |
| First call very slow | JIT compilation | Expected; reuse the compiled loss across designs |
| OOM with large predictors | Several models in one loss | Use smaller predictors or a larger GPU |
Next: Validate designs with boltz or chai, rank with ipsae, then filter
with protein-qc.
data-ai
Structure prediction with Protenix, an open AlphaFold3 reproduction. Use this skill when: (1) Predicting complex structures with an AF3-class model, (2) Wanting an open alternative to AF3 alongside Boltz and Chai, (3) Validating designed binder-target complexes. For QC thresholds, use protein-qc. For ipSAE ranking, use ipsae.
development
De novo antibody and nanobody (VHH) design with Germinal. Use this skill when: (1) Designing epitope-targeted nanobodies or scFvs, (2) Needing CDR design on a fixed framework, (3) Working on antibody-format binders rather than miniproteins. For miniprotein binders, use binder-design (BoltzGen, BindCraft, RFdiffusion, Mosaic). For structure validation, use boltz or chai.
testing
Access UniProt for protein sequence and annotation retrieval. Use this skill when: (1) Looking up protein sequences by accession, (2) Finding functional annotations, (3) Getting domain boundaries, (4) Finding homologs and variants, (5) Cross-referencing to PDB structures. For structure retrieval, use pdb. For sequence design, use proteinmpnn.
development
Solubility-optimized protein sequence design using SolubleMPNN. Use this skill when: (1) Designing for E. coli expression, (2) Optimizing solubility of designed proteins, (3) Reducing aggregation propensity, (4) Need high-yield expression, (5) Avoiding inclusion body formation. For standard design, use proteinmpnn. For ligand-aware design, use ligandmpnn.