Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

adaptyvbio/esm

Name: esm
Author: adaptyvbio

skills/esm/SKILL.md

npx skillsauth add adaptyvbio/protein-design-skills esm

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

ESM Protein Language Models

The ESM line is maintained at github.com/Biohub/esm (Chan Zuckerberg Biohub, MIT license; the older evolutionaryscale/esm URL redirects here). The current generation ships three artifacts: ESM C (language model), ESMFold2 (structure prediction), and ESM Atlas (a map of predicted structures). Weights are on huggingface.co/biohub; the hosted API is at biohub.ai.

This skill covers ESM C, ESMFold2, and legacy ESM2. ESM3 is not covered because its open weights are non-commercial.

Which model to use

| Task | Model | |------|-------| | Embeddings, PLL, mutation scoring | ESM C (ESMC-6B), or ESM2 for a lighter run | | Complex structure prediction | ESMFold2 | | High-throughput single-sequence folding | ESMFold2 fast mode | | Binder design | ESMFold2 inversion (see below), or the mosaic / bindcraft skills | | Variant effect / zero-shot scoring | ESM C or ESM2 |

Prerequisites

| Requirement | Minimum | Recommended | |-------------|---------|-------------| | Python | 3.10+ | 3.11 | | PyTorch | 2.0+ | Latest | | CUDA | 12.0+ | 12.1+ | | GPU VRAM | 24GB (ESM2 / small ESMC) | 80GB (ESMC-6B, ESMFold2) |

ESM C: embeddings and scoring

ESM C is the successor to ESM2. It improves long-range structural understanding as model scale grows and is the default choice for embeddings, pseudo-log-likelihood, and mutation-effect scoring.

Python (Hugging Face)

from transformers import AutoModelForMaskedLM, AutoTokenizer
import torch

model_id = "biohub/ESMC-6B"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForMaskedLM.from_pretrained(
    model_id, output_hidden_states=True, torch_dtype=torch.bfloat16
).eval().cuda()

batch = tok(["MKTAYIAKQRQISFVK..."], return_tensors="pt").to("cuda")
with torch.no_grad():
    out = model(**batch)

logits = out.logits                      # for PLL / mutation scoring
embeddings = out.hidden_states[-1]       # per-residue representations

Install the package with pip install esm@git+https://github.com/Biohub/esm.git@main.

Hosted API

from esm.sdk import esmc_client
from esm.sdk.api import ESMProtein, LogitsConfig

model = esmc_client(model="esmc-600m-2024-12", url="https://biohub.ai", token="<API token>")
tensor = model.encode(ESMProtein(sequence="MKTAYIAKQRQISFVK..."))
out = model.logits(tensor, LogitsConfig(sequence=True, return_embeddings=True))

ESMC-6B has open weights; esmc-600m is the smaller API model. For mutation scoring and fine-tuning, see the esmc_mutation_scoring and esmc_finetune notebooks under cookbook/tutorials.

ESMFold2: complex structure prediction

ESMFold2 is built on ESMC-6B with a diffusion structure head. Unlike the original ESMFold, it predicts complexes (protein, DNA, ligand, and modified residues), takes an optional MSA, and has a single-sequence fast mode for high-throughput screening. It is validated for protein-protein interaction design and leads DockQ pass-rate on Foldbench protein-protein and antibody-antigen complexes.

Modal (biomodals)

printf '>protein|A\nMKTAYIAKQRQISFVK...\n' > target.faa
uv run --with modal modal run modal_esmfold2.py --input-faa target.faa

The FASTA header tags protein|, dna|, rna|, and ligand| (SMILES) let you fold complexes. GPU defaults to A100-40GB (set with MODAL_GPU).

Python (local weights)

from transformers.models.esmfold2.modeling_esmfold2 import ESMFold2Model
from esm.models.esmfold2 import ProteinInput, StructurePredictionInput, ESMFold2InputBuilder

model = ESMFold2Model.from_pretrained("biohub/ESMFold2").cuda().eval()
spi = StructurePredictionInput(sequences=[ProteinInput(id="A", sequence="BINDER_SEQ")])
result = ESMFold2InputBuilder().fold(model, spi, num_loops=20, num_sampling_steps=100)
# result.plddt, result.ptm, result.iptm, result.complex.to_mmcif()

For single-sequence high-throughput folding, the fast variant is the SDK model string esmfold2-fast-2026-05 (HF repo biohub/ESMFold2-Fast). ESMFold2 is one option for complex validation alongside boltz and chai; ranking a shortlist across more than one predictor is more reliable than trusting a single model.

Binder design by inverting ESMFold2

The binder_design cookbook runs gradient optimization through ESMFold2 (a BindCraft-style loop) with an ESMC language-model term for sequence plausibility. The published protocol is validated in the lab to nanomolar affinity across five targets and supports both minibinders and antibody-derived scFvs with framework scaffolds.

biomodals wraps this as modal_esmfold2_binder_design.py:

uv run --with modal modal run modal_esmfold2_binder_design.py \
  --target-name pd-l1 --binder-name minibinder

Targets: presets cd45, ctla4, egfr, pd-l1, pdgfr, or pass --target-sequence.
Binders: presets minibinder and antibody frameworks (for example trastuzumab_framework_vhvl), or pass --binder-sequence with # for designable positions. Use --is-antibody for scFv designs.
Rank candidates by ipTM, filter minibinders to pI below 6, then validate the top shortlist with boltz or chai and rank with ipsae.

Adaptyv's own tests of these models showed ESMFold2-inversion binder design costing about $0.85 per accepted design, averaged across 7 targets.

For a framework that composes ESMFold2 with other predictors in one objective, use the mosaic skill.

ESM2 (legacy)

ESM2 still works well for quick embeddings and PLL when ESMC-6B is too large for the available GPU.

import torch, esm
model, alphabet = esm.pretrained.esm2_t33_650M_UR50D()
bc = alphabet.get_batch_converter()
model = model.eval().cuda()
_, _, toks = bc([("seq1", "MKTAYIAKQRQISFVK...")])
with torch.no_grad():
    rep = model(toks.cuda(), repr_layers=[33])["representations"][33]

| Model | Parameters | Use | |-------|------------|-----| | esm2_t12_35M | 35M | Fast screening | | esm2_t33_650M | 650M | Standard embeddings/PLL | | esm2_t36_3B | 3B | Highest-quality ESM2 |

PLL interpretation

PLL (pseudo-log-likelihood) scores how natural a sequence looks to the model. Higher is more natural. Designed sequences often score lower than natural ones, so treat PLL as a soft filter, not a hard cutoff.

| Normalized PLL | Interpretation | |----------------|----------------| | > 0.2 | Very natural | | 0.0 to 0.2 | Natural-like | | -0.5 to 0.0 | Acceptable | | < -0.5 | May be unnatural |

Troubleshooting

| Issue | Cause | Fix | |-------|-------|-----| | CUDA out of memory | ESMC-6B / ESMFold2 too large | Use ESMC-600m API, ESM2, or an 80GB GPU | | Wrong layer for embeddings | Layer index mismatch | Use the last hidden state (layer 33 for ESM2-650M) | | Invalid amino acid | Non-standard residue | Check for non-canonical characters | | Slow ESMFold2 on many designs | Full MSA mode | Use esmfold2-fast-2026-05 single-sequence mode |

Next: Validate structures with boltz or chai, rank with ipsae, then filter with protein-qc.

adaptyvbio/esm

skills/esm/SKILL.md

ESM protein language models for embeddings, sequence scoring, structure prediction, and binder design. Use this skill when: (1) Computing pseudo-log-likelihood (PLL) or mutation-effect scores, (2) Getting protein embeddings for clustering or filtering, (3) Predicting complex structures with ESMFold2, (4) Designing binders by inverting ESMFold2, (5) Filtering designs by sequence plausibility. For diffusion-based structure prediction, use boltz or chai. For QC thresholds, use protein-qc. For gradient-based multi-objective design, use mosaic.

137 stars

data-ai

Updated Jun 12, 2026

$ install --global

skillsauth

npx skillsauth add adaptyvbio/protein-design-skills esm

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jun 12, 2026, 3:53 AM6.5s1 file scanned

SKILL.md

name:: esm
description:: >
prediction, and binder design. Use this skill when:: (1) Computing
license:: MIT
category:: design-tools
tags:: [sequence-design, embeddings, scoring, structure-prediction, binder]
proteinbase_slug:: esm2-optimization
proteinbase_url:: https://proteinbase.com/design-methods/esm2-optimization
biomodals_script:: modal_esm2_predict_masked.py

ESM Protein Language Models

This skill covers ESM C, ESMFold2, and legacy ESM2. ESM3 is not covered because its open weights are non-commercial.

Which model to use

Prerequisites

ESM C: embeddings and scoring

ESM C is the successor to ESM2. It improves long-range structural understanding as model scale grows and is the default choice for embeddings, pseudo-log-likelihood, and mutation-effect scoring.

Python (Hugging Face)

from transformers import AutoModelForMaskedLM, AutoTokenizer
import torch

model_id = "biohub/ESMC-6B"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForMaskedLM.from_pretrained(
    model_id, output_hidden_states=True, torch_dtype=torch.bfloat16
).eval().cuda()

batch = tok(["MKTAYIAKQRQISFVK..."], return_tensors="pt").to("cuda")
with torch.no_grad():
    out = model(**batch)

logits = out.logits                      # for PLL / mutation scoring
embeddings = out.hidden_states[-1]       # per-residue representations

Install the package with pip install esm@git+https://github.com/Biohub/esm.git@main.

Hosted API

from esm.sdk import esmc_client
from esm.sdk.api import ESMProtein, LogitsConfig

model = esmc_client(model="esmc-600m-2024-12", url="https://biohub.ai", token="<API token>")
tensor = model.encode(ESMProtein(sequence="MKTAYIAKQRQISFVK..."))
out = model.logits(tensor, LogitsConfig(sequence=True, return_embeddings=True))

ESMC-6B has open weights; esmc-600m is the smaller API model. For mutation scoring and fine-tuning, see the esmc_mutation_scoring and esmc_finetune notebooks under cookbook/tutorials.

ESMFold2: complex structure prediction

Modal (biomodals)

printf '>protein|A\nMKTAYIAKQRQISFVK...\n' > target.faa
uv run --with modal modal run modal_esmfold2.py --input-faa target.faa

The FASTA header tags protein|, dna|, rna|, and ligand| (SMILES) let you fold complexes. GPU defaults to A100-40GB (set with MODAL_GPU).

Python (local weights)

from transformers.models.esmfold2.modeling_esmfold2 import ESMFold2Model
from esm.models.esmfold2 import ProteinInput, StructurePredictionInput, ESMFold2InputBuilder

model = ESMFold2Model.from_pretrained("biohub/ESMFold2").cuda().eval()
spi = StructurePredictionInput(sequences=[ProteinInput(id="A", sequence="BINDER_SEQ")])
result = ESMFold2InputBuilder().fold(model, spi, num_loops=20, num_sampling_steps=100)
# result.plddt, result.ptm, result.iptm, result.complex.to_mmcif()

Binder design by inverting ESMFold2

biomodals wraps this as modal_esmfold2_binder_design.py:

uv run --with modal modal run modal_esmfold2_binder_design.py \
  --target-name pd-l1 --binder-name minibinder

Targets: presets cd45, ctla4, egfr, pd-l1, pdgfr, or pass --target-sequence.
Binders: presets minibinder and antibody frameworks (for example trastuzumab_framework_vhvl), or pass --binder-sequence with # for designable positions. Use --is-antibody for scFv designs.
Rank candidates by ipTM, filter minibinders to pI below 6, then validate the top shortlist with boltz or chai and rank with ipsae.

Adaptyv's own tests of these models showed ESMFold2-inversion binder design costing about $0.85 per accepted design, averaged across 7 targets.

For a framework that composes ESMFold2 with other predictors in one objective, use the mosaic skill.

ESM2 (legacy)

ESM2 still works well for quick embeddings and PLL when ESMC-6B is too large for the available GPU.

import torch, esm
model, alphabet = esm.pretrained.esm2_t33_650M_UR50D()
bc = alphabet.get_batch_converter()
model = model.eval().cuda()
_, _, toks = bc([("seq1", "MKTAYIAKQRQISFVK...")])
with torch.no_grad():
    rep = model(toks.cuda(), repr_layers=[33])["representations"][33]

| Model | Parameters | Use | |-------|------------|-----| | esm2_t12_35M | 35M | Fast screening | | esm2_t33_650M | 650M | Standard embeddings/PLL | | esm2_t36_3B | 3B | Highest-quality ESM2 |

PLL interpretation

| Normalized PLL | Interpretation | |----------------|----------------| | > 0.2 | Very natural | | 0.0 to 0.2 | Natural-like | | -0.5 to 0.0 | Acceptable | | < -0.5 | May be unnatural |

Troubleshooting

Next: Validate structures with boltz or chai, rank with ipsae, then filter with protein-qc.

Related Skills

adaptyvbio/protenix

data-ai

VerifiedTrustedCommunity

Structure prediction with Protenix, an open AlphaFold3 reproduction. Use this skill when: (1) Predicting complex structures with an AF3-class model, (2) Wanting an open alternative to AF3 alongside Boltz and Chai, (3) Validating designed binder-target complexes. For QC thresholds, use protein-qc. For ipSAE ranking, use ipsae.

137SKILL.mdUpdated Jun 12, 2026

adaptyvbio/mosaic

devops

VerifiedTrustedCommunity

Multi-objective, gradient-based protein binder design with Mosaic. Use this skill when: (1) Composing several structure or sequence models into one design objective, (2) Optimizing binders against a custom loss rather than a fixed pipeline, (3) Wanting gradient descent over sequence space in the style of ColabDesign, RSO, or BindCraft but with interchangeable predictors, (4) Letting the optimizer choose the epitope instead of fixing hotspots. For an end-to-end binder pipeline with default filters, use bindcraft. For all-atom diffusion design, use boltzgen. For backbone-only generation, use rfdiffusion.

137SKILL.mdUpdated Jun 12, 2026

adaptyvbio/germinal

development

VerifiedTrustedCommunity

De novo antibody and nanobody (VHH) design with Germinal. Use this skill when: (1) Designing epitope-targeted nanobodies or scFvs, (2) Needing CDR design on a fixed framework, (3) Working on antibody-format binders rather than miniproteins. For miniprotein binders, use binder-design (BoltzGen, BindCraft, RFdiffusion, Mosaic). For structure validation, use boltz or chai.

137SKILL.mdUpdated Jun 12, 2026

adaptyvbio/uniprot

testing

VerifiedTrustedCommunity

Access UniProt for protein sequence and annotation retrieval. Use this skill when: (1) Looking up protein sequences by accession, (2) Finding functional annotations, (3) Getting domain boundaries, (4) Finding homologs and variants, (5) Cross-referencing to PDB structures. For structure retrieval, use pdb. For sequence design, use proteinmpnn.

137SKILL.mdUpdated Mar 27, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/adaptyvbio/protein-design-skills.git

# Copy into Claude Code skills folder (global)
cp -r protein-design-skills/skills/esm ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

adaptyvbio/protein-design-skills

137 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT