chemoinformatics/covalent-design/SKILL.md
Designs covalent inhibitors and warheads targeting cysteine (most common, 98% of covalent drugs), lysine, serine, threonine, tyrosine, and aspartate residues, with explicit handling of warhead reactivity (acrylamide, chloroacetamide, vinyl sulfone, sulfonyl fluoride, fluorosulfate, aldehyde, boronate, nitrile), reversibility (kinact/Ki, t_residence), glutathione (GSH) stability, intrinsic reactivity assays, and covalent docking (DOCKovalent, GOLD, HCovDock). Use when designing covalent inhibitors for targeted covalent inhibition (TCI), KRAS G12C-style approaches, or rationalizing covalent SAR.
npx skillsauth add GPTomics/bioSkills bio-covalent-designInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Reference examples tested with: RDKit 2024.09+, OpenEye / AutoDock Vina 1.2+ (for covalent extensions), GOLD (commercial), DOCKovalent (web service), HCovDock 1.0+.
Before using code patterns, verify installed versions match. If versions differ:
pip show rdkit then help(rdkit.Chem) to check signaturesIf code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.
Design molecules that form covalent bonds with target protein residues. The "covalent revolution" (Lonsdale & Ward 2018) made TCIs (Targeted Covalent Inhibitors) clinically validated: KRAS G12C inhibitors (sotorasib, adagrasib), BTK inhibitors (ibrutinib), and EGFR inhibitors (osimertinib) are recent successes. Postdoc-grade covalent design requires balancing intrinsic reactivity (must form bond) vs selectivity (only the intended residue), reversibility (irreversible vs reversible covalent), and drug-likeness (warheads can hurt PK).
For warhead substructure filtering (in non-covalent contexts), see chemoinformatics/substructure-search. For non-covalent docking, see chemoinformatics/virtual-screening. For pose validation, see chemoinformatics/pose-validation.
| Residue | % of covalent drugs | Reactivity | Notes | |---------|---------------------|------------|-------| | Cysteine | ~98% | High (nucleophile thiol) | Most accessible; preferred | | Lysine | ~1% | Moderate (amine) | Less reactive; selective for sulfonyl fluoride | | Serine | <1% | Low (alcohol, requires activation) | β-lactam, boronate | | Threonine | very rare | Low | Boronate, aldehyde | | Tyrosine | very rare | Moderate (phenol) | Sulfonyl fluoride, fluorosulfate | | Aspartate/Glutamate | very rare | Low (carboxylate) | Aldehyde Schiff base |
Cysteine is the dominant target because:
| Warhead | SMARTS pattern | Reactivity | Reversibility | Cys-selective |
|---------|----------------|------------|---------------|----------------|
| Acrylamide | C(=O)C=C | Moderate (Michael acceptor) | Irreversible | Yes |
| Chloroacetamide | C(=O)CCl | High (SN2) | Irreversible | Yes |
| α-haloketone | [CX3](=O)C[F,Cl,Br] | Very high | Irreversible | Yes (but reactive) |
| Vinyl sulfone | S(=O)(=O)C=C | Moderate (Michael) | Irreversible | Yes |
| Sulfonyl fluoride | S(=O)(=O)F | Moderate | Irreversible | Lys/Tyr/Ser |
| Fluorosulfate (SuFEx) | OS(=O)(=O)F | Moderate | Irreversible | Tyr/Lys |
| Aldehyde | C(=O)[H] | Variable | Reversible (covalent equilibrium) | Cys/Lys/Ser |
| Boronate (B-OH or B(OH)2) | B(O)O | Moderate | Reversible | Ser/Thr |
| Nitrile | C#N | Low | Reversible (Cys-S adduct) | Cys |
| Epoxide | C1OC1 | High | Irreversible | Cys/Lys/Asp |
| α,β-unsaturated ketone | [CX3](=O)C=C | Moderate (Michael) | Irreversible | Cys |
| Isothiocyanate | N=C=S | High | Irreversible | Cys/Lys |
| Maleimide | C(=O)N(C(=O))C=C | Very high | Irreversible | Cys |
| Cysteine-selective heterocycle | various | Moderate | Variable | Yes (designed) |
Practical hierarchy: Acrylamide is the modern default for cysteine-selective TCIs (KRAS G12C, EGFR, BTK). Chloroacetamide is more reactive (faster) but less selective.
| Goal | Warhead choice | Reactivity tier | |------|----------------|-----------------| | Cysteine TCI, drug candidate | Acrylamide | Moderate (~kinact/Ki ~10^3-10^5 M^-1 s^-1) | | Cysteine probe (chemical biology) | Chloroacetamide | High (kinact/Ki ~10^4-10^6 M^-1 s^-1) | | Lysine TCI (uncommon) | Sulfonyl fluoride | Moderate | | Tyrosine TCI | Fluorosulfate (SuFEx) | Moderate | | Reversible covalent (KRAS G12C-like) | Acrylamide with α-substitution | Moderate reversibility | | Activity-based protein profiling (ABPP) | Iodoacetamide / chloroacetamide | Very high | | Boronic acid inhibitor (proteasome) | Boronate | Reversible | | Aldehyde inhibitor (calpain) | Aldehyde | Reversible covalent |
Covalent inhibition kinetics:
Modern best practice: report kinact/Ki, not just IC50. Two compounds with same IC50 can have very different kinact/Ki:
| kinact/Ki range (M^-1 s^-1) | Interpretation | Reference | |-----------------------------|----------------|-----------| | > 10^5 | Highly efficient covalent inhibitor | Chloroacetamide probes, fragment-warhead TCIs | | 10^3 - 10^5 | Standard for TCI; clinical candidate | Sotorasib (AMG510) KRAS G12C ~2x10^4 (Hallin 2020); adagrasib ~5x10^3 | | 10^2 - 10^3 | Moderate; clinical possible with high target dwell time | Ibrutinib BTK ~5x10^3 (Pan 2007) | | < 10^2 | Weak; needs warhead optimization | Reversibility likely dominates | | <= 10 | Probably not covalent (or wrong residue) | Background rate vs GSH |
Before committing to a warhead, measure intrinsic reactivity (off-target risk):
# Generic GSH stability assay readout - measure half-life of warhead with 10 mM GSH
# kinact_GSH from time-course of warhead disappearance
| Warhead | GSH t1/2 at 10 mM | Risk | |---------|---------------------|------| | Chloroacetamide | minutes | High (reacts with off-target Cys) | | Acrylamide | hours | Moderate | | Substituted acrylamide (alpha-Me) | days | Low (drug-like) | | Nitrile | days | Low | | Sulfonyl fluoride | hours-days | Variable |
The "GSH-stable" warhead (t1/2 > 4 hours) is the modern target for druglike TCIs.
| Tool | Approach | Strength | Fails when | |------|----------|----------|------------| | DOCKovalent (London et al 2014 Nat Chem Biol 10:1066) | Constraint-based DOCK | Free, well-validated | Browser-based; small library | | GOLD covalent (CCDC) | GOLD with covalent constraint | Commercial; selectivity | License cost | | AutoDock 4 covalent | AD4 with covalent bond | Open source | Slower than Vina | | CovDock (Schrödinger) | Glide-based + covalent | Commercial best | License cost | | MOE covalent | Triposite Discovery | Commercial | License cost | | HCovDock (Wu Q, Huang S-Y 2023 Briefings Bioinform 24:bbac559) | Hierarchical fragment + covalent | Open; supports many residues | Newer, less validated | | ICM-Pro covalent | Active site grid + covalent | Commercial; metal centers | License cost |
For open-source covalent docking, HCovDock (2023) is the modern alternative; DOCKovalent is the longstanding standard.
Goal: Decorate a co-crystal scaffold with a cysteine-targeting warhead and rank candidates by covalent efficiency.
Approach: Load scaffold SMILES, enumerate acrylamide-bearing analogs, filter by reactivity selectivity, dock under covalent constraint, and rank by kinact/Ki surrogates.
from rdkit import Chem
# Step 1: scaffold from co-crystal (4LRW or AMG510)
scaffold_smi = 'c1ccc(C(=O)NC2=Nc3c(...)cnc23)cc1'
scaffold = Chem.MolFromSmiles(scaffold_smi)
# Step 2: enumerate analogs with acrylamide warhead
def add_acrylamide(scaffold, attachment_atom_idx):
"""Append acrylamide (-NC(=O)C=C) at a hydrogen position"""
warhead = Chem.MolFromSmiles('NC(=O)C=C')
# ... combine via Chem.RWMol or fragment combination
pass
# Step 3: filter for reactive group selectivity
# Step 4: dock with DOCKovalent / GOLD covalent / HCovDock
# Step 5: rank by kinact/Ki surrogate (compute reactive Michael acceptor reactivity)
For ranking warheads without wet-lab data:
| Descriptor | Use case |
|------------|----------|
| LUMO energy (DFT) | Michael acceptor reactivity (lower LUMO = more reactive) |
| Electrophile partial charge | SN2 reactivity |
| RDKit rdMolDescriptors.CalcLabuteASA | Steric accessibility |
| AlphaFold3 / Boltz-2 binding pose | Geometric fit to reactive Cys |
Goal: Approximate Michael-acceptor reactivity from 2D structure without running DFT.
Approach: Parse the SMILES, locate the acrylamide substructure, count alpha-carbon substituents (more substitution lowers LUMO and slows reactivity), and return a negative count as a relative reactivity proxy.
def acceptor_lumo_surrogate(smi):
# Crude: count alpha-substituents to acrylamide; more substituents = lower reactivity
mol = Chem.MolFromSmiles(smi)
if mol is None:
return None
acryl_pat = Chem.MolFromSmarts('[CX3](=O)[CX3]=[CX3]')
matches = mol.GetSubstructMatches(acryl_pat)
if not matches:
return None
# Count substituents on alpha-C
alpha_c = mol.GetAtomWithIdx(matches[0][1])
n_subs = len([n for n in alpha_c.GetNeighbors() if n.GetIdx() not in matches[0]])
return -n_subs # crude proxy (lower = more reactive)
For real reactivity prediction, DFT calculations (LUMO energy, HOMO-LUMO gap) are needed.
Trigger: Acrylamide warhead targeted at Tyr.
Mechanism: Acrylamides are Cys-selective; do not form bonds with Tyr at physiological pH.
Symptom: No covalent adduct observed despite docking pose.
Fix: Match warhead to residue: acrylamide/chloroacetamide for Cys; sulfonyl fluoride for Lys/Tyr/Ser.
Trigger: Chloroacetamide in drug-candidate context.
Mechanism: Too reactive; forms adducts with off-target Cys (e.g., GSH t1/2 < 30 min).
Symptom: Toxicity in cell-based assays; non-specific binding signal.
Fix: Replace with acrylamide (more selective); add alpha-substitution to acrylamide for tunable reactivity.
Trigger: Warhead positioned but Cys-Cβ distance > 6 Å.
Mechanism: Even with reactive warhead, geometric reach matters; Cys side chain has limited reach.
Symptom: No covalent labeling in mass spec despite predicted docking.
Fix: Validate by measuring distance from warhead to Cys-Cβ; redock with constrained covalent bond.
Trigger: Designed irreversible TCI but warhead is reversible.
Mechanism: Nitrile, aldehyde, boronate are reversible; equilibrium with non-covalent.
Symptom: Activity wanes after substrate washout in cellular assays.
Fix: Use truly irreversible warhead (acrylamide, chloroacetamide); or design for reversible covalent intentionally.
Trigger: Optimizing for IC50 instead of kinact/Ki.
Mechanism: Compounds with same IC50 differ in covalent efficiency.
Symptom: Apparently identical compounds have different in vivo PK.
Fix: Always measure kinact/Ki (kinetic assay); rank by covalent efficiency.
Trigger: Default DOCKovalent run.
Mechanism: Covalent constraint forces docking; many ligands "succeed" but are unrealistic.
Symptom: Many compounds pass docking; few label in vitro.
Fix: Post-filter by reactivity (chemoinformatics), geometric fit (Cys-Cβ distance), and PoseBusters.
| Aspect | Irreversible | Reversible covalent | |--------|--------------|---------------------| | Examples | KRAS G12C (acrylamide), BTK (ibrutinib) | Boronate (bortezomib), aldehyde (calpain inhibitors) | | Toxicity profile | Off-target Cys labeling potential | Off-target equilibrium | | Resistance mechanism | Mutation of reactive Cys | Mutation reduces affinity | | Patent / IP | Stronger (specific bond) | Standard | | When to choose | If Cys is hot-spot, conserved, druggable | If reversibility critical (e.g., proteasome) |
| Symptom | Cause | Fix | |---------|-------|-----| | Warhead not matching SMARTS | Different stereochemistry or charged | Use canonicalized + neutral SMARTS | | DOCKovalent rejects ligand | No suitable Cys in pocket | Re-check residue accessibility | | GSH adduct dominates | Warhead too reactive | Use less reactive warhead; or alpha-substitute | | Off-target labeling in cells | Promiscuous warhead | Iterate warhead reactivity vs selectivity | | Docking pose but no labeling | Geometric mismatch | Distance check; rotamer search | | Reversible inhibitor not acting irreversibly | Wrong warhead class | Re-check reaction mechanism | | HCovDock fails on PROTAC | Tool optimized for monomer covalent | Use specialized tools for bivalent |
testing
Analyze multi-modal single-cell data (CITE-seq, Multiome, spatial). Use when working with data that measures multiple modalities per cell like RNA + protein or RNA + ATAC. Use when analyzing CITE-seq, Multiome, or other multi-modal single-cell data.
data-ai
Analyze metabolite-mediated cell-cell communication using MeboCost for metabolic signaling inference between cell types. Predict metabolite secretion and sensing patterns from scRNA-seq data. Use when studying metabolic crosstalk between cell populations or metabolite-receptor interactions.
development
Find marker genes and annotate cell types in single-cell RNA-seq using Seurat (R) and Scanpy (Python). Use for differential expression between clusters, identifying cluster-specific markers, scoring gene sets, and assigning cell type labels. Use when finding marker genes and annotating clusters.
development
Reconstruct cell lineage trees from CRISPR barcode tracing or mitochondrial mutations. Use when studying clonal dynamics, cell fate decisions, or developmental trajectories.