scientific-skills/Evidence Insights/antibody-humanizer/SKILL.md
Humanize murine antibody sequences using CDR grafting and framework optimization to reduce immunogenicity while preserving antigen binding. Predicts optimal human germline frameworks and identifies critical back-mutations for therapeutic antibody development.
npx skillsauth add aipoch/medical-research-skills antibody-humanizerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
4 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Bioinformatics platform for converting murine antibodies into humanized variants by grafting complementarity-determining regions (CDRs) onto human framework templates while preserving antigen-binding affinity and reducing immunogenicity risk.
Key Capabilities:
✅ Use this skill when:
❌ Do NOT use when:
phage-display-libraryantibody-design-aiaffinity-maturation-predictorfc-engineering-toolkitIntegration:
antibody-sequencer (VH/VL sequence determination), cdr-grafting-validator (structural assessment)protein-struct-viz (3D visualization), immunogenicity-predictor (T-cell epitope analysis)Parse antibody sequences and identify CDR boundaries:
from scripts.humanizer import AntibodyHumanizer
humanizer = AntibodyHumanizer()
# Analyze antibody sequence
analysis = humanizer.analyze_sequence(
vh_sequence="QVQLQQSGPELVKPGASVKISCKASGYTFTDYYMHWVKQSHGKSLEWIGYINPSTGYTEYNQKFKDKATLTVDKSSSTAYMQLSSLTSEDSAVYYCAR...",
vl_sequence="DIQMTQSPSSLSASVGDRVTITCRASQGISSWLAWYQQKPGKAPKLLIYKASSLESGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQYSSYPYT...",
scheme="chothia" # Options: kabat, chothia, imgt
)
# Output CDR locations
print(analysis.cdr_regions)
# {
# "VH_CDR1": {"start": 26, "end": 32, "seq": "GYTFTDY"},
# "VH_CDR2": {"start": 52, "end": 58, "seq": "INPSTGY"},
# ...
# }
Numbering Schemes: | Scheme | VH CDR1 | VH CDR2 | VH CDR3 | Best For | |--------|---------|---------|---------|----------| | Chothia | 26-32 | 52-56 | 95-102 | Structural analysis | | Kabat | 31-35 | 50-65 | 95-102 | Sequence-based work | | IMGT | 27-38 | 56-65 | 105-117 | Standardized analysis |
Identify optimal human germline templates:
# Match against human germline database
matches = humanizer.find_human_frameworks(
vh_framework=analysis.vh_frameworks,
vl_framework=analysis.vl_frameworks,
top_n=5,
criteria=["homology", "canonical_structure", "vernier_similarity"]
)
# Evaluate each candidate
for match in matches:
print(f"Template: {match.germline_genes}")
print(f"Homology: {match.homology:.2%}")
print(f"Vernier Score: {match.vernier_score:.1f}")
print(f"Risk Level: {match.immunogenicity_risk}")
Matching Criteria:
Assess immunogenicity risk of candidates:
# Score humanization candidates
scores = humanizer.score_candidates(
murine_antibody=analysis,
human_templates=matches,
scoring_methods=["t20", "h_score", "germline_deviation", "paratope_diversity"]
)
# Rank by overall score
ranked = scores.rank_by_composite_score(
weights={"humanness": 0.4, "binding_retention": 0.4, "developability": 0.2}
)
Scoring Methods: | Method | Description | Target | |--------|-------------|--------| | T20 Score | 20-mer peptide humanization | >80% human | | H-Score | Hummerblind germline distance | <15 mutations | | Paratope Diversity | CDR germline gene diversity | Low diversity | | Developability | Aggregation/pH stability prediction | High score |
Identify critical residues to retain from murine framework:
# Predict back-mutations
back_mutations = humanizer.predict_back_mutations(
murine_vh=analysis.vh_sequence,
human_vh=matches[0].human_template,
cdr_regions=analysis.cdr_regions,
rationale_required=True
)
# Output shows position-specific recommendations
for mutation in back_mutations:
print(f"Position {mutation.position}: {mutation.human_aa} → {mutation.murine_aa}")
print(f"Rationale: {mutation.reason}") # e.g., "Vernier region contact"
print(f"Priority: {mutation.priority}") # Critical/Important/Optional
Critical Residue Classes:
Scenario: Convert murine anti-tumor antibody to therapeutic candidate.
# Humanize single antibody
python scripts/main.py \
--vh "QVQLQQSGPELVKPGASVKISCKAS..." \
--vl "DIQMTQSPSSLSASVGDRVTITCRAS..." \
--name "Anti-HER2-Murine-1" \
--scheme chothia \
--top-n 3 \
--output humanization_report.json
# Review top candidates
cat humanization_report.json | jq '.candidates[0]'
Workflow:
Scenario: Screen multiple murine clones from hybridoma campaign.
# Process multiple antibodies
antibodies = [
{"name": "Clone-A", "vh": "...", "vl": "..."},
{"name": "Clone-B", "vh": "...", "vl": "..."},
{"name": "Clone-C", "vh": "...", "vl": "..."}
]
results = humanizer.batch_humanize(
antibodies=antibodies,
ranking_criteria="composite_score",
min_humanness=0.85
)
# Rank by developability
ranked = results.rank_by(criteria=["humanness", "binding_retention", "stability"])
Selection Criteria:
Scenario: Compare different humanization strategies for lead candidate.
# Test multiple framework combinations
strategies = [
{"vh": "IGHV1-2*02", "vl": "IGKV1-12*01", "name": "Template-A"},
{"vh": "IGHV3-23*01", "vl": "IGKV3-20*01", "name": "Template-B"},
{"vh": "IGHV4-34*01", "vl": "IGKV1-5*01", "name": "Template-C"}
]
comparison = humanizer.compare_strategies(
murine_antibody=analysis,
strategies=strategies,
metrics=["homology", "back_mutations", "immunogenicity", "paratope_structure"]
)
comparison.generate_report("framework_comparison.pdf")
Comparison Metrics:
Scenario: Assess humanization for patent landscape analysis.
# Generate humanized variants
python scripts/main.py \
--input murine_lead.json \
--generate-variants 10 \
--include-back-mutations \
--output variants_for_ip.json
# Check novelty against patent databases
python scripts/patent_check.py \
--sequences variants_for_ip.json \
--databases [USPTO, EPO, WIPO] \
--output novelty_report.pdf
IP Considerations:
From murine hybridoma to therapeutic candidate:
# Step 1: Sequence analysis and CDR identification
python scripts/main.py \
--vh $VH_SEQUENCE \
--vl $VL_SEQUENCE \
--scheme chothia \
--output step1_analysis.json
# Step 2: Find best human frameworks
python scripts/main.py \
--input step1_analysis.json \
--find-frameworks \
--top-n 5 \
--output step2_frameworks.json
# Step 3: Score and rank candidates
python scripts/main.py \
--input step2_frameworks.json \
--score-candidates \
--include-immunogenicity \
--output step3_scored.json
# Step 4: Predict back-mutations
python scripts/main.py \
--input step3_scored.json \
--predict-back-mutations \
--rationale \
--output step4_backmutations.json
# Step 5: Generate final humanized sequences
python scripts/main.py \
--input step4_backmutations.json \
--generate-sequences \
--format fasta \
--output humanized_antibody.fasta
Python API:
from scripts.humanizer import AntibodyHumanizer
from scripts.scoring import HumanizationScorer
from scripts.backmutation import BackMutationPredictor
# Initialize pipeline
humanizer = AntibodyHumanizer()
scorer = HumanizationScorer()
bm_predictor = BackMutationPredictor()
# Step 1: Parse and analyze
antibody = humanizer.analyze_sequence(
vh_sequence=murine_vh,
vl_sequence=murine_vl,
scheme="chothia"
)
# Step 2: Find human frameworks
candidates = humanizer.find_human_frameworks(
antibody,
top_n=5
)
# Step 3: Score candidates
for candidate in candidates:
scores = scorer.calculate_scores(
murine=antibody,
humanized=candidate
)
candidate.composite_score = scores.weighted_score()
# Step 4: Select best and predict back-mutations
best = max(candidates, key=lambda x: x.composite_score)
back_mutations = bm_predictor.predict(
murine=antibody,
human_template=best
)
# Step 5: Generate final sequence
final_sequence = humanizer.generate_humanized_sequence(
template=best,
back_mutations=back_mutations,
cdrs=antibody.cdr_regions
)
print(f"Humanized antibody generated:")
print(f"- Humanness: {best.humanness:.1%}")
print(f"- Back-mutations: {len(back_mutations)}")
print(f"- Risk level: {best.immunogenicity_risk}")
Input Quality:
Humanization Assessment:
Output Validation:
Before Experimental Work:
Sequence Issues:
❌ Incomplete sequences → Missing framework regions
❌ Wrong numbering scheme → CDR boundaries incorrect
❌ Non-standard residues → Unusual amino acids
Design Issues:
❌ Over-humanization → Losing antigen binding
❌ Ignoring back-mutations → Assuming 100% human framework works
❌ Single candidate only → No backup options
Experimental Issues:
❌ Skipping binding validation → Assuming in silico = in vivo
❌ Ignoring developability → Aggregation or instability
Available in references/ directory:
imgt_germline_database.md - Human germline gene reference sequencescdr_numbering_schemes.md - Kabat, Chothia, IMGT comparisonhumanization_case_studies.md - Successful therapeutic examplesvernier_positions_guide.md - Critical framework residuesimmunogenicity_assessment.md - T-cell epitope prediction methodspatent_landscape.md - Humanization IP considerationsLocated in scripts/ directory:
main.py - CLI interface for humanizationhumanizer.py - Core humanization enginecdr_parser.py - CDR identification and numberingframework_matcher.py - Human germline database searchscoring.py - Humanization quality assessmentbackmutation.py - Critical residue predictionbatch_processor.py - Multiple antibody screeningstructure_predictor.py - CDR conformation analysis| Parameter | Type | Default | Required | Description |
|-----------|------|---------|----------|-------------|
| --vh | string | - | No | Murine VH sequence (amino acids) |
| --vl | string | - | No | Murine VL sequence (amino acids) |
| --input, -i | string | - | No | Input JSON file path |
| --name, -n | string | "" | No | Antibody name |
| --output, -o | string | - | No | Output file path |
| --format, -f | string | json | No | Output format (json, fasta, csv) |
| --scheme, -s | string | chothia | No | Numbering scheme (kabat, chothia, imgt) |
| --top-n | int | 3 | No | Number of best candidates to return |
# Humanize with direct sequence input
python scripts/main.py --vh "QVQLQQSGPELVKPGASVKMSCKAS..." --vl "DIQMTQSPSSLSASVGDRVTITC..." --name "MyAntibody"
# Use JSON input file
python scripts/main.py --input antibody.json --output results.json
# Use IMGT numbering scheme
python scripts/main.py --vh "SEQUENCE" --vl "SEQUENCE" --scheme imgt
{
"vh_sequence": "QVQLQQSGPELVKPGASVKMSCKAS...",
"vl_sequence": "DIQMTQSPSSLSASVGDRVTITC...",
"name": "MyAntibody",
"scheme": "chothia"
}
| Risk Indicator | Assessment | Level | |----------------|------------|-------| | Code Execution | Python script executed locally | Medium | | Network Access | No external API calls | Low | | File System Access | Read input files, write output files | Low | | Instruction Tampering | Standard prompt guidelines | Low | | Data Exposure | Output may contain proprietary sequences | Medium |
# Python 3.7+
# No external packages required (uses standard library)
🔬 Critical Note: Computational humanization is a design tool, not a substitute for experimental validation. Always express and test humanized candidates for binding affinity, specificity, stability, and immunogenicity before therapeutic development.
tools
Generates complete conventional oncology bulk-transcriptome biomarker and hub-gene research designs from a user-provided cancer type and study direction. Always use this skill whenever a user wants to design, plan, or build a tumor bioinformatics study centered on differential expression, prognostic filtering or risk modeling, PPI-based hub-gene prioritization, diagnostic/prognostic evaluation, clinical association, immune infiltration context, methylation context, and optional tissue or cell validation. Covers five study patterns (signature-first prognostic workflow, hub-gene-first biomarker workflow, hybrid signature-to-hub workflow, immune-context biomarker workflow, translational validation workflow) and always outputs four workload configs (Lite / Standard / Advanced / Publication+) with recommended primary plan, step-by-step workflow, figure plan, validation strategy, minimal executable version, publication upgrade path...
development
Generates complete conventional non-oncology bioinformatics research designs from a user-provided disease context, process-related gene family or biological theme, and validation direction. Use when a study centers on multi-dataset bulk transcriptome integration, DEG analysis, process-gene intersection, enrichment analysis, GSEA, PPI hub-gene prioritization, TF/miRNA regulatory networks, ROC-based biomarker evaluation, and immune infiltration analysis. Covers five study patterns (process-DEG discovery, enrichment/GSEA interpretation, hub-gene prioritization, regulatory-network and immune interpretation, multi-layer public validation) and always outputs Lite / Standard / Advanced / Publication+ with a recommended primary plan, stepwise workflow, figure plan, validation hierarchy, minimal executable version, publication upgrade path, and strictly verified literature retrieval.
tools
Plans confounder control, variable adjustment logic, and bias mitigation strategies at the protocol stage for clinical, epidemiologic, translational, observational, and biomarker studies. Always use this skill when a user needs to identify major confounders, decide which variables should or should not be adjusted for, compare matching/stratification/weighting approaches, anticipate selection or measurement bias, or pressure-test a study design before execution. Focus on bias sensing, causal structure awareness, variable-role classification, and critical design review rather than generic statistical advice.
testing
Generates complete comparative network-toxicology research designs from a user-provided exposure pair, shared toxic phenotype, and validation direction. Use when a study centers on two related exposures under one outcome and needs target collection, shared-vs-specific target decomposition, enrichment, PPI hub prioritization, docking, optional transcriptomic cross-checks, and conservative mechanistic synthesis. Covers five study patterns and always outputs Lite / Standard / Advanced / Publication+ with a recommended primary plan, stepwise workflow, figure plan, validation hierarchy, minimal executable version, publication upgrade path, and strictly verified literature retrieval.