skills/lab-automation/benchling-integration/SKILL.md
Benchling R&D Python SDK: CRUD on registry entities (DNA, RNA, proteins, custom), inventory, ELN, workflow automation. Needs Benchling account and API key. Use biopython for local sequence analysis; pubchem for chemical DBs.
npx skillsauth add jaechang-hits/scicraft benchling-integrationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Benchling is a cloud platform for life sciences R&D. The Python SDK provides programmatic access to registry entities (DNA, proteins), inventory, electronic lab notebooks, and workflows. All operations require a Benchling tenant URL and API key or OAuth credentials.
pip install benchling-sdk
Authentication setup: Obtain an API key from Benchling Profile Settings. Store securely in environment variables — never commit to version control.
import os
from benchling_sdk.benchling import Benchling
from benchling_sdk.auth.api_key_auth import ApiKeyAuth
benchling = Benchling(
url="https://your-tenant.benchling.com",
auth_method=ApiKeyAuth(os.environ["BENCHLING_API_KEY"])
)
OAuth (for multi-user apps):
from benchling_sdk.auth.client_credentials_oauth2 import ClientCredentialsOAuth2
benchling = Benchling(
url="https://your-tenant.benchling.com",
auth_method=ClientCredentialsOAuth2(
client_id=os.environ["BENCHLING_CLIENT_ID"],
client_secret=os.environ["BENCHLING_CLIENT_SECRET"]
)
)
API rate limits: Benchling enforces per-tenant rate limits. The SDK automatically retries on 429 responses with exponential backoff (up to 5 retries by default). For bulk operations, add time.sleep(0.5) between batches.
from benchling_sdk.benchling import Benchling
from benchling_sdk.auth.api_key_auth import ApiKeyAuth
from benchling_sdk.models import DnaSequenceCreate
import os
benchling = Benchling(
url="https://your-tenant.benchling.com",
auth_method=ApiKeyAuth(os.environ["BENCHLING_API_KEY"])
)
# Create a DNA sequence
seq = benchling.dna_sequences.create(
DnaSequenceCreate(name="GFP-insert", bases="ATGGTGAGCAAGGGC", is_circular=False, folder_id="fld_abc123")
)
print(f"Created: {seq.name} ({seq.id})")
Registry entities include DNA sequences, RNA sequences, AA sequences, custom entities, and mixtures. All entity types follow the same create/read/update/archive pattern.
from benchling_sdk.models import DnaSequenceCreate, DnaSequenceUpdate
# Create
sequence = benchling.dna_sequences.create(
DnaSequenceCreate(
name="My Plasmid",
bases="ATCGATCG",
is_circular=True,
folder_id="fld_abc123",
schema_id="ts_abc123",
fields=benchling.models.fields({"gene_name": "GFP"})
)
)
print(f"Created: {sequence.id}")
# Read
seq = benchling.dna_sequences.get_by_id(sequence.id)
print(f"Name: {seq.name}, Length: {len(seq.bases)} bp")
# Update (partial — unspecified fields unchanged)
updated = benchling.dna_sequences.update(
sequence_id=sequence.id,
dna_sequence=DnaSequenceUpdate(
name="Updated Plasmid",
fields=benchling.models.fields({"gene_name": "mCherry"})
)
)
# Archive
benchling.dna_sequences.archive(ids=[sequence.id], reason="RETIRED")
# Register entity in registry (with auto-generated ID)
registered = benchling.dna_sequences.create(
DnaSequenceCreate(
name="Production Plasmid",
bases="ATCGATCG",
is_circular=True,
folder_id="fld_abc123",
entity_registry_id="src_abc123",
naming_strategy="NEW_IDS" # or "IDS_FROM_NAMES"
)
)
print(f"Registry ID: {registered.entity_registry_id}")
# Entity types available via SDK:
# benchling.dna_sequences, benchling.rna_sequences,
# benchling.aa_sequences, benchling.custom_entities, benchling.mixtures
All list operations return paginated generators for memory efficiency.
# List with pagination
sequences = benchling.dna_sequences.list()
total = sequences.estimated_count()
print(f"Total sequences: {total}")
for page in sequences:
for seq in page:
print(f" {seq.name} ({seq.id}): {len(seq.bases)} bp")
# Filter by schema
filtered = benchling.dna_sequences.list(schema_id="ts_abc123")
for page in filtered:
for seq in page:
print(f" {seq.name}")
Manage physical samples, containers, boxes, and locations.
from benchling_sdk.models import ContainerCreate, BoxCreate
# Create container (sample tube)
container = benchling.containers.create(
ContainerCreate(
name="Sample Tube 001",
schema_id="cont_schema_abc123",
parent_storage_id="box_abc123",
fields=benchling.models.fields({"concentration": "100 ng/uL"})
)
)
print(f"Container: {container.id}, Barcode: {container.barcode}")
# Create box
box = benchling.boxes.create(
BoxCreate(
name="Freezer Box A1",
schema_id="box_schema_abc123",
parent_storage_id="loc_abc123"
)
)
# Transfer container to new location
benchling.containers.transfer(
container_id=container.id,
destination_id="box_xyz789"
)
print(f"Transferred {container.name} to new box")
Create and manage electronic lab notebook entries.
from benchling_sdk.models import EntryCreate
# Create notebook entry
entry = benchling.entries.create(
EntryCreate(
name="Experiment 2026-02-17",
folder_id="fld_abc123",
schema_id="entry_schema_abc123",
fields=benchling.models.fields({
"objective": "Test gene expression levels",
"protocol": "Standard qPCR"
})
)
)
print(f"Entry: {entry.id}")
# Link entity to entry
benchling.entry_links.create(
entry_id=entry.id,
entity_id="seq_xyz789"
)
Create and manage workflow tasks for lab process automation.
from benchling_sdk.models import WorkflowTaskCreate, WorkflowTaskUpdate
# Create workflow task
task = benchling.workflow_tasks.create(
WorkflowTaskCreate(
name="PCR Amplification",
workflow_id="wf_abc123",
assignee_id="user_abc123",
fields=benchling.models.fields({"template": "seq_abc123"})
)
)
print(f"Task: {task.id}, Status: {task.status}")
# Update task status
benchling.workflow_tasks.update(
task_id=task.id,
workflow_task=WorkflowTaskUpdate(status_id="status_complete_abc123")
)
# Wait for async operations
from benchling_sdk.helpers.tasks import wait_for_task
result = wait_for_task(
benchling, task_id="task_abc123",
interval_wait_seconds=2, max_wait_seconds=300
)
print(f"Async task completed: {result}")
from benchling_sdk.retry import RetryStrategy
from benchling_sdk.errors import BenchlingError
# Custom retry strategy
benchling = Benchling(
url="https://your-tenant.benchling.com",
auth_method=ApiKeyAuth(os.environ["BENCHLING_API_KEY"]),
retry_strategy=RetryStrategy(max_retries=3)
)
# SDK auto-retries on 429 (rate limit), 502, 503, 504
# Error handling
try:
seq = benchling.dna_sequences.get_by_id("seq_nonexistent")
except BenchlingError as e:
print(f"API error: {e.status_code} — {e.message}")
| Benchling Type | SDK Accessor | Use Case |
|---------------|-------------|----------|
| DNA Sequence | benchling.dna_sequences | Plasmids, primers, gene inserts |
| RNA Sequence | benchling.rna_sequences | mRNA, gRNA, siRNA |
| AA Sequence | benchling.aa_sequences | Proteins, antibodies, enzymes |
| Custom Entity | benchling.custom_entities | Cell lines, reagents, samples |
| Mixture | benchling.mixtures | Buffers, media, compound formulations |
| Container | benchling.containers | Tubes, wells, vials |
| Box | benchling.boxes | Storage boxes, racks |
| Entry | benchling.entries | Lab notebook entries |
| Workflow Task | benchling.workflow_tasks | Process steps, assignments |
Benchling entities use schema-defined custom fields. Always use the fields() helper:
# Correct: use fields() helper
fields = benchling.models.fields({
"concentration": "100 ng/uL",
"date_prepared": "2026-02-17",
"passage_number": 5
})
# Fields are typed by schema — string, number, date, entity link, dropdown
All list() calls return paginated generators. Never call list() without iterating:
# Correct: iterate through pages
for page in benchling.dna_sequences.list():
for item in page:
process(item)
# Get count without loading all data
count = benchling.dna_sequences.list().estimated_count()
import os, time
from Bio import SeqIO
from benchling_sdk.benchling import Benchling
from benchling_sdk.auth.api_key_auth import ApiKeyAuth
from benchling_sdk.models import DnaSequenceCreate
benchling = Benchling(
url="https://your-tenant.benchling.com",
auth_method=ApiKeyAuth(os.environ["BENCHLING_API_KEY"])
)
created = []
for record in SeqIO.parse("sequences.fasta", "fasta"):
seq = benchling.dna_sequences.create(
DnaSequenceCreate(
name=record.id,
bases=str(record.seq),
is_circular=False,
folder_id="fld_abc123",
fields=benchling.models.fields({
"description": record.description,
"source": "FASTA import"
})
)
)
created.append(seq.id)
time.sleep(0.5) # Rate limit compliance
print(f"Created: {record.id} -> {seq.id}")
print(f"Imported {len(created)} sequences")
import os, csv
from benchling_sdk.benchling import Benchling
from benchling_sdk.auth.api_key_auth import ApiKeyAuth
benchling = Benchling(
url="https://your-tenant.benchling.com",
auth_method=ApiKeyAuth(os.environ["BENCHLING_API_KEY"])
)
audit = []
containers = benchling.containers.list(parent_storage_id="loc_freezer01")
for page in containers:
for c in page:
audit.append({
"id": c.id,
"name": c.name,
"barcode": c.barcode,
"location": c.parent_storage_id,
"created": str(c.created_at)
})
with open("inventory_audit.csv", "w", newline="") as f:
writer = csv.DictWriter(f, fieldnames=audit[0].keys())
writer.writeheader()
writer.writerows(audit)
print(f"Audit complete: {len(audit)} containers")
benchling.workflow_tasks.list(workflow_id=..., status="pending")benchling.dna_sequences.get_by_id()benchling.workflow_tasks.update()benchling.entries.create()| Parameter | Function/Endpoint | Default | Options | Effect |
|-----------|------------------|---------|---------|--------|
| folder_id | All create operations | Required | fld_... | Target folder for new entity |
| schema_id | All create operations | Optional | ts_..., cont_... | Schema defining custom fields |
| entity_registry_id | Entity registration | Optional | src_... | Registry to register entity in |
| naming_strategy | Entity registration | — | NEW_IDS, IDS_FROM_NAMES | How registry IDs are generated |
| parent_storage_id | Containers, boxes | Optional | box_..., loc_... | Storage location for inventory |
| max_retries | RetryStrategy | 5 | 0–10 | Number of retry attempts on failure |
| interval_wait_seconds | wait_for_task | 2 | 1–60 | Polling interval for async tasks |
| max_wait_seconds | wait_for_task | 300 | 10–3600 | Maximum wait for async completion |
Always use environment variables for credentials: Never hardcode API keys. Use os.environ["BENCHLING_API_KEY"].
Use the fields() helper for custom schema fields: Raw dicts will not work — the SDK requires typed Fields objects.
Anti-pattern — loading all entities into memory: Use the paginated generator pattern. Never convert list() to a Python list for large datasets.
Add rate limit delays for bulk operations: Insert time.sleep(0.5) between create/update calls when processing >50 entities.
Use OAuth for production apps, API keys for scripts: API keys are user-scoped; OAuth allows app-level permissions and rotation.
Anti-pattern — using both entity_registry_id and naming_strategy: These are mutually exclusive on create. Use one or the other.
Handle BenchlingError explicitly: Catch SDK exceptions and log the status code and message for debugging.
import csv
export = []
for page in benchling.dna_sequences.list(schema_id="ts_target_schema"):
for seq in page:
export.append({
"registry_id": seq.entity_registry_id,
"name": seq.name,
"length": len(seq.bases),
"bases": seq.bases[:50] + "..." if len(seq.bases) > 50 else seq.bases
})
with open("sequences_export.csv", "w", newline="") as f:
writer = csv.DictWriter(f, fieldnames=export[0].keys())
writer.writeheader()
writer.writerows(export)
print(f"Exported {len(export)} sequences")
# Search entities with specific field values
# Note: SDK list() supports limited filtering; for complex queries use Data Warehouse
results = []
for page in benchling.custom_entities.list(schema_id="ts_cell_lines"):
for entity in page:
fields = entity.fields or {}
if fields.get("organism", {}).get("value") == "Human":
results.append(entity)
print(f"Found: {entity.name} ({entity.id})")
print(f"Total human cell lines: {len(results)}")
import time
from datetime import datetime, timedelta
cutoff = datetime.now() - timedelta(days=365)
to_archive = []
for page in benchling.custom_entities.list():
for entity in page:
if entity.modified_at and entity.modified_at < cutoff:
to_archive.append(entity.id)
# Archive in batches
batch_size = 50
for i in range(0, len(to_archive), batch_size):
batch = to_archive[i:i+batch_size]
benchling.custom_entities.archive(ids=batch, reason="RETIRED")
print(f"Archived batch {i//batch_size + 1}: {len(batch)} entities")
time.sleep(1)
print(f"Total archived: {len(to_archive)}")
| Problem | Cause | Solution |
|---------|-------|----------|
| 401 Unauthorized | Invalid or expired API key | Regenerate key in Benchling Profile Settings; check env var is set |
| 403 Forbidden | Insufficient permissions | API key inherits user permissions; check user role in Benchling admin |
| 404 Not Found | Wrong entity ID or tenant URL | Verify ID format (seq_, fld_, etc.); check tenant URL matches |
| 429 Too Many Requests | Rate limit exceeded | SDK auto-retries; add time.sleep() between bulk operations |
| fields ignored on create | Using raw dict instead of fields() helper | Use benchling.models.fields({...}) for custom schema fields |
| naming_strategy error | Used with entity_registry_id | These are mutually exclusive — use one or the other |
| Pagination memory issues | Collecting all items into a list | Iterate page-by-page with for page in .list() pattern |
| OAuth token expired | Client credentials not refreshing | SDK handles refresh automatically; check client_id/secret are valid |
tools
Fast short-read DNA aligner for WGS/WES/ChIP-seq. 2× faster BWA-MEM successor; outputs SAM/BAM with read group headers for GATK. Primary plus supplementary records for chimeric reads. Use STAR for RNA-seq splice-aware alignment; Bowtie2 is a comparable alternative.
tools
smina molecular docking CLI. AutoDock Vina fork with customizable scoring functions, native SDF/MOL2/PDB ligand input, autoboxing, local energy minimization, and per-atom score breakdowns. Pipeline: receptor PDBQT prep -> ligand prep (RDKit/OpenBabel) -> dock via autobox or explicit grid -> rescore/minimize with custom scoring -> rank poses by affinity. Choose smina over Vina when you need custom scoring terms (--custom_scoring), local optimization of an existing pose (--local_only), per-atom contributions (--atom_term_data), or SDF/MOL2 ligands without manual PDBQT conversion. For unknown binding sites use diffdock-blind-docking; for the Python-bindings/Vinardo workflow use autodock-vina-docking.
development
mdtraj molecular dynamics trajectory analysis (Python). Reads DCD/XTC/TRR/NetCDF/H5/PDB topologies and trajectories; computes RMSD vs time, radius of gyration, per-residue RMSF, residue-residue contact frequency maps, phi/psi torsions for Ramachandran plots (general + Gly/Pro), and 8-state DSSP secondary structure. Modules: trajectory I/O, geometry (distances/angles/dihedrals), structural analysis (RMSD/Rg/RMSF/SASA), contacts, hydrogen bonds, secondary structure (DSSP), NMR observables. For broader atom-selection grammar use mdanalysis-trajectory; for running MD simulations use OpenMM/GROMACS.
development
Programmatic PubMed access via NCBI E-utilities REST API. Covers Boolean/MeSH queries, field-tagged search, endpoints (ESearch, EFetch, ESummary, EPost, ELink), history server for batches, citation matching, systematic review strategies. Use for biomedical literature search or automated pipelines.