plugin/skills/tooluniverse-chemical-sourcing/SKILL.md
Find commercial sources for chemical compounds — PubChem/ChEMBL identity resolution then vendor catalog search across ZINC, Enamine, eMolecules, Mcule. Compares pricing, availability, and identifies purchasable analogs when an exact compound is not in stock. Use for chemical procurement, virtual library curation, and 'where can I buy X' questions for synthesis planning.
npx skillsauth add mims-harvard/tooluniverse tooluniverse-chemical-sourcingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Pipeline for identifying, sourcing, and purchasing chemical compounds from commercial vendors. Resolves compound identity through PubChem/ChEMBL, searches multiple vendor databases (ZINC, Enamine, eMolecules, Mcule), compares pricing and availability, and identifies purchasable analogs when exact compounds are unavailable.
Guiding principles:
When uncertain about any scientific fact, SEARCH databases first rather than reasoning from memory. A database-verified answer is always more reliable than a guess.
When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.
Typical triggers:
Not this skill: For ADMET/toxicity assessment, use tooluniverse-admet-prediction. For drug-target interaction analysis, use tooluniverse-drug-target-validation.
| Database | Scope | Best For | |----------|-------|----------| | ZINC | 230M+ purchasable compounds; aggregates vendors | Broadest coverage; substructure/similarity search; free | | Enamine | ~4M in-stock, 30B+ REAL (make-on-demand) | Large in-stock library; fast delivery; building blocks | | eMolecules | Multi-vendor aggregator; 8M+ compounds | Cross-vendor comparison; pricing transparency | | Mcule | 40M+ compounds; one-stop purchasing | Integrated ordering; quote generation | | PubChem | 110M+ compounds; identity resolution | Authoritative compound identification; CID lookup | | ChEMBL | 2.4M+ bioactive molecules | Bioactivity context for sourced compounds |
Phase 0: Compound Identity Resolution
Name/SMILES/CAS -> PubChem CID -> canonical SMILES
|
Phase 1: Vendor Search
Query ZINC, Enamine, eMolecules, Mcule
|
Phase 2: Price & Availability Comparison
Catalog numbers, pricing, stock status, purity
|
Phase 3: Analog Search (if needed)
Similarity search for purchasable alternatives
|
Phase 4: Bioactivity Context (optional)
ChEMBL activity data for sourced compounds
|
Phase 5: Order Summary
Consolidated vendor comparison table
Objective: Establish unambiguous compound identity before vendor searches.
Tools:
PubChem_get_CID_by_compound_name -- resolve name to CID
name (compound name){IdentifierList: {CID: [...]}}PubChem_get_compound_properties_by_CID -- get SMILES, MW, formula
cid (PubChem CID), properties (comma-separated list){CID, MolecularWeight, ConnectivitySMILES, IUPACName}ChEMBL_get_molecule -- get ChEMBL compound details
molecule_chembl_id (ChEMBL ID) or search by nameWorkflow:
Important: PubChem ConnectivitySMILES (not CanonicalSMILES) is the correct property name. Always confirm the SMILES matches the intended compound before proceeding.
Objective: Search all available vendor databases for the target compound.
Tools:
ZINC_search_compounds -- search ZINC by name or SMILES
query (name or SMILES), optional catalog, limitZINC_get_compound -- get detailed compound info from ZINC
zinc_id (ZINC identifier)Enamine_search_catalog -- search Enamine catalog
query (name or SMILES), optional catalog_type, limitEnamine_get_compound -- get Enamine compound details
compound_id (Enamine catalog number)eMolecules_search -- search across multiple vendors
query (name or SMILES), optional limiteMolecules_get_compound -- get eMolecules compound details
compound_id (eMolecules ID)Mcule_get_compound -- search Mcule database
query (name or SMILES), optional limitMcule_get_compound -- get Mcule compound details
compound_id (Mcule ID)Workflow:
Tip: SMILES-based searches are more precise than name searches. If name search returns too many results, switch to SMILES.
Objective: Create a comparison table across vendors.
Compile from Phase 1 results:
| Field | Description | |-------|-------------| | Vendor | Company name | | Catalog # | Vendor-specific identifier | | Quantity | Available pack sizes | | Price | Per unit or per mg | | Purity | Stated purity grade (>95%, >98%, etc.) | | Stock | In-stock vs make-on-demand | | Delivery | Estimated delivery time |
Rank vendors by: (1) in-stock availability, (2) price per mg, (3) purity grade, (4) delivery time.
Objective: When the exact compound is unavailable, find purchasable structural analogs.
Triggered when:
Approach:
Objective: Provide biological activity data for context when sourcing compounds for research.
Tools:
ChEMBL_get_molecule -- get bioactivity summary
Useful when:
Vendor selection decision matrix — don't just list vendors, recommend one:
| Scenario | Best Vendor Strategy | Why | |----------|---------------------|-----| | Need it this week | In-stock vendor with fastest shipping | Make-on-demand takes 2-4 weeks minimum | | Budget-constrained | Cheapest per mg, accept lower purity (>95%) | Academic budgets are tight; >95% is fine for screening | | High-throughput screen | ZINC/Enamine for large libraries; mg quantities | Price per compound matters more than purity | | Assay validation | Highest purity (>98%) from reputable vendor | False positives from impurities waste months | | Building blocks for synthesis | Enamine (largest building block catalog) | Purpose-built for medicinal chemistry | | Exact compound unavailable | Analog search → check bioactivity (ChEMBL) → source best analog | Tanimoto > 0.85 likely retains activity; 0.7-0.85 may have different SAR |
Red flags when sourcing:
Generate a final sourcing report:
| Pattern | Description | Key Phases | |---------|-------------|------------| | Quick Availability Check | Is this compound purchasable? | 0, 1 | | Full Vendor Comparison | Compare all sources with pricing | 0, 1, 2, 5 | | Analog Discovery | Compound unavailable; find alternatives | 0, 1, 3, 5 | | Building Block Sourcing | Find reagents for synthesis | 0, 1, 2 | | Hit-to-Lead Sourcing | Source screening hits with bioactivity context | 0, 1, 2, 4, 5 |
| Evidence Grade | Criteria | Action | |----------------|----------|--------| | A -- High confidence | In-stock at 2+ vendors, purity >=98%, CoA available | Order directly | | B -- Moderate confidence | Single vendor or make-on-demand, purity >=95% | Request CoA, verify structure | | C -- Low confidence | No stock, purity unstated, or price outlier (>5x median) | Custom synthesis or analog search |
Interpreting vendor results:
Synthesis questions to address in the final report:
tools
PCR / qPCR primer and oligo design — design forward/reverse primers for a target region (SantaLucia nearest-neighbor thermodynamics), compute melting temperature (Tm) and annealing temperature (Ta), check GC content, and screen an oligo for hairpins and primer-dimers. Use when you need primers for a sequence, want to QC an existing primer pair, or need the Tm of an oligo. Covers the primer-design rules (Tm matching, GC clamp, 3'-end, length) and the tools' constraint quirks.
tools
Pharmacokinetic (PK) analysis of concentration-time data — non-compartmental analysis (NCA) for Cmax, Tmax, AUC (0-t and 0-∞), terminal half-life, clearance (CL), volume of distribution (Vd), MRT, and absolute bioavailability (F). Also one-compartment fitting. Use when you have plasma/serum drug concentrations over time after a dose and need PK parameters, or to compute bioavailability from IV + oral AUCs. NOT for ADMET property prediction from structure (use tooluniverse-admet-prediction).
tools
Molecular cloning assembly design — Gibson Assembly (overlap design for seamless multi-fragment joining) and Golden Gate Assembly (Type IIS / BsaI / BbsI design with unique 4-bp fusion overhangs). Use when you need to plan how to join DNA fragments into a construct, design assembly overlaps/overhangs, or decide between cloning methods. Covers the domestication (internal-site removal), overhang-uniqueness, and overlap-Tm rules. For PCR primers to generate the fragments, see tooluniverse-primer-design.
tools
Meta-analysis / evidence synthesis — pool effect sizes across studies (odds ratios, risk ratios, hazard ratios, mean differences, correlations, GWAS betas) with fixed- or random-effects models, quantify heterogeneity (Q, I², τ²), and build a forest plot. Use when you have results from MULTIPLE studies and need a single pooled estimate, or to synthesize evidence from a systematic review / multiple GWAS / replicated experiments. Handles the error-prone effect-size + standard-error preparation (converting OR/HR/CI, two-group means±SD, proportions, and correlations into the (effect, SE) the pooling step needs).