bio-research/skills/scvi-tools/SKILL.md
Deep learning for single-cell analysis using scvi-tools. This skill should be used when users need (1) data integration and batch correction with scVI/scANVI, (2) ATAC-seq analysis with PeakVI, (3) CITE-seq multi-modal analysis with totalVI, (4) multiome RNA+ATAC analysis with MultiVI, (5) spatial transcriptomics deconvolution with DestVI, (6) label transfer and reference mapping with scANVI/scArches, (7) RNA velocity with veloVI, or (8) any deep learning-based single-cell method. Triggers include mentions of scVI, scANVI, totalVI, PeakVI, MultiVI, DestVI, veloVI, sysVI, scArches, variational autoencoder, VAE, batch correction, data integration, multi-modal, CITE-seq, multiome, reference mapping, latent space.
npx skillsauth add anthropics/knowledge-work-plugins scvi-toolsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill provides guidance for deep learning-based single-cell analysis using scvi-tools, the leading framework for probabilistic models in single-cell genomics.
scripts/ to avoid rewriting common codereferences/environment_setup.mdreferences/troubleshooting.md| Data Type | Model | Primary Use Case | |-----------|-------|------------------| | scRNA-seq | scVI | Unsupervised integration, DE, imputation | | scRNA-seq + labels | scANVI | Label transfer, semi-supervised integration | | CITE-seq (RNA+protein) | totalVI | Multi-modal integration, protein denoising | | scATAC-seq | PeakVI | Chromatin accessibility analysis | | Multiome (RNA+ATAC) | MultiVI | Joint modality analysis | | Spatial + scRNA reference | DestVI | Cell type deconvolution | | RNA velocity | veloVI | Transcriptional dynamics | | Cross-technology | sysVI | System-level batch correction |
| Workflow | Reference File | Description |
|----------|---------------|-------------|
| Environment Setup | references/environment_setup.md | Installation, GPU, version info |
| Data Preparation | references/data_preparation.md | Formatting data for any model |
| scRNA Integration | references/scrna_integration.md | scVI/scANVI batch correction |
| ATAC-seq Analysis | references/atac_peakvi.md | PeakVI for accessibility |
| CITE-seq Analysis | references/citeseq_totalvi.md | totalVI for protein+RNA |
| Multiome Analysis | references/multiome_multivi.md | MultiVI for RNA+ATAC |
| Spatial Deconvolution | references/spatial_deconvolution.md | DestVI spatial analysis |
| Label Transfer | references/label_transfer.md | scANVI reference mapping |
| scArches Mapping | references/scarches_mapping.md | Query-to-reference mapping |
| Batch Correction | references/batch_correction_sysvi.md | Advanced batch methods |
| RNA Velocity | references/rna_velocity_velovi.md | veloVI dynamics |
| Troubleshooting | references/troubleshooting.md | Common issues and solutions |
Modular scripts for common workflows. Chain together or modify as needed.
| Script | Purpose | Usage |
|--------|---------|-------|
| prepare_data.py | QC, filter, HVG selection | python scripts/prepare_data.py raw.h5ad prepared.h5ad --batch-key batch |
| train_model.py | Train any scvi-tools model | python scripts/train_model.py prepared.h5ad results/ --model scvi |
| cluster_embed.py | Neighbors, UMAP, Leiden | python scripts/cluster_embed.py adata.h5ad results/ |
| differential_expression.py | DE analysis | python scripts/differential_expression.py model/ adata.h5ad de.csv --groupby leiden |
| transfer_labels.py | Label transfer with scANVI | python scripts/transfer_labels.py ref_model/ query.h5ad results/ |
| integrate_datasets.py | Multi-dataset integration | python scripts/integrate_datasets.py results/ data1.h5ad data2.h5ad |
| validate_adata.py | Check data compatibility | python scripts/validate_adata.py data.h5ad --batch-key batch |
# 1. Validate input data
python scripts/validate_adata.py raw.h5ad --batch-key batch --suggest
# 2. Prepare data (QC, HVG selection)
python scripts/prepare_data.py raw.h5ad prepared.h5ad --batch-key batch --n-hvgs 2000
# 3. Train model
python scripts/train_model.py prepared.h5ad results/ --model scvi --batch-key batch
# 4. Cluster and visualize
python scripts/cluster_embed.py results/adata_trained.h5ad results/ --resolution 0.8
# 5. Differential expression
python scripts/differential_expression.py results/model results/adata_clustered.h5ad results/de.csv --groupby leiden
The scripts/model_utils.py provides importable functions for custom workflows:
| Function | Purpose |
|----------|---------|
| prepare_adata() | Data preparation (QC, HVG, layer setup) |
| train_scvi() | Train scVI or scANVI |
| evaluate_integration() | Compute integration metrics |
| get_marker_genes() | Extract DE markers |
| save_results() | Save model, data, plots |
| auto_select_model() | Suggest best model |
| quick_clustering() | Neighbors + UMAP + Leiden |
Raw counts required: scvi-tools models require integer count data
adata.layers["counts"] = adata.X.copy() # Before normalization
scvi.model.SCVI.setup_anndata(adata, layer="counts")
HVG selection: Use 2000-4000 highly variable genes
sc.pp.highly_variable_genes(adata, n_top_genes=2000, batch_key="batch", layer="counts", flavor="seurat_v3")
adata = adata[:, adata.var['highly_variable']].copy()
Batch information: Specify batch_key for integration
scvi.model.SCVI.setup_anndata(adata, layer="counts", batch_key="batch")
Need to integrate scRNA-seq data?
├── Have cell type labels? → scANVI (references/label_transfer.md)
└── No labels? → scVI (references/scrna_integration.md)
Have multi-modal data?
├── CITE-seq (RNA + protein)? → totalVI (references/citeseq_totalvi.md)
├── Multiome (RNA + ATAC)? → MultiVI (references/multiome_multivi.md)
└── scATAC-seq only? → PeakVI (references/atac_peakvi.md)
Have spatial data?
└── Need cell type deconvolution? → DestVI (references/spatial_deconvolution.md)
Have pre-trained reference model?
└── Map query to reference? → scArches (references/scarches_mapping.md)
Need RNA velocity?
└── veloVI (references/rna_velocity_velovi.md)
Strong cross-technology batch effects?
└── sysVI (references/batch_correction_sysvi.md)
testing
Reads a forwarded customer email or ticket, pulls order/refund status from PayPal and account history from HubSpot, drafts a tone-matched reply in the owner's writing voice, and can issue a PayPal refund with explicit owner approval. Use when the user says "draft a response," "answer this customer," "where's my order," or "I want a refund."
development
Prepares tax-season materials for small business owners — framed as deliverables for their accountant, not tax advice. Two modes: (1) quarterly estimated tax calculation — pulls YTD net income from QuickBooks and calculates the federal income tax + self-employment tax liability and quarterly payment due; (2) year-end 1099 prep — scans QuickBooks, PayPal, and Stripe for contractors paid over $600, builds a 1099-NEC candidate list with missing W-9 flags, and produces a plain-English summary a CPA can work from directly. Trigger this skill whenever the user mentions: quarterly taxes, estimated tax payment, how much to set aside for taxes, 1099s, 1099-NEC, year-end tax prep, contractor payments, W-9s, or any phrase suggesting they are preparing for a tax deadline or handing materials to an accountant. Also trigger proactively when a user asks about net profit or YTD income in a context that suggests they are worried about their tax bill.
tools
Prepares tax-season materials — quarterly estimated tax calculation or year-end 1099 prep — and produces an accountant handoff packet. Accepts optional mode and year arguments.
tools
The front door to the Small Business plugin. Listens to what the owner needs right now — vague or specific — and routes them to the best skill or slash command for the moment. Also serves as a guide: explains what's available, suggests what to try next, and adapts recommendations based on stored business context. Trigger whenever the owner asks "what can you do," "help me with my business," "what should I focus on," "I don't know where to start," or any open-ended business request that doesn't clearly match a single skill.