Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

cy-wali/single-cell-rna-qc

Name: single-cell-rna-qc
Author: cy-wali

bio-research/skills/single-cell-rna-qc/SKILL.md

npx skillsauth add cy-wali/knowledge single-cell-rna-qc

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Error

VirusTotalMulti-engine malware detection

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Single-Cell RNA-seq Quality Control

Automated QC workflow for single-cell RNA-seq data following scverse best practices.

When to Use This Skill

Use when users:

Request quality control or QC on single-cell RNA-seq data
Want to filter low-quality cells or assess data quality
Need QC visualizations or metrics
Ask to follow scverse/scanpy best practices
Request MAD-based filtering or outlier detection

Supported input formats:

.h5ad files (AnnData format from scanpy/Python workflows)
.h5 files (10X Genomics Cell Ranger output)

Default recommendation: Use Approach 1 (complete pipeline) unless the user has specific custom requirements or explicitly requests non-standard filtering logic.

Approach 1: Complete QC Pipeline (Recommended for Standard Workflows)

For standard QC following scverse best practices, use the convenience script scripts/qc_analysis.py:

python3 scripts/qc_analysis.py input.h5ad
# or for 10X Genomics .h5 files:
python3 scripts/qc_analysis.py raw_feature_bc_matrix.h5

The script automatically detects the file format and loads it appropriately.

When to use this approach:

Standard QC workflow with adjustable thresholds (all cells filtered the same way)
Batch processing multiple datasets
Quick exploratory analysis
User wants the "just works" solution

Requirements: anndata, scanpy, scipy, matplotlib, seaborn, numpy

Parameters:

Customize filtering thresholds and gene patterns using command-line parameters:

--output-dir - Output directory
--mad-counts, --mad-genes, --mad-mt - MAD thresholds for counts/genes/MT%
--mt-threshold - Hard mitochondrial % cutoff
--min-cells - Gene filtering threshold
--mt-pattern, --ribo-pattern, --hb-pattern - Gene name patterns for different species

Use --help to see current default values.

Outputs:

All files are saved to <input_basename>_qc_results/ directory by default (or to the directory specified by --output-dir):

qc_metrics_before_filtering.png - Pre-filtering visualizations
qc_filtering_thresholds.png - MAD-based threshold overlays
qc_metrics_after_filtering.png - Post-filtering quality metrics
<input_basename>_filtered.h5ad - Clean, filtered dataset ready for downstream analysis
<input_basename>_with_qc.h5ad - Original data with QC annotations preserved

If copying outputs for user access, copy individual files (not the entire directory) so users can preview them directly.

Workflow Steps

The script performs the following steps:

Calculate QC metrics - Count depth, gene detection, mitochondrial/ribosomal/hemoglobin content
Apply MAD-based filtering - Permissive outlier detection using MAD thresholds for counts/genes/MT%
Filter genes - Remove genes detected in few cells
Generate visualizations - Comprehensive before/after plots with threshold overlays

Approach 2: Modular Building Blocks (For Custom Workflows)

For custom analysis workflows or non-standard requirements, use the modular utility functions from scripts/qc_core.py and scripts/qc_plotting.py:

# Run from scripts/ directory, or add scripts/ to sys.path if needed
import anndata as ad
from qc_core import calculate_qc_metrics, detect_outliers_mad, filter_cells
from qc_plotting import plot_qc_distributions  # Only if visualization needed

adata = ad.read_h5ad('input.h5ad')
calculate_qc_metrics(adata, inplace=True)
# ... custom analysis logic here

When to use this approach:

Different workflow needed (skip steps, change order, apply different thresholds to subsets)
Conditional logic (e.g., filter neurons differently than other cells)
Partial execution (only metrics/visualization, no filtering)
Integration with other analysis steps in a larger pipeline
Custom filtering criteria beyond what command-line params support

Available utility functions:

From qc_core.py (core QC operations):

calculate_qc_metrics(adata, mt_pattern, ribo_pattern, hb_pattern, inplace=True) - Calculate QC metrics and annotate adata
detect_outliers_mad(adata, metric, n_mads, verbose=True) - MAD-based outlier detection, returns boolean mask
apply_hard_threshold(adata, metric, threshold, operator='>', verbose=True) - Apply hard cutoffs, returns boolean mask
filter_cells(adata, mask, inplace=False) - Apply boolean mask to filter cells
filter_genes(adata, min_cells=20, min_counts=None, inplace=True) - Filter genes by detection
print_qc_summary(adata, label='') - Print summary statistics

From qc_plotting.py (visualization):

plot_qc_distributions(adata, output_path, title) - Generate comprehensive QC plots
plot_filtering_thresholds(adata, outlier_masks, thresholds, output_path) - Visualize filtering thresholds
plot_qc_after_filtering(adata, output_path) - Generate post-filtering plots

Example custom workflows:

Example 1: Only calculate metrics and visualize, don't filter yet

adata = ad.read_h5ad('input.h5ad')
calculate_qc_metrics(adata, inplace=True)
plot_qc_distributions(adata, 'qc_before.png', title='Initial QC')
print_qc_summary(adata, label='Before filtering')

Example 2: Apply only MT% filtering, keep other metrics permissive

adata = ad.read_h5ad('input.h5ad')
calculate_qc_metrics(adata, inplace=True)

# Only filter high MT% cells
high_mt = apply_hard_threshold(adata, 'pct_counts_mt', 10, operator='>')
adata_filtered = filter_cells(adata, ~high_mt)
adata_filtered.write('filtered.h5ad')

Example 3: Different thresholds for different subsets

adata = ad.read_h5ad('input.h5ad')
calculate_qc_metrics(adata, inplace=True)

# Apply type-specific QC (assumes cell_type metadata exists)
neurons = adata.obs['cell_type'] == 'neuron'
other_cells = ~neurons

# Neurons tolerate higher MT%, other cells use stricter threshold
neuron_qc = apply_hard_threshold(adata[neurons], 'pct_counts_mt', 15, operator='>')
other_qc = apply_hard_threshold(adata[other_cells], 'pct_counts_mt', 8, operator='>')

Best Practices

Be permissive with filtering - Default thresholds intentionally retain most cells to avoid losing rare populations
Inspect visualizations - Always review before/after plots to ensure filtering makes biological sense
Consider dataset-specific factors - Some tissues naturally have higher mitochondrial content (e.g., neurons, cardiomyocytes)
Check gene annotations - Mitochondrial gene prefixes vary by species (mt- for mouse, MT- for human)
Iterate if needed - QC parameters may need adjustment based on the specific experiment or tissue type

Reference Materials

For detailed QC methodology, parameter rationale, and troubleshooting guidance, see references/scverse_qc_guidelines.md. This reference provides:

Detailed explanations of each QC metric and why it matters
Rationale for MAD-based thresholds and why they're better than fixed cutoffs
Guidelines for interpreting QC visualizations (histograms, violin plots, scatter plots)
Species-specific considerations for gene annotations
When and how to adjust filtering parameters
Advanced QC considerations (ambient RNA correction, doublet detection)

Load this reference when users need deeper understanding of the methodology or when troubleshooting QC issues.

Next Steps After QC

Typical downstream analysis steps:

Ambient RNA correction (SoupX, CellBender)
Doublet detection (scDblFinder)
Normalization (log-normalize, scran)
Feature selection and dimensionality reduction
Clustering and cell type annotation

cy-wali/single-cell-rna-qc

bio-research/skills/single-cell-rna-qc/SKILL.md

Performs quality control on single-cell RNA-seq data (.h5ad or .h5 files) using scverse best practices with MAD-based filtering and comprehensive visualizations. Use when users request QC analysis, filtering low-quality cells, assessing data quality, or following scverse/scanpy best practices for single-cell analysis.

testing

Updated Apr 4, 2026

$ install --global

skillsauth

npx skillsauth add cy-wali/knowledge single-cell-rna-qc

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Error

VirusTotalMulti-engine malware detection

70%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Mar 25, 2026, 8:56 PM43.8s6 files scanned

SKILL.md

name:: single-cell-rna-qc
description:: Performs quality control on single-cell RNA-seq data (.h5ad or .h5 files) using scverse best practices with MAD-based filtering and comprehensive visualizations. Use when users request QC analysis, filtering low-quality cells, assessing data quality, or following scverse/scanpy best practices for single-cell analysis.

Single-Cell RNA-seq Quality Control

Automated QC workflow for single-cell RNA-seq data following scverse best practices.

When to Use This Skill

Use when users:

Request quality control or QC on single-cell RNA-seq data
Want to filter low-quality cells or assess data quality
Need QC visualizations or metrics
Ask to follow scverse/scanpy best practices
Request MAD-based filtering or outlier detection

Supported input formats:

.h5ad files (AnnData format from scanpy/Python workflows)
.h5 files (10X Genomics Cell Ranger output)

Default recommendation: Use Approach 1 (complete pipeline) unless the user has specific custom requirements or explicitly requests non-standard filtering logic.

Approach 1: Complete QC Pipeline (Recommended for Standard Workflows)

For standard QC following scverse best practices, use the convenience script scripts/qc_analysis.py:

python3 scripts/qc_analysis.py input.h5ad
# or for 10X Genomics .h5 files:
python3 scripts/qc_analysis.py raw_feature_bc_matrix.h5

The script automatically detects the file format and loads it appropriately.

When to use this approach:

Standard QC workflow with adjustable thresholds (all cells filtered the same way)
Batch processing multiple datasets
Quick exploratory analysis
User wants the "just works" solution

Requirements: anndata, scanpy, scipy, matplotlib, seaborn, numpy

Parameters:

Customize filtering thresholds and gene patterns using command-line parameters:

--output-dir - Output directory
--mad-counts, --mad-genes, --mad-mt - MAD thresholds for counts/genes/MT%
--mt-threshold - Hard mitochondrial % cutoff
--min-cells - Gene filtering threshold
--mt-pattern, --ribo-pattern, --hb-pattern - Gene name patterns for different species

Use --help to see current default values.

Outputs:

All files are saved to <input_basename>_qc_results/ directory by default (or to the directory specified by --output-dir):

qc_metrics_before_filtering.png - Pre-filtering visualizations
qc_filtering_thresholds.png - MAD-based threshold overlays
qc_metrics_after_filtering.png - Post-filtering quality metrics
<input_basename>_filtered.h5ad - Clean, filtered dataset ready for downstream analysis
<input_basename>_with_qc.h5ad - Original data with QC annotations preserved

If copying outputs for user access, copy individual files (not the entire directory) so users can preview them directly.

Workflow Steps

The script performs the following steps:

Calculate QC metrics - Count depth, gene detection, mitochondrial/ribosomal/hemoglobin content
Apply MAD-based filtering - Permissive outlier detection using MAD thresholds for counts/genes/MT%
Filter genes - Remove genes detected in few cells
Generate visualizations - Comprehensive before/after plots with threshold overlays

Approach 2: Modular Building Blocks (For Custom Workflows)

For custom analysis workflows or non-standard requirements, use the modular utility functions from scripts/qc_core.py and scripts/qc_plotting.py:

# Run from scripts/ directory, or add scripts/ to sys.path if needed
import anndata as ad
from qc_core import calculate_qc_metrics, detect_outliers_mad, filter_cells
from qc_plotting import plot_qc_distributions  # Only if visualization needed

adata = ad.read_h5ad('input.h5ad')
calculate_qc_metrics(adata, inplace=True)
# ... custom analysis logic here

When to use this approach:

Different workflow needed (skip steps, change order, apply different thresholds to subsets)
Conditional logic (e.g., filter neurons differently than other cells)
Partial execution (only metrics/visualization, no filtering)
Integration with other analysis steps in a larger pipeline
Custom filtering criteria beyond what command-line params support

Available utility functions:

From qc_core.py (core QC operations):

calculate_qc_metrics(adata, mt_pattern, ribo_pattern, hb_pattern, inplace=True) - Calculate QC metrics and annotate adata
detect_outliers_mad(adata, metric, n_mads, verbose=True) - MAD-based outlier detection, returns boolean mask
apply_hard_threshold(adata, metric, threshold, operator='>', verbose=True) - Apply hard cutoffs, returns boolean mask
filter_cells(adata, mask, inplace=False) - Apply boolean mask to filter cells
filter_genes(adata, min_cells=20, min_counts=None, inplace=True) - Filter genes by detection
print_qc_summary(adata, label='') - Print summary statistics

From qc_plotting.py (visualization):

plot_qc_distributions(adata, output_path, title) - Generate comprehensive QC plots
plot_filtering_thresholds(adata, outlier_masks, thresholds, output_path) - Visualize filtering thresholds
plot_qc_after_filtering(adata, output_path) - Generate post-filtering plots

Example custom workflows:

Example 1: Only calculate metrics and visualize, don't filter yet

adata = ad.read_h5ad('input.h5ad')
calculate_qc_metrics(adata, inplace=True)
plot_qc_distributions(adata, 'qc_before.png', title='Initial QC')
print_qc_summary(adata, label='Before filtering')

Example 2: Apply only MT% filtering, keep other metrics permissive

adata = ad.read_h5ad('input.h5ad')
calculate_qc_metrics(adata, inplace=True)

# Only filter high MT% cells
high_mt = apply_hard_threshold(adata, 'pct_counts_mt', 10, operator='>')
adata_filtered = filter_cells(adata, ~high_mt)
adata_filtered.write('filtered.h5ad')

Example 3: Different thresholds for different subsets

adata = ad.read_h5ad('input.h5ad')
calculate_qc_metrics(adata, inplace=True)

# Apply type-specific QC (assumes cell_type metadata exists)
neurons = adata.obs['cell_type'] == 'neuron'
other_cells = ~neurons

# Neurons tolerate higher MT%, other cells use stricter threshold
neuron_qc = apply_hard_threshold(adata[neurons], 'pct_counts_mt', 15, operator='>')
other_qc = apply_hard_threshold(adata[other_cells], 'pct_counts_mt', 8, operator='>')

Best Practices

Be permissive with filtering - Default thresholds intentionally retain most cells to avoid losing rare populations
Inspect visualizations - Always review before/after plots to ensure filtering makes biological sense
Consider dataset-specific factors - Some tissues naturally have higher mitochondrial content (e.g., neurons, cardiomyocytes)
Check gene annotations - Mitochondrial gene prefixes vary by species (mt- for mouse, MT- for human)
Iterate if needed - QC parameters may need adjustment based on the specific experiment or tissue type

Reference Materials

For detailed QC methodology, parameter rationale, and troubleshooting guidance, see references/scverse_qc_guidelines.md. This reference provides:

Detailed explanations of each QC metric and why it matters
Rationale for MAD-based thresholds and why they're better than fixed cutoffs
Guidelines for interpreting QC visualizations (histograms, violin plots, scatter plots)
Species-specific considerations for gene annotations
When and how to adjust filtering parameters
Advanced QC considerations (ambient RNA correction, doublet detection)

Load this reference when users need deeper understanding of the methodology or when troubleshooting QC issues.

Next Steps After QC

Typical downstream analysis steps:

Ambient RNA correction (SoupX, CellBender)
Doublet detection (scDblFinder)
Normalization (log-normalize, scran)
Feature selection and dimensionality reduction
Clustering and cell type annotation

Related Skills

cy-wali/pipeline-review

testing

VerifiedTrustedCommunity

Analyze pipeline health — prioritize deals, flag risks, get a weekly action plan. Use when running a weekly pipeline review, deciding which deals to focus on this week, spotting stale or stuck opportunities, auditing for hygiene issues like bad close dates, or identifying single-threaded deals.

SKILL.mdUpdated Apr 4, 2026

cy-wali/pipeline-review

cy-wali/forecast

testing

VerifiedTrustedCommunity

Generate a weighted sales forecast with best/likely/worst scenarios, commit vs. upside breakdown, and gap analysis. Use when preparing a quarterly forecast call, assessing gap-to-quota from a pipeline CSV, deciding which deals to commit vs. call upside, or checking pipeline coverage against your number.

SKILL.mdUpdated Apr 4, 2026

cy-wali/draft-outreach

development

VerifiedTrustedCommunity

Research a prospect then draft personalized outreach. Uses web research by default, supercharged with enrichment and CRM. Trigger with "draft outreach to [person/company]", "write cold email to [prospect]", "reach out to [name]".

SKILL.mdUpdated Apr 4, 2026

cy-wali/draft-outreach

cy-wali/daily-briefing

data-ai

VerifiedTrustedCommunity

Start your day with a prioritized sales briefing. Works standalone when you tell me your meetings and priorities, supercharged when you connect your calendar, CRM, and email. Trigger with "morning briefing", "daily brief", "what's on my plate today", "prep my day", or "start my day".

SKILL.mdUpdated Apr 4, 2026

cy-wali/daily-briefing

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/cy-wali/knowledge.git

# Copy into Claude Code skills folder (global)
cp -r knowledge/bio-research/skills/single-cell-rna-qc ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

cy-wali/knowledge

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT