8.differential-region-analysis/SKILL.md
The differential-region-analysis pipeline identifies genomic regions exhibiting significant differences in signal intensity between experimental conditions using a count-based framework and DESeq2. It supports detection of both differentially accessible regions (DARs) from open-chromatin assays (e.g., ATAC-seq, DNase-seq) and differential transcription factor (TF) binding regions from TF-centric assays (e.g., ChIP-seq, CUT&RUN, CUT&Tag). The pipeline can start from aligned BAM files or a precomputed count matrix and is suitable whenever genomic signal can be summarized as read counts per region.
npx skillsauth add bisnake2001/chromskills differential-region-analysisInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill performs differential region analysis between experimental conditions using DESeq2 in a count-based framework. Main steps include:
${proj_dir} in Step 0.qvalues and log2foldchange to define significant regions.Use the differential-region-analysis pipeline when your goal is to identify genomic regions with condition-dependent changes in signal intensity, provided the signal can be represented as raw read counts per region.
Recommended scenarios include:
The pipeline performs best with datasets containing biological replicates (≥2 per condition) and moderate to high sequencing depth (~20–50 million reads per sample).
${sample}_DAR_analysis/ # or ${tf}_${sample}_DB_analysis in differential TF binding detection task
tables/
all_peaks.bed
consensus_peaks.bed # Unified peak set
atac_counts.txt # Count matrix of reads per peak
samples.csv # Sample metadata
DARs/
DAR_results.csv # DESeq2 results (log2FC, p-values)
DAR_sig.bed # Significantly diffential accessible regions
DAR_up.bed
DAR_down.bed
plots/ # visualization outputs
PCA.pdf
Volcano.pdf
logs/ # analysis logs
temp/ # other temp files
Call:
mcp__project-init-tools__project_initwith:
sample: sample name (e.g. c1_vs_c2)task: DAR_analysisThe tool will:
${sample}_DAR_analysis (or ${tf}_${sample}_DB_analysis) directory.${sample}_DAR_analysis (or ${tf}_${sample}_DB_analysis) directory, which will be used as ${proj_dir}.Combine peaks from replicates to define a shared feature space. Call:
bed_files: List of paths to peak BED files from replicates.output_bed: Output path for the merged consensus BED file.output_saf: Output path for the SAF file (needed for featureCounts)Output: consensus_peaks.bed, consensus_peaks.saf
Call:
with:
saf_file: SAF file output from Step 1.bam_files: List of paths to BAM files.output_counts: Path to output count matrix.is_paired_end: Whether the BAM file is pair end or not.threadsOutput: atac_counts.txt
Prepare samples.csv describing condition and replicate information.
sample,condition,replicate
sample1.bam,c1,1
sample2.bam,c1,2
sample3.bam,c2,1
sample4.bam,c2,2
Call:
with:
Output: DAR_results.csv or ${tf}_DB_results.csv
Call:
with:
results_csv: Path to DESeq2 results CSV.counts_file: Path to original counts file (for PCA).metadata_file: Path to metadata (for PCA grouping).output_dir: Directory to save plots.condition_col: (e.g."condition")Call:
with:
results_csv: Path to DESeq2 results CSV.output_prefix: Prefix for output BED files.padj_cutoff: Provided by userlog2fc_cutoff: Provided by userOutput: DAR_sig.bed DAR_up.bed DAR_down.bed or ${tf}_DB_sig.bed ${tf}_DB_up.bed ${tf}_DB_down.bed
design = ~ batch + conditioncontrast=c("condition","A","B")DESeq(dds, test="LRT", reduced=~1)dds[rowSums(counts(dds)) >= 20, ]| Issue | Solution |
|-------|-----------|
| Very low counts | Increase threshold (rowSums >= 20) |
| Batch effect | Add batch term to design |
| Non-converging model | Use fitType="local" or betaPrior=FALSE |
| Mismatched sample names | Ensure count column names match metadata rows |
development
Align ChIP-seq or ATAC-seq FASTQ files to a reference genome using Bowtie2, with strict input validation, library layout detection, output organization and logging. Use it when raw sequencing reads must be converted into sorted/indexed BAM files before downstream QC, peak calling, or footprinting.
development
Align bisulfite sequencing DNA methylation reads using Bismark only, with explicit validation of reference preparation, library layout detection, output organization, logging, and alignment QC. Use it for WGBS, RRBS, or other bisulfite-converted DNA methylation sequencing data when raw FASTQ files must be aligned before methylation extraction and downstream analysis.
data-ai
Perform peak calling for ChIP-seq or ATAC-seq data using MACS3, with intelligent parameter detection from user feedback. Use it when you want to call peaks for ChIP-seq data or ATAC-seq data.
devops
The TF-differential-binding pipeline performs differential transcription factor (TF) binding analysis from ChIP-seq datasets (TF peaks) using the DiffBind package in R. It identifies genomic regions where TF binding intensity significantly differs between experimental conditions (e.g., treatment vs. control, mutant vs. wild-type). Use the TF-differential-binding pipeline when you need to analyze the different function of the same TF across two or more biological conditions, cell types, or treatments using ChIP-seq data or TF binding peaks. This pipeline is ideal for studying regulatory mechanisms that underlie transcriptional differences or epigenetic responses to perturbations.