Version Compatibility

Reference examples tested with: DEWSeq 1.18+, htseq-clip 2.0+, DESeq2 1.44+, edgeR 4.2+, limma 3.60+, Flipper (commit 2024.04+), Skipper (commit 2023.05+), pybedtools 0.10+, pyranges 0.0.129+.

Before using code patterns, verify installed versions match. If versions differ:

R: packageVersion('<pkg>') then ?function_name to verify parameters
Python: pip show <package> then help(module.function) to check signatures
CLI: <tool> --version then <tool> --help to confirm flags

If code throws unexpected errors, introspect the installed package and adapt the example to match the actual API rather than retrying.

Differential CLIP-seq Analysis

"Identify regions with changed RBP binding across conditions" -> Test for condition-level differences in IP enrichment relative to SMInput, accounting for replicate variance and (where available) sequencing depth normalization. Three statistical scales are possible: peak-level (test each peak as a unit), window-level (test fixed transcriptome windows; DEWSeq, Flipper), or crosslink-site level (test single-nt positions). The choice depends on the biology (narrow regulatory shift vs broad binding-mode change) and on which upstream peak caller was used (CLIPper -> peak-level; Skipper -> window-level; PureCLIP -> CL-level).

R (window-level, DEWSeq + htseq-clip): library(DEWSeq); dds <- DESeqDataSetFromSlidingWindows(counts, colData, design=~condition); dds <- DESeq(dds); res <- results(dds)
R (Skipper downstream, Flipper): flipper differential -i skipper_out/ --design design.tsv --contrast treatment vs control -o flipper_out/
R (peak-level, edgeR): dge <- DGEList(counts=peak_counts, group=condition); dge <- calcNormFactors(dge); design <- model.matrix(~condition); dge <- estimateDisp(dge, design); fit <- glmQLFit(dge, design); res <- glmQLFTest(fit)
R (peak-level, limma-voom): v <- voom(dge, design); fit <- lmFit(v, design); fit <- eBayes(fit); res <- topTable(fit, coef=2, number=Inf)
CLI (htseq-clip preprocessing): htseq-clip extract -i annotation.gff -o annotation_windows.bed -w 50 -s 20 && htseq-clip count -i sample.bam -w annotation_windows.bed -o sample.counts.txt

DEWSeq is the EMBL/Hentze-group windowed-NB approach for CLIP binding-site discovery, commonly adapted for differential testing via the interaction design. Flipper is the Skipper-companion tool (Flanagan 2026) for the modern Skipper workflow. Peak-level edgeR/limma-voom work when the upstream peak caller produced a comparable peak BED across conditions (e.g., consensus peaks from CLIPper).

Algorithmic Taxonomy

| Tool | Scale | Statistical model | Replicate requirement | Strength | Fails when | |------|-------|-------------------|----------------------|----------|------------| | DEWSeq (Schwarzl 2024) | 50-100 nt sliding window | Negative binomial GLM (DESeq2 internals) | >= 2 reps per condition | Designed specifically for CLIP; integrates SMInput | Slow on dense libraries; output window-resolution | | Flipper (Flanagan 2026) | Skipper window (100 nt) | Negative binomial; designed for Skipper output | >= 2 reps per condition | Modern; pairs with Skipper peak caller | Only useful if upstream is Skipper | | ASpeak | Peak | Negative binomial | >= 2 reps | Peak-level for CLIPper output | Less popular; legacy | | edgeR (general) | Peak or window | Quasi-likelihood F-test on NB | >= 2 reps | Mature, widely cited | Generic; not CLIP-aware; needs careful normalization | | limma-voom | Peak or window | Linear model with mean-variance trend | >= 2 reps | Fast; well-validated; handles small samples | Generic; treats counts as continuous after voom | | DESeq2 (direct) | Peak or window | Negative binomial GLM | >= 2 reps | Mature; same engine as DEWSeq | Same as edgeR caveats | | Single-cell CLIP (specialized) | Cell-resolved (scCLIP) | Cell-mixture / pseudobulk | Cells | Single-cell CLIP differential | Nascent; few published tools | | diffbind (CLIP adaptation) | Peak | DESeq2 or edgeR backend | >= 2 reps | Familiar from ChIP/ATAC | Designed for ChIP; needs CLIP-specific normalization | | MAnorm2 | Peak | Hierarchical empirical Bayes | >= 2 reps | Tested on ChIP; less on CLIP | Less CLIP-specific |

Methodology evolves; verify DEWSeq vignette and Flipper paper for current best practice. The DEWSeq + htseq-clip pipeline (Schwarzl 2024) is the most-published CLIP-specific differential framework; Flipper (Flanagan 2026) is the modern Skipper-coupled alternative.

Critical Decision: The Interaction-Term Design

Use ~ type + condition + type:condition and test the interaction-term coefficient. This is the single most consequential statistical choice in differential CLIP.

type = ip vs sminput (whether the library is IP or size-matched input)
condition = treated vs untreated (or KD vs control)
type:condition interaction = "Does the IP-vs-input ratio shift across conditions?"

A naive ~ condition design tests whether read counts differ regardless of whether they come from IP or SMInput - this confounds binding changes with expression changes. The interaction term explicitly tests for differential binding (the biologically meaningful signal) rather than differential expression at peak loci.

When testing, extract the interaction-term coefficient: results(dds, name = 'typeip.conditiontreat'). The log2FoldChange returned is the change in IP/input ratio in treated vs untreated - this is what "differential binding" means.

Critical Choice: Peak-Level vs Window-Level vs Crosslink-Level

Three scales differ in resolution and statistical power:

Peak-level (edgeR, limma-voom, DESeq2 on CLIPper peaks): Test each consensus peak's IP/SMInput log2 FC across conditions. Pro: peak boundaries are biologically meaningful; output interpretable. Con: peak set changes across conditions (a new peak in treatment but missing in control complicates testing); SMInput normalization must be applied consistently.

Window-level (DEWSeq, Flipper): Tile transcriptome into fixed 50-100 nt windows; test each window. Pro: comparable across conditions (windows are pre-defined); handles binding-mode shifts within a peak; high statistical power. Con: window-resolution; multiple-testing burden (millions of windows); biological meaning of a window needs translation.

Crosslink-level (custom): Test each single-nt CL position. Pro: nucleotide resolution; captures motif-level shifts. Con: very low coverage per position; massive multiple-testing burden; rarely used in published differential CLIP.

| Goal | Scale | Tool | |------|-------|------| | ENCODE-style peak-level differential (CLIPper upstream) | Peak | DESeq2 / edgeR / limma-voom on CLIPper consensus peaks | | Maximum sensitivity windowed differential | Window | DEWSeq (with htseq-clip) or Flipper (with Skipper) | | Modern Skipper-coupled workflow | Window | Flipper | | Single-cell scCLIP differential | Cell | Specialized single-cell CLIP methods (few published tools) | | Compare binding-mode shifts within a peak | Window | DEWSeq | | Allele-specific differential | CL site | BEAPR + custom logistic | | RBP-KD effect on binding profile | Peak/window | DEWSeq (handles KD-effect on RBP itself) |

DEWSeq Workflow (Window-Level Differential)

DEWSeq is the EMBL/Hentze-group windowed-NB framework for CLIP binding-site discovery, adapted here for differential testing via the interaction design. The pipeline is:

Goal: Identify transcriptome windows where the IP-vs-SMInput ratio shifts across conditions, accounting for replicate variance with the negative-binomial GLM.

Approach: Use htseq-clip to extract sliding 50 nt windows across annotated features, count reads per window per sample, build a DESeqDataSetFromSlidingWindows object with the ~ type + condition + type:condition interaction design, extract the typeip.conditiontreat interaction coefficient as the differential-binding effect size, and aggregate adjacent significant windows with bedtools merge -d 100.

# Step 1: htseq-clip generates sliding-window count matrices
htseq-clip extract \
    -i gencode.v38.annotation.gff \
    -o annotation_windows.bed \
    --window-size 50 \
    --window-step 20 \
    --feature-type CDS,UTR

# Step 2: count IP and SMInput reads per window per sample
for sample in ip_rep1 ip_rep2 sminput_rep1 sminput_rep2; do
    htseq-clip count \
        -i ${sample}.dedup.bam \
        -a annotation_windows.bed \
        -o ${sample}.counts.txt \
        --mate 2
done
# (For eCLIP, --mate 2 because R2 5' is the truncation site; for iCLIP single-end use --mate 1)

# Step 3: DEWSeq differential testing
htseq-clip mergeCounts \
    -i ip_rep1.counts.txt ip_rep2.counts.txt sminput_rep1.counts.txt sminput_rep2.counts.txt \
    -o merged_counts.tsv

library(DEWSeq)

counts <- read.table('merged_counts.tsv', sep='\t', header=TRUE, row.names=1)
colData <- data.frame(
    sample = c('ip_rep1','ip_rep2','sminput_rep1','sminput_rep2'),
    type = c('ip','ip','sminput','sminput'),
    condition = c('treated','treated','untreated','untreated')
)

dds <- DESeqDataSetFromSlidingWindows(
    countData = counts,
    colData = colData,
    annotObj = 'annotation_windows.bed',
    design = ~ type + condition
)

dds <- DESeq(dds)
# In a model with `~ type + condition`, the IP-vs-input contrast is the simple condition
# main effect; to test the differential CLIP signal between conditions use the interaction
# term name from the design matrix (matches the skill's interaction-term guidance below):
res <- results(dds, name='typeip.conditiontreated')

# Window-level FDR adjustment
res_filtered <- res[!is.na(res$padj) & res$padj < 0.05 & abs(res$log2FoldChange) > 1, ]

# Aggregate adjacent significant windows into differential regions
sig_windows <- as.data.frame(res_filtered)
sig_windows$chr <- gsub('_.*', '', rownames(sig_windows))
# Custom reduce: combine adjacent windows within 100 nt

Flipper Workflow (Skipper-Coupled)

Flipper (Flanagan 2026) is the differential companion to Skipper, operating on the same 100 nt feature-respecting windows with a negative-binomial (DESeq2-based) differential test.

# Assume Skipper has been run on all samples; Skipper output is at skipper_out/
flipper differential \
    -i skipper_out/ \
    --design design.tsv \
    --contrast treatment vs control \
    -o flipper_out/

# design.tsv format:
# sample_id   condition   replicate   ip_or_input
# ip_treat_r1 treatment   1           ip
# ip_treat_r2 treatment   2           ip
# in_treat_r1 treatment   1           input
# ...

Output: differential window BED with log2 FC, p, padj per window.

Peak-Level Differential (CLIPper Upstream)

library(DESeq2)
library(GenomicRanges)
library(Rsubread)

# Step 1: union of CLIPper peaks across conditions
# (See bedtools merge upstream)
peaks <- read.table('consensus_peaks.bed', sep='\t', col.names=c('chr','start','end','name','score','strand'))

# Step 2: count reads per peak per sample with featureCounts
saf <- data.frame(
    GeneID = peaks$name,
    Chr = peaks$chr,
    Start = peaks$start + 1,  # 1-based for featureCounts
    End = peaks$end,
    Strand = peaks$strand
)
counts_ip <- featureCounts(
    files = c('ip_treat_r1.bam','ip_treat_r2.bam','ip_ctrl_r1.bam','ip_ctrl_r2.bam'),
    annot.ext = saf,
    isGTFAnnotationFile = FALSE,
    strandSpecific = 1,
    isPairedEnd = TRUE
)$counts

counts_in <- featureCounts(
    files = c('in_treat_r1.bam','in_treat_r2.bam','in_ctrl_r1.bam','in_ctrl_r2.bam'),
    annot.ext = saf,
    isGTFAnnotationFile = FALSE,
    strandSpecific = 1,
    isPairedEnd = TRUE
)$counts

# Step 3: DESeq2 with IP vs SMInput interaction
all_counts <- cbind(counts_ip, counts_in)
colData <- data.frame(
    type = rep(c('ip','input'), each=4),
    condition = rep(c('treat','treat','ctrl','ctrl'), 2),
    replicate = rep(c('r1','r2','r1','r2'), 2)
)

dds <- DESeqDataSetFromMatrix(countData = all_counts, colData = colData,
                               design = ~ type + condition + type:condition)
dds <- DESeq(dds)

# The interaction term `typeip.conditiontreat` tests:
# does the IP/input ratio differ in treatment vs control?
res <- results(dds, name = 'typeip.conditiontreat')

# Filter
res_sig <- res[!is.na(res$padj) & res$padj < 0.05 & abs(res$log2FoldChange) > 1, ]

The interaction-term design (type:condition) is the correct statistical model for differential CLIP: it tests whether the IP-vs-input ratio differs across conditions, which is what "differential binding" means. Naive testing of just condition (ignoring SMInput) confounds binding changes with expression changes.

RBP Knockdown Experiment Design

The canonical differential CLIP design is to knock down the RBP and observe what binding sites are lost. Caveats:

| Issue | Implication | Mitigation | |-------|-------------|------------| | RBP KD also depletes the RBP protein in cells | siRNA/shRNA reduces RBP -> reduces IP yield -> reduces unique fragments | Normalize against SMInput WITHIN each condition; the relative IP/SMInput captures binding, not protein level | | RBP KD changes transcript abundance | mRNA stability regulators (HuR, PUM2) when knocked down change target abundance | Both IP and SMInput see the change; ratio still works | | Off-target effects of siRNA | Multiple binding profiles change | Use multiple independent siRNAs; require concordance | | KD efficiency varies | Lower KD -> smaller binding-loss signal | Validate KD by WB on the same IP lysate; > 70% protein loss target | | Rescue requires re-introducing RBP | siRNA-resistant RBP cDNA for rescue | The standard differential validation experiment |

Per-Tool Failure Modes

DEWSeq -- Slow on dense libraries

Trigger: Whole-genome window tiling at 20 nt step; dense library (50M unique fragments); 4+ samples.

Mechanism: DEWSeq runs DESeq2 internals on millions of windows; the dispersion fit on this many features is slow.

Symptom: Runtime > 6 h; out-of-memory; "size of object exceeds vector limit".

Fix: Increase window step size to 50 nt; pre-filter windows with low counts; or restrict to expressed transcripts only. DEWSeq vignette suggests keep <- rowSums(counts(dds)) >= 30; dds <- dds[keep,] before testing.

DEWSeq -- Custom adjacency aggregation needed

Trigger: Windows are 50 nt; biological binding sites are 50-500 nt; user expects DEWSeq to output continuous "differential regions" but gets individual windows.

Mechanism: DEWSeq outputs per-window results; aggregating adjacent significant windows into regions is a separate step.

Symptom: Output has 10,000 individual windows; user expects 1,000 biological regions.

Fix: Use the DEWSeq utility resultsDEWSeq() then bedtools merge -d 100 on the significant-window BED. Or use the top_hits_to_bed.R script from DEWSeq examples.

Peak-level differential -- Peak set differs between conditions

Trigger: CLIPper called peaks separately per condition; treatment has peaks at sites missing in control (and vice versa).

Mechanism: Peak unification requires a consensus peakset; testing on a "treatment-only" peak underestimates evidence in control (zero reads) and produces spurious DE.

Symptom: "Treatment-specific" peaks dominate DE results; biologically implausible.

Fix: Generate consensus peakset across all conditions (bedtools merge of all per-condition peak BEDs); count reads per consensus peak across all samples; THEN run differential. The Yeo lab convention is consensus peakset across all samples.

Interaction term forgotten

Trigger: DESeq2 design ~ condition instead of ~ type + condition + type:condition.

Mechanism: Simple ~ condition tests whether read counts differ between treatment and control regardless of whether reads are from IP or SMInput. A condition-driven expression change in SMInput is detected as DE binding.

Symptom: DE results dominated by transcripts with global expression changes (housekeeping shifts).

Fix: Always use the interaction-term design. The biologically meaningful test is the interaction type:condition p-value.

Normalization assumptions

Trigger: edgeR calcNormFactors(method='TMM') on CLIP-seq data.

Mechanism: TMM assumes most features (genes) are not differentially expressed. CLIP-seq peak counts can be globally shifted if the RBP itself is knocked down; TMM normalization would force the shift to be invisible.

Symptom: Knockdown experiment shows ~0 DE peaks; expected hundreds.

Fix: Use SMInput as the control library; spike-in normalization if available; or skip TMM and use library-size normalization only. Some CLIP-specific tools (DEWSeq, Flipper) handle this internally.

Flipper requires Skipper upstream

Trigger: Flipper called on CLIPper output.

Mechanism: Flipper expects Skipper's window-level output format with beta-binomial estimates.

Symptom: Flipper crashes or produces nonsense.

Fix: Use DEWSeq with htseq-clip for CLIPper-upstream workflows; use Flipper only with Skipper.

Decision Tree by Scenario

| Scenario | Tool + design | Why | |----------|---------------|-----| | KD vs control eCLIP, CLIPper upstream | DEWSeq + htseq-clip | CLIP-specific NB GLM with interaction term | | KD vs control eCLIP, Skipper upstream | Flipper | Skipper companion; matches windowing | | Treatment vs vehicle (small effect) | DEWSeq (window-level higher power) | Sliding windows capture small shifts | | Multiple time points | DEWSeq with time as covariate | Continuous design with time vector | | Allele-specific differential | BEAPR per-allele + custom logistic | See clip-seq/clip-alignment for WASP | | Single-cell CLIP differential | Specialized single-cell CLIP methods | Nascent; few published tools | | Differential motif occupancy | Window-level + DEWSeq + motif overlap | Combine differential windows with motif BED | | RBP overexpression vs control | Same as KD reversed | Same statistical framework | | Compare two RBPs | NOT differential CLIP; use SPIDR or separate CLIPs | Different RBPs need separate IPs | | Spike-in normalization needed | DEWSeq + spike-in size factors | For global occupancy shifts | | chimeric eCLIP differential miRNA targets | Custom; treat each miRNA-target chimera as feature | Specialized; see clip-seq/ago-clip-mirna-targets |

Reconciliation: When Differential Tools Disagree

| Pattern | Likely cause | Action | |---------|--------------|--------| | DEWSeq finds many DE windows; edgeR peak-level finds few | Window-level higher power for narrow shifts | Aggregate DEWSeq windows; cross-check | | edgeR many DE; DEWSeq few | edgeR not accounting for SMInput | Re-run edgeR with interaction term | | DE peaks dominated by expression changes | No interaction term | Use ~ type + condition + type:condition | | KD experiment shows ~0 DE | TMM over-corrects global shift | Switch to library-size norm only; use SMInput | | siRNA replicates discordant | Off-target effects vary | Use multiple independent siRNAs; require concordance | | Treatment-only peaks dominate DE | No consensus peakset | Generate consensus first; then test on unified set | | Significant windows scattered | Window aggregation step skipped | bedtools merge -d 100 on significant-window BED | | Same gene appears in many DE windows | Multiple binding sites per gene differential | Report at gene-level too; not just window-level | | Flipper fails with non-Skipper input | Upstream mismatch | Use DEWSeq for non-Skipper workflows | | DESeq2 dispersion fit fails | Too few replicates (n=2 per condition); too few features after filtering | Increase replicates; or relax filtering |

Operational rule for high-confidence differential reporting: (a) Use SMInput-aware design (~ type + condition + type:condition); (b) generate consensus peakset across conditions; (c) require padj < 0.05 AND |log2FC| > 1; (d) require concordance with at least one orthogonal method (e.g., DEWSeq + edgeR peak-level on same data); (e) for KD experiments, validate KD efficiency by WB and require multiple independent siRNAs.

Common Errors

| Error / symptom | Cause | Solution | |-----------------|-------|----------| | DESeq2 ~ condition finds many DE | No interaction with type | Use ~ type + condition + type:condition and test interaction | | DEWSeq output is gene-level not window-level | keep <- ... filter too aggressive | Loosen pre-filter | | Adjacent significant windows not aggregated | Forgot bedtools merge | bedtools merge -d 100 on sig windows | | edgeR TMM fits all libraries to one value | Most-features-not-DE assumption violated | Use library-size norm; or DEWSeq for CLIP-specific | | Flipper crashes on CLIPper input | Tool mismatch | Switch to DEWSeq | | Few replicates -> unstable estimates | n=2 not enough for dispersion | Increase n; or use limma-voom (more tolerant) | | Peaks differ across conditions | Per-condition peak calls | Unify with consensus peakset | | KD experiment yields 0 DE | Normalization over-corrected; OR KD efficiency too low | Validate KD WB; check normalization | | Global shift in IP relative to SMInput | RBP itself knocked down so IP yield lower | Normalize WITHIN each condition | | Lots of "treatment-only" peaks | Caller stringency higher in one condition | Use consensus peakset for fairness |

References

Schwarzl T et al 2024 Nucleic Acids Res 52:e1 (DEWSeq, windowed NB binding-site discovery; DESeq2-based, adaptable to differential designs)
Sahadevan S et al 2022 Methods Mol Biol 2404:189 (DEWSeq + htseq-clip pipeline)
Flanagan K, Xu S, Yeo GW 2026 bioRxiv 2026.03.13.711628 (Flipper, Skipper-companion differential; preprint)
Boyle EA et al 2023 Cell Genomics 3:100317 (Skipper, parent of Flipper)
Love MI et al 2014 Genome Biol 15:550 (DESeq2)
McCarthy DJ et al 2012 Nucleic Acids Res 40:4288 (edgeR)
Ritchie ME et al 2015 Nucleic Acids Res 43:e47 (limma)
Yang EW et al 2019 Nat Commun 10:1338 (BEAPR allele-specific protein-RNA binding)
Van Nostrand EL et al 2020 Nature 583:711 (ENCODE 150 RBP shRNA + eCLIP comparison)

Related Skills

clip-seq/clip-peak-calling - CLIPper / Skipper outputs feed differential
clip-seq/clip-qc - Replicate QC required for valid differential
clip-seq/binding-site-annotation - Annotate differential regions
clip-seq/clip-motif-analysis - Motif analysis on differential windows
clip-seq/ago-clip-mirna-targets - Differential miRNA targeting from chimeric eCLIP
differential-expression/deseq2-basics - Underlying NB GLM model
differential-expression/de-results - DE results interpretation
differential-expression/edger-basics - edgeR for peak counts
chip-seq/differential-binding - DNA-protein analogue

Version Compatibility

Reference examples tested with: DEWSeq 1.18+, htseq-clip 2.0+, DESeq2 1.44+, edgeR 4.2+, limma 3.60+, Flipper (commit 2024.04+), Skipper (commit 2023.05+), pybedtools 0.10+, pyranges 0.0.129+.

Before using code patterns, verify installed versions match. If versions differ:

R: packageVersion('<pkg>') then ?function_name to verify parameters
Python: pip show <package> then help(module.function) to check signatures
CLI: <tool> --version then <tool> --help to confirm flags

If code throws unexpected errors, introspect the installed package and adapt the example to match the actual API rather than retrying.

Differential CLIP-seq Analysis

R (window-level, DEWSeq + htseq-clip): library(DEWSeq); dds <- DESeqDataSetFromSlidingWindows(counts, colData, design=~condition); dds <- DESeq(dds); res <- results(dds)
R (Skipper downstream, Flipper): flipper differential -i skipper_out/ --design design.tsv --contrast treatment vs control -o flipper_out/
R (peak-level, edgeR): dge <- DGEList(counts=peak_counts, group=condition); dge <- calcNormFactors(dge); design <- model.matrix(~condition); dge <- estimateDisp(dge, design); fit <- glmQLFit(dge, design); res <- glmQLFTest(fit)
R (peak-level, limma-voom): v <- voom(dge, design); fit <- lmFit(v, design); fit <- eBayes(fit); res <- topTable(fit, coef=2, number=Inf)
CLI (htseq-clip preprocessing): htseq-clip extract -i annotation.gff -o annotation_windows.bed -w 50 -s 20 && htseq-clip count -i sample.bam -w annotation_windows.bed -o sample.counts.txt

Algorithmic Taxonomy

Critical Decision: The Interaction-Term Design

Use ~ type + condition + type:condition and test the interaction-term coefficient. This is the single most consequential statistical choice in differential CLIP.

type = ip vs sminput (whether the library is IP or size-matched input)
condition = treated vs untreated (or KD vs control)
type:condition interaction = "Does the IP-vs-input ratio shift across conditions?"

Critical Choice: Peak-Level vs Window-Level vs Crosslink-Level

Three scales differ in resolution and statistical power:

DEWSeq Workflow (Window-Level Differential)

DEWSeq is the EMBL/Hentze-group windowed-NB framework for CLIP binding-site discovery, adapted here for differential testing via the interaction design. The pipeline is:

Goal: Identify transcriptome windows where the IP-vs-SMInput ratio shifts across conditions, accounting for replicate variance with the negative-binomial GLM.

# Step 1: htseq-clip generates sliding-window count matrices
htseq-clip extract \
    -i gencode.v38.annotation.gff \
    -o annotation_windows.bed \
    --window-size 50 \
    --window-step 20 \
    --feature-type CDS,UTR

# Step 2: count IP and SMInput reads per window per sample
for sample in ip_rep1 ip_rep2 sminput_rep1 sminput_rep2; do
    htseq-clip count \
        -i ${sample}.dedup.bam \
        -a annotation_windows.bed \
        -o ${sample}.counts.txt \
        --mate 2
done
# (For eCLIP, --mate 2 because R2 5' is the truncation site; for iCLIP single-end use --mate 1)

# Step 3: DEWSeq differential testing
htseq-clip mergeCounts \
    -i ip_rep1.counts.txt ip_rep2.counts.txt sminput_rep1.counts.txt sminput_rep2.counts.txt \
    -o merged_counts.tsv

library(DEWSeq)

counts <- read.table('merged_counts.tsv', sep='\t', header=TRUE, row.names=1)
colData <- data.frame(
    sample = c('ip_rep1','ip_rep2','sminput_rep1','sminput_rep2'),
    type = c('ip','ip','sminput','sminput'),
    condition = c('treated','treated','untreated','untreated')
)

dds <- DESeqDataSetFromSlidingWindows(
    countData = counts,
    colData = colData,
    annotObj = 'annotation_windows.bed',
    design = ~ type + condition
)

dds <- DESeq(dds)
# In a model with `~ type + condition`, the IP-vs-input contrast is the simple condition
# main effect; to test the differential CLIP signal between conditions use the interaction
# term name from the design matrix (matches the skill's interaction-term guidance below):
res <- results(dds, name='typeip.conditiontreated')

# Window-level FDR adjustment
res_filtered <- res[!is.na(res$padj) & res$padj < 0.05 & abs(res$log2FoldChange) > 1, ]

# Aggregate adjacent significant windows into differential regions
sig_windows <- as.data.frame(res_filtered)
sig_windows$chr <- gsub('_.*', '', rownames(sig_windows))
# Custom reduce: combine adjacent windows within 100 nt

Flipper Workflow (Skipper-Coupled)

Flipper (Flanagan 2026) is the differential companion to Skipper, operating on the same 100 nt feature-respecting windows with a negative-binomial (DESeq2-based) differential test.

# Assume Skipper has been run on all samples; Skipper output is at skipper_out/
flipper differential \
    -i skipper_out/ \
    --design design.tsv \
    --contrast treatment vs control \
    -o flipper_out/

# design.tsv format:
# sample_id   condition   replicate   ip_or_input
# ip_treat_r1 treatment   1           ip
# ip_treat_r2 treatment   2           ip
# in_treat_r1 treatment   1           input
# ...

Output: differential window BED with log2 FC, p, padj per window.

Peak-Level Differential (CLIPper Upstream)

library(DESeq2)
library(GenomicRanges)
library(Rsubread)

# Step 1: union of CLIPper peaks across conditions
# (See bedtools merge upstream)
peaks <- read.table('consensus_peaks.bed', sep='\t', col.names=c('chr','start','end','name','score','strand'))

# Step 2: count reads per peak per sample with featureCounts
saf <- data.frame(
    GeneID = peaks$name,
    Chr = peaks$chr,
    Start = peaks$start + 1,  # 1-based for featureCounts
    End = peaks$end,
    Strand = peaks$strand
)
counts_ip <- featureCounts(
    files = c('ip_treat_r1.bam','ip_treat_r2.bam','ip_ctrl_r1.bam','ip_ctrl_r2.bam'),
    annot.ext = saf,
    isGTFAnnotationFile = FALSE,
    strandSpecific = 1,
    isPairedEnd = TRUE
)$counts

counts_in <- featureCounts(
    files = c('in_treat_r1.bam','in_treat_r2.bam','in_ctrl_r1.bam','in_ctrl_r2.bam'),
    annot.ext = saf,
    isGTFAnnotationFile = FALSE,
    strandSpecific = 1,
    isPairedEnd = TRUE
)$counts

# Step 3: DESeq2 with IP vs SMInput interaction
all_counts <- cbind(counts_ip, counts_in)
colData <- data.frame(
    type = rep(c('ip','input'), each=4),
    condition = rep(c('treat','treat','ctrl','ctrl'), 2),
    replicate = rep(c('r1','r2','r1','r2'), 2)
)

dds <- DESeqDataSetFromMatrix(countData = all_counts, colData = colData,
                               design = ~ type + condition + type:condition)
dds <- DESeq(dds)

# The interaction term `typeip.conditiontreat` tests:
# does the IP/input ratio differ in treatment vs control?
res <- results(dds, name = 'typeip.conditiontreat')

# Filter
res_sig <- res[!is.na(res$padj) & res$padj < 0.05 & abs(res$log2FoldChange) > 1, ]

RBP Knockdown Experiment Design

The canonical differential CLIP design is to knock down the RBP and observe what binding sites are lost. Caveats:

Per-Tool Failure Modes

DEWSeq -- Slow on dense libraries

Trigger: Whole-genome window tiling at 20 nt step; dense library (50M unique fragments); 4+ samples.

Mechanism: DEWSeq runs DESeq2 internals on millions of windows; the dispersion fit on this many features is slow.

Symptom: Runtime > 6 h; out-of-memory; "size of object exceeds vector limit".

DEWSeq -- Custom adjacency aggregation needed

Trigger: Windows are 50 nt; biological binding sites are 50-500 nt; user expects DEWSeq to output continuous "differential regions" but gets individual windows.

Mechanism: DEWSeq outputs per-window results; aggregating adjacent significant windows into regions is a separate step.

Symptom: Output has 10,000 individual windows; user expects 1,000 biological regions.

Fix: Use the DEWSeq utility resultsDEWSeq() then bedtools merge -d 100 on the significant-window BED. Or use the top_hits_to_bed.R script from DEWSeq examples.

Peak-level differential -- Peak set differs between conditions

Trigger: CLIPper called peaks separately per condition; treatment has peaks at sites missing in control (and vice versa).

Mechanism: Peak unification requires a consensus peakset; testing on a "treatment-only" peak underestimates evidence in control (zero reads) and produces spurious DE.

Symptom: "Treatment-specific" peaks dominate DE results; biologically implausible.

Interaction term forgotten

Trigger: DESeq2 design ~ condition instead of ~ type + condition + type:condition.

Symptom: DE results dominated by transcripts with global expression changes (housekeeping shifts).

Fix: Always use the interaction-term design. The biologically meaningful test is the interaction type:condition p-value.

Normalization assumptions

Trigger: edgeR calcNormFactors(method='TMM') on CLIP-seq data.

Symptom: Knockdown experiment shows ~0 DE peaks; expected hundreds.

Flipper requires Skipper upstream

Trigger: Flipper called on CLIPper output.

Mechanism: Flipper expects Skipper's window-level output format with beta-binomial estimates.

Symptom: Flipper crashes or produces nonsense.

Fix: Use DEWSeq with htseq-clip for CLIPper-upstream workflows; use Flipper only with Skipper.

Decision Tree by Scenario

Reconciliation: When Differential Tools Disagree

Common Errors

References

Schwarzl T et al 2024 Nucleic Acids Res 52:e1 (DEWSeq, windowed NB binding-site discovery; DESeq2-based, adaptable to differential designs)
Sahadevan S et al 2022 Methods Mol Biol 2404:189 (DEWSeq + htseq-clip pipeline)
Flanagan K, Xu S, Yeo GW 2026 bioRxiv 2026.03.13.711628 (Flipper, Skipper-companion differential; preprint)
Boyle EA et al 2023 Cell Genomics 3:100317 (Skipper, parent of Flipper)
Love MI et al 2014 Genome Biol 15:550 (DESeq2)
McCarthy DJ et al 2012 Nucleic Acids Res 40:4288 (edgeR)
Ritchie ME et al 2015 Nucleic Acids Res 43:e47 (limma)
Yang EW et al 2019 Nat Commun 10:1338 (BEAPR allele-specific protein-RNA binding)
Van Nostrand EL et al 2020 Nature 583:711 (ENCODE 150 RBP shRNA + eCLIP comparison)

Related Skills

clip-seq/clip-peak-calling - CLIPper / Skipper outputs feed differential
clip-seq/clip-qc - Replicate QC required for valid differential
clip-seq/binding-site-annotation - Annotate differential regions
clip-seq/clip-motif-analysis - Motif analysis on differential windows
clip-seq/ago-clip-mirna-targets - Differential miRNA targeting from chimeric eCLIP
differential-expression/deseq2-basics - Underlying NB GLM model
differential-expression/de-results - DE results interpretation
differential-expression/edger-basics - edgeR for peak counts
chip-seq/differential-binding - DNA-protein analogue

Adoption

GPTomics/bio-clip-seq-differential-clip

$ install --global

Security Scan Results

SKILL.md

Version Compatibility

Differential CLIP-seq Analysis

Algorithmic Taxonomy

Critical Decision: The Interaction-Term Design

Critical Choice: Peak-Level vs Window-Level vs Crosslink-Level

DEWSeq Workflow (Window-Level Differential)

Flipper Workflow (Skipper-Coupled)

Peak-Level Differential (CLIPper Upstream)

RBP Knockdown Experiment Design

Per-Tool Failure Modes

DEWSeq -- Slow on dense libraries

DEWSeq -- Custom adjacency aggregation needed

Peak-level differential -- Peak set differs between conditions

Interaction term forgotten

Normalization assumptions

Flipper requires Skipper upstream

Decision Tree by Scenario

Reconciliation: When Differential Tools Disagree

Common Errors

References

Related Skills

Related Skills

GPTomics/bio-workflows-clip-pipeline

GPTomics/bio-comparative-genomics-whole-genome-duplication

GPTomics/bio-comparative-genomics-whole-genome-alignment

GPTomics/bio-comparative-genomics-synteny-analysis

GPTomics/bio-clip-seq-differential-clip

$ install --global

Security Scan Results

SKILL.md

Version Compatibility

Differential CLIP-seq Analysis

Algorithmic Taxonomy

Critical Decision: The Interaction-Term Design

Critical Choice: Peak-Level vs Window-Level vs Crosslink-Level

DEWSeq Workflow (Window-Level Differential)

Flipper Workflow (Skipper-Coupled)

Peak-Level Differential (CLIPper Upstream)

RBP Knockdown Experiment Design

Per-Tool Failure Modes

DEWSeq -- Slow on dense libraries

DEWSeq -- Custom adjacency aggregation needed

Peak-level differential -- Peak set differs between conditions

Interaction term forgotten

Normalization assumptions

Flipper requires Skipper upstream

Decision Tree by Scenario

Reconciliation: When Differential Tools Disagree

Common Errors

References

Related Skills

Related Skills

GPTomics/bio-workflows-clip-pipeline

GPTomics/bio-comparative-genomics-whole-genome-duplication

GPTomics/bio-comparative-genomics-whole-genome-alignment

GPTomics/bio-comparative-genomics-synteny-analysis