epitranscriptomics/modification-visualization/SKILL.md
Visualises RNA-modification data with transcript-feature metagene plots (Guitar GuitarPlot with 5'UTR / CDS / 3'UTR scaling; MetaPlotR; deepTools `computeMatrix scale-regions`), peak-centred heatmaps (ComplexHeatmap; deepTools plotHeatmap), IP-vs-input paired browser tracks (bigWig of log2 IP/input via deepTools `bamCompare`; ggcoverage; pyGenomeTracks; Gviz; IGV / UCSC track hubs), DRACH sequence-logo plots (ggseqlogo; MEME), 5'UTR / CDS / 3'UTR stacked-bar feature-distribution summaries, and volcano / MA plots for differential modification. Establishes stop-codon enrichment in the metagene plot as the biological QC anchor for any MeRIP dataset (Dominissini 2012 *Nature* 485:201; Meyer 2012 *Cell* 149:1635 — concurrent founding papers from different labs both showed this independently). Use when producing the canonical metagene plot with stop-codon enrichment as a QC anchor, building paired IP/input genome-browser tracks at single-locus resolution, plotting peak-centred heatmaps clustered by condition, summarising peak distribution across transcript features for figure 1, generating DRACH motif logos as sanity checks on the peak set, rendering volcano plots of differential m6A, or reproducing the Dominissini 2012 / Meyer 2012 stop-codon enrichment plot.
npx skillsauth add GPTomics/bioSkills bio-epitranscriptomics-modification-visualizationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Reference examples tested with: Guitar 2.18+ (Bioconductor), MetaPlotR (GitHub, unversioned), deepTools 3.5+, ggcoverage 1.4+, pyGenomeTracks 3.8+, Gviz 1.46+, ComplexHeatmap 2.18+, ggseqlogo 0.2+, ggplot2 3.5+, rtracklayer 1.62+, GenomicFeatures 1.54+, BSgenome.Hsapiens.UCSC.hg38 1.4+, TxDb.Hsapiens.UCSC.hg38.knownGene 3.18+, ChIPseeker 1.38+.
Before using code patterns, verify installed versions match. If versions differ:
packageVersion('Guitar') then ?GuitarPlot to verify parameters<tool> --help; deepTools <tool> --helpIf R throws unused argument or argument is missing, the API moved between Bioconductor minor releases. Guitar's older API used txdb= while newer uses txTxdb=; verify with ?GuitarPlot. deepTools bamCompare --operation log2 is the modern syntax (older --ratio log2 is being phased out).
"Make the canonical m6A metagene plot for my paper" -> Render the 5'UTR / CDS / 3'UTR transcript-feature metagene with stop-codon enrichment for visual confirmation of the canonical m6A topology (THE smoke test that the antibody-IP captured real m6A), the peak-feature-distribution stacked bar ("where do my peaks land?" Figure 1), peak-centred heatmaps clustered by condition, IP-over-Input paired browser tracks at specific loci, the DRACH sequence logo as a sanity check on antibody specificity, and the volcano / MA plots for differential modification. CRITICAL: the stop-codon enrichment plot is a biological-QC anchor — a MeRIP dataset that does NOT show enrichment at and around the stop codon indicates IP failure, wrong antibody, wrong protocol, or the assay captured a different modification (e.g., m1A is centred at TSS, not stop). Build this plot FIRST as a smoke test BEFORE any downstream visualisation.
Guitar::GuitarPlot(peakBedFiles, txTxdb=...) -- canonical transcript-feature metagene (THE field-standard plot)deeptools computeMatrix scale-regions ... + plotProfile -- genome-coordinate metageneComplexHeatmap::Heatmap() on peak-centred signal matrix -- peak-centred heatmap clustered by conditionpyGenomeTracks --tracks tracks.ini --region chr:start-end -- multi-track browser plotggcoverage::ggcoverage(data=track, mark.region=peaks) + geom_gene() -- ggplot2-based browser tracksggseqlogo::ggseqlogo(seqs) -- DRACH sequence logo from peak-centre 5-mersChIPseeker::annotatePeak() + ggplot2 stacked bar -- 5'UTR / CDS / 3'UTR feature distributionDominissini 2012 Nature 485:201 and Meyer 2012 Cell 149:1635 — concurrent papers from different labs using different protocols (MeRIP-seq and m6A-seq, respectively) — both independently showed m6A enrichment at and around the stop codon (3'UTR-proximal end of the CDS). This is the most robust biological signal in MeRIP-seq, reproduced across cell types, conditions, and decades. A metagene plot from a MeRIP library that does NOT show stop-codon enrichment indicates: (1) IP failure (antibody did not bind), (2) wrong antibody, (3) wrong protocol (e.g., the assay actually captured a different modification — m1A is centred at TSS, not stop), or (4) sample-RNA degradation. Build the Guitar metagene plot FIRST after peak calling, BEFORE any downstream visualisation, as a smoke test. The corollary: the 5'UTR / CDS / 3'UTR feature-distribution stacked bar (the "where do m6A peaks land?" Figure 1) should always be PAIRED with the metagene plot, because the stacked bar can look right (peaks land in 3'UTR / stop area) while the metagene is off (peak DENSITY not concentrated at the codon itself), or vice versa. The two plots are complementary biology-QC anchors, not redundant.
| Tool / plot | Mechanism | Inputs | Output | Strength | Fails when |
|-------------|-----------|--------|--------|----------|------------|
| Guitar GuitarPlot (Cui 2016 Biomed Res Int 2016:8367534) | Per-peak distance computation relative to TSS / start / stop / TES; feature-scaled rendering | BED + TxDb | PDF + per-feature density | THE field-standard m6A metagene; transcript-feature-aware | Older / newer API uses txdb= vs txTxdb= argument |
| MetaPlotR (Olarerin-George 2017 Bioinformatics 33:1563) | Alternative metagene approach with different segment-rescaling | BED + GTF | PDF | Alternative segment rescaling philosophy | GitHub-only; less actively maintained |
| deepTools computeMatrix + plotProfile (Ramírez 2016 NAR 44:W160) | Genome-coordinate signal aggregation over scaled gene regions | bigWig + BED | PDF + matrix | Fast; flexible region/anchor choice; well-tested ChIP-seq lineage | Not transcript-feature-aware; misses 5'UTR / CDS / 3'UTR distinction |
| deepTools plotHeatmap | Same matrix as plotProfile; rendered as heatmap with row clustering | bigWig matrix | PDF | Standard peak-centred heatmap | Single-condition view (use ComplexHeatmap for multi-condition clustering) |
| ComplexHeatmap (Gu 2016 Bioinformatics 32:2847) | General-purpose heatmap with multi-dimensional clustering | numeric matrix | PDF / interactive | Multi-condition cluster + annotation; publication-quality | Requires pre-computed signal matrix |
| pyGenomeTracks (Lopez-Delisle 2021 Bioinformatics 37:422) | Config-file (INI) driven multi-track browser plot | bigWig / bed / GTF + INI config | PDF / PNG | Reproducible browser figures via config; multi-track stacking | INI config syntax error-prone for new users |
| ggcoverage (Song & Wang 2023 BMC Bioinformatics 24:309) | ggplot2-native track plotting | bigWig / BAM + GTF | ggplot2 object | ggplot2 syntax; combinable with annotation layers | Newer tool; smaller user base than Gviz |
| Gviz (Hahne & Ivanek 2016 Methods Mol Biol 1418:335) | R/Bioconductor general genome track | bigWig / BAM / GRanges + TxDb | PDF / R plot | Most flexible R-native; long Bioconductor history | Heavier than ggcoverage; steeper learning curve |
| IGV / IGV.js (Robinson 2011 Nat Biotechnol 29:24) | Interactive browser via Java / JS | bigWig / BAM / bed | interactive | Standard for ad-hoc inspection | Not reproducible for figure generation |
| UCSC Track Hubs | UCSC genome browser display | bigWig + hub.txt + genomes.txt | URL | Public-display standard | Setup overhead for short projects |
| ggseqlogo (Wagih 2017 Bioinformatics 33:3645) | Sequence-logo rendering in ggplot2 | character vector of equal-length sequences | ggplot2 object | Native ggplot2; method='bits' or 'probability' | Requires equal-length sequences |
| MEME-ChIP (Machanick & Bailey 2011 Bioinformatics 27:1696) | Motif discovery + logo | BED + genome | HTML + PWM | Comprehensive motif suite; gold standard | Heavier than ggseqlogo for simple visualisation |
| HOMER findMotifsGenome.pl (Heinz 2010 Mol Cell 38:576) | Motif discovery via cumulative hypergeometric on shuffled background | BED + genome | HTML + motif files | Battle-tested; RNA mode via -rna | Less flexible output formatting than MEME |
| Scenario | Recommended | Why wrong choices fail | |----------|-------------|------------------------| | Canonical Figure 1 metagene plot for m6A paper | Guitar GuitarPlot with TxDb -- transcript-feature scaled | deepTools computeMatrix is genome-coordinate only; misses 5'UTR / CDS / 3'UTR distinction | | QC: does my MeRIP show stop-codon enrichment? | Guitar GuitarPlot FIRST -- the canonical biological-QC anchor | Skipping this and rushing to peak counting is the most common QC failure | | Browser-track Figure for a specific locus | pyGenomeTracks (config-file driven; reproducible) OR ggcoverage (ggplot2-native) | IGV is interactive but not reproducible for figures | | Multi-condition peak-centred heatmap with clustering | ComplexHeatmap; row-cluster k-means; annotate by condition | deepTools plotHeatmap is single-condition view | | DRACH sequence logo from peak centres | ggseqlogo on resized peak ranges (width=5, fix='center') | MEME-ChIP heavier than needed for visualisation only | | Differential m6A volcano plot | ggplot2 native (see m6a-differential) -- modification-visualization should NOT duplicate | Duplicating volcano recipe across skills inflates content | | Genome-coordinate metagene over custom features | deepTools computeMatrix scale-regions; plotProfile | Guitar restricted to transcript features; doesn't scale to arbitrary BED | | Cross-sample paired IP/input track display | bamCompare -> log2 bigWig; pyGenomeTracks multi-track | Single bigWig hides the IP/input pairing structure | | 5'UTR / CDS / 3'UTR stacked bar | ChIPseeker annotatePeak -> ggplot2 stacked bar | Manual annotation error-prone; ChIPseeker handles edge cases | | Reproducing Dominissini 2012 / Meyer 2012 stop-codon plot | Guitar GuitarPlot; the canonical recipe | Other tools produce similar but not identical plots |
Methodology evolves; before high-stakes figure generation, web-search "Guitar Bioconductor release notes" and "pyGenomeTracks vs ggcoverage" for current best practice.
Goal: Render the canonical m6A metagene plot showing peak density along scaled transcript features (5'UTR, CDS, 3'UTR), with the stop-codon-proximal enrichment that is the field's biological QC anchor.
Approach: Load called peaks from BED; pass to GuitarPlot with the matched TxDb; export per-feature density and the PDF.
library(Guitar)
library(TxDb.Hsapiens.UCSC.hg38.knownGene)
library(rtracklayer)
txdb <- TxDb.Hsapiens.UCSC.hg38.knownGene
# Older Guitar versions: argument is `txdb=`; newer: `txTxdb=`. Verify with ?GuitarPlot.
GuitarPlot(
txTxdb = txdb,
stBedFiles = list('exomepeak2_output/m6a_run1/peaks.bed'),
miscOutFilePrefix = 'figures/m6a_metagene',
enableCI = FALSE,
saveToPDFprefix = 'figures/m6a_metagene'
)
If txTxdb= is rejected, fall back to txdb= and consult ?GuitarPlot. The expected output: stop-codon-proximal enrichment, with peak density rising toward and peaking near the stop codon in the 3'UTR. If the plot does NOT show this pattern, do not proceed to downstream visualisation — investigate IP / antibody / protocol failure in merip-preprocessing.
Goal: Render a genome-coordinate metagene over scaled gene regions using deepTools, useful when transcript-feature scaling is not needed or when the regions of interest are not standard genes (custom BED).
Approach: Build a signal matrix with computeMatrix scale-regions from a bigWig (typically IP-over-Input log2 from merip-preprocessing); render as profile plot + heatmap.
mkdir -p figures
computeMatrix scale-regions \
--regionsFileName refs/protein_coding.bed \
--scoreFileName tracks/IP_rep1_over_Input_rep1.bw \
--regionBodyLength 2000 \
--upstream 500 \
--downstream 500 \
--skipZeros \
--numberOfProcessors 8 \
--outFileName figures/m6a_matrix.gz
plotProfile \
--matrixFile figures/m6a_matrix.gz \
--outFileName figures/m6a_profile.pdf \
--plotTitle 'm6A IP over Input metagene' \
--plotType lines
plotHeatmap \
--matrixFile figures/m6a_matrix.gz \
--outFileName figures/m6a_heatmap.pdf \
--colorMap RdBu_r \
--plotTitle 'm6A IP over Input heatmap'
scale-regions is the right mode for gene-body metagene (5'-end-to-3'-end scaled to common length); reference-point is for peak-centred plots. For peak-centred, see the next section.
Goal: Render a peak-centred heatmap of IP-over-Input signal at +/-window around each peak, clustered by condition or by signal pattern, for cross-condition comparison.
Approach: Use deepTools computeMatrix reference-point --referencePoint center for the matrix; ComplexHeatmap for the clustered render.
computeMatrix reference-point \
--regionsFileName exomepeak2_output/m6a_run1/peaks.bed \
--scoreFileName tracks/IP_rep1_over_Input_rep1.bw tracks/IP_rep2_over_Input_rep2.bw tracks/IP_rep3_over_Input_rep3.bw \
--referencePoint center \
--upstream 500 \
--downstream 500 \
--binSize 25 \
--skipZeros \
--numberOfProcessors 8 \
--outFileName figures/peak_centred_matrix.gz \
--outFileNameMatrix figures/peak_centred_matrix.tab
plotHeatmap \
--matrixFile figures/peak_centred_matrix.gz \
--outFileName figures/peak_centred_heatmap.pdf \
--kmeans 3 \
--colorMap viridis \
--plotTitle 'm6A signal centred at peaks'
For multi-condition heatmap with annotations, parse the matrix into R and use ComplexHeatmap:
library(ComplexHeatmap)
library(circlize)
# deepTools --outFileNameMatrix has a 3-line JSON-style header before per-bin numeric columns.
raw <- read.delim('figures/peak_centred_matrix.tab', skip=3, header=FALSE)
mat <- as.matrix(raw[, -(1:6)])
col_fun <- colorRamp2(c(-2, 0, 2), c('blue', 'white', 'red'))
Heatmap(
mat,
name = 'log2 IP / Input',
col = col_fun,
cluster_columns = FALSE,
cluster_rows = TRUE,
row_km = 3,
show_row_names = FALSE,
show_column_names = FALSE,
column_title = 'Peak-centred (+/- 500 bp)'
)
Goal: Render publication-quality genome-browser tracks at a specific locus showing paired IP / Input bigWig tracks, the IP/input log2 ratio, peak calls, and gene annotation; reproducible via INI config.
Approach: Build a tracks.ini config file listing each track type (bigwig, bed, gtf); invoke pyGenomeTracks --tracks tracks.ini --region chr:start-end.
[x-axis]
where = top
fontsize = 12
[IP rep1]
file = tracks/IP_rep1.bw
title = IP rep1
color = #d62728
height = 2
min_value = 0
[Input rep1]
file = tracks/Input_rep1.bw
title = Input rep1
color = #1f77b4
height = 2
min_value = 0
[IP / Input log2]
file = tracks/IP_rep1_over_Input_rep1.bw
title = log2 IP / Input
color = #2ca02c
height = 2
[spacer]
[m6A peaks]
file = exomepeak2_output/m6a_run1/peaks.bed
title = m6A peaks (exomePeak2)
color = #ff7f0e
height = 1
labels = false
[genes]
file = refs/annotation.gtf
title = GENCODE genes
height = 2
prefered_name = gene_name
merge_transcripts = true
pyGenomeTracks \
--tracks tracks.ini \
--region chr19:54,792,000-54,799,000 \
--outFileName figures/browser_metti3_locus.pdf \
--width 14 \
--plotWidth 12
Goal: Render genome-browser tracks in ggplot2 syntax for combining with other ggplot2 layers (annotations, peak highlights, custom theming).
Approach: LoadTrackFile() parses a bigWig / bigBed / BAM input into the dataframe ggcoverage expects; chain ggcoverage() with geom_gene() for transcript annotation; mark.region requires columns start, end, and label.
library(ggcoverage)
library(rtracklayer)
peaks <- as.data.frame(import('exomepeak2_output/m6a_run1/peaks.bed'))
track.df <- LoadTrackFile(
track.file = 'tracks/IP_rep1_over_Input_rep1.bw',
format = 'bw',
region = 'chr19:54792000-54799000'
)
mark.df <- data.frame(
start = peaks$start,
end = peaks$end,
label = peaks$name
)
ggcoverage(
data = track.df,
color = 'auto',
mark.region = mark.df
) +
geom_gene(gtf.file = 'refs/annotation.gtf') +
ggplot2::theme_classic()
Goal: Render a sequence logo of peak-centre 5-mers as a sanity check that the called peak set is enriched for the DRACH consensus motif.
Approach: Resize peak GRanges to fixed 5-nt centred windows; extract genomic sequences; pass to ggseqlogo with method='probability'.
library(Biostrings)
library(BSgenome.Hsapiens.UCSC.hg38)
library(ggseqlogo)
library(rtracklayer)
peaks <- import('exomepeak2_output/m6a_run1/peaks.bed')
peak_centres <- resize(peaks, width=5, fix='center')
genome <- BSgenome.Hsapiens.UCSC.hg38
seqs <- as.character(getSeq(genome, peak_centres))
ggseqlogo(seqs, method='probability') +
ggplot2::labs(title='Peak-centre 5-mer (DRACH consensus expected)',
subtitle='Sanity check on antibody specificity — NOT a per-peak filter')
The expected output: a logo showing approximately D-R-A-C-H consensus (D=A/G/U, R=A/G, A=methylated, C, H=A/C/U) with A clearly dominating position 3. If the logo does NOT show DRACH-like enrichment, the IP failed OR the wrong antibody was used.
Goal: Render the Figure 1 "where do my peaks land?" stacked bar showing the fraction of peaks in 5'UTR vs CDS vs 3'UTR vs intron, paired with the metagene to confirm canonical m6A topology.
Approach: Use ChIPseeker annotatePeak() with a matched TxDb; aggregate to per-feature counts; render as ggplot2 stacked bar.
library(ChIPseeker)
library(TxDb.Hsapiens.UCSC.hg38.knownGene)
library(rtracklayer)
library(ggplot2)
library(dplyr)
txdb <- TxDb.Hsapiens.UCSC.hg38.knownGene
peaks <- import('exomepeak2_output/m6a_run1/peaks.bed')
peak_anno <- annotatePeak(peaks, TxDb=txdb, level='transcript', verbose=FALSE)
anno_df <- as.data.frame(peak_anno@anno)
anno_df$feature <- gsub(' \\(.*\\)', '', anno_df$annotation)
feature_counts <- anno_df %>%
group_by(feature) %>%
summarise(n=n()) %>%
mutate(fraction=n/sum(n)) %>%
arrange(desc(fraction))
ggplot(feature_counts, aes(x='m6A peaks', y=fraction, fill=feature)) +
geom_col(width=0.5) +
scale_y_continuous(labels=scales::percent_format()) +
labs(x=NULL, y='Fraction of peaks', title='m6A peak distribution across transcript features') +
theme_minimal()
Trigger: Guitar GuitarPlot of called peaks does NOT show enrichment at and around the stop codon.
Mechanism: Canonical m6A topology (Dominissini 2012 / Meyer 2012) shows stop-codon-proximal enrichment. Absence indicates (1) IP failure, (2) wrong antibody, (3) wrong protocol, (4) sample-RNA degradation, OR (5) the assay captured a different modification (m1A is TSS-centred; m5C distribution differs).
Symptom: Metagene is flat OR peaks at TSS (m1A signature) OR peaks in introns (unusual).
Fix: Do NOT proceed to downstream analysis. Diagnose at the merip-preprocessing layer: plotFingerprint, per-transcript IP/input distribution, antibody-lot QC. Re-do the IP if necessary. The metagene plot is the biological QC anchor; without it, all downstream interpretation is suspect.
Trigger: ggseqlogo of peak-centre 5-mers shows no consensus, OR shows a non-DRACH-like motif.
Mechanism: Antibody specificity failure OR wrong protocol. Anti-m6A antibodies have ~70% DRACH-context enrichment on real m6A peaks; a failed IP captures random sequences with no consensus.
Fix: Re-inspect IP enrichment in merip-preprocessing. If IP is clean but DRACH logo is absent, the assay may have captured a different modification — investigate before claiming m6A.
txdb= vs txTxdb= argument confusionTrigger: GuitarPlot(stBedFiles=..., txdb=txdb) rejected with "unused argument" error.
Mechanism: Guitar changed the argument name between Bioconductor releases — txdb= (older) vs txTxdb= (newer).
Fix: Try alternative; consult ?GuitarPlot for installed version. Pin Guitar version in reproducible analyses.
Trigger: computeMatrix scale-regions over a BED of genes used to show "5'UTR / CDS / 3'UTR enrichment".
Mechanism: deepTools scales by genomic length, NOT by transcript-feature length. A gene with long 5'UTR and short CDS will scale 5'UTR more than CDS; the metagene loses 5'UTR / CDS / 3'UTR semantics.
Fix: Use Guitar (transcript-feature-aware) for 5'UTR / CDS / 3'UTR semantics. Use deepTools for genome-coordinate metagene over arbitrary BED.
Trigger: Figure 4 of paper shows an IGV screenshot of one locus; not reproducible from code.
Mechanism: IGV is interactive; the figure cannot be re-generated from a config file. Reviewers cannot reproduce the figure if track files change.
Fix: Use pyGenomeTracks (INI config) or ggcoverage (ggplot2 script) for figures intended for publication. IGV is for ad-hoc inspection only.
--ratio log2 deprecationTrigger: Older deepTools syntax --ratio log2 used; newer requires --operation log2.
Fix: Switch to --operation log2. Both currently work but --ratio is being phased out.
Trigger: Mixed-length peak sequences passed to ggseqlogo.
Mechanism: ggseqlogo requires equal-length input strings.
Fix: Resize peak ranges to fixed width before extraction: resize(peaks, width=5, fix='center').
Trigger: Track section header missing brackets, OR file = instead of file= (whitespace inconsistency).
Fix: Use the documented INI syntax precisely; bracketed section headers; consistent whitespace around =. Validate with pyGenomeTracks --tracks tracks.ini --region 'chr:start-end' --outFileName out.pdf and iterate on errors.
| Pattern | Likely cause | Action | |---------|--------------|--------| | Metagene shows stop-codon enrichment but stacked bar has many 5'UTR peaks | 5'UTR peaks are present but density is concentrated at stop codon | Both plots are correct; report together | | Guitar metagene and deepTools metagene look very different | Guitar transcript-feature-scaled vs deepTools genome-scaled | Both correct; Guitar for 5'UTR / CDS / 3'UTR semantics; deepTools for genome coordinates | | Peak-centred heatmap shows two clusters; condition annotation crosses cluster boundary | Biological signal not condition-aligned; OR clustering driven by per-peak coverage variance | Inspect per-cluster fold-change; consider removing low-coverage peaks before re-clustering | | pyGenomeTracks shows IP track higher than Input but bamCompare bigWig shows log2 near zero | Track signal magnitudes are CPM-normalised; absolute counts and log2 ratios are different summaries | Report log2 ratio as the primary; per-track CPM as supplement | | DRACH logo enriched but stacked bar shows mostly intronic peaks | Intronic m6A (Louloupi 2018 nascent transcripts) | Genuine biology; report intronic vs exonic separately; consider library prep (poly-A vs ribo-depleted) | | Volcano plot symmetric but most differential peaks in 3'UTR | Differential is feature-restricted | Cross-check with per-feature differential testing; biological interpretation |
| Quantity | Threshold | Source / rationale | |----------|-----------|--------------------| | Guitar metagene -- expected pattern | Stop-codon-proximal enrichment, 3'UTR > CDS > 5'UTR density | Dominissini 2012 Nature 485:201; Meyer 2012 Cell 149:1635 | | deepTools metagene scaled-region length | 2000 bp body + 500 bp flanks | Convention; covers typical mammalian gene span | | Peak-centred heatmap window | +/-500 bp around peak centre | Standard convention; covers MeRIP fragment width | | Peak-centred heatmap k-means clusters | 3-5 typical | Cluster count informed by condition count + signal heterogeneity | | ggseqlogo expected DRACH pattern | A dominant at position 3 (the methylated A); C dominant at position 4 | DRACH consensus | | pyGenomeTracks figure width | 10-14 inches | Standard publication width | | Browser-track region width | 5-50 kb | Locus-context dependent | | 5'UTR / CDS / 3'UTR / intron+other expected distribution | ~10% / ~30% / ~50% / ~10% for canonical m6A | Per published m6A atlases | | DRACH logo "method" parameter | 'probability' for visual; 'bits' for information-theoretic | ggseqlogo convention | | Per-feature stacked bar minimum peaks | >=100 | Below this, fractions are noisy | | Cross-replicate metagene divergence | Should be near-identical within condition | If replicates' metagenes diverge, failed-IP suspect in merip-preprocessing |
| Error / symptom | Cause | Solution |
|-----------------|-------|----------|
| Guitar txTxdb rejected | Older version uses txdb | Switch argument name; consult ?GuitarPlot |
| Guitar PDF blank | BED file empty OR chromosome mismatch with TxDb | Verify peak BED has peaks; reconcile chromosome naming |
| deepTools computeMatrix errors with "no regions" | BED file empty OR file path wrong | Verify BED is non-empty and path correct |
| pyGenomeTracks INI parse error | Bracket / whitespace inconsistency | Match documented INI syntax exactly |
| ggcoverage region not displayed | Region beyond bigWig coverage; OR chromosome naming mismatch | Verify bigWig contains region; reconcile chr1 vs 1 |
| ggseqlogo throws "equal length" error | Mixed-length sequences | resize(peaks, width=5, fix='center') before extracting |
| ComplexHeatmap clustering hangs | Very large peak set (>100k) | Subset to top N peaks; or use deepTools plotHeatmap k-means |
| bamCompare --ratio log2 deprecation warning | Newer syntax | Switch to --operation log2 |
| IGV screenshot not reproducible | Interactive tool | Switch to pyGenomeTracks / ggcoverage |
| ggseqlogo logo blank | Empty seqs input OR all sequences are gaps / Ns | Verify peak BED resolves to valid genomic sequences via getSeq() |
| Stacked bar shows >100% | Peak annotation overlap counted multiple times | Use annotatePeak with explicit hierarchy |
| ComplexHeatmap colour scale wrong | colorRamp2 breaks at outliers | Set scale based on robust quantiles (quantile(x, 0.05), 0, quantile(x, 0.95)) |
| Pushback | Response | |----------|----------| | "Does the MeRIP show the canonical stop-codon enrichment?" | Yes — Guitar metagene plot shows expected stop-codon-proximal enrichment; cited Dominissini 2012 / Meyer 2012 | | "Why Guitar and not deepTools?" | Guitar is transcript-feature-scaled (5'UTR / CDS / 3'UTR semantics); deepTools is genome-coordinate; both reported when needed | | "Is the DRACH motif enriched?" | ggseqlogo of peak-centre 5-mers; OR HOMER findMotifsGenome.pl E-value reported | | "Is the browser figure reproducible?" | Yes — pyGenomeTracks INI config OR ggcoverage R script; not IGV screenshot | | "How were peaks annotated to features?" | ChIPseeker annotatePeak with matched TxDb; hierarchy explicit | | "Why these specific clusters in the heatmap?" | k-means with k=3 chosen via elbow / silhouette; clusters reflect signal heterogeneity | | "Does the metagene differ between conditions?" | Per-condition metagenes plotted alongside; differences quantified at the feature level | | "Why is the colour scheme red-blue?" | Standard convention for log2 ratios (red = up, blue = down); colour-blind-safe palette via viridis available | | "Was a cross-check with published m6A-Atlas peaks done?" | Common-core overlap reported; cited m6A-Atlas v2 | | "Are the browser track signal magnitudes comparable across samples?" | bigWig CPM-normalised in merip-preprocessing; documented |
development
Find restriction enzyme cut sites in DNA sequences using Biopython Bio.Restriction. Search with single enzymes, batches of enzymes, or commercially available enzyme sets. Returns cut positions for linear or circular DNA. Use when finding restriction enzyme cut sites in sequences.
development
Create restriction maps showing enzyme cut positions on DNA sequences using Biopython Bio.Restriction. Visualize cut sites, calculate distances between sites, and generate text or graphical maps. Use when creating or analyzing restriction maps.
development
Analyze restriction digest fragments using Biopython Bio.Restriction. Predict fragment sizes, get fragment sequences, simulate gel electrophoresis patterns, and perform double digests. Use when analyzing restriction digest fragment patterns.
development
Select restriction enzymes by criteria using Biopython Bio.Restriction. Find enzymes that cut once, don't cut, produce specific overhangs, are commercially available, or have compatible ends for cloning. Use when selecting restriction enzymes for cloning or analysis.