methylation-analysis/bismark-alignment/SKILL.md
Bisulfite sequencing read alignment using Bismark with bowtie2/hisat2. Handles genome preparation and produces BAM files with methylation information. Use when aligning WGBS, RRBS, or other bisulfite-converted sequencing reads to a reference genome.
npx skillsauth add GPTomics/bioSkills bio-methylation-bismark-alignmentInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Reference examples tested with: Bowtie2 2.5.3+, HISAT2 2.2.1+, Trim Galore 0.6.10+, samtools 1.19+
Before using code patterns, verify installed versions match. If versions differ:
<tool> --version then <tool> --help to confirm flagsIf code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.
"Align my bisulfite sequencing reads" -> Map WGBS/RRBS reads to an in-silico bisulfite-converted reference genome, producing BAM files with methylation context tags.
bismark_genome_preparation genome/ then bismark --genome genome/ reads.fq.gz# One-time genome preparation (creates bisulfite-converted index)
bismark_genome_preparation --bowtie2 /path/to/genome_folder/
# Genome folder should contain FASTA files (e.g., hg38.fa, chr1.fa, etc.)
# Creates Bisulfite_Genome/ subdirectory with CT and GA converted indices
bismark --genome /path/to/genome_folder/ reads.fastq.gz -o output_dir/
bismark --genome /path/to/genome_folder/ \
-1 reads_R1.fastq.gz \
-2 reads_R2.fastq.gz \
-o output_dir/
bismark --genome /path/to/genome_folder/ \
--bowtie2 \ # Use bowtie2 (default)
--parallel 4 \ # Number of parallel instances
--temp_dir /tmp/ \ # Temporary directory
--non_directional \ # For non-directional libraries
--nucleotide_coverage \ # Generate nucleotide coverage report
-o output_dir/ \
reads.fastq.gz
# Reduced Representation Bisulfite Sequencing
bismark --genome /path/to/genome_folder/ \
--pbat \ # For PBAT libraries (post-bisulfite adapter tagging)
reads.fastq.gz
# MspI digestion (RRBS standard)
# Bismark handles MspI-digested libraries automatically
# Post-Bisulfite Adapter Tagging (e.g., scBS-seq)
bismark --genome /path/to/genome_folder/ --pbat reads.fastq.gz
# For libraries where all 4 strands are present
bismark --genome /path/to/genome_folder/ --non_directional reads.fastq.gz
# Trim adapters first with Trim Galore (recommended)
trim_galore --illumina --paired reads_R1.fastq.gz reads_R2.fastq.gz
# Then align
bismark --genome /path/to/genome_folder/ \
-1 reads_R1_val_1.fq.gz \
-2 reads_R2_val_2.fq.gz
# --parallel sets instances per alignment direction
# Total threads = parallel * 2 (for directional) or parallel * 4 (non-directional)
bismark --genome /path/to/genome_folder/ \
--parallel 4 \
reads.fastq.gz
# Bismark produces:
# - reads_bismark_bt2.bam # Aligned reads
# - reads_bismark_bt2_SE_report.txt # Alignment report
# View alignment report
cat output_dir/reads_bismark_bt2_SE_report.txt
# Bismark output is unsorted
samtools sort output.bam -o output.sorted.bam
samtools index output.sorted.bam
# Remove PCR duplicates (recommended for WGBS, not RRBS)
deduplicate_bismark --bam output_bismark_bt2.bam
# For paired-end
deduplicate_bismark --paired --bam output_bismark_bt2_pe.bam
# Bismark generates detailed report
cat *_SE_report.txt
# Key metrics:
# - Sequences analyzed
# - Unique alignments
# - Mapping efficiency
# - C methylated in CpG context
# HISAT2 is faster and uses less memory for large mammalian genomes
bismark_genome_preparation --hisat2 /path/to/genome_folder/
# Align with HISAT2
bismark --genome /path/to/genome_folder/ --hisat2 reads.fastq.gz
# HISAT2 paired-end
bismark --genome /path/to/genome_folder/ --hisat2 \
-1 reads_R1.fastq.gz \
-2 reads_R2.fastq.gz
| Parameter | Description | |-----------|-------------| | --genome | Path to genome folder | | --bowtie2 | Use Bowtie2 aligner (default) | | --hisat2 | Use HISAT2 aligner | | --parallel | Parallel alignment instances | | --non_directional | Non-directional library | | --pbat | PBAT library protocol | | -o | Output directory | | --temp_dir | Temporary file directory | | --nucleotide_coverage | Generate nuc coverage report | | -N | Mismatches in seed (0 or 1, default 0) | | -L | Seed length (default 20) |
| Type | Parameter | Description | |------|-----------|-------------| | Directional | (default) | Standard WGBS/RRBS | | Non-directional | --non_directional | All 4 strands | | PBAT | --pbat | Post-bisulfite adapter tagging |
development
Find restriction enzyme cut sites in DNA sequences using Biopython Bio.Restriction. Search with single enzymes, batches of enzymes, or commercially available enzyme sets. Returns cut positions for linear or circular DNA. Use when finding restriction enzyme cut sites in sequences.
development
Create restriction maps showing enzyme cut positions on DNA sequences using Biopython Bio.Restriction. Visualize cut sites, calculate distances between sites, and generate text or graphical maps. Use when creating or analyzing restriction maps.
development
Analyze restriction digest fragments using Biopython Bio.Restriction. Predict fragment sizes, get fragment sequences, simulate gel electrophoresis patterns, and perform double digests. Use when analyzing restriction digest fragment patterns.
development
Select restriction enzymes by criteria using Biopython Bio.Restriction. Find enzymes that cut once, don't cut, produce specific overhangs, are commercially available, or have compatible ends for cloning. Use when selecting restriction enzymes for cloning or analysis.