Version Compatibility

Reference examples tested with: pysam 0.22+, samtools 1.19+

Before using code patterns, verify installed versions match. If versions differ:

Python: pip show <package> then help(module.function) to check signatures
CLI: <tool> --version then <tool> --help to confirm flags

If code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.

Alignment Filtering

"Filter my BAM file to keep only high-quality reads" -> Select reads by FLAG bits, mapping quality, and genomic regions using samtools view or pysam.

CLI: samtools view with -F/-f/-q/-L flags (samtools)
Python: pysam.AlignmentFile iteration with attribute filters (pysam)

Filter alignments by flags, quality, and regions using samtools and pysam.

Filter Flags

| Option | Description | |--------|-------------| | -f FLAG | Include reads with ALL bits set | | -F FLAG | Exclude reads with ANY bits set | | -G FLAG | Exclude reads with ALL bits set | | -q MAPQ | Minimum mapping quality | | -L BED | Include reads overlapping regions |

Common FLAG Values

| Flag | Hex | Meaning | |------|-----|---------| | 1 | 0x1 | Paired | | 2 | 0x2 | Proper pair | | 4 | 0x4 | Unmapped | | 8 | 0x8 | Mate unmapped | | 16 | 0x10 | Reverse strand | | 32 | 0x20 | Mate reverse strand | | 64 | 0x40 | First in pair (read1) | | 128 | 0x80 | Second in pair (read2) | | 256 | 0x100 | Secondary alignment | | 512 | 0x200 | Failed QC | | 1024 | 0x400 | Duplicate | | 2048 | 0x800 | Supplementary |

Filter by FLAG

Keep Only Mapped Reads

samtools view -F 4 -o mapped.bam input.bam

Keep Only Unmapped Reads

samtools view -f 4 -o unmapped.bam input.bam

Keep Only Properly Paired

samtools view -f 2 -o proper.bam input.bam

Remove Duplicates

samtools view -F 1024 -o nodup.bam input.bam

Remove Secondary and Supplementary

samtools view -F 2304 -o primary.bam input.bam

Keep Only Primary Alignments

samtools view -F 256 -F 2048 -o primary.bam input.bam
# Or combined: -F 2304

Keep Read1 Only

samtools view -f 64 -o read1.bam input.bam

Keep Read2 Only

samtools view -f 128 -o read2.bam input.bam

Forward Strand Only

samtools view -F 16 -o forward.bam input.bam

Reverse Strand Only

samtools view -f 16 -o reverse.bam input.bam

Filter by Mapping Quality

Minimum MAPQ

samtools view -q 30 -o highqual.bam input.bam

MAPQ and Mapped

samtools view -F 4 -q 30 -o filtered.bam input.bam

Aligner-Aware MAPQ Thresholds

MAPQ scales differ by aligner; the same -q 30 filter does different things. See sam-bam-basics for the full MAPQ-by-aligner table. Filtering recommendations:

| Aligner | "Drop ambiguous" | "High confidence" | |---------|------------------|-------------------| | BWA-MEM / BWA-MEM2 | -q 1 | -q 30 (or -q 60 for unique only) | | Bowtie2 | -q 1 | -q 23 (Bowtie2 MAPQ maxes at 42 end-to-end; 23 is a community "uniquely mapped" convention derived from analysis of Bowtie2's MAPQ scoring, not stated in the official manual) | | STAR | -q 255 | -q 255 (STAR emits only MAPQ 0/1/3/255, 255 = unique; -q 60 is therefore equivalent to -q 255 -- there are no values between 3 and 255) | | HISAT2 | -q 1 | -q 60 | | minimap2 (DNA, long-read) | -q 1 | -q 60 | | pbmm2 (PacBio) | -q 1 | -q 60 |

For Phred-scaled aligners (BWA, minimap2), MAPQ Q maps to ~10^(-Q/10) probability of wrong mapping. For STAR, the only emitted values are 0/1/3/255 (sentinels, not probabilities).

Drop Ambiguous Across Aligners (Universal)

samtools view -q 1 in.bam   # exclude MAPQ=0; works for all aligners

Filter by Region

Single Region

samtools view -o region.bam input.bam chr1:1000000-2000000

Multiple Regions

samtools view -o regions.bam input.bam chr1:1000-2000 chr2:3000-4000

Regions from BED File

samtools view -L targets.bed -o targets.bam input.bam

Combine Region and Quality

samtools view -q 30 -L targets.bed -o filtered.bam input.bam

Combined Filters

Standard Quality Filter

Goal: Produce a clean BAM containing only primary, mapped, non-duplicate reads with high mapping confidence.

Approach: Combine FLAG exclusion (-F for unmapped + secondary + duplicate + supplementary) with a MAPQ threshold.

Reference (samtools 1.19+):

samtools view -F 3332 -q 30 -o filtered.bam input.bam
# 3332 = 4 (unmapped) + 256 (secondary) + 1024 (duplicate) + 2048 (supplementary)

Variant Calling Prep -- Assay-Aware

Goal: Choose a filter that matches what the downstream caller expects. Stripping supplementary alignments breaks SV callers; requiring proper-pair drops valid spliced RNA-seq reads.

| Assay / caller | Recommended filter | Why | |----------------|-------------------|-----| | Germline WGS short-variant (HaplotypeCaller, DeepVariant) | -f 2 -F 3328 -q 20 | Primary, no dup, proper pair, MAPQ>=20 | | Somatic short-variant (Mutect2, Strelka2) | -F 3328 -q 1 | Drop only MAPQ=0; somatic callers handle low MAPQ; chimeric reads at SVs may carry real somatic SNVs | | Long-read short-variant (clair3, DeepVariant ONT) | -F 3328 -q 5 | Long-read MAPQ scale is lower | | Long-read SV (Sniffles, cuteSV) | -F 1024 only | Keep supplementary -- SA tag is the SV signal | | Short-read SV (Manta, GRIDSS, Delly, SvABA) | -F 1024 only | Same -- supplementary required | | ChIP-seq peak calling | -F 1804 -q 30 | Drop dup + secondary + unmapped + mate-unmapped + QC-fail (supplementary reads are kept -- 1804 has no 2048 bit) | | ATAC-seq | -F 1804 -q 30 -f 2 | Same plus proper pair | | RNA-seq quantification (STAR) | -q 255 | Unique only (STAR sentinel) | | RNA-seq quantification (HISAT2) | -F 256 -q 60 | Different aligner semantics | | RNA-seq variant (after SplitNCigarReads) | -F 3328 -q 20 | Standard germline after split-N-trim | | Panel / amplicon | After samtools ampliconclip; -F 1024 -q 20 | Primer overlap makes proper-pair unreliable | | ctDNA / cfDNA (UMI) | After fgbio consensus; do not pre-filter raw | |

Reference (samtools 1.19+):

# Short-variant germline
samtools view -f 2 -F 3328 -q 20 -o clean.bam input.bam
# 3328 = 256 (secondary) + 1024 (duplicate) + 2048 (supplementary)

# SV calling: KEEP supplementary
samtools view -F 1024 -o sv_input.bam input.bam   # NOT -F 2304 or -F 3328

# ChIP-seq / ATAC-seq common filter
samtools view -F 1804 -q 30 -o filtered.bam input.bam
# 1804 = 4 + 8 + 256 + 512 + 1024 = unmapped + mate-unmapped + secondary + QC-fail + duplicate

Cost of getting this wrong: filtering -F 2304 or -F 3328 before SV calling produces zero SV calls -- a single-flag mistake that silently invalidates the analysis.

Subsample Reads (Deterministic, Pair-Consistent)

samtools view -s SEED.FRAC -- integer is the hash seed; fractional is the keep fraction. The hash is on QNAME, so:

Mate consistency: read1 and read2 are kept or dropped together.
Reproducibility: same seed + same fraction returns the same reads.
Sequential downsampling requires different seeds. With the same seed the keep-sets are nested, so -s 1.5 then -s 1.25 keeps a nested 1/4 (25%) of the original, not 12.5%. Use different integer seeds for independent samples.

# 10% with seed 42 (always the same reads; pair-consistent)
samtools view -s 42.1 -b -o subset.bam input.bam

# Sequential cuts with INDEPENDENT seeds
samtools view -s 1.5 -b in.bam > half1.bam
samtools view -s 2.25 -b half1.bam > quarter.bam   # 12.5% of original

# Coverage-matching to a target read count
total=$(samtools view -c -F 2304 input.bam)
target=10000000
frac=$(awk -v t=$target -v n=$total 'BEGIN{printf "%.6f", t/n}')
samtools view -s "1.${frac#*.}" -b -o matched.bam input.bam

# Tumor-normal coverage matching (pull tumor down to normal)
normal_reads=$(samtools view -c -F 2308 normal.bam)
tumor_reads=$(samtools view -c -F 2308 tumor.bam)
if [ "$tumor_reads" -gt "$normal_reads" ]; then
    frac=$(awk -v n=$normal_reads -v t=$tumor_reads 'BEGIN{printf "%.6f", n/t}')
    samtools view -s "1.${frac#*.}" -b -o tumor_matched.bam tumor.bam
fi

A subsampled BAM without an integer seed (-s 0.1) is non-reproducible -- production pipelines should reject it.

Expression Filtering

samtools view -e EXPR (or --expr, since samtools 1.12; the sclen helper used below and improved null-tag handling arrived in 1.16) supports arbitrary expression filtering on tags, FLAG, MAPQ, RNAME, CIGAR, etc. Powerful for filtering by NM, AS, NH, cs, etc. that the FLAG-based filters cannot reach:

# Reads with >=2 mismatches (NM tag)
samtools view -e '[NM] >= 2' in.bam

# Soft clip on the left, on chr1
samtools view -e 'cigar=~"^[0-9]+S" && rname=="chr1"' in.bam

# Combine with FLAG and MAPQ
samtools view -F 2308 -q 30 -e '[NM] <= 5 && [AS] >= 100' in.bam

# Drop reads with low mapped fraction (samtools-internal helpers)
samtools view -e 'sclen / qlen < 0.2' in.bam

Note: in samtools 1.16+, ![NM] is true only if NM is missing (was buggy in earlier versions); NULL values from missing tags propagate through arithmetic.

Filter by Read Group

samtools view -r library_A in.bam              # single read group
samtools view -R rg_list.txt in.bam            # multiple via file (one ID per line)

pysam Python Alternative

Basic Filtering

import pysam

with pysam.AlignmentFile('input.bam', 'rb') as infile:
    with pysam.AlignmentFile('filtered.bam', 'wb', header=infile.header) as outfile:
        for read in infile:
            if read.is_unmapped:
                continue
            if read.mapping_quality < 30:
                continue
            if read.is_duplicate:
                continue
            outfile.write(read)

Filter with Function

Goal: Apply a multi-criteria quality filter to produce clean alignments for downstream analysis.

Approach: Define a predicate checking mapped status, primary alignment, duplicate flag, and MAPQ; stream reads through it.

Reference (pysam 0.22+):

import pysam

def passes_filter(read):
    if read.is_unmapped:
        return False
    if read.is_secondary or read.is_supplementary:
        return False
    if read.is_duplicate:
        return False
    if read.mapping_quality < 30:
        return False
    return True

with pysam.AlignmentFile('input.bam', 'rb') as infile:
    with pysam.AlignmentFile('filtered.bam', 'wb', header=infile.header) as outfile:
        for read in infile:
            if passes_filter(read):
                outfile.write(read)

Filter by Region

import pysam

with pysam.AlignmentFile('input.bam', 'rb') as infile:
    with pysam.AlignmentFile('region.bam', 'wb', header=infile.header) as outfile:
        for read in infile.fetch('chr1', 1000000, 2000000):
            outfile.write(read)

Filter from BED File

Goal: Extract only reads overlapping target regions defined in a BED file.

Approach: Parse BED into a list of (chrom, start, end) tuples, then fetch reads from each region and write to output.

Reference (pysam 0.22+):

import pysam

def read_bed(bed_path):
    regions = []
    with open(bed_path) as f:
        for line in f:
            if line.startswith('#'):
                continue
            parts = line.strip().split('\t')
            regions.append((parts[0], int(parts[1]), int(parts[2])))
    return regions

regions = read_bed('targets.bed')

with pysam.AlignmentFile('input.bam', 'rb') as infile:
    with pysam.AlignmentFile('targets.bam', 'wb', header=infile.header) as outfile:
        for chrom, start, end in regions:
            for read in infile.fetch(chrom, start, end):
                outfile.write(read)

Subsample (Pair-Consistent)

Hash on QNAME so mates stay together (a fresh random.random() per read drops mates inconsistently and breaks paired-end tools):

import pysam
import zlib

fraction = 0.1
seed = 42
threshold = int(0xffffffff * fraction)

def template_hash(qname, seed):
    return zlib.crc32(qname.encode()) ^ seed

with pysam.AlignmentFile('input.bam', 'rb') as infile:
    with pysam.AlignmentFile('subset.bam', 'wb', header=infile.header) as outfile:
        for read in infile:
            if template_hash(read.query_name, seed) <= threshold:
                outfile.write(read)

Quick Reference

| Task | samtools command | |------|------------------| | Mapped only | view -F 4 | | Unmapped only | view -f 4 | | Properly paired | view -f 2 | | Primary only | view -F 2304 | | No duplicates | view -F 1024 | | High MAPQ | view -q 30 | | Region | view file.bam chr1:1-1000 | | BED regions | view -L file.bed | | Subsample 10% (reproducible) | view -s 42.1 | | Standard filter | view -F 3332 -q 30 |

Common Filter Combinations

| Purpose | Flags | |---------|-------| | Clean reads | -F 3332 -q 30 (mapped, primary, no dups, high qual) | | Variant calling | -f 2 -F 3328 -q 20 (proper pair, primary, no dups) | | Coverage analysis | -F 1284 -q 1 (mapped, no secondary, no dups; supplementary retained) | | Count unique | -F 2304 (primary only) |

Flag breakdowns:

2304 = 256 + 2048 (secondary + supplementary)
3328 = 256 + 1024 + 2048 (secondary + duplicate + supplementary)
3332 = 4 + 256 + 1024 + 2048 (unmapped + secondary + duplicate + supplementary)
1284 = 4 + 256 + 1024 (unmapped + secondary + duplicate)

Related Skills

sam-bam-basics - FLAG semantics, MAPQ-by-aligner, secondary vs supplementary
alignment-sorting - Sort before/after filtering
alignment-indexing - Required for region filtering
alignment-amplicon-clipping - Primer clipping for amplicon panels
duplicate-handling - Mark duplicates before filtering
bam-statistics - Check filter effects

Version Compatibility

Reference examples tested with: pysam 0.22+, samtools 1.19+

Before using code patterns, verify installed versions match. If versions differ:

Python: pip show <package> then help(module.function) to check signatures
CLI: <tool> --version then <tool> --help to confirm flags

If code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.

Alignment Filtering

"Filter my BAM file to keep only high-quality reads" -> Select reads by FLAG bits, mapping quality, and genomic regions using samtools view or pysam.

CLI: samtools view with -F/-f/-q/-L flags (samtools)
Python: pysam.AlignmentFile iteration with attribute filters (pysam)

Filter alignments by flags, quality, and regions using samtools and pysam.

Filter Flags

Common FLAG Values

Filter by FLAG

Keep Only Mapped Reads

samtools view -F 4 -o mapped.bam input.bam

Keep Only Unmapped Reads

samtools view -f 4 -o unmapped.bam input.bam

Keep Only Properly Paired

samtools view -f 2 -o proper.bam input.bam

Remove Duplicates

samtools view -F 1024 -o nodup.bam input.bam

Remove Secondary and Supplementary

samtools view -F 2304 -o primary.bam input.bam

Keep Only Primary Alignments

samtools view -F 256 -F 2048 -o primary.bam input.bam
# Or combined: -F 2304

Keep Read1 Only

samtools view -f 64 -o read1.bam input.bam

Keep Read2 Only

samtools view -f 128 -o read2.bam input.bam

Forward Strand Only

samtools view -F 16 -o forward.bam input.bam

Reverse Strand Only

samtools view -f 16 -o reverse.bam input.bam

Filter by Mapping Quality

Minimum MAPQ

samtools view -q 30 -o highqual.bam input.bam

MAPQ and Mapped

samtools view -F 4 -q 30 -o filtered.bam input.bam

Aligner-Aware MAPQ Thresholds

MAPQ scales differ by aligner; the same -q 30 filter does different things. See sam-bam-basics for the full MAPQ-by-aligner table. Filtering recommendations:

For Phred-scaled aligners (BWA, minimap2), MAPQ Q maps to ~10^(-Q/10) probability of wrong mapping. For STAR, the only emitted values are 0/1/3/255 (sentinels, not probabilities).

Drop Ambiguous Across Aligners (Universal)

samtools view -q 1 in.bam   # exclude MAPQ=0; works for all aligners

Filter by Region

Single Region

samtools view -o region.bam input.bam chr1:1000000-2000000

Multiple Regions

samtools view -o regions.bam input.bam chr1:1000-2000 chr2:3000-4000

Regions from BED File

samtools view -L targets.bed -o targets.bam input.bam

Combine Region and Quality

samtools view -q 30 -L targets.bed -o filtered.bam input.bam

Combined Filters

Standard Quality Filter

Goal: Produce a clean BAM containing only primary, mapped, non-duplicate reads with high mapping confidence.

Approach: Combine FLAG exclusion (-F for unmapped + secondary + duplicate + supplementary) with a MAPQ threshold.

Reference (samtools 1.19+):

samtools view -F 3332 -q 30 -o filtered.bam input.bam
# 3332 = 4 (unmapped) + 256 (secondary) + 1024 (duplicate) + 2048 (supplementary)

Variant Calling Prep -- Assay-Aware

Goal: Choose a filter that matches what the downstream caller expects. Stripping supplementary alignments breaks SV callers; requiring proper-pair drops valid spliced RNA-seq reads.

Reference (samtools 1.19+):

# Short-variant germline
samtools view -f 2 -F 3328 -q 20 -o clean.bam input.bam
# 3328 = 256 (secondary) + 1024 (duplicate) + 2048 (supplementary)

# SV calling: KEEP supplementary
samtools view -F 1024 -o sv_input.bam input.bam   # NOT -F 2304 or -F 3328

# ChIP-seq / ATAC-seq common filter
samtools view -F 1804 -q 30 -o filtered.bam input.bam
# 1804 = 4 + 8 + 256 + 512 + 1024 = unmapped + mate-unmapped + secondary + QC-fail + duplicate

Cost of getting this wrong: filtering -F 2304 or -F 3328 before SV calling produces zero SV calls -- a single-flag mistake that silently invalidates the analysis.

Subsample Reads (Deterministic, Pair-Consistent)

samtools view -s SEED.FRAC -- integer is the hash seed; fractional is the keep fraction. The hash is on QNAME, so:

Mate consistency: read1 and read2 are kept or dropped together.
Reproducibility: same seed + same fraction returns the same reads.
Sequential downsampling requires different seeds. With the same seed the keep-sets are nested, so -s 1.5 then -s 1.25 keeps a nested 1/4 (25%) of the original, not 12.5%. Use different integer seeds for independent samples.

# 10% with seed 42 (always the same reads; pair-consistent)
samtools view -s 42.1 -b -o subset.bam input.bam

# Sequential cuts with INDEPENDENT seeds
samtools view -s 1.5 -b in.bam > half1.bam
samtools view -s 2.25 -b half1.bam > quarter.bam   # 12.5% of original

# Coverage-matching to a target read count
total=$(samtools view -c -F 2304 input.bam)
target=10000000
frac=$(awk -v t=$target -v n=$total 'BEGIN{printf "%.6f", t/n}')
samtools view -s "1.${frac#*.}" -b -o matched.bam input.bam

# Tumor-normal coverage matching (pull tumor down to normal)
normal_reads=$(samtools view -c -F 2308 normal.bam)
tumor_reads=$(samtools view -c -F 2308 tumor.bam)
if [ "$tumor_reads" -gt "$normal_reads" ]; then
    frac=$(awk -v n=$normal_reads -v t=$tumor_reads 'BEGIN{printf "%.6f", n/t}')
    samtools view -s "1.${frac#*.}" -b -o tumor_matched.bam tumor.bam
fi

A subsampled BAM without an integer seed (-s 0.1) is non-reproducible -- production pipelines should reject it.

Expression Filtering

# Reads with >=2 mismatches (NM tag)
samtools view -e '[NM] >= 2' in.bam

# Soft clip on the left, on chr1
samtools view -e 'cigar=~"^[0-9]+S" && rname=="chr1"' in.bam

# Combine with FLAG and MAPQ
samtools view -F 2308 -q 30 -e '[NM] <= 5 && [AS] >= 100' in.bam

# Drop reads with low mapped fraction (samtools-internal helpers)
samtools view -e 'sclen / qlen < 0.2' in.bam

Note: in samtools 1.16+, ![NM] is true only if NM is missing (was buggy in earlier versions); NULL values from missing tags propagate through arithmetic.

Filter by Read Group

samtools view -r library_A in.bam              # single read group
samtools view -R rg_list.txt in.bam            # multiple via file (one ID per line)

pysam Python Alternative

Basic Filtering

import pysam

with pysam.AlignmentFile('input.bam', 'rb') as infile:
    with pysam.AlignmentFile('filtered.bam', 'wb', header=infile.header) as outfile:
        for read in infile:
            if read.is_unmapped:
                continue
            if read.mapping_quality < 30:
                continue
            if read.is_duplicate:
                continue
            outfile.write(read)

Filter with Function

Goal: Apply a multi-criteria quality filter to produce clean alignments for downstream analysis.

Approach: Define a predicate checking mapped status, primary alignment, duplicate flag, and MAPQ; stream reads through it.

Reference (pysam 0.22+):

import pysam

def passes_filter(read):
    if read.is_unmapped:
        return False
    if read.is_secondary or read.is_supplementary:
        return False
    if read.is_duplicate:
        return False
    if read.mapping_quality < 30:
        return False
    return True

with pysam.AlignmentFile('input.bam', 'rb') as infile:
    with pysam.AlignmentFile('filtered.bam', 'wb', header=infile.header) as outfile:
        for read in infile:
            if passes_filter(read):
                outfile.write(read)

Filter by Region

import pysam

with pysam.AlignmentFile('input.bam', 'rb') as infile:
    with pysam.AlignmentFile('region.bam', 'wb', header=infile.header) as outfile:
        for read in infile.fetch('chr1', 1000000, 2000000):
            outfile.write(read)

Filter from BED File

Goal: Extract only reads overlapping target regions defined in a BED file.

Approach: Parse BED into a list of (chrom, start, end) tuples, then fetch reads from each region and write to output.

Reference (pysam 0.22+):

import pysam

def read_bed(bed_path):
    regions = []
    with open(bed_path) as f:
        for line in f:
            if line.startswith('#'):
                continue
            parts = line.strip().split('\t')
            regions.append((parts[0], int(parts[1]), int(parts[2])))
    return regions

regions = read_bed('targets.bed')

with pysam.AlignmentFile('input.bam', 'rb') as infile:
    with pysam.AlignmentFile('targets.bam', 'wb', header=infile.header) as outfile:
        for chrom, start, end in regions:
            for read in infile.fetch(chrom, start, end):
                outfile.write(read)

Subsample (Pair-Consistent)

Hash on QNAME so mates stay together (a fresh random.random() per read drops mates inconsistently and breaks paired-end tools):

import pysam
import zlib

fraction = 0.1
seed = 42
threshold = int(0xffffffff * fraction)

def template_hash(qname, seed):
    return zlib.crc32(qname.encode()) ^ seed

with pysam.AlignmentFile('input.bam', 'rb') as infile:
    with pysam.AlignmentFile('subset.bam', 'wb', header=infile.header) as outfile:
        for read in infile:
            if template_hash(read.query_name, seed) <= threshold:
                outfile.write(read)

Quick Reference

Common Filter Combinations

Flag breakdowns:

2304 = 256 + 2048 (secondary + supplementary)
3328 = 256 + 1024 + 2048 (secondary + duplicate + supplementary)
3332 = 4 + 256 + 1024 + 2048 (unmapped + secondary + duplicate + supplementary)
1284 = 4 + 256 + 1024 (unmapped + secondary + duplicate)

Related Skills

sam-bam-basics - FLAG semantics, MAPQ-by-aligner, secondary vs supplementary
alignment-sorting - Sort before/after filtering
alignment-indexing - Required for region filtering
alignment-amplicon-clipping - Primer clipping for amplicon panels
duplicate-handling - Mark duplicates before filtering
bam-statistics - Check filter effects

Adoption

GPTomics/bio-alignment-filtering

$ install --global

Security Scan Results

SKILL.md

Version Compatibility

Alignment Filtering

Filter Flags

Common FLAG Values

Filter by FLAG

Keep Only Mapped Reads

Keep Only Unmapped Reads

Keep Only Properly Paired

Remove Duplicates

Remove Secondary and Supplementary

Keep Only Primary Alignments

Keep Read1 Only

Keep Read2 Only

Forward Strand Only

Reverse Strand Only

Filter by Mapping Quality

Minimum MAPQ

MAPQ and Mapped

Aligner-Aware MAPQ Thresholds

Drop Ambiguous Across Aligners (Universal)

Filter by Region

Single Region

Multiple Regions

Regions from BED File

Combine Region and Quality

Combined Filters

Standard Quality Filter

Variant Calling Prep -- Assay-Aware

Subsample Reads (Deterministic, Pair-Consistent)

Expression Filtering

Filter by Read Group

pysam Python Alternative

Basic Filtering

Filter with Function

Filter by Region

Filter from BED File

Subsample (Pair-Consistent)

Quick Reference

Common Filter Combinations

Related Skills

Related Skills

GPTomics/bio-workflows-clip-pipeline

GPTomics/bio-comparative-genomics-whole-genome-duplication

GPTomics/bio-comparative-genomics-whole-genome-alignment

GPTomics/bio-comparative-genomics-synteny-analysis

GPTomics/bio-alignment-filtering

$ install --global

Security Scan Results

SKILL.md

Version Compatibility

Alignment Filtering

Filter Flags

Common FLAG Values

Filter by FLAG

Keep Only Mapped Reads

Keep Only Unmapped Reads

Keep Only Properly Paired

Remove Duplicates

Remove Secondary and Supplementary

Keep Only Primary Alignments

Keep Read1 Only

Keep Read2 Only

Forward Strand Only

Reverse Strand Only

Filter by Mapping Quality

Minimum MAPQ

MAPQ and Mapped

Aligner-Aware MAPQ Thresholds

Drop Ambiguous Across Aligners (Universal)

Filter by Region

Single Region

Multiple Regions

Regions from BED File

Combine Region and Quality

Combined Filters