claude/skills/singularity-build/SKILL.md
Expert Singularity/Apptainer container builder for bioinformatics tools on MSKCC HPC (RHEL 8, no sudo). Builds containers using --fakeroot with root-mapped namespace, conda-based package management (never apt-get), and SLURM-safe build scripts. Use this skill proactively whenever the user asks to: create/build a Singularity or Apptainer container, write a .def definition file, containerize a bioinformatics tool or software package, build a SIF image, troubleshoot a failed container build (exit status 1/15/141/255), fix a "command gcc failed" or "cannot find libc" or "CUDA_INCLUDE_DIRS" error in a container build, or package any software into a reproducible container image. Also trigger when the user mentions: .def file, .sif file, apptainer build, singularity build, fakeroot build, container for dorado/samtools/modkit or any bioinformatics tool, or asks to install software that requires root. Once triggered, the skill triages pull-vs-build (Step 0): it tries to acquire and verify a pre-built image before building from scratch. Do NOT trigger for: running an existing container (apptainer exec/run) or Docker/Podman build workflows (use the docker-hpc skill for those).
npx skillsauth add sahuno/llm_configs singularity-buildInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Build reproducible Apptainer/Singularity containers on MSKCC HPC without sudo, using --fakeroot with root-mapped namespace and conda-based package management.
For the full reference guide with all numbered rules, read:
references/build_guide.md (in this skill's directory).
This skill includes three scripts in scripts/ that handle the mechanical, error-prone
parts of container building. Use them instead of generating boilerplate from scratch —
they encode every lesson learned from real build failures on this HPC.
scripts/generate_def.shGenerates a complete .def file for any tier. Handles all compiler symlinks, sysroot
libc symlinks, CUDA header symlinks, conda environment activation, and cleanup.
# Tier 1 — conda-installable
scripts/generate_def.sh --name samtools --version 1.21 --tier 1 \
--packages "samtools=1.21"
# Tier 2 — Python with compiled extensions
scripts/generate_def.sh --name locusmasterte --version latest --tier 2 \
--python-version 3.6 --env-name locusmasterte \
--packages "cython=0.29.7 numpy=1.16.3 htslib samtools" \
--pip-packages "pysam==0.15.2" \
--git-repo "https://github.com/jasonwong-lab/LocusMasterTE.git"
# Tier 3 — Full C++ with CUDA
scripts/generate_def.sh --name dorado --version v1.4.0 --tier 3 \
--packages "cmake=3.30 make zlib openssl" \
--git-repo "https://github.com/nanoporetech/dorado.git" --git-branch v1.4.0 \
--cmake-args "-DDORADO_DISABLE_TESTS=ON"
# Tier 0 — Pre-built binary
scripts/generate_def.sh --name dorado --version 1.4.0 --tier 0 \
--binary-url "https://cdn.oxfordnanoportal.com/software/analysis/dorado-1.4.0-linux-x64.tar.gz"
scripts/generate_build_script.shGenerates a SLURM-safe build script with absolute PROJECT_DIR (never SCRIPT_DIR),
unset APPTAINER_BIND, APPTAINER_CACHEDIR, --ignore-fakeroot-command, || true
guarded smoke tests, and tier-appropriate SLURM resource recommendations.
scripts/generate_build_script.sh --name samtools --version 1.21 --tier 1 \
--project-dir /data1/greenbab/users/ahunos/project
scripts/generate_build_script.sh --name dorado --version v1.4.0 --tier 3 \
--project-dir /data1/greenbab/users/ahunos/project \
--build-args "DORADO_VERSION=v1.4.0 BUILD_THREADS=8" --gpu
scripts/diagnose_build_log.shDiagnoses a failed build by grepping for known error patterns and outputting structured cause + fix for each issue found. Run this first when a build fails.
scripts/diagnose_build_log.sh /home/ahunos/slurm_logs/build_tool_12345.out
scripts/diagnose_build_log.sh build.log stderr.log # can also pass stderr file
Detected patterns: unused build-args, /dev/tty errors, gcc failures (missing compiler vs missing libc), CUDA_INCLUDE_DIRS, permission denied (SLURM spool), apt-get/setgroups, SIGTERM (exit 15), SIGPIPE (exit 141). Also flags harmless warnings (nodev, fakeroot command not found).
Use when the user needs to:
Do NOT use for: running an existing container (apptainer exec/run) or Docker/Podman build workflows (use the docker-hpc skill). Pulling a pre-built image IS in scope — it is the first thing Step 0 tries.
This HPC has specific constraints that shape every decision:
/etc/subuid. Apptainer uses root-mapped namespace — a limited unprivileged mode.apt-get is blocked — setgroups() syscall fails in root-mapped namespace..run installers are blocked — they require /dev/tty which doesn't exist in fakeroot./tmp is a host bind mount in fakeroot — rm -rf /tmp/* destroys host files.These constraints mean: always use condaforge/miniforge3 as the base image, install
everything via mamba, and never touch /tmp broadly.
The goal is a working SIF, not necessarily a built one. Building from scratch is the fall-through, not the default. Before classifying a build tier, try to acquire a verified pre-built image.
Run the helper to get ranked candidates:
scripts/find_prebuilt.sh --name <tool> --version <ver>
It checks the lab catalog first (exit 3 = already registered, reuse it), then
resolves the exact biocontainers tag and prints ranked apptainer pull
commands. Source priority (hard-won — see rules/severus.md):
https://depot.galaxyproject.org/singularity/<tool>:<tag> —
direct SIF, no auth, no rate-limit. Preferred.docker://quay.io/biocontainers/<tool>:<tag> — works, but
anonymous quay pulls are rate-limited.Then VERIFY the pulled image on RHEL 8 — this is the step that makes pulling safe here, and the reason this skill still owns the easy path:
<tool> --version and exercise one real code/network path (not just
--help — see rules/apptainer_env_leak.md).version 'GLIBC_2.xx' not found. The image
assumes a newer glibc than RHEL 8 provides.SSL_CERT_FILE/SSL_CERT_DIR crash httpx-based
tools at first network call. Diagnose with --cleanenv: if
apptainer exec --cleanenv works where plain exec fails, it is an
env-leak, not a build defect.Decide:
profiles/software_configs/softwares_containers_config.yaml, done. No build.Every containerization request falls into one of four tiers. Classify first, then follow the tier-specific recipe.
The tool distributes pre-compiled binaries (tarball, zip) that just need to be extracted. Examples: dorado from ONT CDN, IGV, commercial tools.
Recipe:
wget via mamba/opt/toolnameThe tool exists as a package in conda-forge or bioconda. Examples: samtools, minimap2, bedtools, STAR, modkit.
Recipe:
mamba install -y -c conda-forge -c bioconda tool=versionThe tool has a setup.py or C/Cython extensions that need compilation.
Examples: LocusMasterTE, Telescope, pysam from source.
Recipe (everything from Tier 1 plus):
gcc_linux-64, gxx_linux-64, sysroot_linux-64 to conda packagesx86_64-conda-linux-gnu-gcc -> gcc, g++, cc/lib64/ and /usr/lib64/ (the linker needs these)mamba create -n envname python=X.YThe tool must be compiled from source and needs CUDA. Examples: dorado from source, Clair3 with GPU, custom CUDA tools.
Recipe (everything from Tier 2 plus):
mamba install -c nvidia cuda-toolkit=12.8/opt/conda/targets/x86_64-linux/ to /opt/conda/include/ and /opt/conda/lib/CUDA_HOME, CUDA_PATH, CUDA_INCLUDE_DIRS, CUDA_TOOLKIT_ROOT_DIR-DCUDAToolkit_ROOT, -DCUDA_INCLUDE_DIRS, -DCMAKE_CUDA_COMPILEREvery .def file follows this structure. Sections marked (required) are never omitted.
Bootstrap: docker
From: condaforge/miniforge3:latest
%labels (required)
Author Samuel Ahuno
Date <YYYY-MM-DD>
Purpose <one-line description>
Version <tool version>
Source <URL if applicable>
%arguments (if using build-args)
TOOL_VERSION=<default>
%post (required)
<installation commands — see tier-specific recipes>
mamba clean --all --yes
# Do NOT rm -rf /tmp/*
%environment (required)
export PATH="/opt/conda/bin:$PATH"
%runscript (required)
exec <primary_tool> "$@"
%test (required)
<primary_tool> --version
rm -rf /tmp/* — /tmp is the host's /tmp under fakeroot.apt-get — setgroups() is blocked in root-mapped namespace..run installers — no /dev/tty available.samtools=1.21 not samtools.mamba clean --all --yes after installation.--build-arg must match a %arguments entry — unused args are FATAL.# Compilers are prefixed — build systems expect plain 'gcc'
ln -sf ${BIN}/x86_64-conda-linux-gnu-gcc ${BIN}/gcc
ln -sf ${BIN}/x86_64-conda-linux-gnu-g++ ${BIN}/g++
ln -sf ${BIN}/x86_64-conda-linux-gnu-cc ${BIN}/cc
ln -sf ${BIN}/x86_64-conda-linux-gnu-ar ${BIN}/ar
ln -sf ${BIN}/x86_64-conda-linux-gnu-ranlib ${BIN}/ranlib
# The conda linker expects these at standard paths
SYSROOT="${ENV_PREFIX}/x86_64-conda-linux-gnu/sysroot"
mkdir -p /lib64 /usr/lib64
ln -sf ${SYSROOT}/lib64/libc.so.6 /lib64/libc.so.6
ln -sf ${SYSROOT}/usr/lib64/libc_nonshared.a /usr/lib64/libc_nonshared.a
ln -sf ${SYSROOT}/usr/lib64/libc.so /usr/lib64/libc.so
# Conda cuda-toolkit puts headers/libs in a non-standard path
# Downstream cmake (PyTorch/Caffe2/libtorch) won't find them without symlinks
CUDA_TARGET="/opt/conda/targets/x86_64-linux"
for f in ${CUDA_TARGET}/include/*.h ${CUDA_TARGET}/include/*.hpp; do
[ -f "$f" ] && ln -sf "$f" /opt/conda/include/
done
for d in ${CUDA_TARGET}/include/*/; do
[ -d "$d" ] && ln -sfn "$d" /opt/conda/include/
done
for f in ${CUDA_TARGET}/lib/*.so* ${CUDA_TARGET}/lib/*.a; do
[ -f "$f" ] && ln -sf "$f" /opt/conda/lib/
done
# In %post — activate via sourcing, not conda init
. /opt/conda/etc/profile.d/conda.sh
conda activate envname
# In %environment — prepend PATH, never conda activate
export PATH="/opt/conda/envs/envname/bin:$PATH"
export CONDA_DEFAULT_ENV="envname"
export CONDA_PREFIX="/opt/conda/envs/envname"
Every build script follows this template. The critical detail: use absolute
PROJECT_DIR, never SCRIPT_DIR from BASH_SOURCE[0] — SLURM copies scripts
to /var/spool/slurmd/ which breaks relative path resolution.
#!/usr/bin/env bash
# Author: Samuel Ahuno
# Date: <YYYY-MM-DD>
# Purpose: Build <tool> Apptainer container using --fakeroot
set -euo pipefail
PROJECT_DIR="<absolute path to project directory>"
DEF_FILE="${PROJECT_DIR}/<tool>.def"
SIF_FILE="${PROJECT_DIR}/<tool>_v<version>.sif"
LOG_FILE="${PROJECT_DIR}/build_<tool>_$(date '+%Y%m%d_%H%M%S').log"
# Prevent host bind vars from leaking into %post
unset APPTAINER_BIND SINGULARITY_BIND 2>/dev/null || true
# Set cache dir to avoid home quota issues on compute nodes
export APPTAINER_CACHEDIR=/data1/greenbab/users/ahunos/apptainer_cache
mkdir -p "${APPTAINER_CACHEDIR}"
echo "=== Building <tool> container ===" | tee "$LOG_FILE"
echo "DEF: ${DEF_FILE}" | tee -a "$LOG_FILE"
echo "SIF: ${SIF_FILE}" | tee -a "$LOG_FILE"
echo "Start: $(date)" | tee -a "$LOG_FILE"
# --ignore-fakeroot-command: required for miniforge3 base on RHEL 8
apptainer build --fakeroot --ignore-fakeroot-command \
"${SIF_FILE}" \
"${DEF_FILE}" \
2>&1 | tee -a "$LOG_FILE"
echo "" | tee -a "$LOG_FILE"
echo "=== Build finished: $(date) ===" | tee -a "$LOG_FILE"
# Smoke test
if [[ -f "${SIF_FILE}" ]]; then
echo "=== Smoke test ===" | tee -a "$LOG_FILE"
apptainer exec "${SIF_FILE}" <tool> --version 2>&1 | tee -a "$LOG_FILE" || true
echo "=== DONE: build_<tool>.sh completed successfully ===" | tee -a "$LOG_FILE"
else
echo "ERROR: SIF file not created" | tee -a "$LOG_FILE"
exit 1
fi
Key build script rules:
set -euo pipefail at top, but guard piped smoke test commands with || true to avoid SIGPIPE (exit 141)unset APPTAINER_BIND SINGULARITY_BIND prevents host bind vars leaking into %postAPPTAINER_CACHEDIR on shared filesystem avoids home quota issues--fakeroot --ignore-fakeroot-command are both always requiredapptainer exec --nv in the smoke testtee -a "$LOG_FILE" and 2>&1Use the SLURM MCP tools or provide an sbatch command. Resource estimates by tier:
| Tier | CPUs | Memory | Time | |------|------|--------|------| | 0 (pre-built binary) | 2 | 8G | 15 min | | 1 (conda install) | 2 | 8G | 15 min | | 2 (compiled extensions) | 4 | 16G | 30 min | | 3 (full C++ + CUDA) | 8 | 64G | 4 hours |
Container builds never need GPU — submit to CPU partitions only.
When a build fails, check these in order:
| Error message | Cause | Fix |
|---------------|-------|-----|
| unused build args: X | --build-arg without matching %arguments | Add to %arguments or remove from build command |
| cannot create /dev/tty | Installer needs terminal (NVIDIA .run) | Use conda package or pre-built binary instead |
| command 'gcc' failed + gcc not found | Missing compiler | Add gcc_linux-64 and create symlinks |
| command 'gcc' failed + cannot find libc.so.6 | Linker can't find sysroot | Symlink sysroot libc to /lib64/ and /usr/lib64/ |
| Could NOT find CUDA (missing: CUDA_INCLUDE_DIRS) | Conda CUDA headers in non-standard path | Symlink headers to /opt/conda/include/ and pass cmake hints |
| Permission denied writing log | SLURM spool dir not writable | Use absolute PROJECT_DIR not SCRIPT_DIR |
| Exit code 141 but SIF exists | SIGPIPE from piped smoke test | Build succeeded; add \|\| true to piped commands |
| Exit code 15 | SIGTERM from installer | Installer killed itself (no TTY); use conda alternative |
Always read the last 5 lines before the FATAL: line — that's where the actual error is.
User not listed in /etc/subuid, trying root-mapped namespace — expected, this is how we buildfakeroot command not found — expected with --ignore-fakeroot-command'nodev' mount option set on /tmp — normal on compute nodesSINGULARITY_DOCKER_PASSWORD is set, but APPTAINER_DOCKER_PASSWORD is preferred — just a naming preferencegocryptfs not found — for encrypted containers, which we don't useAfter a successful build:
tool --version or python -c "import module")toolname_vX.Y.Z.sifprofiles/software_configs/softwares_containers_config.yaml if productiondevelopment
Decide whether and how to scatter genomics workloads across chromosomes or region tiles, then gather the per-shard outputs back together correctly. Use proactively whenever the user mentions parallelizing per-chromosome, sharding by chrom, tiling the genome, splitting a BAM/VCF/BED by region, merging per-chrom outputs, or has a workflow with obvious per-chromosome parallelism (variant calling, methylation pileup/DMR, coverage, liftover, peak calling, SV calling). Also triggers on /scatter-gather, "scatter X across chromosomes", "shard this", "chunked variant calling", "merge per-chrom VCFs", "gather these bedmethyl files", "concat these bigwigs", or any per-region parallelism question. **Trigger even when the user is also using Snakemake or Nextflow** — those skills handle DAG plumbing while this one defines *what* to scatter, *whether* it's even safe to scatter (some computations like DSS DMLtest pool globally and break under naive sharding), and *how* to gather each output format without silent corruption. Especially trigger on questions about merging per-chromosome BAM / VCF / BED / bedMethyl / bigwig outputs, or whether a scatter-gather is equivalent to running on the whole genome.
tools
Build self-contained, offline HTML genomic-region reports with igv-reports (create_report). Each HTML bundles igv.js viewers per region with embedded BAM/VCF data slices and default tracks (CpG islands, gencode, RepeatMasker); a reviewer clicks the variant table to inspect read-level evidence with no internet, no server, no IGV install. USE this skill whenever the user wants an HTML, clickable, or browseable viewer of genomic data — phrases like "HTML IGV report", "offline IGV", "self-contained HTML", "clickable viewer", "create_report", "igv-reports", "email this viewer", or any browseable HTML of reads at variants, fusion breakpoints, SV junctions, viral integrations, ChIP peaks, or ROIs. Trigger even when the user doesn't say "igv-reports" — giveaway is HTML/clickable/offline plus genomic regions. Also fire on /igv-reports. DO NOT use for static PNG/PDF/SVG IGV screenshots — use the igv-screenshots skill. Supports hg38, mm10, mm39, T2T. Defaults: --flanking 300, --standalone, genome-tagged output.
development
Verify that structural-variant / breakpoint calls are actually real by checking the chimeric reads that support them. Use whenever the user has caller output (Severus, Manta, Sniffles2, Delly, GRIDSS, MELT, Arriba, SvABA) and wants to validate / audit / QC / double-check their calls — viral integrations (HTLV-1, HBV, HPV, EBV), gene fusions (BCR-ABL, IGH translocations), mobile element insertions (L1, Alu, SVA), translocations. Trigger on phrasings like "is this integration real?", "should I trust this fusion call?", "are these false positives?", "are these PASS calls actually supported by reads?", "QC my SV calls", or any per-call chimeric-read / contamination / bimodality / T-vs-N read overlap question. Also fires on BAM @PG -Y / SA-tag questions on chimeric BAMs, and on /chimeric-read-validation. Output is a per-call TSV with pass / needs_review / fail verdicts. Do not use for calling SVs (use the caller), IGV screenshots (use igv-reports), or RNA-level fusion FDR (use Arriba).
tools
Run a stage-gated runtime/resource optimization study for any bioinformatics tool or command-line program on a SLURM HPC cluster. Walks through preflight, OFAT factor scan, 2^k confirmation factorial, build-mode + alternative-implementation comparison, input-size scan, out-of-sample validation, and produces a fitted predictive resource model (wall_s and peak_rss as functions of input size), a machine-readable model.yaml with caveats, a full REPORT.md, and a one-page exec summary PDF. Trigger PROACTIVELY whenever the user asks to "benchmark", "optimize", "tune", "characterize runtime/memory", "find best config", "build a resource model", "how does X scale", or "what should I put in my Snakemake resources directive for tool Y" — for any compute-bound bioinformatics step (sort, dedup, alignment, variant calling, methylation calling, basecalling, indexing, pileup, liftover). Also triggers on /runtime-resource-study or /benchmark-tool. Skip only for one-off quick timing where a single number suffices and no model is needed.