skills/lab-automation/western-blot-quantification/SKILL.md
Protocols and best practices for western blot quantification and analysis including band detection, normalization, and statistical methods.
npx skillsauth add jaechang-hits/scicraft western-blot-quantificationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Short Description: Comprehensive guide for quantifying and analyzing Western blot images with multiple experimental repetitions, including intensity measurement, normalization, statistical analysis, and visualization.
Authors: Ohagent Team
Version: 1.0
Last Updated: December 2025
License: CC BY 4.0
Commercial Use: ✅ Allowed
This guide provides a standardized workflow for analyzing Western blot images, particularly for experiments with multiple repetitions and conditions. The protocol covers band intensity detection, normalization procedures, statistical aggregation, and visualization best practices.
Western blot quantification cannot use raw band intensities, because total protein loaded per lane varies between samples (pipetting error, transfer efficiency, gel artifacts). A loading control is a protein assumed to be expressed at the same level across all samples (commonly GAPDH, β-actin, α-tubulin, or a total-protein stain such as Ponceau S / stain-free imaging). Dividing the target band intensity by the loading control intensity in the same lane yields a normalized value that corrects for these per-lane technical variations. The loading control must itself be unsaturated and within the linear dynamic range of the detection system.
When two related signals are measured in the same blot — for example a total form (SMAD2) and its phosphorylated form (PSMAD2) — a two-step normalization disentangles changes in protein abundance from changes in modification state. Step A normalizes the total protein to a housekeeping control (SMAD2_norm = SMAD2 / GAPDH); Step B normalizes the modified form to that loading-corrected total (PSMAD2_target = PSMAD2 / SMAD2_norm). This isolates the modification-specific signal from changes in expression of the underlying protein.
Each Western blot is one experimental observation; biological conclusions require biological replicates (independent experiments, not just multiple lanes from one gel). Aggregation steps: (1) normalize within each replicate, (2) compute fold-change relative to the within-replicate control (so the control is 1.0 by definition), (3) compute mean and dispersion (SD or SE) across replicates. Normalizing across replicates before computing fold-change inflates apparent effect size and confuses gel-to-gel variation with biological effect.
SD describes the spread of the underlying biological response across replicates and is appropriate when the question is "how variable is this effect?". SE (= SD / √n) describes the precision of the estimated mean and is appropriate when the question is "how confident are we in this mean value?". For typical n=3 western blot experiments, SD bars look larger than SE bars but communicate the underlying biology more honestly. Always state which error measure is plotted in the figure legend.
Western blot quantification decision tree
└── Single target protein measured?
├── Yes -> Single-step normalization: Target / LoadingControl (per lane)
│ └── Compute fold change vs control within each replicate
│ └── Aggregate mean +/- error across replicates
└── No, two related signals (e.g., total + modified form)
└── Two-step normalization
├── Step A: TotalForm_norm = TotalForm / LoadingControl (per lane)
└── Step B: ModifiedForm_target = ModifiedForm / TotalForm_norm
Error bar choice:
└── Reporting biological variability of the effect? -> SD
└── Reporting precision of the mean estimate? -> SE = SD / sqrt(n)
Experimental design choice:
└── Discrete treatments (control vs conditions) -> Multi-condition design + bar graph + ANOVA / t-tests
└── Same treatment over multiple time points -> Time course design + line graph; normalize to t0 control
└── Same treatment at multiple concentrations -> Dose response design + log-x line graph; fit EC50 / IC50
| Situation | Recommended choice | Rationale | |-----------|--------------------|-----------| | Quantifying total protein abundance changes | Single-step normalization (Target / LoadingControl) | One measurement per lane; loading control corrects total-protein loading | | Quantifying post-translational modification (phosphorylation, ubiquitination) | Two-step normalization (Modified / Total_norm) | Isolates modification stoichiometry from changes in total protein expression | | n = 3 replicates, biology-focused figure | Mean ± SD | Communicates the spread of the biological response | | n = 3 replicates, statistical-precision figure | Mean ± SE | Communicates the precision of the mean estimate | | Small fold changes (~1.5×) on noisy blots | Increase n to ≥ 4–6 and report SE with explicit n in legend | Low effect size requires more replicates for adequate statistical power | | Comparing 4+ discrete conditions | Multi-condition design + ANOVA with post-hoc correction | Pairwise t-tests across many conditions inflate Type I error | | Tracking the same effect over time | Time-course design, normalize to t = 0 within each replicate | Removes baseline drift between replicates | | Determining potency (EC50 / IC50) | Dose-response design with log-spaced concentrations | Log spacing samples the sigmoidal response uniformly; nonlinear fit gives EC50 | | Loading control band saturated | Re-image at lower exposure or dilute the lysate | Saturated bands violate the linear dynamic range and silently bias normalization | | One outlier replicate with unusually high variability | Document and exclude with justification (e.g., transfer artifact) | Honest exclusion is preferable to a noisy mean; never silently drop data |
Objective: Identify ROIs and isolate individual bands in the Western blot image.
Key Considerations:
Tools: analyze_pixel_distribution, find_roi_from_image
Objective: Quantify band intensities for all detected bands.
Procedure:
For each lane/repetition, measure the intensity of:
Record measurements in a structured format:
Best Practices:
Objective: Normalize target protein intensities to account for loading variations.
Two-Step Normalization Process:
Calculate the relative intensity of the loading control protein:
SMAD2_norm = Intensity_SMAD2 / Intensity_GAPDH
This accounts for variations in total protein loading across samples.
Calculate the final normalized target protein intensity:
Target_value = Intensity_PSMAD2 / SMAD2_norm
This provides the normalized PSMAD2 intensity that accounts for both loading control and relative protein levels.
Alternative Normalization Methods:
Target_norm = Intensity_Target / Intensity_GAPDHTarget_norm = Intensity_Target / Intensity_TotalProteinObjective: Express results relative to a control condition.
Procedure:
Fold_Change = Target_value_condition / Target_value_control
Important Notes:
Objective: Combine data from multiple experimental repetitions.
Procedure:
Statistical Considerations:
Objective: Create clear, publication-ready visualizations.
Bar Graph Requirements:
Visualization Best Practices:
Verification Images:
wb_grid_verification.png)Problem: Some bands not detected or incorrectly identified Solutions:
Problem: Large standard deviations or inconsistent results Solutions:
Problem: Normalized values don't match expected biological response Solutions:
Problem: High background affecting intensity measurements Solutions:
Before finalizing analysis, verify:
Required Outputs:
Quantification results: CSV or Excel file with:
Visualization: Bar graph image (e.g., psmad2_quantification.png)
Verification image (optional but recommended): wb_grid_verification.png
For a typical experiment with 3 repetitions and 4 conditions:
Loading Control Normalization:
Loading_norm = Intensity_LoadingControl / Intensity_Housekeeping
Target Normalization:
Target_norm = Intensity_Target / Loading_norm
Fold Change:
Fold_Change = Target_norm_condition / Target_norm_control
Statistics:
Mean = Σ(values) / n
SD = √[Σ(value - mean)² / (n-1)]
SE = SD / √n
Pitfall: Reporting raw band intensities without loading-control normalization. Differences seen on the blot can be entirely explained by per-lane loading variation.
Pitfall: Using a saturated loading control. A saturated GAPDH/β-actin band looks "even" but is outside the linear dynamic range, so normalization silently understates true differences.
Pitfall: Aggregating normalized values across replicates before computing fold change. This conflates gel-to-gel variation with biological signal and inflates the apparent effect size.
Pitfall: Plotting SE bars but labeling them SD (or vice versa). Reviewers and readers cannot interpret the figure correctly.
Pitfall: Drawing conclusions from n = 1 or n = 2 experiments. A single observation cannot distinguish biological effect from technical noise.
Pitfall: Silently excluding outlier replicates without documentation. This biases the reported mean and is irreproducible.
Pitfall: Choosing a loading control that itself responds to the treatment. Some "housekeeping" proteins (e.g., GAPDH) change under metabolic stress, hypoxia, or starvation, breaking the assumption that the loading control is constant.
Pitfall: Using fixed-threshold automatic ROI detection on every image. Different exposures, contrasts, and noise floors require different thresholds; one-size-fits-all detection misses dim bands or splits strong ones.
lower_threshold and upper_threshold per image; manually verify the grid overlay before extracting intensities, and preserve correctly detected ROIs when adjusting parameters.tools
Fast short-read DNA aligner for WGS/WES/ChIP-seq. 2× faster BWA-MEM successor; outputs SAM/BAM with read group headers for GATK. Primary plus supplementary records for chimeric reads. Use STAR for RNA-seq splice-aware alignment; Bowtie2 is a comparable alternative.
tools
smina molecular docking CLI. AutoDock Vina fork with customizable scoring functions, native SDF/MOL2/PDB ligand input, autoboxing, local energy minimization, and per-atom score breakdowns. Pipeline: receptor PDBQT prep -> ligand prep (RDKit/OpenBabel) -> dock via autobox or explicit grid -> rescore/minimize with custom scoring -> rank poses by affinity. Choose smina over Vina when you need custom scoring terms (--custom_scoring), local optimization of an existing pose (--local_only), per-atom contributions (--atom_term_data), or SDF/MOL2 ligands without manual PDBQT conversion. For unknown binding sites use diffdock-blind-docking; for the Python-bindings/Vinardo workflow use autodock-vina-docking.
development
mdtraj molecular dynamics trajectory analysis (Python). Reads DCD/XTC/TRR/NetCDF/H5/PDB topologies and trajectories; computes RMSD vs time, radius of gyration, per-residue RMSF, residue-residue contact frequency maps, phi/psi torsions for Ramachandran plots (general + Gly/Pro), and 8-state DSSP secondary structure. Modules: trajectory I/O, geometry (distances/angles/dihedrals), structural analysis (RMSD/Rg/RMSF/SASA), contacts, hydrogen bonds, secondary structure (DSSP), NMR observables. For broader atom-selection grammar use mdanalysis-trajectory; for running MD simulations use OpenMM/GROMACS.
development
Programmatic PubMed access via NCBI E-utilities REST API. Covers Boolean/MeSH queries, field-tagged search, endpoints (ESearch, EFetch, ESummary, EPost, ELink), history server for batches, citation matching, systematic review strategies. Use for biomedical literature search or automated pipelines.