data-visualization/interactive-visualization/SKILL.md
Build interactive HTML/web visualizations with plotly (Python/R), bokeh (Python), and gganimate/plotly frames for animation, with awareness of current Kaleido static-export model (post-orca-EOL), HTML file-size bloat, and the limits of interactive-only output for journal submission. Use when producing zoomable/hoverable plots for notebook EDA, supplementary HTML, dashboards, or animated time-course / iteration visualizations.
npx skillsauth add GPTomics/bioSkills bio-data-visualization-interactive-visualizationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Reference examples tested with: plotly 5.24+, plotly R 4.10+, bokeh 3.4+, kaleido 1.0+ (note: v1 dropped bundled Chrome), gganimate 1.0.9+, altair 5.4+, htmlwidgets 1.6+.
Before using code patterns, verify installed versions match. If versions differ:
pip show <package> then help(module.function)packageVersion('<pkg>') then ?function_nameIf code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.
"Build an interactive plot" -> Render a zoomable, hoverable, panable HTML/web visualization, knowing that interactive output is a SUPPLEMENT to (not replacement for) the static figure needed for journal submission. Choose plotly for fastest onboarding and ggplot2 conversion (ggplotly); bokeh for streaming/server-side; altair for grammar-of-graphics; D3.js for full custom.
plotly.graph_objects, plotly.express, bokeh, altairplotly (via ggplotly), htmlwidgets ecosystem (leaflet, networkD3, DT)Interactive plots produce HTML, but journals need static PDF/PNG. The plotly static-export pipeline changed materially in 2025:
fig.write_image(..., engine='orca') removed in plotly 6.2 (post-Sept 2025)engine= argumentFor static export of plotly figures in 2026: pip install kaleido; verify Chrome installed; fig.write_image('out.pdf'). Test by writing to a known path and inspecting file size; silent failure on missing Chrome was a 2024-2025 pain point that v1 partially addresses with clearer errors.
Interactive HTML has hidden trade-offs:
Use interactive for notebooks (exploration), supplementary HTML (online journal supplement), dashboards (Streamlit/Dash/Shiny). For the journal figure, always also produce static.
Goal: Build an interactive HTML plot with zoom, pan, and hover-tooltip behavior; export both interactive HTML for supplements and static PDF for the journal figure.
Approach: Use plotly.express for declarative high-level plots OR graph_objects for fine control; enable WebGL via render_mode='webgl' or Scattergl for >5000 points; export HTML with write_html() and static with write_image() after installing Kaleido v1+ and Chrome.
import plotly.express as px
import plotly.graph_objects as go
# Express: high-level, declarative
fig = px.scatter(df, x='PC1', y='PC2', color='cluster',
hover_data=['gene_count', 'sample_id'],
color_discrete_sequence=['#0072B2', '#D55E00', '#009E73'],
title='PCA')
fig.update_layout(template='plotly_white', width=600, height=500)
# WebGL acceleration for >5000 points
fig = px.scatter(df, x='PC1', y='PC2', color='cluster', render_mode='webgl')
# Save
fig.write_html('pca.html')
fig.write_image('pca.pdf') # requires kaleido + Chrome
# Graph_objects: low-level
fig = go.Figure(go.Scattergl( # Scattergl == WebGL scatter
x=df['PC1'], y=df['PC2'],
mode='markers',
marker=dict(color=df['cluster_code'], colorscale='Tab10', size=4),
text=df['sample_id'], hoverinfo='text'))
library(plotly)
library(ggplot2)
p <- ggplot(df, aes(x = PC1, y = PC2, color = cluster, text = sample_id)) +
geom_point() + theme_classic()
# Convert ggplot to interactive plotly
p_int <- ggplotly(p, tooltip = c('text', 'x', 'y', 'colour'))
# Save
htmlwidgets::saveWidget(p_int, 'pca.html', selfcontained = TRUE)
ggplotly is the lowest-friction R interactive path — write ggplot, get plotly.
from bokeh.plotting import figure, output_file, save
from bokeh.models import ColumnDataSource, HoverTool
output_file('pca_bokeh.html')
source = ColumnDataSource(df)
p = figure(title='PCA', x_axis_label='PC1', y_axis_label='PC2',
tools='pan,wheel_zoom,box_zoom,reset,hover,save')
p.scatter('PC1', 'PC2', source=source, size=8, alpha=0.7,
color={'field': 'cluster', 'transform': cluster_cmap})
p.add_tools(HoverTool(tooltips=[('Sample', '@sample_id'), ('Cluster', '@cluster')]))
save(p)
bokeh is stronger than plotly for streaming dashboards and server-side aggregation. Static export via bokeh.io.export_png requires selenium + Chrome.
library(gganimate)
p <- ggplot(df, aes(x, y, color = condition)) +
geom_point(size = 3) +
theme_classic() +
transition_time(time) + # animate over time
labs(title = 'Time: {frame_time}')
anim <- animate(p, nframes = 100, fps = 20, width = 600, height = 400,
renderer = gifski_renderer())
anim_save('time_course.gif', anim)
import plotly.express as px
fig = px.scatter(df, x='x', y='y', color='condition',
animation_frame='time',
animation_group='entity_id',
range_x=[xmin, xmax], range_y=[ymin, ymax])
fig.write_html('time_course.html')
Animation suits time-course data, iterative algorithm visualization, before-after comparisons. Limit to ≤100 frames; longer animations bloat file size and tax viewer attention.
library(DT) # interactive tables
datatable(df, filter = 'top', extensions = 'Buttons',
options = list(dom = 'Bfrtip', buttons = c('csv', 'excel')))
library(leaflet) # interactive maps
leaflet(spatial_df) %>% addTiles() %>% addCircles()
library(networkD3) # interactive networks
sankeyNetwork(...) %>% saveWidget('sankey.html')
htmlwidgets is the R answer to plotly's JavaScript wrapping — many specialized packages for tables, maps, networks, all producing standalone HTML.
Trigger: fig.write_image('out.pdf') without kaleido installed.
Mechanism: plotly previously fell back to orca (now removed); current versions raise ValueError but older versions silently skipped.
Symptom: No file written; OR file written with default settings.
Fix: pip install kaleido; verify Chrome is installed (kaleido v1+ requires it); test with fig.write_image('test.pdf') after install.
Trigger: Following 2020-2022 plotly tutorials with engine='orca'.
Mechanism: orca is EOL; engine= parameter deprecated in plotly 6.2 (post-Sep 2025).
Symptom: ValueError or DeprecationWarning.
Fix: Remove engine= argument; use Kaleido v1 (default).
Trigger: Journal requires EPS; Kaleido v1 only supports PDF/PNG/SVG/JPG/WebP.
Mechanism: Bundled Chromium in v0 supported EPS; v1 unbundled and dropped it.
Symptom: kaleido error on EPS export.
Fix: Export PDF, then convert via pdf2ps (ghostscript). For complex figures may produce raster EPS — verify acceptability with journal.
Trigger: Plotly scatter of 50000 points exported as HTML.
Mechanism: Each point + hover data embedded; JS bundle ~3 MB; data scales linearly.
Symptom: Browser hangs opening; reviewer's network throttles upload.
Fix: Use Scattergl (WebGL); OR Datashader pre-aggregation; OR ship static + small HTML supplement.
Trigger: transition_time with 100+ frames and 10000+ points per frame.
Mechanism: Each frame rendered independently.
Symptom: Animation takes hours.
Fix: Downsample frames; pre-aggregate per-frame data; OR use plotly animation (in-browser interpolation faster).
Trigger: Manuscript references interactive HTML as Figure 2.
Mechanism: Journals require static; interactive HTML is supplement.
Symptom: Submission requires figure resubmission as static.
Fix: Always produce both static (figure) + interactive (supplement) versions.
| Pattern | Cause | Action |
|---------|-------|--------|
| Kaleido / orca confusion in plotly | Pipeline changed 2024-2025 | Use Kaleido v1+; no engine= |
| ggplotly drops some custom theme | Conversion loses non-translatable ggplot elements | Manually re-add via plotly::layout() |
| bokeh static export fails | selenium not installed | pip install selenium; Chrome required |
| htmlwidgets self-contained doesn't work offline | CDN-linked resources by default | saveWidget(..., selfcontained = TRUE) |
| Threshold | Value | Source | |-----------|-------|--------| | HTML file size warning | >10 MB | Practical | | Scattergl trigger | >5000 points | plotly performance | | Animation max frames | ~100 | Viewer attention + file size | | Selfcontained HTML on | always for portability | htmlwidgets best practice |
| Error / symptom | Cause | Solution |
|-----------------|-------|----------|
| Static export silent failure | kaleido / Chrome missing | Install both |
| HTML bloated | Large N points | Scattergl or Datashader |
| orca DeprecationWarning | Following old tutorial | Remove engine=, use Kaleido v1 |
| EPS export fails | Kaleido v1 dropped EPS | PDF + pdf2ps |
| ggplotly tooltips show wrong fields | Default tooltip argument | Specify tooltip = c(...) |
| Animation file too large | Too many frames | Downsample / pre-aggregate |
| Interactive cited as paper figure | Journal requires static | Produce both |
tools
--- name: bio-phasing-imputation-foundations description: Frames the phasing/imputation pipeline before any tool runs: phasing and imputation are one Li-Stephens copying HMM (recombination is the transition, mutation the emission, the genetic map and Ne set the rates), imputation's honest output is a dosage with a self-estimated quality (INFO/R2/DR2) not a hard genotype, and the stages are ordered and each fails silently (QC, align build and strand to the panel, phase, impute per chromosome, fil
tools
Chooses the enrichment generation before any tool runs, mapping the input shape to a method class - a pre-selected gene list plus a background to over-representation analysis (ORA, hypergeometric), a ranked statistic for all genes to gene set enrichment (GSEA), a signed signaling topology to pathway-topology (SPIA) - then making the null explicit (competitive vs self-contained, gene vs subject sampling) and running a trustworthiness checklist (testable-gene universe, FDR, redundancy collapse, leading-edge check, version reporting). Covers why every clusterProfiler GSEA is the inter-gene-correlation-uncorrected competitive null, why the background not the gene list decides ORA significance, and why no method is universally best. Use when deciding ORA vs GSEA vs topology, which gene-set DB, whether a result is trustworthy, or which null a tool computes. For ORA see go-enrichment, GSEA see gsea, databases kegg-pathways/reactome-pathways/wikipathways; the ranking comes from differential-expression/de-results.
testing
End-to-end GWAS workflow from VCF to association results. Covers PLINK QC, population structure correction, and association testing for case-control or quantitative traits. Use when running genome-wide association studies.
development
Orchestrates the full path from differential expression results to redundancy-collapsed functional enrichment: choose ORA vs GSEA, convert gene IDs per method, run enrichGO/enrichKEGG/enrichPathway/enrichWP or gseGO/gseKEGG (clusterProfiler, ReactomePA, rWikiPathways), and visualize. Routes the ORA-vs-GSEA generation fork and the null/universe/reproducibility theory to pathway-analysis/enrichment-foundations. Use when a DESeq2/edgeR/limma result must become enriched GO terms, KEGG/Reactome/WikiPathways pathways, or a GSEA leading edge; when deciding whether a ranking exists for all genes (GSEA, named decreasing vector) or only a pre-selected list (ORA plus a defensible background universe); or when assembling DE-to-pathway end to end. The DE list and ranking statistic come from differential-expression/de-results; per-method nuance lives in the pathway-analysis skills.