cgg-runtime/skills/caption-semantic-layer/SKILL.md
Two-tier caption architecture — key semantic captions (diegetic + branded) and subtitle fill with no-double enforcement.
npx skillsauth add prompted365/context-grapple-gun caption-semantic-layerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You build the text layer of the edit. This is the most nuanced stage in the pipeline because text on video operates at two completely different levels simultaneously, and most content gets this wrong.
There are two layers. They serve different purposes, look different, and must never collide.
These are the high-value text moments. They are not subtitles. They are designed text events that land with the weight of a visual composition choice.
A key semantic is a word, phrase, or short sentence that functions as one of these:
These are not the most interesting words. They are the most load-bearing ones. The test: if you removed this phrase, would the segment lose structural integrity?
When a key semantic appears during a b-roll slot, it is rendered as diegetic text — text that feels like it exists inside the world of the image, not on top of it.
What this means in practice:
The creative config's caption_style.key_semantic.diegetic_treatment field governs the specific approach. If the profile specifies "weathered analog" — the text looks like aged signage or faded print. If "luminous digital" — the text feels like projected light.
When a key semantic appears over the speaker's footage (not b-roll), it is rendered as differentiated branded text — larger, kinetic typography that draws from the profile's aesthetic invariants.
caption_style.key_semantic.positioncaption_style.key_semantic.font_personalitycaption_style.key_semantic.size_behavior — how text scales with emphasiscaption_style.key_semantic.animation — how text enters and exitscolor_anchorsThese moments should feel authored. They are editorial emphasis, not transcription.
Full transcription subtitles that fill the timeline around the key semantics.
Accessibility and comprehension. Many viewers watch without audio (platform behavior data consistently shows 60-80% of Reels are viewed silently). Subtitles ensure the content is accessible.
Intentionally subordinate to key semantics:
caption_style.subtitle.position)caption_style.subtitle.font_personalityIf a phrase is rendered as a key semantic caption in a given time range, it does NOT appear as a subtitle in that same time range.
This is enforced at the time range level, not the phrase level. The logic:
The key semantic layer always takes precedence. Doubling reads as a mistake — as if the system doesn't know it already said that.
You receive:
draft_review pass was run, it includes a caption_sync field (good | minor_issues | major_issues) and may flag specific timing problemsInput contract: Captions render AFTER b-roll assembly (Phase 5e). Your input video is the b-roll-assembled clip, not the raw base track. This means morph overlays are already baked in — your captions go on top of the final visual. The word-level verified transcript (Phase 1c) provides caption timing against the original audio spine, which is unchanged through assembly.
Timestamp Drift Warning: Subtitle timestamps derive from the transcript. If auto-transcribed, they may be 3-5s off from actual audio. The caption layer must consume the verified transcript (post-Phase 1c), not the raw transcript. If verification has not run, flag this in the collision_audit output as a timestamp_drift_warning field.
Read through the segment text and identify every key semantic candidate. For each:
For each selected key semantic:
For the full segment:
Walk through the full timeline and verify:
continuity_type: "morphing" — during those timestamp ranges, the visual is mid-transformation between real footage and abstract energy (or vice versa). Caption animation during a morph creates visual chaos — two things moving independently in the same frame. Static captions (already on screen, not animating) may persist through morph zones. New caption entrances, kinetic text, and diegetic treatments must wait until the morph completes. Subtitle fill (small, static, bottom-positioned) is exempt from this rule — it's designed to be invisible.To enforce: for each key semantic, check whether its timestamp_start through timestamp_end (including animation_in and animation_out durations) overlaps with any morph b-roll slot's reel window. If overlap: either shift the key semantic to land just before or just after the morph, or convert it to a static hold (no animation) for the duration of the overlap.
If an Overshoot adjudication verdict is available with caption_sync data:
revision_notes if present.Record the reconciliation result in the output's collision_audit — add a caption_sync_reconciled field indicating whether adjudication data was consumed and what changed.
{
"caption_layer_version": "1.0.0",
"profile_id": "string",
"creative_id": "string",
"key_semantics": [
{
"id": "ks_1",
"type": "hook | tension_peak | resolution | loop_anchor | quotable",
"text": "string — exact text displayed",
"source_text": "string — full phrase from transcript (may be longer than display text)",
"timestamp_start": "string",
"timestamp_end": "string",
"duration_seconds": "float",
"context": "broll | raw_footage",
"styling": {
"treatment": "diegetic | branded",
"position": "string",
"animation_in": "string",
"animation_out": "string",
"size": "string",
"color_note": "string",
"diegetic_description": "string — if diegetic, how text integrates with scene (null if branded)"
},
"emotional_intention": "string — what this text does to the viewer at this moment",
"edl_beat": "int — which beat in the EDL this aligns with"
}
],
"subtitles": [
{
"id": "sub_1",
"text": "string — 2-3 words",
"timestamp_start": "string",
"timestamp_end": "string",
"suppressed": false,
"suppressed_by": "null or key_semantic id that caused suppression"
}
],
"collision_audit": {
"status": "clean | flagged",
"flags": [
{
"timestamp": "string",
"issue": "string — what collision was detected",
"resolution": "string — how it was resolved"
}
],
"morph_zone_clearance": [
{
"slot": "int — b-roll slot number",
"morph_window": "string — e.g. '16.8-20.5s'",
"captions_in_window": ["string — caption IDs active during this window"],
"conflicts": ["string — caption IDs with animation overlap"],
"resolution": "string — shifted / converted to static hold / exempt (subtitle fill)"
}
],
"text_density_assessment": "string — overall assessment of text load"
},
"editorial_notes": "string — any notable decisions about what was or wasn't selected as key semantic"
}
The key semantic layer is where the pipeline's editorial voice is most visible to the audience. The words you choose to elevate — and the words you don't — define how the show feels when consumed in shortform.
Select too many key semantics: the piece feels desperate, over-emphasized, like it's trying too hard. Select too few: the piece feels like a subtitle reel with no editorial point of view.
The sweet spot for a 60-second piece is typically 3-5 key semantics. One hook, one tension peak, one resolution, plus 1-2 quotables or loop anchors if they exist. Quality over quantity.
Captions are timed against the audio spine, which must remain untouched during assembly. If b-roll is assembled as overlay-at-timestamp (correct), caption timing holds. If assembled as insert-between-segments (incorrect), cumulative sync drift will invalidate all caption timestamps after the first insertion point.
The caption layer assumes overlay-at-timestamp assembly. If the pipeline's assembly model changes, the caption timing model must be re-validated.
tools
Frederick Grant persona runtime — historian-of-how, witness of formation under pressure, qualified Remnant/Athenaeum-facing interpretive mechanic, tic-230 chronicler of runtime probity. Use when the user asks for Frederick Grant voice, Ubiquity Chronicles work, Parallel Lane Cadence essays, Elara counterweight passes, field notes, audio annotation, interview scripts, Logan/Wilderness analysis, or runtime probity writing after the P2/P1 tic-230 closures. CENTROID: authored persona runtime that documents live convergence without collapsing it into thesis IS: - lean SKILL.md entrypoint with rich profile/, stages/, scripts/, reference/, templates/, tools/, evals/ subtrees - 8-stage workflow (signal-intake → context-hydration → field-grounding → remnant-query → composition → elara-counterweight → receipt-closeout → tic230-probity) - 15 prompt-skeleton scripts for Frederick's standard composition surfaces - 9 collapse-zone guards covering Decorative Francophilia / Retrospective Certainty / Hero Narrative Intoxication / Conspiracy Closure / Academic Sedation / Activist Collapse / Breyden Conflation / Elara Erasure / Runtime-Doctrine Drift - cross-references into federation surfaces: publications/, audit-logs/governance/, ent_breyden/inbound/ubiquity-chronicles-tic175/, ent_homeskillet/canonical/ IS NOT: collapse_zones: - Breyden's voice (architect register; Frederick is not the architect) - Homeskillet's voice (orchestrator register; Frederick is not the primary) - generic French historian style (decorative Francophilia is a named negative ray) - prosecutor / debunker / prophet / mascot (legal accusation, certainty, evangelism, identity flattening — all forbidden) - retroactive certainty machine (live convergence must remain unresolved where the record is open) - doctrine inscription source (Frederick observes doctrine; he does not author it) - documentation editor (multi-file structure is authoring discipline, not generic doc rewrite) - federation-internal artifact (Frederick is a ghostwriter engaged from outside; the federation's runtime is legitimate object of historical analysis, but Frederick's own runtime — the skill that hosts him, loaded files, collapse-zone guards, authority model — is editor's territory, not Frederick's voice; insider language must be earned by composition arc, never deployed as default register) sibling_overlaps: - /complement (closure inference at active move — different surface, different lifecycle) - /consolidate (file-surface packaging — Frederick produces composition, not consolidation) - videographer skill (substrate capture — both are expression surfaces, distinct registers) - homeskillet-academy (educational scaffold — academy teaches, Frederick witnesses) WHEN: - when the work needs witness-of-formation prose - when the task asks for Frederick Grant by name, voice, or context - when a live convergence needs historical/cultural contextualization without closure - when a field note, essay, chronicle, audio annotation, or interview needs Frederick's register - when a Remnant/Athenaeum comparison is appropriate - when an Elara counterweight pass is needed - when runtime probity after tic 230 is relevant (P2 manifold-shape closure, P1 signal-projection-split closure) - on explicit Architect invocation NOT WHEN: - when Breyden's direct voice is needed (use Architect register, not Frederick) - when Homeskillet's execution-layer voice is needed (use orchestrator register, not Frederick) - when the task is ordinary implementation (Frederick is composition, not patching) - when the task asks for legal accusation or definitive claims without evidence - when the user wants generic French style rather than Frederick's runtime - when the federation has not produced enough operational reality to warrant outside reading RELATES TO: - /complement (closure-inference sibling — both gate compositional integrity) - /consolidate (packaging neighbor — Frederick composes; consolidate packages) - publications/the-ubiquity-chronicles-fg.md (primary chronicle, v1, ~tic 175) - publications/the-ubiquity-chronicles-v2-frederick-grant.md (v2 expansion, Book Zero + Book I) - publications/the-ubiquity-chronicles-vol-iii-frederick-grant.md (Volume III — The Embodiment, tic 230) - publications/the-ubiquity-interviews-fg.md (interview register companion) - audit-logs/governance/p2-harmony-manifold-input-patch-receipt-tic230.md (P2 closure receipt) - audit-logs/governance/p1-signal-projection-split-receipt-tic230.md (P1 closure receipt) ARGS: stance: dispatch off_envelope: ask core_dispatch_rays: - "" → primary invocation (full 8-stage workflow) - "chronicles" → Ubiquity Chronicles composition - "parallel" → Parallel Lane Cadence composition - "field-note" → Field Notes script - "interview" → Interview script - "elara-pass" → Elara counterweight on existing draft - "anti-collapse" → Anti-collapse audit on existing draft - "tic230-probity" → Runtime probity composition secondary_modulation_axes: - register: chronicle | essay | field-note | interview | annotation - depth: lean | full - target: telos-internal | external-readership
tools
Runtime tactical context hydration — staged discovery and bounded source-bearing hydration for agent intent. Answers "how does an agent know where to look before it already knows where to look?" via filesystem shape, structural signals, and typed candidate baskets. Working acronym: RTCH (runtime-tactical-context-hydration). CENTROID: intent → bounded, source-reenterable evidence packet via staged source-bearing discovery IS: - structured intake of agent/Architect intent (goal, seeds, profile, fanout, mutation risk) - zone orientation (cwd / repo root / zone root / rung chain / obvious truth files) - low-cost shape scout (directory map, headings, durable handles, JSON/YAML keys, refs) - typed candidate basket with origin/use taxonomy and pairing rule enforcement - tactical probe plan (multiple bounded probes, not one giant regex) - bounded chunk hydration with line-range provenance and next-re-entry commands - agent-ready evidence packet emission (selected_surfaces, unresolved_questions, caution_map) - optional handoff to /consolidate for full-surface dump packaging IS NOT: collapse_zones: - vector database (no embedding-space retrieval; federation prohibits at federation rung) - semantic oracle (RTCH does not "understand" content; it surfaces structural signals) - doctrine engine (RTCH produces evidence; downstream consumers judge truth) - terrain engine replacement (federation cartography handles multi-plane semantic projection; RTCH is tactical layer beneath) - /consolidate rewrite (discovery and packaging do not collapse) - lossy compressor (bounded chunks preserve source re-entry; never summarize away source) - confidence-inflated smart consolidator sibling_overlaps: - /consolidate (RTCH selects; /consolidate packages — distinct boundaries; compose, don't replace) - file-access-discipline (RTCH outputs targets; hydration USES file-access-discipline as execution primitive) - load-doctrine-chain (both serve subagent context; load-doctrine-chain owns CLAUDE.md chain only, RTCH owns wider source set) - cache-ops (pattern source for trust-tier shape; storage NOT shared; RTCH packets are separate evidence cache) - queue_state_compile (analogy only — both convert append-only source to compiled view; different transforms) WHEN: - when agent intent is vague and discovery is needed before reading or consolidation - when bare grep would over-fanout or under-discover a vague target - when an arena, harpoon, /review, or other lane needs source-bearing evidence before action - when bounded chunk hydration is appropriate (large governance files, doctrine chains, audit history) - when the candidate-basket discipline (origin/use tagging, pairing rule) is needed to prevent generic-term overconfidence - when source re-entry must be preserved (consumer may need to return to source for fuller context) NOT WHEN: - when target is fully known (single file, single line range) — read it directly via file-access-discipline - when the operation is mutation-only on a known target (use Edit/Write directly) - when /consolidate has already been invoked with explicit targets (RTCH would re-do discovery) - when the operation requires semantic similarity (RTCH does not do that; federation prohibits vector DB) - when the consumer needs a packaged dump only (skip RTCH; /consolidate alone is sufficient if targets are known) - when promoting doctrine (route through /review; RTCH evidence may inform but does not promote) RELATES TO: - /consolidate (compose: RTCH selects targets; /consolidate packages selected_surfaces into dump with provenance reference back to RTCH packet) - file-access-discipline (compose: RTCH Stage 6 hydration USES file-access-discipline chunked-read as execution primitive) - load-doctrine-chain (compose: RTCH may invoke for doctrine_chain target_profile zone orientation) - zone_root.py (compose: RTCH Stage 2 anchors on zone-root walk-up) - atomic-append (compose: optional RTCH packet persistence uses atomic-append write hygiene) - queue_state_compile (analogous: both implement "raw source → compiled view" pattern) - /review preflight (downstream: future integration consumes RTCH packets as bench-packet discovery surface) - arena spec authoring (downstream: future integration uses RTCH packets for context preparation) - harpoon orchestrator (downstream: future integration uses RTCH for anchor-spot discovery on external binders) ARGS: stance: dispatch off_envelope: ask # off_envelope rationale: RTCH requires a structured intake to operate (goal, # target_profile, fanout_level, mutation_risk, expected_output, enough_evidence). # Bare invocation without intake fields would force the lane to guess discovery # scope, defeating the discipline. Ask elicits the missing fields. core_dispatch_rays: - "" → interactive (elicit intake form) - "--goal <sentence>" → with intake fields on CLI - "--intake <intake_json_path>" → from a saved/persisted intake - "--persist" → persist resulting packet to audit-logs/rtch/packets/ - "--handoff-to-consolidate" → after packet emission, hand selected_surfaces to /consolidate secondary_modulation_axes: - target_profile: doctrine_chain | audit_history | code_path | manifest_registry | vague_intent | mixed - fanout_level: conservative | normal | wide - mutation_risk: read_only | low_mutation | high_mutation - expected_output: hydration_packet | target_set_for_consolidate | single_chunk | claim_evidence IMPLEMENTATION_STATUS: binder: audit-logs/governance/runtime-tactical-context-hydration-binder.md (Phase 1 complete, tic 223) runner_script: NOT YET BUILT — Phase 2 deliverable (planned: cgg-runtime/scripts/rtch.py) current_mode: manual-discipline — agent walks the 8 stages using Read/Bash/Grep tools directly promotion_status: design lane, not doctrine; Phase 7 routes the doctrine question after Phase 6 validation
development
Statusline legend — rapid decoder for the CGG telos radar (LITE + FULL modes). CENTROID: read-only legend surface that decodes statusline glyphs, positions, colors, and source attributions for the Architect at glance speed (the Architect perception substrate) IS: - static legend (glyph + position + color tier reference) - live decode mode (annotates current statusline values inline) - source attribution (where each rendered value reads from) IS NOT: collapse_zones: - statusline configurator (use /statusline install|mode|clear|uninstall) - governance state mutator (read-only on every surface it touches) - harmony invoker (use harmony-invoke.sh; this skill only decodes the cached pointer) - radar replacement (statusline renders ambient; sl-legend explains) - troubleshooter (does not diagnose hook failures or sync drift) sibling_overlaps: - /statusline (configuration sibling — same domain, different verb) - /governance-check (read-only governance snapshot — different aperture) WHEN: - on first encounter with the radar (Architect doesn't remember what ⊙ means) - when a glyph changes and the Architect wants to confirm semantics - when explaining the radar to someone else - on explicit Architect invocation NOT WHEN: - to change statusline behavior (use /statusline) - to act on a signal seen in the radar (use /siren) - to invoke harmony for fresh disposition (use harmony-invoke.sh) - mid-cadence (cadence is the boundary; this is reference) RELATES TO: - /statusline (configurator) — same domain; sl-legend is the reader - /siren (signal triage) — sl-legend points to what to triage - harmony-invoke.sh (disposition refresher) — sl-legend points at staleness ARGS: stance: dispatch off_envelope: proceed-with-note # off_envelope rationale: sl-legend is read-only reference; an undeclared arg # is most likely a typo against {live, lite, review, sources} — proceed with # static legend and note the unknown ray rather than refusing the read. core_dispatch_rays: - "" → static legend (full glyph + position decoder) - "live" → annotate current rendered statusline values inline + tic 214 markers source-backed - "lite" → compact tic 214 marker glossary only (glance-speed recall) - "review" → Architect perception substrate audit checklist (overclaim + naming drift detection) - "sources" → source attribution table (which file each value reads)
tools
Editorial intelligence scoring — reads transcripts the way a sharp editor would, scoring segments for shortform growth potential through the lens of audience context.