Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

jmagly/Transcribe Media

Name: Transcribe Media
Author: jmagly

agentic/code/frameworks/media-curator/skills/transcribe-media/SKILL.md

npx skillsauth add jmagly/aiwg Transcribe Media

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Transcribe Media

Create a research-grade transcript sidecar for a local acquired audio or video file. This primitive supports media-curator to research handoff. It does not claim transcription support unless an actual local STT tool, approved service adapter, human transcript, or diarization sidecar is available.

Inputs

Required:

Local acquired media path.

Optional:

Source URL, title, creator, acquired-at timestamp, acquisition ID, language.
Existing transcript or diarization sidecar.

Output

Write transcript sidecars under .aiwg/media/transcripts/ or beside the acquired media when the collection already stores sidecars locally.

Recommended filename: <media-basename>.transcript.json

Required fields:

schema: aiwg.media.transcript.v1
source.path, source.url, source.sha256
transcript.sha256, transcript.language, transcript.generated_at, transcript.tool, transcript.quality
segments[] with stable id, start, end, text, and optional speaker
provenance.wasDerivedFrom, provenance.generatedEntity, provenance.activity, provenance.used

Segment IDs MUST be stable. Use zero-padded sequential IDs such as seg-000001 unless the upstream transcript already has durable IDs.

Hashing

source.sha256 is the SHA-256 of the exact local media file bytes.
transcript.sha256 is the SHA-256 of the canonical transcript payload used for citation, not the pretty-printed JSON file.
The canonical payload is the UTF-8 join of id, start, end, speaker if present, and text for every segment, separated by tabs and newlines.
Use the same lowercase sha256:<hex> convention as media-curator integrity manifests.

Speaker Labels

Preserve speaker labels when STT output, a diarization sidecar, or a human transcript provides them. If no diarization is available, emit the documented single-speaker fallback SPEAKER_00 and record the limitation in transcript.quality.limitations.

Do not invent speaker names. Replace SPEAKER_00 with real names only when metadata or human verification proves them.

Tooling Detection

Check for an available transcription path before generating text:

command -v whisper-cpp || command -v whisper || command -v vosk-transcriber || true
command -v ffmpeg || true

If no STT tool or approved transcript source is available, do not fabricate transcript text. Write or report an actionable plan with:

schema: aiwg.media.transcript-plan.v1
status: blocked-tooling-missing
source path and source hash when the media file can be read
next steps for installing local STT tooling or providing a human transcript
quality limits stating that no transcript hash exists until segment text exists

Verification Limits

A generated transcript is evidence of tool output, not proof of exact speech content. Handoff notes MUST state:

Machine transcripts can contain word errors, omissions, and hallucinated punctuation.
Speaker labels are provisional unless diarization or human review supports them.
Research induction should cite the transcript hash and source media hash together.
Human verification is required before using quotations in high-stakes or published claims.

Research Handoff

Include the transcript sidecar path, source media hash, transcript hash, source URL, acquisition metadata, quality status, and known limitations.

Fixture Example

See examples/sample.transcript.json for a minimal transcript sidecar with timestamps, speaker fallback, source URL, source hash, transcript hash, and provenance fields.

References

@$AIWG_ROOT/agentic/code/frameworks/media-curator/skills/integrity-verification/SKILL.md — SHA-256 manifest and fixity conventions
@$AIWG_ROOT/agentic/code/frameworks/media-curator/skills/provenance-tracking/SKILL.md — W3C PROV-O derivation model for media artifacts
@$AIWG_ROOT/docs/integrations/media-curator-to-research-handoff.md — Research handoff expectations for media-derived artifacts

jmagly/Transcribe Media

agentic/code/frameworks/media-curator/skills/transcribe-media/SKILL.md

Produce timestamped transcript sidecars for acquired audio/video with hashes, source metadata, speaker labels when available, and explicit degraded plans when STT tooling is missing

139 stars

tools

Updated May 26, 2026

$ install --global

skillsauth

npx skillsauth add jmagly/aiwg Transcribe Media

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 26, 2026, 7:19 AM94.3s2 files scanned

SKILL.md

namespace:: aiwg
platforms:: [all]
name:: Transcribe Media
description:: Produce timestamped transcript sidecars for acquired audio/video with hashes, source metadata, speaker labels when available, and explicit degraded plans when STT tooling is missing
category:: media-curator

Transcribe Media

Inputs

Required:

Local acquired media path.

Optional:

Source URL, title, creator, acquired-at timestamp, acquisition ID, language.
Existing transcript or diarization sidecar.

Output

Write transcript sidecars under .aiwg/media/transcripts/ or beside the acquired media when the collection already stores sidecars locally.

Recommended filename: <media-basename>.transcript.json

Required fields:

schema: aiwg.media.transcript.v1
source.path, source.url, source.sha256
transcript.sha256, transcript.language, transcript.generated_at, transcript.tool, transcript.quality
segments[] with stable id, start, end, text, and optional speaker
provenance.wasDerivedFrom, provenance.generatedEntity, provenance.activity, provenance.used

Segment IDs MUST be stable. Use zero-padded sequential IDs such as seg-000001 unless the upstream transcript already has durable IDs.

Hashing

source.sha256 is the SHA-256 of the exact local media file bytes.
transcript.sha256 is the SHA-256 of the canonical transcript payload used for citation, not the pretty-printed JSON file.
The canonical payload is the UTF-8 join of id, start, end, speaker if present, and text for every segment, separated by tabs and newlines.
Use the same lowercase sha256:<hex> convention as media-curator integrity manifests.

Speaker Labels

Do not invent speaker names. Replace SPEAKER_00 with real names only when metadata or human verification proves them.

Tooling Detection

Check for an available transcription path before generating text:

command -v whisper-cpp || command -v whisper || command -v vosk-transcriber || true
command -v ffmpeg || true

If no STT tool or approved transcript source is available, do not fabricate transcript text. Write or report an actionable plan with:

schema: aiwg.media.transcript-plan.v1
status: blocked-tooling-missing
source path and source hash when the media file can be read
next steps for installing local STT tooling or providing a human transcript
quality limits stating that no transcript hash exists until segment text exists

Verification Limits

A generated transcript is evidence of tool output, not proof of exact speech content. Handoff notes MUST state:

Machine transcripts can contain word errors, omissions, and hallucinated punctuation.
Speaker labels are provisional unless diarization or human review supports them.
Research induction should cite the transcript hash and source media hash together.
Human verification is required before using quotations in high-stakes or published claims.

Research Handoff

Include the transcript sidecar path, source media hash, transcript hash, source URL, acquisition metadata, quality status, and known limitations.

Fixture Example

See examples/sample.transcript.json for a minimal transcript sidecar with timestamps, speaker fallback, source URL, source hash, transcript hash, and provenance fields.

References

@$AIWG_ROOT/agentic/code/frameworks/media-curator/skills/integrity-verification/SKILL.md — SHA-256 manifest and fixity conventions
@$AIWG_ROOT/agentic/code/frameworks/media-curator/skills/provenance-tracking/SKILL.md — W3C PROV-O derivation model for media artifacts
@$AIWG_ROOT/docs/integrations/media-curator-to-research-handoff.md — Research handoff expectations for media-derived artifacts

Related Skills

jmagly/radar-status

data-ai

VerifiedTrustedCommunity

Report which research-corpus radar sidecars are overdue for refresh. Computes staleness (days since last refresh vs the cadence window) for every radar, sorted most-overdue-first. Runs via `aiwg corpus radar-status`.

140SKILL.mdUpdated May 28, 2026

jmagly/radar-report

data-ai

VerifiedTrustedCommunity

Aggregate research-corpus radar sidecars into a corpus or per-cluster freshness report — totals, overdue count, per-cluster / per-GRADE / per-trajectory breakdowns, an overdue table, and per-radar rationale snippets. Runs via `aiwg corpus radar-report`.

140SKILL.mdUpdated May 28, 2026

jmagly/radar-init

testing

VerifiedTrustedCommunity

Scaffold radar/freshness sidecars for research-corpus REFs. Pulls title/authors from the citation sidecar and GRADE from the analysis doc, defaults the refresh cadence from GRADE and the cluster from a corpus-local map, and stamps documentation/radar/REF-XXX-radar.md. Runs via `aiwg corpus radar-init`.

140SKILL.mdUpdated May 28, 2026

jmagly/profile-temporal

data-ai

VerifiedTrustedCommunity

Compute an entity's publication trajectory — per-year paper counts, topic drift, hot-streak detection (≥3 consecutive A-grade years), and career phase. Runs via `aiwg corpus profile-temporal`.

140SKILL.mdUpdated May 28, 2026

jmagly/profile-temporal

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/jmagly/aiwg.git

# Copy into Claude Code skills folder (global)
cp -r aiwg/agentic/code/frameworks/media-curator/skills/transcribe-media ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

jmagly/aiwg

139 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT