skills/gst-caps-negotiation/SKILL.md
Understand and resolve GStreamer caps negotiation issues. Use when a user encounters caps-related errors, format mismatches, or needs to control media formats between pipeline elements.
npx skillsauth add flejz/skills gst-caps-negotiationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Resolve caps (capabilities) negotiation failures and control media format flow between GStreamer elements.
Caps describe the media format flowing between elements. Each pad has a set of caps it supports. Two linked pads must agree on a compatible format during negotiation.
# Caps structure
media-type, field1=value1, field2=value2, ...
# Examples
video/x-raw, format=I420, width=1920, height=1080, framerate=30/1
audio/x-raw, format=S16LE, rate=44100, channels=2, layout=interleaved
video/x-h264, stream-format=avc, alignment=au, profile=high, level=(string)4.1
image/jpeg, width=1280, height=720
| Format | Description | Typical Use |
|--------|-------------|-------------|
| video/x-raw | Uncompressed video | Between processing elements |
| video/x-h264 | H.264 encoded | Streaming, recording |
| video/x-h265 | H.265/HEVC encoded | High-efficiency recording |
| video/x-vp8 | VP8 encoded | WebM, WebRTC |
| video/x-vp9 | VP9 encoded | WebM, YouTube |
| video/x-av1 | AV1 encoded | Next-gen streaming |
| image/jpeg | JPEG frames | MJPEG cameras |
| Format | Description | Typical Use |
|--------|-------------|-------------|
| audio/x-raw | Uncompressed audio | Between processing elements |
| audio/mpeg | AAC/MP3 | Streaming, recording |
| audio/x-opus | Opus encoded | WebRTC, VoIP |
| audio/x-vorbis | Vorbis encoded | Ogg containers |
| audio/x-flac | FLAC lossless | Archival |
format: I420, NV12, YUY2, UYVY, BGRA, RGBA, BGRx, RGBx, BGR, RGB, GRAY8, ...
width: integer (e.g., 1920)
height: integer (e.g., 1080)
framerate: fraction (e.g., 30/1, 25/1, 60/1, 0/1 for variable)
interlace-mode: progressive, interleaved, mixed
pixel-aspect-ratio: fraction (e.g., 1/1)
colorimetry: bt709, bt601, smpte240m
chroma-site: mpeg2, jpeg, none
# FAILS: x264enc only accepts I420, YV12, NV12
videotestsrc ! x264enc ! ...
# FIX: Insert videoconvert
videotestsrc ! videoconvert ! x264enc ! ...
# BEST: Explicit caps for predictability
videotestsrc ! videoconvert ! video/x-raw,format=I420 ! x264enc ! ...
# FAILS: Sink expects different size than source
src ! sink expecting 1280x720
# FIX: Insert videoscale + caps filter
src ! videoscale ! video/x-raw,width=1280,height=720 ! sink
# FAILS: Muxer expects constant framerate, source is variable
camera ! muxer
# FIX: Insert videorate
camera ! videorate ! video/x-raw,framerate=30/1 ! muxer
# FAILS: Encoder expects 44100 Hz, source produces 48000 Hz
src ! encoder
# FIX: Insert audioresample
src ! audioconvert ! audioresample ! audio/x-raw,rate=44100 ! encoder
# FAILS: Mono source, stereo sink
src ! audio/x-raw,channels=1 ! stereo_sink
# FIX: Insert audioconvert
src ! audioconvert ! audio/x-raw,channels=2 ! stereo_sink
| Problem | Solution Element | Purpose |
|---------|-----------------|---------|
| Wrong pixel format | videoconvert | Converts between video color formats |
| Wrong resolution | videoscale | Scales video to target resolution |
| Wrong framerate | videorate | Adjusts framerate by duplicating/dropping |
| Wrong audio format | audioconvert | Converts between audio sample formats |
| Wrong sample rate | audioresample | Resamples audio to target rate |
| Need parsed stream | h264parse, mpegaudioparse | Parses encoded bitstream |
# View element pad templates (what formats it accepts/produces)
gst-inspect-1.0 x264enc | grep -A 20 "Pad Templates"
# See negotiated caps at runtime
GST_DEBUG=GST_CAPS:4 gst-launch-1.0 ...
# Use identity to print caps flowing through
... ! identity silent=false ! ...
# Use dot graph to see caps on every link
GST_DEBUG_DUMP_DOT_DIR=/tmp gst-launch-1.0 ...
# Inline caps (shorthand)
... ! video/x-raw,width=1280,height=720 ! ...
# capsfilter element (explicit)
... ! capsfilter caps="video/x-raw,width=1280,height=720" ! ...
# Multiple caps alternatives (element negotiates best match)
... ! "video/x-raw,format=I420; video/x-raw,format=NV12" ! ...
# Range values
... ! video/x-raw,width=[640,1920],height=[480,1080] ! ...
# List values
... ! video/x-raw,format={I420,NV12,YV12} ! ...
When unsure about format compatibility, this chain handles most conversions:
# Video: handles format, size, and rate
... ! videoconvert ! videoscale ! videorate ! capsfilter caps="TARGET_CAPS" ! ...
# Audio: handles format, rate, and channels
... ! audioconvert ! audioresample ! capsfilter caps="TARGET_CAPS" ! ...
videoconvert is nearly zero-cost when input and output formats already match - it is safe to insert liberallygst-inspect-1.0 to understand what an element can acceptGST_DEBUG=GST_CAPS:4 shows the negotiation process step by steph264parse, aacparse, etc.) are often needed between decoders and muxers to ensure proper stream framingdecodebin3 auto-inserts decoders and converters, but for custom pipelines you often need manual conversion elementstools
Find the right GStreamer elements and plugins for a given task. Use when a user needs to identify which GStreamer element handles a specific codec, protocol, effect, or media operation.
devops
Optimize GStreamer pipeline performance. Use when a user needs to reduce latency, increase throughput, fix dropped frames, tune buffer sizes, leverage hardware acceleration, or profile pipeline bottlenecks.
development
Debug and fix broken GStreamer pipelines. Use when a user has a pipeline that fails, produces errors, hangs, or behaves unexpectedly. Covers error message interpretation, GST_DEBUG, dot graph generation, and common failure patterns.
development
Build GStreamer pipelines from high-level descriptions. Use when a user wants to construct a multimedia pipeline for tasks like video playback, transcoding, streaming, recording, mixing, or any media processing workflow using GStreamer.