plugins/media-curator/skills/find-sources/SKILL.md
Discover and rank media sources across platforms for an artist or specific content
npx skillsauth add jmagly/aiwg find-sourcesInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Discover and rank media sources across YouTube, Internet Archive, Bandcamp, SoundCloud, and other platforms. Intelligently prioritizes official releases, lossless formats, and verified channels while filtering out low-quality duplicates.
Automate the tedious process of hunting down the best available version of media content across fragmented platforms. Instead of manually checking Bandcamp, YouTube, archive.org, and streaming services, this command orchestrates a comprehensive search and returns a ranked, deduplicated list of sources with quality scoring.
Key Benefits:
<artist> - Artist or creator name to search for
Examples:
"Tool""Pink Floyd""Grateful Dead""Tame Impala"--scope <scope> - Limit search scope (default: complete)
complete - Full artist catalog (all albums, singles, live recordings)album:NAME - Specific album only (e.g., album:"Dark Side of the Moon")era:NAME - Time period or tour (e.g., era:"1990-1995", era:"Lateralus Tour")track:NAME - Single song across all versions (e.g., track:"Stairway to Heaven")--tier <1-4> - Minimum acceptable quality tier (default: 3)
1 - Official/Lossless only (FLAC, verified channels, Qobuz Hi-Res)2 - High Quality+ (256kbps AAC, 1080p video, official sources)3 - Standard+ (128kbps+, 720p+, most YouTube content)4 - Accept all (includes phone recordings, low-quality rips)--output <file> - Save results to YAML file (default: print to stdout)
--platforms <list> - Comma-separated platform filter (default: all)
youtube - YouTube and YouTube Musicbandcamp - Bandcamp artist pagesarchive - Internet Archive (all collections)soundcloud - SoundCloudqobuz - Qobuz Hi-Resvimeo - VimeoExample: --platforms youtube,bandcamp,archive
--sort <field> - Sort results by field (default: quality_score)
quality_score - Tier + format + verification score (0-100)tier - Tier 1 first, then tier 2, etc.upload_date - Newest firstplatform - Alphabetical by platform name--limit <n> - Maximum results to return (default: unlimited)
--verified-only - Only include verified/official sources (boolean flag)
--include-unavailable - Include geo-blocked or restricted content (default: exclude)
aiwg find-sources "Tool" --tier 2 --output tool-sources.yaml
What it does:
tool-sources.yamlExpected output: 30-50 sources including studio albums (FLAC from Bandcamp), official music videos (1080p YouTube), live concerts (archive.org soundboard recordings).
aiwg find-sources "Pink Floyd" --scope 'album:"The Dark Side of the Moon"' --tier 1 --verified-only
What it does:
Expected output: 3-8 sources including Qobuz 24/96 FLAC, Bandcamp official FLAC (if available), YouTube Topic channel official audio.
aiwg find-sources "Grateful Dead" --scope 'era:"1977"' --platforms archive --output gd1977.yaml
What it does:
gd1977.yamlExpected output: 50-100 sources (Grateful Dead has extensive archive.org presence with official taper policy).
aiwg find-sources "Led Zeppelin" --scope 'track:"Stairway to Heaven"' --tier 2
What it does:
Expected output: 10-20 sources including studio album version, live at Madison Square Garden, BBC Sessions version, 2014 remaster, etc.
aiwg find-sources "Tame Impala" --platforms youtube,bandcamp --sort upload_date --limit 10
What it does:
Expected output: Recent releases, new music videos, latest Bandcamp uploads.
Priority Search - Highest quality, verified sources
Bandcamp
# Search Bandcamp for artist
curl -s "https://bandcamp.com/search?q=$(echo "$ARTIST" | sed 's/ /+/g')&item_type=b"
Qobuz (if available)
# Check Qobuz for Hi-Res releases
# Requires API key or web scraping
YouTube Verified Channels
yt-dlp --dump-json "https://www.youtube.com/@${ARTIST_HANDLE}/videos"
YouTube Topic Channels
yt-dlp "ytsearch:${ARTIST} - Topic"
Historical and Live Content
Internet Archive API
curl "https://archive.org/advancedsearch.php?q=creator:(${ARTIST})&fl[]=identifier,title,date,format,avg_rating&rows=100&output=json"
Etree Collection (for jam bands)
curl "https://archive.org/advancedsearch.php?q=collection:etree+AND+creator:(${ARTIST})&output=json"
Live Music Archive
curl "https://archive.org/advancedsearch.php?q=collection:etree+AND+subject:(${ARTIST})&output=json"
Broad Search with Quality Filtering
Search Query Construction
# For complete catalog
QUERY="${ARTIST} official HD"
# For album
QUERY="${ARTIST} ${ALBUM_NAME} full album"
# For track
QUERY="${ARTIST} ${TRACK_NAME} official"
Execute Search
yt-dlp --dump-json "ytsearch50:${QUERY}" | jq -r '.[]'
Quality Filter
Deduplication
Additional Sources
SoundCloud
# SoundCloud search (requires API or web scraping)
curl -s "https://soundcloud.com/search/sounds?q=${ARTIST}"
Vimeo
curl "https://api.vimeo.com/videos?query=${ARTIST}"
Dailymotion (fallback)
Score Each Source (0-100 scale)
Quality Score = Tier Points + Format Points + Verification Points + Completeness Points
Tier Points (40 max):
- Tier 1: 40
- Tier 2: 30
- Tier 3: 20
- Tier 4: 10
Format Points (30 max):
Audio:
- Lossless (FLAC, WAV): 30
- 320kbps: 25
- 256kbps: 20
- 192kbps: 15
- 128kbps: 10
- <128kbps: 5
Video:
- 4K (2160p): 30
- 1080p: 25
- 720p: 20
- 480p: 10
- <480p: 5
Verification Points (15 max):
- Official channel (✓): 15
- Verified uploader: 10
- High-reputation uploader: 5
- Unknown: 0
Completeness Points (15 max):
- Complete album/concert: 15
- Partial (>50% of content): 10
- Single track: 5
YAML Format (if --output specified):
search_query:
artist: "Artist Name"
scope: "complete"
tier_minimum: 2
platforms: ["youtube", "bandcamp", "archive"]
search_date: "2026-02-14T10:30:00Z"
sources_found:
- url: "https://bandcamp.com/album/example"
platform: "bandcamp"
type: "album"
tier: 1
quality_score: 95
format: "FLAC"
bitrate: "lossless"
resolution: "N/A"
size: "450 MB"
tracks: 12
notes: "Official artist upload, includes booklet PDF"
verified: true
upload_date: "2023-05-15"
- url: "https://youtube.com/watch?v=VIDEO_ID"
platform: "youtube"
type: "video"
tier: 2
quality_score: 85
format: "mp4"
bitrate: "256 kbps AAC"
resolution: "1080p"
duration: "3:45"
views: 1250000
notes: "Official music video"
verified: true
upload_date: "2023-05-20"
channel: "Artist Official"
channel_verified: true
quality_summary:
tier_1_sources: 5
tier_2_sources: 15
tier_3_sources: 8
tier_4_sources: 0
total_sources: 28
average_quality_score: 78.5
platforms_searched: ["youtube", "bandcamp", "archive", "soundcloud"]
Stdout Format (if no --output):
Found 28 sources for "Artist Name" (tier 2+)
Tier 1 Sources (5):
[95] Bandcamp FLAC - Album Name (450 MB, 12 tracks)
https://bandcamp.com/album/example
[92] Qobuz 24/96 - Album Name (1.2 GB)
https://qobuz.com/album/example
...
Tier 2 Sources (15):
[85] YouTube 1080p - Music Video ✓
https://youtube.com/watch?v=VIDEO_ID
[82] Archive.org FLAC - Live 2023-06-10 (Soundboard)
https://archive.org/details/IDENTIFIER
...
Tier 3 Sources (8):
[65] YouTube 720p - Live Performance
https://youtube.com/watch?v=ANOTHER_ID
...
Quality Summary:
Average Score: 78.5
Total Sources: 28
Platforms: YouTube, Bandcamp, Archive.org, SoundCloud
| Field | Type | Description |
|-------|------|-------------|
| url | string | Direct link to source |
| platform | string | Platform name (youtube, bandcamp, etc.) |
| type | string | Content type (album, video, concert, track) |
| tier | integer | Quality tier (1-4) |
| quality_score | integer | Overall quality score (0-100) |
| format | string | File format (FLAC, mp4, mp3, etc.) |
| bitrate | string | Audio bitrate or "lossless" |
| resolution | string | Video resolution or "N/A" |
| size | string | File/download size (if available) |
| duration | string | Track/video duration (if available) |
| tracks | integer | Number of tracks (for albums) |
| notes | string | Additional context |
| verified | boolean | Official/verified source |
| upload_date | string | ISO 8601 date |
| channel | string | Uploader/channel name (if applicable) |
| taper | string | Taper name (for archive.org concerts) |
| views | integer | View count (if available) |
If a platform is down or unreachable:
Warning: Could not reach ${platform}platforms_unavailable: ["platform"]If rate-limited by a platform:
Rate limited by ${platform}, retrying in ${delay}sIf no sources match criteria:
sources_found: [] with explanationIf content is geo-blocked:
notes field: "Geo-blocked in current region"--include-unavailable flag set# Find all Tier 1 sources and download them
aiwg find-sources "Artist" --tier 1 --output sources.yaml
yq '.sources_found[].url' sources.yaml | while read url; do
yt-dlp "$url"
done
Users can override quality scoring by:
quality_score fieldyq or jq# Find sources, filter to YouTube, download
aiwg find-sources "Artist" --platforms youtube --output yt.yaml
aiwg download-queue --input yt.yaml --format best
Solution: Use --limit, increase --tier threshold, or narrow --scope
Solution: Lower to --tier 2 or --tier 3, or check if artist has official Bandcamp/Qobuz
Solution: Use more specific artist name (e.g., "Pink Floyd UK" vs "Pink Floyd tribute")
Solution: Sort by --sort upload_date to prioritize recent uploads
@$AIWG_ROOT/agentic/code/frameworks/media-curator/agents/source-discoverer.md@$AIWG_ROOT/agentic/code/frameworks/media-curator/skills/youtube-acquisition/SKILL.md — read its Prerequisites section before triggering YouTube downloads (EJS / PO-token / pip-vs-zipapp gotchas)@$AIWG_ROOT/agentic/code/frameworks/media-curator/agents/queue-manager.mddata-ai
Report which research-corpus radar sidecars are overdue for refresh. Computes staleness (days since last refresh vs the cadence window) for every radar, sorted most-overdue-first. Runs via `aiwg corpus radar-status`.
data-ai
Aggregate research-corpus radar sidecars into a corpus or per-cluster freshness report — totals, overdue count, per-cluster / per-GRADE / per-trajectory breakdowns, an overdue table, and per-radar rationale snippets. Runs via `aiwg corpus radar-report`.
testing
Scaffold radar/freshness sidecars for research-corpus REFs. Pulls title/authors from the citation sidecar and GRADE from the analysis doc, defaults the refresh cadence from GRADE and the cluster from a corpus-local map, and stamps documentation/radar/REF-XXX-radar.md. Runs via `aiwg corpus radar-init`.
data-ai
Compute an entity's publication trajectory — per-year paper counts, topic drift, hot-streak detection (≥3 consecutive A-grade years), and career phase. Runs via `aiwg corpus profile-temporal`.