plugins/media-curator/skills/check-completeness/SKILL.md
Analyze collection completeness against canonical discography and generate prioritized gap report
npx skillsauth add jmagly/aiwg check-completenessInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Analyze collection completeness by comparing actual inventory against canonical discography, identify gaps, score coverage, and generate prioritized acquisition plans.
Answer collection completeness questions:
Produces comprehensive gap reports with actionable acquisition strategies prioritized by importance, availability, and cost.
<artist> - Artist name (quoted if contains spaces)
"Pink Floyd" or "The Beatles"--collection <path> - Collection root directory
/mnt/media/Music--gaps-only - Show only missing items, skip completeness scoring
--priority <level> - Filter gaps by priority level
high, medium, low, all (default: all)--priority high shows only critical gaps--output <file> - Write report to file instead of stdout
.yaml, .json, .md--output gaps-report.yaml--canonical-source <source> - Canonical discography source
musicbrainz (default), discogs, allmusic--canonical-source discogs--include-legendary - Include legendary/bootleg content in analysis
--quality-threshold <level> - Minimum quality for "complete" status
lossless, 320kbps, 256kbps, any (default: any)--quality-threshold lossless treats 320kbps as needing upgrade--create-gap-notes - Generate GAP-NOTE.md files for missing items
--incremental - Incremental update mode
--cost-estimate - Include cost estimates in report
# Use existing scan results if available
SCAN_FILE=".media-curator/scans/${ARTIST_SLUG}/scan-latest.json"
if [[ -f "$SCAN_FILE" ]]; then
INVENTORY=$(cat "$SCAN_FILE")
else
# Trigger fresh scan
/scan-collection "$ARTIST" --format json --output "$SCAN_FILE"
INVENTORY=$(cat "$SCAN_FILE")
fi
Inventory provides:
# Check for cached canonical data
CANONICAL_FILE=".media-curator/completeness/${ARTIST_SLUG}/canonical.yaml"
CACHE_AGE_DAYS=$((( $(date +%s) - $(stat -c %Y "$CANONICAL_FILE" 2>/dev/null || echo 0) ) / 86400))
if [[ $CACHE_AGE_DAYS -gt 7 ]] || [[ ! -f "$CANONICAL_FILE" ]]; then
# Fetch fresh canonical discography
# MusicBrainz API example:
MBID=$(musicbrainz-lookup "$ARTIST" --type artist --format json | jq -r '.artists[0].id')
RELEASES=$(musicbrainz-releases "$MBID" --format json)
# Parse into canonical.yaml structure
echo "$RELEASES" | parse-canonical > "$CANONICAL_FILE"
fi
Canonical discography includes:
MusicBrainz advantages:
Discogs alternative:
If --include-legendary flag set:
# Load legendary performances catalog
LEGENDARY_FILE=".media-curator/completeness/${ARTIST_SLUG}/legendary.yaml"
if [[ ! -f "$LEGENDARY_FILE" ]]; then
# Web search for "artist legendary performances bootleg"
# Or load from community-maintained catalog
# Or use private tracker tags (if integrated)
fetch-legendary-catalog "$ARTIST" > "$LEGENDARY_FILE"
fi
Extended catalog includes:
# Comparison logic pseudocode
for release in canonical_releases:
match = find_in_inventory(release.title, release.year)
if match:
if match.track_count == release.track_count:
release.status = "complete"
else:
release.status = "partial"
release.missing_tracks = release.track_count - match.track_count
release.quality = match.quality
release.source = match.source
else:
release.status = "missing"
Fuzzy matching rules:
# Weighted scoring formula
weights = {
'studio_albums': 0.40,
'eps': 0.15,
'singles': 0.15,
'live_albums': 0.10,
'notable_covers': 0.10,
'legendary': 0.10
}
component_scores = {}
for category, weight in weights.items():
total = len(canonical[category])
complete = sum(1 for item in canonical[category] if item.status == 'complete')
partial = sum(0.5 for item in canonical[category] if item.status == 'partial')
component_scores[category] = ((complete + partial) / total * 100) if total > 0 else 100
overall_score = sum(component_scores[cat] * weights[cat] for cat in weights)
# Quality bonus
lossless_pct = (lossless_count / total_count) * 100
if lossless_pct > 80:
overall_score += 5
# Gap prioritization logic
for release in canonical_releases:
if release.status == "missing" or release.status == "partial":
priority = calculate_priority(release)
availability = check_availability(release)
cost = estimate_cost(release, availability)
gaps.append({
'item': release.title,
'type': release.type,
'year': release.year,
'priority': priority,
'availability': availability,
'estimated_cost': cost,
'reason': priority_reason(release)
})
gaps.sort(key=lambda x: (priority_order[x['priority']], x['year']))
Priority calculation:
def calculate_priority(release):
score = 0
# Type importance
if release.type == 'studio_album':
score += 40
elif release.type in ['ep', 'single']:
score += 15
elif release.type == 'live_album':
score += 10
# Critical acclaim (if available)
if release.rating and release.rating > 8.0:
score += 20
# Classic period (artist-specific)
if release.year in artist_classic_years:
score += 15
# Completes a set
if is_part_of_trilogy(release) and have_other_parts(release):
score += 10
# Historical significance
if release.legendary or release.historically_significant:
score += 25
# Assign tier
if score >= 60:
return "HIGH"
elif score >= 30:
return "MEDIUM"
else:
return "LOW"
def check_availability(release):
sources = []
# Streaming services (easy, legal, lossy)
if available_on_streaming(release):
sources.append({
'type': 'streaming',
'services': ['Apple Music', 'Spotify', 'Tidal'],
'quality': '256kbps AAC',
'cost': 'subscription',
'difficulty': 'easy'
})
# Digital purchase (legal, often lossless)
if available_on_bandcamp(release):
sources.append({
'type': 'digital_purchase',
'platform': 'Bandcamp',
'quality': 'FLAC / 320kbps MP3',
'cost': '$7-12',
'difficulty': 'easy'
})
# Physical (CD/vinyl)
discogs_listings = check_discogs(release)
if discogs_listings:
sources.append({
'type': 'physical',
'format': 'CD',
'availability': f"{len(discogs_listings)} listings",
'price_range': f"${min(discogs_listings)} - ${max(discogs_listings)}",
'difficulty': 'moderate'
})
# Private trackers (torrents, requires ratio)
if user_has_tracker_access():
tracker_results = search_trackers(release)
if tracker_results:
sources.append({
'type': 'private_tracker',
'sites': tracker_results.sites,
'quality': 'FLAC',
'cost': 'ratio',
'difficulty': 'moderate' if tracker_results.seeded else 'hard'
})
return sources
Output format depends on --output extension or defaults to rich terminal display:
Terminal output (default):
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Pink Floyd Collection Completeness Report
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📊 Overall Score: 92.3% (Excellent Collection)
🎵 Canonical Releases: 94.1%
✅ Studio Albums: 16/17 (94.1%)
✅ EPs: 4/4 (100%)
✅ Singles: 24/30 (80.0%)
✅ Live Albums: 12/15 (80.0%)
🎸 Extended Content: 78.5%
✅ Notable Covers: 8/12 (66.7%)
✅ Soundtracks: 3/5 (60.0%)
⚠️ Legendary: 12/47 (25.5%)
💿 Quality Distribution:
🟢 Lossless (FLAC): 87% (298 files)
🟡 High (320kbps): 11% (38 files)
🔴 Medium (256kbps): 2% (6 files)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Top 5 Acquisition Priorities
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🔴 HIGH: The Endless River (2014)
Missing: Complete studio album
Reason: Only missing album from official discography
Sources:
• Apple Music (stream rip, 256kbps) - FREE with subscription
• Bandcamp (FLAC) - $9.99
• Discogs (used CD) - $12-18
Recommended: Bandcamp FLAC purchase
🔴 HIGH: Live at Pompeii (1972) - Soundboard
Missing: Legendary performance
Reason: Exceptional quality, historically significant
Sources:
• Private tracker (FLAC soundboard) - Ratio cost
• YouTube rip (256kbps, unofficial) - FREE but low quality
Recommended: Private tracker if available
🟡 MEDIUM: Obscured by Clouds B-sides
Missing: 3 tracks from singles
Reason: Completes album era, unique content
Sources:
• Compilation "Works" (1983) - Discogs $15-20
• Individual singles (rare) - eBay ~$30 each
Recommended: Works compilation
🟡 MEDIUM: The Wall (24bit/96kHz remaster)
Upgrade: Current 320kbps → Lossless remaster
Reason: Significant audio improvement, bonus tracks
Sources:
• HDtracks (24bit FLAC) - $19.99
• Qobuz (24bit streaming) - Subscription
Recommended: Wait for HDtracks sale (<$15)
🟢 LOW: Relics (1971 compilation)
Missing: Compilation of earlier singles
Reason: Low priority, all tracks owned on original albums
Sources:
• Streaming (256kbps) - FREE
• CD (out of print) - eBay $20-30
Recommended: Stream rip if desired for convenience
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Acquisition Plan Summary
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Next 5 acquisitions:
1. The Endless River - Bandcamp FLAC ($9.99)
2. Live at Pompeii - Private tracker (ratio)
3. Obscured by Clouds B-sides - Works CD ($18)
4. The Wall remaster - Wait for sale (~$15)
5. (Optional) Relics - Stream rip (FREE)
💰 Estimated cost: $43-60 (excluding sale wait items)
⏱️ Estimated time: 1-2 weeks
🎯 Next score: ~96.5% (Completionist tier)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Report saved to: .media-curator/completeness/pink-floyd/gap-report-2026-02-14.yaml
GAP-NOTE files created: 5 items
YAML output (--output report.yaml):
Uses the full gap report structure defined in the Completeness Tracker agent documentation.
Markdown output (--output report.md):
Human-readable report with sections, tables, and acquisition instructions.
for gap in high_priority_gaps:
NOTE_PATH="${COLLECTION}/${ARTIST}/${gap.year} - ${gap.title}/GAP-NOTE.md"
mkdir -p "$(dirname "$NOTE_PATH")"
cat > "$NOTE_PATH" << EOF
# GAP-NOTE: ${gap.title} (${gap.year})
## Expected Content
- ${gap.type}: ${gap.track_count} tracks
- ${gap.description}
- Rating: ${gap.rating}/10
## Acquisition Strategy
${format_acquisition_steps(gap.sources)}
## Priority
**${gap.priority}** - ${gap.reason}
## Sources
${format_source_links(gap)}
- Last searched: Never
EOF
done
/check-completeness "Pink Floyd" --collection /mnt/media/Music
Output: Full terminal report with scores and top 5 priorities
/check-completeness "The Beatles" --gaps-only --priority high
Output:
High-Priority Gaps for The Beatles:
1. Let It Be Sessions (1970) - Legendary unreleased sessions
2. Live at Hollywood Bowl (original 1977) - Out of print, reissue available
3. Anthology outtakes (various) - 12 tracks not on official release
/check-completeness "Led Zeppelin" \
--include-legendary \
--create-gap-notes \
--output gaps-report.yaml
Creates:
gaps-report.yaml - Full structured report# Weekly cron job
/check-completeness "Grateful Dead" \
--incremental \
--quality-threshold lossless \
--output /reports/weekly-$(date +%F).yaml
Fast execution, only re-fetches canonical if stale.
Before completeness check:
/scan-collection "Artist" --format json # Generates inventory
After completeness check:
/acquisition-plan "Artist" --from-gaps gaps-report.yaml # Plan downloads
/quality-audit "Artist" --upgrades-only # Find quality improvement targets
Parallel gap filling:
/completeness-fill "Artist" --priority high --source streaming &
/completeness-fill "Artist" --priority high --source physical &
wait
/check-completeness "Artist" --incremental # Re-check after acquisition
MusicBrainz API failure:
Artist not found:
No collection inventory:
/scan-collection if not foundExpected execution time:
Rate limiting:
All output written to .media-curator/completeness/{artist-slug}/:
.media-curator/completeness/pink-floyd/
├── canonical.yaml # Cached canonical discography
├── legendary.yaml # Extended content catalog
├── state.json # Completeness tracking state
├── gap-report-2026-02-14.yaml
├── gap-report-2026-01-15.yaml # Historical reports
└── history.json # Completeness score over time
data-ai
Report which research-corpus radar sidecars are overdue for refresh. Computes staleness (days since last refresh vs the cadence window) for every radar, sorted most-overdue-first. Runs via `aiwg corpus radar-status`.
data-ai
Aggregate research-corpus radar sidecars into a corpus or per-cluster freshness report — totals, overdue count, per-cluster / per-GRADE / per-trajectory breakdowns, an overdue table, and per-radar rationale snippets. Runs via `aiwg corpus radar-report`.
testing
Scaffold radar/freshness sidecars for research-corpus REFs. Pulls title/authors from the citation sidecar and GRADE from the analysis doc, defaults the refresh cadence from GRADE and the cluster from a corpus-local map, and stamps documentation/radar/REF-XXX-radar.md. Runs via `aiwg corpus radar-init`.
data-ai
Compute an entity's publication trajectory — per-year paper counts, topic drift, hot-streak detection (≥3 consecutive A-grade years), and career phase. Runs via `aiwg corpus profile-temporal`.