.claude/skills/ingest-reddit/SKILL.md
Extract post and comments from Reddit threads for wiki ingestion. Appends .json to any Reddit URL, no auth required for public subreddits.
npx skillsauth add RonanCodes/llm-wiki ingest-redditInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Extract the full post content and comment tree from a Reddit thread.
Parse the URL to extract the thread path:
https://www.reddit.com/r/{subreddit}/comments/{id}/{slug}/https://old.reddit.com/..., https://reddit.com/...Fetch via JSON endpoint (no auth needed for public subreddits):
curl -sL -H "User-Agent: llm-wiki-bot/1.0" \
"https://www.reddit.com/r/{subreddit}/comments/{id}/{slug}.json?sort=top&limit=500"
Parse JSON response:
[0] — the post:
[0].data.children[0].data.title — post title[0].data.children[0].data.selftext — post body (markdown)[0].data.children[0].data.author — poster username[0].data.children[0].data.subreddit — subreddit name[0].data.children[0].data.score — upvotes[0].data.children[0].data.created_utc — timestamp[0].data.children[0].data.url — link URL (for link posts)[0].data.children[0].data.is_self — true if text post, false if link post[0].data.children[0].data.num_comments — comment count[1] — the comment tree:
[1].data.children[] — top-level comments.data.author, .data.body, .data.score, .data.created_utc.data.replies — a nested object with the same structure.data.replies.data.children[] to build nested markdownBuild the comment tree as nested markdown:
kind === "more" (these are placeholders for additional comments in large threads; acceptable to skip for v1)## Comments
> **u/username** (42 points)
> Comment text here, which may span
> multiple lines.
> > **u/replier** (15 points)
> > This is a reply to the above comment.
> > > **u/deep-replier** (8 points)
> > > And this is a nested reply.
> **u/another-top-level** (30 points)
> Another top-level comment.
Cap rendering at 4 levels of nesting depth to keep output readable. For deeper replies, flatten them at the 4th level.
raw/<subreddit>-<slug>.md with YAML header:---
source-url: <reddit-thread-url>
title: "<post-title>"
author: "u/<username>"
subreddit: "r/<subreddit>"
date-fetched: <today>
source-type: discussion
score: <post-score>
num-comments: <comment-count>
---
If the .json endpoint returns a 429 (rate limit) or error:
old.reddit.com as the domain:curl -sL -H "User-Agent: llm-wiki-bot/1.0" \
"https://old.reddit.com/r/{subreddit}/comments/{id}/{slug}.json?sort=top&limit=500"
None — uses only curl.
data-ai
Extract transcript from a YouTube video as clean readable text. Use when user shares a youtube.com or youtu.be link and wants the transcript, content summary, or to read what was said.
development
Page type templates and frontmatter conventions for LLM Wiki pages. Reference skill loaded by ingest, query, and lint skills to ensure consistent wiki structure.
testing
Show status of all LLM Wiki vaults — page counts, source counts, last activity, and git status. Use when user wants to see vault status, list vaults, or check wiki health.
documentation
Import an existing Obsidian vault, markdown folder, or git repo as an llm-wiki vault. Moves content into vaults/, adds missing structure (index, log, CLAUDE.md, frontmatter). Use when user wants to import, adopt, migrate, or bring in an existing knowledge base.