website-scraper/SKILL.md
Turn a URL into a durable knowledge artifact: searchable markdown summary + local raw text archive, so you can recall it later without revisiting the web.
npx skillsauth add memgrafter/skills-flatagents website-scraperInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Archive web pages into a personal knowledge base with structured summaries and preserved source text.
./run.sh "https://example.com/article"
# Archive a blog post
./run.sh "https://simonwillison.net/2024/Dec/19/one-shot-python-tools/"
# Archive a GitHub README page
./run.sh "https://github.com/SWE-agent/mini-swe-agent"
# Use a custom archive location
DATA_DIR=~/research ./run.sh "https://arxiv.org/abs/2501.09891"
Per URL, the scraper writes:
{year}/{date}_{slug}.md — summary + YAML frontmatter{year}/raw/{date}_{slug}.txt — raw extracted text (for LLM context){year}/README.md — auto-updated index rowFetch page → extract clean text (trafilatura) → generate validated summary/frontmatter → save summary + raw text.
Cost: ~0.5 cents/page. Benefit: permanent, searchable, LLM-ready references.
development
Deterministically fixes broken OpenAI Deep Research markdown citations without using an LLM: creates a .bkp backup, rewrites citation markers, rebuilds references, and runs strict regex validation.
tools
Use this as the default toolset for coding sessions when you want faster navigation, search, file inspection, and git workflow execution with lower command overhead: tools: - fzf - ripgrep - bat - delta - lazygit - starship - zoxide - eza - atuin - yazi
development
Run shell commands and analyze output with validated summaries. Use for build logs, test output, or any command with substantial output. Protects context by returning concise summaries with grep-validated citations.
development
Search the web and refine results to key findings. Use when the user asks to search and summarize, find and refine web results, or wants concise research summaries.