skills/web-to-markdown/SKILL.md
Convert a web URL into cleaned Markdown with deterministic routing. Use when Codex needs to read article-like content from links and should apply source-aware fetch strategies: default to r.jina.ai for general pages (including X/Twitter), use defuddle.md for YouTube links, and use browser-impersonated extraction for WeChat/Zhihu/Feishu pages with Mozilla Readability cleanup.
npx skillsauth add rookie-ricardo/erduo-skills web-to-markdownInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Convert URLs into usable Markdown by applying domain-aware fetching routes, then return the cleaned content directly.
r.jina.ai: general web + X/Twitter.defuddle.md: YouTube transcript/content extraction.special-browser-fetch: WeChat/Zhihu/Feishu.For generic URLs (non-YouTube, non-WeChat/Zhihu/Feishu), use this fallback chain:
r.jina.ai first,Run from this skill directory (skills/web-to-markdown):
npm install
node scripts/url_to_markdown.mjs <url>
Return metadata with markdown:
node scripts/url_to_markdown.mjs <url> --json
Force special-site browser extraction:
node scripts/fetch_special_sites.mjs <url> --json
https://r.jina.ai/<url>.youtube.com, youtu.be): https://defuddle.md/<url>.x.com, twitter.com): https://r.jina.ai/<url>.scripts/fetch_special_sites.mjs.https://defuddle.md/https://... or https://r.jina.ai/https://...), normalize back to the original URL and re-apply routing.Use a two-stage strategy for WeChat/Zhihu/Feishu:
cuimp HTTP/TLS impersonation first, then clean HTML with Mozilla Readability.puppeteer-extra browser impersonation.cuimp.sec-ch-ua headers.CHROME_PATH first, then system Chrome/Chromium/Edge paths.If special-site extraction fails due to anti-bot checks, account-only pages, or network limits, report failure clearly and ask for fallback input (for example raw page text).
For normal usage, output markdown only.
When --json is used, return:
source: backend source (r.jina.ai, defuddle, cuimp, browser-readability).strategy: selected route (r-jina, defuddle, special-http-fetch, special-browser-fetch-fallback).requestedUrl: original input.resolvedUrl: normalized/final URL.markdown: extracted markdown body.scripts/url_to_markdown.mjs: primary entrypoint.scripts/fetch_special_sites_http.mjs: WeChat/Zhihu/Feishu HTTP impersonation fetcher (cuimp JS).scripts/fetch_special_sites.mjs: two-stage extractor (HTTP-first, browser-fallback).data-ai
高质量文章翻译技能,采用"分析→初译→审校→终稿"四步精翻工作流。仅支持中文↔英文、中文↔日文翻译。当用户明确提出"翻译"、"translate"、"精翻"、"翻訳"、"翻译文章"、"translate to Chinese/English/Japanese"、"改成中文"、"改成英文"、"改成日文"、"翻成中文"、"翻成日文"、"翻成英文"、"英译中"、"中译英"、"中译日"、"日译中"、"日本語に翻訳"、"中国語に翻訳"、"英語に翻訳"、"これを翻訳して"、"put this in Chinese"、"put this in English"、"put this in Japanese"、"convert to Chinese"、"convert to English"、"convert to Japanese"、"帮我翻一下"、"本地化"、"localize"、"这篇文章翻译一下",或给出 URL/文件/正文并明确要求输出目标语言成稿时触发。不用于仅做摘要、解释、理解或整理的请求。若输入是 URL,优先使用 `curl -L` 请求 `r.jina.ai` 抓取正文 Markdown;抓取失败或正文不完整时必须直接停止并要求用户自行提供正文。
tools
将语音转录文本(访谈、演讲、播客、会议)精修为可读性更高的文章段落。当用户提到"字幕精修"、"transcript polish"、"润色字幕"、"把视频字幕整理成文章"、"访谈文字整理"、处理访谈记录、转录文本优化、语音转文字整理、或者需要将大段对话/演讲文本整理成可读文章时触发。适用于单人演说或多人对谈的转录文本整理,要求保留原句原词、拒绝高度概括。即使用户只是说"帮我整理一下这段文字"并附上了明显的口语化文本,也应该触发此技能。
tools
Remove the visible Gemini AI watermark from images using reverse alpha blending. Use when asked to strip Gemini watermarks, batch-process Gemini images, or build/modify a CLI script that removes the bottom-right Gemini watermark without HTML or server-side components.
data-ai
基于预设 URL 列表抓取内容,筛选高质量技术信息并生成每日 Markdown 报告。