platforms/hermes/skills/social-media/x-article-canonicalization/SKILL.md
将 X/Twitter 长文高保真落库到 Obsidian/知识库:用 bird --json-full 重建 block 顺序、保留可点击链接、本地化图片,并区分可精确定位的正文插图与待定位附图。
npx skillsauth add codingsamss/ai-dotfiles x-article-canonicalizationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
把 X/Twitter 长文高保真保存到本地知识库(尤其是 Obsidian 的“推特精选”一类 canonical 原文层)。
适用场景:
不适用场景:
bird read <url> --json-full--plain 当 canonical 主来源content_state.blocks 重建HTTP_PROXY=http://127.0.0.1:7897 HTTPS_PROXY=http://127.0.0.1:7897 \
bird --cookie-source chrome --timeout 30000 read <tweet-url> --json-full > /tmp/x-article.json
重点字段:
article_results.result.content_state.blocksarticle_results.result.content_state.entityMaparticle_results.result.cover_mediaarticle_results.result.media_entities媒体来源分两类:
cover_media.media_info.original_img_url → 封面图media_entities[].media_info.original_img_url → 正文图/附图候选本地化到:
assets/<slug>/img-0.extassets/<slug>/img-1.ext按 content_state.blocks 顺序处理:
header-two → 二级标题unstyled → 普通段落blockquote → 引用块unordered-list-item → 无序列表atomic → 需要看 entityMap只有当 atomic block 对应的 entityMap 项是 MEDIA,且能拿到明确 mediaId 时,才把对应图片插回正文当前位置。
也就是:
atomic + MEDIA = 正文原位插图atomic + LINK / TWEET / 其他 = 不是正文图片,不要误插图不要写成这种不可点击文本:
[嵌入推文: https://x.com/... ]应该写成:
[嵌入推文|作者名 / 简短说明](https://x.com/...)正文里的文章/项目链接也尽量保留 markdown 超链接。
经验坑点:
Bird 的 media_entities 可能给出多张图,但 content_state.blocks 并不一定给出这些图的精确插入位置。
这时不要猜,也不要全部强塞回正文。
正确做法:
## 原文附图(Bird 导出的 block 结构未给出精确插入位点)
并在小节里逐张保留。
---
frontmatter...
---

# 中文标题
中文导语/摘要
## 第一节
正文...

## 第二节
正文...
[嵌入推文|说明](https://x.com/i/status/...)
## 原文附图(Bird 导出的 block 结构未给出精确插入位点)
### 图 2|说明

交付前至少核对:
--json-fullcover_media 已下载并引用atomic + MEDIA 图片都已按原位插入expected_body_media_count <= downloaded_count == referenced_count_total其中:
expected_body_media_count = block 里可精确定位的正文图数量downloaded_count = 实际下载图片数referenced_count_total = 正文原位图 + 附图区图片总引用数错误做法:
--plain 当 canonical 原文media_entities 就假定每张图都有准确正文位置正确做法:
--json-full如果后续还要 ingest 进 llm-wiki:
raw/articles/*.md source bridgeanalysis -> generation ingest这样可以把“高保真原文层”和“结构化知识层”分开,避免 sources 路径漂移。
development
Query Midea MX / 美信 local message cache through the MX local HTTP query service from Codex. Use when the user asks to read MX sessions, search chat history, search messages globally or inside a group/session, list recent messages, or page message history. This is read-only and does not require send authorization. Never fall back to reading SQLite or app cache files directly.
development
Safely search MX users or groups and send Midea MX / 美信 IM messages from Codex. Use when the user asks to notify someone, send a message to a person or group, use a configured group alias, @ users, @ all, or send MX file/image messages. Read lookups need no extra authorization; every live send needs explicit user authorization for that exact target and message.
tools
MX channel output rules. Always active in MX conversations.
tools
Use the company WorkSpace `ws` CLI reliably as a delegated coding agent from Codex. Trigger when the user wants Codex to command `ws`, WorkSpace CLI, or the company opencode-derived coding tool to generate code, inspect a repo, run a bounded implementation task, or use a requested WorkSpace model while Codex reviews the output.