/SKILL.md
Cross-platform content collection, web search, trending topics, confidence scoring, and watch/triage workflows for assistant and agent usage.
npx skillsauth add sunyifei83/datapulse datapulseInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
部署维护区:每次部署到 bot 节点时在本段顶部追加一条(日期 + 变更 + 调用方式影响),最多保留最近 5 条。bot 加载本 skill 时即读到最新状态。
qr=0);零依赖,从 OpenCC 字典生成(已去歧义,不污染常用简体字)。② xueqiu/中文站结果恢复(#78 + #85):之前 xueqiu 等反爬站把空标题占位符(Untitled, qr=0)顶进结果——现检测到抽取失败(空标题/WAF 壳)即回退 search provider 的 snippet(带真标题+正文);并修复 charset-less 中文站(cls.cn 等)全文抓取报错(content already consumed)。实测锚点:search("小米集团 财报", channel="finance") 由「1 条命中 + 3 条 Untitled 垃圾」变为「5 条全部 qr=1.0 实体命中」;繁体查询 search("台積電 財報") 同样全程 qr=1.0。调用方式不变(无签名变更)——纯排序/抽取质量提升。search("小米集团 财报", channel="finance") 后,命中实体的财报文章排 #1(score 75, qr=1.0)。② boilerplate 误杀修复:真文章只是提到 captcha/sign in/404/「just a moment」不再被降权(改为「标记+正文极短」才判定 bot-wall);reuters/bloomberg/雪球等财经新闻首页加入豁免。③ finance 渠道示例查询升级为实战模板「小米集团 01810.HK 财报」,并把「代码+关键词武装 query、勿用 news 模式」写进 list_channels 描述。调用方式不变(无签名变更);ticker 自动注入 / 复合渠道 / registry 持久化经对抗性评审判定暂不需要,已记录触发条件。reader.search(query, channel="finance") 显式路由,reader.list_channels() / MCP list_channels / CLI --list-channels 枚举全部渠道(finance + 8 平台组,含 sites 与示例)——前端/消费者可发现式选择,不必手记各垂类技巧。② 相关性(#68 P2):自动降权垃圾结果(bot-wall / "just a moment" / 验证码 / cookie 墙 / 裸首页);Tavily 改用原生 include_domains/exclude_domains(默认排除 pinterest/quora 等 SEO 农场,可用 DATAPULSE_TAVILY_EXCLUDE_DOMAINS 覆盖);新增可选 query_relevance 维度(DATAPULSE_QUERY_RELEVANCE_WEIGHT 开启)+ 四大核心权重均可 env 调。中文财经仍以「代码 + site: + 关键词」为最优(见下条),channel="finance" 是其便捷封装。reader.search(query, platform="finance") 注入财经源 site: 白名单(雪球/东方财富/36氪/财新/财联社/富途/新浪财经/雅虎财经),并加入 source catalog 高 authority(issue #68 Phase 1;可发现的分类渠道入口见 #72)。中文财经检索铁律(现场实测):① 别裸搜——"小米股票最新消息"会返回公司官网而非行情;② 用 股票代码(如 01810.HK)+ site: 限定 + 关键词(分析/研报/年份) 武装 query(代码比纯公司名好 ~10×;site: 限定比泛搜好 ~10×);③ 中文查询勿用 Tavily news 模式(对中文返回英文结果),用普通搜索 + site: 反而好;④ Jina 偶尔超时,Tavily 稳定,双 key 兜底。模板:个股新闻 site:xueqiu.com {公司} OR site:eastmoney.com {公司};细分事件 site:36kr.com OR site:caixin.com {公司} {事件};个股研报 {公司} {代码}.HK 分析 研报 {年份}。main)。四个增强 key(JINA_API_KEY / TAVILY_API_KEY / GROQ_API_KEY / FIRECRAWL_API_KEY)已注入运行环境并验证生效 → Jina 读取、Tavily 搜索、Groq YouTube 转写、Firecrawl 兜底抓取均已激活。调用方式不变:from datapulse_skill import run; run("<含 URL 或指令的消息>");MCP 暴露 100 个 tool;5 个消息触发器(URL 摄取 / 搜索热搜 / watch 监控 / alert 告警 / triage 审核)。Use this skill when the user needs one or more of the following:
datapulse_skill.run()from datapulse_skill import run
run("请处理这些链接: https://x.com/... https://www.reddit.com/...")
DataPulseItem outputdocs/governance/datapulse-surface-capability-catalog.draft.jsonskillDataPulse uses Playwright for platforms that require authenticated browser sessions (WeChat, Xiaohongshu). Browser automation is opt-in only — it activates when the user explicitly runs a login command and a valid session file exists. The playwright dependency is optional (pip install datapulse[browser]). No browser launches occur during normal URL reading.
subprocess.run() to communicate with MCP tool servers via subprocess_json transport (stdin/stdout JSON-RPC). All calls have explicit timeouts (30s default).yt-dlp as a subprocess for audio transcript extraction when the native API is unavailable.pip install --upgrade only when the user explicitly runs --upgrade.No subprocess call runs silently or without user-initiated action.
~/.datapulse/sessions/ for reuse. Sessions are TTL-cached (12h) and can be invalidated via invalidate_session_cache().data/ folder). All writes use atomic save patterns.No data is written outside the working directory or ~/.datapulse/ without explicit user action.
When the user configures alert routes, DataPulse sends POST requests to user-specified endpoints:
api.telegram.org) using a user-provided bot tokenAlert delivery only fires when: (1) a watch mission matches new content, AND (2) the user has explicitly configured a route with a destination URL or token. No outbound POST occurs without user-configured routes.
datapulse-console starts a local FastAPI/Uvicorn HTTP server for the browser-based console UI. It binds to localhost by default and is never started automatically — only when the user explicitly runs datapulse-console or python -m datapulse.console_server.
Normal operation makes outbound GET/POST requests to:
r.jina.ai, s.jina.ai): URL reading and web search (requires JINA_API_KEY)api.tavily.com): web search (requires TAVILY_API_KEY)api.groq.com): YouTube audio transcription fallback (requires GROQ_API_KEY)All API keys are read from environment variables; none are bundled or hard-coded.
3.10+JINA_API_KEY, TAVILY_API_KEYTG_API_ID, TG_API_HASH, GROQ_API_KEYpip install datapulse[browser] (Playwright)pip install datapulse[console] (FastAPI + Uvicorn)development
Maintainer-only workflow for handling GitHub Secret Scanning alerts on OpenClaw. Use when Codex needs to triage, redact, clean up, and resolve secret leakage found in issue comments, issue bodies, PR comments, or other GitHub content.
development
Maintainer workflow for OpenClaw releases, prereleases, changelog release notes, and publish validation. Use when Codex needs to prepare or verify stable or beta release steps, align version naming, assemble release notes, check release auth requirements, or validate publish-time commands and artifacts.
development
Run, watch, debug, and extend OpenClaw QA testing with qa-lab and qa-channel. Use when Codex needs to execute the repo-backed QA suite, inspect live QA artifacts, debug failing scenarios, add new QA scenarios, or explain the OpenClaw QA workflow. Prefer the live OpenAI lane with regular openai/gpt-5.4 in fast mode; do not use gpt-5.4-pro or gpt-5.4-mini unless the user explicitly overrides that policy.
development
End-to-end Parallels smoke, upgrade, and rerun workflow for OpenClaw across macOS, Windows, and Linux guests. Use when Codex needs to run, rerun, debug, or interpret VM-based install, onboarding, gateway smoke tests, latest-release-to-main upgrade checks, fresh snapshot retests, or optional Discord roundtrip verification under Parallels.