skills/shumei-user-violation-audit/SKILL.md
Automate Shumei-based user violation-rate audits from MongoDB user and conversation collections, producing a CSV sorted by per-user request violation rate. Use when asked to screen users for forbidden/risky content, compute user-level violation rates, audit newly registered/free/suspicious users, or rerun a similar report with custom user filters, conversation filters, and a Shumei input-event key.
npx skillsauth add realroc/skills shumei-user-violation-auditInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this skill to turn a user cohort plus conversation-selection rule into a per-user CSV:
用户 -> conversation.query -> Shumei input event -> 违禁请求数 / 检测成功请求数.
Before running, identify these values from the user or local context:
user collection.conversation collection.accessKey for the text risk service.mongo_uri + mongo_db, or an app/container environment that already has Mongo access.Do not hardcode the Shumei key in the skill. Treat it as run input.
Use the bundled script:
python3 scripts/audit_user_violation_rate.py --config config.json
Print an editable config template:
python3 scripts/audit_user_violation_rate.py --print-example-config
The script is read-only for MongoDB. It writes a UTF-8-BOM CSV and a .gz copy unless --no-gzip is passed.
The script accepts valid JSON. Template placeholders are resolved before querying:
{{now}}{{now - 2592000}}{{username}}{{register_time}}{{register_time + 86400}}{{user.some_nested_field}}Default conversation template matches the prior AMA audit:
{
"user": "{{username}}",
"request_time": {
"$gte": "{{register_time}}",
"$lt": "{{register_time + 86400}}"
},
"query": { "$exists": true, "$ne": "" }
}
Example still-free users registered in the past 30 days:
{
"register_time": { "$gte": "{{now - 2592000}}", "$lt": "{{now}}" },
"$and": [
{
"$or": [
{ "pay_history": { "$exists": false } },
{ "pay_history": [] },
{ "pay_history": null }
]
},
{
"$or": [
{ "vip_expire_time": { "$exists": false } },
{ "vip_expire_time": { "$lte": "{{now}}" } },
{ "vip_expire_time": -1 }
]
}
]
}
user_filter.conversation_filter_template; include query: {"$exists": true, "$ne": ""} unless the user explicitly wants empty queries included.RESULT_JSON_START summary: row count, request count, violation count, and failed checks.If the target MongoDB is only reachable from a server/container, copy or mount the script there and run it inside that environment. For very large CSVs over TAT/stdout, prefer leaving the report file on the server or transferring a compressed .csv.gz; avoid printing CSV contents to stdout.
The default violation rule follows the existing product behavior:
code == 1100 and riskLevel != PASS
CSV rows are sorted by:
请求违禁率 descending违禁请求数 descending对话请求数 descendingDefault columns include user identity fields, request counts, Shumei risk labels/descriptions, up to three violating query examples, and first/last selected request times.
development
Screen MongoDB conversation collections for script-driven abuse (prompt-injection templates, curl/empty user agents, probe-word floods, sessionless calls, multi-account IPs). Produces a two-tier triage report (confirmed abuse / suspicious) plus a multi-account IP list and a ban candidate CSV. Use when asked to find script callers, prompt-injection attempts, abnormal high-frequency users, accounts bypassing the web UI, or "who is using my AI as a cron job".
development
Audit or rewrite a prompt into a six-section issue spec (Goal / Constraints / Non-goals / Verification / Architecture notes / Existing context) before any code gets generated. Use when the user pastes a vague request and asks for implementation, or explicitly says they want to frame an issue properly. Triggers on: prompt spec, audit this prompt, check my prompt, what's missing in this prompt, frame this issue, rewrite as a prompt spec, convert to issue spec, make this an issue, issue framing.
testing
GitHire's six-step AI-native engineering method: frame the issue, sandbox, AI execute, AI review, architect decision, ship. Use when planning or executing real work with AI agents — issue framing, prompt writing, PR review gating, architect handoff — or anytime humans-frame-AI-execute-architects-verify applies. Triggers on: use githire, githire methodology, issue-first onboarding, ai-native workflow, frame this issue, prompt spec, architect review, first PR for a candidate, hire through real PRs.
development
Geolocate a batch of IPv4 addresses and produce a Markdown distribution table — Chinese IPs broken down by province (incl. HK/MO/TW), foreign IPs by country, with counts and percentages. Optionally exports CSV. Uses the free ip-api.com batch endpoint (no key, no signup, HTTP only, 15 batches × 100 IPs per minute). Use when the user pastes a list of IPs and asks for "IP 分布", "IP 归属地分布", "省份分布", "where are these IPs from", "geolocate these IPs", or wants an IP-region breakdown table.