skills/model-fallback/SKILL.md
OpenClaw 模型降级与故障切换完整指南。基于 2026-03-31 实战踩坑经验,覆盖 cron model 继承机制、session 缓存陷阱、LiveSessionModelSwitchError bug 及运维 SOP。
npx skillsauth add aaaaqwq/agi-super-team model-fallbackInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
⚠️ 本文档基于 OpenClaw 2026.3.28 (f9b1079) 实战验证,非理论文档。
1. cron payload.model → 单个 cron 任务独立指定
2. agent model.primary → agent 级配置
3. agents.defaults.model.primary → 全局默认
agents.defaults.subagents.model → sessions_spawn 子任务默认模型
⚠️ 易遗漏! 2026-03-31 事故中,agent primary 已改但 subagents.model 还是旧值,导致子任务全部走失败的 provider。
| payload.kind | payload.model | 实际使用的模型 |
|--------------|--------------|--------------|
| agentTurn | 显式指定(如 zai/glm-5.1)| 用指定的 ✅ |
| agentTurn | 未指定 | 继承 agent 的 primary model |
| systemEvent | N/A | 不走 LLM,纯文本注入 |
"payload": {
"kind": "agentTurn",
"model": "zai/glm-5.1",
"fallbacks": ["minimax/MiniMax-M2.7-highspeed", "xingsuancode/claude-opus-4-6"],
"message": "..."
}
不指定则继承 agent 的 fallbacks 配置。
OpenClaw session 以 JSONL 文件持久化在磁盘:
~/.openclaw/agents/*/sessions/*.jsonl
session 文件内缓存了当前使用的 model 名称。即使 config 改了 primary,session 文件里的旧 model 不会自动更新。
isolated session(cron 默认)理论上每次创建新 session,但仍可能复用缓存的 model 映射session:xxx(绑定到特定 session)100% 会复用旧 model/new 重置 session/new 对应 session 或更新 cron 的 sessionTargetGitHub Issue: openclaw/openclaw#58406
当 fallback 过程中,OpenClaw 检测到 session 的 model 被并发修改(live session model switch detected before attempt),会把所有 fallback 候选模型标记为 candidate_failed (reason=unknown),不实际尝试 API 调用。
[model-fallback/decision] candidate_failed requested=shibacc/claude-opus-4-6 candidate=zai/glm-5.1 reason=unknown
[agent/embedded] live session model switch detected before attempt: zai/glm-5.1 -> shibacc/claude-opus-4-6
Fallback 流程中的 "live session model switch detection" 过于激进,不区分:
/model 切换# 1. 确认哪个 provider 挂了
journalctl _PID=$(pgrep openclaw-gateway) --since "10 min ago" | grep -i "error\|403\|429"
# 2. 查看当前所有 agent primary
cat ~/.openclaw/openclaw.json | python3 -c "
import json, sys
c = json.load(sys.stdin)
print('defaults.primary:', c['agents']['defaults']['model']['primary'])
print('subagents.model:', c['agents']['defaults']['subagents']['model'])
for a in c['agents']['list']:
print(f' {a[\"id\"]}: {a.get(\"model\",{}).get(\"primary\",\"(inherit)\")}')"
# 3. 批量切换所有 agent + subagents + cron payload.model
# 见下方"一键切换脚本"
# config.patch 方式(推荐)—— 改 agent primary + fallbacks + subagents.model
# 通过 gateway tool 的 config.patch 操作
# cron payload.model 批量更新
# 需要逐个 cron update,设置 payload.model + payload.fallbacks
# 统计所有 session 文件引用的 model
grep -rh '"model"' ~/.openclaw/agents/*/sessions/*.jsonl 2>/dev/null \
| grep -oP '"model":"[^"]*"' | sort | uniq -c | sort -rn | head -20
# 查看特定 agent 的活跃 session
ls -lt ~/.openclaw/agents/quant/sessions/*.jsonl | head -5
# 查看最近 30 分钟的 fallback 日志
journalctl _PID=$(pgrep openclaw-gateway) --since "30 min ago" \
| grep -i "fallback\|candidate_failed\|model.*switch" | tail -30
models.providers.<name> 配置 baseUrl + apiKey + modelsauth.profiles.<name>:default 配置认证模式.env 添加对应环境变量pass show api/<name> 确保密钥可用agents.defaults.model.primary — 全局默认agents.defaults.subagents.model — ⚠️ 易遗漏agents.list[].model.primary — 各 agent 独立配置agents.list[].model.fallbacks — fallback 链payload.model — 需逐个更新payload.fallbacks — 需逐个更新/new 清除缓存| 问题 | 状态 | Issue | |------|------|-------| | LiveSessionModelSwitchError 导致 fallback 全链失败 | 🟡 已报 bug | #58406 | | SIGUSR1 热重启不清除 session model 缓存 | 🟡 待确认 | - | | isolated cron session 复用旧 model 映射 | 🟡 待确认 | - |
Last updated: 2026-03-31 — 基于 shibacc 余额耗尽事故的完整复盘
development
Technology-agnostic prompt generator that creates customizable AI prompts for scanning codebases and identifying high-quality code exemplars. Supports multiple programming languages (.NET, Java, JavaScript, TypeScript, React, Angular, Python) with configurable analysis depth, categorization methods, and documentation formats to establish coding standards and maintain consistency across development teams.
tools
Expert-level browser automation, debugging, and performance analysis using Chrome DevTools MCP. Use for interacting with web pages, capturing screenshots, analyzing network traffic, and profiling performance.
data-ai
Prompt for creating detailed feature implementation plans, following Epoch monorepo structure.
tools
Interactive prompt refinement workflow: interrogates scope, deliverables, constraints; copies final markdown to clipboard; never writes code. Requires the Joyride extension.