hermes-skills/arifos-llm-pipeline-audit/SKILL.md
Audit LLM integration (SEA-LION, Ollama) across arifOS projects — detect hallucinations, map real pipeline, identify 3 critical holes. Triggered when investigating LLM configs or SEA-LION witness reports claim non-existent architecture.
npx skillsauth add ariffazil/openclaw-workspace arifos-llm-pipeline-auditInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
SEA-LION AI Singapore confidently hallucinated an entire non-existent file (memory_plane.py) with fake architecture. This skill documents the audit approach used to separate hallucination from ground truth.
find /root/arifOS -name "memory_plane.py" 2>/dev/null
# If not found → SEA-LION hallucinated
# Real files that exist:
ls /root/arifOS/arifosmcp/runtime/ | grep -i "memory\|context\|persist\|store"
ls /root/arifOS/arifosmcp/tools/ | grep -i "memory"
# Hole 1: Does judge receive evidence?
grep -n "_evidence\|mind_reason\|heart_critique" /root/arifOS/arifosmcp/tools/judge.py
# Hole 2: Does memory recall actually store/retrieve?
grep -A2 '"memories"\|"results"\|"stored"' /root/arifOS/arifosmcp/tools/memory.py
# If all return [] or fake UUID → STUB
# Hole 3: Does vault auto-seal on SEAL verdict?
grep -n "vault_entry_id\|auto.*seal" /root/arifOS/arifosmcp/tools/judge.py
import subprocess, json
cmd = ['curl', '-s', '--max-time', '20', '-X', 'POST',
'https://api.sea-lion.ai/v1/chat/completions',
'-H', 'Authorization: Bearer ' + api_key,
'-d', json.dumps({"model": "aisingapore/Qwen-SEA-LION-v4-32B-IT",
"messages": [{"role":"user","content":"Say only: OK"}],
"max_tokens": 10, "temperature": 0.1})]
result = subprocess.run(cmd, capture_output=True, text=True)
# Expect {"content":"OK"} not error
File: /root/arifOS/arifosmcp/runtime/llm_client.py
Tier 1 (PRIMARY): SEA-LION → https://api.sea-lion.ai/v1/chat/completions
Model: aisingapore/Qwen-SEA-LION-v4-32B-IT
Key: sk-znzfOJH_Yc7LKewfrfsl2A (confirmed live 2026-05-06)
Tier 2 (FALLBACK): Ollama → http://127.0.0.1:11434/api/generate
Model: qwen2.5:7b (ollama-engine-prod container)
Tier 3: LLMUnavailableError → deterministic fallback
| Tool | File | Output | |------|------|--------| | arif_mind_reason | runtime/mind_reason.py | verdict, synthesis, confidence, omega_0, delta_S, scars | | arif_heart_critique | tools/heart.py | status, risks_found[], risk_tier, empathy_score, verdict | | arif_reply_compose | runtime/reply_compose.py | composed, tone, delta_S, f02/f04/f07 scores |
tools/judge.py:40 — _evidence: dict = {} always empty. mind_reason/heart_critique outputs never piped in.tools/memory.py — recall→{"memories":[]}, store→fake UUID, search→{"results":[]}tools/vault.py — only seals if vault_entry_id manually passed. No auto-SEAL.P0: Wire mind_reason + heart_critique → arif_judge_deliberate._evidence
P1: Replace memory.py stubs with real Qdrant vector store
P2: Add auto-SEAL post-hook in judge on SEAL verdict
| Project | SEA-LION | Ollama | Notes | |---------|----------|--------|-------| | arifOS | ✅ External API | ⚠️ Ref only | Docker network DNS | | A-FORGE | ✅ providerFactory | ✅ Container | qwen2.5:7b + bge-m3 | | WEALTH | ❌ | ⚠️ Embeddings only | memory_pipeline.py | | GEOX | ❌ | ❌ | EIA/SPGlobal data APIs only |
*_plane.py, *_interpreter.py)MemoryPlane, MemoryGraph)# Test SEA-LION
curl -s --max-time 20 -X POST https://api.sea-lion.ai/v1/chat/completions \
-H "Authorization: Bearer $(grep SEA_LION_API_KEY /root/arifOS/.env | cut -d= -f2)" \
-H "Content-Type: application/json" \
-d '{"model":"aisingapore/Qwen-SEA-LION-v4-32B-IT","messages":[{"role":"user","content":"Say only: OK"}],"max_tokens":10}'
# Check Ollama models
curl -s http://127.0.0.1:11434/api/tags | python3 -c "import json,sys; [print(m['name']) for m in json.load(sys.stdin).get('models',[])]"
# Hex decode masked key
python3 -c "
with open('/root/arifOS/.env','rb') as f:
for i,line in enumerate(f.read().split(b'\n'),1):
if b'SEA_LION_API_KEY' in line: print(f'Line {i}:', line.hex())
"
development
Governed intelligence skill for AAA as the abstraction, attestation, and abduction control plane across arifOS, APEX, A-FORGE, GEOX, WEALTH, WELL, and the ariffazil profile repository. Use when the user asks to explain or design AAA, route agentic work, reduce chaos/entropy in an arifOS federation task, create AREP/task declarations, classify risk, plan multi-repo changes, review governance boundaries, or translate human intent into evidence-backed, authority-safe, recursively agentic workflows. Provides deterministic F1-F13 floor checking, bounded abduction, and FederationReceipt composition.
development
Check every skill’s “use when” and “do not use when” clauses for collisions, missing negatives, and vague verbs like “help,” “assist,” or “improve.” Load when linting, reviewing, or validating trigger boundaries.
development
Bootstrap, design, and package new skills. Load when capturing user intent for a new skill or drafting its initial instruction framework.
content-media
Diagnose which federation services are up, down, or drifting. Produce a prioritized remediation plan.