Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

atrislabs/tune

Name: tune
Author: atrislabs

_internal-skills/tune/SKILL.md

npx skillsauth add atrislabs/atris tune

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Tune — Live Skill RL

Turn the user's reaction into a targeted SKILL.md overlay edit. Human-in-the-loop reinforcement learning for skills.

Architecture

Base SKILL.md (read-only, shared)
    +
Overlay.md (mutable, per-workspace)
    =
Effective skill at invocation time

Path A (cron):  /api/improve ticks weekly, objective reward harvester
Path B (live):  THIS SKILL — user reaction = reward, edits overlay in-session
                Same overlay.md, same scorecard — both paths compound.

The reward schema (non-negotiable)

| Signal | Reward | Source | |---|---|---| | Explicit positive | +1 | "good", "perfect", "keep it", "that nailed it" | | Implicit positive | +0.5 | user kept output unchanged, shipped it, moved on | | Neutral | 0 | no reaction, switched topic | | Implicit negative | −0.5 | user edited heavily, reran skill, tweaked params | | Explicit negative | −1 | "shit", "bad", "nope", "revert", "worse than before" |

Do not infer reward beyond these. If you are unsure, classify as neutral and log; do NOT propose an edit on neutral.

The loop (one tick)

1. OBSERVE
   - User just invoked skill <X> at turn T
   - Skill produced output Y
   - User reacted with R

2. CLASSIFY
   - Map R to reward from the schema above
   - If reward in {0}: log and stop. Do not propose.

3. DIAGNOSE
   - If reward < 0: what specifically did the user reject?
     (a phrase? a structural choice? a missing section? tone?)
   - If reward > 0: what specifically did the user approve?
     (same questions, positive framing)
   - Write a one-sentence diagnosis.

4. PROPOSE
   - Read /workspace/atris/skills/<X>/SKILL.md   (BASE, read-only)
   - Read /workspace/atris/skills/<X>/overlay.md (if exists, else empty)
   - Propose ONE targeted edit to overlay.md:
     • Max 5 lines of change
     • Must be additive (append) or a bounded replace of existing overlay lines
     • NEVER edits base SKILL.md

5. SHOW THE DIFF
   - Display a unified diff of the proposed overlay change
   - State the diagnosis and expected outcome in one sentence

6. REQUIRE APPROVAL
   - Wait for an explicit "yes" / "apply" / "go" from the user.
   - On anything else: log as rejected. Do not apply.

7. APPLY + LOG
   - Write the approved change to /workspace/atris/skills/<X>/overlay.md
   - Append one line to /workspace/atris/skills/<X>/scorecards.jsonl:
     {"ts": "<iso>", "skill": "<X>", "reward": <n>, "reaction": "<short>",
      "diagnosis": "<short>", "applied": true, "path": "B-live", "turn": <T>}

Validator rules (hard gates)

Base is read-only. If the skill folder has no overlay.md, create it. Never edit SKILL.md directly.
Diff gate. A diff must be shown to the user BEFORE any write. No silent edits ever.
Approval gate. Explicit "yes/apply/go" required. Silence is not approval.
Size cap. A single tick may add at most 5 lines to overlay.md. Larger changes require the user to re-invoke with a bigger scope confirmed.
Overlay size cap. If overlay.md exceeds 100 lines, refuse new edits and suggest consolidation (point the user at /api/improve or ask them to manually prune).
No ALL CAPS, no em dashes, no "AI" buzzword in overlay content (matches outbound copy rules).
Log every tick to scorecards.jsonl — approved, rejected, and neutral. Every tick.

Reward harvester (Path B side)

scorecards.jsonl (append-only):
  {"ts", "skill", "reward", "reaction", "diagnosis", "applied", "path": "B-live", "turn"}

Path A (/api/improve) also writes to the SAME scorecards.jsonl with:
  {"ts", "skill", "reward", "path": "A-cron", "variant_hash", "kept": bool}

Weekly: /api/improve cron tick reads the last 7 days of scorecards.jsonl
        to propose autonomous variants that address patterns Path B caught.

Examples

Example 1 — explicit negative on aeo

User: [runs aeo draft, gets output]
User: "this is too hedgy, it uses 'may' twice"

Tune skill:
1. Reward: −1 (explicit negative, specific complaint)
2. Diagnosis: AEO rule 9 (no hedging) is violated; overlay should strengthen it
3. Propose overlay.md addition:

   ```diff
   --- overlay.md (before)
   +++ overlay.md (after)
   @@ +1,3 @@
   +## Stricter no-hedging (learned 2026-04-17)
   +Before returning, scan the body for: may, might, could, perhaps, possibly, potentially.
   +If found, rewrite the sentence as declarative. No exceptions.

Diagnosis: "strengthens rule 9 by making the scan explicit post-generation"
Expected: next draft has zero hedging words

"Apply this?" User: "yes" [writes overlay.md + logs scorecard, reward=−1, applied=true]


### Example 2 — explicit positive on cold-email

User: "damn that cold email hook is sharp, keep doing that"

Tune skill:

Reward: +1
Diagnosis: the specific hook pattern worked (e.g., "opened with a stat from their 10-K")
Propose overlay.md addition:

+## Hook that worked (2026-04-17) +Prefer opening with a verifiable, recent stat from the prospect's own +public material (10-K, earnings call, press release) over generic praise.
"Apply this as a preferred pattern?" User: "yes" [writes overlay.md + logs scorecard, reward=+1, applied=true]


### Example 3 — neutral

User: [gets output, says nothing, moves on]

Tune skill: does NOT proactively run. Only triggers on user-expressed reaction. If somehow invoked: log neutral, propose nothing.


## Anti-patterns

- **Don't propose on neutral signal.** Silence is not −1 or +1.
- **Don't rewrite the base.** Base SKILL.md is shared across customers. Only overlay diverges.
- **Don't batch multiple changes into one tick.** One tick = one targeted edit. Larger refactors go through `/api/improve` in review mode.
- **Don't claim improvement you can't verify.** "Expected outcome" should be a testable prediction (e.g., "next draft will have zero 'may/might/could'"), not a vibe.
- **Don't run without the user's explicit approval.** Every application is gated. No "this seems good, I'll apply it."

## File layout (what exists where)

atris-cli/atris/skills/<name>/SKILL.md ← base, shared (genotype) /workspace/atris/skills/<name>/SKILL.md ← customer copy of base (synced) /workspace/atris/skills/<name>/overlay.md ← mutable, RL-tuned (phenotype) /workspace/atris/skills/<name>/scorecards.jsonl ← append-only reward log


**Invocation-time behavior:** skill callers (backend endpoints like `/aeo/draft`, or Claude Code skill system) read `SKILL.md` AND `overlay.md` and concatenate. If overlay.md is missing, treat as empty.

## Related

- Base pattern: `atris/skills/autoresearch/SKILL.md` (Karpathy keep/revert)
- Path A tick runner: `/api/improve` (`backend/routers/improve.py`)
- PII sanitizer for cross-customer diffs: `backend/rl/skill_extractor.py`
- Reward compiler: `backend/rl/rewards.py`
- Judge infra: `backend/rl/judges.py`
- Feature: `atris/features/skill-rl/idea.md`

## Rules (non-negotiable summary)

1. Never edit base SKILL.md. Overlay only.
2. Always show the diff before applying.
3. Require explicit user approval.
4. Cap a single tick at 5 lines of change.
5. Log every tick — approved, rejected, neutral.
6. Do not propose on neutral reactions.
7. Strip PII before any cross-customer record (use `skill_extractor`).

atrislabs/tune

_internal-skills/tune/SKILL.md

Live RL tuner for skills. Watches skill invocations, reads user reaction, proposes targeted SKILL.md overlay edits, requires explicit approval, writes scorecards. The in-session half of the skill-RL loop (Path B). Triggers on: tune, sharpen, skill feedback, that was shit, that was great, make X better.

60 stars

development

Updated Apr 18, 2026

$ install --global

skillsauth

npx skillsauth add atrislabs/atris tune

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 18, 2026, 8:22 AM13.2s1 file scanned

SKILL.md

name:: tune
description:: Live RL tuner for skills. Watches skill invocations, reads user reaction, proposes targeted SKILL.md overlay edits, requires explicit approval, writes scorecards. The in-session half of the skill-RL loop (Path B). Triggers on: tune, sharpen, skill feedback, that was shit, that was great, make X better.
when_to_use:: Use when the user reacts to a skill's output with approval or pushback, or asks to tune/improve/sharpen a skill based on what just happened. Examples: 'that aeo draft was shit, make it better', 'good, keep doing that', 'the cold-email skill keeps hedging', 'tune this'.
version:: 0.1.0
allowed-tools:: Read, Write, Edit, Glob, Grep, Bash

Tune — Live Skill RL

Turn the user's reaction into a targeted SKILL.md overlay edit. Human-in-the-loop reinforcement learning for skills.

Architecture

Base SKILL.md (read-only, shared)
    +
Overlay.md (mutable, per-workspace)
    =
Effective skill at invocation time

Path A (cron):  /api/improve ticks weekly, objective reward harvester
Path B (live):  THIS SKILL — user reaction = reward, edits overlay in-session
                Same overlay.md, same scorecard — both paths compound.

The reward schema (non-negotiable)

Do not infer reward beyond these. If you are unsure, classify as neutral and log; do NOT propose an edit on neutral.

The loop (one tick)

1. OBSERVE
   - User just invoked skill <X> at turn T
   - Skill produced output Y
   - User reacted with R

2. CLASSIFY
   - Map R to reward from the schema above
   - If reward in {0}: log and stop. Do not propose.

3. DIAGNOSE
   - If reward < 0: what specifically did the user reject?
     (a phrase? a structural choice? a missing section? tone?)
   - If reward > 0: what specifically did the user approve?
     (same questions, positive framing)
   - Write a one-sentence diagnosis.

4. PROPOSE
   - Read /workspace/atris/skills/<X>/SKILL.md   (BASE, read-only)
   - Read /workspace/atris/skills/<X>/overlay.md (if exists, else empty)
   - Propose ONE targeted edit to overlay.md:
     • Max 5 lines of change
     • Must be additive (append) or a bounded replace of existing overlay lines
     • NEVER edits base SKILL.md

5. SHOW THE DIFF
   - Display a unified diff of the proposed overlay change
   - State the diagnosis and expected outcome in one sentence

6. REQUIRE APPROVAL
   - Wait for an explicit "yes" / "apply" / "go" from the user.
   - On anything else: log as rejected. Do not apply.

7. APPLY + LOG
   - Write the approved change to /workspace/atris/skills/<X>/overlay.md
   - Append one line to /workspace/atris/skills/<X>/scorecards.jsonl:
     {"ts": "<iso>", "skill": "<X>", "reward": <n>, "reaction": "<short>",
      "diagnosis": "<short>", "applied": true, "path": "B-live", "turn": <T>}

Validator rules (hard gates)

Base is read-only. If the skill folder has no overlay.md, create it. Never edit SKILL.md directly.
Diff gate. A diff must be shown to the user BEFORE any write. No silent edits ever.
Approval gate. Explicit "yes/apply/go" required. Silence is not approval.
Size cap. A single tick may add at most 5 lines to overlay.md. Larger changes require the user to re-invoke with a bigger scope confirmed.
Overlay size cap. If overlay.md exceeds 100 lines, refuse new edits and suggest consolidation (point the user at /api/improve or ask them to manually prune).
No ALL CAPS, no em dashes, no "AI" buzzword in overlay content (matches outbound copy rules).
Log every tick to scorecards.jsonl — approved, rejected, and neutral. Every tick.

Reward harvester (Path B side)

scorecards.jsonl (append-only):
  {"ts", "skill", "reward", "reaction", "diagnosis", "applied", "path": "B-live", "turn"}

Path A (/api/improve) also writes to the SAME scorecards.jsonl with:
  {"ts", "skill", "reward", "path": "A-cron", "variant_hash", "kept": bool}

Weekly: /api/improve cron tick reads the last 7 days of scorecards.jsonl
        to propose autonomous variants that address patterns Path B caught.

Examples

Example 1 — explicit negative on aeo

User: [runs aeo draft, gets output]
User: "this is too hedgy, it uses 'may' twice"

Tune skill:
1. Reward: −1 (explicit negative, specific complaint)
2. Diagnosis: AEO rule 9 (no hedging) is violated; overlay should strengthen it
3. Propose overlay.md addition:

   ```diff
   --- overlay.md (before)
   +++ overlay.md (after)
   @@ +1,3 @@
   +## Stricter no-hedging (learned 2026-04-17)
   +Before returning, scan the body for: may, might, could, perhaps, possibly, potentially.
   +If found, rewrite the sentence as declarative. No exceptions.

Diagnosis: "strengthens rule 9 by making the scan explicit post-generation"
Expected: next draft has zero hedging words

"Apply this?" User: "yes" [writes overlay.md + logs scorecard, reward=−1, applied=true]


### Example 2 — explicit positive on cold-email

User: "damn that cold email hook is sharp, keep doing that"

Tune skill:

Reward: +1
Diagnosis: the specific hook pattern worked (e.g., "opened with a stat from their 10-K")
Propose overlay.md addition:

+## Hook that worked (2026-04-17) +Prefer opening with a verifiable, recent stat from the prospect's own +public material (10-K, earnings call, press release) over generic praise.
"Apply this as a preferred pattern?" User: "yes" [writes overlay.md + logs scorecard, reward=+1, applied=true]


### Example 3 — neutral

User: [gets output, says nothing, moves on]

Tune skill: does NOT proactively run. Only triggers on user-expressed reaction. If somehow invoked: log neutral, propose nothing.


## Anti-patterns

- **Don't propose on neutral signal.** Silence is not −1 or +1.
- **Don't rewrite the base.** Base SKILL.md is shared across customers. Only overlay diverges.
- **Don't batch multiple changes into one tick.** One tick = one targeted edit. Larger refactors go through `/api/improve` in review mode.
- **Don't claim improvement you can't verify.** "Expected outcome" should be a testable prediction (e.g., "next draft will have zero 'may/might/could'"), not a vibe.
- **Don't run without the user's explicit approval.** Every application is gated. No "this seems good, I'll apply it."

## File layout (what exists where)


**Invocation-time behavior:** skill callers (backend endpoints like `/aeo/draft`, or Claude Code skill system) read `SKILL.md` AND `overlay.md` and concatenate. If overlay.md is missing, treat as empty.

## Related

- Base pattern: `atris/skills/autoresearch/SKILL.md` (Karpathy keep/revert)
- Path A tick runner: `/api/improve` (`backend/routers/improve.py`)
- PII sanitizer for cross-customer diffs: `backend/rl/skill_extractor.py`
- Reward compiler: `backend/rl/rewards.py`
- Judge infra: `backend/rl/judges.py`
- Feature: `atris/features/skill-rl/idea.md`

## Rules (non-negotiable summary)

1. Never edit base SKILL.md. Overlay only.
2. Always show the diff before applying.
3. Require explicit user approval.
4. Cap a single tick at 5 lines of change.
5. Log every tick — approved, rejected, neutral.
6. Do not propose on neutral reactions.
7. Strip PII before any cross-customer record (use `skill_extractor`).

Related Skills

atrislabs/copy-editor

testing

VerifiedTrustedCommunity

Detects AI slop and fixes it, especially in memos, docs, READMEs, messages, PRDs, and other written output. Based on Wikipedia's AI Cleanup patterns plus memo-specific anti-slop rules. Triggers on "copy edit", "review writing", "humanize", "deslopper", "ai patterns", "make it sound human", "AI slop", "anti-slop", "memo".

61SKILL.mdUpdated Apr 3, 2026

atrislabs/copy-editor

atrislabs/imessage

tools

VerifiedTrustedCommunity

Use when an agent needs to inspect or send local macOS iMessage through Atris CLI. Triggers on iMessage, Messages.app, local text messages, chat.db, or texting someone from the user's Mac.

60SKILL.mdUpdated May 3, 2026

atrislabs/atris-feedback

databases

VerifiedTrustedCommunity

Submit, list, resolve, close, or delete Atris customer feedback. Use when user types /feedback or asks to triage the feedback queue.

60SKILL.mdUpdated Apr 22, 2026

atrislabs/atris-feedback

atrislabs/research-search

development

VerifiedTrustedCommunity

Fast research sweep — arxiv, semantic scholar, github, web. Finds papers, scores relevance, extracts actionable insights, stores to wiki. Triggers on: research search, find papers, latest research, arxiv, what's new in, sweep papers, research sweep.

60SKILL.mdUpdated Apr 18, 2026

atrislabs/research-search

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/atrislabs/atris.git

# Copy into Claude Code skills folder (global)
cp -r atris/_internal-skills/tune ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

atrislabs/atris

60 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT