Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

langwatch/test-cli-usability

Name: test-cli-usability
Author: langwatch

skills/recipes/test-cli-usability/SKILL.md

npx skillsauth add langwatch/langwatch test-cli-usability

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Test Your CLI's Agent Usability

This recipe helps you write scenario tests that verify your CLI tool works well when operated by AI agents (Claude Code, Cursor, Codex, etc.). A CLI that's agent-friendly means:

All commands can run non-interactively (no stdin prompts that hang)
Output is parseable and informative
Error messages are clear enough for an agent to self-correct
Help text enables discovery (--help works on every subcommand)

Prerequisites

Install the Scenario SDK:

npm install @langwatch/scenario vitest @ai-sdk/openai
# or: pip install langwatch-scenario pytest

Step 1: Identify Your CLI Commands

List every command your CLI supports. For each, note:

Does it require interactive input? (MUST have a non-interactive alternative)
What flags/options does it accept?
What does it output on success/failure?

Step 2: Write Scenario Tests

For each command, write a scenario test where an AI agent discovers and uses it:

import scenario, { type AgentAdapter, AgentRole } from "@langwatch/scenario";
import { openai } from "@ai-sdk/openai";

const myAgent: AgentAdapter = {
  role: AgentRole.AGENT,
  call: async (input) => {
    // Your Claude Code adapter here
  },
};

const result = await scenario.run({
  name: "CLI command discovery",
  description: "Agent discovers and uses the CLI to accomplish a task",
  agents: [
    myAgent,
    scenario.userSimulatorAgent({ model: openai("gpt-5-mini") }),
    scenario.judgeAgent({
      model: openai("gpt-5-mini"),
      criteria: [
        "Agent used the CLI command correctly",
        "Agent did not get stuck on interactive prompts",
        "Agent did not need to pipe 'yes' or use 'expect' scripting",
      ],
    }),
  ],
});

Step 3: Assert No Interactive Workarounds

Add this assertion to every test:

function assertNoInteractiveWorkarounds(state) {
  const output = state.messages.map(m =>
    typeof m.content === 'string' ? m.content : JSON.stringify(m.content)
  ).join('\n');

  expect(output).not.toMatch(/echo\s+["']?[yY](?:es)?["']?\s*\|/);
  expect(output).not.toMatch(/\byes\s*\|/);
  expect(output).not.toMatch(/expect\s+-c/);
  expect(output).not.toMatch(/printf\s+["']\\n["']\s*\|/);
}

If this assertion fails, your CLI has an interactivity bug -- add --yes, --force, or --non-interactive flags to the offending commands.

Step 4: Test Error Recovery

Write scenarios where the agent makes a mistake and must recover:

Wrong command name -> agent reads --help and self-corrects
Missing required argument -> agent reads error message and retries
Authentication failure -> agent follows instructions in error output

Common Mistakes

Do NOT make commands that require stdin for essential operations -- always provide flag alternatives
Do NOT use interactive prompts for confirmation without a --yes or --force flag
Do NOT output errors without actionable guidance (the agent needs to know how to fix it)
DO make --help comprehensive on every subcommand
DO use non-zero exit codes for failures (agents check exit codes)
DO output structured information (the agent can parse it)

langwatch/test-cli-usability

skills/recipes/test-cli-usability/SKILL.md

Write scenario tests that verify your CLI tool is usable by AI agents. Ensures commands work non-interactively, provide clear output, and don't hang on prompts. Use when you want to prove your CLI is agent-friendly.

3,203 stars

tools

Updated Apr 15, 2026

$ install --global

skillsauth

npx skillsauth add langwatch/langwatch test-cli-usability

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 15, 2026, 1:21 PM4.1s1 file scanned

SKILL.md

name:: test-cli-usability
description:: Write scenario tests that verify your CLI tool is usable by AI agents. Ensures commands work non-interactively, provide clear output, and don't hang on prompts. Use when you want to prove your CLI is agent-friendly.
license:: MIT
compatibility:: Requires @langwatch/scenario. Works with Claude Code and similar coding agents.
category:: recipe

Test Your CLI's Agent Usability

This recipe helps you write scenario tests that verify your CLI tool works well when operated by AI agents (Claude Code, Cursor, Codex, etc.). A CLI that's agent-friendly means:

All commands can run non-interactively (no stdin prompts that hang)
Output is parseable and informative
Error messages are clear enough for an agent to self-correct
Help text enables discovery (--help works on every subcommand)

Prerequisites

Install the Scenario SDK:

npm install @langwatch/scenario vitest @ai-sdk/openai
# or: pip install langwatch-scenario pytest

Step 1: Identify Your CLI Commands

List every command your CLI supports. For each, note:

Does it require interactive input? (MUST have a non-interactive alternative)
What flags/options does it accept?
What does it output on success/failure?

Step 2: Write Scenario Tests

For each command, write a scenario test where an AI agent discovers and uses it:

import scenario, { type AgentAdapter, AgentRole } from "@langwatch/scenario";
import { openai } from "@ai-sdk/openai";

const myAgent: AgentAdapter = {
  role: AgentRole.AGENT,
  call: async (input) => {
    // Your Claude Code adapter here
  },
};

const result = await scenario.run({
  name: "CLI command discovery",
  description: "Agent discovers and uses the CLI to accomplish a task",
  agents: [
    myAgent,
    scenario.userSimulatorAgent({ model: openai("gpt-5-mini") }),
    scenario.judgeAgent({
      model: openai("gpt-5-mini"),
      criteria: [
        "Agent used the CLI command correctly",
        "Agent did not get stuck on interactive prompts",
        "Agent did not need to pipe 'yes' or use 'expect' scripting",
      ],
    }),
  ],
});

Step 3: Assert No Interactive Workarounds

Add this assertion to every test:

function assertNoInteractiveWorkarounds(state) {
  const output = state.messages.map(m =>
    typeof m.content === 'string' ? m.content : JSON.stringify(m.content)
  ).join('\n');

  expect(output).not.toMatch(/echo\s+["']?[yY](?:es)?["']?\s*\|/);
  expect(output).not.toMatch(/\byes\s*\|/);
  expect(output).not.toMatch(/expect\s+-c/);
  expect(output).not.toMatch(/printf\s+["']\\n["']\s*\|/);
}

If this assertion fails, your CLI has an interactivity bug -- add --yes, --force, or --non-interactive flags to the offending commands.

Step 4: Test Error Recovery

Write scenarios where the agent makes a mistake and must recover:

Wrong command name -> agent reads --help and self-corrects
Missing required argument -> agent reads error message and retries
Authentication failure -> agent follows instructions in error output

Common Mistakes

Do NOT make commands that require stdin for essential operations -- always provide flag alternatives
Do NOT use interactive prompts for confirmation without a --yes or --force flag
Do NOT output errors without actionable guidance (the agent needs to know how to fix it)
DO make --help comprehensive on every subcommand
DO use non-zero exit codes for failures (agents check exit codes)
DO output structured information (the agent can parse it)

Related Skills

langwatch/tracing

development

VerifiedTrustedCommunity

Add LangWatch tracing and observability to your code. Use for both onboarding (instrument an entire codebase) and targeted operations (add tracing to a specific function or module). Supports Python and TypeScript with all major frameworks.

3,203SKILL.mdUpdated Apr 15, 2026

langwatch/scenarios

tools

VerifiedTrustedCommunity

Test your AI agent with simulation-based scenarios. Covers writing scenario test code (Scenario SDK), creating platform scenarios (CLI or MCP), and red teaming for security vulnerabilities. Auto-detects whether to use code or platform approach based on context.

3,203SKILL.mdUpdated Apr 15, 2026

langwatch/test-compliance

testing

VerifiedTrustedCommunity

Test that your AI agent stays observational and doesn't give prescriptive advice in regulated domains (healthcare, finance, legal). Creates scenario tests for boundary enforcement and red team tests for adversarial probing. Use when your agent advises but must not prescribe.

3,203SKILL.mdUpdated Apr 15, 2026

langwatch/test-compliance

langwatch/improve-setup

development

VerifiedTrustedCommunity

Expert AI engineering consultant for your LangWatch setup. Audits your codebase, traces, evaluations, and scenarios, then guides you to improve — starting from low-hanging fruit and going deeper. Use when you want to level up your agent's engineering quality.

3,203SKILL.mdUpdated Apr 15, 2026

langwatch/improve-setup

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/langwatch/langwatch.git

# Copy into Claude Code skills folder (global)
cp -r langwatch/skills/recipes/test-cli-usability ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

langwatch/langwatch

3,203 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT