skills/claude-4-6-features/long-context-1m/SKILL.md
Use Claude's 1M-token context window effectively — when to use it, how to structure inputs for recall, how to price it, and how to combine with prompt caching to keep it affordable. Use this skill when building apps that feed large codebases, long documents, or entire conversation histories to Claude, or when weighing 1M context vs RAG. Activate when: 1M context, long context, big context window, context vs RAG, Claude 1 million tokens, context-beta header.
npx skillsauth add latestaiagents/agent-skills long-context-1mInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Claude Opus 4.6 and Sonnet 4.6 support 1M token context with the context-1m-2025-08-07 beta header. Use it well or burn money for nothing.
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const response = await client.messages.create(
{
model: "claude-sonnet-4-6",
max_tokens: 4096,
messages: [{ role: "user", content: giantDocument + "\n\nSummarize." }],
},
{ headers: { "anthropic-beta": "context-1m-2025-08-07" } },
);
Without the beta header, requests over 200K tokens will error.
Long context is priced differently above 200K input tokens. Check your provider's current rates; as a rule of thumb input above 200K costs ~2× the base rate. Output price is unchanged.
Rule: if you're only going to use 200K, don't enable 1M. Only pay for long-context pricing when you actually need > 200K.
| When 1M context wins | When RAG wins | |---|---| | Cross-document synthesis | Fresh data that updates hourly | | Full-codebase refactoring | Unbounded corpus (> 1M tokens) | | Holistic code review | Per-user personal data (privacy isolation) | | Single-shot analysis | Many cheap lookups on small queries | | Exploration where you don't know what's relevant | Known query patterns |
Hybrid: RAG retrieves the top 500K tokens; stuff those into 1M context. Best of both.
Claude's long-context recall is strong but not uniform. Tips:
<document index="1" title="...">...</document> — the model indexes on theseconst prompt = `
You will analyze the codebase below, then answer questions.
<codebase>
<file path="src/auth.ts">...</file>
<file path="src/db.ts">...</file>
...
</codebase>
Given the codebase above, answer: <question>How does auth flow work?</question>
`;
1M context is expensive per call. If you're asking multiple questions against the same corpus, cache it:
const response = await client.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 4096,
system: [
{ type: "text", text: "You are a code reviewer." },
{ type: "text", text: giantCodebase, cache_control: { type: "ephemeral", ttl: "1h" } },
],
messages: [{ role: "user", content: "What's the auth flow?" }],
});
First call: full cost. Subsequent calls in the TTL window: ~10% of input cost for the cached portion. See the prompt-caching-ttl skill.
1M input takes longer to process — TTFT can be 10-30s for a full context. Mitigate:
Use the token counter before sending:
const { input_tokens } = await client.messages.countTokens({
model: "claude-sonnet-4-6",
messages: [{ role: "user", content: text }],
});
if (input_tokens > 1_000_000) throw new Error("Over context limit");
Budget with a 5% safety margin — actual tokenization varies slightly.
development
Test skills for correct activation, content quality, and regression — both automated checks (frontmatter validity, lint) and manual verification (query-suite activation testing). Covers CI integration and how to catch skill regressions before users do. Use this skill when adding skills to a repo, setting up CI for a skill library, or debugging "the skill exists but doesn't work". Activate when: test skills, validate skills, skill CI, skill linting, skill activation test, skill regression.
documentation
Write the YAML frontmatter for a SKILL.md file so it activates reliably — name, description, and activation keywords that the model matches against. Covers length, tone, and the most common frontmatter mistakes. Use this skill when authoring a new skill, fixing a skill that isn't auto-activating, or reviewing skills for publication. Activate when: SKILL.md frontmatter, skill description, skill activation, skill YAML, write a skill, author a skill.
development
Design skills that fire at the right moment — neither over-eager (noise) nor under-eager (silent). Covers activation specificity, trigger phrases, disambiguation between overlapping skills, and debugging activation. Use this skill when multiple skills could fire on the same query, a skill never fires, or a skill fires too often. Activate when: skill won't activate, skill over-activates, overlapping skills, skill triggers, skill selection, skill disambiguation.
development
Structure SKILL.md content so the model reads just enough — concise summary up front, progressively deeper detail, examples on demand. Covers section ordering, length budgets, when to split into multiple skills. Use this skill when writing or refactoring a skill body, one skill has grown too long, or a skill is wordy but not useful. Activate when: SKILL.md structure, skill content, skill too long, split skill, progressive disclosure, skill body.