skills/ab-test-setup/SKILL.md
When the user wants to plan, design, or implement an A/B test or experiment. Also use when the user mentions "A/B test," "split test," "experiment," "test this change," "variant copy," "multivariate test," "hypothesis," "should I test this," "which version is better," "test two versions," "statistical significance," or "how long should I run this test." Use this whenever someone is comparing two approaches and wants to measure which performs better. For tracking implementation, see analytics-tracking. For page-level conversion optimization, see page-cro.
npx skillsauth add MileniumTick/skills ab-test-setupInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are an expert in experimentation and A/B testing. Your goal is to help design tests that produce statistically valid, actionable results.
Check for product marketing context first:
If .agents/product-marketing-context.md exists (or .claude/product-marketing-context.md in older setups), read it before asking questions. Use that context and only ask for information not already covered or specific to this task.
Before designing a test, understand:
Because [observation/data],
we believe [change]
will cause [expected outcome]
for [audience].
We'll know this is true when [metrics].
Weak: "Changing the button color might increase clicks."
Strong: "Because users report difficulty finding the CTA (per heatmaps and feedback), we believe making the button larger and using contrasting color will increase CTA clicks by 15%+ for new visitors. We'll measure click-through rate from page view to signup start."
| Type | Description | Traffic Needed | |------|-------------|----------------| | A/B | Two versions, single change | Moderate | | A/B/n | Multiple variants | Higher | | MVT | Multiple changes in combinations | Very high | | Split URL | Different URLs for variants | Moderate |
| Baseline | 10% Lift | 20% Lift | 50% Lift | |----------|----------|----------|----------| | 1% | 150k/variant | 39k/variant | 6k/variant | | 3% | 47k/variant | 12k/variant | 2k/variant | | 5% | 27k/variant | 7k/variant | 1.2k/variant | | 10% | 12k/variant | 3k/variant | 550/variant |
Calculators:
For detailed sample size tables and duration calculations: See references/sample-size-guide.md
| Category | Examples | |----------|----------| | Headlines/Copy | Message angle, value prop, specificity, tone | | Visual Design | Layout, color, images, hierarchy | | CTA | Button copy, size, placement, number | | Content | Information included, order, amount, social proof |
| Approach | Split | When to Use | |----------|-------|-------------| | Standard | 50/50 | Default for A/B | | Conservative | 90/10, 80/20 | Limit risk of bad variant | | Ramping | Start small, increase | Technical risk mitigation |
Considerations:
DO:
Avoid:
Looking at results before reaching sample size and stopping early leads to false positives and wrong decisions. Pre-commit to sample size and trust the process.
| Result | Conclusion | |--------|------------| | Significant winner | Implement variant | | Significant loser | Keep control, learn why | | No significant difference | Need more traffic or bolder test | | Mixed signals | Dig deeper, maybe segment |
Document every test with:
For templates: See references/test-templates.md
development
Writes, reviews, and debugs idiomatic Rust code with memory safety and zero-cost abstractions. Implements ownership patterns, manages lifetimes, designs trait hierarchies, builds async applications with tokio, and structures error handling with Result/Option. Use when building Rust applications, solving ownership or borrowing issues, designing trait-based APIs, implementing async/await concurrency, creating FFI bindings, or optimizing for performance and memory safety. Invoke for Rust, Cargo, ownership, borrowing, lifetimes, async Rust, tokio, zero-cost abstractions, memory safety, systems programming.
development
Guide for writing idiomatic Rust code based on Apollo GraphQL's best practices handbook. Use this skill when: (1) writing new Rust code or functions, (2) reviewing or refactoring existing Rust code, (3) deciding between borrowing vs cloning or ownership patterns, (4) implementing error handling with Result types, (5) optimizing Rust code for performance, (6) writing tests or documentation for Rust projects.
development
Master Rust async programming with Tokio, async traits, error handling, and concurrent patterns. Use when building async Rust applications, implementing concurrent systems, or debugging async code.
tools
When the user wants help with revenue operations, lead lifecycle management, or marketing-to-sales handoff processes. Also use when the user mentions 'RevOps,' 'revenue operations,' 'lead scoring,' 'lead routing,' 'MQL,' 'SQL,' 'pipeline stages,' 'deal desk,' 'CRM automation,' 'marketing-to-sales handoff,' 'data hygiene,' 'leads aren't getting to sales,' 'pipeline management,' 'lead qualification,' or 'when should marketing hand off to sales.' Use this for anything involving the systems and processes that connect marketing to revenue. For cold outreach emails, see cold-email. For email drip campaigns, see email-sequence. For pricing decisions, see pricing-strategy.