configs/skills/ads-test/SKILL.md
A/B test design and experiment planning for paid advertising. Structured hypothesis framework, statistical significance calculator, test duration estimator, sample size calculator, and platform-specific experiment setup guides (Meta Experiments, Google Experiments, LinkedIn A/B). Use when user says A/B test, split test, experiment design, test hypothesis, statistical significance, sample size, or test duration.
npx skillsauth add shenxingy/claude-code-kit ads-testInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Every test must start with a structured hypothesis:
IF we [change/action]
THEN [metric] will [increase/decrease] by [estimated %]
BECAUSE [reasoning based on data or insight]
Example:
IF we replace polished product shots with UGC creator videos
THEN Meta CTR will increase by 25-40%
BECAUSE Andromeda prioritizes diverse creative formats and UGC consistently outperforms polished in 2025-2026 benchmarks
Required Sample Size (per variant):
n = (Z_alpha + Z_beta)^2 × 2 × p × (1-p) / MDE^2
Where:
- Z_alpha = 1.96 (for 95% confidence)
- Z_beta = 0.84 (for 80% power)
- p = baseline conversion rate
- MDE = minimum detectable effect (relative %)
Simplified lookup:
| Baseline CVR | 5% MDE | 10% MDE | 20% MDE | 30% MDE | |-------------|---------|---------|---------|---------| | 1% | 612,000 | 153,000 | 38,300 | 17,000 | | 2% | 302,400 | 75,600 | 18,900 | 8,400 | | 5% | 116,800 | 29,200 | 7,300 | 3,200 | | 10% | 55,200 | 13,800 | 3,450 | 1,530 | | 20% | 24,600 | 6,150 | 1,540 | 680 |
Per variant, 95% confidence, 80% power
Duration = Required Sample Size / Daily Traffic per Variant
Minimum duration: 7 days (capture weekly patterns)
Maximum recommended: 28 days (avoid seasonal drift)
Learning phase: Google 7-14 days, Meta 3-7 days, LinkedIn 7-14 days
Inputs needed:
- Daily impressions or clicks
- Number of variants (2 = A/B, 3+ = multivariate)
- Baseline conversion rate
- Minimum detectable effect desired
| Daily Clicks | 2% CVR, 20% MDE | 5% CVR, 20% MDE | 10% CVR, 20% MDE | |-------------|-----------------|-----------------|-----------------| | 100 | 189 days | 73 days | 35 days | | 500 | 38 days | 15 days | 7 days | | 1,000 | 19 days | 7 days | 4 days* | | 5,000 | 4 days* | 2 days* | 1 day* |
*Minimum 7 days recommended regardless of sample sufficiency
## A/B Test Plan
### Hypothesis
IF [change]
THEN [metric] will [direction] by [amount]
BECAUSE [reasoning]
### Test Design
| Parameter | Value |
|-----------|-------|
| Platform | [platform] |
| Test Type | [A/B / Multivariate] |
| Variable | [what's being changed] |
| Control | [current state] |
| Variant | [proposed change] |
| Primary Metric | [KPI] |
| Traffic Split | [50/50 / other] |
### Sample Size & Duration
| Metric | Value |
|--------|-------|
| Baseline CVR | [X%] |
| MDE | [X%] |
| Required Sample | [N per variant] |
| Daily Traffic | [N clicks/day] |
| Est. Duration | [X days] |
| Min Duration | 7 days |
### Success Criteria
- Winner declared at 95% confidence
- [Primary metric] improvement of [X%]+ sustained over [Y] days
- No negative impact on [secondary metric]
### Setup Instructions
[Platform-specific step-by-step]
testing
One-command multilingual blog creation. Writes a blog post, translates it into user-specified languages, applies cultural adaptation, and emits hreflang tags, sitemap entries, and a CMS-ready language map. The complete write-to-publish pipeline for international content. Orchestrates blog-write, blog-translate, blog-localize, and (optionally) seo-hreflang. Use when user says "multilingual blog", "blog multilingual", "write in multiple languages", "international blog", "mehrsprachiger Blog", "blog multilingue", "blog multilingue", "create blog in German and French".
development
Research what people are actually saying about a topic in the last 30 days across Reddit, X / Twitter, YouTube, Hacker News, dev.to, Medium, and other public discourse platforms. API-free; uses WebSearch with platform-targeted site operators plus recency filters. Produces DISCOURSE.md (a structured brief) and JSON output the writer can consume. Complements blog-researcher (which focuses on authority sources) with a recency-and-engagement lens. Use when user says "blog discourse", "discourse research", "what are people saying about", "research what people are saying", "voice of customer", "social listening", "30-day research", "trend research", "what's the discussion on", "real-time research", "practitioner discourse", "/blog discourse".
documentation
Establish durable brand and voice context for cross-skill consumption. Generates BRAND.md (audience, positioning, do/don't editorial rules, taboo phrases, competitor differentiation) and VOICE.md (existing persona JSON re-expressed as readable prose), both written to the project root. When present, all blog sub-skills auto-load these files before writing or reviewing. Pairs with blog-persona, which manages the structured persona JSON. Use when user says "blog brand", "create brand context", "brand voice doc", "BRAND.md", "VOICE.md", "establish editorial brand", "brand guidelines for blog".
development
Server-side tracking pipeline audit covering server-side Google Tag Manager (sGTM), Meta CAPI Gateway, Conversions API health, event deduplication via event_id, server-side hit ratio targets, pixel debugging, and PII hashing discipline. Use when user says server-side tracking, sGTM, server-side GTM, server-side tagging, CAPI, Conversions API, CAPI Gateway, Meta Conversions API, event deduplication, event_id, pixel debug, pixel health, Pixel/CAPI audit, first-party tracking, iOS 14.5 recovery, or server-side hit ratio.