skills/product-manager-toolkit/SKILL.md
Use when the user needs feature prioritization (RICE scoring, backlog ranking, capacity planning), customer interview analysis (transcript parsing, insight extraction, cross-interview synthesis), PRD writing or review, or product discovery frameworks (opportunity mapping, hypothesis testing). NEVER use for marketing strategy or go-to-market campaigns (use marketing-strategy-pmm). NEVER use for project execution, sprint planning, or delivery tracking (use executing-plans). NEVER use for standalone user research methodology beyond interview transcript analysis.
npx skillsauth add sharkitect-solutions/sharkitect-claude-toolkit product-manager-toolkitInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
| File | Purpose | When to use |
|------|---------|-------------|
| scripts/rice_prioritizer.py | RICE scoring, portfolio analysis, roadmap generation | Feature prioritization with CSV input, capacity planning |
| scripts/customer_interview_analyzer.py | NLP-based transcript analysis: pain points, JTBD, sentiment, themes | Single interview analysis or multi-interview aggregation |
| references/prd_templates.md | 4 PRD formats: Standard (11-section), One-Page, Agile Epic, Feature Brief | Any requirements documentation task |
| This skill handles | Defer to | |--------------------|----------| | Feature prioritization (RICE, scoring, ranking) | -- | | Customer interview transcript analysis | -- | | PRD writing, review, and structure | -- | | Product discovery frameworks | -- | | Marketing strategy, GTM campaigns, positioning | marketing-strategy-pmm | | Project execution, sprint management, delivery | executing-plans | | Writing plans and roadmap documents | writing-plans | | User research design (survey creation, study planning) | Out of scope (no skill) |
RICE and similar frameworks produce systematically wrong answers in predictable situations. Detect and correct these before presenting results.
| Trap | Why RICE fails | Correction | |------|---------------|------------| | Infrastructure/platform work | Reach = 0 direct users, so score = 0. But every feature depends on it. | Score Reach as "all users of features this unblocks" with a time-decay factor. Or exempt from RICE entirely and allocate a fixed % (15-25%) of capacity. | | Technical debt reduction | No visible user impact, so Impact = minimal. But velocity degrades without it. | Track "developer hours lost per quarter" as a proxy Reach metric. If velocity dropped >15% quarter-over-quarter, escalate debt items above RICE ranking. | | Defensive features (security, compliance) | Low reach, low impact -- until a breach or audit failure. | Apply a "catastrophic multiplier": if the downside of NOT doing it is existential (data breach, regulatory fine, SOC2 failure), it bypasses RICE. | | Overconfident Confidence scores | Teams rate Confidence: High on features they like, regardless of evidence. | Require evidence tiers: High = quantitative data from 50+ users; Medium = qualitative from 10+ interviews; Low = team opinion only. | | Effort anchoring | Effort estimates cluster around "M" because teams avoid extremes. | Force calibration: pick the single easiest item (XS anchor) and hardest (XL anchor) first, then score everything relative to those anchors. | | Reach inflation for B2B | "All enterprise customers" = 50 accounts, not 50,000 users. Score looks tiny. | Use revenue-weighted reach: Reach = number of accounts x average contract value. Normalize across the portfolio. | | Feature cannibalization | Two features score high individually but compete for the same users. | Run overlap analysis: if >60% of Feature A's reach overlaps Feature B's, only count the incremental reach for the lower-scored one. |
| Confidence level | Required evidence | Typical accuracy | |-----------------|-------------------|-----------------| | High (100%) | A/B test data, 50+ user quantitative validation, or existing usage analytics | 80-90% of projected impact realized | | Medium (80%) | 10+ qualitative interviews, competitor benchmarks, or analogous feature data | 50-70% of projected impact realized | | Low (50%) | Team intuition, stakeholder request, or <5 data points | 20-40% of projected impact realized |
What separates a PRD that gets built from one that gets shelved:
| Situation | Template | From references/prd_templates.md |
|-----------|----------|-------------------------------------|
| Major feature, 6+ weeks, cross-functional | Standard PRD (11-section) | Section 1 |
| Small feature, 2-4 weeks, single team | One-Page PRD | Section 3 |
| Sprint-based delivery, agile team | Agile Epic Template | Section 2 |
| Exploration phase, pre-commitment | Feature Brief | Section 4 |
The scripts/customer_interview_analyzer.py extracts signals, but the PM must catch these systematic biases before acting on the output.
| Bias | How to detect in transcript | What to do | |------|---------------------------|------------| | Confirmation bias | Interviewer follows up enthusiastically on answers that match their hypothesis, drops threads that don't | Re-read dropped threads. Count disconfirming evidence separately. If ratio of confirming:disconfirming > 5:1, suspect bias. | | Leading questions | "Don't you think X would be helpful?" or "Would you say X is a problem?" | Flag any question containing "don't you," "would you say," or "wouldn't it be." Discount answers to leading questions. | | Courtesy bias | Interviewee says everything is "great" or "sounds good" with no specifics | Look for behavioral evidence: "When did you last actually do X?" If they can't give a specific instance, discount the positive. | | Survivorship bias | Only interviewed current users; didn't talk to churned users or non-adopters | Note if sample = current users only. Their pain points differ from why people leave or never sign up. | | Recency bias | Interviewee anchors on last week's experience, not the pattern over months | Ask "Is this typical or was last week unusual?" Weight recurring patterns over one-time events. | | Small sample extrapolation | PM treats 3 interviews as representative of 10,000 users | Minimum viable sample: 5 interviews for theme emergence, 12+ for pattern confidence, 20+ for segmentation. Below 5, label all findings as "hypotheses." |
aggregate_interviews() function to merge findingsWhen a PM thinks they have buy-in but actually don't:
| Failure mode | Warning sign | Prevention | |-------------|-------------|------------| | Silent disagreement | Stakeholder says "looks good" in meeting but never references the decision afterward | Ask explicitly: "What concerns do you have?" Silence is not agreement. Follow up async within 24h. | | Scope creep as sabotage | Engineering lead keeps adding "must-have" requirements after sign-off | Lock scope with a signed-off doc. Any addition after lock requires a trade-off: "What do we cut to add this?" | | Executive override | VP casually mentions a "small change" that invalidates the PRD | Treat any post-approval executive input as a formal change request. Assess impact on timeline and present trade-offs. | | Different success definitions | PM measures adoption, engineering measures performance, sales measures pipeline | Align on ONE north star metric + 2-3 supporting metrics BEFORE kickoff. Document in PRD. | | HIPPO override of data | Highest-Paid Person's Opinion overrides interview data or analytics | Present data first, opinion second. Frame as "the data suggests X -- do we have additional context that changes this?" |
| Vanity metric (avoid as primary) | Actionable alternative | Why | |----------------------------------|----------------------|-----| | Total registered users | Monthly active users (MAU) | Registrations include abandoned accounts | | Page views | Time-to-value (first meaningful action) | Views don't indicate value delivery | | App downloads | Day-7 retention rate | Downloads without retention = waste | | "NPS score" in isolation | NPS segmented by cohort + follow-up action rate | Aggregate NPS hides segment problems | | Feature usage count | Feature adoption rate (% of eligible users) | Raw count is meaningless without a denominator |
"When a measure becomes a target, it ceases to be a good measure." Watch for:
Antidote: Always pair a primary metric with a "health check" counter-metric. If conversion rate goes up, check that absolute conversions also increased (not just smaller denominator). If engagement rises, check that unsubscribe/mute rates haven't spiked.
| Dimension | What this skill adds beyond Claude's base knowledge | |-----------|-----------------------------------------------------| | RICE trap detection | 7 specific scenarios where RICE produces systematically wrong rankings, with correction procedures | | PRD quality signals | Concrete 5-signal checklist separating PRDs that ship from ones that stall, based on organizational behavior patterns | | Interview bias catalog | 6 named biases with detection heuristics and minimum sample sizes for valid findings | | Stakeholder failure modes | 5 alignment failures with early warning signs -- organizational dynamics Claude can't derive from PM textbooks | | Metric traps | Vanity-vs-actionable mapping plus Goodhart's Law detection patterns specific to product contexts | | Companion tooling | 3 scripts/references for deterministic execution of prioritization, analysis, and documentation |
development
When the user wants help with paid advertising campaigns on Google Ads, Meta (Facebook/Instagram), LinkedIn, Twitter/X, or other ad platforms. Also use when the user mentions 'PPC,' 'paid media,' 'ad copy,' 'ad creative,' 'ROAS,' 'CPA,' 'ad campaign,' 'retargeting,' or 'audience targeting.' This skill covers campaign strategy, ad creation, audience targeting, and optimization.
testing
--- name: using-sharkitect-methodology description: Use when starting any conversation in a Sharkitect workspace OR before any task involving NEW pricing, positioning, proposal, strategy, plan-execution, or schema-design work — mandates invocation of Sharkitect-specific methodology skills (pricing-strategy, marketing-strategy-pmm, smb-cfo, hq-revenue-ops, executing-plans, brainstorming) under the same anti-rationalization discipline as using-superpowers. Documentation has failed 4 times across H
testing
Use when user says 'end session', 'wrap up', 'stop for the day', 'done for today', 'close out', 'save session', 'wrapping up', or invokes /end-session. Runs the full 9-step end-of-session protocol: resource audit, MEMORY.md update, lessons capture, plan status, pending items, workspace checklist, .tmp/ audit, git commit+push, Supabase brain sync, session brief, summary. Final step schedules a detached self-kill of the current session ONLY (3s delay) so the window closes cleanly. Other claude.exe processes (active workspaces) are NOT touched -- orphan cleanup is handled separately by Claude-Orphan-Cleanup-Hourly with proper age safeguards. Do NOT use for: mid-session quick saves (use session-checkpoint), skill syncing (use sync-skills.py), brain memory queries (use supabase-sync.py pull), document freshness reviews (use document-lifecycle), resource gap detection (use resource-auditor).
testing
Remove signs of AI-generated writing from text. Use when editing or reviewing text to make it sound more natural and human-written. Based on Wikipedia's comprehensive "Signs of AI writing" guide. Detects and fixes patterns including: inflated symbolism, promotional language, superficial -ing analyses, vague attributions, em dash overuse, rule of three, AI vocabulary words, passive voice, negative parallelisms, and filler phrases.