skills/cto-advisor/SKILL.md
Use when advising on CTO-level decisions: tech debt prioritization, rewrite vs refactor, build vs buy, engineering team scaling, architecture governance, vendor lock-in assessment, or production reliability strategy. NEVER for individual contributor coding tasks, code review, or project management mechanics.
npx skillsauth add sharkitect-solutions/sharkitect-claude-toolkit cto-advisorInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
What type of decision?
Not all tech debt is bad. Deliberate debt (ship faster, fix later with a plan) is a legitimate strategy. Accidental debt (didn't know better) is the real problem. The key insight: tech debt has an INTEREST RATE -- some debt compounds weekly, some barely accrues at all.
| | LOW COMPOUND RATE | HIGH COMPOUND RATE | |--------------------|--------------------------------|----------------------------------| | HIGH IMPACT | Schedule fix (next quarter) | FIX NOW -- this is an emergency | | (many teams hit) | e.g., legacy UI nobody loves | e.g., shared DB coupling | | LOW IMPACT | Ignore safely -- track it | Fix opportunistically | | (one team/service) | e.g., old admin panel | e.g., one team's test debt |
Ask: "If we ignore this for 6 months, how much worse does it get?"
"20% of sprint capacity for tech debt" sounds disciplined but fails in practice. Why: teams treat it as a dumping ground for pet refactors, not strategic debt. Instead, treat debt items like features -- prioritize by impact x compound rate, fund the top items explicitly, and track outcomes.
Big-bang rewrites fail because the old system keeps changing during the rewrite. By the time the new system is "done," the requirements have moved. The team ends up maintaining two systems indefinitely.
Route new traffic to new code. Migrate old functionality piece by piece. The old system shrinks naturally. This works because you're never maintaining two complete systems.
ALL THREE must be true:
If any condition is false, use Strangler Fig instead. "We can rewrite it better" is almost always wrong -- teams rewrite it DIFFERENTLY, not better, and introduce new bugs in previously-stable code.
BEFORE (common mistake): "We have 47 tech debt items. Let's allocate 20% of each sprint to chip away at them. Engineers can pick what they want to work on." --> Result: 6 months later, 47 debt items is now 52. Engineers fixed what was fun, not what mattered. New debt accumulated faster than old debt was paid down. No measurable impact on velocity.
AFTER (expert approach): "We scored all debt by impact x compound rate. Three items are high-impact, high-compound: shared database coupling, missing integration tests on the payment path, and the hand-rolled auth system. We're funding a 2-person team for 6 weeks to strangle the shared DB. The other two go into next quarter's roadmap as first-class work items with success metrics." --> Result: shared DB migration completes. Deployment frequency for 3 teams doubles. The two remaining items have clear timelines and owners.
Build ONLY if it is a core differentiator AND you have the team to maintain it for years. Buy everything else.
"We can build it better" is almost always wrong. You can build it differently. You cannot build it better than a company whose entire business is that one product, while also building your actual product.
Building is 20% of the cost. Maintaining is 80%. When you build internally:
Build if ALL are true:
Buy if ANY are true:
Hiring too fast is worse than hiring too slow. Every new hire reduces team productivity for 3-6 months (onboarding cost). More than 25% growth per quarter breaks knowledge transfer.
Adding people to a late project makes it later (Brooks's Law). Communication paths grow as n*(n-1)/2. A team of 5 has 10 paths. A team of 10 has 45. A team of 20 has 190. The coordination cost eventually exceeds the productivity gain.
Team topology matters more than headcount. Two well-structured teams of 5 outperform one team of 12 every time.
ONLY when you have 5+ product teams. Before that, platform work is a tax on product delivery -- the overhead of a separate team exceeds the benefit.
Signs you actually need a platform team:
Dedicated architecture teams become disconnected from production reality within 6 months. They produce designs nobody implements. Instead: embed senior architects in product teams. Have them rotate quarterly. Architecture decisions come from people who live with the consequences.
Component teams (frontend team, backend team, database team) are intuitive but create handoff bottlenecks. Every feature requires coordination across 3+ teams.
Stream-aligned teams (owns a business capability end-to-end) have higher autonomy and faster delivery. They take 6-12 months to become fully effective, but after the first year they consistently outperform component teams.
Switch to stream-aligned when: feature delivery requires more than 2 team handoffs on average.
When entering a new technology area (first ML project, first mobile app, first distributed system): hire senior engineers first. They build the patterns, tooling, and standards. Juniors hired into a domain with no senior guidance will build something that needs to be rewritten.
When the patterns are set, the CI/CD works, the tests are comprehensive: juniors can extend and contribute effectively. They learn from the codebase and the surrounding seniors.
Technologies change every 2-3 years. Hire for learning ability, systems thinking, and communication. A senior Go developer who has never touched your stack will outperform a junior who already knows it within 3 months.
One engineering hire per month is sustainable with a single recruiter and existing interview capacity. Two per month requires a dedicated recruiting function. Four+ per month requires a recruiting TEAM and will still probably result in lowered hiring bar.
Most CTO failures come from playing the wrong role for the company stage. Every transition feels like a demotion -- you're giving up what you're best at.
| Stage | IC Work | Management | Where You Add Value | |------------|---------|------------|-----------------------------------------| | Seed | 80% | 20% | Writing code, making architecture calls | | Series A | 50% | 50% | Building team, setting standards | | Series B | 20% | 80% | Strategy, hiring, cross-functional work | | Series C+ | 5% | 95% | Board, fundraising tech narrative |
If these ratios don't shift as the company grows, YOU are the bottleneck. The most common failure: a seed-stage CTO who is still writing 80% code at Series B. The team can't grow because all decisions funnel through one person's PR reviews.
Make irreversible decisions slowly and reversible ones fast.
Irreversible (take weeks, involve many people):
Reversible (decide in a day, change if wrong):
"Let's use microservices" is the new "let's use Java." Most companies under 50 engineers should run a modular monolith. Microservices add: network latency, distributed debugging complexity, deployment orchestration overhead, data consistency challenges.
Start with a monolith. Extract services only when a specific module has different scaling needs OR a separate team needs to deploy independently. The trigger should be operational pain, not architectural aesthetics.
The technical migration is usually 20% of the switching cost. The other 80%:
Multi-cloud sounds like good risk management. In practice: you pay more (no volume discounts), your team needs to know two platforms, your abstractions leak, and you use the lowest common denominator of both clouds instead of the best features of one.
When multi-cloud IS justified:
For everyone else: pick one cloud, use it well, negotiate good pricing.
For any new vendor or service, ask: "What happens when we need to leave?"
If the answer to the first two questions is "no," either negotiate data portability guarantees in the contract or factor 6-12 months of migration work into your long-term cost model.
"Stability vs speed" is a false dichotomy. The data (from Accelerate/DORA research) shows: the fastest teams deploy the most AND break the least. High reliability enables faster deployment because the team trusts the system.
Low reliability --> fear of deploying --> larger, riskier batches --> more failures --> even lower reliability (death spiral)
High reliability --> confidence in deploying --> smaller, safer changes --> fewer failures --> even higher reliability (virtuous cycle)
Most outages are caused by deployments, not infrastructure failures. The single most impactful reliability practice: make rollback instant and automatic. If every deployment can be reverted in under 2 minutes, the cost of a bad deploy drops from "hours of debugging" to "2 minutes of downtime."
Root cause analysis often stops at "why #2" because it gets political. "Why did the deploy fail?" -> "Because there were no integration tests" -> "Because the team doesn't have time to write tests" -> "Because leadership prioritizes features over quality" -> uncomfortable silence.
If your postmortems consistently stop at the proximate cause (bad code, missed test) instead of the systemic cause (incentive structure, staffing, process), they won't prevent recurrence.
| Decision Mistake | Sounds Like | Why It Fails | Expert Move | |------------------|-------------|--------------|-------------| | "Let's rewrite it properly" | "The codebase is unmaintainable" | Old system keeps changing during rewrite; you end up maintaining two systems | Strangler fig: route new traffic to new code, migrate piece by piece | | "We need microservices" | "We need to scale" | 90% of startups under 50 engineers don't need distributed systems overhead | Start monolith, extract services only when operational pain demands it | | "Let's build it ourselves" | "We can build it better and cheaper" | You can build it DIFFERENTLY; maintenance is 80% of cost | Build only core differentiators; buy commodity | | "We should go multi-cloud" | "We can't depend on one vendor" | Double the cost, half the expertise, lowest common denominator features | Single cloud done well beats multi-cloud done poorly | | "Let's hire faster" | "We need more engineers to ship faster" | Communication overhead grows quadratically; onboarding consumes existing team capacity | Max 25% headcount growth per quarter; senior-first | | "20% of sprints for tech debt" | "We'll chip away at it" | Becomes a dumping ground for pet refactors; no strategic prioritization | Fund specific debt items as first-class projects with success metrics | | "Everyone should be full-stack" | "We need flexibility" | Generalists plateau at intermediate in each specialty; deep problems need deep expertise | T-shaped: deep in one area, functional in adjacent areas | | "Let's create an architecture team" | "We need architectural consistency" | Disconnects from production reality within 6 months; designs nobody implements | Embed architects in product teams, rotate quarterly |
development
When the user wants help with paid advertising campaigns on Google Ads, Meta (Facebook/Instagram), LinkedIn, Twitter/X, or other ad platforms. Also use when the user mentions 'PPC,' 'paid media,' 'ad copy,' 'ad creative,' 'ROAS,' 'CPA,' 'ad campaign,' 'retargeting,' or 'audience targeting.' This skill covers campaign strategy, ad creation, audience targeting, and optimization.
testing
--- name: using-sharkitect-methodology description: Use when starting any conversation in a Sharkitect workspace OR before any task involving NEW pricing, positioning, proposal, strategy, plan-execution, or schema-design work — mandates invocation of Sharkitect-specific methodology skills (pricing-strategy, marketing-strategy-pmm, smb-cfo, hq-revenue-ops, executing-plans, brainstorming) under the same anti-rationalization discipline as using-superpowers. Documentation has failed 4 times across H
testing
Use when user says 'end session', 'wrap up', 'stop for the day', 'done for today', 'close out', 'save session', 'wrapping up', or invokes /end-session. Runs the full 9-step end-of-session protocol: resource audit, MEMORY.md update, lessons capture, plan status, pending items, workspace checklist, .tmp/ audit, git commit+push, Supabase brain sync, session brief, summary. Final step schedules a detached self-kill of the current session ONLY (3s delay) so the window closes cleanly. Other claude.exe processes (active workspaces) are NOT touched -- orphan cleanup is handled separately by Claude-Orphan-Cleanup-Hourly with proper age safeguards. Do NOT use for: mid-session quick saves (use session-checkpoint), skill syncing (use sync-skills.py), brain memory queries (use supabase-sync.py pull), document freshness reviews (use document-lifecycle), resource gap detection (use resource-auditor).
testing
Remove signs of AI-generated writing from text. Use when editing or reviewing text to make it sound more natural and human-written. Based on Wikipedia's comprehensive "Signs of AI writing" guide. Detects and fixes patterns including: inflated symbolism, promotional language, superficial -ing analyses, vague attributions, em dash overuse, rule of three, AI vocabulary words, passive voice, negative parallelisms, and filler phrases.