You just spent two weeks pushing your codebase to 100% test coverage. Every line, every branch, every function — green. The badge is shining on your README. Then you deploy and a regression breaks the most important user flow in your application. Everything was “covered.” Nothing was actually tested.
If that story sounds familiar, you’ve discovered the dirty secret of software metrics: coverage measures how much code your tests touch, not how much behaviour they verify. You can reach 100% by rendering a component with zero assertions. You can hit every branch with snapshot tests that nobody reads. The number goes up. Confidence doesn’t.
“Write tests. Not too many. Mostly integration.” — Guillermo Rauch
The 100% Coverage Trap
Chasing full coverage creates three predictable problems, and understanding them is the first step toward a healthier testing strategy.
| Problem | What Happens | Why It Hurts |
|---|
| False confidence | Tests execute every line but assert on nothing meaningful. Snapshot tests and no-assertion renders inflate the number. | You trust a metric that doesn’t reflect reality. Bugs ship under a green dashboard. |
| Slow test suites | Covering every private function and internal edge case adds hundreds of tests that don’t catch real bugs. | CI takes 15+ minutes. Engineers stop running tests locally and “push and pray.” |
| Brittle tests | Tests coupled to implementation details break on every refactor — even when behaviour is unchanged. | Engineers stop refactoring. The codebase calcifies. Tests become a liability, not a safety net. |
If your test suite breaks during a refactor that doesn’t change any external behaviour, your tests are testing implementation details, not user-facing outcomes. Those tests are working against you.
The Testing Trophy
Kent C. Dodds’ testing trophy replaced the classic testing pyramid for modern full-stack applications. Think of it as a guide for where to invest your testing time.
| Layer | Volume | Speed | Confidence | What It Catches |
|---|
| Static Analysis | Runs on every keystroke | Instant | Low–Medium | Type errors, lint violations, formatting |
| Unit Tests | Some | Fast | Low–Medium | Pure function logic, utility edge cases |
| Integration Tests | Most test code | Medium | High | Components + hooks + API + DB working together |
| E2E Tests | Few | Slow | Highest | Full user journeys across the real stack |
The key insight is that integration tests give the best confidence-to-cost ratio. They test real behaviour — a component rendering with real hooks, calling a real API, returning real data — without the overhead of spinning up an entire browser. Static analysis is essentially free once configured, so invest there heavily. Unit tests shine for pure logic. And E2E tests are reserved for the journeys where a failure would cost real money or real trust.
What to Test vs What to Skip
One of the hardest skills to develop is knowing when not to write a test. Here’s a decision framework.
| Test This | Skip This |
|---|
| Critical user journeys (signup, checkout, payments) | Third-party library internals — they have their own tests |
| API endpoint responses and status codes | TypeScript types at runtime — the compiler already enforces them |
| Component behaviour from the user’s perspective | Implementation details like which internal function was called |
| Complex business logic with many edge cases | Trivial getters like const getName = (u) => u.name |
| Error handling paths (what breaks when a dependency fails?) | One-off migration scripts — manual verification is fine |
A useful litmus test: if you refactored the internals of a module without changing its inputs or outputs, would your test break? If yes, you’re testing implementation. Rewrite it to assert on behaviour — inputs in, outputs out, side effects verified.
Confidence-Driven Testing
Instead of asking “what’s our coverage percentage?”, ask a better question: “How confident am I that this deploy won’t break anything users care about?”
This reframes your entire approach. You start from the user’s perspective and work backward:
- Identify critical user journeys. What are the 5–10 flows that generate the most business value? Signup, purchase, core feature usage.
- Map what could break them. Auth failures, payment processing errors, data corruption, third-party API outages.
- Write tests that catch those breaks. Not tests that cover lines — tests that simulate real scenarios and assert on real outcomes.
A single integration test that sends a real HTTP request, hits a real database, and verifies the response tells you more about production readiness than fifty unit tests full of mocks.
Metrics That Actually Matter
I’ve replaced the coverage percentage with four metrics that genuinely predict reliability.
| Metric | What It Measures | Target | How to Track |
|---|
| Bug escape rate | Bugs reaching production per deploy | < 1 per 100 deploys | Error tracking + deploy log |
| Time to detect | How fast you learn about escaped bugs | < 5 min for critical | Alerting tool MTTR |
| Deploy confidence | Team’s gut trust in the test suite | > 4.0 / 5.0 | Weekly 1-question survey |
| Suite speed | Whether engineers actually run tests | < 5 min locally | CI metrics dashboard |
Bug escape rate is your north star — it directly measures whether your tests are catching what matters. Time to detect combines your tests with monitoring to give a full safety picture. Deploy confidence is qualitative but powerful: if the team doesn’t trust the suite, the number on the dashboard is irrelevant. And suite speed is the silent killer — a 20-minute suite is a suite nobody runs locally.
Practical Test Budgeting
Here’s how I allocate testing effort on a typical feature:
| Category | Time Budget | Why |
|---|
| Integration tests | 50% | API endpoints with real databases, components with real hooks. Highest confidence per hour invested. |
| E2E tests | 20% | 2–3 Playwright tests covering the happy path and the most critical error path. |
| Static analysis | 20% | TypeScript strict mode, ESLint rules, Zod schemas for runtime validation at boundaries. |
| Unit tests | 10% | Only for pure utility functions and genuinely complex business logic calculations. |
This ratio is intentionally inverted from most testing guides. Conventional wisdom says lots of unit tests, few integration tests. In my experience, the opposite produces more confidence per hour of engineering time. Write the integration test first. Add unit tests only when the logic is complex enough to warrant isolating it.
The goal of testing isn’t a green badge. It’s the confidence to deploy on a Friday afternoon without anxiously checking your error tracker every five minutes. Measure confidence, not coverage, and the right testing strategy follows naturally.