Integration Tests Are the Only Tests That Matter

I have a controversial testing opinion: most unit tests are a waste of time. Not all of them — pure functions, complex algorithms, data transformations, absolutely write unit tests for those. But for the vast majority of application code — API handlers, React components, database queries — unit tests test the wrong thing. They test implementation details. They test mocks. They test that your code calls other code. They don’t test that your software works. Integration tests do. An integration test for an API endpoint sends a real HTTP request, hits a real database, runs real middleware, and asserts on the real response. When that test passes, you know the endpoint works. When a unit test with five mocks passes, you know the handler called the right mock with the right arguments — which tells you almost nothing about production behaviour. At Weel, 60% of our test suite is integration tests. Our bug escape rate dropped 3x when we made that shift. Here’s how we do it.

The Case Against Unit Test Obsession

Consider a typical API handler tested with unit tests:

// The handler
async function createPayment(req: Request, res: Response) {
  const validated = paymentSchema.parse(req.body);
  const user = await userService.findById(req.userId);
  const payment = await paymentService.create(validated, user);
  await notificationService.sendPaymentConfirmation(payment);
  res.json({ payment });
}

// The unit test — 15 lines of mocking for 3 lines of logic
test('createPayment calls services correctly', () => {
  const mockParse = jest.fn().mockReturnValue(validPayment);
  const mockFindUser = jest.fn().mockResolvedValue(mockUser);
  const mockCreate = jest.fn().mockResolvedValue(mockPayment);
  const mockNotify = jest.fn().mockResolvedValue(undefined);

  // ...wire up all the mocks...

  expect(mockFindUser).toHaveBeenCalledWith('user-123');
  expect(mockCreate).toHaveBeenCalledWith(validPayment, mockUser);
  expect(mockNotify).toHaveBeenCalledWith(mockPayment);
});

This test verifies that the handler calls functions in the right order with the right arguments. It doesn’t verify that the payment is actually created in the database. It doesn’t verify that the validation actually rejects bad input. It doesn’t verify that the notification actually sends. If you refactor the handler to use a different internal structure, the test breaks — even if the API response is identical. Now compare the integration test:

test('POST /api/payments creates a payment', async () => {
  const user = await factories.user.create({ plan: 'business' });
  const recipient = await factories.recipient.create();

  const response = await request(app)
    .post('/api/payments')
    .set('Authorization', `Bearer ${generateToken(user)}`)
    .send({
      amount: 25000,
      currency: 'AUD',
      recipientId: recipient.id,
      description: 'Invoice #1234',
    });

  expect(response.status).toBe(200);
  expect(response.body.payment).toMatchObject({
    amount: 25000,
    currency: 'AUD',
    status: 'processing',
  });

  const dbPayment = await db.payment.findFirst({
    where: { userId: user.id },
  });
  expect(dbPayment).toBeTruthy();
  expect(dbPayment?.amount).toBe(25000);
});

This tests the real thing. Real HTTP, real auth middleware, real validation, real database write. When this passes, you know the API works. Refactor the internals all you want — as long as the API contract holds, the test stays green.

Making Integration Tests Fast

The common objection: “integration tests are slow.” They don’t have to be. Here are the patterns that keep ours under 5 minutes for 400+ tests.

Test containers for database

We use Testcontainers to spin up a Postgres instance per test run. It starts in ~3 seconds, runs entirely in memory, and gets destroyed when tests finish.

// test/setup.ts
import { PostgreSqlContainer } from '@testcontainers/postgresql';

let container: StartedPostgreSqlContainer;

beforeAll(async () => {
  container = await new PostgreSqlContainer('postgres:16-alpine')
    .withDatabase('test')
    .start();

  process.env.DATABASE_URL = container.getConnectionUri();
  await runMigrations();
}, 30000);

afterAll(async () => {
  await container.stop();
});

Start the test container once for the entire suite, not per test file. A cold Postgres container start takes ~3 seconds. If you start one per file across 50 test files, that’s 2.5 minutes of pure container overhead.

Transaction rollback for isolation

Instead of truncating tables between tests (slow) or using separate databases (complex), we wrap each test in a transaction and roll it back:

// test/helpers.ts
import { db } from '@/lib/db';

beforeEach(async () => {
  await db.$executeRaw`BEGIN`;
});

afterEach(async () => {
  await db.$executeRaw`ROLLBACK`;
});

Every test starts with a clean database state. The rollback is nearly instant compared to truncating and reseeding tables. At Weel, this pattern alone cut our test suite from 8 minutes to 3.5 minutes.

Factory functions over fixtures

Static fixtures are brittle and hard to maintain. Factory functions create exactly the data each test needs:

// test/factories/user.ts
import { faker } from '@faker-js/faker';
import { db } from '@/lib/db';

export const userFactory = {
  async create(overrides: Partial<CreateUserInput> = {}) {
    return db.user.create({
      data: {
        email: faker.internet.email(),
        name: faker.person.fullName(),
        plan: 'starter',
        ...overrides,
      },
    });
  },

  async createWithOrg(overrides: Partial<CreateUserInput> = {}) {
    const org = await orgFactory.create();
    return this.create({ ...overrides, orgId: org.id });
  },
};

Factories compose. Need a user with an organization, a payment method, and three past payments? Chain the factories. The test reads like a story, not a data dump.

Parallel execution

Vitest runs test files in parallel by default. Since each test file uses transaction rollback for isolation, there’s no interference between parallel runs:

// vitest.config.ts
export default defineConfig({
  test: {
    pool: 'forks',
    poolOptions: {
      forks: {
        minForks: 2,
        maxForks: 4,
      },
    },
    setupFiles: ['./test/setup.ts'],
  },
});

Parallel tests sharing a database can cause flaky failures if tests depend on global state (sequences, unique constraints on generated data). Use factories with randomised data (faker) and transaction isolation to prevent this.

The Testing Diamond

The traditional testing pyramid says: lots of unit tests, fewer integration tests, few E2E tests. For modern full-stack TypeScript applications, I use a diamond shape instead:

Layer	Proportion	What It Tests
E2E (Playwright)	10%	Critical user journeys, 5-10 tests
Integration	60%	API endpoints, components with real data
Unit	20%	Pure functions, utilities, calculations
Static (TypeScript + ESLint)	10% of effort	Types, patterns, formatting

The diamond is widest at integration because that’s where the confidence-to-cost ratio peaks. Integration tests cover real behaviour without the brittleness and slowness of full E2E tests, and without the false confidence of mocked unit tests.

When Unit Tests DO Matter

I’m not a unit test nihilist. There are clear cases where unit tests are the right tool: Complex business logic. A function that calculates tax across multiple jurisdictions with different rules, thresholds, and exemptions? Unit test every edge case. The logic is pure (input → output), and the combinatorial space is too large for integration tests alone.

describe('calculateGST', () => {
  it('applies standard 10% GST for domestic transactions', () => {
    expect(calculateGST({ amount: 10000, country: 'AU', type: 'standard' }))
      .toEqual({ net: 10000, gst: 1000, total: 11000 });
  });

  it('exempts GST for international transactions', () => {
    expect(calculateGST({ amount: 10000, country: 'US', type: 'standard' }))
      .toEqual({ net: 10000, gst: 0, total: 10000 });
  });

  it('applies reduced rate for eligible categories', () => {
    expect(calculateGST({ amount: 10000, country: 'AU', type: 'education' }))
      .toEqual({ net: 10000, gst: 0, total: 10000 });
  });
});

Data transformations. Functions that reshape data structures, parse formats, or transform between representations. Pure input/output, no side effects, lots of edge cases. Algorithms. Sorting, searching, scheduling, rate limiting logic — anything where correctness depends on specific algorithmic steps. The pattern: if it’s a pure function with complex logic, unit test it. If it involves I/O, side effects, or system integration, integration test it.

Testing Error Paths

The most valuable integration tests aren’t the happy paths — they’re the error paths. What happens when the database is down? When the payment provider returns a 500? When the request body is malformed?

test('returns 422 for invalid payment amount', async () => {
  const user = await factories.user.create();

  const response = await request(app)
    .post('/api/payments')
    .set('Authorization', `Bearer ${generateToken(user)}`)
    .send({ amount: -100, currency: 'AUD' });

  expect(response.status).toBe(422);
  expect(response.body.errors).toContainEqual(
    expect.objectContaining({
      field: 'amount',
      message: expect.stringContaining('positive'),
    })
  );
});

test('returns 503 when payment provider is unavailable', async () => {
  mockPaymentProvider.simulateOutage();
  const user = await factories.user.create();

  const response = await request(app)
    .post('/api/payments')
    .set('Authorization', `Bearer ${generateToken(user)}`)
    .send(validPaymentData);

  expect(response.status).toBe(503);
  expect(response.body.error).toBe('Payment service temporarily unavailable');
});

At Weel, we require every API endpoint to have at least one error path integration test. The pattern is: “what happens when [dependency] fails?” These tests have caught more production bugs than all our unit tests combined, because error handling code is the code that gets the least manual testing.

The testing philosophy is simple: test the things users and systems interact with. HTTP endpoints. Rendered components. Published events. Database state. Everything else is an implementation detail that you should be free to refactor without breaking tests.

TypeScript at Scale

Design Systems

Deep Dives

System Design & Architecture

Career & Engineering Leadership

Shipping & DevOps

Testing & Quality

Observability

Integration Tests Are the Only Tests That Matter — Fight Me

The Case Against Unit Test Obsession

Making Integration Tests Fast

Test containers for database

Transaction rollback for isolation

Factory functions over fixtures

Parallel execution

The Testing Diamond

When Unit Tests DO Matter

Testing Error Paths

TypeScript at Scale

Design Systems

Deep Dives

System Design & Architecture

Career & Engineering Leadership

Shipping & DevOps

Testing & Quality

Observability

​The Case Against Unit Test Obsession

​Making Integration Tests Fast

​Test containers for database

​Transaction rollback for isolation

​Factory functions over fixtures

​Parallel execution

​The Testing Diamond

​When Unit Tests DO Matter

​Testing Error Paths

The Case Against Unit Test Obsession

Making Integration Tests Fast

Test containers for database

Transaction rollback for isolation

Factory functions over fixtures

Parallel execution

The Testing Diamond

When Unit Tests DO Matter

Testing Error Paths