Skip to main content
A team collaborating at a whiteboard covered in diagrams Every E2E test suite starts with good intentions. Five tests, clean code, runs in 30 seconds. Then it grows. At 20 tests, someone adds a waitForTimeout(5000) because a test was flaky. At 50 tests, the suite takes 8 minutes and fails randomly twice a week. At 100 tests, the team stops trusting the suite and starts skipping it entirely. I’ve watched this cycle play out multiple times, and the root cause is always structural — not the tool. The difference between an E2E suite that collapses at 50 tests and one that hums along at 500 is a handful of patterns applied from the start. None of them are advanced. All of them are essential.
“Simplicity is prerequisite for reliability.” — Edsger W. Dijkstra

Why E2E Suites Become Unmaintainable

Before looking at solutions, it helps to understand the three patterns that kill E2E suites. If any of these sound familiar, you know where to start fixing.
Anti-PatternWhat It Looks LikeWhy It’s Fatal
No abstractionEvery test contains raw selectors, navigation steps, and inline assertionsUI changes force updates in 40 tests instead of one page object. Engineers dread touching the suite.
Implicit waits and timeoutspage.waitForTimeout(3000) scattered throughout testsSometimes 3 seconds isn’t enough. Sometimes it’s wasteful. It’s always fragile and hides real timing issues.
Shared mutable stateTest B depends on data that test A created. Test order matters.When test A fails or runs in a different order, test B fails too. Cascading failures make debugging impossible.
waitForTimeout is a prayer, not a strategy. Replace every static timeout with an explicit wait for a condition — waitForSelector, waitForLoadState, or expect(locator).toBeVisible(). Your tests become faster and more reliable.

The Page Object Pattern

The page object model is the single most important pattern for maintainable E2E tests. Every page or major UI section gets a class that encapsulates its selectors and interactions. Tests never touch raw selectors directly. Think of it like an API for your UI. Tests describe what the user does (“create a payment”), not how the UI is structured (“fill the input with id amount-field, click the button with class submit-btn”). When the UI changes, you update the page object once. Every test that uses it stays unchanged.
export class PaymentPage {
  constructor(private page: Page) {}

  readonly amount = this.page.getByLabel('Amount');
  readonly recipient = this.page.getByLabel('Recipient');
  readonly submit = this.page.getByRole('button', { name: 'Send Payment' });
  readonly success = this.page.getByTestId('payment-success');

  async createPayment(amount: number, recipient: string) {
    await this.amount.fill(String(amount));
    await this.recipient.selectOption({ label: recipient });
    await this.submit.click();
  }
}
Now your tests read like user stories: “go to the payment page, create a payment for $250, expect success.” The selector details are hidden behind a clean interface. When the UI restructures — the amount input becomes a custom component, the button label changes — you update one file, not fifty.
Use Playwright’s built-in locators (getByRole, getByLabel, getByTestId) instead of CSS selectors. They’re more resilient to DOM restructuring and align with how users actually find elements — by role and label, not by class name or element ID.

Test Organization Strategies

At scale, how you organize test files matters as much as how you write individual tests. Here’s a comparison of the most common approaches.
StrategyStructureBest ForTrade-off
By featuretests/payments/, tests/auth/, tests/settings/Product teams — each team owns their feature’s testsCross-feature journeys span multiple directories
By user roletests/admin/, tests/member/, tests/guest/Permission-heavy apps — ensures role-specific flows are coveredFeature logic is spread across role directories
By prioritytests/critical/, tests/regression/, tests/smoke/CI optimization — run critical first, regression overnightFeatures are split across priority tiers
Hybrid (recommended)Feature directories with priority tags via test.describeMost teamsSlightly more upfront organization
The hybrid approach works best in practice: group tests by feature for discoverability, and use Playwright’s tagging to run subsets by priority in different CI stages. Engineers working on payments find all payment tests in one place. CI runs @critical tests on every PR and the full suite nightly.

Handling Auth and Flakiness

Authentication and flakiness are the two biggest sources of E2E pain. Handle both with deliberate patterns. Authentication: Log in once in a global setup script and save the browser storage state to a JSON file. Each test loads that saved state instead of logging in again. This avoids repeating the login flow in every test and saves significant time.
// In global setup: login once, save state
const page = await browser.newPage();
await page.goto('/login');
await page.getByLabel('Email').fill('test@example.com');
await page.getByLabel('Password').fill(process.env.TEST_PASSWORD!);
await page.getByRole('button', { name: 'Sign in' }).click();
await page.waitForURL('/dashboard');
await page.context().storageState({ path: 'test/.auth/user.json' });
Then tests simply create a browser context with storageState: 'test/.auth/user.json' and they’re instantly authenticated. Login happens once globally, not per test. Flakiness: Most flaky tests share a root cause: shared mutable state or insufficient waiting. Address both systematically.
Flakiness CauseFix
Tests depend on data from other testsEach test creates its own data via API seeding — never through the UI
Element not yet visible when clickedReplace waitForTimeout with expect(locator).toBeVisible() before interaction
Network requests haven’t completedUse waitForLoadState('networkidle') or wait for a specific response
Animation in progressDisable animations in test config: reducedMotion: 'reduce'
Database state from a previous testUse transaction rollback or API-based cleanup in afterEach

The 80/20 Rule: Test Critical Paths

You don’t need E2E tests for everything. In fact, trying to E2E test everything is how suites become slow, flaky, and abandoned. Apply the 80/20 rule: test the 20% of user journeys that generate 80% of business value. Identify your critical paths — the flows where a failure would cost real money, lose real customers, or trigger real incidents. These typically include:
PriorityUser JourneyWhy It’s Critical
P0Sign up / onboardingBroken signup = zero new users
P0Core transaction (purchase, payment, booking)Broken transactions = direct revenue loss
P1Authentication (login, logout, password reset)Broken auth = no one can use the product
P1Key reporting / exportsBroken reports = eroded trust, compliance risk
P2Admin / team managementImportant but lower frequency; integration tests usually suffice
Give P0 and P1 journeys thorough E2E coverage: happy paths, the most critical error path, and any edge case that has caused a production incident before. Everything else — settings pages, profile updates, notification preferences — gets integration test coverage instead.
E2E tests are expensive to write, expensive to maintain, and slow to run. Spend that budget on the journeys that would cause the most damage if they broke. A broken checkout flow costs revenue. A broken avatar upload costs a support ticket. Test accordingly.
The patterns that make E2E suites scale aren’t clever tricks. They’re structural decisions: abstract with page objects, isolate test data, handle auth once, and focus coverage on what matters most. Build these foundations from the start, and your suite will still be maintainable at 500 tests.