Every E2E test suite starts with good intentions. Five tests, clean code, runs in 30 seconds. Then it grows. At 20 tests, someone adds a waitForTimeout(5000) because a test was flaky. At 50 tests, the suite takes 8 minutes and fails randomly twice a week. At 100 tests, the team stops trusting the suite and starts skipping it entirely. I’ve watched this cycle play out multiple times, and the root cause is always structural — not the tool.
The difference between an E2E suite that collapses at 50 tests and one that hums along at 500 is a handful of patterns applied from the start. None of them are advanced. All of them are essential.
“Simplicity is prerequisite for reliability.” — Edsger W. Dijkstra
Why E2E Suites Become Unmaintainable
Before looking at solutions, it helps to understand the three patterns that kill E2E suites. If any of these sound familiar, you know where to start fixing.
| Anti-Pattern | What It Looks Like | Why It’s Fatal |
|---|
| No abstraction | Every test contains raw selectors, navigation steps, and inline assertions | UI changes force updates in 40 tests instead of one page object. Engineers dread touching the suite. |
| Implicit waits and timeouts | page.waitForTimeout(3000) scattered throughout tests | Sometimes 3 seconds isn’t enough. Sometimes it’s wasteful. It’s always fragile and hides real timing issues. |
| Shared mutable state | Test B depends on data that test A created. Test order matters. | When test A fails or runs in a different order, test B fails too. Cascading failures make debugging impossible. |
waitForTimeout is a prayer, not a strategy. Replace every static timeout with an explicit wait for a condition — waitForSelector, waitForLoadState, or expect(locator).toBeVisible(). Your tests become faster and more reliable.
The Page Object Pattern
The page object model is the single most important pattern for maintainable E2E tests. Every page or major UI section gets a class that encapsulates its selectors and interactions. Tests never touch raw selectors directly.
Think of it like an API for your UI. Tests describe what the user does (“create a payment”), not how the UI is structured (“fill the input with id amount-field, click the button with class submit-btn”). When the UI changes, you update the page object once. Every test that uses it stays unchanged.
export class PaymentPage {
constructor(private page: Page) {}
readonly amount = this.page.getByLabel('Amount');
readonly recipient = this.page.getByLabel('Recipient');
readonly submit = this.page.getByRole('button', { name: 'Send Payment' });
readonly success = this.page.getByTestId('payment-success');
async createPayment(amount: number, recipient: string) {
await this.amount.fill(String(amount));
await this.recipient.selectOption({ label: recipient });
await this.submit.click();
}
}
Now your tests read like user stories: “go to the payment page, create a payment for $250, expect success.” The selector details are hidden behind a clean interface. When the UI restructures — the amount input becomes a custom component, the button label changes — you update one file, not fifty.
Use Playwright’s built-in locators (getByRole, getByLabel, getByTestId) instead of CSS selectors. They’re more resilient to DOM restructuring and align with how users actually find elements — by role and label, not by class name or element ID.
Test Organization Strategies
At scale, how you organize test files matters as much as how you write individual tests. Here’s a comparison of the most common approaches.
| Strategy | Structure | Best For | Trade-off |
|---|
| By feature | tests/payments/, tests/auth/, tests/settings/ | Product teams — each team owns their feature’s tests | Cross-feature journeys span multiple directories |
| By user role | tests/admin/, tests/member/, tests/guest/ | Permission-heavy apps — ensures role-specific flows are covered | Feature logic is spread across role directories |
| By priority | tests/critical/, tests/regression/, tests/smoke/ | CI optimization — run critical first, regression overnight | Features are split across priority tiers |
| Hybrid (recommended) | Feature directories with priority tags via test.describe | Most teams | Slightly more upfront organization |
The hybrid approach works best in practice: group tests by feature for discoverability, and use Playwright’s tagging to run subsets by priority in different CI stages. Engineers working on payments find all payment tests in one place. CI runs @critical tests on every PR and the full suite nightly.
Handling Auth and Flakiness
Authentication and flakiness are the two biggest sources of E2E pain. Handle both with deliberate patterns.
Authentication: Log in once in a global setup script and save the browser storage state to a JSON file. Each test loads that saved state instead of logging in again. This avoids repeating the login flow in every test and saves significant time.
// In global setup: login once, save state
const page = await browser.newPage();
await page.goto('/login');
await page.getByLabel('Email').fill('test@example.com');
await page.getByLabel('Password').fill(process.env.TEST_PASSWORD!);
await page.getByRole('button', { name: 'Sign in' }).click();
await page.waitForURL('/dashboard');
await page.context().storageState({ path: 'test/.auth/user.json' });
Then tests simply create a browser context with storageState: 'test/.auth/user.json' and they’re instantly authenticated. Login happens once globally, not per test.
Flakiness: Most flaky tests share a root cause: shared mutable state or insufficient waiting. Address both systematically.
| Flakiness Cause | Fix |
|---|
| Tests depend on data from other tests | Each test creates its own data via API seeding — never through the UI |
| Element not yet visible when clicked | Replace waitForTimeout with expect(locator).toBeVisible() before interaction |
| Network requests haven’t completed | Use waitForLoadState('networkidle') or wait for a specific response |
| Animation in progress | Disable animations in test config: reducedMotion: 'reduce' |
| Database state from a previous test | Use transaction rollback or API-based cleanup in afterEach |
The 80/20 Rule: Test Critical Paths
You don’t need E2E tests for everything. In fact, trying to E2E test everything is how suites become slow, flaky, and abandoned. Apply the 80/20 rule: test the 20% of user journeys that generate 80% of business value.
Identify your critical paths — the flows where a failure would cost real money, lose real customers, or trigger real incidents. These typically include:
| Priority | User Journey | Why It’s Critical |
|---|
| P0 | Sign up / onboarding | Broken signup = zero new users |
| P0 | Core transaction (purchase, payment, booking) | Broken transactions = direct revenue loss |
| P1 | Authentication (login, logout, password reset) | Broken auth = no one can use the product |
| P1 | Key reporting / exports | Broken reports = eroded trust, compliance risk |
| P2 | Admin / team management | Important but lower frequency; integration tests usually suffice |
Give P0 and P1 journeys thorough E2E coverage: happy paths, the most critical error path, and any edge case that has caused a production incident before. Everything else — settings pages, profile updates, notification preferences — gets integration test coverage instead.
E2E tests are expensive to write, expensive to maintain, and slow to run. Spend that budget on the journeys that would cause the most damage if they broke. A broken checkout flow costs revenue. A broken avatar upload costs a support ticket. Test accordingly.
The patterns that make E2E suites scale aren’t clever tricks. They’re structural decisions: abstract with page objects, isolate test data, handle auth once, and focus coverage on what matters most. Build these foundations from the start, and your suite will still be maintainable at 500 tests.