Skip to main content

Testflight Lab · Quality Strategy

Testing is the guardrail that lets us ship fast across the Dev stack, the productivity lab, and the AI lab. This guide outlines the layers and tooling we use.

1. Strategy Overview

  • Shift left: contract tests + component stories before API wiring.
  • Synthetic monitoring: Playwright cron jobs hit prod smoke paths hourly.
  • Red teaming: AI features get adversarial prompts + guardrail evals.

2. Layered Stack

LayerToolingNotes
UnitVitest / Jest + Testing LibraryMocks kept in __fixtures__; snapshots minimized.
IntegrationSupertest + MSWAPI + DB interactions with seed data per suite.
UI/E2EPlaywright + CypressPlaywright for core flows, Cypress for visual diff in design system.
StoriesStorybook Interaction TestsEach component story asserts accessibility + state logic.
API ContractsPactflow + OpenAPI testsPR fails if breaking change detected.
AI EvalLangSmith + custom scriptsTrack toxicity, hallucination, guardrail coverage.

3. Recipe: Adding a Productivity Feature

  1. Write MDX spec + stub Storybook story.
  2. Add Vitest unit tests for hooks + utilities.
  3. Add Playwright scenario hitting new workflow.
  4. Register synthetic monitor (Checkly) once live.
  5. Document runbooks + expected SLOs.

4. Reporting + Dashboards

  • Coverage + flake rate inside Buildkite Insights.
  • Playwright video artifacts auto-uploaded to S3 + Slack thread.
  • AI eval scores tracked per agent (precision/recall on guardrail dataset).

5. Incident Feedback Loop

  • Failures create GitHub issues via Actions.
  • Weekly “Testflight Retro” reviews flaky tests, identifies systemic issues.
  • Document lessons learned inside /runbooks/testing/<date>.mdx.
Adopt these layers wholesale or remix them; the key is to keep quality visible and automated.