Testflight Lab · Quality Strategy
Testing is the guardrail that lets us ship fast across the Dev stack, the productivity lab, and the AI lab. This guide outlines the layers and tooling we use.1. Strategy Overview
- Shift left: contract tests + component stories before API wiring.
- Synthetic monitoring: Playwright cron jobs hit prod smoke paths hourly.
- Red teaming: AI features get adversarial prompts + guardrail evals.
2. Layered Stack
| Layer | Tooling | Notes |
|---|---|---|
| Unit | Vitest / Jest + Testing Library | Mocks kept in __fixtures__; snapshots minimized. |
| Integration | Supertest + MSW | API + DB interactions with seed data per suite. |
| UI/E2E | Playwright + Cypress | Playwright for core flows, Cypress for visual diff in design system. |
| Stories | Storybook Interaction Tests | Each component story asserts accessibility + state logic. |
| API Contracts | Pactflow + OpenAPI tests | PR fails if breaking change detected. |
| AI Eval | LangSmith + custom scripts | Track toxicity, hallucination, guardrail coverage. |
3. Recipe: Adding a Productivity Feature
- Write MDX spec + stub Storybook story.
- Add Vitest unit tests for hooks + utilities.
- Add Playwright scenario hitting new workflow.
- Register synthetic monitor (Checkly) once live.
- Document runbooks + expected SLOs.
4. Reporting + Dashboards
- Coverage + flake rate inside Buildkite Insights.
- Playwright video artifacts auto-uploaded to S3 + Slack thread.
- AI eval scores tracked per agent (precision/recall on guardrail dataset).
5. Incident Feedback Loop
- Failures create GitHub issues via Actions.
- Weekly “Testflight Retro” reviews flaky tests, identifies systemic issues.
- Document lessons learned inside
/runbooks/testing/<date>.mdx.
