Happy path testing verifies the default flow of a feature when nothing goes wrong. It does NOT verify what happens when something goes wrong, which is where bugs actually live. Three other coverage classes exist alongside it: sad path (anticipated failures like auth rejection or a 500 from the payment processor), edge case (boundary inputs like an empty cart or a max-length string in a text field), and corner case (multiple low-probability conditions hitting simultaneously, such as an abandoned cart resumed after a price change in a non-USD locale).
Most engineering teams shipping without a dedicated QA function have at least some happy path coverage, often written by a coding agent given the prompt "write a Playwright test for the checkout flow." Those tests pass. They will keep passing. They will not catch the four classes of bugs above, and they will not alert you until a user does. The reason is structural: a Playwright test only covers what the author (human or agent) thought to write, and the author is almost always thinking about the happy path. We built Autonoma so that the coverage is derived from the codebase instead. Autonoma runs Playwright under the hood; the difference is that no one on your team writes or maintains the Playwright code. The ICP is specific: small engineering teams, three to twelve engineers, no QA hire, using Cursor or Claude Code to ship fast, and currently learning about production bugs from support tickets rather than from a CI failure. If that is not your situation, this article is still the canonical taxonomy reference. If it is, the rest will feel familiar.
What "the happy path" actually means
The term originates from usability testing literature in the late 1990s. Thomas Allmer used it in the context of cognitive walkthroughs: the sequence of actions a user takes when every step succeeds as designed. Some teams call it the "golden path." Some use-case theorists call it the "main success scenario." The three terms describe the same thing: the single uninterrupted sequence from feature entry to feature success.
It became the default coverage floor for practical reasons. Writing a test for the happy path is fast, stable (the application was designed to pass this sequence), and gives immediate confidence that the core flow works. A test suite with 100% happy-path coverage looks complete on paper. It is not.

The problem is that "the happy path" is singular. There is one happy path per feature. A checkout flow with five fields has one happy path and dozens of ways to fail: a required field left blank, a card number with the wrong format, a product that went out of stock between page load and checkout, a session that expired, a network timeout during the payment call. Happy path coverage covers one of those. The rest ship.
"Golden path" is worth a note. Some organizations use it interchangeably with "happy path." Others use it to mean the opinionated recommended path through a multi-step onboarding flow. For this article, treat them as synonyms.
Happy path, sad path, edge case, corner case: the taxonomy
These four terms are used loosely across the industry. Many SERP results treat "edge case" and "corner case" as synonyms. They are not. Here is the crisp distinction.
Happy path: the single default flow where every step succeeds. User enters valid credentials, valid card, valid address. System responds as designed at each step. No branching.
Sad path: anticipated failure flows. The inputs are wrong in expected ways: auth fails, validation rejects a field, the network returns a 500, the item is out of stock. These are flows the application explicitly handles (there is error-handling code for them). The bugs here are usually in that error-handling code: wrong error message, incorrect redirect, state not reset correctly after failure.
Edge case: boundary inputs that stress the system rather than break it in an anticipated way. Off-by-one quantities, empty lists, max-length strings, zero-value totals, the first and last element of a paginated set. The application may not have explicit handling for these inputs; they may pass through general-purpose code and produce wrong output silently.
Corner case: multiple low-probability conditions hitting simultaneously. A user with an expired card in a non-USD locale on a product that went on sale while their cart was open. No single condition is unusual; the combination is. Corner cases are hard to enumerate in advance because they are combinatorial. They are the category most reliably caught by exploratory testing and most reliably missed by scripted happy-path suites.
The shift-left testing principle, which a parallel article in this series covers for small engineering teams, applies across all four classes, not just the happy path. Testing sad paths and edge cases earlier in the cycle is cheaper than finding them in production.
For the edge case taxonomy specifically, the sibling article on edge case testing covers how to find edge cases without manually listing every input permutation. For concrete corner case examples in real web applications, the corner case catalogue article in this series provides a browsable reference.
A happy path Playwright test for a checkout flow
Every other happy path article on the internet uses a login/password example. This one does not. A checkout flow is a better demonstration because it has more states, more failure surfaces, and more direct revenue consequence.
The test below is the kind of happy path coverage a competent Playwright user writes by hand, or that a coding agent emits when prompted for "a Playwright test for checkout." It uses page.getByRole and page.fill, a real test card number, and a real assertion on the order confirmation page. Playwright is doing exactly what Playwright is designed to do. The question this article is building toward is not whether the framework can execute the test, but whether the person (or agent) authoring the test is thinking about the right failure surfaces.
This passes. It will pass for the next six months. It will pass through every UI redesign that does not touch the core checkout form structure. Stability is exactly what a well-formed Playwright test buys you, and stability is not the same property as coverage. The four bugs in the next section all reach production on top of a green build of exactly this suite.
Four bugs the happy path misses

Zero-quantity submission
The cart's decrement button lets a user reduce quantity from 1 to 0 without removing the item. The "Remove" button is separate. A user clicks decrement once more than intended: the item stays in the cart with quantity 0. The happy-path test starts with a product already added at quantity 1 and never touches the decrement button. The symptom in production: an order is created with a $0.00 line item. The payment processor charges $0.00. The fulfillment system sees a valid order. The product ships. The revenue does not arrive.
Currency rounding on a three-decimal-place locale
The Bahraini dinar (BHD) uses three decimal places. Most checkout implementations format prices using a locale-aware formatter, but many hardcode two decimal places in the order total calculation. The happy-path test runs in USD. The symptom in production: a BHD customer is charged 10x the displayed total because the third decimal place is treated as part of the integer when the formatter is bypassed in the payment call. This is the kind of bug that does not appear in staging, does not appear in QA, and generates a support ticket within minutes of the first BHD transaction.
Expired card with a far-future format
A card expiry field accepts year input as two digits. The validation regex is \d2. A developer deploys a fix for 2025's expiry validation and includes logic to reject cards where the parsed year is less than the current year. The parsing treats "30" as 2030. The fix ships. On January 1, 2030, every valid card expiring in 2030 is rejected as expired because "30" is now less than the current year. The happy-path test uses a card expiring in the current year. The bug is invisible until 2030, or until a test specifically checks the boundary condition at year rollover.
Abandoned cart resume after price change
A user adds a product at $50, abandons the cart, returns 24 hours later, and the product is now $40 due to a sale. The cart still displays $50 (the price at cart-add time is cached). The user sees $40 on the product page, $50 in the cart, proceeds through checkout, and is charged $50. The happy-path test does not pause between adding the product and checking out. The symptom in production: a chargeback filed against the higher price the customer was charged after seeing the lower price on the product page. This is not a payment processor bug or an auth bug. It is a state-management bug that the happy-path test is structurally incapable of exposing.
Your coding agent only writes happy paths
This is the unique failure mode of the current moment. Engineers using Cursor, Claude Code, or any other coding agent to generate test coverage are systematically receiving happy-path Playwright suites. The reason is not a model failure and it is not a prompt quality issue. It is structural. Any test plan derived from a prompt inherits the prompt author's mental model, and the prompt author is thinking about the feature working.

The prompt a developer typically gives is this:
The agent generates a test for the checkout flow. The singular, default one. It navigates to the product page, adds to cart, fills the form with valid data, submits, and asserts the confirmation page renders. Here is a representative output:
Notice what is absent. There is no assertion for what happens when quantity is 0. There is no locale parameter. There is no card-expiry boundary test. There is no state-pause simulating an abandoned cart. The agent wrote a test for "the checkout flow" as prompted. The prompt did not say "write tests for all the ways checkout can fail." The agent did not internally enumerate the branching paths because the prompt did not request enumeration.
The coding agent pattern-matched the prompt: "Write a Playwright test for the checkout flow." It wrote a test for one checkout flow. It was correct given the prompt. A better prompt, such as "write tests for the checkout flow including sad path, boundary, and concurrent-state cases," would produce a marginally broader suite. It would still miss the bugs the prompt author did not think to ask for. The structural ceiling is not prompt quality. It is that someone has to hold the failure surface in their head and translate it into Playwright code. The fix is to derive the test plan from the codebase, where the failure surface is already encoded as branches, validators, and state machines.
The result is a test suite that passes on every green build, gives the team confidence, and completely omits the four bug classes above. When "we don't have any QA" and the coverage is entirely coding-agent-generated, this is the structural outcome. The confidence the suite provides is real. The coverage is not.
When is happy-path-only testing OK? A decision framework
Not every flow requires beyond-happy-path coverage. The answer depends on the blast radius if a non-happy-path bug reaches production.
| Flow type | Happy-path-only OK? | Happy-path-only negligent? | Examples |
|---|---|---|---|
| Internal admin tool (used by 5 or fewer people) | Yes | No | Internal feature flag dashboard, admin user table |
| Early-stage MVP feature flag | Yes | No | New feature visible to 1% of users |
| Payments and checkout | No | Yes | Stripe checkout, subscription upgrade, refund flow |
| Auth and signup | No | Yes | Signup form, password reset, OAuth callback |
| Data import | No | Yes | CSV upload, bulk API ingestion |
| Search and filter | Depends | Depends | Catalog search (OK if internal-only; not OK if revenue-critical) |
The practical rule: any flow that touches money, session state, or user-generated data needs beyond-happy-path coverage. Anything behind a feature flag or limited to a small internal audience can ship with happy-path coverage while the feature matures.
Corner cases: what your customers actually file as bugs
We talk to a lot of small engineering teams. The phrase "we don't have any QA" shows up in nearly every call. So does "we hear about it real quick," which means a user emails or messages support within minutes of a bug shipping. The bugs those messages report are almost never happy-path failures. They are corner cases.
Two examples that come up repeatedly.
File upload with special characters. A customer uploads a file named résumé-final(2)%.pdf. The happy-path test uses test.pdf. The filename is passed directly to an S3 key without URL encoding. The percent sign is interpreted as a URL escape sequence. The key is malformed. The upload returns a 500. The user sees a generic error. The team spends two hours reproducing it before someone tries a filename with a special character.
i18n translation gap. The team ships a Spanish locale. One string in the checkout confirmation email was added after the translation pass. The happy-path test runs in the default locale. The symptom in production: a Spanish-speaking customer files a support ticket reading "the button is in English on the Spanish version." The translation gap is real and the fix is a one-liner, but it shipped because the test suite had no locale coverage.
Both are "catch bugs before they reach production" scenarios. Both are trivially testable once you know to look for them. Neither shows up in a happy-path suite generated by a coding agent.
The sibling edge case testing article covers finding edge cases systematically. The corner case catalogue article provides a browsable list organized by feature category.
How Autonoma covers happy path AND corner case testing
The structural problem with happy-path-only coverage is that the person writing the tests (or prompting the agent to write them) is thinking about the feature working, not the feature failing. Autonoma changes the input to the test generation process.
Connect your codebase to Autonoma. The Planner agent reads your routes, components, and user flows, not a spec document and not a prompt. For the checkout example: it does not just plan "add to cart, fill form, submit, assert confirmation." It reads the cart component and sees the quantity decrement handler with no lower bound. It reads the price calculation function and sees the locale-conditional formatter. It reads the session management code and sees the cart-expiry check. It plans tests for those branches because the code exposes them, not because a human thought to ask.
The Automator runs the planned tests against a managed preview environment provisioned per PR via our autonomous testing platform. The four-stage pipeline (Plan, Generate, Run, Heal) means the coverage is generated, executed, and maintained without anyone writing a test file. Autonoma uses Playwright under the hood for execution, so you keep the framework, the locator strategy, and the traces your team already understands. What changes is the authoring layer: nobody on your team writes or maintains the Playwright code, and the test plan is derived from the routes, components, and state machines your codebase already exposes. The Maintainer agent keeps tests passing as the code changes, using AI self-healing test automation to recover from selector changes and UI redesigns.
An early-stage YC startup we work with had the exact file-upload corner case described above. They had a four-engineer team shipping a marketplace. Their test suite was coding-agent-generated and passed on every PR. The file-upload bug shipped. Autonoma caught it on the first run after connecting the codebase because the Planner read the S3 upload handler and saw the unencoded filename path.
The honest qualifier. Autonoma generates tests by exploring the running app. We can only find what your app actually exposes in a test environment. If you have a Bahraini-dinar code path that the test environment never enters because no test product is priced in BHD, the Planner will not find it. Autonoma complements a thoughtful product spec; it does not replace one. The complement is what makes "no QA team" viable. Not the absence of thought about coverage, but the absence of manual test writing.
If your happy-path test suite passes, a Sentry exception in production is how you find out it was happy-path-only. Sentry is the post-production safety net; it is not pre-deploy coverage. Autonoma does not replace Sentry. We generate the tests that catch the bugs before they reach Sentry.
FAQ
Happy path testing verifies the default flow of a feature when nothing goes wrong. It confirms that the system works as designed when all inputs are valid and all dependencies respond correctly. Happy path testing is the coverage floor, not the ceiling. Sad path, edge case, and corner case testing are the layers that catch production bugs.
The opposite is the sad path: the anticipated failure flows where inputs are wrong in expected ways, such as a failed authentication, a rejected card, or a 500 from a downstream service. Beyond the sad path, edge cases (boundary inputs) and corner cases (multiple simultaneous low-probability conditions) represent additional failure surfaces that the sad path does not capture.
Only if the blast radius of a non-happy-path failure is acceptable. For internal admin tools used by a small team, happy-path-only coverage is often fine. For any flow touching payments, auth, user-generated data, or session state, happy-path-only testing is negligent. The decision framework table below maps flow types (payments, auth, internal admin tools, MVP feature flags) to the appropriate coverage standard.
Because any prompt-driven test plan inherits the prompt author's mental model, and the author is thinking about the happy path. Playwright is a framework. It executes whatever the author wrote. A coding agent given 'write a Playwright test for the checkout flow' writes one happy-path test, which is correct on the prompt it received. A better prompt produces a marginally broader suite but still depends on the author enumerating failure modes by hand. The fix is not a better agent or a better prompt; it is a different input to the test generation process, one that derives coverage from the codebase, where the failure surface is already encoded, rather than from a human-written prompt.
For early-stage feature flags visible to a small percentage of users, and for internal tools used by a handful of people, happy-path-only testing is often an acceptable tradeoff. The rule of thumb: if a non-happy-path failure would reach a paying customer or touch financial data, happy-path-only is not sufficient. Payments, auth, signup, data import, and search on revenue-critical surfaces require beyond-happy-path coverage.




