7 real-world failure stories from a QA engineer who ran Selenium E2E test automation into the ground — covering ChromeDriver version hell, Wait design chaos, xpath dependency, knowledge silos, and more. Learn the Selenium-specific patterns behind “tests that worked until they didn’t,” and the recovery strategies for each. Note: “technical debt” refers to the state where maintenance costs compound and grow like a snowball over time.
“It worked fine at first. Somehow, a year later, nobody could touch the code anymore.” — Selenium suite deterioration doesn’t happen suddenly. It creeps in, slowly.
📌 Who This Article Is For
- Engineers running Selenium E2E tests who notice things getting quietly unstable
- QA engineers thinking “why does this test randomly fail?” or “maintenance is falling behind”
- Test leads looking to recover or stabilize a struggling Selenium setup
- Engineers looking for evidence to inform a Selenium → Playwright migration decision
✅ What You Will Learn
- 7 Selenium-specific failure patterns and how each deterioration unfolds
- How to detect “not broken yet, but warning signs are there” before it’s too late
- Recovery strategies and prevention tactics for each pattern
👤 About the Author
Written by Yoshi, a QA engineer and test automation engineer with over 15 years of hands-on experience. Having witnessed and recovered from Selenium suite breakdowns across multiple real projects — from initial setup through full collapse and recovery — these patterns are drawn directly from the field.
📖 How This Article Differs from Related Content
- Top 5 Test Automation Failures: Tool-agnostic strategy and design mistakes → Start here if you failed at deciding what to automate
- 7 Playwright Adoption Failures: Playwright-specific mistakes → Start here if you failed after migrating to Playwright
- This article: Selenium-specific patterns where a working suite slowly deteriorates over time
📌 Key Takeaways
- Selenium operational breakdown typically starts from three Selenium-specific landmines: ChromeDriver management, Wait design, and selector design
- Deterioration doesn’t happen suddenly — “things feel a bit flaky” and “maintenance can’t keep up” are the early warning signs. Catching them early is the key
- Selenium isn’t the problem. Scaling without an operational design is the root cause
“The first six months were smooth. But a year later, nobody touched the test code anymore.” — A scene I’ve watched repeat itself across multiple projects.
What these situations have in common is that the deterioration doesn’t hit all at once — it creeps in slowly. This article walks through 7 patterns of Selenium technical debt and the recovery strategy for each, based on direct experience.
🔍 First: Check Your Selenium Early Warning Signs
- CI fails once a week or more due to ChromeDriver version mismatch
- Both
implicitly_waitandWebDriverWaitcoexist in the codebase without a design policy - Large amounts of full-path xpath or class-dependent selectors exist
- The same login logic is copy-pasted in multiple places
- “You have to ask [person X] — no one else knows this test” situations exist
- Tests pass locally but fail in CI (headless) environments
- Versions are not pinned in
requirements.txt
| ✅ 0–2: Healthy Keep checking regularly | ⚠️ 3–4: Caution Address matching items first | 🚨 5+: Danger Start a recovery plan now |
- 7 Selenium Failure Patterns: Quick Reference
- ① What Is ChromeDriver Version Management Hell?
- ② What Is the implicitWait / explicitWait Mix That Creates Flaky Hell?
- ③ What Is the xpath Selector Dependency That Breaks on Every UI Refresh?
- ④ What Is the 1,000-Line Script That Grew With No Design?
- ⑤ What Is the "Key Person Leaves and No One Can Touch the Code" Situation?
- ⑥ What Is "Can't Run Headless — Can't Get Into CI"?
- ⑦ What Is the Selenium 4 Upgrade That Stopped All Tests Cold?
- When Is Keeping Selenium the Right Call?
- FAQ
7 Selenium Failure Patterns: Quick Reference
| # | Pattern | Root Cause | CI Impact | Fix Cost |
|---|---|---|---|---|
| ① | ChromeDriver version management hell | No binary management system | 🔴 High | 🟢 Low |
| ② | implicitWait / explicitWait mixed together | No unified Wait design policy | 🔴 High | 🟡 Medium |
| ③ | xpath selector dependency | Fragile selector design | 🔴 High | 🔴 High |
| ④ | 1,000-line script bloat | “Just make it work” development with no design | 🟡 Medium | 🔴 High |
| ⑤ | Key person leaves, code becomes untouchable | Knowledge silos while scaling | 🟡 Medium | 🔴 High |
| ⑥ | Can’t run headless — can’t get into CI | Local-only implementation design | 🔴 High | 🟡 Medium |
| ⑦ | Selenium 4 upgrade stops all tests cold | No dependency management or migration plan | 🔴 High | 🟡 Medium |
① What Is ChromeDriver Version Management Hell?
What happened
Monday morning: CI is down. The log says SessionNotCreatedException: This version of ChromeDriver only supports Chrome version XX. Chrome auto-updated over the weekend and the ChromeDriver version no longer matches.
“Just download the new ChromeDriver and swap it out.” Fine the first time. Then this happens every single week. Find the download page → check the version → replace the binary → re-run CI. Every Monday morning, eaten by this.
Early warning signs
- The ChromeDriver path is hardcoded in the test code
- No ChromeDriver version in
requirements.txt - CI environment Chrome version is managed as “roughly the latest”
- ChromeDriver download is a manual task
# ❌ Failure pattern: hardcoded ChromeDriver path
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
driver = webdriver.Chrome(service=Service("/usr/local/bin/chromedriver"))
# Breaks every time the version doesn't match
# ✅ Option 1: Selenium Manager (built-in since Selenium 4.6+)
# Automatic ChromeDriver resolution is a standard feature from Selenium 4.6
# Note: CI/Docker environments may need extra setup due to OS dependencies
from selenium import webdriver
driver = webdriver.Chrome()
# No extra library needed (though environment-specific config may be required)
# ✅ Option 2: webdriver-manager (widely used in existing projects and CI)
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(
service=Service(ChromeDriverManager().install())
)
# Still valid for existing setups and Selenium < 4.6- Chrome not installed: Minimal Docker images don't include Chrome at all
- PATH issues: Chrome/ChromeDriver path may not be set in container environments
- Permission restrictions: Binary acquisition may be blocked by filesystem permissions
For existing projects or CI environments, webdriver-manager remains widely used. Choose based on your team's setup. Pinning Chrome versions in Docker images adds further stability.
② What Is the implicitWait / explicitWait Mix That Creates Flaky Hell?
※ Flaky tests are tests that pass sometimes and fail other times with no clear reason — they're unreliable and non-reproducible. When CI results vary from run to run, it becomes impossible to tell whether a failure is a real bug or just noise.
What happened
"Let's be thorough with waits." After reading the docs, I ended up with both implicitly_wait and WebDriverWait throughout the codebase. Result: tests that pass one day and fail the next — flaky tests with no reproducibility. "Did this failure mean a real bug?" became impossible to answer.
implicitly_waitis a global wait applied to the entire driver. Once set, it affects the whole sessionWebDriverWaitis an explicit wait targeting a specific element- When both coexist, timeout behavior becomes hard to predict, wait behavior varies across tests, and debugging gets much harder. Selenium's official docs recommend avoiding the mix
- Some setups with both may appear stable in the short term, but the risk grows as test count increases. The recommended approach is to unify on explicitWait — this makes wait design clear and reduces flakiness
Note: Wait design is only one contributor to flakiness. Network latency, rendering differences, and shared test state are also major factors. Cleaning up Wait design eliminates one source — not all sources.
# ❌ Failure pattern: mixing implicitWait and explicitWait
driver.implicitly_wait(10) # global setting
# elsewhere in the codebase
wait = WebDriverWait(driver, 5)
element = wait.until(EC.presence_of_element_located((By.ID, "submit")))
# timeout interference → unpredictable flakiness
# ✅ Recovery: unify on explicitWait
# Do NOT use implicitly_wait
# Use WebDriverWait for all waiting
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
wait = WebDriverWait(driver, 10)
element = wait.until(
EC.element_to_be_clickable((By.ID, "submit"))
)
element.click()implicitly_wait exists in the codebase, start by finding and removing every instance. Use grep -r "implicitly_wait" ./tests to find them all.③ What Is the xpath Selector Dependency That Breaks on Every UI Refresh?
What happened
"As long as we can target the element, it's fine." Full-path xpath accumulated throughout the codebase. Six months later, a frontend redesign changed the HTML structure — and 280 of 300 tests failed simultaneously. All the xpath paths were now invalid.
# ❌ Failure pattern: full-path xpath dependency
driver.find_element(
By.XPATH,
"/html/body/div[2]/main/section[1]/form/div[3]/button"
)
# Any minor change to HTML structure causes total collapse
# ⚠️ Relative xpath can also be fragile
driver.find_element(By.XPATH, "//div[@class='btn-wrapper']/button[1]")
# Breaks if the class name changes
# ✅ Most stable: CSS Selector with data-testid
driver.find_element(By.CSS_SELECTOR, "[data-testid='submit-btn']")
# data-testid is a test-only attribute → establish a rule never to change it
# ✅ Second best: ID or name attribute
driver.find_element(By.ID, "submit-btn")
driver.find_element(By.NAME, "submit")Selector stability comparison
| Selector Type | UI Change Resilience | Recommendation |
|---|---|---|
data-testid attribute | ✅ Strongest | Agree with dev team and make it a standard |
| ID attribute | ✅ Strong | Use whenever the element has an ID |
| name / aria-label attribute | △ Moderate | Use when available |
| Stable-attribute CSS/XPath e.g. //button[@type='submit'] | △〜○ Conditional | Acceptable when using non-volatile attributes like aria or type |
| Class-dependent CSS/XPath | ⚠️ Weak | Fragile to design changes — last resort only |
| Full-path XPath | ❌ Very low | Avoid as a rule (may be used in low-change admin screens or temporary contexts) |
data-testid attributes as a development standard. To find existing xpath, use grep -r "By.XPATH" ./tests and replace high-priority tests first.④ What Is the 1,000-Line Script That Grew With No Design?
What happened
"Ship working tests first." Tests were added with that policy, until a single file exceeded 1,000 lines. Login logic was copy-pasted in 5 places — when the login screen's selector changed, all 5 needed updates. Missed patches started hiding bugs.
# ❌ Failure pattern: same logic scattered throughout the file
def test_purchase():
driver.find_element(By.ID, "email").send_keys("user@example.com")
driver.find_element(By.ID, "password").send_keys("pass")
driver.find_element(By.ID, "login-btn").click()
# ...purchase flow...
def test_profile():
driver.find_element(By.ID, "email").send_keys("user@example.com") # duplicate
driver.find_element(By.ID, "password").send_keys("pass") # duplicate
driver.find_element(By.ID, "login-btn").click() # duplicate
# ...profile flow...
# ✅ Recovery: consolidate with Page Object Model
class LoginPage:
def __init__(self, driver):
self.driver = driver
def login(self, email: str, password: str):
self.driver.find_element(By.ID, "email").send_keys(email)
self.driver.find_element(By.ID, "password").send_keys(password)
self.driver.find_element(By.ID, "login-btn").click()
# Test code only expresses intent
def test_purchase(driver):
LoginPage(driver).login("user@example.com", "pass")
# ...purchase flow only...- UI change frequency: Higher-change screens have more value to consolidate
- Test reusability: Operations called from multiple test scenarios are priority Page Object candidates
- Domain boundaries: Separating by function (Login / Product Search / Cart) makes maintenance much cleaner
Fixing at 1,000 lines is a major undertaking. At 300 lines, it's still manageable.
⑤ What Is the "Key Person Leaves and No One Can Touch the Code" Situation?
What happened
"Only person A understands these Selenium tests." Person A resigned. The month after, tests started failing with nobody able to fix them. No comments in the code. No environment setup documentation. "Tests fail, nobody fixes them" became the culture — and the automation became hollow.
Knowledge silo symptom checklist
※ Rather than just Yes/No, ask "what proportion of the team can handle this?" — one person being the only one who can do it is the essence of a knowledge silo.
| Checklist Item | Your Team? |
|---|---|
| README alone is enough to complete environment setup | ✅ / ❌ |
| Team members other than the main person can add and fix tests | ✅ / ❌ |
| Test code goes through code review | ✅ / ❌ |
| Multiple people can investigate when a test fails | ✅ / ❌ |
| "You have to ask [X] — no one else knows" situations don't exist | ✅ / ❌ |
⑥ What Is "Can't Run Headless — Can't Get Into CI"?
What happened
"Works locally" — but when pushed to the CI server (headless environment), errors everywhere. Investigation revealed test code scattered with operations that depended on browser window size and rendering behavior. Specifically, these problems were intertwined:
- Viewport size difference: Local is 1920×1080, headless default is often 800×600 — elements can fall outside the viewport and become unclickable
- Lazy rendering: Elements that only render after scrolling (lazy-loaded) remain unrendered in headless
- Sticky / fixed UI: Fixed headers or sidebars overlapping elements block clicks
- Responsive breakpoints: Narrower viewport triggers layout changes that hide or rearrange expected elements
# ❌ Failure pattern: implementation that breaks in headless
# Scroll operation based on assumed screen size
driver.execute_script("window.scrollTo(0, 500)")
driver.find_element(By.ID, "submit").click()
# The element may not be at y=500 in headless
# ❌ Clicking a viewport-external element (fails in headless)
element = driver.find_element(By.ID, "footer-btn")
element.click() # Error if off-screen
# ✅ Recovery option 1: scroll element into view, then interact
element = driver.find_element(By.ID, "submit")
driver.execute_script("arguments[0].scrollIntoView(true);", element)
element.click()
# ✅ Recovery option 2: set window size explicitly
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument("--headless")
options.add_argument("--window-size=1920,1080") # fix size in CI
driver = webdriver.Chrome(options=options)--window-size=1920,1080 as a permanent option and standardize scrollIntoView-based operations. Note: if a fixed/sticky header overlaps the element after scrolling, scrollIntoView alone can still trigger ElementClickInterceptedException. In that case, use JavaScript arguments[0].click() or add an offset for the header height.⑦ What Is the Selenium 4 Upgrade That Stopped All Tests Cold?
What happened
"There's a security patch — let's upgrade Selenium." After pip install selenium --upgrade took us from Selenium 3 to 4, the next morning's CI had over 150 of 200 tests failing simultaneously. Multiple deprecated APIs in Selenium 4, combined with no migration plan, caused the disaster.
Main API changes from Selenium 3 → 4
| Selenium 3 Syntax | Change in Selenium 4 |
|---|---|
driver.find_element_by_id("x") | Unified to driver.find_element(By.ID, "x") |
driver.find_element_by_xpath("//x") | driver.find_element(By.XPATH, "//x") |
webdriver.Chrome(executable_path="...") | Recommended via Service object |
DesiredCapabilities | Merged into the Options class |
requirements.txt was the direct cause of this disaster.# ❌ Failure pattern: still using Selenium 3 deprecated APIs
driver.find_element_by_id("submit") # removed in Selenium 4
driver.find_element_by_xpath("//button") # removed in Selenium 4
driver.find_element_by_class_name("btn") # removed in Selenium 4
# ✅ Selenium 4-compatible syntax (use By.XXX)
from selenium.webdriver.common.by import By
driver.find_element(By.ID, "submit")
driver.find_element(By.XPATH, "//button")
driver.find_element(By.CLASS_NAME, "btn")grep -r "find_element_by_" ./tests to find all legacy API usage. Many can be bulk-replaced with sed. Going forward, pin versions in requirements.txt and upgrade via dedicated branch with CI verification.When Is Keeping Selenium the Right Call?
This article has focused on operational problems, but migrating to Playwright is not the right answer for every team. The following cases are where staying with Selenium is the more realistic choice.
Cases where Selenium continuation makes sense
- Large-scale Selenium Grid infrastructure: Mature parallel execution setup where migration cost would be enormous
- Java or C# automation stack: Team culture where a shift to Python or TypeScript isn't realistic
- IE or legacy browser testing required: Playwright does not support IE
- Thousands of stable, running Selenium tests: The risk of breaking working tests outweighs migration benefits
Two key decision axes
- Team skill: If the team has no TypeScript/JS culture, Playwright adoption just transplants the knowledge silo problem
- CI maturity: Migrating tools before CI is properly set up leads to underestimated costs and stalled migrations
FAQ
Q. Is Selenium being replaced by Playwright?
Selenium is not obsolete. Active development on Selenium 4.x continued through 2024–2025, with new features like Selenium Grid improvements and Selenium Manager being added. Selenium demand in the job market also remains high. That said, new projects adopting Playwright is increasing. The practical approach is to use each tool where it fits best — "one will replace the other" is not the right framing. Evaluate based on your team's tech stack, existing assets, and project requirements.
Q. What should I do first to prevent Selenium maintenance debt?
In priority order: ① Pin versions in requirements.txt (done today, zero risk), ② Automate ChromeDriver management with Selenium 4.6+ Selenium Manager or webdriver-manager (done in a day), ③ Search for and remove implicitly_wait usage (1–2 days). Just these three significantly reduce debt risk.
Q. Should I rewrite a deteriorated Selenium test suite from scratch?
Rewriting everything at once is high risk. A realistic approach is to start with the top 20 most frequently-run tests and work from there. "Improve without breaking working tests" has lower business risk than a full rewrite.
Q. When should I consider migrating from Selenium to Playwright?
First: many ChromeDriver management issues are already resolved by Selenium Manager in Selenium 4.6+. That problem alone is not sufficient migration justification. The core decision should weigh the cost of resolving current technical debt in Selenium against the cost of migrating to Playwright. Issues like "flakiness has destroyed CI trust" and "every new test adds maintenance load" can sometimes be addressed within Selenium by improving test design. Migrating without fixing the underlying design means the same problems reappear in Playwright.
📖 Related Articles
Selenium Implementation
- Selenium + Python + pytest Setup | Auto-managing ChromeDriver with webdriver-manager
- Selenium × pytest Practical Guide | fixture, parametrize, conftest.py, mark
Design and Migration Decisions
- Page Object Model | The Design Pattern That Prevents Maintenance Cost Explosion
- 7 Playwright Adoption Failures | Pitfalls of Migrating from Selenium
- Top 5 Test Automation Failures | Strategy and Design-Level Mistakes
- 7 Test Cases You Should Not Automate
Roadmap
🚀 Start Here: Recovery Priority TOP 3
Begin with actions you can take the same day you read this.
| 1st | Pin versions in requirements.txt Do today · High impact · Zero risk |
| 2nd | Find all implicitly_wait usage with grepgrep -r "implicitly_wait" ./tests — list every instance and fix in 1–2 days |
| 3rd | Find all full-path xpath usage with grepgrep -r "By.XPATH" ./tests — audit impact and migrate the highest-risk ones to data-testid first |
Selenium is a proven tool with over 20 years of production history. Every pattern in this article reflects not a problem with Selenium, but the result of scaling without an operational design. Start checking the warning signs today. Address them one by one. A deteriorated setup can always be recovered.
📋 Summary
- ChromeDriver version mismatches are solvable with Selenium 4.6+ Selenium Manager or webdriver-manager — choose based on your environment
- Mixing implicitWait and explicitWait creates unpredictable wait behavior. Unify on explicitWait as the standard
- Ban full-path xpath now. Work with the frontend team to standardize data-testid; use stable-attribute CSS/XPath as a fallback
- Page Object signals: frequency of duplication, UI change rate, and domain boundaries — not just "appears twice"
- Resolve knowledge silos with README documentation and code review applied to test code
- Headless differences (viewport, lazy render, sticky UI) can be verified locally. Pin window size, use scrollIntoView, and know when JS click is needed
- Major version upgrades need a dedicated branch and full CI verification before merging — always pin versions
- There are valid reasons to stay with Selenium. Decide based on current pain points, not trends
