-
Notifications
You must be signed in to change notification settings - Fork 15.8k
fix: Enable Playwright migration with graceful Selenium fallback #35063
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Enable Playwright migration with graceful Selenium fallback #35063
Conversation
Based on your review schedule, I'll hold off on reviewing this PR until it's marked as ready for review. If you'd like me to take a look now, comment
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review by Korbit AI
Korbit automatically attempts to detect when you fix issues in new commits.
Category | Issue | Status |
---|---|---|
Incomplete Playwright Availability Check ▹ view | ✅ Fix detected | |
Redundant Import Checks ▹ view | ✅ Fix detected | |
Driver Factory Method Violation ▹ view | ✅ Fix detected |
Files scanned
File Path | Reviewed |
---|---|
superset/utils/screenshots.py | ✅ |
superset/utils/webdriver.py | ✅ |
Explore our documentation to understand the languages and file types we support and the files we ignore.
Check out our docs on how you can make Korbit work best for you and your team.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #35063 +/- ##
===========================================
+ Coverage 60.48% 71.87% +11.39%
===========================================
Files 1931 587 -1344
Lines 76236 43498 -32738
Branches 8568 4703 -3865
===========================================
- Hits 46114 31266 -14848
+ Misses 28017 11003 -17014
+ Partials 2105 1229 -876
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Wondering if we could commit to Playwright-only for 7.0 already, and fully deprecate Selenium already (?) Would simplify things quite a bit and make everyone's life better moving forward. |
I asked Claude Code, here is what it said:
Goal: Enable by default to gather production feedback Phase 2: 7.0 Release (Breaking Change)Remove feature flag entirely, always use Playwright: # In superset/utils/screenshots.py
def driver(self, window_size: WindowSize | None = None) -> WebDriver:
window_size = window_size or self.window_size
if _PLAYWRIGHT_AVAILABLE:
return WebDriverPlaywright(self.driver_type, window_size)
# Graceful degradation with clear error message
raise ScreenshotImageNotAvailableException(
"Playwright is required for screenshots in Superset 7.0+. "
f"Install with: pip install 'apache-superset[playwright]' && playwright install chromium. "
f"{_PLAYWRIGHT_INSTALL_MESSAGE}"
) Phase 3: Cleanup (Post-7.0)
Breaking Changes for 7.0Dependencies❌ Breaking: Configuration❌ Breaking: Remove Documentation Updates Required
Benefits Analysis1. Functional Benefits
2. Technical Benefits
3. Development Benefits
Risk AssessmentLow Risk Factors ✅
Medium Risk Factors
|
7564cb4
to
214dc68
Compare
…91312] Fixes "name 'sync_playwright' is not defined" error when PLAYWRIGHT_REPORTS_AND_THUMBNAILS feature flag is enabled without Playwright installed. Changes: - Add module-level PLAYWRIGHT_AVAILABLE check to prevent NameError - Implement graceful fallback to Selenium when Playwright unavailable - Add clear migration messages for WebGL/DuckGL support from Cypress - Make feature flag control preference rather than hard requirement - Add validate_webdriver_config() for migration status checking Testing: - Add 20+ unit tests covering all migration scenarios - Test graceful fallback, error handling, and configuration validation - Ensure no breaking changes to existing Selenium functionality This unblocks the year-long effort to migrate from Cypress to Playwright for proper WebGL/DuckGL chart screenshot support. Shortcut: https://app.shortcut.com/preset/story/91312/
- Fixed comment to clarify that Cypress (not Selenium) doesn't work with DeckGL - Updated installation message to accurately reflect the migration from Cypress - Maintained correct description of WebGL/Canvas screenshot support that Playwright enables
- Add comprehensive Playwright availability check that verifies browser binaries - Implement module-level caching to avoid repeated import checks - Add single-log pattern to prevent log spam on repeated operations - Update copy from "WebGL/DuckGL" to "WebGL/Canvas charts" and fix DeckGL references - Add comprehensive test coverage for availability checking and caching behavior - Fix test parameter ordering and mock targets for proper validation The enhanced availability check now actually launches a browser to ensure Playwright is fully functional, not just imported. Module-level caching optimizes performance by avoiding repeated checks on every driver() call.
Changed logging statements to use % formatting instead of f-strings to comply with ruff G004 security best practices.
- Use PLAYWRIGHT_AVAILABLE directly instead of cached value to respect test patches - Use config.get() with defaults for SCREENSHOT_LOCATE_WAIT and SCREENSHOT_LOAD_WAIT - Fix test parameter order to match decorator order (bottom to top) - Update fallback logging expectations to expect single log (avoids log spam) - Change WebGL/Canvas to WebGL/DuckGL in install message for consistency - Fix logger.info test to check formatted parameter correctly
…ed variables - Replace all references to _PLAYWRIGHT_AVAILABLE with PLAYWRIGHT_AVAILABLE - Replace all references to _PLAYWRIGHT_INSTALL_MESSAGE with PLAYWRIGHT_INSTALL_MESSAGE - Fix WebGL/Canvas to WebGL/DuckGL consistency in test assertions
…ation Based on test engineer review, removed ~60% of test code while maintaining 100% functionality coverage: - Removed TestDriverMethodThreadSafety - tests implementation details not actual behavior - Removed TestScreenshotModuleLevelCaching - tests internal caching implementation - Removed TestScreenshotDriverPerformance - contrived performance tests - Deleted test_playwright_migration_integration.py - contrived scenarios - Simplified test_playwright_migration_working.py to essential tests only - Removed duplicate tests from webdriver_test.py Remaining tests focus on core behavior: - Driver selection logic (Playwright when available, Selenium fallback) - Fallback logging behavior - Window size handling - Configuration validation
7458aa8
to
75f60f5
Compare
…gration-fallback-sc-91312
@rusackas do you think we can we make this decision by lazy consensus? Make the call right here in this PR and start ripping Selenium out? |
superset/utils/screenshots.py
Outdated
"not render correctly). %s", | ||
PLAYWRIGHT_INSTALL_MESSAGE, | ||
) | ||
_PLAYWRIGHT_FALLBACK_LOGGED = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a Flask app (multiple threads) or a Celery worker (multiple processes), this may not behave as we would expect:
Each process will have its own memory space, so the log suppression will only work per process, not globally.
In threaded mode, modifying the global without a lock could lead to race conditions (unlikely to matter much for a boolean flag, but still possible).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right about the concurrency issues. Added thread safety with a lock to prevent race conditions in threaded Flask mode. The per-process behavior in Celery is actually fine (one log per worker), but the threading needed fixing.
try: | ||
# Try to actually launch a browser to ensure binaries are installed | ||
with sync_playwright() as p: | ||
browser = p.chromium.launch(headless=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there may be a lighter way to do this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it to check if the browser binary exists first, only launches if that fails
superset/utils/webdriver.py
Outdated
self._screenshot_locate_wait = app.config["SCREENSHOT_LOCATE_WAIT"] | ||
self._screenshot_load_wait = app.config["SCREENSHOT_LOAD_WAIT"] | ||
self._screenshot_locate_wait = app.config.get("SCREENSHOT_LOCATE_WAIT", 10) | ||
self._screenshot_load_wait = app.config.get("SCREENSHOT_LOAD_WAIT", 60) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We used to have a rule to be strict with the config, all keys have defaults configured so we expect app.config["SOME_KEY"]
instead of app.config.get
unsure if we still have this "rule"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. yeah, Thanks. Good catch! Since the defaults are configured in config.py, switched to strict app.config["KEY"].
I would think lazy consenus is fine. If anyone on dev@ replies with "bUt I nEeDz it!" then we can elevate it to a SIP. SIP-178 doesn't explicitly mention selenium at all, so the consensus is a good idea. SIP-98 goes does have mentions on the thread that we should remove Selenium in a future version. If we consider it a breaking change, that raises Semver/merge/release timing questions we should also consider. |
…gration-fallback-sc-91312
…ization - Add thread safety to _PLAYWRIGHT_FALLBACK_LOGGED with threading.Lock() - Optimize playwright availability check with lightweight executable_path check first - Remove unnecessary caching since function is only called at module import - Use strict app.config[] access instead of .get() for keys with defaults - Update tests to properly validate optimizations and thread-safe global access
…re test The test was failing because it only mocked chromium.launch() to fail, but check_playwright_availability() first tries chromium.executable_path. When that succeeds, the function returns True without testing the launch fallback. This fix mocks both executable_path and launch to raise exceptions, ensuring the function properly tests the complete fallback scenario and returns False when browser launch fails. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
superset/utils/screenshots.py
Outdated
"not render correctly). %s", | ||
PLAYWRIGHT_INSTALL_MESSAGE, | ||
) | ||
_PLAYWRIGHT_FALLBACK_LOGGED = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think that the overhead of the lock, global variable, and complexity isn't worth it for preventing some log lines. Also celery can be configured to use forks (most frequent) or threads/eventlet etc, on the fork case the thread would not work, that is the most common scenario. Let's just log it, most log systems can do deduplication anyway
…t fallback Simplifies the fallback logging by removing the global flag, threading lock, and deduplication logic. The added complexity wasn't worth it for preventing log lines, especially with Celery's fork mode and modern log systems that handle deduplication anyway. Changes: - Remove _PLAYWRIGHT_FALLBACK_LOGGED global variable - Remove _fallback_lock threading synchronization - Simplify fallback logging to just log directly - Update tests to match simplified behavior 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked this in the daily PR sync, and it looks like since this is a fallback and not a removal, we can consider this non-breaking. LGTM, merging, on with the show!!!
Fixes "name 'sync_playwright' is not defined" error when PLAYWRIGHT_REPORTS_AND_THUMBNAILS feature flag is enabled without Playwright installed.
Changes:
Testing:
This unblocks the effort to migrate from Cypress to Playwright for proper WebGL/DuckGL chart screenshot support.
SUMMARY
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TESTING INSTRUCTIONS
ADDITIONAL INFORMATION