test(e2e) + ci: harden UI e2e suite against the double-prefix regression#26047
Draft
yuneng-berri wants to merge 4 commits intolitellm_internal_stagingfrom
Draft
test(e2e) + ci: harden UI e2e suite against the double-prefix regression#26047yuneng-berri wants to merge 4 commits intolitellm_internal_stagingfrom
yuneng-berri wants to merge 4 commits intolitellm_internal_stagingfrom
Conversation
The e2e_ui_testing job previously ran in parallel with no downstream dependencies, so a failure there did not block the pipeline. This let the UI double-prefix regression in 0afffe4 (read of NEXT_PUBLIC_BASE_URL in networking.tsx) ship unnoticed. Make e2e_ui_testing a prerequisite for the two jobs that function as the main-path gate: - build_and_test: the main branch-integration job - db_migration_disable_update_check: the main ops-path job Scoped narrowly to avoid over-serializing the DAG; other workflow branches still run in parallel with e2e_ui_testing. Part of the follow-up hardening for the double-prefix incident. Co-authored-by: yuneng-jiang <yuneng-berri@users.noreply.github.com>
Adds a Playwright fixture that wraps the default page and fails any test whose browser makes a forbidden request. Two classes of violations are flagged: 1. URL path contains /ui/ui/ (the exact double-prefix regression from 0afffe4 + NEXT_PUBLIC_BASE_URL="ui/"). 2. Any xhr/fetch request to a known API endpoint nested under /ui/ (e.g. /ui/key/info, /ui/team/list). The verb list is lifted from networking.tsx. Document/script/asset requests are exempt from rule 2 so legitimate UI routes that collide with API verb names (/ui/guardrails, /ui/usage, /ui/prompts) do not trigger false positives. The helper is unit-tested via vitest in tests/e2e_guard.test.ts (84 cases covering double-prefix, API-verb-under-/ui/, legitimate UI routes, static assets, root-level API paths, cross-origin, malformed URLs, and the resourceType boundary). Also tightens globalSetup.ts to reject post-login redirects that land on /ui/ui/, so the e2e suite fails early if login itself happens against a broken bundle. Next commit will migrate existing specs to use the fixture. Co-authored-by: yuneng-jiang <yuneng-berri@users.noreply.github.com>
Every Playwright spec under e2e_tests/tests/** now imports test/expect
from ../../fixtures/guarded-page instead of @playwright/test. The
guarded fixture wraps the default page with a request URL listener
that fails the test if any browser request matches the double-prefix
or /ui/<api-verb>/... patterns.
No per-test code changes are needed — the fixture overrides the
existing 'page' fixture name, so all existing ({ page }) callbacks
are automatically protected.
Also refactored the fixture to export test-with-wrapped-page (instead
of a separate guardedPage fixture name) so the migration is a
one-line import swap per spec.
Co-authored-by: yuneng-jiang <yuneng-berri@users.noreply.github.com>
Two Playwright meta-specs under e2e_tests/tests/meta/: - guard.spec.ts (positive): spins up an ephemeral HTTP server, navigates the browser at it, and verifies the guarded page's request listener fires. Also re-checks isForbiddenRequestUrl at runtime through the same code path Playwright uses. - guard_triggers.spec.ts (negative): fires /ui/ui/ and /ui/key/info XHRs from the page, with test.fail(true, ...) annotations so each spec *passes* iff the guard fixture trips (and fails iff the guard stops working). Meta-specs run via e2e_tests/playwright.meta.config.ts (no proxy, no DB, no globalSetup) and are excluded from the main Playwright config via a testIgnore entry so they don't slow down the heavy run. CircleCI e2e_ui_testing gains a lightweight 'Run guardedPage fixture meta-tests' step before the full run so fixture breakage is caught before we spend cycles booting the proxy. Co-authored-by: yuneng-jiang <yuneng-berri@users.noreply.github.com>
|
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Relevant issues
Follow-up hardening for the UI double-prefix bug (post-fix from commits
ba24e4a1b3 remove next envandaab3ef8988 chore: update Next.js build artifacts). The source fix — removingNEXT_PUBLIC_BASE_URL="ui/"fromui/litellm-dashboard/.env.productionand rebuilding the bundle underlitellm/proxy/_experimental/out/— already landed onlitellm_internal_staging. This PR addresses the three unresolved "Next Steps" from the investigation:e2e_ui_testing(pipeline 74034 / job 1526026) pass with the broken bundle?e2e_ui_testingan actual merge gate.Changes
CI (
.circleci/config.yml)- e2e_ui_testingtorequires:onbuild_and_testanddb_migration_disable_update_check, so a failing UI e2e run blocks the two jobs that function as the main-path gate. Intentionally scoped narrow so the rest of the DAG still runs in parallel withe2e_ui_testing.Run guardedPage fixture meta-testsstep insidee2e_ui_testingthat runs before booting the proxy, so fixture breakage is caught fast and cheap.Playwright fixture —
ui/litellm-dashboard/e2e_tests/fixtures/guarded-page.tsRe-exports
testfrom@playwright/testwith the defaultpagefixture wrapped in a URL guard. Specs only change their import source; no per-test code changes. The guard installs apage.on('request')listener and fails any test whose browser issues a request matching:/ui/ui/— the exact double-prefix regression. Applies to every resource type (document, script, xhr, fetch, …)./ui/<api-verb>/...for xhr/fetch requests only, where<api-verb>is one of the top-level paths pulled from${proxyBaseUrl}/...calls innetworking.tsx(key,team,user,model,global,spend,customer,health,sso,callbacks,budget,tag, etc.).Document navigations to
/ui/guardrails,/ui/usage,/ui/promptsetc. — real UI routes whose first segment collides with API verb names — are explicitly NOT flagged. Static assets under/ui/_next/*,/ui/assets/*,*.{html,js,css,map,png,…}are allow-listed.Spec migration
Every spec under
e2e_tests/tests/**(including the skipped users specs, for consistency) now importstest/expectfrom../../fixtures/guarded-pageinstead of@playwright/test. Because the fixture overrides thepagefixture name, existing({ page })callbacks are automatically protected with no code changes.globalSetup.tsTighten the post-login
waitForURLpredicate so a redirect to/ui/ui/...fails setup immediately, instead of silently authenticating against a broken bundle.Meta-tests
Two Playwright specs under
e2e_tests/tests/meta/(run viaplaywright.meta.config.ts, excluded from the main run):guard.spec.ts— liveness test. Spins up an ephemeral HTTP server and verifies the guardedpage's request listener fires on a real navigation, plus runtime sanity-checksisForbiddenRequestUrl().guard_triggers.spec.ts— negative test. Intentionally fires/ui/ui/and/ui/key/infoXHRs withtest.fail(true, ...)annotations so each spec passes iff the guard trips. If someone breaks the fixture wiring, these flip red.Unit tests —
ui/litellm-dashboard/tests/e2e_guard.test.ts84 vitest cases covering double-prefix detection, API-verb-under-
/ui/for xhr/fetch, legitimate UI routes (including API-verb-named routes like/ui/guardrails), static asset allow-list, root-level API paths, cross-origin, malformed URLs, and the resource-type boundary. All 84 pass.Phase 1 — Why did the original CircleCI run pass?
I could not retrieve logs for pipeline 74034 / job 1526026 from this sandbox — CircleCI's public API returns "Project not found" (the project is private) and neither the CircleCI MCP tool nor the GitHub Status API exposes the run from here.
Based on static analysis of
ui/litellm-dashboard/e2e_tests/tests/**andnetworking.tsx:waitForLoadState("networkidle")treats 404 responses as "completed". Several pages render from client state even when API calls fail.fetch("ui/project/list")resolves against the current URL. The Next.js export's canonical URL is/ui(no trailing slash), so the browser replaces the last segment →/ui/project/list. In Docker (behind a reverse proxy that serves under/ui/), the current URL ends with/, so the same relative URL resolves to/ui/ui/project/list→ 404.In other words: the source, the build, and the bundle were all identical between Docker and CircleCI — what differed was the current-URL-with-trailing-slash, and whether the proxy's static-asset handler happened to fall through to the API router for
/ui/<verb>/...paths. This PR's guarded fixture catches both variants since the UI still issues raw/ui/<api-verb>/...XHRs from the relative URL resolution, regardless of whether they 404 or get routed.Pre-Submission checklist
ui/litellm-dashboard/tests/e2e_guard.test.ts,ui/litellm-dashboard/e2e_tests/tests/meta/*.spec.ts).test.fail(true, ...)inversion).python3 -c "import yaml; yaml.safe_load(...)").litellm/, no schema changes, no dep changes. Only CI config, Playwright e2e infrastructure, and matching vitest coverage.Type
✅ Test
🚄 Infrastructure
Screenshots / Proof of Fix
Vitest:
Meta-tests (Playwright chromium, standalone):
✘ 1and✘ 2are the negative tests — they expect the guard to trip (viatest.fail(true, ...)), so the run-level result is4 passed.Follow-ups not in this PR
e2e_ui_testingstatus as a required check onlitellm_internal_staging/main. Must be done in the repo settings UI; cannot be changed from code.