tools: add build_frontend.sh for reliable Toolforge frontend builds#534
Open
lgelauff wants to merge 90 commits into
Open
tools: add build_frontend.sh for reliable Toolforge frontend builds#534lgelauff wants to merge 90 commits into
lgelauff wants to merge 90 commits into
Conversation
Runs npm install + toolforge:build via `toolforge jobs run` to avoid OOM crashes in the interactive webservice shell. Works for any tool account (montage-beta, montage-dev) after `become <account>`. Usage: bash tools/build_frontend.sh
Without frontend/.env.production, the .env file is gitignored on Toolforge so VITE_API_ENDPOINT is undefined at build time, causing Axios baseURL to resolve to undefined/v1/ and breaking all API calls. Verified working on montage-dev (2026-04-20). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Wrap command in bash -c so && is interpreted by a shell - Use npm ci instead of npm install (strict, never modifies lock file) - Restore package-lock.json from git before building to prevent binary version drift - Derive log paths from tool name instead of hardcoding montage-beta - Replace unreliable toolforge jobs logs polling with tail -f on the output file Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nary fix - Use --wait flag on toolforge jobs run instead of polling - Derive esbuild version from package-lock.json and install matching @esbuild/linux-x64 binary explicitly (workaround for npm 9 in node20 image not installing correct optional platform dep) - Show logs with cat after job completes instead of tail -f polling - Drop rm -rf node_modules (not needed with explicit binary fix) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
esbuild's install.js post-install script validates the platform binary version immediately during npm install. With a stale npm cache serving @esbuild/linux-x64@0.25.5, it fails before we can replace the binary. --ignore-scripts skips the validation; the explicit binary install that follows provides the correct version. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Truncate log files before each run so stale errors from previous attempts don't appear in output - Filter EBADENGINE/WARN noise from stderr; only show real errors - Consolidate comments explaining the --ignore-scripts workaround Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add a KNOWN WORKAROUNDS header distinguishing temporary hacks (esbuild binary mismatch due to npm 9 / cross-platform lock file) from permanent requirements (VITE_API_ENDPOINT in .env.production), with pointers to root causes and what removes each workaround. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the manual webservice shell approach (npm install + build in an interactive node20 shell) with the build script, which handles the esbuild binary workaround and runs as a proper Toolforge job. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…/patch - vite 6.3.6 → 6.4.2 (fixes 3 high-severity dev-server CVEs) - vue 3.5.22 → 3.5.33 - axios 1.12.2 → 1.15.2 - dayjs 1.11.18 → 1.11.20 - prettier 3.6.2 → 3.8.3 Remaining 12 vulnerabilities are all in Cypress test deps or build tooling and do not affect the production deployment. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…flicts npm install inside the Toolforge job modifies package-lock.json on the bastion's NFS mount. This causes 'git pull' to abort with a merge conflict on the next deploy. Restore the file from git at the start of the script so it's always clean before the job runs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The build script now restores package-lock.json from git before running, so git pull + bash tools/build_frontend.sh is the single deploy command. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- README: add GitHub Issues link, mark Phabricator as archived, add pointer to dev.md - dev.md: fix Node.js version (v16→v18), rdb.py line number (113→119), and project tree (filenames, removed non-existent config/ dir) - .gitignore: add .claude/ and tmp/ - app.py: fix /a/ StaticFileRoute to point to static/index.html (not static/a/index.html which no longer exists) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…static files - dev.md: document debug:true and userid options for bypassing OAuth locally - config.default.yaml: add debug:false as an optional documented field - Remove montage/static/a/index.html and static/dist/ (obsolete pre-Vue frontend files) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…keys Add debug_userid and debug_username config options so developers can set which account they are auto-logged in as when debug: true, instead of always defaulting to Slaporte. Document the /complete_login workaround for bypassing OAuth locally. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- deployment.md: correct db_url format (mysql+pymysql://), document log paths for both production and beta instances separately - dev.md: remove non-existent config/ directory from project tree, fix Dockerfile -> dockerfile reference in Docker files section Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…evision/filetypes Closes #425. Relates to #505. Changes: - labs.py: new query using file/filerevision/filetypes tables; adds file_id; SELECT DISTINCT to prevent duplicates from multiple linktarget rows - rdb.py: add file_id column to Entry; deduplicate entries case-insensitively in add_entries() to match MariaDB utf8mb4_unicode_ci collation - loaders.py: pass file_id through make_entry(); update export dicts - tests: update fixtures and assertions for new query shape - tools/migrate_prod_db.sql, revert_prod_db.sql: production schema migration - requirements.txt: cffi 1.17.1, setuptools pin for Python 3.13 - deployment.md: update python3.9 → python3.13 references Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Without frontend/.env.production, the .env file is gitignored on Toolforge so VITE_API_ENDPOINT is undefined at build time, causing Axios baseURL to resolve to undefined/v1/ and breaking all API calls. Verified working on montage-dev (2026-04-20). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…n debugging section Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Step 6: replace vague bullet list with cp command, chmod, and annotated config template - Step 7: fix venv creation (python3.13, --without-pip + curl bootstrap) - Step 8: fix path (tools/create_schema.py, not montage/create_schema.py) - Fix python3.11 → python3.13 throughout - Fix log paths (no logs/ subdirectory) - Fix step numbering in "Deploying new changes" (was 1,2,3,5,8,9) - Add cd ~ before restart commands (must run from ~, not repo) - General grammar and clarity pass Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…step Brings deployment.md up to date from tools/build-frontend-script (PR #534): - Replace vague bullet instructions with actual commands - Add config template with openssl cookie secret generation - Fix venv creation (python3.13, --without-pip + curl bootstrap) - Fix tools/create_schema.py path - Fix log paths (no logs/ subdirectory) - Add step 0 (optional clean slate) with backup of irreplaceable files - Add venv rebuild procedure to debugging section - Fix python3.11 → python3.13 throughout Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ge reinstall - reinstall.sh: backs up config, wipes home dir, clones fresh, restores config, builds frontend, inits schema. Interactive confirmations before destructive steps. Refuses to run on production (montage) account. Detects service running under unexpected Python version and warns. - reinstall_venv.sh: rebuilds venv inside webservice shell pod using --without-pip + curl bootstrap (avoids subprocess hang in pod). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Dump usernames with is_organizer=1 to ~/backup/organizers.txt before the wipe, then re-insert them after schema init so organizer access survives a clean-slate reinstall. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
deploy.sh handles routine updates (git pull + build + restart), with --no-frontend and --pip flags. tools/steps/pip_install.sh and restart_service.sh are reusable building blocks; reinstall_venv.sh now delegates pip install to the step script. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
toolforge webservice stop returns non-zero when no service is running, causing set -e to silently exit the script. Add || true to the command substitution so the exit code is always 0. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
If a previous run created ~/backup/ and exited early, the directory may have wrong permissions on the next run. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Avoids permission errors when re-running reinstall.sh after a partial run — if the backup already matches the source, no copy is needed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three-pass wipe for NFS-backed directories; clone step explicitly removes any leftover src directory before cloning. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
If running from NFS (/data/project/), copy script to /tmp and re-exec before the wipe so bash holds no open handles on the NFS filesystem. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The webservice pod holds NFS file handles. Without waiting for it to terminate, rm -rf fails silently on the open directories. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Instead of silently failing at git clone, detect when ~/www/python/src survives the wipe (webservice pod still holding NFS handles), try a file-by-file deletion as a second attempt, and if still stuck exit with clear step-by-step recovery instructions.
CLONE_ERR=$(git clone ...) with set -e exits the script before
if [$? -ne 0] runs when clone fails. Switch to $() || { } form.
Also detects 'already exists' in the error and prints the NFS-lock
recovery steps inline rather than showing a generic error.
rm -rf removes contents but NFS prevents removing the directory itself. Strategy: delete all files first, then rmdir the empty directory — rmdir succeeds even when rm -rf on the parent tree fails.
If the src directory survives all deletion attempts (NFS lock on the tools/ subdirectory), fall back to initialising a git repo in-place and doing fetch + reset --hard + clean instead of clone.
- Preserve ~/.kube/ during reinstall wipe so toolforge-jobs keeps its kubeconfig and the frontend build doesn't break. - Use Python import instead of regex to get ENV_NAME, with 'dev' as fallback. Guard against ENV_NAME='default' so credentials never get written into the committed config.default.yaml template.
…WMCS operator needed
…t from lockfile The lockfile version and the version npm actually installs can diverge (cross-platform lockfile + npm 9), causing a host/binary mismatch. Reading the version after npm install ensures they always match.
Curl-bootstrapping pip is faster than a full venv rebuild when only pip itself is missing. Seen on montage-beta 2026-05-04. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
11 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
tools/build_frontend.sh— a script for building the Vue frontend on Toolforge using ajobs runjob (node20, 4Gi) rather than an interactivewebservice shellfrontend/.env.productionfrom fix(frontend): set empty VITE_API_ENDPOINT for production builds #516, settingVITE_API_ENDPOINT=so production builds use a relative API base URL instead of the localhost defaultdeployment.md(both fresh-install and update sections) to use the script instead of the manualnpm install+npm run toolforge:buildapproachKnown workarounds documented in the script
Temporary —
--ignore-scripts+ explicit@esbuild/linux-x64install:The Toolforge node20 image ships npm 9.2.0, which does not correctly install platform-specific optional binaries when
package-lock.jsonwas generated on macOS with npm 10+. This causes esbuild's post-install validation to fail. Workaround: skip post-install scripts, then install the correct linux-x64 binary explicitly at the version from the lock file.Long-term fix: switch to
--image node22(ships npm 10+), see T393437.Permanent —
frontend/.env.production:Required so production builds use
/v1/as the API base URL. Must stay in the repo.Test plan
git pull && bash tools/build_frontend.shcompletes with✓ built in Xson montage-betamontage/static/assets/contains freshly built files--- errors ---section of script outputtoolforge webservice python3.11 restart🤖 Generated with Claude Code