feat: legacy school type with any-format uploads, PII check, and Edvise Schema (ES) naming (re-added)#216
Merged
kaylawilding merged 1 commit intodevelopfrom Mar 11, 2026
Conversation
… check, and Edvise Schema (ES) naming""
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
feat: legacy school type with any-format uploads, PII check, and Edvise Schema (ES) naming
changes
Institutions API
legacy_idto institution model and create/update flows. Enforce mutual exclusivity: at most one ofpdp_id,edvise_id, orlegacy_idper institution viahas_at_most_one_school_type().edvise_id/legacy_id(e.g.edvise_N,legacy_N) on create whenis_edvise/is_legacyis set and no id provided. Reject create when more than one school type is requested.Validation (data upload)
legacy_id, useschema_namespace = "legacy": any CSV format (encoding + read only, no schema validation), then PII column check before moving to raw/validated.student_idis excluded from PII (treated as non-PII).allowed_schemas = ["UNKNOWN"]instead of 422; non-legacy still receive 422 for non-descriptive filenames.inst_id404 with logging), and module-level constants for cache TTLs.Naming
Tests
legacy_idon create, PATCH to addlegacy_id, bucket/Databricks failures, reject both edvise + legacy.file_types: ["UNKNOWN"], empty file name → 422, invalidinst_id→ 404, edvise non-descriptive filename → 422, duplicate validate idempotent._infer_allowed_schemas_from_filenameand_ext_models_set; utilities test forhas_at_most_one_school_type.context
UNKNOWN. Downstream (EDA, model runs) that require STUDENT/COURSE still behave as before (404/400 when only UNKNOWN is present).deployment
Before or as part of deploying this branch, the database schema must include the
legacy_idcolumn on the institution table. If it is not already present, run:(This matches
pdp_id/edvise_idin the schema;VAR_CHAR_LENGTHis 36 in the codebase.)questions
None
Note
Medium Risk
Medium risk because it changes institution creation/update rules and the core upload validation flow, including a new bypass path that accepts arbitrary CSVs (gated by
legacy_id) and new PII-based rejections.Overview
Adds Legacy schools by introducing
InstTable.legacy_idand exposing it through institutions read/create/update responses, with enforced mutual exclusivity acrosspdp_id,edvise_id, andlegacy_idplus optional auto-assignment ofedvise_id/legacy_idon create.Refactors upload validation routing (
validation_helper) into smaller helpers, adds stricter input validation (empty filenames, invalid institution IDs), and introduces a legacy validation path (institution_id="legacy") that skips schema validation, reads the CSV as-is, and blocks uploads whose column names look like PII.Updates documentation/messages to consistently use “Edvise Schema (ES)”, adjusts PII detection to treat
student_idas non-PII, and expands tests/fixtures to cover legacy behavior, new error cases, and updated masking expectations.Written by Cursor Bugbot for commit 9f01339. This will update automatically on new commits. Configure here.