Project: ResisTrack · AI-Powered AMR Risk Prediction & Infection Control Platform
Team: Curelytics · Impact-AI-Thon 2026
Version: 1.0
ResisTrack predicts antimicrobial resistance (AMR) risk in hospitalized patients within 6 hours of admission, before culture results are available (which take 48–120 hours). The platform ingests EHR data, lab results, vital signs, prior antibiotic history, and clinical notes to generate real-time AMR risk scores and stewardship recommendations.
Primary Goal: Close the diagnostic gap. Reduce inappropriate antibiotic prescribing. Support — never replace — clinical judgment.
These rules are non-negotiable and override all other instructions:
RULE-SAFETY-01: The model MUST NEVER issue antibiotic prescriptions,
modify medication orders, or make autonomous treatment decisions.
All outputs are decision support — final authority rests with the clinician.
RULE-SAFETY-02: The model MUST NOT surface predictions with a
calibrated confidence score below 0.60 without attaching a
LOW_CONFIDENCE_FLAG = true to the output payload.
RULE-SAFETY-03: The model MUST flag when input data quality is
insufficient (e.g., < 3 lab values in prior 72 hours, missing vitals)
and must communicate data completeness score alongside the risk output.
RULE-SAFETY-04: Model outputs must NEVER be used to deny treatment.
They are risk stratification tools only.
RULE-SAFETY-05: No model version may be promoted to production without
passing clinical validation on >= 1,000 patient records with documented
sensitivity >= 0.80 and specificity >= 0.75.
RULE-DATA-01: ALL patient data must be treated as Protected Health
Information (PHI) under HIPAA. No PHI may leave the AWS VPC boundary.
RULE-DATA-02: The model must NEVER receive raw patient identifiers
(name, SSN, DOB, MRN) as direct input features. All patient references
must use tokenized internal IDs only.
RULE-DATA-03: Clinical notes passed to ClinicalBERT/BioBERT must be
processed inside the VPC only. No external API calls (e.g., OpenAI,
Anthropic) with patient note content.
RULE-DATA-04: Model training data must originate only from hospitals
that have signed a Business Associate Agreement (BAA) and data sharing
consent. Training on non-consented data is prohibited.
RULE-DATA-05: All model inputs and outputs must be logged to the
audit trail in RDS with timestamp, user role, hospital_tenant_id,
and de-identified patient_token. Log retention: 7 years minimum.
RULE-DATA-06: Training datasets must be de-identified per HIPAA
Safe Harbor (removing all 18 PHI identifiers) before use in
any non-production environment.
| Feature Name | Type | Source | Notes |
|---|---|---|---|
wbc_trend_7d |
float | LIS | White blood cell count — 7-day slope |
crp_latest |
float | LIS | C-Reactive Protein, most recent value |
creatinine_trend |
float | LIS | Creatinine 72h delta |
prior_beta_lactam_count |
int | Pharmacy | Count of prior beta-lactam Rx in past 90 days |
prior_fluoroquinolone_count |
int | Pharmacy | Count in past 90 days |
prior_carbapenem_flag |
bool | Pharmacy | Any carbapenem exposure in past 12 months |
icu_admission_flag |
bool | EHR | Is the current encounter ICU admission |
age_years |
int | EHR | Patient age — do NOT use DOB directly |
charlson_comorbidity_index |
int | Calculated | From ICD-10 codes in active problem list |
admission_ward_code |
categorical | EHR | Encoded ward ID (not ward name) |
days_since_last_hospitalization |
int | EHR | 0 if no prior admission in system |
culture_positive_history_flag |
bool | LIS | Any prior positive culture on record |
isolation_flag_current |
bool | EHR | Active contact/droplet isolation order |
temperature_max_48h |
float | Vitals | Max temp (°C) in past 48 hours |
heart_rate_max_48h |
float | Vitals | Max HR in past 48 hours |
All feature values must be validated against acceptable ranges before inference. Values outside physiologically plausible ranges must trigger
DATA_QUALITY_FLAG.
- Input shape:
(batch_size, 72, 13)— 72 hourly timestamps, 13 channels (8 lab values + 5 vitals) - Missing time steps: forward-fill with last known value; if >30% of timestamps are missing → set
DATA_COMPLETENESS_SCORE < 0.70and attach warning - Normalization: z-score per feature using hospital-cohort training statistics (not global statistics)
- Input: last 3 clinical notes (physician + nursing), max 512 tokens each after truncation
- Truncation strategy: keep first 128 tokens (contains chief complaint / assessment) + last 384 tokens
- Notes older than 72 hours: exclude unless no newer notes exist
- Do NOT pass radiology report image data — text reports only
Every inference call must return the following structured JSON payload:
{
"patient_token": "string (de-identified internal token)",
"hospital_tenant_id": "string",
"inference_timestamp": "ISO 8601 UTC",
"amr_risk_score": 0.0,
"risk_tier": "LOW | MEDIUM | HIGH | CRITICAL",
"confidence_score": 0.0,
"low_confidence_flag": false,
"data_completeness_score": 0.0,
"data_quality_flag": false,
"antibiotic_class_risk": {
"beta_lactam": 0.0,
"carbapenem": 0.0,
"fluoroquinolone": 0.0,
"aminoglycoside": 0.0,
"vancomycin": 0.0
},
"shap_top_features": [
{
"feature_name": "string",
"shap_value": 0.0,
"direction": "INCREASES_RISK | DECREASES_RISK",
"human_readable": "string (plain English explanation for clinician)"
}
],
"recommended_action": "string (stewardship recommendation text)",
"model_version": "string",
"explanation_available": true
}| Score Range | Tier | Required Action |
|---|---|---|
| 0 – 24 | LOW | No immediate action required; monitor |
| 25 – 49 | MEDIUM | Flag for pharmacist review within 24h |
| 50 – 74 | HIGH | Trigger CDS Hook alert to attending physician and pharmacy |
| 75 – 100 | CRITICAL | Immediate CDS alert + infection control notification |
RULE-TRAIN-01: Train/validation/test split must be 70/15/15 with
stratification on outcome label (resistant/sensitive) and hospital_tenant_id.
Do NOT train and test on data from the same hospital to avoid
site-specific overfitting.
RULE-TRAIN-02: Class imbalance handling — apply SMOTE or class_weight
balancing when positive (resistant) class prevalence < 20%.
Document imbalance ratio in the model card.
RULE-TRAIN-03: XGBoost hyperparameter search must use Bayesian
optimization (not random search) with 50+ trials via SageMaker HPO.
Key parameters to tune: max_depth (3–8), learning_rate (0.01–0.3),
n_estimators (100–1000), subsample (0.6–1.0).
RULE-TRAIN-04: ClinicalBERT fine-tuning must use a clinical-domain
pre-trained checkpoint (e.g., emilyalsentzer/Bio_ClinicalBERT).
Do NOT fine-tune general-domain BERT on clinical notes.
RULE-TRAIN-05: All training runs must be logged to SageMaker
Experiments with: dataset version, feature set version, hyperparameters,
AUC-ROC, AUPRC, sensitivity@80%specificity, and confusion matrix.
RULE-TRAIN-06: Ensemble weights (XGBoost vs LSTM vs NLP) must be
learned via a held-out validation set meta-learner. Do NOT hardcode
equal weights.
RULE-TRAIN-07: Model performance must be disaggregated by
subgroup: age band (< 18, 18–65, > 65), ICU vs non-ICU,
and primary organism if label is available.
Report any subgroup performance gaps >= 10% AUC as a risk item.
The following metrics must ALL be met before a model can be promoted to production:
| Metric | Minimum Threshold | Primary Model (XGBoost) |
|---|---|---|
| AUC-ROC | ≥ 0.82 | Primary evaluation metric |
| AUPRC | ≥ 0.70 | Required for imbalanced data fairness |
| Sensitivity @ 80% Specificity | ≥ 0.80 | Critical for patient safety — miss rate |
| False Positive Rate | ≤ 0.20 | Alert fatigue prevention |
| Calibration (Brier Score) | ≤ 0.15 | Probability reliability |
| Inference Latency (p95) | ≤ 2,000 ms | Real-time CDS requirement |
Mandatory: All thresholds must be validated on a held-out test set (not validation set) before the model card is signed off.
RULE-CDS-01: CDS Hook responses must be returned within 2 seconds
(p95). If inference endpoint latency exceeds 1.5 seconds, return
a cached score from the last inference run (max 24 hours old)
and flag CACHED_RESULT = true in the response.
RULE-CDS-02: CDS Hook cards must include a "Why this alert?"
link that opens the SHAP explainability panel — mandatory for
High and Critical tier alerts.
RULE-CDS-03: Every CDS Hook alert must provide three response
options to the clinician:
(a) "Acknowledged — will act"
(b) "Override — not applicable" (requires reason code selection)
(c) "Escalate to ID specialist"
All responses must be logged.
RULE-CDS-04: Override rate per clinician must be monitored.
If any clinician's override rate exceeds 60% over a 30-day period,
auto-generate a model feedback report for review by the clinical
informatics team.
RULE-MLOPS-01: Model retraining schedule — monthly automated
SageMaker Pipeline run on new hospital data. Emergency retraining
triggered if model drift score (PSI > 0.20) is detected in
production monitoring.
RULE-MLOPS-02: Blue/green deployment required for all model
updates. New model receives 10% traffic initially; auto-promote
to 100% if AUC-ROC on production shadow traffic >= previous
model - 0.02 over 72 hours.
RULE-MLOPS-03: Model versioning: semantic versioning (MAJOR.MINOR.PATCH).
MAJOR version bump required for changes to feature set.
MINOR for retrained weights on same feature set.
PATCH for calibration-only updates.
RULE-MLOPS-04: Model rollback capability must be maintained for
the previous 2 production versions. Rollback execution time
target: < 15 minutes.
RULE-MLOPS-05: All production model predictions must be stored
(de-identified) for post-hoc analysis and ground truth
comparison once culture results are available.
Model accuracy against culture ground truth must be reported monthly.
The following patterns are strictly prohibited in any code, model, or pipeline component:
# ❌ NEVER DO: External API calls with patient data
import openai
response = openai.ChatCompletion.create(
model="gpt-4",
messages=[{"role": "user", "content": patient_note}] # PROHIBITED
)
# ❌ NEVER DO: Raw PHI as model features
features["patient_name"] = row["patient_name"] # PROHIBITED
features["date_of_birth"] = row["dob"] # PROHIBITED
features["social_security"] = row["ssn"] # PROHIBITED
# ❌ NEVER DO: Log PHI to CloudWatch or stdout
print(f"Processing patient {patient_mrn}") # PROHIBITED
logger.info(f"Patient name: {patient_name}") # PROHIBITED
# ❌ NEVER DO: Hardcode thresholds for clinical decisions
if amr_score > 50:
prescribe_vancomycin() # PROHIBITED — model never prescribes
# ❌ NEVER DO: Deploy model without validation gate
model.deploy(validation_passed=False) # PROHIBITED- All Python code must pass
mypy --stricttype checking - All ML pipelines must be reproducible: set
random_state=42or equivalent for all stochastic operations - Feature engineering functions must have unit tests with ≥80% line coverage
- SageMaker Processing scripts must be containerized (Docker) and version-pinned for reproducibility
- Secrets (API keys, DB credentials) must NEVER appear in code — use AWS Secrets Manager exclusively
- All infrastructure must be provisioned via AWS CDK (TypeScript) — no console-created resources in production
| Term | Definition |
|---|---|
| AMR | Antimicrobial Resistance — resistance of microorganisms to antimicrobial medicines |
| MDRO | Multi-Drug Resistant Organism |
| SHAP | SHapley Additive exPlanations — model explainability method |
| CDS Hooks | Clinical Decision Support Hooks — standard for EHR-integrated alerts |
| SMART on FHIR | Substitutable Medical Applications, Reusable Technologies on FHIR |
| HL7 v2 | Health Level 7 version 2 — legacy healthcare messaging standard |
| FHIR R4 | Fast Healthcare Interoperability Resources Release 4 — modern healthcare data standard |
| BAA | Business Associate Agreement — HIPAA-required contract for PHI handling |
| PHI | Protected Health Information |
| PSI | Population Stability Index — metric for detecting model/data drift |
| AUPRC | Area Under the Precision-Recall Curve |
ResisTrack Agent Rules v1.0 — Team Curelytics — Impact-AI-Thon 2026
These rules must be reviewed and updated with each MAJOR model version release.