This repository contains the code for fitting a dynamic Bayesian model of NCAA team performance and simulating potential March Madness outcomes based on the results. The model is written in Stan and the core pipeline is written in R, with some modest Python and JavaScript as needed.
READMEs in each directory provide details on the directory contents. In particular, the stan/ README walks through the model in detail. A general overview of the model methodology can be found in the How this works article, and the full output for the men’s and women’s brackets can be found at the data diary.
- Modified
run_bracket_model()to pull in the result of the championship game.
- Added explicit anti-join columns when writing model outputs to disk.
- Fixed minor bug in
run_bracket_model()to enforce correct date filtering when finding lastwid0used.
run_*_model()functions now de-duplicate results by date/league before writing results out.run_bracket_model(),generate_html_bracket(), andgenerate_html_table()skip processing when no games have been played.
- Modify
extract_teams()so that partially-filled games return a0instead ofNA - Manual override for Men’s BYU Sweet 16 game (no link available in ESPN)
- Modified
images.pyso that Tennessee’s women’s logo appears correctly. - Updated women’s regional names.
- Initial release!
- Fixes a few minor pre-release bugs to ensure pipeline runs on 2025 tournament data
- Initial pre-release