Skip to content

DOC: Validate versions.json before building docs #61573 #61578

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

iabhi4
Copy link
Contributor

@iabhi4 iabhi4 commented Jun 5, 2025

Adds a JSON validity check for versions.json directly inside pandas_web during context generation. This ensures malformed JSON (e.g., trailing commas) is caught early, preventing issues like the broken version dropdown in #61572

@iabhi4 iabhi4 force-pushed the add-sanity-for-versions-json branch 2 times, most recently from 0aa4645 to f1d80fc Compare June 5, 2025 22:05
Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @iabhi4

Can you move this check to main, and just open the file and load it as json, with a comment about what's for.

Also, can you add the comma to versions so we can see how this fails, and if we really need to reraise the exception (which I'm not sure we need).

Thanks!

@datapythonista datapythonista added CI Continuous Integration Web pandas website labels Jun 6, 2025
@iabhi4 iabhi4 force-pushed the add-sanity-for-versions-json branch from f1d80fc to 93d9d89 Compare June 6, 2025 20:52
@iabhi4
Copy link
Contributor Author

iabhi4 commented Jun 6, 2025

Can you move this check to main, and just open the file and load it as json, with a comment about what's for.

Thanks for the review @datapythonista, just to confirm: when you say "move this check to main", do you mean the main() function itself, or the __main__ block. My assumption is the __main__ block, so we catch the error before any file system operations, just want to double check!

Added the trailing comma to versions.json to test the failure. Leaving the sys.exit(1) in place for now. Once we see the CI result, happy to adjust that if we want to just warn instead of fail

@iabhi4 iabhi4 requested a review from datapythonista June 6, 2025 20:57
@datapythonista
Copy link
Member

Better inside the main function. The idea is that the if __name__ == "__main__": block is only executed when the script is called. If for whatever reason I want to import the script from another file to generate the website, that block won't execute, but I still want to validate the file.

We don't want to warn. What I'm saying is that when you do json.load() this will already raise an exception, show an error message and crash the script. So, catching the exception to show an error and crash the script is probably not needed, since it'll already happen, and the code is significantly simpler, just two lines.

If the default error message is not great, then I'm happy with your approach. But instead of sys.exit I think it's better to raise a new exception from the previous one (Python can chain exceptions).

Btw, when this is done, if you are interested, it'd be nice to open a new PR to use pathlib instead of os.path in the whole script. For this PR is better to use os.path for consistency.

@iabhi4 iabhi4 force-pushed the add-sanity-for-versions-json branch from 93d9d89 to 89b8493 Compare June 6, 2025 22:18
@iabhi4
Copy link
Contributor Author

iabhi4 commented Jun 6, 2025

Thanks for such a detailed clarification. Have pushed the updated changes.

Btw, when this is done, if you are interested, it'd be nice to open a new PR to use pathlib instead of os.path in the whole script

happy to follow up with a separate PR for the pathlib changes once this one is in

@datapythonista
Copy link
Member

Thanks for the update. I think catching the exception as you did was indeed better, the error message here is not helpful enough. But personally, I would just raise a RuntimeError with the good error message from the original one.

Feel free to remove the comma, after those changes I think we can get this merged. Thanks!

@iabhi4 iabhi4 force-pushed the add-sanity-for-versions-json branch from 89b8493 to 57bbd0f Compare June 6, 2025 22:42
@iabhi4
Copy link
Contributor Author

iabhi4 commented Jun 6, 2025

Raising RuntimeError now as suggested. Thank you so much for your help in this

@iabhi4 iabhi4 force-pushed the add-sanity-for-versions-json branch from 57bbd0f to 6eaa177 Compare June 6, 2025 22:52
@iabhi4 iabhi4 requested a review from datapythonista June 6, 2025 22:53
Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @iabhi4, very nice!

@datapythonista datapythonista merged commit 2096b28 into pandas-dev:main Jun 7, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration Web pandas website
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WEB: Test that our versions JSON is valid
2 participants