-
Notifications
You must be signed in to change notification settings - Fork 166
Testing Tectonic on the arXiv paper dataset #397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I've wanted to do this for so long, this is amazing!!! |
Let me know if you're ever in the Boston area and I'll buy you the beverage(s) of your choice. |
Would be interesting to test also the Oxidized fork/branch: |
Here's the current results with crlf0710#122 applied. |
@Mrmaxmeier Thank you, that's super helpful! |
404 on right column. |
Where are the sources for the test runner & frontend? |
On arxiv.org under |
@burrbull not the TeX sources, the code that runs the tests and displays them on https://tt.ente.ninja. |
@cormacrelf It might make sense to rewrite this to consume tectonic as a library. It currently just spawn a bunch of tectonic processes. |
@Mrmaxmeier can you rerun tests with crlf0710#123 ? |
Now that these tests are running automatically in the main repo, I am going to say that this issue is closed. Thank you for you work to set up this service @Mrmaxmeier, it is hugely valuable! |
Uh oh!
There was an error while loading. Please reload this page.
I'm experimenting with a few low-level changes in the engine. Tectonic's test suite is a nice early warning but passing the tests says little about being able to build real LaTeX documents in my experience.
The arXiv archive provides TeX sources for most of the published papers.
I've hacked up a proof of concept of a crater-like regression testing tool using parts of the arXiv dataset.
Here's a quick rundown:
It'll run the engine on each source file and store logs and output files.
This takes about 30 minutes for 2500 papers.
It currently relies on libfaketime to get as close to reproducible builds as possible.Tectonic's output is now reproducible when settingSOURCE_DATE_EPOCH
. This means we're able to detect regressions by checking the output hashes. Visual comparisons of output PDFs might still be interesting though in the future.There are some issues with deterministic outputs as some samples trigger undefined behavior.
I've borrowed some of crater's CSS and whipped up a simple front-end for the test results.
I'll open separate issues for the false-positives caused by undefined behavior and will keep this one open as a kind-of tracking issue on reproducibility and regression testing.
Ideas / TODOs:
The text was updated successfully, but these errors were encountered: