Closed
Description
Our test suite has started to crash more and more frequently, and now almost constantly, with the latest Julia nightly and 1.10 updates.
It seems we get OOM crashes (but it's hard to say because there are no stacktraces, just messages like this (if there is a way to get a backtrace here, that would be super helpful)
[2046] signal (15): Terminated
in expression starting at /home/runner/work/_actions/julia-actions/julia-runtest/latest/test_harness.jl:7
Error: The operation was canceled.
We have collect a ton of more data on oscar-system/Oscar.jl#2441 but no MWE as it is difficult to trigger this locally -- it "helps" that the CI runners on GitHub have only a few GB of RAM.
There is also something weird going on some of the statistics; note the crazy heap_target
Heap stats: bytes_mapped 1728.42 MB, bytes_resident 1286.89 MB, heap_size 1832.69 MB, heap_target 2357.69 MB, live_bytes 1761.48 MB
, Fragmentation 0.961GC: pause 898.46ms. collected 30.505936MB. incr
Heap stats: bytes_mapped 1728.42 MB, bytes_resident 1286.89 MB, heap_size 1832.31 MB, heap_target 2357.31 MB, live_bytes 1778.76 MB
, Fragmentation 0.971GC: pause 320.62ms. collected 552.570595MB. incr
Heap stats: bytes_mapped 1728.42 MB, bytes_resident 1317.08 MB, heap_size 2221.08 MB, heap_target 869387521.08 MB, live_bytes 1748.25 MB
, Fragmentation 0.787 39156 ms (1847 ms GC) and 392MB allocated for alnuth/polynome.tst