Skip to content

Python 3.13.0a6 freethreading on s390x: test.test_io.CBufferedReaderTest.test_constructor crash with Floating point exception #117755

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
befeleme opened this issue Apr 11, 2024 · 6 comments
Labels
type-bug An unexpected behavior, bug, or error

Comments

@befeleme
Copy link
Contributor

befeleme commented Apr 11, 2024

Bug report

Bug description:

Since #114331 was solved, we once again attempted to build Python with freethreading on s390x Fedora Linux.

test.test_io.CBufferedReaderTest.test_constructor fails. The traceback:

test_constructor (test.test_io.CBufferedReaderTest.test_constructor) ... Fatal Python error: Floating point exception
Current thread 0x000003ff8aa77b60 (most recent call first):
  File "/builddir/build/BUILD/Python-3.13.0a6/Lib/unittest/case.py", line 238 in handle
  File "/builddir/build/BUILD/Python-3.13.0a6/Lib/unittest/case.py", line 795 in assertRaises
  File "/builddir/build/BUILD/Python-3.13.0a6/Lib/test/test_io.py", line 1710 in test_constructor
  File "/builddir/build/BUILD/Python-3.13.0a6/Lib/unittest/case.py", line 606 in _callTestMethod
  File "/builddir/build/BUILD/Python-3.13.0a6/Lib/unittest/case.py", line 651 in run
  File "/builddir/build/BUILD/Python-3.13.0a6/Lib/unittest/case.py", line 707 in __call__
  File "/builddir/build/BUILD/Python-3.13.0a6/Lib/unittest/suite.py", line 122 in run
  File "/builddir/build/BUILD/Python-3.13.0a6/Lib/unittest/suite.py", line 84 in __call__
  File "/builddir/build/BUILD/Python-3.13.0a6/Lib/unittest/suite.py", line 122 in run
  File "/builddir/build/BUILD/Python-3.13.0a6/Lib/unittest/suite.py", line 84 in __call__
  File "/builddir/build/BUILD/Python-3.13.0a6/Lib/unittest/runner.py", line 240 in run
  File "/builddir/build/BUILD/Python-3.13.0a6/Lib/test/libregrtest/single.py", line 57 in _run_suite
  File "/builddir/build/BUILD/Python-3.13.0a6/Lib/test/libregrtest/single.py", line 37 in run_unittest
  File "/builddir/build/BUILD/Python-3.13.0a6/Lib/test/libregrtest/single.py", line 132 in test_func
  File "/builddir/build/BUILD/Python-3.13.0a6/Lib/test/libregrtest/single.py", line 88 in regrtest_runner
  File "/builddir/build/BUILD/Python-3.13.0a6/Lib/test/libregrtest/single.py", line 135 in _load_run_test
  File "/builddir/build/BUILD/Python-3.13.0a6/Lib/test/libregrtest/single.py", line 178 in _runtest_env_changed_exc
  File "/builddir/build/BUILD/Python-3.13.0a6/Lib/test/libregrtest/single.py", line 278 in _runtest
  File "/builddir/build/BUILD/Python-3.13.0a6/Lib/test/libregrtest/single.py", line 306 in run_single_test
  File "/builddir/build/BUILD/Python-3.13.0a6/Lib/test/libregrtest/worker.py", line 77 in worker_process
  File "/builddir/build/BUILD/Python-3.13.0a6/Lib/test/libregrtest/worker.py", line 112 in main
  File "/builddir/build/BUILD/Python-3.13.0a6/Lib/test/libregrtest/worker.py", line 116 in <module>
  File "<frozen runpy>", line 88 in _run_code
  File "<frozen runpy>", line 198 in _run_module_as_main
1 test failed again:
    test_io

Is this freethreading related? I don't know. Hoping to raise visibility and pointers as where the issue comes from. cc @vstinner

CPython versions tested on:

3.13

Operating systems tested on:

Linux

Linked PRs

@befeleme befeleme added the type-bug An unexpected behavior, bug, or error label Apr 11, 2024
@befeleme befeleme changed the title 3.13.0a6: test.test_io.CBufferedReaderTest.test_constructor ends with Fatal Python error: Floating point exception 3.13.0a6: test.test_io.CBufferedReaderTest.test_constructor ends with Fatal Python error: Floating point exception on s390x Apr 11, 2024
@befeleme befeleme changed the title 3.13.0a6: test.test_io.CBufferedReaderTest.test_constructor ends with Fatal Python error: Floating point exception on s390x 3.13.0a6: test.test_io.CBufferedReaderTest.test_constructor ends with Fatal Python error: Floating point exception on s390x with freethreading Apr 11, 2024
vstinner added a commit to vstinner/cpython that referenced this issue Apr 12, 2024
The test allocates 9 223 372 036 854 775 807 bytes
(0x7fffffffffffffff) and mimalloc fails with a division by zero on
s390x.
@vstinner
Copy link
Member

The root issue is a division by zero in mimalloc when requested memory is huge: 0x7fffffffffffffff bytes.

It can be reproduced on s390x without test_io:

$ ./python
Python 3.13.0a6+ (heads/main-dirty:396b831, Apr 12 2024, 04:17:51) [GCC 8.5.0 20210514 (Red Hat 8.5.0-20)] on linux
>>> import sys
>>> size = 0x7fffffffffffffff
>>> len = (size - sys.getsizeof(b''))
>>> b'x'*len
Floating point exception (core dumped)

$ uname -r
4.18.0-513.18.1.el8_9.s390x

@vstinner
Copy link
Member

I can reproduce the issue with a Python built with:

./configure --disable-gil CFLAGS="-O0 -g -ggdb"
time make -j6 

I disable all compiler optimizations (-O0) to ease debugging in gdb.

gdb logs when the bug occurs:

$ gdb -args ./python bug.py
(gdb) run

Program received signal SIGFPE, Arithmetic exception.
0x000000000112b508 in mi_page_init (tld=<optimized out>, block_size=<optimized out>, page=0x40000000168, heap=0x1595ce8 <_PyRuntime+295592>)
    at Objects/mimalloc/page.c:696
696	  page->reserved = (uint16_t)(page_size / block_size);

(gdb) p page_size
$1 = 0
(gdb) p block_size
$2 = 0

(gdb) up
#1  0x0000000001172d9e in mi_page_fresh_alloc (heap=0x16efaa8 <_PyRuntime+295592>, pq=0x16f0590 <_PyRuntime+298384>, 
    block_size=9223372036854775808, page_alignment=0) at Objects/mimalloc/page.c:295
295	  mi_page_init(heap, page, full_block_size, heap->tld);
(gdb) p full_block_size
$3 = 0
(gdb) p block_size
$4 = 9223372036854775808
(gdb) p /x block_size
$5 = 0x8000000000000000

(gdb) p pq == 0
$6 = 0
(gdb) p mi_page_queue_is_huge(pq)
$7 = true
(gdb) p mi_page_block_size(page)
$8 = 0
(gdb) p page->xblock_size
$9 = 0

@vstinner
Copy link
Member

(gdb) b PyObject_Malloc
Breakpoint 1 at 0x11806c6: file Objects/obmalloc.c, line 1288.

(gdb) condition 1 size >= 0x7fffffffffffffff

PyObject_Malloc():

  • mi_find_page()
  • mi_large_huge_page_alloc()
  • mi_page_fresh_alloc()
  • _mi_segment_page_alloc()
  • mi_segment_huge_page_alloc(): mi_segment_os_alloc() and _mi_segment_page_start()

mi_segment_huge_page_alloc() is called with size=0x8000000000000000:

  • mi_segment_alloc(size) returns 0x40000000000
  • uint8_t* start = _mi_segment_page_start(segment, page, &psize); with slice->slice_count = 0 sets psize to 0.

The problem is that slice->slice_count is 0: integer overflow. mi_page_t.slice_count type is uint32_t, whereas on s390x, we try to set it to 0x8000_00000000 (140737488355328) which doesn't fit:

static mi_page_t* mi_segment_span_allocate(mi_segment_t* segment, size_t slice_index, size_t slice_count, mi_segments_tld_t* tld) {
  ...
  slice->slice_count = (uint32_t)slice_count;
  ...
  return page;
}

gdb:

Breakpoint 5, mi_segment_span_allocate (segment=0x40000000000, slice_index=1, slice_count=140737488355328, tld=0x16f1eb0 <_PyRuntime+304816>)
    at Objects/mimalloc/segment.c:711

vstinner added a commit to vstinner/cpython that referenced this issue Apr 12, 2024
Fix mimalloc allocator for huge memory allocation (around
8,589,934,592 GiB) on s390x.

* Abort allocation early in mimalloc if the number of slices doesn't
  fit into uint32_t, to prevent a integer overflow (cast 64-bit
  size_t to uint32_t).
* Add test_large_alloc() to test_bigaddrspace.
* Reenable test_maxcontext_exact_arith() of test_decimal on s390x.
@vstinner
Copy link
Member

On x86-64, we don't reach this bug, since mmap() fails before.

strace on the big mmap() call:

  • x86-64, fail with ENOMEM: mmap(NULL, 9223372036863164416, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM
  • s390x, success: mmap(NULL, 9223372036858970112, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x3ffdcf80000.

On x86-64, gdb traces around mmap():

  • _mi_arena_alloc_aligned(size=0x8000000000410000) returns NULL.
  • _mi_prim_alloc()
  • unix_mmap() with size=0x8000000000800000, try_alignment=0x2000000, protect_flags=PROT_WRITE | PROT_READ, allow_large=true => it fails with errno=ENOMEM and return NULL.

@vstinner vstinner changed the title 3.13.0a6: test.test_io.CBufferedReaderTest.test_constructor ends with Fatal Python error: Floating point exception on s390x with freethreading Python 3.13.0a6 freethreading: test.test_io.CBufferedReaderTest.test_constructor crash with Floating point exception on s390x Apr 12, 2024
@vstinner vstinner changed the title Python 3.13.0a6 freethreading: test.test_io.CBufferedReaderTest.test_constructor crash with Floating point exception on s390x Python 3.13.0a6 freethreading on s390x: test.test_io.CBufferedReaderTest.test_constructor crash with Floating point exception Apr 12, 2024
@colesbury
Copy link
Contributor

One thing I'm curious about: even with overcommit, I think that mmap() still needs to create the page mappings. With 4KB pages, that's trillions of pages. The page table itself is too bit to store in memory.

Why does it succeed on s390x? Is mmap able to use much bigger pages on IBM Z? Something else?

@vstinner
Copy link
Member

Why does it succeed on s390x? Is mmap able to use much bigger pages on IBM Z? Something else?

Sorry, I have no idea 🤷🏻‍♂️

vstinner added a commit that referenced this issue Apr 15, 2024
The test allocates 9 223 372 036 854 775 807 bytes
(0x7fffffffffffffff) and mimalloc fails with a division by zero on
s390x.
vstinner added a commit to vstinner/cpython that referenced this issue Apr 15, 2024
Fix mimalloc allocator for huge memory allocation (around
8,589,934,592 GiB) on s390x.

* Abort allocation early in mimalloc if the number of slices doesn't
  fit into uint32_t, to prevent a integer overflow (cast 64-bit
  size_t to uint32_t).
* Add test_large_alloc() to test_bigaddrspace (test skipped on 32-bit
  platforms).
* Reenable test_maxcontext_exact_arith() of test_decimal on s390x.
* Reenable test_constructor() tests of test_io on s390x.
vstinner added a commit to vstinner/cpython that referenced this issue Apr 16, 2024
Remove unreliable tests on huge memory allocations:

* Remove test_maxcontext_exact_arith() of test_decimal.
  Stefan Krah, test author, agreed on removing the test:
  python#114331 (comment)
* Remove test_constructor() tests of test_io.
  Sam Gross suggests remove them:
  python#117809 (review)

On Linux, depending how overcommit is configured, especially on Linux
s390x, a huge memory allocation (half or more of the full address
space) can succeed, but then the process will eat the full system
swap and make the system slower and slower until the whole system
becomes unusable.

Moreover, these tests had to be skipped when Python is built with
sanitizers.
vstinner added a commit that referenced this issue Apr 16, 2024
Remove unreliable tests on huge memory allocations:

* Remove test_maxcontext_exact_arith() of test_decimal.
  Stefan Krah, test author, agreed on removing the test:
  #114331 (comment)
* Remove test_constructor() tests of test_io.
  Sam Gross suggests remove them:
  #117809 (review)

On Linux, depending how overcommit is configured, especially on Linux
s390x, a huge memory allocation (half or more of the full address
space) can succeed, but then the process will eat the full system
swap and make the system slower and slower until the whole system
becomes unusable.

Moreover, these tests had to be skipped when Python is built with
sanitizers.
vstinner added a commit that referenced this issue Apr 16, 2024
Fix mimalloc allocator for huge memory allocation (around
8,589,934,592 GiB) on s390x.

Abort allocation early in mimalloc if the number of slices doesn't
fit into uint32_t, to prevent a integer overflow (cast 64-bit
size_t to uint32_t).
diegorusso pushed a commit to diegorusso/cpython that referenced this issue Apr 17, 2024
)

The test allocates 9 223 372 036 854 775 807 bytes
(0x7fffffffffffffff) and mimalloc fails with a division by zero on
s390x.
diegorusso pushed a commit to diegorusso/cpython that referenced this issue Apr 17, 2024
Remove unreliable tests on huge memory allocations:

* Remove test_maxcontext_exact_arith() of test_decimal.
  Stefan Krah, test author, agreed on removing the test:
  python#114331 (comment)
* Remove test_constructor() tests of test_io.
  Sam Gross suggests remove them:
  python#117809 (review)

On Linux, depending how overcommit is configured, especially on Linux
s390x, a huge memory allocation (half or more of the full address
space) can succeed, but then the process will eat the full system
swap and make the system slower and slower until the whole system
becomes unusable.

Moreover, these tests had to be skipped when Python is built with
sanitizers.
diegorusso pushed a commit to diegorusso/cpython that referenced this issue Apr 17, 2024
…7809)

Fix mimalloc allocator for huge memory allocation (around
8,589,934,592 GiB) on s390x.

Abort allocation early in mimalloc if the number of slices doesn't
fit into uint32_t, to prevent a integer overflow (cast 64-bit
size_t to uint32_t).
@hugovk hugovk closed this as completed Jun 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

4 participants