-
-
Notifications
You must be signed in to change notification settings - Fork 32k
memoryview to freed memory can cause segfault #60198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
A memoryview which does not own a reference to its base object can point to freed or reallocated memory. For instance the following segfaults for me on Windows and Linux. import io
class File(io.RawIOBase):
def readinto(self, buf):
global view
view = buf
def readable(self):
return True
f = io.BufferedReader(File())
f.read(1) # get view of buffer used by BufferedReader
del f # deallocate buffer
view = view.cast('P')
L = [None] * len(view) # create list whose array has same size
# (this will probably coincide with view)
view[0] = 0 # overwrite first item with NULL
print(L[0]) # segfault: dereferencing NULL I realize there are easier ways to make Python segfault, so maybe this should not be considered a serious issue. But I think there should be some way of guaranteeing that a memoryview will not try to access memory which has already been freed. In bpo-15903 skrah proposed exposing memory_release() as PyBuffer_Release(). However, I don't think that would necessarily invalidate all exports of the buffer. Alternatively, one could incref the buffered reader object and set mview->mbuf->obj to it. Maybe we could have PyMemoryView_FromMemoryEx(char *mem, Py_ssize_t size, int flags, PyObject *obj) which guarantees that if obj is non-NULL then it will not be garbage collected before the memoryview. This should *not* expose obj as an attribute of the memoryview. |
Can someone follow this up noting it refers to bpo-15903. |
We deal with it when we have time. IMO there is little value in |
Saving a reference to the overall buffered reader is probably not good enough. I think the buffer is freed when close() is called, so a reference to this internal buffer would need to be saved instead (or as well). Unless we stop having close() free the buffer. |
Unless the "buffered *self" parameter in _buffered_raw_read() is |
I recently created bpo-26720 about a similar situation with BufferedWriter. However I am starting to believe that the problem cannot be solved the way I originally wanted. Instead, the best solution there would be similar what I would suggest here: we need to avoid the user accessing the memoryview after readinto() or write() has returned. Some ideas, not all perfect:
|
I think idea 2 (error if memoryview not released) would not work. It would rely on garbage collection, which is not always immediate. E.g. if the memoryview were kept alive by a reference cycle, it may not immediately be released. I don’t think idea 3 is worthwhile, because the memoryview API does not seem designed for this use case, and it would not be 100% consistent anyway. So that leaves my first idea, to call view.release() when readinto(), etc, return. This should avoid the crash and data loss problems in some common cases. It is implemented in release-view.patch, along with some notes. |
Just stumbled upon https://pwn.win/2022/05/11/python-buffered-reader.html which is an “exploit” for this //cc @vstinner |
Can confirm I can repro this via the above for 3.11 and a recent-ish checkout of Python 3.12.0a3+ (built in release config). |
"Given that you need to be able to execute arbitary Python code in the first place, this exploit won’t be useful in most settings." is the most important line from that blog post. I don't consider this any more a security issue than any use of pickle or marshal already is. |
As a fourth option to add to @vadmium's three options above, could it also be possible to replace the raw |
Idea 1 has the same problem as idea 2. If the memoryview were kept alive by a reference cycle, release() will fail. It will leave us in the same state, except that the user code maybe has a chance to report an error before crash. The problem in the buffered reader can be solved by using bytearray as a buffer (the code will be complicated by necessary to support multiple buffers if they are not yet released). But |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
The text was updated successfully, but these errors were encountered: