-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
CLN validate some Py_* Functions in C Extensions check for NULL #50997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN validate some Py_* Functions in C Extensions check for NULL #50997
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool thanks for trying this. Not easy
@@ -66,6 +66,9 @@ void *buffer_rd_bytes(void *source, size_t nbytes, size_t *bytes_read, | |||
args = Py_BuildValue("(i)", nbytes); | |||
|
|||
func = PyObject_GetAttrString(src->obj, "read"); | |||
if (func == NULL) { | |||
return NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like you also need to Py_DECREF(args)
- looks to be Py_XDECREF a few lines down but won't happen if you return here.
As a side note - I think we also too liberally use Py_XDECREF
. But that's another issue for another day
@@ -108,6 +108,9 @@ JSOBJ Object_npyNewArray(void *prv, void *_decoder) { | |||
if (decoder->curdim <= 0) { | |||
// start of array - initialise the context buffer | |||
npyarr = decoder->npyarr = PyObject_Malloc(sizeof(NpyArrContext)); | |||
if (npyarr == NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a bit redundant with L116, though I find the npyarr == NULL
approach more explicit. I also think the position of this is better than L116 but need to combine these pieces of code
@@ -119,6 +122,9 @@ JSOBJ Object_npyNewArray(void *prv, void *_decoder) { | |||
npyarr->labels[0] = npyarr->labels[1] = NULL; | |||
|
|||
npyarr->shape.ptr = PyObject_Malloc(sizeof(npy_intp) * NPY_MAXDIMS); | |||
if (npyarr->shape.ptr == NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point in the code you have malloc'ed decoder->npyarr, so you also need to free it (using PyObject_Free instead of the built-in)
@@ -90,6 +90,10 @@ char *PyDateTimeToIso(PyObject *obj, NPY_DATETIMEUNIT base, | |||
|
|||
*len = (size_t)get_datetime_iso_8601_strlen(0, base); | |||
char *result = PyObject_Malloc(*len); | |||
if (result == NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the idiomatic approach here is to just do return PyErr_NoMemory()
https://docs.python.org/3/c-api/exceptions.html#c.PyErr_NoMemory
example from CPython source:
@@ -149,6 +149,9 @@ static TypeContext *createTypeContext(void) { | |||
TypeContext *pc; | |||
|
|||
pc = PyObject_Malloc(sizeof(TypeContext)); | |||
if (pc == NULL) { | |||
return NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here about PyErr_NoMemory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though again this is the same as L155, though I think the pc == NULL
check is a lot more explicit
if (tz != Py_None) { | ||
// Go through object array if we have dt64tz, since tz info will | ||
// be lost if values is used directly. | ||
Py_DECREF(tz); | ||
values = PyObject_CallMethod(obj, "__array__", NULL); | ||
if (values == NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm +/-0 on having to do this here. If values
is NULL the return values
naturally propagates up the caller, who is also responsible for checking. But if it helps enforce patterns then this is harmless
PyObject *repr; | ||
if (PyObject_HasAttrString(obj, "dtype")) { | ||
PyObject *dtype = PyObject_GetAttrString(obj, "dtype"); | ||
if (dtype == NULL) { | ||
return NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to Py_DECREF(typeRepr)
before returning
repr = PyObject_Repr(dtype); | ||
if (repr == NULL) { | ||
return NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment above on Py_DECREF(typeRepr)
; might be a clever way to refactor the function so you don't need to repeat in every error handler, but also OK to repeat for now
PyObject *ret; | ||
|
||
if (tmp == 0) { | ||
return 0; | ||
} | ||
ret = PyObject_GetAttrString(tmp, subAttr); | ||
if (ret == NULL) { | ||
return ret; | ||
} | ||
Py_DECREF(tmp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is a case where it could make sense to have say Py_DECREF
right before the error checking; whether or not things worked you are down with the lifecycle of tmp
For functions that return Would be curious how that gets handled upstream in ultrajson. Seems like there's a much more natural way to serialize the built-in types than what we have here |
This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this. |
thanks for your review I probably won't have a chance to update this any time soon due to there being quite a few other tasks to take care of, so closing for now to clear the queue |
xref #49756
@WillAyd is this what you had in mind?
In cases like
pandas/pandas/_libs/src/ujson/python/objToJSON.c
Lines 902 to 906 in 3a1a8b3
which return
void
, how shouldNULL
be handled?