Skip to content

CLN validate some Py_* Functions in C Extensions check for NULL #50997

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

MarcoGorelli
Copy link
Member

xref #49756

@WillAyd is this what you had in mind?

In cases like

void Dir_iterBegin(JSOBJ obj, JSONTypeContext *tc) {
GET_TC(tc)->attrList = PyObject_Dir(obj);
GET_TC(tc)->index = 0;
GET_TC(tc)->size = PyList_GET_SIZE(GET_TC(tc)->attrList);
}

which return void, how should NULL be handled?

@MarcoGorelli MarcoGorelli requested a review from WillAyd January 26, 2023 17:09
Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool thanks for trying this. Not easy

@@ -66,6 +66,9 @@ void *buffer_rd_bytes(void *source, size_t nbytes, size_t *bytes_read,
args = Py_BuildValue("(i)", nbytes);

func = PyObject_GetAttrString(src->obj, "read");
if (func == NULL) {
return NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you also need to Py_DECREF(args) - looks to be Py_XDECREF a few lines down but won't happen if you return here.

As a side note - I think we also too liberally use Py_XDECREF. But that's another issue for another day

@@ -108,6 +108,9 @@ JSOBJ Object_npyNewArray(void *prv, void *_decoder) {
if (decoder->curdim <= 0) {
// start of array - initialise the context buffer
npyarr = decoder->npyarr = PyObject_Malloc(sizeof(NpyArrContext));
if (npyarr == NULL) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bit redundant with L116, though I find the npyarr == NULL approach more explicit. I also think the position of this is better than L116 but need to combine these pieces of code

@@ -119,6 +122,9 @@ JSOBJ Object_npyNewArray(void *prv, void *_decoder) {
npyarr->labels[0] = npyarr->labels[1] = NULL;

npyarr->shape.ptr = PyObject_Malloc(sizeof(npy_intp) * NPY_MAXDIMS);
if (npyarr->shape.ptr == NULL) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point in the code you have malloc'ed decoder->npyarr, so you also need to free it (using PyObject_Free instead of the built-in)

@@ -90,6 +90,10 @@ char *PyDateTimeToIso(PyObject *obj, NPY_DATETIMEUNIT base,

*len = (size_t)get_datetime_iso_8601_strlen(0, base);
char *result = PyObject_Malloc(*len);
if (result == NULL) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -149,6 +149,9 @@ static TypeContext *createTypeContext(void) {
TypeContext *pc;

pc = PyObject_Malloc(sizeof(TypeContext));
if (pc == NULL) {
return NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here about PyErr_NoMemory

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though again this is the same as L155, though I think the pc == NULL check is a lot more explicit

if (tz != Py_None) {
// Go through object array if we have dt64tz, since tz info will
// be lost if values is used directly.
Py_DECREF(tz);
values = PyObject_CallMethod(obj, "__array__", NULL);
if (values == NULL) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm +/-0 on having to do this here. If values is NULL the return values naturally propagates up the caller, who is also responsible for checking. But if it helps enforce patterns then this is harmless

PyObject *repr;
if (PyObject_HasAttrString(obj, "dtype")) {
PyObject *dtype = PyObject_GetAttrString(obj, "dtype");
if (dtype == NULL) {
return NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to Py_DECREF(typeRepr) before returning

repr = PyObject_Repr(dtype);
if (repr == NULL) {
return NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment above on Py_DECREF(typeRepr); might be a clever way to refactor the function so you don't need to repeat in every error handler, but also OK to repeat for now

PyObject *ret;

if (tmp == 0) {
return 0;
}
ret = PyObject_GetAttrString(tmp, subAttr);
if (ret == NULL) {
return ret;
}
Py_DECREF(tmp);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is a case where it could make sense to have say Py_DECREF right before the error checking; whether or not things worked you are down with the lifecycle of tmp

@WillAyd
Copy link
Member

WillAyd commented Jan 26, 2023

For functions that return void your hands are a bit tied. In that particular case PyObject_Dir should be setting the global python indicator, so somewhere along the lines you could do a PyErr_Occurred() check and see what happened. But that's thorny so I would just ignore for now.

Would be curious how that gets handled upstream in ultrajson. Seems like there's a much more natural way to serialize the built-in types than what we have here

@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Mar 27, 2023
@MarcoGorelli
Copy link
Member Author

thanks for your review

I probably won't have a chance to update this any time soon due to there being quite a few other tasks to take care of, so closing for now to clear the queue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants