Skip to content

Add Py_mod_gil slot to C extension modules #59135

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 1, 2024

Conversation

lysnikolaou
Copy link
Contributor

x-ref #59057.

The Py_mod_gil slot is needed so that the GIL is not automatically enabled when these C extension modules get imported.

  • I checked pandas_datetime for thread-safety and it appears to be okay.
  • pandas_parser is not thread-safe. A discussion is needed on whether we should care about it though, since the c engine to read_csv and friends is already documented as being thread-unsafe and the thread-safe pyarrow alternative is already there.

@WillAyd
Copy link
Member

WillAyd commented Jun 28, 2024

Is this slot only available in 3.13?

  • pandas_parser is not thread-safe. A discussion is needed on whether we should care about it though,

Do you know what makes it not thread unsafe? I'm surprised given how small of a module it is.

Definitely more of a nice-to-have on the parser though. Understood its probably a very large undertaking to make the CSV reader overall multi-threaded, and most likely not even worth it with pyarrow

@@ -245,7 +245,10 @@ static int pandas_datetime_exec(PyObject *Py_UNUSED(module)) {
}

static PyModuleDef_Slot pandas_datetime_slots[] = {
{Py_mod_exec, pandas_datetime_exec}, {0, NULL}};
{Py_mod_exec, pandas_datetime_exec},
{Py_mod_gil, Py_MOD_GIL_NOT_USED},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @WillAyd mentioned, this is new in 3.13:

#if PY_VERSION_HEX >= 0x030D0000
    {Py_mod_gil, Py_MOD_GIL_NOT_USED},
#endif

https://docs.python.org/3.14/howto/free-threading-extensions.html#multi-phase-initialization

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. Friday was not a good day, apparently. 🤕

One question: In the free-threaded repo's README we recomment guarding with #ifdef Py_GIL_DISABLED. Should we change that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't kept tabs on the CPython implementation but seems like what the free-threaded repo suggests might be better than checking the version hex - there will be both GIL and GIL-less Python compilations for a few versions right?

Copy link
Contributor Author

@lysnikolaou lysnikolaou Jul 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WillAyd That's correct, however Py_mod_gil is just ignored in 3.13+ non-free-threaded builds. Effectively, both guards lead to correct behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a matter of style I think checking for Py_GIL_DISABLED would make more sense if you were evaluating whether that was true/false, not just checking if the symbol was defined.

So happy with what we have here, but probably worth asking for input from other projects for the free-threaded repo standard.

We can merge this as is for now and follow up with a different approach if a more canonical one becomes clearer

@lysnikolaou
Copy link
Contributor Author

Do you know what makes it not thread unsafe? I'm surprised given how small of a module it is.

I mean that it's thread-unsafe in the sense that there's global state in the parser_t struct and there's unsafe reads/writes to it. As far as I understand, read_csv only ever creates one instance of it though, which means that running read_csv in multiple threads is safe.

@WillAyd
Copy link
Member

WillAyd commented Jul 1, 2024

Sorry for the typo - I mean "not thread safe" but I think you got what I was saying. And cool thanks for the response - agree with you on that

@WillAyd WillAyd added this to the 3.0 milestone Jul 1, 2024
@WillAyd WillAyd merged commit f2eeb4e into pandas-dev:main Jul 1, 2024
45 checks passed
@WillAyd
Copy link
Member

WillAyd commented Jul 1, 2024

Thanks @lysnikolaou

lysnikolaou added a commit to lysnikolaou/pandas that referenced this pull request Jul 15, 2024
This is the Cython equivalent of adding a `Py_mod_gil` slot with
`Py_MOD_GIL_NOT_USED` like we did in pandas-dev#59135.
mroeschke pushed a commit that referenced this pull request Jul 16, 2024
* ENH: Globally enable Cython free-threading directive

This is the Cython equivalent of adding a `Py_mod_gil` slot with
`Py_MOD_GIL_NOT_USED` like we did in #59135.

* Use add_project_arguments

* Mark json with Py_MOD_GIL_NOT_USED & remove PYTHON_GIL env var from ci test job
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants