Skip to content

Block stdlib names #2409

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Sep 18, 2017
Merged

Block stdlib names #2409

merged 5 commits into from
Sep 18, 2017

Conversation

ewdurbin
Copy link
Member

@ewdurbin ewdurbin commented Sep 17, 2017

Prohibits new projects with names which collide on canonicalized name with standard library modules from being created.

Uses stdlib_list from @jackmaney for programatic access to a hopefully up to date list of standard library names.

RE: #2151

Prohibits new projects with names which collide on canonicalized name
with standard library modules from being created.

Uses stdlib_list from https://github.com/jackmaney for programatic access to a hopefully up to date list of standard library names.
@ewdurbin ewdurbin requested a review from dstufft September 17, 2017 21:03
@alex
Copy link
Member

alex commented Sep 17, 2017

There are several packages in the stdlib that have backports on PyPI, or were originally on PyPI. For example:

What does this do with them?

@ewdurbin
Copy link
Member Author

@alex, no impact on existing projects, this just stops any new projects with such a name from being created without admin intervention.

existing projects can upload new releases without any intervention.

@jonemo
Copy link
Contributor

jonemo commented Sep 17, 2017

In the discussion in #2151 the requirement to have an escape hatch came up, for example for packages that backport stdlib modules. How would those be handled?

@ewdurbin
Copy link
Member Author

my opinion is that any new backports should be explicitly named as such. in the rare case that a package name included in the standard library module requires registration with the precise name, an administrative tool in this project should allow that.

Copy link
Member

@dstufft dstufft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally looks good to me, a few comments about some things that I think could make this better.

)
)

STDLIB_PROHIBITTED = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nitpick, but this would probably be better as a set, it will save some duplication for like ConfigParser vs configparser and will also make the in check later on a tiny bit faster.

@@ -46,6 +50,17 @@

PATH_HASHER = "blake2_256"

ALL_STDLIB = set(
chain.from_iterable(
[stdlib_list.stdlib_list(version)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be slightly better as (...) instead of [...] to make this is a generator expression. That will prevent us from creating an intermediate list.

@@ -46,6 +50,17 @@

PATH_HASHER = "blake2_256"

ALL_STDLIB = set(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for having ALL_STDLIB exist instead of just nesting this logic inside the comprehension for STDLIB_PROHIBITED? We could ditch the "inner" set() then and just have STDLIB_PROHIBITED be a set comprehension.

STDLIB_PROHIBITTED):
raise _exc_with_message(
HTTPBadRequest,
"The name {!r} is not allowed.".format(form.name.data),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a distinct message here? Even if it's just including "because it is the name of a Python standard library module" in the existing message?

@dstufft dstufft merged commit 4b05627 into master Sep 18, 2017
@dstufft dstufft deleted the block_stdlib_names branch September 18, 2017 04:45
@ncoghlan
Copy link
Contributor

For future backporting-without-admin-approval, note that the backports namespace package already exists as a self-documenting target for that purpose: https://pypi.python.org/pypi/backports/

(While that backport namespace itself could also be targeted, it isn't vulnerable to the naive "pip install urllib" problem that afflicts stdlib module names, so it's inherently less interesting as an attack vector)

@VanL
Copy link

VanL commented Sep 18, 2017

We should probably add some common "typosquatting" names to the blocklist as well - urrlib, etc.

@dstufft
Copy link
Member

dstufft commented Sep 18, 2017

Well urllib is part of the stdlib so it will get included automatically with this PR. There is another PR that let us add a custom blacklist in the DB that we've pre-filled with previous typosquatting packages that we've taken ownership of.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants