Skip to content

bpo-39028: Performance enhancement in keyword extraction #17576

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 18, 2019

Conversation

seberg
Copy link
Contributor

@seberg seberg commented Dec 12, 2019

All keywords should first be checked for pointer identity. Only
after that failed for all keywords (unlikely) should unicode
equality be used.
The original code would call unicode equality on any non-matching
keyword argument. Meaning calling it often e.g. when a function
has many kwargs but only the last one is provided.


Unless I am missing something big, this seems like it was the original intention, and simple typo. It may make sense to backport?

I have to admit, I have not actually recompiled and timed this, I may do that tomorrow (but I have never had the need to compile python, so need to look up how).

https://bugs.python.org/issue39028

@seberg seberg changed the title ENH: Fix performance issue in keyword extraction bpo-39028: Fix performance issue in keyword extraction Dec 12, 2019
@seberg
Copy link
Contributor Author

seberg commented Dec 12, 2019

Ah, I missed that this code is used for PyArg_* where it is not clear that strings are almost always interned, so the change would likely be useful for argument clinic functions, but probably not in general.

EDIT: Both can make sense I guess, but for the clinic it is clearly interned I think? So another option is to have two versions here...

Python/getargs.c Outdated
if (kwname == key) {
return kwstack[i];
}
}
/* ptr == ptr should normally find a match in since keyword keys
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is related to kwname == key.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change LGTM, but please move the comment before the pointers comparison or before the first loop.

Although this is almost dead code. It is used in just few sites generated by Argument Clinic and soon will not be used at all.

@serhiy-storchaka
Copy link
Member

Ah, sorry, the use in _PyArg_UnpackKeywords() is not dead, so this optimization has long term effect.

@seberg seberg force-pushed the kwarg-extract-performance branch from ba65512 to 17561c2 Compare December 12, 2019 17:31
@seberg
Copy link
Contributor Author

seberg commented Dec 12, 2019

Yeah, although it only will have remotely noticeable effects for >3 kwargs probably, which seems pretty rare in python. Anyway moved and tweaked the comment.

@seberg seberg changed the title bpo-39028: Fix performance issue in keyword extraction bpo-39028: Performance enhancement in keyword extraction Dec 12, 2019
Python/getargs.c Outdated
}
/* This function assumes the strings should be interned, so that this
should only be reached on error, and the loop below will never
find a match */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep the second loop, why we need to have both of find_keyword_interned and find_keyword?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh man, sorry, I do not think we do. I was trying around with it a bit, and apparently forgot to commit before pushing or something :/. fixed.

All keywords should first be checked for pointer identity. Only
after that failed for all keywords (unlikely) should unicode
equality be used.
The original code would call unicode equality on any non-matching
keyword argument. Meaning calling it often e.g. when a function
has many kwargs but only the last one is provided.
@seberg seberg force-pushed the kwarg-extract-performance branch from 17561c2 to 52562b9 Compare December 13, 2019 14:19
@methane
Copy link
Member

methane commented Dec 16, 2019

LGTM. Would you create an NEWS entry?
https://devguide.python.org/committing/#what-s-new-and-news-entries

@methane methane merged commit 75bb07e into python:master Dec 18, 2019
@seberg seberg deleted the kwarg-extract-performance branch January 21, 2020 19:11
shihai1991 pushed a commit to shihai1991/cpython that referenced this pull request Jan 31, 2020
)

All keywords should first be checked for pointer identity. Only
after that failed for all keywords (unlikely) should unicode
equality be used.
The original code would call unicode equality on any non-matching
keyword argument. Meaning calling it often e.g. when a function
has many kwargs but only the last one is provided.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants