-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
ENH: add default value to str.extract #38001
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
take |
does this functionaily match anything in the standard library? why is this useful? |
Not that I can think of, it would require three extra steps to achieve this: df = pd.DataFrame({'A': ['a84', 'abcd', '99string', np.nan]})
mask1 = df['A'].notna()
result = df['A'].str.extract(r'(\d+)', expand=False)
mask2 = result.isna()
result = np.where(mask1&mask2, 'missing', result)
Because right now it is not possible to distinguish between already |
@erfannariman some string accessor methods have a |
why would it need to be restricted to a string? (for object dtypes) |
Is your feature request related to a problem?
In some cases we can set a default value for non matches of
str.extract
withSeries/Frame.fillna
.But there are cases when the data prior to applying
str.extract
already has NaN values. So running fillna on the results would fill both, and there we cannot distinguish between what the actual NaN values were and what the NaN values are as a result of str.extract.Describe the solution you'd like
Set a default value which has to be a string, to indicate the non matches of the regex pattern.
API breaking implications
None I think
The text was updated successfully, but these errors were encountered: