Skip to content

pcre2grep: better compatibility for \d #222

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 11, 2023

Conversation

carenas
Copy link
Contributor

@carenas carenas commented Apr 6, 2023

As suggested in the discussion[1] from git by the GNU grep developers, and with the hope to prevent further divergence.

The current implementation is mostly compatible with git grep -P and ripgrep, and the behavior in UCP mode is mostly compatible with Perl, hence why adding it under a flag.

[1] https://lore.kernel.org/git/[email protected]/

@carenas carenas changed the title pcre2grep: avoid UCP mode for \d pcre2grep: better compatibility for \d Apr 7, 2023
@PhilipHazel
Copy link
Collaborator

I agree with this in principle, but it probably needs leaving until #223 gets sorted out.

Matching multi-byte digits was released with GNU grep 3.9 and considered a
regression that required a couple more releases.

While implementations differ on their interpretation of `\d` under Unicode,
there is a recommendation[1] for a POSIX compatible mode that and a suggestion
that it will be better (safer and faster) to only consider ASCII digits for
that class (and [:digit]).

Use the recently introduced PCRE2_EXTRA_ASCII_BSD if `--posix-digit` is used
together with `-u` or `-U` for compatibility with GNU grep.

[1] https://unicode.org/reports/tr18/
@carenas carenas marked this pull request as ready for review April 11, 2023 11:58
@PhilipHazel PhilipHazel merged commit 3bbdb6d into PCRE2Project:master Apr 11, 2023
@carenas carenas deleted the noucpd branch April 12, 2023 01:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants