Skip to content

[JIT] Performance regression with some regexs #16

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cmb69 opened this issue Sep 8, 2021 · 7 comments
Closed

[JIT] Performance regression with some regexs #16

cmb69 opened this issue Sep 8, 2021 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@cmb69
Copy link

cmb69 commented Sep 8, 2021

I'm forwarding this from https://bugs.php.net/81424, but it seems that this is actually a PCRE2 issue. Consider the following (bad) regex:

/[^{};\/\n]+\{\}/

When run on a large string (e.g. https://pastebin.com/WVBR4f9T), with PCRE2 10.34 JIT this was fast; with PCRE2 10.35 and later it is more than hundred times slower.

If the regex is rewritten to use a lookbehind assertion (/(?<![{};\/\n]+)\{\}/), performance with the different PCRE2 versions is on par, so you may not consider this something to be fix-worthy. :)

There is no performance regression without JIT, so I wonder whether this regex isn't jitted anymore as of PCRE2 10.35.

@PhilipHazel
Copy link
Collaborator

Have you tried with the new Release Candidate (10.38-RC1)? which is available on GitHub? Please do so if you can. If there is still a problem, I'll assign this to the JIT maintainer.

@cmb69
Copy link
Author

cmb69 commented Sep 8, 2021

Yes, I also tried with 10.38-RC1.

@zherczeg
Copy link
Collaborator

zherczeg commented Sep 9, 2021

It looks like this commit introduced the issue: 21c40e6

@zherczeg
Copy link
Collaborator

@cmb69 thank you for this report. It looks like I accidentally disabled a very important optimization. @PhilipHazel this patch fixes it, but release cycle is started so please decide whether you want it in this release or the next.

diff --git a/src/pcre2_jit_compile.c b/src/pcre2_jit_compile.c
index a3f7ebe..495920d 100644
--- a/src/pcre2_jit_compile.c
+++ b/src/pcre2_jit_compile.c
@@ -11228,7 +11228,7 @@ early_fail_type = (early_fail_ptr & 0x7);
 early_fail_ptr >>= 3;

 /* During recursion, these optimizations are disabled. */
-if (common->early_fail_start_ptr == 0)
+if (common->early_fail_start_ptr == 0 && common->fast_forward_bc_ptr == NULL)
   {
   early_fail_ptr = 0;
   early_fail_type = type_skip;

Btw I am happy we moved to git, this bug was easier to track down with it.

@PhilipHazel
Copy link
Collaborator

As it is a very small patch, I think we should include it in this release. Please go ahead and do the merge and update ChangeLog.

@PhilipHazel PhilipHazel added the bug Something isn't working label Sep 10, 2021
@zherczeg
Copy link
Collaborator

Fixed in dc5f966.

@cmb69
Copy link
Author

cmb69 commented Sep 10, 2021

That was fast, thank you! I can confirm that the patch solves the issue, and I couldn't detect any regression with the PHP PCRE test suite.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants