-
Notifications
You must be signed in to change notification settings - Fork 210
PCRE2 Different Behavior Depending On Optimization Level #147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Please be more specific. PCRE2 has a test program called pcre2test, you can show the problematic patter/input pairs there. |
@zherczeg i've updated the issue, i'll try pcre2test |
the pcre2test works fine but for some reason that same compiled library works weirdly with my code. i'm just using ASCII |
I am sorry but we don't really know about the internals of your system, and the description is too generic (a difference between -O0 and -Os). We need some pattern/input pair, and some compile / match flags to work with. |
so these are the patterns i'm using: filePattern = \.([ch](pp|xx)?|C|cc|c\+\+|cu|H|hh|ii?)$ # File Extensions
pattern1 = //.* # Comment //
pattern2 = (^#define*)|(^#include*)|(^#if*)|(^#ifndef*)|(^#ifdef*)|(^#endif*)|(^#elif*)|(^#else*)|(^#elseif*)|(^#warning*)|(^#error*) # Pre-Processor Directives
pattern3 = (auto|bool|char|const|double|enum|extern|float|inline|int|long|restrict|short|signed|sizeof|static|struct|typedef|union|unsigned|void) # Keywords
pattern4 = ([[:lower:]][[:lower:]_]*|(u_?)?int(8|16|32|64))_t # Types Like uint_8
pattern5 = (if|else|for|while|do|switch|case|default) # Keywords
pattern6 = [A-Z_][0-9A-Z_]*
pattern7 = ^[[:blank:]]*[A-Z_a-z][0-9A-Z_a-z]*:[[:blank:]]*$
pattern9 = (class|explicit|friend|mutable|namespace|override|private|protected|public|register|template|this|typename|using|virtual|volatile) # Keywords
pattern10 = (try|throw|catch|operator|new|delete) # Keywords
pattern11 = (break|continue|goto|return) # Keywords
pattern14 = <[^>]+> # For <header.h> in #include <header.h> i'm compiling these regexes with after matching is done i check if there are matches, if yes i use this logic: for (int i = 0; i < rc; i++) {
long int start = ovector[i], end = ovector[i + 1];
if (start < 0 || end < 0)
continue;
if (callback)
callback(start < end ? start : end, start < end ? end : start, data); // Basically Pass The Smaller Value As Start & Bigger Value as End
totalFound++;
printf(", RC: %d, Start: %ld, End: %ld\n", rc, start, end);
} where and then while iterating over the values i call a |
Thanks, this is more helpful. Can you check which specific pattern fail, and what is the ovector content in that case? |
Is this still an issue? |
it is indeed |
@pegvin I have looked at your code. PCRE2 does not behave differently based on the optimisation level. What you are observing is that your code is behaving differently. int rc = pcre2_match(p->re, (PCRE2_SPTR)str, strlen(str), 0, PCRE2_NO_JIT, p->md == NULL ? matchData : p->md, NULL);
if (rc < 0) {
#if IS_DEBUG
if (rc == PCRE2_ERROR_NOMATCH) {
log_warn("No Matches Found!");
} else {
PCRE2_UCHAR buffer[120];
pcre2_get_error_message(rc, buffer, sizeof(buffer));
log_error("regex matching error %d: %s in regex: %s", rc, buffer, str);
}
} else if (rc == 1) {
log_warn("No Matches Found!");
#endif
} else {
PCRE2_SIZE* ovector = pcre2_get_ovector_pointer(p->md == NULL ? matchData : p->md); In a Debug build, It also looks like you have not quite understood the meaning of the Your for-loop should be: for (int i = 0; i < rc; i++) {
long int start = ovector[2*i], end = ovector[2*i + 1];
|
Uh oh!
There was an error while loading. Please reload this page.
I'm using PCRE2 for my project aru for syntax highlighting. apparently everything works fine except when the certain optimizations are enabled PCRE2 behaves differently.
This is the video demonstrating the problem:
2022-09-23.19-35-59.mp4
Flags For Debug Build:
Flags For Release Build:
i'm not compiling the code with default Cmake or something provided with in the repository instead i'm compiling some specific files:
The text was updated successfully, but these errors were encountered: