Skip to content

--enable-jit-sealloc / sljitProtExecAllocator.c is broken across fork(), without exec() #162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
berrange opened this issue Nov 21, 2022 · 1 comment
Labels
JIT Relating to the JIT feature

Comments

@berrange
Copy link

The sljitProtExecAllocator code is the executable memory allocator that avoids triggering execmem denials from SELinux. If it is actually used by an application that uses PCRE2 across fork() though, then very bad things will happen, because it is not fork safe. The two worst problems

  • If a pcre2_code is allocated and jit compiled in a parent, and then pcre2_code_free() is called in both parent & a forked child, whichever process runs second will crash with SEGV
  • If a pcre2_code is allocated in the parent, and the a forked child calls pcre2_code_free() and then allocates and compiles an entirely new regex, the parent process starts matching the new regex

The root cause of both these problems is that the PCRE2 code is only safe across fork if the executable memory is MAP_PRIVATE, such that child processes get a copy-on-write mapping. The sljitProtExecAllocator uses a pair of MAP_SHARED mappings, loosing the copy-on-write behaviour across fork. There's no easy way to fix this, AFAICT, because the if the pair of mappings used MAP_PRIVATE, then changes in the writable mapping won't be visible in the executable mapping. The only option I see would be to have a separate mmap for every allocated JIT code block, map it MAP_SHARED initially to generate the code, then unmap it and map it MAP_PRIVATE again for execution. That's going to be much much more expensive & inefficient if there lots of regexes JIT compiled frequently.

Demonstrating the crash is pretty trivial, adapted from the original Perl demonstrator (https://github.com/rurban/re-engine-PCRE2/blob/master/t/1-basic.t ):

#define PCRE2_CODE_UNIT_WIDTH 8

#include <stdio.h>
#include <pcre2.h>
#include <sys/wait.h>
#include <unistd.h>

int main(int argc, char **argv)
{
  pcre2_code *re;
  pcre2_compile_context *ccontext;
  int err;
  size_t erroff;

  ccontext = pcre2_compile_context_create (NULL);
  re = pcre2_compile((PCRE2_SPTR8)"fish", PCRE2_ZERO_TERMINATED, 0, &err, &erroff, ccontext);
  pcre2_compile_context_free (ccontext);
  pcre2_jit_compile(re, PCRE2_JIT_COMPLETE);

  pid_t child = fork();
  if (child == 0) {
    pcre2_code_free(re);
    _exit(0);
  }
  waitpid(child, NULL, 0);
  pcre2_code_free(re);
  return 0;
}

Demonstrating the silent change in behaviour of the parent process after the fork'd child process creates a new compiled regex is slightly harder:

#define PCRE2_CODE_UNIT_WIDTH 8

#include <stdio.h>
#include <pcre2.h>
#include <sys/wait.h>
#include <unistd.h>

int main(int argc, char **argv)
{
  pcre2_code *re;
  pcre2_compile_context *ccontext;
  pcre2_match_data *mdata;
  pcre2_match_context *mcontext;
  int err;
  size_t erroff;
  int rv;

  ccontext = pcre2_compile_context_create (NULL);
  re = pcre2_compile((PCRE2_SPTR8)"fish", PCRE2_ZERO_TERMINATED, 0, &err, &erroff, ccontext);
  pcre2_compile_context_free (ccontext);
  pcre2_jit_compile(re, PCRE2_JIT_COMPLETE);

  mcontext = pcre2_match_context_create(NULL);
  mdata = pcre2_match_data_create_from_pattern(re, NULL);

  rv = pcre2_jit_match (re, (PCRE2_SPTR8)"fish", 4, 0, 0, mdata, mcontext);
  printf("MatchFish=%d\n", rv);
  rv = pcre2_jit_match (re, (PCRE2_SPTR8)"food", 4, 0, 0, mdata, mcontext);
  printf("MatchFood=%d\n", rv);

  pid_t child = fork();
  if (child == 0) {
    pcre2_code_free(re);
    ccontext = pcre2_compile_context_create (NULL);
    re = pcre2_compile((PCRE2_SPTR8)"food", PCRE2_ZERO_TERMINATED, 0, &err, &erroff, ccontext);
    pcre2_compile_context_free (ccontext);
    rv = pcre2_jit_compile(re, PCRE2_JIT_COMPLETE);
    printf("JIT compile=%d\n", rv);
    _exit(0);
  }
  for (int i = 0; i < 1000; i++) {
    rv = pcre2_jit_match (re, (PCRE2_SPTR8)"fish", 4, 0, 0, mdata, mcontext);
    printf("MatchFish=%d\n", rv);
    rv = pcre2_jit_match (re, (PCRE2_SPTR8)"food", 4, 0, 0, mdata, mcontext);
    printf("MatchFood=%d\n", rv);
  }
  waitpid(child, NULL, 0);
  pcre2_code_free(re);
  return 0;
}

The parent will initially print

MatchFish=1
MatchFood=-1

but at a point in time (co-inciding with 'pcre2_jit_compile' in the child process) will suddenly & silently change to

MatchFish=-1
MatchFood=1

this is probably more worrying behaviour than the crash problem to me.

The crash'ing problem was partially discussed in https://bugs.exim.org/show_bug.cgi?id=1749 (annoyingly marked private - find it via https://web.archive.org/web/20201109025300/https://bugs.exim.org/show_bug.cgi?id=1749), but there was no resolution that I can see in the waybackmachine archive.

The problem where behaviour of the regexes can silently change between processes was not mentioned before though AFAIK, which is what motivated me to fill out this bug report, as a new publically viewable record for anyone else researching this problem.

I realize this lack of compatibility with 'fork()' is mentioned in the README file, however, that is quite easy to miss. I feel like 'configure' should print out a prominent warning that the --enable-jit-sealloc option is dangerous and should not be used if your process is liable to fork and trigger use of pcre2 before exec. Triggering such use of pcre2 is all too easy now than glib2 has adopted pcre2 with JIT enabled by G_REGEX_OPTIMIZE.

@zherczeg
Copy link
Collaborator

The seallocator is not intended for general use. The security part is always confusing for me, but as far as I understood the best way to handle this is to use the normal allocator, and allow special rights for certain applications in the selinux environment. If the application does not have these rights, it cannot allocate executable memory, and falls back to interpreted execution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
JIT Relating to the JIT feature
Projects
None yet
Development

No branches or pull requests

3 participants