-
Notifications
You must be signed in to change notification settings - Fork 1.7k
mod_security causes CGI scripts to timeout #2101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@markblackman, I'm wondering if any other Apache module is interfering? Do you have any other module enabled other than mod_cgi and mod_security? It could be great if you could test with a very minimal Apache configuration and a minimal set of modules to exclude the possibility of some other module interfering... |
@victorhora Hi, we certainly have a large number of modules, however, the behaviour we see is very binary. When we enable mod_security, our CGI scripts start failing, although not immediately, and when we disable mod_security, our CGI scripts start working again. This is a very repeatable and reproducible pattern and we have seen it many times in the last year. Unfortunately, what we cannot do is reproduce this outside of our large-scale end-user environments. So far we simply cannot reproduce it in the lab, although we have not tried very hard yet, other than simple high-concurrency loading. I suspect we might have to run continuous loads over an hour or so to see the issue in the lab. I suspect there is a problem in the handling of the output bucket brigade in some cases. cgi_bucket_read just times out on the child script output (stdout) file descriptor. Maybe a mod_security filter is not putting the file descriptor in the right place in the bucket brigade? |
Hi @markblackman, Now that #2091 got merged, can you test it again within the v2/master branch? |
You suspect that the changes in #2091 might have influenced the CGI timeouts? We have been running Rainer's patches for about a month now and we know the CGI problem will return even with Rainer's deltas. |
We have made some partial progress tracking this down. We can see that the CGI child process is experiencing a segmentation fault, presumably after forking, but before exec'ing either suexec or the child. The backtraces always look like this and we can see the pconf pool is always the pool with the damaged child clean-up function pointer. Core was generated by /apache24/bin/httpd -f /apache24/conf/dynamic/apache24/httpd.conf -Dnick=uki1n2. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x0000000000000000 in ?? () |
Did you have a chance to report this on mod_cgi as well? I am asking because as per the backtrace, it seems like the crash is reverberating from there. Even if the problem is not there, mod_cgi devs may be able to tell more about what could cause such behavior. |
We're working directly with Rainer Jung of the Apache team to track it down. I just wondered if anyone in the ModSecurity team might have any ideas when/why a pconf pool would get corrupted or damaged. |
We have reason to believe the corruption/damage might be happening on a request with a body (the POST case). |
There are some optimizations on the pmMatch that may try to update a memory area that was previously allocated during the configuration load. On those tests, are you running with or without rules? unload the rules cause any impact? |
Running with OWASP 3 rules plus trivial tweaks. |
@zimmerle By pmMatch you probably mean the @pm operator. By "may try to update a memory area that was previously allocated during the configuration load" you are referring to the use of memcpy(), strcat(), strncat() or strncpy() copying data to a target that was allocated from a pool? So a potentially wrong or missing length check? |
So our current hypothesis is that the pconf pool is corrupted by the ap_varbuf_grow routine (in child pool ptemp) during start-up configuration and more than one kind of rule triggers it, but maybe not all rules trigger it. Not yet tested the 'no-rules' case, but using the apr pool debug option is pointing to an issue associated with ap_varbuf_grow |
This is in the Apache configuration code though, not the mod_security code. Possibly these very long configuration lines trigger some configuration processing bugs. |
The general configuration corruption issue is reproducible with ease as far as I can tell with the APR pool debugging on. I built APR (--enable-pool-debug=all), APR-util, httpd-2.4.39 and mod_security 2.9.3 from scratch on a stock, plain FreeBSD 12 and got a segmentation violation during the apache start-up phase with the core rule set 3.1.1 configured. The segmentation violation points to a particular rule, but removing that rule just seems to trigger it elsewhere. Here's the lldb stack trace from start-up
Exactly the same stack trace we see from gdb on an ancient SLES11 SP4 system |
For some reason the allocator (in the ptemp pool) has become null |
If it helps, here's vb, but I think you can reproduce this with ease.
|
These all seem to be side effects of using pool debugging and are not related to the original issue. |
Based on what we're seeing with cleanup function issues, I would say we have been seeing precisely what was seen in #890 and the corresponding PR 2049 |
Applying the PR in #2049 has solved our problem. |
Although PR #2049 deals with our coredump problem (due to corrupted cleanup functions), we're seeing a lot more CPU utilization than we expected and attaching to the active httpd processes shows a lot of time spent in poll()
|
Describe the bug
After enabling mod_security in our Apache 2.4 configuration, more and more, but not all, CGI scripts will timeout and not usually immediately after restart but after a few minutes. As soon as mod_security is disabled, the issue goes away and all CGI scripts behave normally.
Logs and dumps
There are no mod_security logs, only Apache error logs like so....
[Wed May 22 12:40:46.262612 2019] [cgi:warn] [pid 198045:tid 139887210845952] [client 10.235.31.231:0] AH01220: Timeout waiting for output from CGI script /var/www/global-cgi-bin/web-info
[Wed May 22 12:40:46.262653 2019] [cgi:error] [pid 198045:tid 139887210845952] [client 10.235.31.231:0] Script timed out before returning headers: web-info
To Reproduce
Steps to reproduce the behavior:
A curl command line that mimics the original request and reproduces the problem.
curl -v http://somesite.corp.com/cgi-bin/any-shell-script
Expected behavior
I expected the CGI script to return it's output
Server (please complete the following information):
Rule Set (please complete the following information):
Additional context
This may be connected to filter processing issues seen in issues 2091, 2093
Re-reading the traces and the source code, everything points to the httpd parent never seeing any output from the CGI script, not even headers, in cgi_read_bucket.
https://github.com/apache/httpd/blob/2.4.39/modules/generators/mod_cgi.c#L694
How could mod_security interfere with the mod_cgi buckets in the output bucket brigade?
The text was updated successfully, but these errors were encountered: