Skip to content

Add support for new operator rxGlobal #2396

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGES
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
v3.x.y - YYYY-MMM-DD (to be released)
-------------------------------------

- Add support for new operator rxGlobal
[@martinhsv]
- Adds support to lua 5.4
[@zimmerle]
- GeoIP: switch to GEOIP_MEMORY_CACHE from GEOIP_INDEX_CACHE
Expand Down
1 change: 1 addition & 0 deletions Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -290,6 +290,7 @@ TESTS+=test/test-cases/secrules-language-tests/operators/noMatch.json
TESTS+=test/test-cases/secrules-language-tests/operators/pmFromFile.json
TESTS+=test/test-cases/secrules-language-tests/operators/pm.json
TESTS+=test/test-cases/secrules-language-tests/operators/rx.json
TESTS+=test/test-cases/secrules-language-tests/operators/rxGlobal.json
TESTS+=test/test-cases/secrules-language-tests/operators/streq.json
TESTS+=test/test-cases/secrules-language-tests/operators/strmatch.json
TESTS+=test/test-cases/secrules-language-tests/operators/unconditionalMatch.json
Expand Down
1 change: 1 addition & 0 deletions src/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -221,6 +221,7 @@ OPERATORS = \
operators/rbl.cc \
operators/rsub.cc \
operators/rx.cc \
operators/rx_global.cc \
operators/str_eq.cc \
operators/str_match.cc \
operators/validate_byte_range.cc \
Expand Down
2 changes: 2 additions & 0 deletions src/operators/operator.cc
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@
#include "src/operators/rbl.h"
#include "src/operators/rsub.h"
#include "src/operators/rx.h"
#include "src/operators/rx_global.h"
#include "src/operators/str_eq.h"
#include "src/operators/str_match.h"
#include "src/operators/validate_byte_range.h"
Expand Down Expand Up @@ -169,6 +170,7 @@ Operator *Operator::instantiate(std::string op, std::string param_str) {
IF_MATCH(rbl) { return new Rbl(std::move(param)); }
IF_MATCH(rsub) { return new Rsub(std::move(param)); }
IF_MATCH(rx) { return new Rx(std::move(param)); }
IF_MATCH(rxglobal) { return new RxGlobal(std::move(param)); }
IF_MATCH(streq) { return new StrEq(std::move(param)); }
IF_MATCH(strmatch) { return new StrMatch(std::move(param)); }
IF_MATCH(validatebyterange) {
Expand Down
85 changes: 85 additions & 0 deletions src/operators/rx_global.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
/*
* ModSecurity, http://www.modsecurity.org/
* Copyright (c) 2020 Trustwave Holdings, Inc. (http://www.trustwave.com/)
*
* You may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* If any of the files related to licensing are missing or if you have any
* other questions related to licensing please contact Trustwave Holdings, Inc.
* directly using the email address [email protected].
*
*/

#include "src/operators/rx_global.h"

#include <string>
#include <list>
#include <memory>

#include "src/operators/operator.h"
#include "modsecurity/rule.h"
#include "modsecurity/rule_message.h"

namespace modsecurity {
namespace operators {


bool RxGlobal::init(const std::string &arg, std::string *error) {
if (m_string->m_containsMacro == false) {
m_re = new Regex(m_param);
}

return true;
}


bool RxGlobal::evaluate(Transaction *transaction, RuleWithActions *rule,
const std::string& input, std::shared_ptr<RuleMessage> ruleMessage) {
Regex *re;

if (m_param.empty() && !m_string->m_containsMacro) {
return true;
}

if (m_string->m_containsMacro) {
std::string eparam(m_string->evaluate(transaction));
re = new Regex(eparam);
} else {
re = m_re;
}

std::vector<Utils::SMatchCapture> captures;
re->searchGlobal(input, captures);

if (rule && rule->hasCaptureAction() && transaction) {
for (const Utils::SMatchCapture& capture : captures) {
const std::string capture_substring(input.substr(capture.m_offset,capture.m_length));
transaction->m_collections.m_tx_collection->storeOrUpdateFirst(
std::to_string(capture.m_group), capture_substring);
ms_dbg_a(transaction, 7, "Added regex subexpression TX." +
std::to_string(capture.m_group) + ": " + capture_substring);
transaction->m_matched.push_back(capture_substring);
}
}

for (const auto & capture : captures) {
logOffset(ruleMessage, capture.m_offset, capture.m_length);
}

if (m_string->m_containsMacro) {
delete re;
}

if (captures.size() > 0) {
return true;
}

return false;
}


} // namespace operators
} // namespace modsecurity
67 changes: 67 additions & 0 deletions src/operators/rx_global.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
/*
* ModSecurity, http://www.modsecurity.org/
* Copyright (c) 2020 Trustwave Holdings, Inc. (http://www.trustwave.com/)
*
* You may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* If any of the files related to licensing are missing or if you have any
* other questions related to licensing please contact Trustwave Holdings, Inc.
* directly using the email address [email protected].
*
*/

#ifndef SRC_OPERATORS_RX_GLOBAL_H_
#define SRC_OPERATORS_RX_GLOBAL_H_

#include <string>
//#include <list>
#include <memory>
#include <utility>

#include "src/operators/operator.h"
#include "src/utils/regex.h"


namespace modsecurity {
using Utils::SMatch;
using Utils::regex_search;
using Utils::Regex;

namespace operators {


class RxGlobal : public Operator {
public:
/** @ingroup ModSecurity_Operator */
explicit RxGlobal(std::unique_ptr<RunTimeString> param)
: m_re(nullptr),
Operator("RxGlobal", std::move(param)) {
m_couldContainsMacro = true;
}

~RxGlobal() {
if (m_string->m_containsMacro == false && m_re != NULL) {
delete m_re;
m_re = NULL;
}
}

bool evaluate(Transaction *transaction, RuleWithActions *rule,
const std::string& input,
std::shared_ptr<RuleMessage> ruleMessage) override;

bool init(const std::string &arg, std::string *error) override;

private:
Regex *m_re;
};


} // namespace operators
} // namespace modsecurity


#endif // SRC_OPERATORS_RX_GLOBAL_H_
6 changes: 6 additions & 0 deletions src/parser/seclang-parser.yy
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,7 @@ class Driver;
#include "src/operators/rbl.h"
#include "src/operators/rsub.h"
#include "src/operators/rx.h"
#include "src/operators/rx_global.h"
#include "src/operators/str_eq.h"
#include "src/operators/str_match.h"
#include "src/operators/unconditional_match.h"
Expand Down Expand Up @@ -455,6 +456,7 @@ using namespace modsecurity::operators;
OPERATOR_RSUB "OPERATOR_RSUB"
OPERATOR_RX_CONTENT_ONLY "Operator RX (content only)"
OPERATOR_RX "OPERATOR_RX"
OPERATOR_RX_GLOBAL "OPERATOR_RX_GLOBAL"
OPERATOR_STR_EQ "OPERATOR_STR_EQ"
OPERATOR_STR_MATCH "OPERATOR_STR_MATCH"
OPERATOR_UNCONDITIONAL_MATCH "OPERATOR_UNCONDITIONAL_MATCH"
Expand Down Expand Up @@ -1037,6 +1039,10 @@ op_before_init:
{
OPERATOR_CONTAINER($$, new operators::Rx(std::move($2)));
}
| OPERATOR_RX_GLOBAL run_time_string
{
OPERATOR_CONTAINER($$, new operators::RxGlobal(std::move($2)));
}
| OPERATOR_STR_EQ run_time_string
{
OPERATOR_CONTAINER($$, new operators::StrEq(std::move($2)));
Expand Down
2 changes: 2 additions & 0 deletions src/parser/seclang-scanner.ll
Original file line number Diff line number Diff line change
Expand Up @@ -302,6 +302,7 @@ OPERATOR_PM (?i:@pm)
OPERATOR_RBL (?i:@rbl)
OPERATOR_RSUB (?i:@rsub)
OPERATOR_RX (?i:@rx)
OPERATOR_RX_GLOBAL (?i:@rxGlobal)
OPERATOR_STR_EQ (?i:@streq)
OPERATOR_STR_MATCH (?i:@strmatch)
OPERATOR_UNCONDITIONAL_MATCH (?i:@unconditionalMatch)
Expand Down Expand Up @@ -1105,6 +1106,7 @@ EQUALS_MINUS (?i:=\-)
{OPERATOR_PM} { BEGIN_PARAMETER(); return p::make_OPERATOR_PM(*driver.loc.back()); }
{OPERATOR_RBL} { BEGIN_PARAMETER(); return p::make_OPERATOR_RBL( *driver.loc.back()); }
{OPERATOR_RX} { BEGIN_PARAMETER(); return p::make_OPERATOR_RX(*driver.loc.back()); }
{OPERATOR_RX_GLOBAL} { BEGIN_PARAMETER(); return p::make_OPERATOR_RX_GLOBAL(*driver.loc.back()); }
{OPERATOR_STR_EQ} { BEGIN_PARAMETER(); return p::make_OPERATOR_STR_EQ(*driver.loc.back()); }
{OPERATOR_STR_MATCH} { BEGIN_PARAMETER(); return p::make_OPERATOR_STR_MATCH(*driver.loc.back()); }
{OPERATOR_BEGINS_WITH} { BEGIN_PARAMETER(); return p::make_OPERATOR_BEGINS_WITH(*driver.loc.back()); }
Expand Down
78 changes: 78 additions & 0 deletions src/utils/regex.cc
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,24 @@
namespace modsecurity {
namespace Utils {

// Helper function to tell us if the current config indicates CRLF is a valid newline sequence
bool crlfIsNewline() {
int d = 0;
pcre_config(PCRE_CONFIG_NEWLINE, &d);

unsigned int option_bits = (d == 13)? PCRE_NEWLINE_CR :
(d == 10)? PCRE_NEWLINE_LF :
(d == (13<<8 | 10))? PCRE_NEWLINE_CRLF :
(d == -2)? PCRE_NEWLINE_ANYCRLF :
(d == -1)? PCRE_NEWLINE_ANY : 0;

bool crlf_is_newline =
option_bits == PCRE_NEWLINE_ANY ||
option_bits == PCRE_NEWLINE_CRLF ||
option_bits == PCRE_NEWLINE_ANYCRLF;

return crlf_is_newline;
}

Regex::Regex(const std::string& pattern_)
: pattern(pattern_.empty() ? ".*" : pattern_) {
Expand Down Expand Up @@ -115,6 +133,66 @@ bool Regex::searchOneMatch(const std::string& s, std::vector<SMatchCapture>& cap
return (rc > 0);
}

bool Regex::searchGlobal(const std::string& s, std::vector<SMatchCapture>& captures) const {
const char *subject = s.c_str();

bool prev_match_zero_length = false;
int pcre_options = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int startOffset = 0;

while (startOffset <= s.length()) {
int ovector[OVECCOUNT];
if (prev_match_zero_length) {
pcre_options = PCRE_NOTEMPTY_ATSTART | PCRE_ANCHORED;
} else {
pcre_options = 0; // common case
}
int rc = pcre_exec(m_pc, m_pce, subject, s.length(), startOffset, pcre_options, ovector, OVECCOUNT);

if (rc > 0) {
size_t firstGroupForThisFullMatch = captures.size();
for (int i = 0; i < rc; i++) {
size_t start = ovector[2*i];
size_t end = ovector[2*i+1];
size_t len = end - start;
if (end > s.length()) {
continue;
}
SMatchCapture capture(firstGroupForThisFullMatch + i, start, len);
captures.push_back(capture);

if (i == 0) {
if (len > 0) {
// normal case; next call to pcre_exec should start after the end of the last full match string
startOffset = end;
prev_match_zero_length = false;
} else {
// zero-length match; modify next match attempt to avoid infinite loop
prev_match_zero_length = true;
}
}
}
} else {
if (prev_match_zero_length) {
// The n-1 search found a zero-length match, so we did a subsequent search
// with the special flags. That subsequent exec did not find a match, so now advance
// by one character (unless CRLF, then advance by two)
startOffset++;
if (crlfIsNewline() && (startOffset < s.length()) && (s[startOffset-1] == '\r')
&& (s[startOffset] == '\n')) {
startOffset++;
}
prev_match_zero_length = false;
} else {
// normal case; no match on most recent scan (with options=0). We are done.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just my 2 cents...

Here you check the value of rc is greater than 0 or not - if not (eg. if it's definitely 0) you treat it that it's not matched.

Referring to the PCRE API, the value what you're looking for is -1.

I'm sure the chances are very-very-very small to get the value 0 (considering the value of the OVECTOR, which is 900), but may be here would be good to make some error handling. "no match on most recent" is only when the value of rc is -1. If it's 0, then it means the ovector out of space. And there many more different error - which also have a very-very slim chance.

Just a few thoughts...

Copy link
Contributor Author

@martinhsv martinhsv Oct 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @airween ,

Thanks for the observation; it is at least something for us to have considered.

As you note, the OVECTOR is enormous. It seems incredibly unlikely that anyone would compose a regex pattern with 300 capture groups (keeping in mind that repeating capture groups like '(sub-expression)*' only count as one).

It is true that the functionality implication here is that if this use case occurs, it will be treated the same as 'no match'. Note, that this is also the case for the standard @rx operator in v3.x -- both before and after the changes made this past summer, for example.

We may well want to make some adjustments for this edge case going forward. Whether we want to change main behaviour in this case might be debatable, but we should at least produce some kind of debug message. However, I think it would make more sense to adjust all relevant occurrences of this behaviour at the same time (including the much more important 'rx' operator), rather than adjusting this pull request in isolation.

For those reasons, my inclination here is to leave the current pull request as is.

Thanks again.

break;
}
}
}

return (captures.size() > 0);
}

int Regex::search(const std::string& s, SMatch *match) const {
int ovector[OVECCOUNT];
int ret = pcre_exec(m_pc, m_pce, s.c_str(),
Expand Down
1 change: 1 addition & 0 deletions src/utils/regex.h
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ class Regex {

std::list<SMatch> searchAll(const std::string& s) const;
bool searchOneMatch(const std::string& s, std::vector<SMatchCapture>& captures) const;
bool searchGlobal(const std::string& s, std::vector<SMatchCapture>& captures) const;
int search(const std::string &s, SMatch *match) const;
int search(const std::string &s) const;

Expand Down
46 changes: 46 additions & 0 deletions test/test-cases/regression/operator-rxGlobal.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
[
{
"enabled":1,
"version_min":300000,
"title":"Testing Operator :: @rxGlobal",
"client":{
"ip":"200.249.12.31",
"port":123
},
"server":{
"ip":"200.249.12.31",
"port":80
},
"request":{
"headers":{
"Host":"localhost",
"User-Agent":"curl/7.38.0",
"Accept":"*/*",
"Content-Length": "27",
"Content-Type": "application/x-www-form-urlencoded"
},
"uri":"/",
"method":"POST",
"body": [
"param1=value1&param2=value2"
]
},
"response":{
"headers":{
"Date":"Mon, 13 Jul 2015 20:02:41 GMT",
"Last-Modified":"Sun, 26 Oct 2014 22:33:37 GMT",
"Content-Type":"text/html"
},
"body":[
"no need."
]
},
"expected":{
"debug_log":"Executing operator \"RxGlobal"
},
"rules":[
"SecRuleEngine On",
"SecRule ARGS \"@rxGlobal (value1)\" \"id:1,phase:2,pass,t:trim\""
]
}
]
Loading