timbray
diff --git a/‎.github/workflows/benchmarks.yml‎
Lines changed: 1 addition & 1 deletion b/‎.github/workflows/benchmarks.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎.github/workflows/codeql-analysis.yaml‎
Lines changed: 1 addition & 1 deletion b/‎.github/workflows/codeql-analysis.yaml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎.github/workflows/dep-review.yaml‎
Lines changed: 1 addition & 1 deletion b/‎.github/workflows/dep-review.yaml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎.github/workflows/go-lint.yaml‎
Lines changed: 1 addition & 1 deletion b/‎.github/workflows/go-lint.yaml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎.github/workflows/go-unit-tests.yaml‎
Lines changed: 1 addition & 1 deletion b/‎.github/workflows/go-unit-tests.yaml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎.github/workflows/release.yaml‎
Lines changed: 1 addition & 1 deletion b/‎.github/workflows/release.yaml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎README.md‎
Lines changed: 25 additions & 19 deletions b/‎README.md‎
Lines changed: 25 additions & 19 deletions
diff --git a/‎anything_but.go‎
Lines changed: 8 additions & 9 deletions b/‎anything_but.go‎
Lines changed: 8 additions & 9 deletions
diff --git a/‎cl2_test.go‎
Lines changed: 8 additions & 6 deletions b/‎cl2_test.go‎
Lines changed: 8 additions & 6 deletions
diff --git a/‎core_matcher.go‎
Lines changed: 18 additions & 11 deletions b/‎core_matcher.go‎
Lines changed: 18 additions & 11 deletions
@@ -20,7 +20,7 @@ jobs:
 
     steps:
       - name: Checkout repository
-        uses: actions/checkout@a5ac7e51b41094c92402da3b24376905380afc29
+        uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332
 
       - name: Set up Go ${{ matrix.go-version }}
         uses: actions/setup-go@cdcb36043654635271a94b9a6d1392de5bb323a7
 
@@ -39,7 +39,7 @@ jobs:
 
     steps:
     - name: Checkout repository
-      uses: actions/checkout@a5ac7e51b41094c92402da3b24376905380afc29
+      uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332
 
     # Initializes the CodeQL tools for scanning.
     - name: Initialize CodeQL
 
@@ -17,7 +17,7 @@ jobs:
     timeout-minutes: 5
     steps:
       - name: Checkout repository
-        uses: actions/checkout@a5ac7e51b41094c92402da3b24376905380afc29
+        uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332
 
       - name: Dependency Review
         uses: actions/dependency-review-action@0659a74c94536054bfa5aeb92241f70d680cc78e
@@ -19,7 +19,7 @@ jobs:
 
     steps:
     - name: Checkout repository
-      uses: actions/checkout@a5ac7e51b41094c92402da3b24376905380afc29
+      uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332
       with:
         fetch-depth: 1
 
 
@@ -31,7 +31,7 @@ jobs:
 
     steps:
       - name: Checkout repository
-        uses: actions/checkout@a5ac7e51b41094c92402da3b24376905380afc29
+        uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332
 
       - name: Set up Go ${{ matrix.go-version }}
         uses: actions/setup-go@cdcb36043654635271a94b9a6d1392de5bb323a7
 
@@ -27,7 +27,7 @@ jobs:
 
     steps:
       - name: Checkout repository
-        uses: actions/checkout@a5ac7e51b41094c92402da3b24376905380afc29
+        uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332
         with:
           fetch-depth: 0
           ref: "main"
 
@@ -14,7 +14,9 @@
 create an instance and add multiple **Patterns** to it,
 and then query data objects called **Events** to
 discover which of the Patterns match
-the fields in the Event.
+the fields in the Event. In typical cases, Quamina
+can match millions of Events per second, even with
+many Patterns added to the instance.
 
 Quamina has no run-time dependencies beyond built-in Go libraries.
 
@@ -292,33 +294,20 @@ Events through it as is practical.
 
 ### `AddPattern()` Performance
 
-In **most** cases, tens of thousands of Patterns per second can
+Tens of thousands of Patterns per second can
 be added to a Quamina instance; the in-memory data structure will
-become larger, but not unreasonably so. The amount of of
+become larger, but not unreasonably so. The amount of
 available memory is the only significant limit to the
 number of patterns an instance can carry.
 
-The exception is `shellstyle` Patterns. Adding many of these
-can rapidly lead to degradation in elapsed time and memory
-consumption, at a rate which is uneven but at worst
-O(2<sup>N</sup>) in the number of patterns. A fuzz test
-which adds random 5-letter words with a `*` at a random
-location slows to a crawl after 30 or so `AddPattern()`
-calls, with the Quamina instance having many millions of
-states. Note that such instances, once built, can still
-match Events at high speeds.
-
-This is after some optimization. It is possible there is a
-bug such that automaton-building is unduly wasteful but it
-may remain the case that adding this flavor of Pattern is
-simply not something that can be done at large scale.
-
 ### `MatchesForEvent()` Performance
 
 I used to say that the performance of
 `MatchesForEvent` was O(1) in the number of
 Patterns. That’s probably a reasonable way to think
-about it, because it’s *almost* right.
+about it, because it’s *almost* right, except in the
+case where a very large number of `shellstyle` patterns
+have been added; this is discussed in the next section.
 
 To be correct, the performance is a little worse than
 O(N) where N is the average number of unique fields in an
@@ -361,6 +350,23 @@ So, adding a new Pattern that only mentions fields which are
 already mentioned in previous Patterns is effectively free,
 i.e. O(1) in terms of run-time performance.
 
+### Quamina instances with large numbers of `shellstyle` Patterns
+
+A study of the theory of finite automata reveals that processing 
+regular-expression constructs such as `*` increases the complexity of
+the automaton necessary to match it. It develops that when 
+a large number of such automata are compiled together, the merged
+output can contain a high degree of nondeterminism which can result
+in a drastic slowdown.
+
+A fuzz test which adds a pattern for each of 12,959 5-letter words with
+one `*` embedded in each at a random offset slows matching speed down to 
+below 10,000/second, in stark contrast to most Quamina instances, which 
+can achieve millions of matches/second.
+
+This slowdown is under active investigation and it is possible that the
+situation will improve.
+
 ### Further documentation
 
 There is a series of blog posts entitled
 
@@ -73,20 +73,19 @@ func readAnythingButSpecial(pb *patternBuild, valsIn []typedVal) (pathVals []typ
 func makeMultiAnythingButFA(vals [][]byte) (*smallTable, *fieldMatcher) {
 	nextField := newFieldMatcher()
 	successStep := &faState{table: newSmallTable(), fieldTransitions: []*fieldMatcher{nextField}}
-	//DEBUG successStep.table.label = "(success)"
-	success := &faNext{steps: []*faState{successStep}}
+	success := &faNext{states: []*faState{successStep}}
 
-	ret, _ := oneMultiAnythingButStep(vals, 0, success), nextField
+	ret, _ := makeOneMultiAnythingButStep(vals, 0, success), nextField
 	return ret, nextField
 }
 
-// oneMultiAnythingButStep - spookeh. The idea is that there will be N smallTables in this FA, where N is
+// makeOneMultiAnythingButStep - spookeh. The idea is that there will be N smallTables in this FA, where N is
 // the longest among the vals. So for each value from 0 through N, we make a smallTable whose default is
 // success but transfers to the next step on whatever the current byte in each of the vals that have not
 // yet been exhausted. We notice when we get to the end of each val and put in a valueTerminator transition
 // to a step with no nextField entry, i.e. failure because we've exactly matched one of the anything-but
 // strings.
-func oneMultiAnythingButStep(vals [][]byte, index int, success *faNext) *smallTable {
+func makeOneMultiAnythingButStep(vals [][]byte, index int, success *faNext) *smallTable {
 	// this will be the default transition in all the anything-but tables.
 	var u unpackedTable
 	for i := range u {
@@ -115,18 +114,18 @@ func oneMultiAnythingButStep(vals [][]byte, index int, success *faNext) *smallTa
 
 	// for each val that still has bytes to process, recurse to process the next one
 	for utf8Byte, val := range valsWithBytesRemaining {
-		nextTable := oneMultiAnythingButStep(val, index+1, success)
+		nextTable := makeOneMultiAnythingButStep(val, index+1, success)
 		nextStep := &faState{table: nextTable}
-		u[utf8Byte] = &faNext{steps: []*faState{nextStep}}
+		u[utf8Byte] = &faNext{states: []*faState{nextStep}}
 	}
 
 	// for each val that ends at 'index', put a failure-transition for this anything-but
 	// if you hit the valueTerminator, success for everything else
 	for utf8Byte := range valsEndingHere {
 		failState := &faState{table: newSmallTable()} // note no transitions
-		lastStep := &faNext{steps: []*faState{failState}}
+		lastStep := &faNext{states: []*faState{failState}}
 		lastTable := makeSmallTable(success, []byte{valueTerminator}, []*faNext{lastStep})
-		u[utf8Byte] = &faNext{steps: []*faState{{table: lastTable}}}
+		u[utf8Byte] = &faNext{states: []*faState{{table: lastTable}}}
 	}
 
 	table := newSmallTable()
 
@@ -187,20 +187,20 @@ func TestRulerCl2(t *testing.T) {
 
 	// initial run to stabilize memory
 	bm := newBenchmarker()
-	bm.addRules(exactRules, exactMatches)
+	bm.addRules(exactRules, exactMatches, false)
 
 	bm.run(t, lines)
 
 	bm = newBenchmarker()
-	bm.addRules(exactRules, exactMatches)
+	bm.addRules(exactRules, exactMatches, true)
 	fmt.Printf("EXACT events/sec: %.1f\n", bm.run(t, lines))
 
 	bm = newBenchmarker()
-	bm.addRules(prefixRules, prefixMatches)
+	bm.addRules(prefixRules, prefixMatches, true)
 	fmt.Printf("PREFIX events/sec: %.1f\n", bm.run(t, lines))
 
 	bm = newBenchmarker()
-	bm.addRules(anythingButRules, anythingButMatches)
+	bm.addRules(anythingButRules, anythingButMatches, true)
 	fmt.Printf("ANYTHING-BUT events/sec: %.1f\n", bm.run(t, lines))
 }
 
@@ -214,13 +214,15 @@ func newBenchmarker() *benchmarker {
 	return &benchmarker{q: q, wanted: make(map[X]int)}
 }
 
-func (bm *benchmarker) addRules(rules []string, wanted []int) {
+func (bm *benchmarker) addRules(rules []string, wanted []int, report bool) {
 	for i, rule := range rules {
 		rname := fmt.Sprintf("r%d", i)
 		_ = bm.q.AddPattern(rname, rule)
 		bm.wanted[rname] = wanted[i]
 	}
-	fmt.Println(matcherStats(bm.q.matcher.(*coreMatcher)))
+	if report {
+		fmt.Println(matcherStats(bm.q.matcher.(*coreMatcher)))
+	}
 }
 
 func (bm *benchmarker) run(t *testing.T, events [][]byte) float64 {
 
@@ -129,7 +129,7 @@ func (m *coreMatcher) deletePatterns(_ X) error {
 // matchesForJSONEvent calls the flattener to pull the fields out of the event and
 // hands over to MatchesForFields
 // This is a leftover from previous times, is only used by tests, but it's used by a *lot*
-// so removing it would require a lot of tedious work
+// and it's a convenient API for testing.
 func (m *coreMatcher) matchesForJSONEvent(event []byte) ([]X, error) {
 	fields, err := newJSONFlattener().Flatten(event, m.getSegmentsTreeTracker())
 	if err != nil {
@@ -178,20 +178,27 @@ func (m *coreMatcher) matchesForFields(fields []Field) ([]X, error) {
 	}
 	matches := newMatchSet()
 
+	// pre-allocate a pair of buffers that will be used several levels down the call stack for efficiently
+	// transversing NFAs
+	bufs := &bufpair{
+		buf1: make([]*faState, 0),
+		buf2: make([]*faState, 0),
+	}
+
 	// for each of the fields, we'll try to match the automaton start state to that field - the tryToMatch
 	// routine will, in the case that there's a match, call itself to see if subsequent fields after the
 	// first matched will transition through the machine and eventually achieve a match
 	s := m.fields()
 	for i := 0; i < len(fields); i++ {
-		tryToMatch(fields, i, s.state, matches)
+		tryToMatch(fields, i, s.state, matches, bufs)
 	}
 	return matches.matches(), nil
 }
 
 // tryToMatch tries to match the field at fields[index] to the provided state. If it does match and generate
 // 1 or more transitions to other states, it calls itself recursively to see if any of the remaining fields
 // can continue the process by matching that state.
-func tryToMatch(fields []Field, index int, state *fieldMatcher, matches *matchSet) {
+func tryToMatch(fields []Field, index int, state *fieldMatcher, matches *matchSet, bufs *bufpair) {
 	stateFields := state.fields()
 
 	// transition on exists:true?
@@ -200,16 +207,16 @@ func tryToMatch(fields []Field, index int, state *fieldMatcher, matches *matchSe
 		matches = matches.addXSingleThreaded(existsTrans.fields().matches...)
 		for nextIndex := index + 1; nextIndex < len(fields); nextIndex++ {
 			if noArrayTrailConflict(fields[index].ArrayTrail, fields[nextIndex].ArrayTrail) {
-				tryToMatch(fields, nextIndex, existsTrans, matches)
+				tryToMatch(fields, nextIndex, existsTrans, matches, bufs)
 			}
 		}
 	}
 
 	// an exists:false transition is possible if there is no matching field in the event
-	checkExistsFalse(stateFields, fields, index, matches)
+	checkExistsFalse(stateFields, fields, index, matches, bufs)
 
 	// try to transition through the machine
-	nextStates := state.transitionOn(&fields[index])
+	nextStates := state.transitionOn(&fields[index], bufs)
 
 	// for each state in the possibly-empty list of transitions from this state on fields[index]
 	for _, nextState := range nextStates {
@@ -221,17 +228,17 @@ func tryToMatch(fields []Field, index int, state *fieldMatcher, matches *matchSe
 		//  of the same array
 		for nextIndex := index + 1; nextIndex < len(fields); nextIndex++ {
 			if noArrayTrailConflict(fields[index].ArrayTrail, fields[nextIndex].ArrayTrail) {
-				tryToMatch(fields, nextIndex, nextState, matches)
+				tryToMatch(fields, nextIndex, nextState, matches, bufs)
 			}
 		}
 		// now we've run out of fields to match this state against. But suppose it has an exists:false
 		// transition, and it so happens that the exists:false pattern field is lexically larger than the other
 		// fields and that in fact such a field does not exist. That state would be left hanging. So…
-		checkExistsFalse(nextStateFields, fields, index, matches)
+		checkExistsFalse(nextStateFields, fields, index, matches, bufs)
 	}
 }
 
-func checkExistsFalse(stateFields *fmFields, fields []Field, index int, matches *matchSet) {
+func checkExistsFalse(stateFields *fmFields, fields []Field, index int, matches *matchSet, bufs *bufpair) {
 	for existsFalsePath, existsFalseTrans := range stateFields.existsFalse {
 		// it seems like there ought to be a more state-machine-idiomatic way to do this, but
 		// I thought of a few and none of them worked.  Quite likely someone will figure it out eventually.
@@ -250,9 +257,9 @@ func checkExistsFalse(stateFields *fmFields, fields []Field, index int, matches
 		if i == len(fields) {
 			matches = matches.addXSingleThreaded(existsFalseTrans.fields().matches...)
 			if thisFieldIsAnExistsFalse {
-				tryToMatch(fields, index+1, existsFalseTrans, matches)
+				tryToMatch(fields, index+1, existsFalseTrans, matches, bufs)
 			} else {
-				tryToMatch(fields, index, existsFalseTrans, matches)
+				tryToMatch(fields, index, existsFalseTrans, matches, bufs)
 			}
 		}
 	}