Skip to content

Commit 36e31d7

Browse files
committed
chore: cleanup after NFA-DFA refactor
Closes: #67 Signed-off-by: Tim Bray <[email protected]>
1 parent c5e39ea commit 36e31d7

File tree

10 files changed

+75
-109
lines changed

10 files changed

+75
-109
lines changed

README.md

Lines changed: 34 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -268,15 +268,37 @@ you’ve created a Quamina instance, whether through
268268
`New()` or `Copy()`, keep it around and run as many
269269
Events through it as is practical.
270270

271-
272-
### Performance
271+
### `AddPattern()` Performance
272+
273+
In **most** cases, tens of thousands of Patterns per second can
274+
be added to a Quamina instance; the in-memory data structure will
275+
become larger, but not unreasonably so. The amount of of
276+
available memory is the only significant limit to the
277+
number of patterns an instance can carry.
278+
279+
The exception is `shellstyle` Patterns. Adding many of these
280+
can rapidly lead to degradation in elapsed time and memory
281+
consumption, at a rate which is uneven but at worst
282+
O(2<sup>N</sup>) in the number of patterns. A fuzz test
283+
which adds random 5-letter words with a `*` at a random
284+
location slows to a crawl after 30 or so `AddPattern()`
285+
calls, with the Quamina instance having many millions of
286+
states. Note that such instances, once built, can still
287+
match Events at high speeds.
288+
289+
This is after some optimization. It is possible there is a
290+
bug such that automaton-building is unduly wasteful but it
291+
may remain the case that adding this flavor of Pattern is
292+
simply not something that can be done at large scale.
293+
294+
### `MatchesForEvent()` Performance
273295

274296
I used to say that the performance of
275-
`MatchesForEvent` was `O(1)` in the number of
297+
`MatchesForEvent` was O(1) in the number of
276298
Patterns. That’s probably a reasonable way to think
277299
about it, because it’s *almost* right.
278300

279-
To be correct, the performance is `O(N)` where `N` is
301+
To be correct, the performance is O(N) where N is
280302
the number of unique fields that appear in all the Patterns
281303
that have been added to Quamina.
282304

@@ -315,9 +337,16 @@ is at most N, the number of fields left after discarding.
315337

316338
Thus, adding a new Pattern that only
317339
mentions fields which are already mentioned in previous
318-
Patterns is effectively free i.e. `O(1)` in terms of run-time
340+
Patterns is effectively free i.e. O(1) in terms of run-time
319341
performance.
320342

343+
### Further documentation
344+
345+
There is a series of blog posts entitled
346+
[Quamina Diary](https://www.tbray.org/ongoing/What/Technology/Quamina%20Diary/)
347+
that provides a detailed discussion of the design decisions
348+
at a length unsuitable for in-code comments.
349+
321350
### Name
322351

323352
From Wikipedia: Quamina Gladstone (1778 – 16 September

field_matcher.go

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,14 @@ package quamina
33
import "sync/atomic"
44

55
// fieldMatcher represents a state in the matching automaton, which matches field names and dispatches to
6-
// valueMatcher to complete matching of field values. fieldMatcher has a map which is keyed by the
7-
// field pathSegments values that can start transitions from this matcher; for each such field, there is a
8-
// valueMatcher which, given the field's value, determines whether the automaton progresses to another fieldMatcher
6+
// valueMatcher to complete matching of field values. fieldMatcher has a map which is keyed by the
7+
// field pathSegments values that can start transitions from this matcher; for each such field, there is a
8+
// valueMatcher which, given the field's value, determines whether the automaton progresses to another fieldMatcher
99
// matches contains the X values that arrival at this state implies have matched
1010
// existsFalseFailures reports the condition that traversal has occurred by matching a field which is named in an
11-
// exists:false pattern, and the named X's should be subtracted from the matches list being built up by a match project
11+
// exists:false pattern, and the named X's should be subtracted from the matches list being built up by a match project
1212
// the fields that hold state are segregated in updateable so they can be replaced atomically and make the matcher
13-
// thread-safe.
13+
// thread-safe.
1414
type fieldMatcher struct {
1515
updateable atomic.Value // always holds an *fmFields
1616
}
@@ -20,8 +20,8 @@ type fmFields struct {
2020
existsFalseFailures *matchSet
2121
}
2222

23-
// fields / update / addExistsFalseFailure / addMatch exist to insuleate callers from dealing with
24-
// the atomic Load/Store business
23+
// fields / update / addExistsFalseFailure / addMatch exist to insulate callers from dealing with
24+
// the atomic Load/Store business
2525
func (m *fieldMatcher) fields() *fmFields {
2626
return m.updateable.Load().(*fmFields)
2727
}

flatten_json.go

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,14 +7,13 @@ import (
77
)
88

99
// flattenJSON is a custom non-general-purpose JSON parser whose object is to implement Flattener and produce a []Field
10-
// list from a JSON object. This could be done (and originally was) with the built-in encoding/json, but the
11-
// performance was unsatisfactory (99% of time spent parsing events < 1% matching them). The profiler suggests
12-
// that the performance issue was mostly due to excessive memory allocation.
10+
// list from a JSON object. This could be done (and originally was) with the built-in encoding/json, but the
11+
// performance was unsatisfactory (99% of time spent parsing events < 1% matching them). The profiler suggests
12+
// that the performance issue was mostly due to excessive memory allocation.
1313
// If we assume that the event is immutable while we're working, then all the pieces of it that constitute
14-
// the fields & values can be represented as []byte slices using a couple of offsets into the underlying event.
15-
// There is an exception, namely strings that contain \-prefixed JSON escapes; since we want to work with the
16-
// actual UTF-8 bytes, this requires re-writing such strings into memory we have to allocate.
17-
// TODO: There are gaps in the unit-test coverage, including nearly all the error conditions
14+
// the fields & values can be represented as []byte slices using a couple of offsets into the underlying event.
15+
// There is an exception, namely strings that contain \-prefixed JSON escapes; since we want to work with the
16+
// actual UTF-8 bytes, this requires re-writing such strings into memory we have to allocate.
1817
type flattenJSON struct {
1918
event []byte // event being processed, treated as immutable
2019
eventIndex int // current byte index into the event

flattener.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
package quamina
22

3-
// Flattener is interface which provides methods to turn a data structure into a list of path-names and
3+
// Flattener is an interface which provides methods to turn a data structure into a list of path-names and
44
// values. The following example illustrates how it works for a JSON object:
55
// { "a": 1, "b": "two", "c": true", "d": nil, "e": { "e1": 2, "e2":, 3.02e-5} "f": [33, "x"]} }
66
// should produce

list_maker.go

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
package quamina
22

3-
// this needs to exist so that all all the lists containing a single step to X, or the triple step to X,Y,Z are the
4-
// same list, so that pack/unpack work properly. In a large majority of cases, there's only one step in the list, so
5-
// those are handled straightforwardly with a map. Otherwise, we laboriously look through all the lists for a match.
6-
// In Java I'd implement a hashCode() method and everything would be a hash, but I haven't learned yet what the Go
7-
// equivalent is.
3+
// this needs to exist so that all all the lists containing a single step to X are the same list, and similarly all
4+
// those containing the triple step to X,Y,Z are the same list, so that pack/unpack work properly. In a large majority
5+
// of cases, there's only one step in the list, so those are handled straightforwardly with a map. Otherwise, we
6+
// laboriously look through all the lists for a match. In Java I'd implement a hashCode() method and everything
7+
// would be a hash, but I haven't learned yet what the Go equivalent is.
88
type dfaMemory struct {
99
singletons map[*nfaStep]*dfaStep
1010
plurals []perList

match_set.go

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
package quamina
22

33
// matchSet is what it says on the tin; implements a set semantic on matches, which are of type X. These could all
4-
// be implemented as match[X]bool but this makes the calling code more readable.
4+
// be implemented as match[X]bool but this makes the calling code more readable.
55
type matchSet struct {
66
set map[X]bool
77
}
@@ -11,6 +11,7 @@ func newMatchSet() *matchSet {
1111
}
1212

1313
func (m *matchSet) addX(exes ...X) *matchSet {
14+
// for concurrency, can't update in place
1415
newSet := make(map[X]bool, len(m.set)+1)
1516
for k := range m.set {
1617
newSet[k] = true

quamina.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,7 @@ func (q *Quamina) Copy() *Quamina {
119119

120120
// X is used in the AddPattern and MatchesForEvent APIs to identify the patterns that are added to
121121
// a Quamina instance and are reported by that instance as matching an event. Commonly, X is a string
122-
// used to name the event.
122+
// used to name the pattern.
123123
type X any
124124

125125
// AddPattern - adds a pattern, identified by the x argument, to a Quamina instance.

shell_style_test.go

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@ package quamina
22

33
import (
44
"fmt"
5+
"math/rand"
56
"strings"
67
"testing"
78
)
@@ -84,8 +85,7 @@ func TestMakeShellStyleAutomaton(t *testing.T) {
8485
}
8586
}
8687

87-
/* To be used in profiling AddPattern for patterns which need NFAs
88-
func xTestShellStyleBuildTime(t *testing.T) {
88+
func TestShellStyleBuildTime(t *testing.T) {
8989
words := readWWords(t)
9090
starWords := make([]string, 0, len(words))
9191
patterns := make([]string, 0, len(words))
@@ -107,7 +107,6 @@ func xTestShellStyleBuildTime(t *testing.T) {
107107
}
108108
fmt.Println(matcherStats(q.matcher.(*coreMatcher)))
109109
}
110-
*/
111110

112111
func TestMixedPatterns(t *testing.T) {
113112
// let's mix up some prefix, infix, suffix, and exact-match searches
@@ -123,9 +122,6 @@ func TestMixedPatterns(t *testing.T) {
123122
`"ZOE"`: 19,
124123
`"CRYSTAL"`: 6,
125124
}
126-
x1, _ := makeShellStyleAutomaton([]byte(`"*ST"`), nil)
127-
x2, _ := makeShellStyleAutomaton([]byte(`"*TH"`), nil)
128-
mergeNfas(x1, x2)
129125

130126
stringTemplate := `{"properties": { "STREET": [ XX ] } }`
131127
shellTemplate := `{"properties": {"STREET":[ {"shellstyle": XX} ] } }`

small_table.go

Lines changed: 3 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ const valueTerminator byte = 0xf5
3838
// but I imagine organizing it this way is a bit more memory-efficient. Suppose we want to model a table where
3939
// byte values 3 and 4 map to ss1 and byte 0x34 maps to ss2. Then the smallTable would look like:
4040
// ceilings:--|3|----|5|-|0x34|--|x35|-|byteCeiling|
41-
// steps:---|nil|-|&ss1|--|ni|--|&ss2|---------|nil|
41+
// steps:---|nil|-|&ss1|--|nil|-|&ss2|---------|nil|
4242
// invariant: The last element of ceilings is always byteCeiling
4343
// The motivation is that we want to build a state machine on byte values to implement things like prefixes and
4444
// ranges of bytes. This could be done simply with an array of size byteCeiling for each state in the machine,
@@ -133,7 +133,7 @@ func mergeOneDfaStep(step1, step2 *dfaStep, memoize map[dfaStepKey]*dfaStep) *df
133133
uComb[i] = stepNew
134134
case stepExisting != nil && stepNew != nil:
135135
// there are considerable runs of the same value
136-
if i > 1 && stepExisting == uExisting[i-1] && stepNew == uNew[i-1] {
136+
if i > 0 && stepExisting == uExisting[i-1] && stepNew == uNew[i-1] {
137137
uComb[i] = uComb[i-1]
138138
} else {
139139
uComb[i] = mergeOneDfaStep(stepExisting, stepNew, memoize)
@@ -148,7 +148,7 @@ func mergeOneDfaStep(step1, step2 *dfaStep, memoize map[dfaStepKey]*dfaStep) *df
148148
// transitions in the NFA because, as of the time of writing, none of the
149149
// pattern-matching required those transitions. It is based on the algorithm
150150
// taught in the TU München course “Automata and Formal Languages”, lecturer
151-
// Prof. Dr.Ernst W. Mayr in 2014-15, in particular the examples appearing in
151+
// Prof. Dr. Ernst W. Mayr in 2014-15, in particular the examples appearing in
152152
// http://wwwmayr.informatik.tu-muenchen.de/lehre/2014WS/afs/2014-10-14.pdf
153153
// especially the slide in Example 11.
154154
//
@@ -208,69 +208,6 @@ func nfaStep2DfaStep(stepList *nfaStepList, memoize *dfaMemory) *dfaStep {
208208
return dStep
209209
}
210210

211-
type nfaStepKey struct {
212-
step1 *nfaStep
213-
step2 *nfaStep
214-
}
215-
216-
func mergeNfas(nfa1, nfa2 *smallTable[*nfaStepList]) *smallTable[*nfaStepList] {
217-
step1 := &nfaStep{table: nfa1}
218-
step2 := &nfaStep{table: nfa2}
219-
return mergeOneNfaStep(step1, step2, make(map[nfaStepKey]*nfaStep), newListMaker(), 0).table
220-
}
221-
222-
func mergeOneNfaStep(step1, step2 *nfaStep, memoize map[nfaStepKey]*nfaStep, lister *listMaker, depth int) *nfaStep {
223-
var combined *nfaStep
224-
mKey := nfaStepKey{step1: step1, step2: step2}
225-
combined, ok := memoize[mKey]
226-
if ok {
227-
return combined
228-
}
229-
230-
newTable := newSmallTable[*nfaStepList]()
231-
switch {
232-
case step1.fieldTransitions == nil && step2.fieldTransitions == nil:
233-
combined = &nfaStep{table: newTable}
234-
case step1.fieldTransitions != nil && step2.fieldTransitions != nil:
235-
transitions := append(step1.fieldTransitions, step2.fieldTransitions...)
236-
combined = &nfaStep{table: newTable, fieldTransitions: transitions}
237-
case step1.fieldTransitions != nil && step2.fieldTransitions == nil:
238-
combined = &nfaStep{table: newTable, fieldTransitions: step1.fieldTransitions}
239-
case step1.fieldTransitions == nil && step2.fieldTransitions != nil:
240-
combined = &nfaStep{table: newTable, fieldTransitions: step2.fieldTransitions}
241-
}
242-
memoize[mKey] = combined
243-
244-
u1 := unpackTable(step1.table)
245-
u2 := unpackTable(step2.table)
246-
var uComb unpackedTable[*nfaStepList]
247-
for i, list1 := range u1 {
248-
list2 := u2[i]
249-
switch {
250-
case list1 == nil && list2 == nil:
251-
uComb[i] = nil
252-
case list1 != nil && list2 == nil:
253-
uComb[i] = u1[i]
254-
case list1 == nil && list2 != nil:
255-
uComb[i] = u2[i]
256-
case list1 != nil && list2 != nil:
257-
var comboList []*nfaStep
258-
for _, nextStep1 := range list1.steps {
259-
for _, nextStep2 := range list2.steps {
260-
merged := mergeOneNfaStep(nextStep1, nextStep2, memoize, lister, depth+1)
261-
comboList = append(comboList, merged)
262-
}
263-
}
264-
uComb[i] = lister.getList(comboList...)
265-
}
266-
}
267-
combined.table.pack(&uComb)
268-
return combined
269-
}
270-
271-
// TODO: Clean up from here on down - too many funcs doing about the same thing, and also it seems that
272-
// we never want to have more than one "range", which is the whole table.
273-
274211
// makeSmallDfaTable creates a pre-loaded small table, with all bytes not otherwise specified having the defaultStep
275212
// value, and then a few other values with their indexes and values specified in the other two arguments. The
276213
// goal is to reduce memory churn

value_matcher.go

Lines changed: 14 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -8,16 +8,21 @@ import (
88
// valueMatcher represents a byte-driven automaton. The table needs to be the
99
// equivalent of a map[byte]nextState and is represented by smallTable. Some
1010
// patterns can be represented by a deterministic finite automaton (DFA) but
11-
// others, particularly with a regex failure, need to be represented by a
12-
// nondeterministic finite automaton (NFA). NFAs trump DFAs so if a valueMatcher
13-
// has one, it must be used in preference to other alternatives. In some cases
14-
// there is only one byte sequence forward from a state, i.e. a string-valued
15-
// field with only one string match. In this case, the DFA and NFA will b null
16-
// and the value being matched has to exactly equal the singletonMatch field; if
17-
// so, the singletonTransition is the return value. This is to avoid having a
18-
// long chain of smallTables each with only one entry.
11+
// others, particularly with a regex flavor, need to be represented by a
12+
// nondeterministic finite automaton (NFA). NFAs are converted to DFAs for
13+
// simplicity and efficiency. The basic algorithm is to compute the automaton
14+
// for a pattern, convert it to a DFA if necessary, and merge with any
15+
// existing DFA.
16+
// In some (common) cases there is only one byte sequence forward from a state,
17+
// i.e. a string-valued field with only one string match. In this case, the DFA
18+
// will be null and the value being matched has to exactly equal the singletonMatch
19+
// field; if so, the singletonTransition is the return value. This is to avoid
20+
// having a long chain of smallTables each with only one entry.
21+
// To allow for concurrent access between one thread running AddPattern and many
22+
// others running MatchesForEvent, the valueMatcher payload is stored in an
23+
// atomic.Value
1924
type valueMatcher struct {
20-
updateable atomic.Value
25+
updateable atomic.Value // always contains *vmFields
2126
}
2227
type vmFields struct {
2328
startDfa *smallTable[*dfaStep]
@@ -83,7 +88,6 @@ func transitionDfa(table *smallTable[*dfaStep], val []byte, transitions []*field
8388
}
8489

8590
transitions = append(transitions, step.fieldTransitions...)
86-
8791
table = step.table
8892
}
8993

0 commit comments

Comments
 (0)