Skip to content

Commit c6c035f

Browse files
committed
fixup! api: reorganize directory layout
1 parent c2f9f92 commit c6c035f

27 files changed

+570
-417
lines changed

README.md

Lines changed: 77 additions & 103 deletions
Original file line numberDiff line numberDiff line change
@@ -10,19 +10,21 @@
1010

1111
### Fast pattern-matching library
1212

13-
Quamina provides APIs to create an interface called
14-
a **Matcher**,
15-
add multiple **Patterns** to it, and then query JSON blobs
16-
called **Events** to discover which of the patterns match
13+
**Quamina** implements a data type that has APIs to
14+
create an instance and add multiple **Patterns** to it,
15+
and then query data objects called **Events** to
16+
discover which of the patterns match
1717
the fields in the event.
1818

19+
Quamina [welcomes contributions](CONTRIBUTING.md).
20+
1921
### Status
2022

2123
As of late May 2022, Quamina has a lot of unit tests and
22-
they're all passing. We are working on getting the
23-
GitHub-based CI/CD nailed down and stable. We have not
24-
pressed the “release” button, so we reserve the right
25-
to change APIs.
24+
they're all passing. It has a reasonable basis of
25+
GitHub-based CI/CD working. We intend to press the
26+
“release” button any day now, but for the moment
27+
we reserve the right to change APIs.
2628

2729
### Patterns
2830

@@ -128,87 +130,79 @@ Number matching is weak - the number has to appear
128130
exactly the same in the pattern and the event. I.e.,
129131
Quamina doesn't know that 35, 35.000, and 3.5e1 are the
130132
same number. There's a fix for this in the code which
131-
is commented out because it causes a
132-
significant performance penalty.
133+
is not yet activated because it causes a
134+
significant performance penalty, so the API needs to
135+
be enhanced to only ask for it when you need it.
133136

134137
## Flattening and Matching
135138

136139
The first step in finding matches for an Event is
137140
“flattening” it, which is to say turning it
138141
into a list of pathname/value pairs called Fields. Quamina
139-
defines a `Flattener` interface type and provides a
140-
JSON-specific implementation in the `FJ` type.
142+
defines a `Flattener` interface type and has a built-in
143+
`Flattener` for JSON.
141144

142145
`Flattener` implementations in general will have
143146
internal state and thus not be thread-safe.
144147

145-
The `MatchesForJSONEvent` API must create a new
146-
`FJ` instance for each event so that it
147-
can be thread-safe. This works fine, but creating a
148-
new `FJ` instance is expensive enough to slow the
149-
rate at which events can be matched by 15% or so.
150-
151-
For maximum performance in matching JSON events,
152-
you should create your own `FJ` instance with the
153-
`NewFJ(Matcher)` method. You can then use
154-
`FJ.Flatten(event)` API to turn multiple successive
155-
JSON events into `Field` lists and pass them to
156-
`Matcher`'s `MatchesForFields()` API, but `FJ`
157-
includes a convenience method `FlattenAndMatch(event)`
158-
which will call the `Matcher` for you. As long as
159-
each thread has its own `Flattener` instance,
160-
everything will remain thread-safe.
161-
162-
Also note that should you wish to process events
148+
Note that should you wish to process events
163149
in a format other than JSON, you can implement
164-
the `Flattener` interface and use that to process
165-
events in whatever format into Field lists.
150+
the `Flattener` interface yourself.
166151

167152
## APIs
168-
169153
**Note**: In all the APIs below, field names and values in both
170154
Patterns and Events must be valid UTF-8. Unescaped characters
171155
smaller than 0x1F (illegal per JSON), and bytes with value
172156
greater than 0XF4 (can't occur in correctly composed UTF-8)
173157
are rejected by the APIs.
174-
158+
### Control APIs
175159
```go
176-
type Matcher interface {
177-
AddPattern(x X, pat string) error
178-
MatchesForJSONEvent(event []byte) ([]X, error)
179-
MatchesForFields(fields []Field) []X
180-
DeletePattern(x X) error
181-
}
182-
```
160+
func New(...Option) (*Quamina, error)
183161

184-
Above are the operations provided by a Matcher. Quamina
185-
includes an implementation called `CoreMatcher` which
186-
implements `Matcher`. In a forthcoming release it will
187-
provider alternate implementations that offer extra
188-
features.
189-
190-
```go
191-
func NewCoreMatcher() *Matcher
162+
func WithMediaType(mediaType string) Option
163+
func WithFlattener(f Flattener) Option
164+
func WithPatternDeletion(b bool) Option
165+
func WithPatternStorage(ps LivePatternsState) Option
192166
```
167+
For example:
168+
193169
```go
194-
func pruner.NewMatcher() *Matcher
170+
q, err := quamina.New(quamina.New(quamina.WithMediaType("application/json")))
195171
```
196-
197-
Create new Matchers, take no arguments. The difference
198-
is that the `pruner.NewMatcher` version supports the
199-
`DeletePattern()` API. Be careful: It occasionally
200-
rebuilds the Matcher in stop-the-world fashion, so if you
201-
delete lots of Patterns in a large Matcher you may
202-
encounter occasional elevated latencies.
172+
The meanings of the `Option` functions are:
173+
174+
`WithMediaType`: In the futue, Quamina will support
175+
Events not just in JSON but in other formats such as
176+
Avro, Protobufs, and so on. This option will make sure
177+
to invoke the correct Flattener. At the moment, the only
178+
supported value is `application/json`, the default.
179+
180+
`WithFlattener`: Requests that Quamina flatten events with
181+
the provided (presumably user-written) Flattener.
182+
183+
`WithPatternDeletion`: If true, arranges that Quamina
184+
allows Patterns to be deleted from an instance. This is
185+
not free; it can incur extra costs in memory and
186+
occasional stop-the-world Quamina rebuilds. (We plan
187+
to improve this.)
188+
189+
`WithPatternStorage`: If you provide an argument that
190+
supports the `LivePatternStorage` API, Quamina will
191+
use it to
192+
maintain a list of which patterns have currently been
193+
added but not deleted. This could be useful if you
194+
wanted to rebuild Quamina instances for sharded
195+
processing or after a system failure. ***Note: Not
196+
yet implemented.***
197+
198+
### Data APIs
203199

204200
```go
205-
func (m *Matcher) AddPattern(x X, patternJSON string) error
201+
func (q *Quamina) AddPattern(x X, patternJSON string) error
206202
```
207-
208203
The first argument identifies the Pattern and will be
209-
returned by a Matcher when asked to match against Events.
210-
X is currently `interface{}`. Should it be a generic now
211-
that Go has them?
204+
returned by Quamina when asked to match against Events.
205+
X is defined as `any`.
212206

213207
The Pattern must be provided as a string which is a
214208
JSON object as exemplified above in this document.
@@ -217,67 +211,47 @@ The `error` return is used to signal invalid Pattern
217211
structure, which could be bad UTF-8 or malformed JSON
218212
or leaf values which are not provided as arrays.
219213

220-
As many Patterns as desired can be added to a Matcher.
221-
The `CoreMatcher` type does not support `DeletePattern()`
222-
but `pruner.Matcher` does.
214+
As many Patterns as desired can be added to a Quamina
215+
instance.
223216

224217
The `AddPattern` call is single-threaded; if multiple
225218
threads call it, they will block and execute sequentially.
226219

227220
```go
228-
func (m *Matcher) MatchesForJSONEvent(event []byte) ([]X, error)
221+
func (q *Quamina) MatchesForEvent(event []byte) ([]X, error)
229222
```
230223

231-
The `event` argument must be a JSON object encoded in
232-
correct UTF-8.
233-
234224
The `error` return value is nil unless there was an
235-
error in the Event JSON.
225+
error in the encoding of the Event.
236226

237227
The `[]X` return slice may be empty if none of the Patterns
238228
match the provided Event.
239229

240-
```go
241-
func (m *Matcher) MatchesForFields([]Field) ([]X, error)
242-
```
243-
Performs the functions of `MatchesForJSON` on an
244-
Event which has been flattened into a list of `Field`
245-
instances. At the moment, `CoreMatcher` only returns
246-
an error if the `[]Field` argument is nil. `pruner.Matcher`
247-
can return an error if it suffers a failure in its
248-
Pattern storage.
249-
250-
These matching calls are thread-safe. Many threads may
251-
be executing it concurrently, even while `AddPattern` is
252-
also executing. There is a significant performance
253-
penalty if there is a high rate of `AddPattern` in
254-
combination with matching.
255-
256-
```go
257-
func NewFJ(*Matcher) Flattener
258-
```
259-
Creates a new JSON-specific Flattener.
260-
```go
261-
func (fj *FJ) Flatten([]byte event) []Field
262-
```
263-
Transforms an event, which must be JSON object
264-
encoded in UTF-8, into a list of `Field` instances.
230+
A single Quamina instance is not thread-safe. But
231+
instances can share the underlying data structures
232+
in a safe way.
265233

266-
```go
267-
func (fj *FJ) FlattenAndMatch([]byte event) ([]X, error)
234+
```json
235+
func (q *Quamina) Copy() *Quamina
268236
```
269-
Utility function which combines the functions of the
270-
`FJ.Flatten` and `Matcher.MatchesForFields` APIs.
237+
This generates a copy of of the target instance
238+
which may be used in parallel on another thread,
239+
while sharing the underlying data structure. Many
240+
instances can execute `MatchesForEvent()` calls
241+
concurrently, even while one or more of them are
242+
also executing `AddPattern()`. There is a
243+
significant performance penalty if there is a high
244+
rate of `AddPattern` in parallel with matching.
271245

272246
### Performance
273247

274248
I used to say that the performance of
275-
`MatchesForJSONEvent` was `O(1)` in the number of
249+
`MatchesForEvent` was `O(1)` in the number of
276250
Patterns. While that’s probably the right way to think
277251
about it, it’s not *quite* true,
278252
as it varies somewhat as a function of the number of
279253
unique fields that appear in all the patterns that have
280-
been added to the matcher, but still remains sublinear
254+
been added to Quamina, but still remains sublinear
281255
in that number.
282256

283257
A word of explanation: Quamina compiles the
@@ -290,7 +264,7 @@ flattened into a list of pathname/value pairs and
290264
sorted. This process exceeds 50% of execution time,
291265
and is optimized by discarding any fields that
292266
do not appear in one or more of the patterns added
293-
to the matcher. Thus, adding a new pattern that only
267+
to Quamina. Thus, adding a new pattern that only
294268
mentions fields mentioned in previous patterns is
295269
effectively free i.e. `O(1)` in terms of run-time
296270
performance.
@@ -306,7 +280,7 @@ colonies before slavery was abolished.
306280

307281
### Credits
308282

309-
@timbray: v0.1 and patches.
283+
@timbray: v0.0 and patches.
310284

311285
@jsmorph: `Pruner` and concurrency testing.
312286

arrays_test.go

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -75,16 +75,16 @@ func TestArrayCorrectness(t *testing.T) {
7575
pattern1 := `{"bands": { "members": { "given": [ "Mick" ], "surname": [ "Strummer" ] } } }`
7676
pattern2 := `{"bands": { "members": { "given": [ "Wata" ], "role": [ "drums" ] } } }`
7777
pattern3 := `{"bands": { "members": { "given": [ "Wata" ], "role": [ "guitar" ] } } }`
78-
m := NewCoreMatcher()
79-
err := m.AddPattern("Mick strummer", pattern1)
78+
m := newCoreMatcher()
79+
err := m.addPattern("Mick strummer", pattern1)
8080
if err != nil {
8181
t.Error(err.Error())
8282
}
83-
err = m.AddPattern("Wata drums", pattern2)
83+
err = m.addPattern("Wata drums", pattern2)
8484
if err != nil {
8585
t.Error(err.Error())
8686
}
87-
err = m.AddPattern("Wata guitar", pattern3)
87+
err = m.addPattern("Wata guitar", pattern3)
8888
if err != nil {
8989
t.Error(err.Error())
9090
}

0 commit comments

Comments
 (0)