Re-evaluate second (length) index (yet again)

The second index was dropped by #64 (#30) as it proved quite a bit faster to simplify the `lex` function. My reasoning was that we have to validate the input anyway and domain names (very likely the bulk of the data) are parsed like strings using a scalar loop to calculate label lengths. However, now that the domain name parser is being optimized (#66, thanks @lemire!), it seems my assumption may be less correct.

Tests showed that the data dependencies and branches in the [lex](https://github.com/k0ekk0ek/simdzone/blob/2fd6ff79f6a94d62fe1c17b1ccd9978d85299b05/src/lexer.h#L36) function reduced performance of the scanner by about 40%. The branches and dependencies were required because all indexes were written to the same tape (vector). Now that we know most parse functions will benefit from having the length available (mentioned both by @lemire and @aqrit on multiple occasions), it may worth looking into this again. It will also simplify parsing of fields that allow for both contiguous and quoted character strings (domain names and text).

The new plan is to have two separate tapes, one that contains the start index and one that contains the delimiter index. The counts should generally be the same, so select the bigger number and write that many indexes in simdjson fashion to both vectors (CPU should handle this in parallel). Now, when we encounter a `CONTIGUOUS` or `QUOTED` token, it's guaranteed the delimiter tape has the delimiting index and we can quickly calculate the length and pop the index of the stack without adding branches. All of this assumes that writing out two indexes does indeed not add too much overhead.

Initial results look promising:

Current main in release mode:
```
$ time ./zone-bench lex ../../zones/com.zone
Selected target haswell
Lexed 1405612413 tokens

real	0m7.737s
user	0m6.595s
sys	0m1.132s
```

Quick hack in release mode:
```
$ time ./zone-bench lex ../../zones/com.zone
Selected target haswell
Lexed 1405612413 tokens

real	0m8.333s
user	0m7.111s
sys	0m1.212s

```

Note that yes, there's a delay in scanning, but nowhere near 40%. To see if it's actually viable, I'll need to update some functions to leverage the length. The plan is to update RRTYPE and name parsing. We should see an improvement when actually parsing all the data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Re-evaluate second (length) index (yet again) #95

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Re-evaluate second (length) index (yet again) #95

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions