Does Meili officially support control characters inside strings? #744
LukasKalbertodt
started this conversation in
Feedback & Feature Proposal
Replies: 1 comment 5 replies
-
Hello @LukasKalbertodt, Yes, the control characters are ignored during the tokenization process, meaning that the search never sees any of these characters. |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I tested this (with Meili v1.4.2):
And it works. I can also retrieve the document again, with the null byte still being inside the
desc
field. I can even search forfoobar
and get a matchstart: 0, length: 7
insidedesc
.My question is: does this work by accident or can I rely on Meili working with control characters inside strings?
And a follow up question: will this slow down Meili? Since deserializing JSON with escape codes means that a new string has to be allocated as one can't simply reference a part of the input file.
EDIT: though it seems like Meili treats \0 as a normal word character. For example with query "bar", the document is not found (due to Meili performing prefix search). All control characters should probably be treated the same as
". "
, i.e. a strong separator token. Would such a change be welcome in Meili?Beta Was this translation helpful? Give feedback.
All reactions