-
Notifications
You must be signed in to change notification settings - Fork 25.2k
ESQL: TO_LOWER process all values #124676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Make `TO_LOWER` and `TO_UPPER` process all values it received. This is quite large because it borrows a lot of code from the regular evaluator generator to generate conversions so we can use the Locale. That change propagates to the order of some parameters and to the `toString` and a few more places. Closes elastic#124002
Pinging @elastic/es-analytical-engine (Team:Analytics) |
Hi @nik9000, I've created a changelog YAML for you. |
I believe this one doesn't require anything else around push down - we push down TO_LOWER0-ed comparisons to lucene. All of those comparisons need single valued fields though. I'm going to have a look at how that's done to much sure there's nothing to do. |
Looks like the PR that enabled push down is #118870. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This doesn't really change the benchmarks:
Well, it does, but it's pretty within the error. |
I double checked push down and it all looks good. The equality still neesd single values. |
Make `TO_LOWER` and `TO_UPPER` process all values it received. This is quite large because it borrows a lot of code from the regular evaluator generator to generate conversions so we can use the Locale. That change propagates to the order of some parameters and to the `toString` and a few more places. Closes elastic#124002
Speed up the TO_IP method by converting directly from utf-8 encoded strings to the ip encoding. Previously we did: ``` utf-8 -> String -> INetAddress -> ip encoding ``` In a step towards solving elastic#125460 this creates three IP parsing functions, one the rejects leading zeros, one that interprets leading zeros as decimal numbers, and one the interprets leading zeros as octal numbers. IPs have historically been parsed in all three of those ways. This plugs the "rejects leading zeros" parser into `TO_IP` because that's the behavior it had before. Here is the performance: ``` Benchmark Score Error Units leadingZerosAreDecimal 14.007 ± 0.093 ns/op leadingZerosAreOctal 15.020 ± 0.373 ns/op leadingZerosRejected 14.176 ± 3.861 ns/op original 32.950 ± 1.062 ns/op ``` So this is roughly 45% faster than what we had. This includes a big chunk of elastic#124676 - but not the behavior change - just the code that allowed it.
Speed up the TO_IP method by converting directly from utf-8 encoded strings to the ip encoding. Previously we did: ``` utf-8 -> String -> INetAddress -> ip encoding ``` In a step towards solving elastic#125460 this creates three IP parsing functions, one the rejects leading zeros, one that interprets leading zeros as decimal numbers, and one the interprets leading zeros as octal numbers. IPs have historically been parsed in all three of those ways. This plugs the "rejects leading zeros" parser into `TO_IP` because that's the behavior it had before. Here is the performance: ``` Benchmark Score Error Units leadingZerosAreDecimal 14.007 ± 0.093 ns/op leadingZerosAreOctal 15.020 ± 0.373 ns/op leadingZerosRejected 14.176 ± 3.861 ns/op original 32.950 ± 1.062 ns/op ``` So this is roughly 45% faster than what we had. This includes a big chunk of elastic#124676 - but not the behavior change - just the code that allowed it.
Speed up the TO_IP method by converting directly from utf-8 encoded strings to the ip encoding. Previously we did: ``` utf-8 -> String -> INetAddress -> ip encoding ``` In a step towards solving #125460 this creates three IP parsing functions, one the rejects leading zeros, one that interprets leading zeros as decimal numbers, and one the interprets leading zeros as octal numbers. IPs have historically been parsed in all three of those ways. This plugs the "rejects leading zeros" parser into `TO_IP` because that's the behavior it had before. Here is the performance: ``` Benchmark Score Error Units leadingZerosAreDecimal 14.007 ± 0.093 ns/op leadingZerosAreOctal 15.020 ± 0.373 ns/op leadingZerosRejected 14.176 ± 3.861 ns/op original 32.950 ± 1.062 ns/op ``` So this is roughly 45% faster than what we had. This includes a big chunk of #124676 - but not the behavior change - just the code that allowed it.
Speed up the TO_IP method by converting directly from utf-8 encoded strings to the ip encoding. Previously we did: ``` utf-8 -> String -> INetAddress -> ip encoding ``` In a step towards solving #125460 this creates three IP parsing functions, one the rejects leading zeros, one that interprets leading zeros as decimal numbers, and one the interprets leading zeros as octal numbers. IPs have historically been parsed in all three of those ways. This plugs the "rejects leading zeros" parser into `TO_IP` because that's the behavior it had before. Here is the performance: ``` Benchmark Score Error Units leadingZerosAreDecimal 14.007 ± 0.093 ns/op leadingZerosAreOctal 15.020 ± 0.373 ns/op leadingZerosRejected 14.176 ± 3.861 ns/op original 32.950 ± 1.062 ns/op ``` So this is roughly 45% faster than what we had. This includes a big chunk of #124676 - but not the behavior change - just the code that allowed it.
Make `TO_LOWER` and `TO_UPPER` process all values it received. This is quite large because it borrows a lot of code from the regular evaluator generator to generate conversions so we can use the Locale. That change propagates to the order of some parameters and to the `toString` and a few more places. Closes elastic#124002
Make `TO_LOWER` and `TO_UPPER` process all values it received. This is quite large because it borrows a lot of code from the regular evaluator generator to generate conversions so we can use the Locale. That change propagates to the order of some parameters and to the `toString` and a few more places. Closes #124002
Make
TO_LOWER
andTO_UPPER
process all values it receives.This is quite large because it borrows a lot of code from the regular evaluator generator to generate conversions so we can use the Locale. That change propagates to the order of some parameters and to the
toString
and a few more places.Closes #124002