-
Notifications
You must be signed in to change notification settings - Fork 24.1k
Implement DIFF, DIFF1, ANDOR and ONE for BITOP #13898
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@minchopaskal should we consider using SIMD to optimize these new commands? |
@sundb thank you for the suggestion. Actually, I though about that too but it seems the code is written in such way so that the compiler specifically can generate SIMD code. I checked the disassembly of arm64 and x86 which both generate simd instructions. The new commands have implementations for the optimized path. |
@minchopaskal In #13558, we found that even in the case of support SIMD, the compiler will not generate the corresponding code, maybe we can try to take a command to verify it. |
Added hand-written SIMD optimization (via AVX2 instruction set - 256bit register) for the BITOP operations including the new ops. BenchmarksThis commit actually includes two optimizations:
Repromachine: AWS m5.large instance (non-dedicated)
Repromachine: AWS m5.large instance (non-dedicated) NotesOne can see that AVX shows no benefit over SSE for the Another observation is the decrease of improvement when command is run with more keys - this happens for keys of big size. As we add more keys - there are more load instruction and I conjure they degrade the performance. Still - we gain a lot of speed over the compiler generated SSE. Final note - both source keys and destination memory are unaligned. SIMD load instructions are more effective when load/store operation is done over memory aligned to the size of the SIMD register. We can't possibly benefit from aligning source keys as it would incur too much copies, but aligned load instructions were tested nevertheless and showed no significant difference. |
Some more benchmarks.. I tried to run them with Some of the tests were the same as in above comment - 32bytes-16keys, 50Kb-16keys . This new benchmark show better results for them - I attribute that to the noise introduced by the fact tests were ran on non-dedicated AWS machines. Generally results look very good, besides 10mb case for small number of keys.. Speeds there are generally slow so I'm sure if the ~12% degradation is really meaningful. Sadly the AVX2 code couldn't optimize these cases I presume because of all the unaligned loads.
Repro |
@minchopaskal I tried XOR, OR, AND on my local, 2 keys, each can be any size, 100kb, 1mb etc. This is what I see performance wise: It feels like this PR changed something about how we loop and it helped with the performance even without AVX. Also, somehow AVX makes it slower on my local (I have i9-9880H). Did you compare it with unstable branch even without AVX? Perhaps, you can even try it on your local |
I did notice degradation with 2 keys, but only for large ones. Will try on local |
@tezc after we tested together and saw no actual degradation I redid the tests yet again on AWS JIC. Then for N=[32, 1000, 10000, 100000, 1000000, 10000000]
There is no degradation between unstable and current branch with |
🎉 Snyk checks have passed. No issues have been found so far.✅ security/snyk check is complete. No issues have been found. (View Details) ✅ license/snyk check is complete. No issues have been found. (View Details) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances the BITOP command by adding four new operators—DIFF, DIFF1, ANDOR, and ONE—to allow more expressive bitmap operations.
- Introduces new evaluation logic in the bit manipulation loop for the new operators.
- Updates the BITOP command configuration in JSON and command definitions to support the additional tokens.
- Adjusts and expands test cases to validate the behavior of all BITOP operations.
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
File | Description |
---|---|
tests/unit/bitops.tcl | Updated tests and simulation logic for the new BITOP operators. |
src/commands/bitop.json | Added definitions for the new operators in the commands JSON file. |
src/commands.def | Extended the BITOP command argument configuration to support more tokens. |
Comments suppressed due to low confidence (1)
tests/unit/bitops.tcl:266
- The key name 'no-such_key' is inconsistent with the rest of the tests that use 'no-such-key'. Consider renaming it to 'no-such-key' for consistency.
r bitop one res7{t} no-such_key{t} a{t}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR adds 4 new operators to the
BITOP
command -DIFF
,DIFF1
,ANDOR
andONE
. They enable redis clients to atomically do non-trivial logical operations that are useful for checking membership of a bitmap against a group of bitmaps.DIFF
BITOP DIFF dest srckey1 srckey2 [key...]
Description
DIFF(X, A1, A2, ..., AN) = X ∧ ¬(A1 ∨ A2 ∨ ... ∨ AN), i.e the set bits of X that are not set in any of A1, A2, …, AN
NOTE
Command expects at least 2 source keys.
DIFF1
BITOP DIFF1 dest srckey1 srckey2 [key...]
Description
DIFF1(X, A1, A2, ..., AN) = ¬X ∧ (A1 ∨ A2 ∨ ... ∨ AN), i.e the bits set in one or more of A1, A2, …, AN that are not set in X
NOTE
Command expects at least 2 source keys.
ANDOR
BITOP ANDOR dest srckey1 srckey2 [key...]
Description
ANDOR(X, A1, A2, ..., AN) = X ∧ (A1 ∨ A2 ∨ ... ∨ AN), i.e the set bits of X that are also set in A1, A2, …, AN
NOTE
Command expects at least 2 source keys.
ONE
BITOP ONE dest key [key...]
Description
ONE(A1, A2, ..., AN) = X, where
if X[i] is the i-th bit of X then X[i] = 1 if and only if there is m such that A_m[i] = 1 and An[i] = 0 for all n != m, i.e bit X[i] is set only if it set in exactly one of A1, A2, ..., AN
Return value
As in all other
BITOP
operators return value for all the new ones is the number bytes of the longest key.EDIT:
Besides adding the new commands couple more changes were made: