Description
I would like to work toward creating a carryless multiplication builtin in zig. This is a fast instruction used in simdjson for example to convert binary quote boundaries from json strings into masks. In the following example, Q
is a 64 bit quote boundary marker. The last line is the result of a carryless multiplication between Q
and 0xfffffffffffffff
{ "\\\" Nam[{": [ 116,"\\\\" , 234, "true", false ], "t":"\\\"" }: input data
__1___1_____1________1____1________1____1___________1_1_1___11__ : Q
______1_____________________________________________________1___ : OD
__1_________1________1____1________1____1___________1_1_1____1__ : Q &=~OD
__1111111111_________11111_________11111____________11__11111___ : CLMUL(Q,~0)
In simdjson this is known as prefix_xor
and is implemented here:
Here are some references to this instruction in the zig repo:
The llvm x86 intrinsic is llvm.x86.pclmulqdq
Name ideas:
@mulCarryless()
@mulWithoutCarry()
I hope to use this in my simdjson port to get rid off hacky llvm intrinsic calls such as the following which may not be possible in stage 2:
@"llvm.x86.pclmulqdq"(@bitCast(i64x2, a), @bitCast(i64x2, b), 0)
Related to #903
If accepted, I'm not sure where I would begin. If anyone can suggest a similar builtin which uses different intrinsics per platform (and a custom implementation on arm) , perhaps i can follow its implementation.