-
Notifications
You must be signed in to change notification settings - Fork 43
Introduce unsigned narrowing with signed inputs #95
Conversation
Thanks for the PR, could you add a comment here with mappings to the exact instructions on different architectures? This information is split right now across multiple threads, so having this here for folks to review may be useful. |
Instructions corresponding to narrowing of signed values to unsigned results are For more context on the instructions already added, not in scope of this change, signed to signed narrowing is |
Given that this has been contentious and the discussion in PR #91, I suggest we put this to vote and resolve this after a reasonable time to vote, possibly in next week's sync meeting. I think the options here are -
If you have opinions about the semantics of narrowing operations, please vote! |
FYI I don't see a way to react to your post with 👌. I propose using 👀 instead. I think there's a clear case for the signed->unsigned narrowing instructions, since there's lots of code out there that uses the corresponding SSE instruction, and NEON has a direct equivalent. The case for unsigned->unsigned narrowing is less clear. I can imagine it being useful, and hard to reproduce the optimal SSE code-gen for it without a dedicated instruction. It would be nice to have a kernel that benefits from the unsigned->unsigned narrowing instructions to prove that. I'm leaning toward 👌 (👀) but I'd also be fine with 🚀. |
Oops, thanks for pointing that out. Updated to use 👀 instead of 👌. |
I prefer the instruction names from this PR, as they make inputs and outputs more explicit, aside of that I am fine with #91 as well. Also leaning towards this one 👀, but would not oppose removing purely unsigned instructions (with renaming) 🚀 |
I've been out this past week so haven't had a chance to follow up with the folks from the 2017 meeting where these were discussed. As there has been general consensus that we should include the signed operations, I'd be in favor of merging #91, and I'll take an AI to dig up the slides and follow up about unsigned operations. @penzn, The naming could be better, but if we don't end up merging unsigned conversions renaming is possibly not needed? Approved #91 for merge as the signed narrowing operations have most consensus, and there are concerns about use cases for the unsigned versions of these operations. |
Use opcodes from unsigned narrowing for `unsigned narrowing with unsigned inputs`, as that was the semantics assumed by the spec and inmplemented in the runtimes. Unsigned narrowing with signed inputs is intended to be the new instruction introduced in the previous change.
Closing as we merged the alternative in #91 |
Narrowing signed inputs to unsigned outputs is widely used in image filtering, when output of a transformation with signed integer coefficients is packed into an array of unsigned integers. The operation is supported in hardware on both x86(64) and ARM.
This PR renames
i8x16.narrow_i16x8_u
toi8x16.narrow_u16x8_u
andi16x8.narrow_i32x4_u
toi16x8.narrow_u32x4_u
, highlighting the unsigned-to-unsigned semantics; it adds two new opcodes fori8x16.narrow_i16x8_u
andi16x8.narrow_i32x4_u
for unsigned narrowing with signed inputs.Instructions corresponding to narrowing of signed values to unsigned results are
VQMOVUN
on ARM andPACKUSDW
on x86; signed to signed narrowing isVQMOVN
on ARM andPACKSSDW
on x86; unsigned to unsigned is alsoVQMOVN
on ARM (using a different data type) without exact match in x86 SIMD.For issue #94