Toolchain optimization of shuffles

We received a bug report on emscripten (https://github.com/emscripten-core/emscripten/issues/9340) because LLVM was combining shuffles a user had written as intrinsics, and V8 was therefore producing a slow pshufb instead of the pair of fast shuffle instructions the user had expected.

This raises the question of how the toolchain should reason about WebAssembly shuffles. The reporting user simply wanted the toolchain to not mess with their shuffle intrinsics, but in https://reviews.llvm.org/D66983 I received pushback from members of the LLVM community saying that shuffles should be optimized by the toolchain no matter what users write. The suggested fix is to bake knowledge of what WebAssembly shuffles will be fast into LLVM. The problem with this solution is that it depends on engine platform-specific information in the WebAssembly backend, which will necessarily favor some platforms over others and generally breaks the WebAssembly abstraction layer. That is not a precedent I would like to set.

In https://github.com/WebAssembly/simd/issues/8#issuecomment-297110939 it is suggested that LLVM prefers not to mess with a user's shuffles unless it knows for sure that it can make them better. This is not inconsistent with the statement from Craig Topper on my LLVM patch that "x86 is aggressive about optimizing shuffles no matter where they came from" because the x86 backend can know for sure that many optimizations would be helpful.

The WebAssembly backend cannot know that a shuffle optimization will be helpful on all possible platforms, so I propose that in the WebAssembly LLVM backend we neither perform optimizations on shuffles that we do not know for sure will be useful nor bake in target-specific information that punches through the WebAssembly abstraction and favors some platforms over others. This is consistent with the philosophy behind x86's optimization approach but looks very different because it makes our shuffles essentially opaque, so I would like to hear the community's thoughts.

cc @sunfishcode because this is a continuation of a conversation you were in on the LLVM patch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Toolchain optimization of shuffles #118

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Toolchain optimization of shuffles #118

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions