Skip to content
This repository was archived by the owner on Dec 22, 2021. It is now read-only.
This repository was archived by the owner on Dec 22, 2021. It is now read-only.

Toolchain optimization of shuffles #118

@tlively

Description

@tlively

We received a bug report on emscripten (emscripten-core/emscripten#9340) because LLVM was combining shuffles a user had written as intrinsics, and V8 was therefore producing a slow pshufb instead of the pair of fast shuffle instructions the user had expected.

This raises the question of how the toolchain should reason about WebAssembly shuffles. The reporting user simply wanted the toolchain to not mess with their shuffle intrinsics, but in https://reviews.llvm.org/D66983 I received pushback from members of the LLVM community saying that shuffles should be optimized by the toolchain no matter what users write. The suggested fix is to bake knowledge of what WebAssembly shuffles will be fast into LLVM. The problem with this solution is that it depends on engine platform-specific information in the WebAssembly backend, which will necessarily favor some platforms over others and generally breaks the WebAssembly abstraction layer. That is not a precedent I would like to set.

In #8 (comment) it is suggested that LLVM prefers not to mess with a user's shuffles unless it knows for sure that it can make them better. This is not inconsistent with the statement from Craig Topper on my LLVM patch that "x86 is aggressive about optimizing shuffles no matter where they came from" because the x86 backend can know for sure that many optimizations would be helpful.

The WebAssembly backend cannot know that a shuffle optimization will be helpful on all possible platforms, so I propose that in the WebAssembly LLVM backend we neither perform optimizations on shuffles that we do not know for sure will be useful nor bake in target-specific information that punches through the WebAssembly abstraction and favors some platforms over others. This is consistent with the philosophy behind x86's optimization approach but looks very different because it makes our shuffles essentially opaque, so I would like to hear the community's thoughts.

cc @sunfishcode because this is a continuation of a conversation you were in on the LLVM patch.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions