Skip to content

Latest commit

 

History

History
168 lines (120 loc) · 6.5 KB

File metadata and controls

168 lines (120 loc) · 6.5 KB

Slang LLVM Targets

The LLVM targets are capable of creating LLVM IR and object code for arbitrary target triples (<machine>-<vendor>-<os>, e.g. x86_64-unknown-linux). This allows for highly performant and debuggable Slang code on almost any platform, as long as an LLVM backend exists for it.

The current state is highly experimental and there are many missing features. The feature also focuses heavily on CPUs for now.

Targets

The HOST / SHADER split from the CPU target applies here as well.

The following targets always use LLVM:

  • llvm-ir / SLANG_HOST_LLVM_IR generates LLVM IR in the text representation, suitable for free-standing functions.
  • llvm-shader-ir / SLANG_SHADER_LLVM_IR generates LLVM IR for compute shader entry points.

The following targets use LLVM when -emit-cpu-via-llvm or EmitCPUMethod=SLANG_EMIT_CPU_VIA_LLVM is specified:

  • host-object-code / SLANG_HOST_OBJECT_CODE generates position-independent object code, which can be linked into an executable or a static or dynamic library.
  • shader-object-code / SLANG_OBJECT_CODE generates object code for compute shader entry points.
  • SLANG_HOST_HOST_CALLABLE and SLANG_SHADER_HOST_CALLABLE JIT compile the module.

Support for exe / SLANG_HOST_EXECUTABLE and sharedlib / SLANG_HOST_SHARED_LIBRARY may be added later, once the LLVM target has stabilized. For compiling to platforms other than the current CPU running the Slang compiler, the following options are provided:

  • -llvm-target-triple <target-triple>. The default is the host machine's triple.
  • -llvm-cpu <cpu-name> sets the target CPU, similar to Clang's -mcpu=<cpu-name>.
  • -llvm-features <features> sets the available features, similar to LLC's -mattr=<features>.

Features

  • Compile stand-alone programs in Slang for platforms supported by LLVM
  • Focus on memory layout correctness: type layouts such as the scalar layout are handled correctly
  • Does not depend on external compilers (although, currently depends on external linkers!)
  • Works well with debuggers!

Standalone programs

You can write functions decorated with export to make them visible from the resulting object code, and __extern_cpp to unmangle their names. So, for a standalone Slang application, the entry point is:

export __extern_cpp int main(int argc, NativeString* argv)
{
    // Do whatever you want here!
    return 0;
}

To cross-compile, you can use -llvm-target-triple <target-triple>. For now, you'll need to compile into an object file and use a compiler or linker to turn that into an executable, e.g. with clang main.o -o main.exe.

Application Binary Interface

This section defines the ABI rules which code generated by the LLVM target follows and expects of external code calling into it.

The default type layout aligns vectors to the next power of two, structures are aligned and padded up to the largest alignment among their fields, and booleans are a single byte.

If you specify a different layout with flags like -fvk-use-c-layout or -fvk-use-scalar-layout, all structure and array types on the stack and heap will follow those layout rules.

Types and resources

  • StructuredBuffer and ByteAddressBuffer are stored as { Type* data; intptr_t size; }, where size is the number of elements in data.

  • Vectors are passed as LLVM vector types; there's no direct equivalent in standard C or C++.

  • Matrix types are lowered into arrays of vectors. Column and row major matrices are supported as normal.

Aggregate parameters

All aggregates (structs and arrays) are always passed by reference in Slang's LLVM emitter. Other than that, the target platform's C calling conventions are followed. This stems from LLVM not handling aggregates correctly in calling conventions, and requiring every frontend to painstakingly reimplement the same per-target logic if they want full C compatibility.

This means that if you declare a function like this in Slang:

export __extern_cpp MyStruct func(MyStruct val);

It would have the following signature in C:

void func(const MyStruct *val, MyStruct *returnval);

In other words, aggregate parameters are turned into pointers and aggregate return values are turned into an additional pointer-typed parameter at the end of the parameter list.

C foreign functions

Due to the aggregate parameter passing limitation of LLVM, calling arbitrary C functions from Slang is complicated, and a hypothetical binding generator would need to generate calling convention adapter functions. A binding generator would be a useful tool to include, but remains as future work.

Limitations

The LLVM target support is work-in-progress, and there are currently many limitations.

CPU targets only

Currently, support is limited to conventional CPU targets. The emitted LLVM IR is not compatible with LLVMs SPIR-V target, for example. At least resource bindings and pointer address spaces would have to be accounted for to expand support to GPU targets. Slang already has native emitters for GPU targets, so you can use those instead of going through LLVM.

Missing compute shader features

  • No groupshared
  • No barriers.
  • No atomics.
  • No wave operations.

These limitations stem from the fact that work items / threads of a work group are currently run serially instead of actually being in parallel. This may be improved upon later.

Limited vectorization

Vector instructions are vectorized in the way typical CPU math libraries (e.g. GLM) vectorize, as long as the target CPU allows for vector instructions. This is worse than how GPUs do it, where each work item / thread gets a SIMD lane. This aspect may be improved upon later.

Compatibility with prior CPU Slang features

There are limitations regarding features of the existing C++ based CPU target. The following features are not yet supported:

  • String type.
  • new.
  • class.
  • COM interfaces.

The implementations of these rely on C++ features, and are not trivial to implement in LLVM. Support for them may be added later.

Missing types

  • No texture or sampler types.
  • No acceleration structures.

These are missing due to limitation of scope for the initial implementation, and may be added later.

Gotchas

Out-of-bounds buffer access

Attempting to index past the end of any buffer type is undefined behaviour. It is not guaranteed to return zero as in HLSL; segmentation faults and memory corruption are more than likely to occur!