Releases: Olympus-HPC/proteus
v2026.03.0
Highlights
Deployment
- Link only needed LLVM/Clang libraries
- Avoid static initialization errors when consumer separately links with LLVM by not exporting LLVM symbols from the Proteus runtime library
- Do not install internal implementation API headers by default, add cmake option
PROTEUS_INSTALL_IMPL_HEADERS(default off) to optionally enable - Update spack package recipe
User interface
- Deprecate
proteus::init(),proteus::finalize()(no-ops). Proteus automatically handles initialization and finalization within the runtime library
Caching
- Support partially centralize MPI cache with server-client protocol for writing cache entries, distributed reading from networked FS
- Support fully centralized MPI cache with server-client protocol for both reading/writing cache entries
DSL
- Support launching using
Dim3 - Support references
- Support defining/declaring multiple variables
- Support functional constructs
- Introduce
LLVMCodeBuildercomponent for backend code generation
CPP
- Support launching using
Dim3 - Propagate
ExtraArgsfor CPP compilation to runtime template instantiation
Bugfixes
- Fix registration for CUDA RDC linked binaries in runtime library
What's Changed
- DSL/CPP: Kernel Launch Using Dim3 by @ZwFink in #380
- PJ-DSL: Reference Semantics by @ZwFink in #368
- Link only needed from LLVM and privatize by @ggeorgakoudis in #384
- Centralized MPI Storage Cache by @ZwFink in #354
- PJ-DSL: Define/Declare Multiple Vars by @ZwFink in #386
- Fix LLVM static initialization with shared proteus by @koparasy in #374
- Use pdebug for matrix ci by @ggeorgakoudis in #390
- [Refactor][NFC] Rename src/lib to src/runtime by @ggeorgakoudis in #396
- Separate headers by @ggeorgakoudis in #397
- Replace lazy
ensureInitializedpatterns with explicit initialization by @ZwFink in #394 - Show linter output only in step summary by @ggeorgakoudis in #399
- Add install prefix and hostname to CUDA setup script by @ZwFink in #401
- Sanitize initialization and finalization by @ggeorgakoudis in #402
- [NFC] Separate linked libraries in the cmake for the runtime by @ggeorgakoudis in #404
- Completely Centralized MPI Cache by @ZwFink in #400
- Kernel JIT Tracing by @ZwFink in #403
- Make saveToFileAtomic safe for multiple MPI cache writers by @ZwFink in #406
- PJ-DSL: Functional Constructs by @ZwFink in #405
- Ensure instrumentation of all kernel function registrations by @koparasy in #407
- CPPJitModule: Make Instantiate Propagate
ExtraArgsby @ZwFink in #412 - Fix registration for CUDA RDC linked binaries by @ggeorgakoudis in #413
- Remove CUDA runtime dep and fix proteus linkage issues by @koparasy in #385
- MPI Caches: Make Comm Thread Sleep Between Probes, Simplify Finalization by @ZwFink in #414
- Update spack installation by @ggeorgakoudis in #417
- Fix gitlab CI on matrix by @ggeorgakoudis in #420
- [Refactor][DSL] Introduce LLVMCodeBuilder by @ggeorgakoudis in #419
Full Changelog: v2026.01.0...v2026.03.0
v2026.01.0
Proteus v2026.01.0
We are happy to announce Proteus v2026.01.0, the first versioned release of the project utilizing Calendar Versioning (CalVer) in the YYYY.MM.MICRO format.
About Proteus
Proteus is a programmable Just-In-Time (JIT) compiler based on LLVM designed to embed optimizing compilation directly into C/C++ applications. By leveraging runtime context—such as the actual values of variables during execution—Proteus specializes code on the fly for CUDA, HIP, and host CPUs, achieving optimizations that are impossible with static compilation alone.
cgo25
CGO25 Artifact
This release contains the CGO25 artifact version of Proteus, including build scripts, benchmark programs, experiment workflow scripts, and visualization scripts to re-create figures and tables of the associated manuscript.