PIKE

A framework for comparing multi-agent PyTorch optimization systems, along with multiple optimization strategy implementations.

These components are collectively defined as PyTorch Inference Kernel Evolution (PIKE).

See the paper preprint here: https://arxiv.org/abs/2511.16964

About

This is a fork of KernelBench by Anne Ouyang, Simon Guo, and Azalia Mirhoseini. Benchmark additions and modifications are included from KernelBenchFiltered by METR.

This repository contains:

a refined set of KernelBench benchmarks
our evaluator setup
PIKE-B, a multi-agent evolutionary branching strategy for PyTorch optimization

The implementation for PIKE-O can be found in the pike-openevolve repository. PIKE-O is an OpenEvolve-based PyTorch optimization strategy. It makes use of the evaluator in this repository.

Setup

Clone this repository, then do the following:

conda create --name kernel-bench python=3.12
conda activate kernel-bench
pip install -r requirements.txt
pip install -e .

# additional data analysis
pip install matplotlib pandas scipy

Save the following API key environment variables to ~/.bashrc:

export OPENAI_API_KEY=<...>
export GEMINI_API_KEY=<...>

Running PIKE

Running a PIKE implementation involves 3 key components. It is recommended to start the components in the order listed below.

Eval Worker: Runs evaluator in a container, and allows low-level, filesystem-based communication with the containerized worker
Eval Server: Exposes HTTP server for sending and receiving eval data, managing the low-level communication with the Eval Worker internally
PIKE Implementation (PIKE-B/PIKE-O): implements the LLM-based optimization strategy

Start Eval Worker

If you are working on a machine where you have root access, install Docker, along with the NVIDIA Container Toolkit (https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)

Start the containerized Eval Worker like so, passing in the correct GPU architecture:

python -u sandbox/tools/start_worker_container.py --engine docker --arch <Ampere/Hopper> --max_active_tasks 20

Start Eval Server

Once the Eval Worker is started, start the Eval Server. The Eval Server is an HTTP server that acts as a proxy between the Eval Worker's low-level communication channel and the PIKE implementation eval requests.

python scripts/disk_channel_server.py --port 8000

Run PIKE-B

To run PIKE-B directly, first try a dry run (does not require the eval worker):

python scripts/parallel_tree_search.py server_type=google model_name=gemini-2.5-pro num_workers=10 level=3-pike task_start=1 task_end=50 num_samples=10 num_phases=30 max_fix_attempts=5 dry_run=True eval_port=8000 run_dir=<path/to/output-dir>

If this works fine, you can switch to dry_run=False. Run this only after the Eval Worker and Eval Server are running.

Run PIKE-O

First, clone the following repository: pike-openevolve

In the pike-openevolve directory:

pip install -e .

As with PIKE-B, run the following (from within the pike-openevolve directory) only after the Eval Worker and Eval Server are running:

python examples/kernelbench/run.py --pike_dir <path/to/this-repo> --level 3-pike --task_start 1 --task_end 50 --max_fix_attempts 5 --eval_port 8000 --run_dir <path/to/output-dir>

To further tune the PIKE-O system configuration, edit examples/kernelbench/config.yaml

Documentation

To learn more about using PIKE, see docs/README.md

Citation

@misc{nagaitsev2025pike,
    title={Optimizing PyTorch Inference with LLM-Based Multi-Agent Systems}, 
    author={Kirill Nagaitsev and Luka Grbcic and Samuel Williams and Costin Iancu},
    year={2025},
    eprint={2511.16964},
    archivePrefix={arXiv},
    primaryClass={cs.MA},
    url={https://arxiv.org/abs/2511.16964}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 745 Commits
KernelBench		KernelBench
assets		assets
best_agent_solutions/h100/level3-metr		best_agent_solutions/h100/level3-metr
docs		docs
examples		examples
figs-lrc/improvement		figs-lrc/improvement
results/timing		results/timing
sandbox		sandbox
scripts		scripts
src		src
test		test
tools		tools
.dockerignore		.dockerignore
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
notes.txt		notes.txt
requirements.txt		requirements.txt
requirements_level5.txt		requirements_level5.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PIKE

About

Setup

Running PIKE

Start Eval Worker

Start Eval Server

Run PIKE-B

Run PIKE-O

Documentation

Citation

About

Uh oh!

Contributors 2

Languages

License

pike-project/pike

Folders and files

Latest commit

History

Repository files navigation

PIKE

About

Setup

Running PIKE

Start Eval Worker

Start Eval Server

Run PIKE-B

Run PIKE-O

Documentation

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors 2

Languages