Skip to content

pike-project/pike

Repository files navigation

PIKE

A framework for comparing multi-agent PyTorch optimization systems, along with multiple optimization strategy implementations.

These components are collectively defined as PyTorch Inference Kernel Evolution (PIKE).

See the paper preprint here: https://arxiv.org/abs/2511.16964

About

This is a fork of KernelBench by Anne Ouyang, Simon Guo, and Azalia Mirhoseini. Benchmark additions and modifications are included from KernelBenchFiltered by METR.

This repository contains:

  • a refined set of KernelBench benchmarks
  • our evaluator setup
  • PIKE-B, a multi-agent evolutionary branching strategy for PyTorch optimization

The implementation for PIKE-O can be found in the pike-openevolve repository. PIKE-O is an OpenEvolve-based PyTorch optimization strategy. It makes use of the evaluator in this repository.

Setup

Clone this repository, then do the following:

conda create --name kernel-bench python=3.12
conda activate kernel-bench
pip install -r requirements.txt
pip install -e .

# additional data analysis
pip install matplotlib pandas scipy

Save the following API key environment variables to ~/.bashrc:

export OPENAI_API_KEY=<...>
export GEMINI_API_KEY=<...>

Running PIKE

Running a PIKE implementation involves 3 key components. It is recommended to start the components in the order listed below.

  • Eval Worker: Runs evaluator in a container, and allows low-level, filesystem-based communication with the containerized worker
  • Eval Server: Exposes HTTP server for sending and receiving eval data, managing the low-level communication with the Eval Worker internally
  • PIKE Implementation (PIKE-B/PIKE-O): implements the LLM-based optimization strategy

Start Eval Worker

If you are working on a machine where you have root access, install Docker, along with the NVIDIA Container Toolkit (https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)

Start the containerized Eval Worker like so, passing in the correct GPU architecture:

python -u sandbox/tools/start_worker_container.py --engine docker --arch <Ampere/Hopper> --max_active_tasks 20

Start Eval Server

Once the Eval Worker is started, start the Eval Server. The Eval Server is an HTTP server that acts as a proxy between the Eval Worker's low-level communication channel and the PIKE implementation eval requests.

python scripts/disk_channel_server.py --port 8000

Run PIKE-B

To run PIKE-B directly, first try a dry run (does not require the eval worker):

python scripts/parallel_tree_search.py server_type=google model_name=gemini-2.5-pro num_workers=10 level=3-pike task_start=1 task_end=50 num_samples=10 num_phases=30 max_fix_attempts=5 dry_run=True eval_port=8000 run_dir=<path/to/output-dir>

If this works fine, you can switch to dry_run=False. Run this only after the Eval Worker and Eval Server are running.

Run PIKE-O

First, clone the following repository: pike-openevolve

In the pike-openevolve directory:

pip install -e .

As with PIKE-B, run the following (from within the pike-openevolve directory) only after the Eval Worker and Eval Server are running:

python examples/kernelbench/run.py --pike_dir <path/to/this-repo> --level 3-pike --task_start 1 --task_end 50 --max_fix_attempts 5 --eval_port 8000 --run_dir <path/to/output-dir>

To further tune the PIKE-O system configuration, edit examples/kernelbench/config.yaml

Documentation

To learn more about using PIKE, see docs/README.md

Citation

@misc{nagaitsev2025pike,
    title={Optimizing PyTorch Inference with LLM-Based Multi-Agent Systems}, 
    author={Kirill Nagaitsev and Luka Grbcic and Samuel Williams and Costin Iancu},
    year={2025},
    eprint={2511.16964},
    archivePrefix={arXiv},
    primaryClass={cs.MA},
    url={https://arxiv.org/abs/2511.16964}, 
}

About

A framework for comparing multi-agent PyTorch optimization systems

Topics

Resources

License

Stars

Watchers

Forks

Languages