Skip to content

Commit 324d4f5

Browse files
committed
Bug fix
1 parent 772b5ae commit 324d4f5

File tree

2 files changed

+8
-7
lines changed

2 files changed

+8
-7
lines changed

README.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,12 +15,13 @@
1515

1616
CodeClash is a benchmark for evaluating AI systems on **goal-oriented software engineering**.
1717

18-
Today's AI coding evals are *task*-oriented (e.g., HumanEval, SWE-bench).
18+
Today's AI coding evals are *task*-oriented (e.g.,
19+
<a href="https://github.com/openai/human-eval">HumanEval</a>, <a href="https://swebench.com">SWE-bench</a>).
1920
Models are given explicit instructions.
20-
We then verify implementations with unit tests.
21+
We then verify correctness with unit tests.
2122

2223
But building software is fundamentally driven by goals ("improve user retention", "reduce costs", "increase revenue").
23-
Reaching our goals is a self-directed, iterative, and often competitive process.
24+
Reaching our goals via code is a self-directed, iterative, and often competitive process.
2425
To capture this dynamism of real software development, we introduce CodeClash!
2526

2627
Check out our [arXiv paper](https://arxiv.org/abs/2511.00839) and [website](https://codeclash.ai/) for the full details!
@@ -35,6 +36,9 @@ $ pip install -e '.[dev]'
3536
$ python main.py configs/test/battlesnake.yaml
3637
```
3738

39+
> [!TIP]
40+
> CodeClash requires Docker to create execution environments. CodeClash was developed and tested on Ubuntu 22.04.4 LTS.
41+
3842
Once this works, you should be set up to run a real tournament!
3943
To run *Claude Sonnet 4.5* against *o3* in a *BattleSnake* tournament with *5 rounds* and *1000 competition simulations* per round, run:
4044
```bash

codeclash/arenas/arena.py

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -142,10 +142,7 @@ def build_image(self):
142142
arena_file = Path(inspect.getfile(self.__class__))
143143
folder_path = arena_file.parent
144144
result = subprocess.run(
145-
(
146-
"export $(cat .env | xargs);"
147-
f"docker build --no-cache -t {self.image_name} -f {folder_path}/{self.name}.Dockerfile ."
148-
),
145+
f"docker build --no-cache -t {self.image_name} -f {folder_path}/{self.name}.Dockerfile .",
149146
shell=True,
150147
capture_output=True,
151148
text=True,

0 commit comments

Comments
 (0)