Skip to content

Commit 7781eba

Browse files
author
Anand Mudgerikar
committed
updated results
1 parent 0396034 commit 7781eba

File tree

2 files changed

+1
-1
lines changed

2 files changed

+1
-1
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ Note: All results from the paper use the questions in `secgym/questions/tests` f
8080

8181
Below is the evaluation results of the LLM agents on the test questions. We set temperature = 0 and max_step = 25. GPT-4o is used for evaluation. The full evaluation logs can be downloaded from [this link](https://pennstateoffice365-my.sharepoint.com/:u:/g/personal/ykw5399_psu_edu/EXOMtXyFSRNGvKsLZGPIAfwBZhkKMr11oROccOydbWyioA?e=XzpQLa). If can also be found under this [branch](https://github.com/microsoft/SecRL/tree/before_cleanup_all_history) under `final_results` folder (along with the original code).
8282

83-
![ExCyTIn-Bench](./secgym/assets/eval_results.png)
83+
![ExCyTIn-Bench](./secgym/assets/updated_eval_results_09_25.png)
8484

8585
## 📝 Citation
8686

433 KB
Loading

0 commit comments

Comments
 (0)