Skip to content

v0.5.8

Latest
Compare
Choose a tag to compare
@yifanmai yifanmai released this 30 Aug 04:27
· 32 commits to main since this release
9e87c57

Models

  • Add GLM-4.5-AIR-FP8 model (#3785)
  • Add Qwen3 235B A22B Instruct 2507 FP8 (#3788)
  • Add Gemini 2.5 Flash-Lite GA (#3776)
  • Add gpt-oss (#3789, #3794)
  • Add GPT-5 (#3793, #3797)
  • Handle safety and usage guidelines errors from Grok API (#3770)
  • Handle Gemini responses with max tokens reached during thinking (#3804)
  • Add OpenRouterClient (#3811)

Scenarios

  • Fix instructions and prompt formatting for InfiniteBench En.MC (#3790)
  • Add MedQA and MedMCQA to MedHELM (#3781)
  • Add or modify Arabic language scenarios:
  • Add run expander for Arabic language instructions for Arabic MCQA scenarios (#3833)
  • Allow configuration of LLM-as-a-judge models in MedHELM scenarios (#3812)
  • Add user-configurable MedHELM scenario (#3844)

Frontend

  • Display Arabic text in RTL direction in frontend (#3807)
  • Fix regular expression query handling in run predictions (#3826)
  • Fix invalid sort column index error in leaderboard (#3845)

Framework

Contributors

Thank you to the following contributors for your work on this HELM release!