[Feature] Support pass@1 evaluation for multi predictions in MathEvaluator

### Describe the feature

When using a Hugging Face model with the parameter num_return_sequences set greater than 1, the output column “predictions” becomes a list instead of a string. As a result, the MathEvaluator always returns an accuracy of 0, regardless of whether the prediction is correct. It would be beneficial if the score function could handle list-type inputs and evaluate pass@1 using multiple predictions, similar to the approach mentioned in the DeepSeek-R1 technical report.

### Will you implement it?

- [x] I would like to implement this feature and create a PR!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Support pass@1 evaluation for multi predictions in MathEvaluator #2252

Describe the feature

Will you implement it?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature] Support pass@1 evaluation for multi predictions in MathEvaluator #2252

Description

Describe the feature

Will you implement it?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions