Audio2Text Retrieval AbsTask and Evaluator + Audiocaps Retrieval Dataset #2684

switchpiggy · 2025-05-09T13:08:11Z

Resolves #2068

Code Quality

Code Formatted: Format the code using make lint to maintain consistent style.

Documentation

Updated Documentation: Add or update documentation to reflect the changes introduced in this PR.

Testing

New Tests Added: Write tests to cover new functionality. Validate with make test-with-coverage.
Tests Passed: Run tests locally using make test or make test-with-coverage to ensure no existing functionality is broken.

…rieval

…b_retrieval

Copilot

Pull Request Overview

This pull request introduces a new AudioCaps Retrieval task for Audio2Text retrieval and integrates it into the task registry while also removing an obsolete text file.

Added a new task implementation in AudioCapsRetrieval with associated metadata.
Updated module imports to expose the new task.
Removed the legacy "the_ugly_duckling.txt" file.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

File	Description
mteb/tasks/init.py	Added import for the Audio2TextRetrieval tasks.
mteb/tasks/Audio/Audio2TextRetrieval/eng/AudioCapsRetrieval.py	New task implementation defining AudioCaps Retrieval metadata.
mteb/tasks/Audio/Audio2TextRetrieval/init.py	Exposed AudioCapsRetrieval via module import.
mteb/abstasks/the_ugly_duckling.txt	Removed legacy file.

Samoed

Good start!

Samoed · 2025-05-09T17:16:58Z

mteb/tasks/Audio/Audio2TextRetrieval/eng/AudioCapsRetrieval.py

+        dataset={
+            "path": "TwinkStart/AudioCaps",
+            "revision": "8fc8b151149af779517aedfbf8c536160822bd70",
+            "trust_remote_code": True,


Suggested change

"trust_remote_code": True,

Samoed · 2025-05-09T17:18:00Z

mteb/tasks/Audio/Audio2TextRetrieval/eng/AudioCapsRetrieval.py

+        description="Measuring the ability to retrieve the groundtruth answers to reasoning task queries on ARC-Challenge.",
+        reference="https://allenai.org/data/arc",
+        dataset={
+            "path": "TwinkStart/AudioCaps",


Do we want this dataset? It weights 153GB

We should probably downsample + negative mine this to a reasonable size.

Samoed · 2025-05-09T17:21:12Z

mteb/evaluation/evaluators/Audio/Audio2TextRetrievalEvaluator.py

+        elif (
+            hasattr(self.retriever.model.model, "mteb_model_meta")
+            and self.retriever.model.model.mteb_model_meta.name == "bm25s"
+        ):
+            return self.retriever.model.model.search(
+                corpus,
+                queries,
+                self.top_k,
+                score_function="bm25",
+                task_name=self.task_name,  # type: ignore
+            )


Suggested change

elif (

hasattr(self.retriever.model.model, "mteb_model_meta")

and self.retriever.model.model.mteb_model_meta.name == "bm25s"

):

return self.retriever.model.model.search(

corpus,

queries,

self.top_k,

score_function="bm25",

task_name=self.task_name, # type: ignore

)

Samoed · 2025-05-09T17:25:06Z

mteb/evaluation/evaluators/Audio/Audio2TextRetrievalEvaluator.py

+logger = logging.getLogger(__name__)
+
+
+def corpus_to_str(


Do you need this?

Samoed · 2025-05-09T17:28:45Z

mteb/evaluation/evaluators/Audio/Audio2TextRetrievalEvaluator.py

+        if instructions:
+            queries = [f"{query} {instructions[query]}".strip() for query in queries]
+        if isinstance(queries[0], list):  # type: ignore
+            query_embeddings = self.encode_conversations(


Do you need this?

Samoed · 2025-05-09T17:32:10Z

mteb/abstasks/Audio/AbsTaskAudio2TextRetrieval.py

+        id_column_name: str = '_id',
+        audio_column_name: str = 'audio',
+        text_column_name: str = 'text'


Can you find more datasets you want to add as retrieval and after that we can create better loader? MIEB and MTEB retrieval tasks have different configurations for corpus and queries with qrelsl. I don't think we should change this format

KennethEnevoldsen · 2025-06-15T19:08:33Z

@switchpiggy, it seems like this PR might have gotten stale. Any plans to finish it up?

switchpiggy added 4 commits May 9, 2025 04:21

added retrieval

e41a5ff

added a2t retrieval Abstask and Evaluator + AudioCaps dataset for ret…

3f5bf1c

…rieval

added a2t retrieval Abstask and Evaluator + AudioCaps dataset for ret…

b81bdc1

…rieval

Merge branch 'maeb_retrieval' of github.com:switchpiggy/mteb into mae…

8bcf0ad

…b_retrieval

switchpiggy changed the title ~~Maeb retrieval~~ Audio2Text Retrieval AbsTask and Evaluator + Audiocaps Retrieval Dataset May 9, 2025

switchpiggy mentioned this pull request May 9, 2025

Create audio-text retrieval AbsTask and Evaluator #2068

Closed

isaac-chung linked an issue May 9, 2025 that may be closed by this pull request

Create audio-text retrieval AbsTask and Evaluator #2068

Closed

isaac-chung requested review from Samoed and Copilot May 9, 2025 15:59

Copilot AI reviewed May 9, 2025

View reviewed changes

Samoed requested changes May 9, 2025

View reviewed changes

KennethEnevoldsen added the stale label Jun 15, 2025

kkaitlyn111 mentioned this pull request Jul 6, 2025

Implemented Audio Any2AnyRetrieval + 3 Datasets for A2A, A2T, T2A #2882

Merged

kkaitlyn111 closed this Jul 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Audio2Text Retrieval AbsTask and Evaluator + Audiocaps Retrieval Dataset #2684

Audio2Text Retrieval AbsTask and Evaluator + Audiocaps Retrieval Dataset #2684

Uh oh!

switchpiggy commented May 9, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Samoed left a comment

Uh oh!

Samoed May 9, 2025

Uh oh!

Samoed May 9, 2025

Uh oh!

isaac-chung May 11, 2025

Uh oh!

Samoed May 9, 2025

Uh oh!

Samoed May 9, 2025

Uh oh!

Samoed May 9, 2025

Uh oh!

Samoed May 9, 2025

Uh oh!

KennethEnevoldsen commented Jun 15, 2025

Uh oh!

Uh oh!

		logger = logging.getLogger(__name__)


		def corpus_to_str(

Audio2Text Retrieval AbsTask and Evaluator + Audiocaps Retrieval Dataset #2684

Audio2Text Retrieval AbsTask and Evaluator + Audiocaps Retrieval Dataset #2684

Uh oh!

Conversation

switchpiggy commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Quality

Documentation

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Samoed left a comment

Choose a reason for hiding this comment

Uh oh!

Samoed May 9, 2025

Choose a reason for hiding this comment

Uh oh!

Samoed May 9, 2025

Choose a reason for hiding this comment

Uh oh!

isaac-chung May 11, 2025

Choose a reason for hiding this comment

Uh oh!

Samoed May 9, 2025

Choose a reason for hiding this comment

Uh oh!

Samoed May 9, 2025

Choose a reason for hiding this comment

Uh oh!

Samoed May 9, 2025

Choose a reason for hiding this comment

Uh oh!

Samoed May 9, 2025

Choose a reason for hiding this comment

Uh oh!

KennethEnevoldsen commented Jun 15, 2025

Uh oh!

Uh oh!

switchpiggy commented May 9, 2025 •

edited

Loading