Added custom model inference. #437

JoelNiklaus · 2024-12-11T15:30:57Z

Enables the evaluation of any system in the user's control. Fixes Issue 430.

Try with

python -m lighteval custom google-translate /path/to/google_translate_model.py "lighteval|wmt20:fr-de|0|0" --max-samples 10

google_translate_model.py

import logging
from typing import Optional

from tqdm import tqdm
from transformers import AutoTokenizer

from lighteval.data import GenerativeTaskDataset
from lighteval.models.abstract_model import LightevalModel, ModelInfo
from lighteval.models.model_output import (
    GenerativeResponse,
    LoglikelihoodResponse,
    LoglikelihoodSingleTokenResponse,
)
from lighteval.tasks.requests import (
    GreedyUntilRequest,
    LoglikelihoodRequest,
    LoglikelihoodRollingRequest,
    LoglikelihoodSingleTokenRequest,
)


logger = logging.getLogger(__name__)


class GoogleTranslateClient(LightevalModel):

    def __init__(self, config, env_config) -> None:
        self.model = config.model
        self.model_definition_file_path = config.model_definition_file_path

        self.model_info = ModelInfo(
            model_name=config.model,
            model_sha="",
            model_dtype=None,
            model_size="",
        )
        
        self._tokenizer = AutoTokenizer.from_pretrained("gpt2")  # Use a dummy tokenizer for compatibility

        import httpcore
        # Needed to fix some googletrans bug
        # https://stackoverflow.com/questions/72796594/attributeerror-module-httpcore-has-no-attribute-synchttptransport#comment136664963_77334618
        setattr(httpcore, 'SyncHTTPTransport', 'AsyncHTTPProxy')  
        from googletrans import Translator
        self.translator = Translator()

    def greedy_until(
        self,
        requests: list[GreedyUntilRequest],
        override_bs: Optional[int] = None,
    ) -> list[GenerativeResponse]:
        """
        Generates responses using a greedy decoding strategy until certain ending conditions are met.

        Args:
            requests (list[Request]): list of requests containing the context and ending conditions.
            disable_tqdm (bool, optional): Whether to disable the progress bar. Defaults to False.
            override_bs (int, optional): Override the batch size for generation. Defaults to None.

        Returns:
            list[GenerativeResponse]: list of generated responses.
        """
        for request in requests:
            request.tokenized_context = self.tok_encode(request.context)

        dataset = GenerativeTaskDataset(requests=requests, num_dataset_splits=self.DATASET_SPLITS)
        results = []

        for _ in tqdm(
            dataset.splits_start_end_iterator(),
            total=dataset.num_dataset_splits,
            desc="Splits",
            position=0,
            disable=False,  # self.disable_tqdm,
        ):
            for r in tqdm(dataset, desc="Batch", position=1, disable=False):
                context = r.context.replace("French phrase: ", "")
                # TODO: Get src and dest from request
                translation = self.translator.translate(context, src='fr', dest='de')


                result = translation.text
                cur_response = GenerativeResponse(
                    result=result,
                    logits=None, 
                    generated_tokens=[],
                    input_tokens=[],
                )
                results.append(cur_response)


        return dataset.get_original_order(results)

    @property
    def tokenizer(self):
        return self._tokenizer

    def tok_encode(self, text: str):
        return self.tokenizer.encode(text)

    @property
    def add_special_tokens(self) -> bool:
        return False

    @property
    def max_length(self) -> int:
        """Return the maximum sequence length of the model."""
        return 4096

    def loglikelihood(
        self, requests: list[LoglikelihoodRequest], override_bs: Optional[int] = None
    ) -> list[LoglikelihoodResponse]:
        """Tokenize the context and continuation and compute the log likelihood of those
        tokenized sequences.
        """
        raise NotImplementedError

    def loglikelihood_rolling(
        self, requests: list[LoglikelihoodRollingRequest], override_bs: Optional[int] = None
    ) -> list[LoglikelihoodResponse]:
        """This function is used to compute the log likelihood of the context for perplexity metrics."""
        raise NotImplementedError

    def loglikelihood_single_token(
        self, requests: list[LoglikelihoodSingleTokenRequest], override_bs: Optional[int] = None
    ) -> list[LoglikelihoodSingleTokenResponse]:
        """Tokenize the context and continuation and compute the log likelihood of those
        tokenized sequences.
        """
        raise NotImplementedError

src/lighteval/models/model_loader.py

clefourrier

Looking very nice, exactly what I had in mind!

No big comments on the main PR, but

you could add your model class in examples.
you need to update the doc pages to explain how this works
it would be good to add a small test to our suite for this feature

I'll try to run it this afternoon and if all goes well and you update the doc, we'll be good to go!

src/lighteval/main_custom.py

clefourrier · 2024-12-12T14:07:28Z

Hahaha please also provide an explicit requirements files :)

JoelNiklaus · 2024-12-12T14:50:26Z

The explicit requirement file is only needed for the google translate example, right? Where should I add that?

clefourrier · 2024-12-12T14:55:12Z

google_translate_model_requirements.txt for now, next to the py file

JoelNiklaus · 2024-12-12T15:39:36Z

Great, fixed the things. @clefourrier ready for review again.

…nference.

NathanHB · 2024-12-16T14:27:46Z

Hi @JoelNiklaus ! Great PR, howveer, just tried it and it does not seem to work.

When running:

lighteval custom google-translate google_translate_model.py "lighteval|wmt20:fr-de|0|0" --max-samples 10

AttributeError: 'NoneType' object has no attribute 'group'

│ ❱  79 │   │   │   │   translation = self.translator.translate(context, src='fr', dest='de')      │
│    80 │   │   │   │                                                                              │
│    81 │   │   │   │                                                                              │
│    82 │   │   │   │   result = translation.text                                                  │
│                                                                                                  │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │           _ = (0, 10)                                                                        │ │
│ │     context = '"J\'aimerais faire du parlement européen une instance plus démocratique, plus │ │
│ │               ouv'+296                                                                       │ │

deps:

httpx==0.28.1
googletrans==3.0.0

JoelNiklaus · 2024-12-16T15:31:47Z

Hmm, would you mind trying an environment with the requirements in examples/custom_models/google-translate-requirements-freeze.txt?

JoelNiklaus · 2024-12-17T09:29:52Z

@NathanHB I added another custom model example at examples/custom_models/local_mt_model.py

JoelNiklaus · 2025-01-03T10:43:52Z

@clefourrier @NathanHB Would you mind reviewing again? It should work better now.

JoelNiklaus · 2025-01-11T18:31:19Z

@NathanHB I think it would be great to merge this soon, it enables many more systems to be evaluated :) Also I need it merged for PR #488.

HuggingFaceDocBuilderDev · 2025-05-05T08:32:41Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…nto add-custom-model

Enables the evaluation of any system in the user's control. Fixes [Issue 430](#430). Try with ``` python -m lighteval custom google-translate /path/to/google_translate_model.py "lighteval|wmt20:fr-de|0|0" --max-samples 10 ``` google_translate_model.py ``` import logging from typing import Optional from tqdm import tqdm from transformers import AutoTokenizer from lighteval.data import GenerativeTaskDataset from lighteval.models.abstract_model import LightevalModel, ModelInfo from lighteval.models.model_output import ( GenerativeResponse, LoglikelihoodResponse, LoglikelihoodSingleTokenResponse, ) from lighteval.tasks.requests import ( GreedyUntilRequest, LoglikelihoodRequest, LoglikelihoodRollingRequest, LoglikelihoodSingleTokenRequest, ) logger = logging.getLogger(__name__) class GoogleTranslateClient(LightevalModel): def __init__(self, config, env_config) -> None: self.model = config.model self.model_definition_file_path = config.model_definition_file_path self.model_info = ModelInfo( model_name=config.model, model_sha="", model_dtype=None, model_size="", ) self._tokenizer = AutoTokenizer.from_pretrained("gpt2") # Use a dummy tokenizer for compatibility import httpcore # Needed to fix some googletrans bug # https://stackoverflow.com/questions/72796594/attributeerror-module-httpcore-has-no-attribute-synchttptransport#comment136664963_77334618 setattr(httpcore, 'SyncHTTPTransport', 'AsyncHTTPProxy') from googletrans import Translator self.translator = Translator() def greedy_until( self, requests: list[GreedyUntilRequest], override_bs: Optional[int] = None, ) -> list[GenerativeResponse]: """ Generates responses using a greedy decoding strategy until certain ending conditions are met. Args: requests (list[Request]): list of requests containing the context and ending conditions. disable_tqdm (bool, optional): Whether to disable the progress bar. Defaults to False. override_bs (int, optional): Override the batch size for generation. Defaults to None. Returns: list[GenerativeResponse]: list of generated responses. """ for request in requests: request.tokenized_context = self.tok_encode(request.context) dataset = GenerativeTaskDataset(requests=requests, num_dataset_splits=self.DATASET_SPLITS) results = [] for _ in tqdm( dataset.splits_start_end_iterator(), total=dataset.num_dataset_splits, desc="Splits", position=0, disable=False, # self.disable_tqdm, ): for r in tqdm(dataset, desc="Batch", position=1, disable=False): context = r.context.replace("French phrase: ", "") # TODO: Get src and dest from request translation = self.translator.translate(context, src='fr', dest='de') result = translation.text cur_response = GenerativeResponse( result=result, logits=None, generated_tokens=[], input_tokens=[], ) results.append(cur_response) return dataset.get_original_order(results) @Property def tokenizer(self): return self._tokenizer def tok_encode(self, text: str): return self.tokenizer.encode(text) @Property def add_special_tokens(self) -> bool: return False @Property def max_length(self) -> int: """Return the maximum sequence length of the model.""" return 4096 def loglikelihood( self, requests: list[LoglikelihoodRequest], override_bs: Optional[int] = None ) -> list[LoglikelihoodResponse]: """Tokenize the context and continuation and compute the log likelihood of those tokenized sequences. """ raise NotImplementedError def loglikelihood_rolling( self, requests: list[LoglikelihoodRollingRequest], override_bs: Optional[int] = None ) -> list[LoglikelihoodResponse]: """This function is used to compute the log likelihood of the context for perplexity metrics.""" raise NotImplementedError def loglikelihood_single_token( self, requests: list[LoglikelihoodSingleTokenRequest], override_bs: Optional[int] = None ) -> list[LoglikelihoodSingleTokenResponse]: """Tokenize the context and continuation and compute the log likelihood of those tokenized sequences. """ raise NotImplementedError ```

JoelNiklaus added 4 commits December 11, 2024 16:28

Added first version of custom model.

5909d4a

Merge branch 'main' into add-custom-model

a2d6b63

Merge branch 'main' into add-custom-model

2283c89

Merge branch 'main' into add-custom-model

9563fab

clefourrier reviewed Dec 12, 2024

View reviewed changes

src/lighteval/models/model_loader.py Show resolved Hide resolved

clefourrier reviewed Dec 12, 2024

View reviewed changes

src/lighteval/main_custom.py Outdated Show resolved Hide resolved

src/lighteval/main_custom.py Outdated Show resolved Hide resolved

clefourrier added 2 commits December 12, 2024 12:11

Merge branch 'main' into add-custom-model

319d482

Merge branch 'main' into add-custom-model

464edfe

JoelNiklaus added 3 commits December 12, 2024 15:44

Moved custom model config.

6096042

Added warning.

a7e1fe5

Added custom model example for google translate.

24b8bd3

Added documentation for custom model config.

c177a8e

JoelNiklaus added 5 commits December 12, 2024 16:03

Added docs.

d712cdb

Merge branch 'main' into add-custom-model

7553147

Fixed path error.

b41949c

Fixed doc error.

aaaadb0

Added requirements file for google translate.

c85065f

JoelNiklaus added 4 commits December 12, 2024 16:50

Moved model loading function to reduce merge conflicts with litellm i…

f1103da

…nference.

Added diskcache and get source and target language from the task name.

71f871e

Fixed problem with removing languages in the context.

d1af518

Added retry logic.

2511158

JoelNiklaus added 2 commits December 16, 2024 07:18

Merge branch 'main' into add-custom-model

7d5f76d

Update google-translate requirements.

743a284

Added another example for a custom model.

1a37f71

JoelNiklaus and others added 10 commits December 17, 2024 11:36

Made local mt model example more general to support madlad400 as well.

2f27645

Merge branch 'main' into add-custom-model

a4d4fee

Merge branch 'main' into add-custom-model

bd08781

Make sure generation can happen on the GPU.

b7106e4

Fixed issue with src and tgt lang for seamless model.

a7d176c

Added cleanup to free the GPU memory again.

f1ba65c

Fix dependency issues by switching to deep-translator.

ace6e59

Made inference code more robust against empty responses.

cfd7254

Merge branch 'main' into add-custom-model

3ddc104

Merge branch 'main' into add-custom-model

f6df2a3

Merge branch 'main' into add-custom-model

348e427

JoelNiklaus changed the title ~~Added first version of custom model.~~ Added custom model inference. Jan 11, 2025

JoelNiklaus and others added 4 commits January 11, 2025 10:31

Merge branch 'main' into add-custom-model

a63f4b3

Merge branch 'main' into add-custom-model

8d6da37

fix doc

7f9f1b2

Merge branch 'main' into add-custom-model

ba33b48

NathanHB and others added 3 commits May 5, 2025 08:43

fix style

02bd44a

fix style

f694533

Merge branch 'main' into add-custom-model

96c6589

NathanHB approved these changes May 5, 2025

View reviewed changes

NathanHB added 2 commits May 5, 2025 13:07

few fixes

eadf9f9

Merge branch 'add-custom-model' of github.com:JoelNiklaus/lighteval i…

fdbd914

…nto add-custom-model

NathanHB merged commit e67ed9c into huggingface:main May 5, 2025
4 checks passed

NathanHB added the feature/enhancement New feature/request label May 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added custom model inference. #437

Added custom model inference. #437

Uh oh!

JoelNiklaus commented Dec 11, 2024

Uh oh!

Uh oh!

clefourrier left a comment

Uh oh!

Uh oh!

Uh oh!

clefourrier commented Dec 12, 2024 •

edited

Loading

Uh oh!

JoelNiklaus commented Dec 12, 2024

Uh oh!

clefourrier commented Dec 12, 2024

Uh oh!

JoelNiklaus commented Dec 12, 2024

Uh oh!

NathanHB commented Dec 16, 2024 •

edited

Loading

Uh oh!

JoelNiklaus commented Dec 16, 2024

Uh oh!

JoelNiklaus commented Dec 17, 2024 •

edited

Loading

Uh oh!

JoelNiklaus commented Jan 3, 2025

Uh oh!

JoelNiklaus commented Jan 11, 2025

Uh oh!

HuggingFaceDocBuilderDev commented May 5, 2025

Uh oh!

Uh oh!

Uh oh!

Added custom model inference. #437

Added custom model inference. #437

Uh oh!

Conversation

JoelNiklaus commented Dec 11, 2024

Uh oh!

Uh oh!

clefourrier left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

clefourrier commented Dec 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JoelNiklaus commented Dec 12, 2024

Uh oh!

clefourrier commented Dec 12, 2024

Uh oh!

JoelNiklaus commented Dec 12, 2024

Uh oh!

NathanHB commented Dec 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JoelNiklaus commented Dec 16, 2024

Uh oh!

JoelNiklaus commented Dec 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JoelNiklaus commented Jan 3, 2025

Uh oh!

JoelNiklaus commented Jan 11, 2025

Uh oh!

HuggingFaceDocBuilderDev commented May 5, 2025

Uh oh!

Uh oh!

Uh oh!

clefourrier commented Dec 12, 2024 •

edited

Loading

NathanHB commented Dec 16, 2024 •

edited

Loading

JoelNiklaus commented Dec 17, 2024 •

edited

Loading