Feature Request: support for XLMRobertaModel embeddings #8789

IzzyHibbert · 2024-07-31T07:55:32Z

I am running the latest code. Mention the version if possible as well.
I carefully followed the README.md.
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
I reviewed the Discussions, and have a new and useful enhancement to share.

At the moment the project does not support embeddings based on XLMRobertaModel architecture.
Could be very good to extend the support to those.

I believe that adding support could be great. Models like BAAI/bge-m3 are showing promising results against proprietary models like OpenAI, see here :
https://towardsdatascience.com/openai-vs-open-source-multilingual-embedding-models-e5ccb7c90f05

Also, they support higher volume of tokens (bge--m3 for instance up to 8192 tokens).

No response

ExtReMLapin · 2024-07-31T08:59:31Z

IzzyHibbert · 2024-07-31T09:01:42Z

Closed as duplicate of #8658. (apologies, did not find it when searching..)

IzzyHibbert added the enhancement New feature or request label Jul 31, 2024

IzzyHibbert closed this as completed Jul 31, 2024

Provide feedback