Hello!
Bug report overview
- I'm unable to run
ORTModelForMaskedLM.from_pretrained as required for the new SparseEncoder class in Sentence Transformers.
Details
Sentence Transformers currently supports three model archetypes: (dense) embedding models (SentenceTransformer), reranker models (CrossEncoder) and as of the recent v5.0.0 release also sparse embedding models (SparseEncoder).
This last model archetype doesn't yet support ONNX/OpenVINO, and I'd like to add support for that natively in Sentence Transformers by relying on optimum.
Like with the SentenceTransfomer class, there are different architectures supported by the SparseEncoder class, but most models are based on SPLADE, i.e. a Masked Language Modelling transformer + a pooling layer. The transformer is loaded with AutoModelForMaskedLM.
I would like to add support for the ONNX/OV variants: ORTModelForMaskedLM and OVModelForMaskedLM, but I'm getting errors with the ONNX one at the moment:
from optimum.onnxruntime import ORTModelForMaskedLM
model = ORTModelForMaskedLM.from_pretrained("sparse-encoder-testing/splade-bert-tiny-nq", export=True)
Traceback (most recent call last):
File "c:\code\optimum\demo_export_splade.py", line 5, in <module>
model = ORTModelForMaskedLM.from_pretrained("sparse-encoder-testing/splade-bert-tiny-nq", export=True)#, library_name="transformers")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\code\optimum\optimum\onnxruntime\modeling_ort.py", line 552, in from_pretrained
return super().from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\code\optimum\optimum\modeling_base.py", line 419, in from_pretrained
return from_pretrained_method(
^^^^^^^^^^^^^^^^^^^^^^^
File "c:\code\optimum\optimum\onnxruntime\modeling_ort.py", line 422, in _export
main_export(
File "c:\code\optimum\optimum\exporters\onnx\__main__.py", line 348, in main_export
model = TasksManager.get_model_from_task(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\code\optimum\optimum\exporters\tasks.py", line 2163, in get_model_from_task
model_class = TasksManager.get_model_class_for_task(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\code\optimum\optimum\exporters\tasks.py", line 1457, in get_model_class_for_task
raise KeyError(
KeyError: 'Unknown task: fill-mask. Possible values are: `feature-extraction` for SentenceTransformer, `sentence-similarity` for SentenceTransformer'
In short, optimum recognizes the Sentence Transformers-specific files in the repository and locks me to feature-extraction and sentence-similarity. The simplest fix is actually to let me pass the library_name that gets automatically inferred in the from_pretrained. I'll make a PR to set that up.
Hello!
Bug report overview
ORTModelForMaskedLM.from_pretrainedas required for the newSparseEncoderclass in Sentence Transformers.Details
Sentence Transformers currently supports three model archetypes: (dense) embedding models (SentenceTransformer), reranker models (CrossEncoder) and as of the recent v5.0.0 release also sparse embedding models (SparseEncoder).
This last model archetype doesn't yet support ONNX/OpenVINO, and I'd like to add support for that natively in Sentence Transformers by relying on optimum.
Like with the SentenceTransfomer class, there are different architectures supported by the SparseEncoder class, but most models are based on SPLADE, i.e. a Masked Language Modelling transformer + a pooling layer. The transformer is loaded with
AutoModelForMaskedLM.I would like to add support for the ONNX/OV variants: ORTModelForMaskedLM and OVModelForMaskedLM, but I'm getting errors with the ONNX one at the moment:
In short,
optimumrecognizes the Sentence Transformers-specific files in the repository and locks me to feature-extraction and sentence-similarity. The simplest fix is actually to let me pass thelibrary_namethat gets automatically inferred in thefrom_pretrained. I'll make a PR to set that up.