Custom tokenizer for trf models from hf #13562
Unanswered
K-Grachev-2106756
asked this question in
Help: Model Advice
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I have a task to build a ner pipeline. It consists of a transformer and ner pipelines. I used to use a standard tokenizer for Chinese.
After that, I wondered if I was doing the right thing by submitting Docs with non-native marked tokens of another tokenizer to the transformer model.
Please tell me how the internal learning process of the model works. Is it important to write a custom tokenizer, or do tokens become understandable to the transformer in any way during the learning process?
Beta Was this translation helpful? Give feedback.
All reactions