-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Japanese model #450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
No, we are not currently working on Japanese. The first requirement to make models for any language is labeled data to train models from. Most of the components in CoreNLP use supervised learning. Traditionally, the public availability of Japanese language corpora hasn't been very good, but, now, e.g., the Japanese Universal Dependencies corpora could be used to train several components (segmenter, POS, depparse). However, I still don't know of any usable Japanese NER data. The other requirement is somebody willing to do the work. In general, our expansions to other languages have occurred because somebody was interested in having the language available for some reason. |
@manning Thank you for your reply. Actually, I am an NLP newbie and still don't fully understand your valuable answer ;) I will search for Japanese NER data and if I find one I will share it with you. BTW, I enjoy your Natural Language Processing with Deep Learning course very much! Thanks for it too! |
I found jigg. It's said having similar interface to CoreNLP, actually including CoreNLP and Kuromoji, a Japanese tokenizer. The authors are inspired by CoreNLP, once tried to make an Japanese extension to CoreNLP, but later decided to make jigg for more flexibility. For Japanese NER, they use JUMAN/KNP. |
FWIW (mostly for archival reasons at this point) there are now Japanese models for stanfordnlp https://stanfordnlp.github.io/stanfordnlp/models.html#human-languages-supported-by-stanfordnlp |
Thank you @AngledLuffa for your update. |
@AngledLuffa @vochicong i just checked standford package and stanza. but still not support for process relation extraction( |
There is no relation extraction model in Stanza |
Hi, I am interested to using CoreNLP (and DeepDive) with Japanese.
Are you working on Japanese?
And how can I start building a Japanese model?
How did you build Chinese or English models?
Thanks!
The text was updated successfully, but these errors were encountered: