Mozc UT dictionaries are additional dictionaries for Mozc.
Merge-ut-dictionaries merges the UT dictionaries into one and modify it for the latest Mozc.
They need more Stars.
Mozc: 1930 Stars
Fcitx5-mozc: 82 Stars
Merge-ut-dictionaries: 40 Stars
Starring a repository also shows appreciation to the repository maintainer for their work. - GitHub Docs
リポジトリに Star を付けるということは、リポジトリメンテナに対してその作業についての感謝を示すことでもあります。- GitHub Docs
git clone --depth 1 https://github.com/utuhiro78/merge-ut-dictionaries.git
Comment out unnecessary dictionaries in src/merge/make.sh.
Default settings:
#alt_cannadic="true"
#edict2="true"
jawiki="true"
#neologd="true"
personal_names="true"
place_names="true"
#skk_jisyo="true"
sudachidict="true"
cd src/merge/
sh make.sh
cat mozcdic-ut.txt >> ../../../mozc-master/src/data/dictionary_oss/dictionary00.txt
Build Mozc as usual.
Uncomment #generate_latest="true"
in src/merge/make.sh.
It downloads the latest "jawiki-latest-pages-articles-multistream.xml.bz2" (over 4.2 GB).
-
mozcdic-ut.txt (generated by merge-ut-dictionaries): Combined
You can combine these UT dictionaries.
-
jawiki-latest-pages-articles-multistream-index.txt: CC BY-SA
merge-ut-dictionaries use it to generate the costs for words.
-
dictionary*.txt in Mozc: BSD-3-Clause
merge-ut-dictionaries use them to remove duplicate words.
-
id.def in Mozc: BSD-3-Clause
merge-ut-dictionaries use it to update ID.
-
Source code: Apache License, Version 2.0