Update Japanese tokenizer config and add serialization #5562

adrianeboyd · 2020-06-08T13:26:53Z

Description

Use config dict for tokenizer settings
Add serialization of split mode setting
Add tests for tokenizer split modes and serialization of split mode setting

Based on #5561

Types of change

Enhancement.

Checklist

I have submitted the spaCy Contributor Agreement.
I ran the tests, and all new and existing tests passed.
My changes don't require a change to the documentation, or if they do, I've added all required information.

* Use `config` dict for tokenizer settings * Add serialization of split mode setting * Add tests for tokenizer split modes and serialization of split mode setting Based on explosion#5561

hiroshi-matsuda-rit · 2020-06-08T15:29:49Z

I think this works fine and thank you for adding test cases.

Update Japanese tokenizer config and add serialization

9be89a0

* Use `config` dict for tokenizer settings * Add serialization of split mode setting * Add tests for tokenizer split modes and serialization of split mode setting Based on explosion#5561

adrianeboyd added enhancement Feature requests and improvements lang / ja Japanese language data and models labels Jun 8, 2020

Merge branch 'master' into feature/ja-config

71c8f57

adrianeboyd mentioned this pull request Jun 8, 2020

set split_mode from meta.json #5561

Closed

3 tasks

adrianeboyd merged commit 3bf1115 into explosion:master Jun 8, 2020

hiroshi-matsuda-rit mentioned this pull request Jun 8, 2020

Japanese Model #3756

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Update Japanese tokenizer config and add serialization #5562

Update Japanese tokenizer config and add serialization #5562

Uh oh!

adrianeboyd commented Jun 8, 2020

Uh oh!

hiroshi-matsuda-rit commented Jun 8, 2020

Uh oh!

Uh oh!

Uh oh!

Update Japanese tokenizer config and add serialization #5562

Update Japanese tokenizer config and add serialization #5562

Uh oh!

Conversation

adrianeboyd commented Jun 8, 2020

Description

Types of change

Checklist

Uh oh!

hiroshi-matsuda-rit commented Jun 8, 2020

Uh oh!

Uh oh!