-
Notifications
You must be signed in to change notification settings - Fork 9
Closed
Labels
new featureThis doesn't seem rightThis doesn't seem right
Description
It would be useful to add a sentence splitter, for instance, possibilities could be,
- Puntk sentence tokenizer from NLTK (needs pre-trained model)
- Unicode sentence boundaries from Unicode sentence boundaries unicode-rs/unicode-segmentation#24 (doesn't need a pre-trained model)
- investigate spacy implementation (likely needs pre-trained model)
Metadata
Metadata
Assignees
Labels
new featureThis doesn't seem rightThis doesn't seem right