What's Changed
- Reduce icud.dll by @mryzhov in #196
- Split implementation without FastTokenizer by @pavel-esir in #208
- Align Sentencepiece Model Vocab by @apaniukov in #205
- Ops Optimization by @apaniukov in #219
- [TF FE][Tokenizers] Avoid dependency from TF FE in tokenizers by @rkazants in #227
- Add Truncation To Sentencepiece by @apaniukov in #225
- reimplement BPE tokenizer by @pavel-esir in #220
- [TF FE][Tokenizers] Optimize TF FE extensions by @rkazants in #232
- Enabled build w/o FastTokenizers by @ilya-lavrenov in #237
- Win debug build by @mryzhov in #218
- Switch To BPE Backend by @apaniukov in #235
- Add UTF-8 validation by @pavel-esir in #242
Full Changelog: 2024.3.0.0...2024.4.0.0