Fast and memory-efficient library for WordPiece tokenization as it is used by BERT.
-
Updated
Jan 30, 2025 - C#
Fast and memory-efficient library for WordPiece tokenization as it is used by BERT.
NLP Code Snippets and Conference related
WordPiece Tokenizer for BERT models.
Word/Image/Audio Embedding models, Tokenizer models, Ngram language models, MatrixModels, Corpus building, Vocabulary Building, Language modelling
Add a description, image, and links to the wordpiece-tokenization topic page so that developers can more easily learn about it.
To associate your repository with the wordpiece-tokenization topic, visit your repo's landing page and select "manage topics."