Skip to content

Latest commit

 

History

History
6 lines (5 loc) · 252 Bytes

README.md

File metadata and controls

6 lines (5 loc) · 252 Bytes

wordpieces

This crate provides a subword tokenizer. A subword tokenizer splits a token into several pieces, so-called word pieces. Word pieces were popularized by and used in the BERT natural language encoder.