# TinyStack Tokenizer ByteLevel BPE tokenizer trained on fhswf/tiny-stack dataset. ## Usage ```python from tokenizers.implementations import ByteLevelBPETokenizer from tokenizers.processors import BertProcessing tokenizer = ByteLevelBPETokenizer("./vocab.json", "./merges.txt") tokenizer._tokenizer.post_processor = BertProcessing( ("", tokenizer.token_to_id("")), ("", tokenizer.token_to_id("")), ) tokenizer.enable_truncation(max_length=512) ``` Vocab size: 52000