Skip to content

vllm.tokenizers.fastokens

fastokens tokenizer mode.

Loads a Hugging Face fast tokenizer whose internal Rust tokenizer is replaced by the fastokens shim. fastokens also rebinds tokenizers.decoders.DecodeStream so the streaming detokenizer accepts the shim. Both patches are installed for the lifetime of the process — patch_transformers() is idempotent.