Text Generation
Transformers
Safetensors
GGUF
English
stablelm
causal-lm
conversational
Inference Endpoints

IMPORTANT - possible mistake in tokenizer_config.json

#17
by ctranslate2-4you - opened

I noticed that the file states a maximum model length of 2048 but I thought that the zephyr 1.6b (and 3b for that matter) had a maximum length of 4096? I might be misunderstanding because I'm not a programmer by trade, but another possibility is that the max length specified of 2048 pertains to the tokenizer itself and not the LLM? In any event, here is the error I'm receiving when I try to send Zephyr more than 2048 tokens:

Token indices sequence length is longer than the specified maximum sequence length for this model (2761 > 2048). Running this sequence through the model will result in indexing errors

[EDIT]

I manually changed it to say 4096 instead and it seems to work now.

Stability AI org

@ctranslate2-4you Updated. Thank you for reporting!

Sign up or log in to comment