This model uses <0x0A> instead of a carriage return
I think this was happening to another model in the last few days, I forgot which. Instead of CRLF it just outputs <0x0A>
The latest llama.cpp does output <0x0A> instead of a regular LF.
It looks like it could be a problem with the tokenizer like this model:
https://huggingface.co/TheBloke/Starling-LM-7B-alpha-GGUF/discussions/1#6566495193951c950b3b8c10
Yeah, it's also impacting Notus and is summed up succinctly by @alvarobartt
https://huggingface.co/argilla/notus-7b-v1/discussions/3#656dc9d802a56b531ade7f73
Yes, I get this token always <0x0A> and sometimes a series of tokens like this: <0x0A><0x0A><0xF0><0x9F><0x9A><0xB8>
any solution to this <0x0A>
issue?
Indeed it was due to the addition of the tokenizer.json
file, since GGUF is expecting the SentencePiece (slow tokenizer) rather than the fast one (Rust-based) which is transformers
default one, that's why then when running python convert.py ...
from llama.cpp
to convert the weights into GGUF, the model is not available, so the tokenization needs to be inferred from the vocab file and the format (e.g. BPE for the Mistral-based models). Anyway, it seems that the tokenizer.model
has been uploaded 9 days ago into berkeley-nest/Starling-LM-7B-alpha
(see https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha/blob/main/tokenizer.model), so re-running the GGUF conversion script would do the work 🤗