microsoft/Phi-3-mini-4k-instruct · Recent change on the rstrip property on special tokens

Hi, Recently there's a breaking change in phi3's tokenizer by adding rstrip options: https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/commit/4eea1a7b25a14b098aab569599563c37443312cb
However by this change the tokenizer is not lossless anymore, for example:

tokens = tokenizer.encode("<|user|>\n<|end|>\n<|assistant|>")
print(tokenizer.decode(tokens))

will output<|user|><|end|><|assistant|>. Imo this leads to many troubles (for example, drawing ascii arts). Is this change for aligning the actual text generation method during training? If yes, I think it would be better to implement this by changing chat_template to a new-line-free way, since intuitively this happens in chat serialization process, not text tokenization process.