Safetensors version?

by matatonic - opened Nov 8, 2023

Discussion

matatonic

Nov 8, 2023

Any chance you could also upload a safetensors version?

Thireus

Nov 30, 2023

* Use the following script to convert your local pytorch_model bin files to float16 (you can also choose bfloat16) + safetensors all in one go:

https://github.com/oobabooga/text-generation-webui/blob/main/convert-to-safetensors.py
(best for sharding and float16/FP16 or bfloat16/BF16 conversion)

Example to convert WizardLM 70B V1.0 directly to float16 safetensors in 10GB shards:

python convert-to-safetensors.py ~/original/WizardLM-70B-V1.0 --output ~/float16_safetensored/WizardLM-70B-V1.0 --max-shard-size 10GB

Use --bf16 if you'd like to try bfloat16 instead, but note that there are concerns about quantization quality – https://github.com/turboderp/exllamav2/issues/30#issuecomment-1719009289

** Use any one of the following scripts to convert your local pytorch_model bin files to safetensors:

ehartford

Cognitive Computations org Nov 30, 2023

Thanks but I don't really see why I would want / need to do that

Narsil

Dec 1, 2023

•

edited Dec 1, 2023

Hi, creator of safetensors here.

The issue with pickle files is that they are not safe. Anyone can write malicious code that's going to be executed on your machine.
You can check: https://huggingface.co/Narsil/totallysafe Weird things will happen as soon as you open this file. And things will continue to be tricky after you have closed your python session.

I promise this is harmless since I actually wrote it, but should give you an impression of how BAD pickle can be.

Users will not know you, therefore not necessarily trust what you output. Having a safetensors file, means at least they should be safe from arbitrary code execution, and the worst that could happen is just loading a problem not fit for what they intend to do.
You can check out more reasons here: https://github.com/huggingface/safetensors#yet-another-format-

For instance it loads files 2x faster than pickle (10x for CPU actually).

ehartford

Cognitive Computations org Dec 1, 2023

I understand that, but as the author of the the model I know what's in it, and also primarily this is repackaged and quantized by TheBloke, and he and I also trust each other.

I just train models, he packages them for distribution.

I don't see how adding an extra step in the publishing process benefits myself personally.

The person you should talk to is @winglian . If he updates Axolotl so it uses safetensor format by default, then that is what I'll publish.

matatonic

Dec 2, 2023

My original reason for asking was to convert to exl2 quants, which The Bloke doesn't do. I had been unable to convert 70B models so far using the scripts I had (resource limits) - however, the above Panchovix script linked by @Thireus works perfectly and doesn't use crazy resources - Thanks! With that I was able to convert it myself.
I'd add though the HuggingFace build in tools to convert also didn't work on this model, so my thinking is it's a waste for so many other people to convert it using various tools of varying quality, when it could be converted once at the source and quality assured. TheBloke used to release HF fp16 formats for some models, which was handy, but he doesn't do that often anymore.
Regardless, as for my original request I don't need you to provide it anymore.

ehartford

Cognitive Computations org Dec 2, 2023

Fair enough - I think you might wanna chat with wing though, safetensors would certainly get more adoption if it was default output format in Axolotl

ehartford changed discussion status to closed Dec 2, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment