Quantization support

#12
by Kernel - opened

Is there any possible way to get 8bit quantization? BTW what is the model size? 7B? Cant find this information

Hi,

There is an on going effort to port Kosmos-2 directly into transformers. This repository (remote code) might need some more bug fixes later, including breaking changes.
I would suggest to wait for the official support (where I will try to make it work with quantization, but I can't 100% guarantee at this moment)

Regarding the model size, the paper says The total number of trainable parameters amounts to approximately 1.6B, but I didn't check it myself. The model file (pytorch bin file) is 6.6 GB however.

close this issue. Feel free to open once an official port is merged into transformers. Thank you.

ydshieh changed discussion status to closed

Sign up or log in to comment