the int8 speed are very slow

#1
by lucasjin - opened
01-ai org

It may be related to your hardware, and in addition, you can try the following inference framework for acceleration, such as vllm

Sign up or log in to comment