generating extremely slow, compared to 4k length model
1
#4 opened 23 days ago
by
CHNtentes
Run with full 128k context in 24G vram
#3 opened 28 days ago
by
meigami
Slow inference speed and high VRAM using huggingface transformers
#2 opened about 1 month ago
by
Starlento
Adding `safetensors` variant of this model
#1 opened about 1 month ago
by
SFconvertbot