Collections
Discover the best community collections!
Collections trending this week
-
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Paper β’ 2208.07339 β’ Published β’ 4 -
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Paper β’ 2210.17323 β’ Published β’ 6 -
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Paper β’ 2211.10438 β’ Published β’ 2 -
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Paper β’ 2306.00978 β’ Published β’ 5