ISTA-DASLab
/

Meta-Llama-3-70B-Instruct-AQLM-2Bit-1x16

For this quantization, we used 1 codebook of 16 bits.

Results (measured with lm_eval==4.0):

Model	Quantization	MMLU (5-shot)	ArcC	ArcE	Hellaswag	Winogrande	PiQA	Model size, Gb
meta-llama/Meta-Llama-3-70B	-	0.7980	0.6160	0.8624	0.6367	0.8183	0.7632	141.2
	1x16	0.7587	0.4863	0.7668	0.6159	0.7481	0.7537	21.9

Safetensors

Model size

11B params

Tensor type

FP16

I16

Collection including ISTA-DASLab/Meta-Llama-3-70B-Instruct-AQLM-2Bit-1x16