Edit model card

Model Card for Model ID

This is Meta's Llama 2 7B quantized in 2-bit using AutoGPTQ from Hugging Face Transformers.

Model Details

Model Description

Model Sources

The method and code used to quantize the model are explained here: Quantize and Fine-tune LLMs with GPTQ Using Transformers and TRL

Uses

This model is pre-trained and not fine-tuned. You may fine-tune it with PEFT using adapters. Note that the 2-bit quantization significantly decreases the performance of Llama 2.

Other versions

Model Card Contact

The Kaitchup

Downloads last month
89
Safetensors
Model size
723M params
Tensor type
I32
·
FP16
·