Based on Meta-Llama-3-8b-Instruct, and is governed by Meta Llama 3 License agreement: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct
Realized a tokenization mistake with the previous DPO model. So this is now a new version testing out DPO training on the following dataset:
https://huggingface.co/datasets/mlabonne/orpo-dpo-mix-40k The open LLM results are really BAD lol. Something with this dataset is disagreeing with llama 3?
We are happy for anyone to try it out and give some feedback and we won't have the model up on https://awanllm.com on our LLM API...
Instruct format:
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
{{ system_prompt }}<|eot_id|><|start_header_id|>user<|end_header_id|>
{{ user_message_1 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
{{ model_answer_1 }}<|eot_id|><|start_header_id|>user<|end_header_id|>
{{ user_message_2 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Quants:
FP16: https://huggingface.co/AwanLLM/Awanllm-Llama-3-8B-Instruct-DPO-v0.2
GGUF: https://huggingface.co/AwanLLM/Awanllm-Llama-3-8B-Instruct-DPO-v0.2-GGUF
- Downloads last month
- 1,405
Model size
8.03B params
Architecture
llama
Unable to determine this model's library. Check the
docs
.