Uploaded model
- Developed by: Tung177
- License: apache-2.0
- Finetuned from model : ura-hcmut/GemSUra-2B
This gemma model was trained 2x faster with Unsloth and Huggingface's TRL library.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 64
- total_train_batch_size: 64
- optimizer: paged-adamw-32bit
- lr_scheduler_type: constant
- num_epochs: 3
- qlora: r64 a16 dropout0
- Downloads last month
- 0
Unable to determine this model’s pipeline type. Check the
docs
.