Terjman-Large

This model is a fine-tuned version of Helsinki-NLP/opus-mt-tc-big-en-ar on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 3.2078
Bleu: 8.3292
Gen Len: 34.4959

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 22
eval_batch_size: 22
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 88
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.03
num_epochs: 40

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
No log	0.9982	407	4.3938	4.6056	22.6033
5.1616	1.9988	815	3.7257	5.8319	30.9201
3.902	2.9994	1223	3.5214	6.7311	32.9091
3.5737	4.0	1631	3.4204	7.3684	32.1433
3.4576	4.9982	2038	3.3562	7.8632	34.5399
3.4576	5.9988	2446	3.3151	7.9739	35.3278
3.3833	6.9994	2854	3.2884	8.0825	35.8292
3.3358	8.0	3262	3.2681	8.2765	34.5427
3.3069	8.9982	3669	3.2517	8.1019	33.584
3.2769	9.9988	4077	3.2404	8.106	33.3802
3.2769	10.9994	4485	3.2342	8.3037	33.303
3.2777	12.0	4893	3.2284	8.0674	33.3967
3.2476	12.9982	5300	3.2226	8.2883	33.8154
3.2611	13.9988	5708	3.2189	8.3537	34.0413
3.2511	14.9994	6116	3.2159	8.1365	34.5014
3.2437	16.0	6524	3.2140	8.3549	34.0606
3.2437	16.9982	6931	3.2131	8.2507	34.303
3.2498	17.9988	7339	3.2116	8.2928	33.9945
3.2341	18.9994	7747	3.2105	8.337	33.7052
3.2403	20.0	8155	3.2098	8.3179	34.3526
3.2229	20.9982	8562	3.2094	8.3848	34.2039
3.2229	21.9988	8970	3.2090	8.2042	34.6529
3.2379	22.9994	9378	3.2086	8.4227	34.0275
3.2257	24.0	9786	3.2082	8.3515	34.3306
3.2526	24.9982	10193	3.2085	8.4089	34.4986
3.2206	25.9988	10601	3.2082	8.476	34.6226
3.2288	26.9994	11009	3.2083	8.4452	33.697
3.2288	28.0	11417	3.2080	8.29	34.0331
3.2251	28.9982	11824	3.2080	8.35	34.2948
3.2302	29.9988	12232	3.2078	8.4408	33.416
3.21	30.9994	12640	3.2079	8.2934	34.0854
3.2271	32.0	13048	3.2079	8.4573	33.3912
3.2271	32.9982	13455	3.2078	8.4055	34.2452
3.2428	33.9988	13863	3.2079	8.5107	34.5152
3.2303	34.9994	14271	3.2080	8.3734	34.2562
3.2129	36.0	14679	3.2079	8.3193	34.4628
3.2119	36.9982	15086	3.2082	8.4122	34.2121
3.2119	37.9988	15494	3.2078	8.3585	33.8843
3.2445	38.9994	15902	3.2079	8.3968	34.6722
3.2356	39.9264	16280	3.2078	8.3292	34.4959

Framework versions

Transformers 4.40.2
Pytorch 2.2.1+cu121
Datasets 2.19.1
Tokenizers 0.19.1

atlasia
/

Terjman-Large

Terjman-Large

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Finetuned from

Spaces using atlasia/Terjman-Large 2

Collection including atlasia/Terjman-Large

MT Models

Evaluation results

Terjman-Large

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Finetuned from Helsinki-NLP/opus-mt-tc-big-en-ar

Spaces using atlasia/Terjman-Large 2

Collection including atlasia/Terjman-Large

Evaluation results

Finetuned from