stefan-it's picture
Upload folder using huggingface_hub
eeff3de
2023-10-19 20:02:41,370 ----------------------------------------------------------------------------------------------------
2023-10-19 20:02:41,370 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-19 20:02:41,370 ----------------------------------------------------------------------------------------------------
2023-10-19 20:02:41,370 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-19 20:02:41,370 ----------------------------------------------------------------------------------------------------
2023-10-19 20:02:41,370 Train: 7142 sentences
2023-10-19 20:02:41,370 (train_with_dev=False, train_with_test=False)
2023-10-19 20:02:41,371 ----------------------------------------------------------------------------------------------------
2023-10-19 20:02:41,371 Training Params:
2023-10-19 20:02:41,371 - learning_rate: "3e-05"
2023-10-19 20:02:41,371 - mini_batch_size: "8"
2023-10-19 20:02:41,371 - max_epochs: "10"
2023-10-19 20:02:41,371 - shuffle: "True"
2023-10-19 20:02:41,371 ----------------------------------------------------------------------------------------------------
2023-10-19 20:02:41,371 Plugins:
2023-10-19 20:02:41,371 - TensorboardLogger
2023-10-19 20:02:41,371 - LinearScheduler | warmup_fraction: '0.1'
2023-10-19 20:02:41,371 ----------------------------------------------------------------------------------------------------
2023-10-19 20:02:41,371 Final evaluation on model from best epoch (best-model.pt)
2023-10-19 20:02:41,371 - metric: "('micro avg', 'f1-score')"
2023-10-19 20:02:41,371 ----------------------------------------------------------------------------------------------------
2023-10-19 20:02:41,371 Computation:
2023-10-19 20:02:41,371 - compute on device: cuda:0
2023-10-19 20:02:41,371 - embedding storage: none
2023-10-19 20:02:41,371 ----------------------------------------------------------------------------------------------------
2023-10-19 20:02:41,371 Model training base path: "hmbench-newseye/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-19 20:02:41,371 ----------------------------------------------------------------------------------------------------
2023-10-19 20:02:41,371 ----------------------------------------------------------------------------------------------------
2023-10-19 20:02:41,371 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-19 20:02:43,725 epoch 1 - iter 89/893 - loss 2.79188490 - time (sec): 2.35 - samples/sec: 11352.71 - lr: 0.000003 - momentum: 0.000000
2023-10-19 20:02:46,036 epoch 1 - iter 178/893 - loss 2.61515556 - time (sec): 4.66 - samples/sec: 10906.94 - lr: 0.000006 - momentum: 0.000000
2023-10-19 20:02:48,288 epoch 1 - iter 267/893 - loss 2.34054331 - time (sec): 6.92 - samples/sec: 10747.45 - lr: 0.000009 - momentum: 0.000000
2023-10-19 20:02:50,551 epoch 1 - iter 356/893 - loss 2.03989423 - time (sec): 9.18 - samples/sec: 10817.11 - lr: 0.000012 - momentum: 0.000000
2023-10-19 20:02:52,789 epoch 1 - iter 445/893 - loss 1.80785549 - time (sec): 11.42 - samples/sec: 10771.76 - lr: 0.000015 - momentum: 0.000000
2023-10-19 20:02:54,975 epoch 1 - iter 534/893 - loss 1.64636165 - time (sec): 13.60 - samples/sec: 10816.29 - lr: 0.000018 - momentum: 0.000000
2023-10-19 20:02:57,307 epoch 1 - iter 623/893 - loss 1.49893188 - time (sec): 15.94 - samples/sec: 10849.48 - lr: 0.000021 - momentum: 0.000000
2023-10-19 20:02:59,556 epoch 1 - iter 712/893 - loss 1.39030658 - time (sec): 18.18 - samples/sec: 10811.44 - lr: 0.000024 - momentum: 0.000000
2023-10-19 20:03:01,862 epoch 1 - iter 801/893 - loss 1.30158560 - time (sec): 20.49 - samples/sec: 10831.14 - lr: 0.000027 - momentum: 0.000000
2023-10-19 20:03:04,146 epoch 1 - iter 890/893 - loss 1.22337235 - time (sec): 22.77 - samples/sec: 10892.98 - lr: 0.000030 - momentum: 0.000000
2023-10-19 20:03:04,227 ----------------------------------------------------------------------------------------------------
2023-10-19 20:03:04,227 EPOCH 1 done: loss 1.2217 - lr: 0.000030
2023-10-19 20:03:05,672 DEV : loss 0.3446877598762512 - f1-score (micro avg) 0.0378
2023-10-19 20:03:05,686 saving best model
2023-10-19 20:03:05,721 ----------------------------------------------------------------------------------------------------
2023-10-19 20:03:07,724 epoch 2 - iter 89/893 - loss 0.55265620 - time (sec): 2.00 - samples/sec: 11812.70 - lr: 0.000030 - momentum: 0.000000
2023-10-19 20:03:09,989 epoch 2 - iter 178/893 - loss 0.50626466 - time (sec): 4.27 - samples/sec: 11382.89 - lr: 0.000029 - momentum: 0.000000
2023-10-19 20:03:12,173 epoch 2 - iter 267/893 - loss 0.50045584 - time (sec): 6.45 - samples/sec: 11059.55 - lr: 0.000029 - momentum: 0.000000
2023-10-19 20:03:14,417 epoch 2 - iter 356/893 - loss 0.49013333 - time (sec): 8.70 - samples/sec: 11043.35 - lr: 0.000029 - momentum: 0.000000
2023-10-19 20:03:16,658 epoch 2 - iter 445/893 - loss 0.49176159 - time (sec): 10.94 - samples/sec: 11102.34 - lr: 0.000028 - momentum: 0.000000
2023-10-19 20:03:18,956 epoch 2 - iter 534/893 - loss 0.48647694 - time (sec): 13.24 - samples/sec: 11113.66 - lr: 0.000028 - momentum: 0.000000
2023-10-19 20:03:21,023 epoch 2 - iter 623/893 - loss 0.47781561 - time (sec): 15.30 - samples/sec: 11386.80 - lr: 0.000028 - momentum: 0.000000
2023-10-19 20:03:23,292 epoch 2 - iter 712/893 - loss 0.47306629 - time (sec): 17.57 - samples/sec: 11350.70 - lr: 0.000027 - momentum: 0.000000
2023-10-19 20:03:25,506 epoch 2 - iter 801/893 - loss 0.47156156 - time (sec): 19.78 - samples/sec: 11325.69 - lr: 0.000027 - momentum: 0.000000
2023-10-19 20:03:27,763 epoch 2 - iter 890/893 - loss 0.46923210 - time (sec): 22.04 - samples/sec: 11261.08 - lr: 0.000027 - momentum: 0.000000
2023-10-19 20:03:27,837 ----------------------------------------------------------------------------------------------------
2023-10-19 20:03:27,838 EPOCH 2 done: loss 0.4691 - lr: 0.000027
2023-10-19 20:03:30,630 DEV : loss 0.279310941696167 - f1-score (micro avg) 0.2653
2023-10-19 20:03:30,644 saving best model
2023-10-19 20:03:30,676 ----------------------------------------------------------------------------------------------------
2023-10-19 20:03:32,903 epoch 3 - iter 89/893 - loss 0.40506812 - time (sec): 2.23 - samples/sec: 10690.15 - lr: 0.000026 - momentum: 0.000000
2023-10-19 20:03:35,235 epoch 3 - iter 178/893 - loss 0.38010385 - time (sec): 4.56 - samples/sec: 11115.43 - lr: 0.000026 - momentum: 0.000000
2023-10-19 20:03:37,501 epoch 3 - iter 267/893 - loss 0.37662633 - time (sec): 6.82 - samples/sec: 11073.04 - lr: 0.000026 - momentum: 0.000000
2023-10-19 20:03:39,721 epoch 3 - iter 356/893 - loss 0.38495631 - time (sec): 9.04 - samples/sec: 11045.67 - lr: 0.000025 - momentum: 0.000000
2023-10-19 20:03:41,991 epoch 3 - iter 445/893 - loss 0.38828927 - time (sec): 11.31 - samples/sec: 11129.43 - lr: 0.000025 - momentum: 0.000000
2023-10-19 20:03:44,287 epoch 3 - iter 534/893 - loss 0.39211636 - time (sec): 13.61 - samples/sec: 11133.36 - lr: 0.000025 - momentum: 0.000000
2023-10-19 20:03:46,507 epoch 3 - iter 623/893 - loss 0.39046621 - time (sec): 15.83 - samples/sec: 11032.92 - lr: 0.000024 - momentum: 0.000000
2023-10-19 20:03:48,760 epoch 3 - iter 712/893 - loss 0.39250884 - time (sec): 18.08 - samples/sec: 11020.44 - lr: 0.000024 - momentum: 0.000000
2023-10-19 20:03:51,013 epoch 3 - iter 801/893 - loss 0.39203862 - time (sec): 20.34 - samples/sec: 11038.05 - lr: 0.000024 - momentum: 0.000000
2023-10-19 20:03:53,262 epoch 3 - iter 890/893 - loss 0.38941708 - time (sec): 22.59 - samples/sec: 10979.59 - lr: 0.000023 - momentum: 0.000000
2023-10-19 20:03:53,342 ----------------------------------------------------------------------------------------------------
2023-10-19 20:03:53,342 EPOCH 3 done: loss 0.3895 - lr: 0.000023
2023-10-19 20:03:55,699 DEV : loss 0.24555543065071106 - f1-score (micro avg) 0.3492
2023-10-19 20:03:55,714 saving best model
2023-10-19 20:03:55,750 ----------------------------------------------------------------------------------------------------
2023-10-19 20:03:58,025 epoch 4 - iter 89/893 - loss 0.36113360 - time (sec): 2.27 - samples/sec: 11397.77 - lr: 0.000023 - momentum: 0.000000
2023-10-19 20:04:00,247 epoch 4 - iter 178/893 - loss 0.37535560 - time (sec): 4.50 - samples/sec: 11055.72 - lr: 0.000023 - momentum: 0.000000
2023-10-19 20:04:02,497 epoch 4 - iter 267/893 - loss 0.38798410 - time (sec): 6.75 - samples/sec: 10954.42 - lr: 0.000022 - momentum: 0.000000
2023-10-19 20:04:04,814 epoch 4 - iter 356/893 - loss 0.36743176 - time (sec): 9.06 - samples/sec: 10987.50 - lr: 0.000022 - momentum: 0.000000
2023-10-19 20:04:07,053 epoch 4 - iter 445/893 - loss 0.36072597 - time (sec): 11.30 - samples/sec: 11008.14 - lr: 0.000022 - momentum: 0.000000
2023-10-19 20:04:09,282 epoch 4 - iter 534/893 - loss 0.35634201 - time (sec): 13.53 - samples/sec: 10966.00 - lr: 0.000021 - momentum: 0.000000
2023-10-19 20:04:11,544 epoch 4 - iter 623/893 - loss 0.35741415 - time (sec): 15.79 - samples/sec: 10900.67 - lr: 0.000021 - momentum: 0.000000
2023-10-19 20:04:13,740 epoch 4 - iter 712/893 - loss 0.35551333 - time (sec): 17.99 - samples/sec: 10928.84 - lr: 0.000021 - momentum: 0.000000
2023-10-19 20:04:16,072 epoch 4 - iter 801/893 - loss 0.35441138 - time (sec): 20.32 - samples/sec: 10943.74 - lr: 0.000020 - momentum: 0.000000
2023-10-19 20:04:18,390 epoch 4 - iter 890/893 - loss 0.35047145 - time (sec): 22.64 - samples/sec: 10963.62 - lr: 0.000020 - momentum: 0.000000
2023-10-19 20:04:18,455 ----------------------------------------------------------------------------------------------------
2023-10-19 20:04:18,455 EPOCH 4 done: loss 0.3506 - lr: 0.000020
2023-10-19 20:04:21,307 DEV : loss 0.23053158819675446 - f1-score (micro avg) 0.4164
2023-10-19 20:04:21,320 saving best model
2023-10-19 20:04:21,354 ----------------------------------------------------------------------------------------------------
2023-10-19 20:04:23,584 epoch 5 - iter 89/893 - loss 0.36360822 - time (sec): 2.23 - samples/sec: 11218.46 - lr: 0.000020 - momentum: 0.000000
2023-10-19 20:04:25,876 epoch 5 - iter 178/893 - loss 0.34953845 - time (sec): 4.52 - samples/sec: 11330.46 - lr: 0.000019 - momentum: 0.000000
2023-10-19 20:04:28,191 epoch 5 - iter 267/893 - loss 0.33640113 - time (sec): 6.84 - samples/sec: 11092.39 - lr: 0.000019 - momentum: 0.000000
2023-10-19 20:04:30,437 epoch 5 - iter 356/893 - loss 0.33695372 - time (sec): 9.08 - samples/sec: 10958.27 - lr: 0.000019 - momentum: 0.000000
2023-10-19 20:04:32,684 epoch 5 - iter 445/893 - loss 0.33502067 - time (sec): 11.33 - samples/sec: 10796.79 - lr: 0.000018 - momentum: 0.000000
2023-10-19 20:04:34,983 epoch 5 - iter 534/893 - loss 0.32633282 - time (sec): 13.63 - samples/sec: 10838.27 - lr: 0.000018 - momentum: 0.000000
2023-10-19 20:04:37,208 epoch 5 - iter 623/893 - loss 0.32930044 - time (sec): 15.85 - samples/sec: 10786.98 - lr: 0.000018 - momentum: 0.000000
2023-10-19 20:04:39,526 epoch 5 - iter 712/893 - loss 0.32599754 - time (sec): 18.17 - samples/sec: 10806.95 - lr: 0.000017 - momentum: 0.000000
2023-10-19 20:04:41,815 epoch 5 - iter 801/893 - loss 0.32412592 - time (sec): 20.46 - samples/sec: 10882.52 - lr: 0.000017 - momentum: 0.000000
2023-10-19 20:04:44,085 epoch 5 - iter 890/893 - loss 0.32457590 - time (sec): 22.73 - samples/sec: 10907.25 - lr: 0.000017 - momentum: 0.000000
2023-10-19 20:04:44,171 ----------------------------------------------------------------------------------------------------
2023-10-19 20:04:44,171 EPOCH 5 done: loss 0.3245 - lr: 0.000017
2023-10-19 20:04:46,562 DEV : loss 0.22031265497207642 - f1-score (micro avg) 0.4319
2023-10-19 20:04:46,578 saving best model
2023-10-19 20:04:46,612 ----------------------------------------------------------------------------------------------------
2023-10-19 20:04:49,412 epoch 6 - iter 89/893 - loss 0.29917674 - time (sec): 2.80 - samples/sec: 8970.24 - lr: 0.000016 - momentum: 0.000000
2023-10-19 20:04:51,611 epoch 6 - iter 178/893 - loss 0.30090357 - time (sec): 5.00 - samples/sec: 9705.99 - lr: 0.000016 - momentum: 0.000000
2023-10-19 20:04:53,798 epoch 6 - iter 267/893 - loss 0.30335158 - time (sec): 7.19 - samples/sec: 10058.54 - lr: 0.000016 - momentum: 0.000000
2023-10-19 20:04:56,069 epoch 6 - iter 356/893 - loss 0.29977449 - time (sec): 9.46 - samples/sec: 10419.04 - lr: 0.000015 - momentum: 0.000000
2023-10-19 20:04:58,350 epoch 6 - iter 445/893 - loss 0.29912670 - time (sec): 11.74 - samples/sec: 10649.72 - lr: 0.000015 - momentum: 0.000000
2023-10-19 20:05:00,578 epoch 6 - iter 534/893 - loss 0.29972497 - time (sec): 13.97 - samples/sec: 10682.42 - lr: 0.000015 - momentum: 0.000000
2023-10-19 20:05:02,836 epoch 6 - iter 623/893 - loss 0.30160090 - time (sec): 16.22 - samples/sec: 10660.06 - lr: 0.000014 - momentum: 0.000000
2023-10-19 20:05:05,114 epoch 6 - iter 712/893 - loss 0.30329819 - time (sec): 18.50 - samples/sec: 10674.37 - lr: 0.000014 - momentum: 0.000000
2023-10-19 20:05:07,353 epoch 6 - iter 801/893 - loss 0.30226174 - time (sec): 20.74 - samples/sec: 10770.24 - lr: 0.000014 - momentum: 0.000000
2023-10-19 20:05:09,610 epoch 6 - iter 890/893 - loss 0.30273390 - time (sec): 23.00 - samples/sec: 10787.39 - lr: 0.000013 - momentum: 0.000000
2023-10-19 20:05:09,681 ----------------------------------------------------------------------------------------------------
2023-10-19 20:05:09,681 EPOCH 6 done: loss 0.3027 - lr: 0.000013
2023-10-19 20:05:12,048 DEV : loss 0.21030142903327942 - f1-score (micro avg) 0.4508
2023-10-19 20:05:12,062 saving best model
2023-10-19 20:05:12,098 ----------------------------------------------------------------------------------------------------
2023-10-19 20:05:14,410 epoch 7 - iter 89/893 - loss 0.27586402 - time (sec): 2.31 - samples/sec: 11425.48 - lr: 0.000013 - momentum: 0.000000
2023-10-19 20:05:16,516 epoch 7 - iter 178/893 - loss 0.29185307 - time (sec): 4.42 - samples/sec: 11712.87 - lr: 0.000013 - momentum: 0.000000
2023-10-19 20:05:18,743 epoch 7 - iter 267/893 - loss 0.28935607 - time (sec): 6.64 - samples/sec: 11312.02 - lr: 0.000012 - momentum: 0.000000
2023-10-19 20:05:21,023 epoch 7 - iter 356/893 - loss 0.29203548 - time (sec): 8.92 - samples/sec: 11033.42 - lr: 0.000012 - momentum: 0.000000
2023-10-19 20:05:23,248 epoch 7 - iter 445/893 - loss 0.29379774 - time (sec): 11.15 - samples/sec: 11063.54 - lr: 0.000012 - momentum: 0.000000
2023-10-19 20:05:25,531 epoch 7 - iter 534/893 - loss 0.29301504 - time (sec): 13.43 - samples/sec: 11023.25 - lr: 0.000011 - momentum: 0.000000
2023-10-19 20:05:27,819 epoch 7 - iter 623/893 - loss 0.29205510 - time (sec): 15.72 - samples/sec: 11044.37 - lr: 0.000011 - momentum: 0.000000
2023-10-19 20:05:30,143 epoch 7 - iter 712/893 - loss 0.29058991 - time (sec): 18.04 - samples/sec: 11110.85 - lr: 0.000011 - momentum: 0.000000
2023-10-19 20:05:32,363 epoch 7 - iter 801/893 - loss 0.29245337 - time (sec): 20.26 - samples/sec: 11065.63 - lr: 0.000010 - momentum: 0.000000
2023-10-19 20:05:34,600 epoch 7 - iter 890/893 - loss 0.29089580 - time (sec): 22.50 - samples/sec: 11036.49 - lr: 0.000010 - momentum: 0.000000
2023-10-19 20:05:34,670 ----------------------------------------------------------------------------------------------------
2023-10-19 20:05:34,671 EPOCH 7 done: loss 0.2911 - lr: 0.000010
2023-10-19 20:05:37,489 DEV : loss 0.20720338821411133 - f1-score (micro avg) 0.4672
2023-10-19 20:05:37,502 saving best model
2023-10-19 20:05:37,536 ----------------------------------------------------------------------------------------------------
2023-10-19 20:05:39,794 epoch 8 - iter 89/893 - loss 0.27774907 - time (sec): 2.26 - samples/sec: 11112.16 - lr: 0.000010 - momentum: 0.000000
2023-10-19 20:05:42,014 epoch 8 - iter 178/893 - loss 0.26957585 - time (sec): 4.48 - samples/sec: 11162.38 - lr: 0.000009 - momentum: 0.000000
2023-10-19 20:05:44,238 epoch 8 - iter 267/893 - loss 0.28147536 - time (sec): 6.70 - samples/sec: 11015.63 - lr: 0.000009 - momentum: 0.000000
2023-10-19 20:05:46,501 epoch 8 - iter 356/893 - loss 0.28289854 - time (sec): 8.96 - samples/sec: 11009.76 - lr: 0.000009 - momentum: 0.000000
2023-10-19 20:05:48,711 epoch 8 - iter 445/893 - loss 0.28117295 - time (sec): 11.17 - samples/sec: 11071.86 - lr: 0.000008 - momentum: 0.000000
2023-10-19 20:05:50,984 epoch 8 - iter 534/893 - loss 0.28021332 - time (sec): 13.45 - samples/sec: 11099.17 - lr: 0.000008 - momentum: 0.000000
2023-10-19 20:05:53,200 epoch 8 - iter 623/893 - loss 0.28009769 - time (sec): 15.66 - samples/sec: 11108.29 - lr: 0.000008 - momentum: 0.000000
2023-10-19 20:05:55,483 epoch 8 - iter 712/893 - loss 0.27850109 - time (sec): 17.95 - samples/sec: 11093.56 - lr: 0.000007 - momentum: 0.000000
2023-10-19 20:05:57,728 epoch 8 - iter 801/893 - loss 0.28167039 - time (sec): 20.19 - samples/sec: 11062.97 - lr: 0.000007 - momentum: 0.000000
2023-10-19 20:05:59,979 epoch 8 - iter 890/893 - loss 0.28144169 - time (sec): 22.44 - samples/sec: 11048.66 - lr: 0.000007 - momentum: 0.000000
2023-10-19 20:06:00,053 ----------------------------------------------------------------------------------------------------
2023-10-19 20:06:00,053 EPOCH 8 done: loss 0.2812 - lr: 0.000007
2023-10-19 20:06:02,455 DEV : loss 0.20476743578910828 - f1-score (micro avg) 0.4805
2023-10-19 20:06:02,469 saving best model
2023-10-19 20:06:02,504 ----------------------------------------------------------------------------------------------------
2023-10-19 20:06:04,834 epoch 9 - iter 89/893 - loss 0.27170126 - time (sec): 2.33 - samples/sec: 11367.49 - lr: 0.000006 - momentum: 0.000000
2023-10-19 20:06:07,090 epoch 9 - iter 178/893 - loss 0.26530562 - time (sec): 4.59 - samples/sec: 11293.74 - lr: 0.000006 - momentum: 0.000000
2023-10-19 20:06:09,087 epoch 9 - iter 267/893 - loss 0.26261891 - time (sec): 6.58 - samples/sec: 11573.45 - lr: 0.000006 - momentum: 0.000000
2023-10-19 20:06:11,314 epoch 9 - iter 356/893 - loss 0.26838652 - time (sec): 8.81 - samples/sec: 11293.47 - lr: 0.000005 - momentum: 0.000000
2023-10-19 20:06:13,535 epoch 9 - iter 445/893 - loss 0.27202370 - time (sec): 11.03 - samples/sec: 11166.55 - lr: 0.000005 - momentum: 0.000000
2023-10-19 20:06:15,784 epoch 9 - iter 534/893 - loss 0.27221890 - time (sec): 13.28 - samples/sec: 11209.09 - lr: 0.000005 - momentum: 0.000000
2023-10-19 20:06:18,056 epoch 9 - iter 623/893 - loss 0.27510528 - time (sec): 15.55 - samples/sec: 11191.69 - lr: 0.000004 - momentum: 0.000000
2023-10-19 20:06:20,253 epoch 9 - iter 712/893 - loss 0.27406065 - time (sec): 17.75 - samples/sec: 11176.26 - lr: 0.000004 - momentum: 0.000000
2023-10-19 20:06:22,509 epoch 9 - iter 801/893 - loss 0.27418893 - time (sec): 20.00 - samples/sec: 11185.83 - lr: 0.000004 - momentum: 0.000000
2023-10-19 20:06:24,749 epoch 9 - iter 890/893 - loss 0.27283631 - time (sec): 22.24 - samples/sec: 11150.60 - lr: 0.000003 - momentum: 0.000000
2023-10-19 20:06:24,822 ----------------------------------------------------------------------------------------------------
2023-10-19 20:06:24,823 EPOCH 9 done: loss 0.2724 - lr: 0.000003
2023-10-19 20:06:27,630 DEV : loss 0.20394523441791534 - f1-score (micro avg) 0.4875
2023-10-19 20:06:27,643 saving best model
2023-10-19 20:06:27,678 ----------------------------------------------------------------------------------------------------
2023-10-19 20:06:29,876 epoch 10 - iter 89/893 - loss 0.27186045 - time (sec): 2.20 - samples/sec: 10565.18 - lr: 0.000003 - momentum: 0.000000
2023-10-19 20:06:32,140 epoch 10 - iter 178/893 - loss 0.28325566 - time (sec): 4.46 - samples/sec: 10724.49 - lr: 0.000003 - momentum: 0.000000
2023-10-19 20:06:34,426 epoch 10 - iter 267/893 - loss 0.28330112 - time (sec): 6.75 - samples/sec: 10745.16 - lr: 0.000002 - momentum: 0.000000
2023-10-19 20:06:36,681 epoch 10 - iter 356/893 - loss 0.28151126 - time (sec): 9.00 - samples/sec: 10845.21 - lr: 0.000002 - momentum: 0.000000
2023-10-19 20:06:38,903 epoch 10 - iter 445/893 - loss 0.27886107 - time (sec): 11.22 - samples/sec: 10819.44 - lr: 0.000002 - momentum: 0.000000
2023-10-19 20:06:41,070 epoch 10 - iter 534/893 - loss 0.27309122 - time (sec): 13.39 - samples/sec: 10915.88 - lr: 0.000001 - momentum: 0.000000
2023-10-19 20:06:43,380 epoch 10 - iter 623/893 - loss 0.26808545 - time (sec): 15.70 - samples/sec: 10969.00 - lr: 0.000001 - momentum: 0.000000
2023-10-19 20:06:45,637 epoch 10 - iter 712/893 - loss 0.26343191 - time (sec): 17.96 - samples/sec: 11030.06 - lr: 0.000001 - momentum: 0.000000
2023-10-19 20:06:47,913 epoch 10 - iter 801/893 - loss 0.26658732 - time (sec): 20.23 - samples/sec: 11041.20 - lr: 0.000000 - momentum: 0.000000
2023-10-19 20:06:50,173 epoch 10 - iter 890/893 - loss 0.26998091 - time (sec): 22.49 - samples/sec: 11032.32 - lr: 0.000000 - momentum: 0.000000
2023-10-19 20:06:50,245 ----------------------------------------------------------------------------------------------------
2023-10-19 20:06:50,246 EPOCH 10 done: loss 0.2707 - lr: 0.000000
2023-10-19 20:06:53,057 DEV : loss 0.20230604708194733 - f1-score (micro avg) 0.4879
2023-10-19 20:06:53,070 saving best model
2023-10-19 20:06:53,132 ----------------------------------------------------------------------------------------------------
2023-10-19 20:06:53,132 Loading model from best epoch ...
2023-10-19 20:06:53,211 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-19 20:06:57,772
Results:
- F-score (micro) 0.373
- F-score (macro) 0.2137
- Accuracy 0.2375
By class:
precision recall f1-score support
LOC 0.3794 0.4612 0.4163 1095
PER 0.3723 0.4595 0.4113 1012
ORG 0.0435 0.0196 0.0270 357
HumanProd 0.0000 0.0000 0.0000 33
micro avg 0.3564 0.3913 0.3730 2497
macro avg 0.1988 0.2351 0.2137 2497
weighted avg 0.3235 0.3913 0.3531 2497
2023-10-19 20:06:57,772 ----------------------------------------------------------------------------------------------------