stefan-it's picture
Upload folder using huggingface_hub
3606cef
2023-10-19 19:30:38,316 ----------------------------------------------------------------------------------------------------
2023-10-19 19:30:38,317 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-19 19:30:38,317 ----------------------------------------------------------------------------------------------------
2023-10-19 19:30:38,317 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-19 19:30:38,317 ----------------------------------------------------------------------------------------------------
2023-10-19 19:30:38,317 Train: 7142 sentences
2023-10-19 19:30:38,317 (train_with_dev=False, train_with_test=False)
2023-10-19 19:30:38,317 ----------------------------------------------------------------------------------------------------
2023-10-19 19:30:38,317 Training Params:
2023-10-19 19:30:38,317 - learning_rate: "3e-05"
2023-10-19 19:30:38,317 - mini_batch_size: "4"
2023-10-19 19:30:38,317 - max_epochs: "10"
2023-10-19 19:30:38,317 - shuffle: "True"
2023-10-19 19:30:38,317 ----------------------------------------------------------------------------------------------------
2023-10-19 19:30:38,317 Plugins:
2023-10-19 19:30:38,317 - TensorboardLogger
2023-10-19 19:30:38,317 - LinearScheduler | warmup_fraction: '0.1'
2023-10-19 19:30:38,317 ----------------------------------------------------------------------------------------------------
2023-10-19 19:30:38,317 Final evaluation on model from best epoch (best-model.pt)
2023-10-19 19:30:38,318 - metric: "('micro avg', 'f1-score')"
2023-10-19 19:30:38,318 ----------------------------------------------------------------------------------------------------
2023-10-19 19:30:38,318 Computation:
2023-10-19 19:30:38,318 - compute on device: cuda:0
2023-10-19 19:30:38,318 - embedding storage: none
2023-10-19 19:30:38,318 ----------------------------------------------------------------------------------------------------
2023-10-19 19:30:38,318 Model training base path: "hmbench-newseye/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-19 19:30:38,318 ----------------------------------------------------------------------------------------------------
2023-10-19 19:30:38,318 ----------------------------------------------------------------------------------------------------
2023-10-19 19:30:38,318 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-19 19:30:42,345 epoch 1 - iter 178/1786 - loss 3.35557409 - time (sec): 4.03 - samples/sec: 6649.09 - lr: 0.000003 - momentum: 0.000000
2023-10-19 19:30:45,409 epoch 1 - iter 356/1786 - loss 3.04732355 - time (sec): 7.09 - samples/sec: 7258.29 - lr: 0.000006 - momentum: 0.000000
2023-10-19 19:30:48,452 epoch 1 - iter 534/1786 - loss 2.57347013 - time (sec): 10.13 - samples/sec: 7605.61 - lr: 0.000009 - momentum: 0.000000
2023-10-19 19:30:51,432 epoch 1 - iter 712/1786 - loss 2.19279281 - time (sec): 13.11 - samples/sec: 7628.30 - lr: 0.000012 - momentum: 0.000000
2023-10-19 19:30:54,374 epoch 1 - iter 890/1786 - loss 1.90940254 - time (sec): 16.06 - samples/sec: 7780.05 - lr: 0.000015 - momentum: 0.000000
2023-10-19 19:30:57,033 epoch 1 - iter 1068/1786 - loss 1.71174595 - time (sec): 18.72 - samples/sec: 7981.47 - lr: 0.000018 - momentum: 0.000000
2023-10-19 19:31:00,019 epoch 1 - iter 1246/1786 - loss 1.56052404 - time (sec): 21.70 - samples/sec: 8029.71 - lr: 0.000021 - momentum: 0.000000
2023-10-19 19:31:03,200 epoch 1 - iter 1424/1786 - loss 1.44384272 - time (sec): 24.88 - samples/sec: 8064.88 - lr: 0.000024 - momentum: 0.000000
2023-10-19 19:31:06,277 epoch 1 - iter 1602/1786 - loss 1.34797295 - time (sec): 27.96 - samples/sec: 8097.63 - lr: 0.000027 - momentum: 0.000000
2023-10-19 19:31:09,206 epoch 1 - iter 1780/1786 - loss 1.27247036 - time (sec): 30.89 - samples/sec: 8036.65 - lr: 0.000030 - momentum: 0.000000
2023-10-19 19:31:09,314 ----------------------------------------------------------------------------------------------------
2023-10-19 19:31:09,314 EPOCH 1 done: loss 1.2713 - lr: 0.000030
2023-10-19 19:31:10,267 DEV : loss 0.3359092175960541 - f1-score (micro avg) 0.0487
2023-10-19 19:31:10,280 saving best model
2023-10-19 19:31:10,314 ----------------------------------------------------------------------------------------------------
2023-10-19 19:31:13,387 epoch 2 - iter 178/1786 - loss 0.52150669 - time (sec): 3.07 - samples/sec: 7935.63 - lr: 0.000030 - momentum: 0.000000
2023-10-19 19:31:16,544 epoch 2 - iter 356/1786 - loss 0.47540073 - time (sec): 6.23 - samples/sec: 8020.47 - lr: 0.000029 - momentum: 0.000000
2023-10-19 19:31:19,413 epoch 2 - iter 534/1786 - loss 0.48292965 - time (sec): 9.10 - samples/sec: 8281.08 - lr: 0.000029 - momentum: 0.000000
2023-10-19 19:31:22,411 epoch 2 - iter 712/1786 - loss 0.47086550 - time (sec): 12.10 - samples/sec: 8348.10 - lr: 0.000029 - momentum: 0.000000
2023-10-19 19:31:25,366 epoch 2 - iter 890/1786 - loss 0.46122234 - time (sec): 15.05 - samples/sec: 8197.09 - lr: 0.000028 - momentum: 0.000000
2023-10-19 19:31:28,406 epoch 2 - iter 1068/1786 - loss 0.45031419 - time (sec): 18.09 - samples/sec: 8215.56 - lr: 0.000028 - momentum: 0.000000
2023-10-19 19:31:31,454 epoch 2 - iter 1246/1786 - loss 0.44704061 - time (sec): 21.14 - samples/sec: 8208.17 - lr: 0.000028 - momentum: 0.000000
2023-10-19 19:31:34,397 epoch 2 - iter 1424/1786 - loss 0.44759542 - time (sec): 24.08 - samples/sec: 8290.58 - lr: 0.000027 - momentum: 0.000000
2023-10-19 19:31:37,326 epoch 2 - iter 1602/1786 - loss 0.44190235 - time (sec): 27.01 - samples/sec: 8273.98 - lr: 0.000027 - momentum: 0.000000
2023-10-19 19:31:40,332 epoch 2 - iter 1780/1786 - loss 0.43665530 - time (sec): 30.02 - samples/sec: 8263.97 - lr: 0.000027 - momentum: 0.000000
2023-10-19 19:31:40,430 ----------------------------------------------------------------------------------------------------
2023-10-19 19:31:40,430 EPOCH 2 done: loss 0.4367 - lr: 0.000027
2023-10-19 19:31:43,156 DEV : loss 0.2586509883403778 - f1-score (micro avg) 0.3582
2023-10-19 19:31:43,169 saving best model
2023-10-19 19:31:43,203 ----------------------------------------------------------------------------------------------------
2023-10-19 19:31:46,369 epoch 3 - iter 178/1786 - loss 0.34074405 - time (sec): 3.17 - samples/sec: 7437.29 - lr: 0.000026 - momentum: 0.000000
2023-10-19 19:31:49,481 epoch 3 - iter 356/1786 - loss 0.35722086 - time (sec): 6.28 - samples/sec: 7581.36 - lr: 0.000026 - momentum: 0.000000
2023-10-19 19:31:52,534 epoch 3 - iter 534/1786 - loss 0.37202217 - time (sec): 9.33 - samples/sec: 7701.47 - lr: 0.000026 - momentum: 0.000000
2023-10-19 19:31:55,593 epoch 3 - iter 712/1786 - loss 0.36390494 - time (sec): 12.39 - samples/sec: 7782.00 - lr: 0.000025 - momentum: 0.000000
2023-10-19 19:31:58,617 epoch 3 - iter 890/1786 - loss 0.36309047 - time (sec): 15.41 - samples/sec: 7832.20 - lr: 0.000025 - momentum: 0.000000
2023-10-19 19:32:01,736 epoch 3 - iter 1068/1786 - loss 0.36131319 - time (sec): 18.53 - samples/sec: 7945.23 - lr: 0.000025 - momentum: 0.000000
2023-10-19 19:32:04,829 epoch 3 - iter 1246/1786 - loss 0.35798402 - time (sec): 21.63 - samples/sec: 7915.12 - lr: 0.000024 - momentum: 0.000000
2023-10-19 19:32:08,007 epoch 3 - iter 1424/1786 - loss 0.35852174 - time (sec): 24.80 - samples/sec: 7935.10 - lr: 0.000024 - momentum: 0.000000
2023-10-19 19:32:11,177 epoch 3 - iter 1602/1786 - loss 0.35173269 - time (sec): 27.97 - samples/sec: 7988.43 - lr: 0.000024 - momentum: 0.000000
2023-10-19 19:32:14,249 epoch 3 - iter 1780/1786 - loss 0.35039743 - time (sec): 31.05 - samples/sec: 7985.94 - lr: 0.000023 - momentum: 0.000000
2023-10-19 19:32:14,352 ----------------------------------------------------------------------------------------------------
2023-10-19 19:32:14,352 EPOCH 3 done: loss 0.3509 - lr: 0.000023
2023-10-19 19:32:16,681 DEV : loss 0.23497113585472107 - f1-score (micro avg) 0.4239
2023-10-19 19:32:16,695 saving best model
2023-10-19 19:32:16,729 ----------------------------------------------------------------------------------------------------
2023-10-19 19:32:19,764 epoch 4 - iter 178/1786 - loss 0.34808840 - time (sec): 3.03 - samples/sec: 7736.54 - lr: 0.000023 - momentum: 0.000000
2023-10-19 19:32:22,828 epoch 4 - iter 356/1786 - loss 0.33182945 - time (sec): 6.10 - samples/sec: 7914.57 - lr: 0.000023 - momentum: 0.000000
2023-10-19 19:32:25,874 epoch 4 - iter 534/1786 - loss 0.33144903 - time (sec): 9.14 - samples/sec: 8138.52 - lr: 0.000022 - momentum: 0.000000
2023-10-19 19:32:28,884 epoch 4 - iter 712/1786 - loss 0.32920436 - time (sec): 12.15 - samples/sec: 7962.87 - lr: 0.000022 - momentum: 0.000000
2023-10-19 19:32:31,833 epoch 4 - iter 890/1786 - loss 0.32373319 - time (sec): 15.10 - samples/sec: 8014.61 - lr: 0.000022 - momentum: 0.000000
2023-10-19 19:32:34,962 epoch 4 - iter 1068/1786 - loss 0.32588127 - time (sec): 18.23 - samples/sec: 8015.75 - lr: 0.000021 - momentum: 0.000000
2023-10-19 19:32:38,443 epoch 4 - iter 1246/1786 - loss 0.31956784 - time (sec): 21.71 - samples/sec: 7917.14 - lr: 0.000021 - momentum: 0.000000
2023-10-19 19:32:41,353 epoch 4 - iter 1424/1786 - loss 0.31923266 - time (sec): 24.62 - samples/sec: 8006.99 - lr: 0.000021 - momentum: 0.000000
2023-10-19 19:32:44,409 epoch 4 - iter 1602/1786 - loss 0.31502249 - time (sec): 27.68 - samples/sec: 8044.83 - lr: 0.000020 - momentum: 0.000000
2023-10-19 19:32:47,532 epoch 4 - iter 1780/1786 - loss 0.31282089 - time (sec): 30.80 - samples/sec: 8055.24 - lr: 0.000020 - momentum: 0.000000
2023-10-19 19:32:47,633 ----------------------------------------------------------------------------------------------------
2023-10-19 19:32:47,633 EPOCH 4 done: loss 0.3130 - lr: 0.000020
2023-10-19 19:32:50,010 DEV : loss 0.2149001508951187 - f1-score (micro avg) 0.4653
2023-10-19 19:32:50,023 saving best model
2023-10-19 19:32:50,057 ----------------------------------------------------------------------------------------------------
2023-10-19 19:32:53,143 epoch 5 - iter 178/1786 - loss 0.28225441 - time (sec): 3.08 - samples/sec: 8251.32 - lr: 0.000020 - momentum: 0.000000
2023-10-19 19:32:56,016 epoch 5 - iter 356/1786 - loss 0.29819281 - time (sec): 5.96 - samples/sec: 8472.79 - lr: 0.000019 - momentum: 0.000000
2023-10-19 19:32:59,080 epoch 5 - iter 534/1786 - loss 0.29855410 - time (sec): 9.02 - samples/sec: 8471.64 - lr: 0.000019 - momentum: 0.000000
2023-10-19 19:33:02,083 epoch 5 - iter 712/1786 - loss 0.30330707 - time (sec): 12.02 - samples/sec: 8205.57 - lr: 0.000019 - momentum: 0.000000
2023-10-19 19:33:05,219 epoch 5 - iter 890/1786 - loss 0.29989577 - time (sec): 15.16 - samples/sec: 8154.64 - lr: 0.000018 - momentum: 0.000000
2023-10-19 19:33:08,319 epoch 5 - iter 1068/1786 - loss 0.29542870 - time (sec): 18.26 - samples/sec: 8139.95 - lr: 0.000018 - momentum: 0.000000
2023-10-19 19:33:11,416 epoch 5 - iter 1246/1786 - loss 0.29344410 - time (sec): 21.36 - samples/sec: 8120.50 - lr: 0.000018 - momentum: 0.000000
2023-10-19 19:33:14,522 epoch 5 - iter 1424/1786 - loss 0.28918196 - time (sec): 24.46 - samples/sec: 8166.42 - lr: 0.000017 - momentum: 0.000000
2023-10-19 19:33:17,613 epoch 5 - iter 1602/1786 - loss 0.28833349 - time (sec): 27.55 - samples/sec: 8144.18 - lr: 0.000017 - momentum: 0.000000
2023-10-19 19:33:20,676 epoch 5 - iter 1780/1786 - loss 0.28461780 - time (sec): 30.62 - samples/sec: 8107.88 - lr: 0.000017 - momentum: 0.000000
2023-10-19 19:33:20,760 ----------------------------------------------------------------------------------------------------
2023-10-19 19:33:20,761 EPOCH 5 done: loss 0.2851 - lr: 0.000017
2023-10-19 19:33:23,133 DEV : loss 0.2108898162841797 - f1-score (micro avg) 0.4851
2023-10-19 19:33:23,146 saving best model
2023-10-19 19:33:23,180 ----------------------------------------------------------------------------------------------------
2023-10-19 19:33:26,371 epoch 6 - iter 178/1786 - loss 0.27194464 - time (sec): 3.19 - samples/sec: 7732.08 - lr: 0.000016 - momentum: 0.000000
2023-10-19 19:33:29,509 epoch 6 - iter 356/1786 - loss 0.25982245 - time (sec): 6.33 - samples/sec: 7990.52 - lr: 0.000016 - momentum: 0.000000
2023-10-19 19:33:32,616 epoch 6 - iter 534/1786 - loss 0.26314407 - time (sec): 9.44 - samples/sec: 8062.31 - lr: 0.000016 - momentum: 0.000000
2023-10-19 19:33:35,942 epoch 6 - iter 712/1786 - loss 0.26630751 - time (sec): 12.76 - samples/sec: 7878.24 - lr: 0.000015 - momentum: 0.000000
2023-10-19 19:33:38,609 epoch 6 - iter 890/1786 - loss 0.26812327 - time (sec): 15.43 - samples/sec: 8199.69 - lr: 0.000015 - momentum: 0.000000
2023-10-19 19:33:41,286 epoch 6 - iter 1068/1786 - loss 0.26573731 - time (sec): 18.11 - samples/sec: 8330.67 - lr: 0.000015 - momentum: 0.000000
2023-10-19 19:33:44,460 epoch 6 - iter 1246/1786 - loss 0.26870941 - time (sec): 21.28 - samples/sec: 8187.53 - lr: 0.000014 - momentum: 0.000000
2023-10-19 19:33:47,666 epoch 6 - iter 1424/1786 - loss 0.26843570 - time (sec): 24.49 - samples/sec: 8116.00 - lr: 0.000014 - momentum: 0.000000
2023-10-19 19:33:50,752 epoch 6 - iter 1602/1786 - loss 0.26563717 - time (sec): 27.57 - samples/sec: 8095.59 - lr: 0.000014 - momentum: 0.000000
2023-10-19 19:33:53,883 epoch 6 - iter 1780/1786 - loss 0.26643678 - time (sec): 30.70 - samples/sec: 8085.37 - lr: 0.000013 - momentum: 0.000000
2023-10-19 19:33:53,976 ----------------------------------------------------------------------------------------------------
2023-10-19 19:33:53,977 EPOCH 6 done: loss 0.2667 - lr: 0.000013
2023-10-19 19:33:56,338 DEV : loss 0.20400717854499817 - f1-score (micro avg) 0.5092
2023-10-19 19:33:56,353 saving best model
2023-10-19 19:33:56,384 ----------------------------------------------------------------------------------------------------
2023-10-19 19:33:58,908 epoch 7 - iter 178/1786 - loss 0.25727723 - time (sec): 2.52 - samples/sec: 9164.43 - lr: 0.000013 - momentum: 0.000000
2023-10-19 19:34:01,967 epoch 7 - iter 356/1786 - loss 0.26036317 - time (sec): 5.58 - samples/sec: 8619.91 - lr: 0.000013 - momentum: 0.000000
2023-10-19 19:34:05,035 epoch 7 - iter 534/1786 - loss 0.24886818 - time (sec): 8.65 - samples/sec: 8487.13 - lr: 0.000012 - momentum: 0.000000
2023-10-19 19:34:08,134 epoch 7 - iter 712/1786 - loss 0.25428910 - time (sec): 11.75 - samples/sec: 8325.67 - lr: 0.000012 - momentum: 0.000000
2023-10-19 19:34:11,331 epoch 7 - iter 890/1786 - loss 0.25487606 - time (sec): 14.95 - samples/sec: 8237.75 - lr: 0.000012 - momentum: 0.000000
2023-10-19 19:34:14,669 epoch 7 - iter 1068/1786 - loss 0.25037409 - time (sec): 18.28 - samples/sec: 8144.31 - lr: 0.000011 - momentum: 0.000000
2023-10-19 19:34:17,733 epoch 7 - iter 1246/1786 - loss 0.25246122 - time (sec): 21.35 - samples/sec: 8148.88 - lr: 0.000011 - momentum: 0.000000
2023-10-19 19:34:20,831 epoch 7 - iter 1424/1786 - loss 0.25394057 - time (sec): 24.45 - samples/sec: 8096.66 - lr: 0.000011 - momentum: 0.000000
2023-10-19 19:34:23,879 epoch 7 - iter 1602/1786 - loss 0.25424882 - time (sec): 27.49 - samples/sec: 8090.76 - lr: 0.000010 - momentum: 0.000000
2023-10-19 19:34:26,967 epoch 7 - iter 1780/1786 - loss 0.25556992 - time (sec): 30.58 - samples/sec: 8114.01 - lr: 0.000010 - momentum: 0.000000
2023-10-19 19:34:27,071 ----------------------------------------------------------------------------------------------------
2023-10-19 19:34:27,071 EPOCH 7 done: loss 0.2550 - lr: 0.000010
2023-10-19 19:34:29,411 DEV : loss 0.19975362718105316 - f1-score (micro avg) 0.5137
2023-10-19 19:34:29,425 saving best model
2023-10-19 19:34:29,460 ----------------------------------------------------------------------------------------------------
2023-10-19 19:34:32,551 epoch 8 - iter 178/1786 - loss 0.25894893 - time (sec): 3.09 - samples/sec: 8193.08 - lr: 0.000010 - momentum: 0.000000
2023-10-19 19:34:35,544 epoch 8 - iter 356/1786 - loss 0.25720911 - time (sec): 6.08 - samples/sec: 8254.42 - lr: 0.000009 - momentum: 0.000000
2023-10-19 19:34:38,679 epoch 8 - iter 534/1786 - loss 0.25536811 - time (sec): 9.22 - samples/sec: 7975.80 - lr: 0.000009 - momentum: 0.000000
2023-10-19 19:34:42,147 epoch 8 - iter 712/1786 - loss 0.24758044 - time (sec): 12.69 - samples/sec: 7802.40 - lr: 0.000009 - momentum: 0.000000
2023-10-19 19:34:45,184 epoch 8 - iter 890/1786 - loss 0.24736965 - time (sec): 15.72 - samples/sec: 7829.00 - lr: 0.000008 - momentum: 0.000000
2023-10-19 19:34:48,231 epoch 8 - iter 1068/1786 - loss 0.24333603 - time (sec): 18.77 - samples/sec: 8003.01 - lr: 0.000008 - momentum: 0.000000
2023-10-19 19:34:51,246 epoch 8 - iter 1246/1786 - loss 0.24489911 - time (sec): 21.78 - samples/sec: 7953.46 - lr: 0.000008 - momentum: 0.000000
2023-10-19 19:34:54,308 epoch 8 - iter 1424/1786 - loss 0.24241203 - time (sec): 24.85 - samples/sec: 7988.83 - lr: 0.000007 - momentum: 0.000000
2023-10-19 19:34:57,453 epoch 8 - iter 1602/1786 - loss 0.24403072 - time (sec): 27.99 - samples/sec: 8015.58 - lr: 0.000007 - momentum: 0.000000
2023-10-19 19:35:00,570 epoch 8 - iter 1780/1786 - loss 0.24583775 - time (sec): 31.11 - samples/sec: 7963.02 - lr: 0.000007 - momentum: 0.000000
2023-10-19 19:35:00,674 ----------------------------------------------------------------------------------------------------
2023-10-19 19:35:00,674 EPOCH 8 done: loss 0.2452 - lr: 0.000007
2023-10-19 19:35:03,077 DEV : loss 0.19914978742599487 - f1-score (micro avg) 0.5189
2023-10-19 19:35:03,090 saving best model
2023-10-19 19:35:03,123 ----------------------------------------------------------------------------------------------------
2023-10-19 19:35:06,163 epoch 9 - iter 178/1786 - loss 0.23821530 - time (sec): 3.04 - samples/sec: 8011.02 - lr: 0.000006 - momentum: 0.000000
2023-10-19 19:35:09,155 epoch 9 - iter 356/1786 - loss 0.25309649 - time (sec): 6.03 - samples/sec: 8089.52 - lr: 0.000006 - momentum: 0.000000
2023-10-19 19:35:12,272 epoch 9 - iter 534/1786 - loss 0.24190772 - time (sec): 9.15 - samples/sec: 7958.63 - lr: 0.000006 - momentum: 0.000000
2023-10-19 19:35:15,346 epoch 9 - iter 712/1786 - loss 0.24222367 - time (sec): 12.22 - samples/sec: 8057.06 - lr: 0.000005 - momentum: 0.000000
2023-10-19 19:35:18,365 epoch 9 - iter 890/1786 - loss 0.24155227 - time (sec): 15.24 - samples/sec: 8034.84 - lr: 0.000005 - momentum: 0.000000
2023-10-19 19:35:21,347 epoch 9 - iter 1068/1786 - loss 0.24208353 - time (sec): 18.22 - samples/sec: 8156.84 - lr: 0.000005 - momentum: 0.000000
2023-10-19 19:35:24,351 epoch 9 - iter 1246/1786 - loss 0.24108614 - time (sec): 21.23 - samples/sec: 8109.56 - lr: 0.000004 - momentum: 0.000000
2023-10-19 19:35:27,403 epoch 9 - iter 1424/1786 - loss 0.24019162 - time (sec): 24.28 - samples/sec: 8156.05 - lr: 0.000004 - momentum: 0.000000
2023-10-19 19:35:30,466 epoch 9 - iter 1602/1786 - loss 0.24303458 - time (sec): 27.34 - samples/sec: 8144.81 - lr: 0.000004 - momentum: 0.000000
2023-10-19 19:35:33,616 epoch 9 - iter 1780/1786 - loss 0.23944010 - time (sec): 30.49 - samples/sec: 8129.33 - lr: 0.000003 - momentum: 0.000000
2023-10-19 19:35:33,716 ----------------------------------------------------------------------------------------------------
2023-10-19 19:35:33,716 EPOCH 9 done: loss 0.2392 - lr: 0.000003
2023-10-19 19:35:36,496 DEV : loss 0.19566838443279266 - f1-score (micro avg) 0.5251
2023-10-19 19:35:36,509 saving best model
2023-10-19 19:35:36,543 ----------------------------------------------------------------------------------------------------
2023-10-19 19:35:39,683 epoch 10 - iter 178/1786 - loss 0.25183541 - time (sec): 3.14 - samples/sec: 7949.52 - lr: 0.000003 - momentum: 0.000000
2023-10-19 19:35:42,858 epoch 10 - iter 356/1786 - loss 0.24654755 - time (sec): 6.31 - samples/sec: 7969.63 - lr: 0.000003 - momentum: 0.000000
2023-10-19 19:35:45,912 epoch 10 - iter 534/1786 - loss 0.24620089 - time (sec): 9.37 - samples/sec: 8111.22 - lr: 0.000002 - momentum: 0.000000
2023-10-19 19:35:48,917 epoch 10 - iter 712/1786 - loss 0.23937163 - time (sec): 12.37 - samples/sec: 8029.79 - lr: 0.000002 - momentum: 0.000000
2023-10-19 19:35:52,075 epoch 10 - iter 890/1786 - loss 0.24211773 - time (sec): 15.53 - samples/sec: 7931.61 - lr: 0.000002 - momentum: 0.000000
2023-10-19 19:35:55,143 epoch 10 - iter 1068/1786 - loss 0.24202751 - time (sec): 18.60 - samples/sec: 7953.73 - lr: 0.000001 - momentum: 0.000000
2023-10-19 19:35:58,209 epoch 10 - iter 1246/1786 - loss 0.24174603 - time (sec): 21.67 - samples/sec: 7979.86 - lr: 0.000001 - momentum: 0.000000
2023-10-19 19:36:01,244 epoch 10 - iter 1424/1786 - loss 0.23782131 - time (sec): 24.70 - samples/sec: 8023.65 - lr: 0.000001 - momentum: 0.000000
2023-10-19 19:36:04,337 epoch 10 - iter 1602/1786 - loss 0.23640511 - time (sec): 27.79 - samples/sec: 7996.89 - lr: 0.000000 - momentum: 0.000000
2023-10-19 19:36:07,434 epoch 10 - iter 1780/1786 - loss 0.23538347 - time (sec): 30.89 - samples/sec: 8035.23 - lr: 0.000000 - momentum: 0.000000
2023-10-19 19:36:07,525 ----------------------------------------------------------------------------------------------------
2023-10-19 19:36:07,525 EPOCH 10 done: loss 0.2355 - lr: 0.000000
2023-10-19 19:36:09,907 DEV : loss 0.19619056582450867 - f1-score (micro avg) 0.5165
2023-10-19 19:36:09,947 ----------------------------------------------------------------------------------------------------
2023-10-19 19:36:09,947 Loading model from best epoch ...
2023-10-19 19:36:10,025 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-19 19:36:14,436
Results:
- F-score (micro) 0.4104
- F-score (macro) 0.2422
- Accuracy 0.2681
By class:
precision recall f1-score support
LOC 0.4062 0.5297 0.4598 1095
PER 0.4160 0.4822 0.4467 1012
ORG 0.0895 0.0476 0.0622 357
HumanProd 0.0000 0.0000 0.0000 33
micro avg 0.3887 0.4345 0.4104 2497
macro avg 0.2279 0.2649 0.2422 2497
weighted avg 0.3595 0.4345 0.3915 2497
2023-10-19 19:36:14,436 ----------------------------------------------------------------------------------------------------