stefan-it's picture
Upload folder using huggingface_hub
2c3d57b
2023-10-19 19:56:56,385 ----------------------------------------------------------------------------------------------------
2023-10-19 19:56:56,386 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-19 19:56:56,386 ----------------------------------------------------------------------------------------------------
2023-10-19 19:56:56,386 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-19 19:56:56,386 ----------------------------------------------------------------------------------------------------
2023-10-19 19:56:56,386 Train: 7142 sentences
2023-10-19 19:56:56,386 (train_with_dev=False, train_with_test=False)
2023-10-19 19:56:56,386 ----------------------------------------------------------------------------------------------------
2023-10-19 19:56:56,386 Training Params:
2023-10-19 19:56:56,386 - learning_rate: "5e-05"
2023-10-19 19:56:56,386 - mini_batch_size: "4"
2023-10-19 19:56:56,386 - max_epochs: "10"
2023-10-19 19:56:56,386 - shuffle: "True"
2023-10-19 19:56:56,386 ----------------------------------------------------------------------------------------------------
2023-10-19 19:56:56,386 Plugins:
2023-10-19 19:56:56,386 - TensorboardLogger
2023-10-19 19:56:56,386 - LinearScheduler | warmup_fraction: '0.1'
2023-10-19 19:56:56,386 ----------------------------------------------------------------------------------------------------
2023-10-19 19:56:56,386 Final evaluation on model from best epoch (best-model.pt)
2023-10-19 19:56:56,386 - metric: "('micro avg', 'f1-score')"
2023-10-19 19:56:56,387 ----------------------------------------------------------------------------------------------------
2023-10-19 19:56:56,387 Computation:
2023-10-19 19:56:56,387 - compute on device: cuda:0
2023-10-19 19:56:56,387 - embedding storage: none
2023-10-19 19:56:56,387 ----------------------------------------------------------------------------------------------------
2023-10-19 19:56:56,387 Model training base path: "hmbench-newseye/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-19 19:56:56,387 ----------------------------------------------------------------------------------------------------
2023-10-19 19:56:56,387 ----------------------------------------------------------------------------------------------------
2023-10-19 19:56:56,387 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-19 19:56:59,484 epoch 1 - iter 178/1786 - loss 2.72583835 - time (sec): 3.10 - samples/sec: 8628.89 - lr: 0.000005 - momentum: 0.000000
2023-10-19 19:57:02,567 epoch 1 - iter 356/1786 - loss 2.28399604 - time (sec): 6.18 - samples/sec: 8233.47 - lr: 0.000010 - momentum: 0.000000
2023-10-19 19:57:05,501 epoch 1 - iter 534/1786 - loss 1.84616762 - time (sec): 9.11 - samples/sec: 8156.39 - lr: 0.000015 - momentum: 0.000000
2023-10-19 19:57:08,543 epoch 1 - iter 712/1786 - loss 1.55839783 - time (sec): 12.16 - samples/sec: 8168.51 - lr: 0.000020 - momentum: 0.000000
2023-10-19 19:57:11,561 epoch 1 - iter 890/1786 - loss 1.38530571 - time (sec): 15.17 - samples/sec: 8105.25 - lr: 0.000025 - momentum: 0.000000
2023-10-19 19:57:14,566 epoch 1 - iter 1068/1786 - loss 1.26744376 - time (sec): 18.18 - samples/sec: 8093.41 - lr: 0.000030 - momentum: 0.000000
2023-10-19 19:57:17,760 epoch 1 - iter 1246/1786 - loss 1.15679850 - time (sec): 21.37 - samples/sec: 8089.28 - lr: 0.000035 - momentum: 0.000000
2023-10-19 19:57:20,780 epoch 1 - iter 1424/1786 - loss 1.07406304 - time (sec): 24.39 - samples/sec: 8059.87 - lr: 0.000040 - momentum: 0.000000
2023-10-19 19:57:23,840 epoch 1 - iter 1602/1786 - loss 1.00605621 - time (sec): 27.45 - samples/sec: 8084.14 - lr: 0.000045 - momentum: 0.000000
2023-10-19 19:57:26,889 epoch 1 - iter 1780/1786 - loss 0.94992355 - time (sec): 30.50 - samples/sec: 8133.35 - lr: 0.000050 - momentum: 0.000000
2023-10-19 19:57:26,989 ----------------------------------------------------------------------------------------------------
2023-10-19 19:57:26,989 EPOCH 1 done: loss 0.9486 - lr: 0.000050
2023-10-19 19:57:28,453 DEV : loss 0.29305753111839294 - f1-score (micro avg) 0.249
2023-10-19 19:57:28,468 saving best model
2023-10-19 19:57:28,503 ----------------------------------------------------------------------------------------------------
2023-10-19 19:57:31,169 epoch 2 - iter 178/1786 - loss 0.44594390 - time (sec): 2.67 - samples/sec: 8876.07 - lr: 0.000049 - momentum: 0.000000
2023-10-19 19:57:34,235 epoch 2 - iter 356/1786 - loss 0.41159547 - time (sec): 5.73 - samples/sec: 8474.76 - lr: 0.000049 - momentum: 0.000000
2023-10-19 19:57:37,249 epoch 2 - iter 534/1786 - loss 0.41144726 - time (sec): 8.75 - samples/sec: 8158.94 - lr: 0.000048 - momentum: 0.000000
2023-10-19 19:57:40,308 epoch 2 - iter 712/1786 - loss 0.39931129 - time (sec): 11.80 - samples/sec: 8135.11 - lr: 0.000048 - momentum: 0.000000
2023-10-19 19:57:43,366 epoch 2 - iter 890/1786 - loss 0.40220612 - time (sec): 14.86 - samples/sec: 8169.71 - lr: 0.000047 - momentum: 0.000000
2023-10-19 19:57:46,440 epoch 2 - iter 1068/1786 - loss 0.39783851 - time (sec): 17.94 - samples/sec: 8200.55 - lr: 0.000047 - momentum: 0.000000
2023-10-19 19:57:49,549 epoch 2 - iter 1246/1786 - loss 0.39066605 - time (sec): 21.05 - samples/sec: 8278.83 - lr: 0.000046 - momentum: 0.000000
2023-10-19 19:57:52,827 epoch 2 - iter 1424/1786 - loss 0.38832131 - time (sec): 24.32 - samples/sec: 8199.66 - lr: 0.000046 - momentum: 0.000000
2023-10-19 19:57:56,026 epoch 2 - iter 1602/1786 - loss 0.38666982 - time (sec): 27.52 - samples/sec: 8141.75 - lr: 0.000045 - momentum: 0.000000
2023-10-19 19:57:59,146 epoch 2 - iter 1780/1786 - loss 0.38480581 - time (sec): 30.64 - samples/sec: 8100.47 - lr: 0.000044 - momentum: 0.000000
2023-10-19 19:57:59,235 ----------------------------------------------------------------------------------------------------
2023-10-19 19:57:59,236 EPOCH 2 done: loss 0.3847 - lr: 0.000044
2023-10-19 19:58:01,589 DEV : loss 0.23073670268058777 - f1-score (micro avg) 0.4494
2023-10-19 19:58:01,602 saving best model
2023-10-19 19:58:01,635 ----------------------------------------------------------------------------------------------------
2023-10-19 19:58:04,884 epoch 3 - iter 178/1786 - loss 0.31405377 - time (sec): 3.25 - samples/sec: 7326.64 - lr: 0.000044 - momentum: 0.000000
2023-10-19 19:58:08,542 epoch 3 - iter 356/1786 - loss 0.29918815 - time (sec): 6.91 - samples/sec: 7336.99 - lr: 0.000043 - momentum: 0.000000
2023-10-19 19:58:11,630 epoch 3 - iter 534/1786 - loss 0.29653379 - time (sec): 9.99 - samples/sec: 7560.52 - lr: 0.000043 - momentum: 0.000000
2023-10-19 19:58:14,695 epoch 3 - iter 712/1786 - loss 0.30267607 - time (sec): 13.06 - samples/sec: 7649.08 - lr: 0.000042 - momentum: 0.000000
2023-10-19 19:58:17,742 epoch 3 - iter 890/1786 - loss 0.30690130 - time (sec): 16.11 - samples/sec: 7818.28 - lr: 0.000042 - momentum: 0.000000
2023-10-19 19:58:20,754 epoch 3 - iter 1068/1786 - loss 0.30838373 - time (sec): 19.12 - samples/sec: 7925.28 - lr: 0.000041 - momentum: 0.000000
2023-10-19 19:58:23,746 epoch 3 - iter 1246/1786 - loss 0.30686090 - time (sec): 22.11 - samples/sec: 7898.97 - lr: 0.000041 - momentum: 0.000000
2023-10-19 19:58:26,798 epoch 3 - iter 1424/1786 - loss 0.30975257 - time (sec): 25.16 - samples/sec: 7919.77 - lr: 0.000040 - momentum: 0.000000
2023-10-19 19:58:29,717 epoch 3 - iter 1602/1786 - loss 0.31078462 - time (sec): 28.08 - samples/sec: 7993.49 - lr: 0.000039 - momentum: 0.000000
2023-10-19 19:58:32,673 epoch 3 - iter 1780/1786 - loss 0.30834283 - time (sec): 31.04 - samples/sec: 7989.82 - lr: 0.000039 - momentum: 0.000000
2023-10-19 19:58:32,767 ----------------------------------------------------------------------------------------------------
2023-10-19 19:58:32,767 EPOCH 3 done: loss 0.3085 - lr: 0.000039
2023-10-19 19:58:35,142 DEV : loss 0.20818448066711426 - f1-score (micro avg) 0.4907
2023-10-19 19:58:35,155 saving best model
2023-10-19 19:58:35,189 ----------------------------------------------------------------------------------------------------
2023-10-19 19:58:38,191 epoch 4 - iter 178/1786 - loss 0.27724988 - time (sec): 3.00 - samples/sec: 8636.87 - lr: 0.000038 - momentum: 0.000000
2023-10-19 19:58:41,275 epoch 4 - iter 356/1786 - loss 0.28611619 - time (sec): 6.09 - samples/sec: 8168.75 - lr: 0.000038 - momentum: 0.000000
2023-10-19 19:58:44,340 epoch 4 - iter 534/1786 - loss 0.29478782 - time (sec): 9.15 - samples/sec: 8075.94 - lr: 0.000037 - momentum: 0.000000
2023-10-19 19:58:47,333 epoch 4 - iter 712/1786 - loss 0.28033530 - time (sec): 12.14 - samples/sec: 8200.91 - lr: 0.000037 - momentum: 0.000000
2023-10-19 19:58:50,386 epoch 4 - iter 890/1786 - loss 0.27617105 - time (sec): 15.20 - samples/sec: 8187.17 - lr: 0.000036 - momentum: 0.000000
2023-10-19 19:58:53,226 epoch 4 - iter 1068/1786 - loss 0.27197505 - time (sec): 18.04 - samples/sec: 8226.99 - lr: 0.000036 - momentum: 0.000000
2023-10-19 19:58:56,053 epoch 4 - iter 1246/1786 - loss 0.27230716 - time (sec): 20.86 - samples/sec: 8251.50 - lr: 0.000035 - momentum: 0.000000
2023-10-19 19:58:59,031 epoch 4 - iter 1424/1786 - loss 0.27153860 - time (sec): 23.84 - samples/sec: 8246.27 - lr: 0.000034 - momentum: 0.000000
2023-10-19 19:59:02,112 epoch 4 - iter 1602/1786 - loss 0.27197949 - time (sec): 26.92 - samples/sec: 8260.76 - lr: 0.000034 - momentum: 0.000000
2023-10-19 19:59:05,192 epoch 4 - iter 1780/1786 - loss 0.26875111 - time (sec): 30.00 - samples/sec: 8272.93 - lr: 0.000033 - momentum: 0.000000
2023-10-19 19:59:05,285 ----------------------------------------------------------------------------------------------------
2023-10-19 19:59:05,285 EPOCH 4 done: loss 0.2689 - lr: 0.000033
2023-10-19 19:59:08,121 DEV : loss 0.191745325922966 - f1-score (micro avg) 0.499
2023-10-19 19:59:08,135 saving best model
2023-10-19 19:59:08,168 ----------------------------------------------------------------------------------------------------
2023-10-19 19:59:11,255 epoch 5 - iter 178/1786 - loss 0.26076343 - time (sec): 3.09 - samples/sec: 8103.56 - lr: 0.000033 - momentum: 0.000000
2023-10-19 19:59:14,449 epoch 5 - iter 356/1786 - loss 0.25312425 - time (sec): 6.28 - samples/sec: 8157.09 - lr: 0.000032 - momentum: 0.000000
2023-10-19 19:59:17,516 epoch 5 - iter 534/1786 - loss 0.24657920 - time (sec): 9.35 - samples/sec: 8112.84 - lr: 0.000032 - momentum: 0.000000
2023-10-19 19:59:20,617 epoch 5 - iter 712/1786 - loss 0.24929645 - time (sec): 12.45 - samples/sec: 7994.69 - lr: 0.000031 - momentum: 0.000000
2023-10-19 19:59:23,609 epoch 5 - iter 890/1786 - loss 0.24850076 - time (sec): 15.44 - samples/sec: 7921.48 - lr: 0.000031 - momentum: 0.000000
2023-10-19 19:59:26,734 epoch 5 - iter 1068/1786 - loss 0.24097893 - time (sec): 18.56 - samples/sec: 7956.00 - lr: 0.000030 - momentum: 0.000000
2023-10-19 19:59:29,745 epoch 5 - iter 1246/1786 - loss 0.24335773 - time (sec): 21.58 - samples/sec: 7925.61 - lr: 0.000029 - momentum: 0.000000
2023-10-19 19:59:32,778 epoch 5 - iter 1424/1786 - loss 0.24082207 - time (sec): 24.61 - samples/sec: 7979.59 - lr: 0.000029 - momentum: 0.000000
2023-10-19 19:59:35,918 epoch 5 - iter 1602/1786 - loss 0.24015223 - time (sec): 27.75 - samples/sec: 8023.83 - lr: 0.000028 - momentum: 0.000000
2023-10-19 19:59:38,964 epoch 5 - iter 1780/1786 - loss 0.24046767 - time (sec): 30.80 - samples/sec: 8050.69 - lr: 0.000028 - momentum: 0.000000
2023-10-19 19:59:39,044 ----------------------------------------------------------------------------------------------------
2023-10-19 19:59:39,044 EPOCH 5 done: loss 0.2403 - lr: 0.000028
2023-10-19 19:59:41,414 DEV : loss 0.1918220967054367 - f1-score (micro avg) 0.5135
2023-10-19 19:59:41,429 saving best model
2023-10-19 19:59:41,465 ----------------------------------------------------------------------------------------------------
2023-10-19 19:59:44,427 epoch 6 - iter 178/1786 - loss 0.21180179 - time (sec): 2.96 - samples/sec: 8480.65 - lr: 0.000027 - momentum: 0.000000
2023-10-19 19:59:47,394 epoch 6 - iter 356/1786 - loss 0.21702266 - time (sec): 5.93 - samples/sec: 8183.50 - lr: 0.000027 - momentum: 0.000000
2023-10-19 19:59:50,443 epoch 6 - iter 534/1786 - loss 0.22223314 - time (sec): 8.98 - samples/sec: 8050.91 - lr: 0.000026 - momentum: 0.000000
2023-10-19 19:59:53,514 epoch 6 - iter 712/1786 - loss 0.21924966 - time (sec): 12.05 - samples/sec: 8177.54 - lr: 0.000026 - momentum: 0.000000
2023-10-19 19:59:56,555 epoch 6 - iter 890/1786 - loss 0.21767231 - time (sec): 15.09 - samples/sec: 8284.64 - lr: 0.000025 - momentum: 0.000000
2023-10-19 19:59:59,654 epoch 6 - iter 1068/1786 - loss 0.21749357 - time (sec): 18.19 - samples/sec: 8202.54 - lr: 0.000024 - momentum: 0.000000
2023-10-19 20:00:02,945 epoch 6 - iter 1246/1786 - loss 0.21862146 - time (sec): 21.48 - samples/sec: 8051.91 - lr: 0.000024 - momentum: 0.000000
2023-10-19 20:00:06,069 epoch 6 - iter 1424/1786 - loss 0.22007018 - time (sec): 24.60 - samples/sec: 8027.26 - lr: 0.000023 - momentum: 0.000000
2023-10-19 20:00:09,135 epoch 6 - iter 1602/1786 - loss 0.21938150 - time (sec): 27.67 - samples/sec: 8073.43 - lr: 0.000023 - momentum: 0.000000
2023-10-19 20:00:12,296 epoch 6 - iter 1780/1786 - loss 0.21988434 - time (sec): 30.83 - samples/sec: 8046.76 - lr: 0.000022 - momentum: 0.000000
2023-10-19 20:00:12,397 ----------------------------------------------------------------------------------------------------
2023-10-19 20:00:12,397 EPOCH 6 done: loss 0.2200 - lr: 0.000022
2023-10-19 20:00:15,251 DEV : loss 0.18987227976322174 - f1-score (micro avg) 0.5335
2023-10-19 20:00:15,264 saving best model
2023-10-19 20:00:15,297 ----------------------------------------------------------------------------------------------------
2023-10-19 20:00:18,364 epoch 7 - iter 178/1786 - loss 0.19179214 - time (sec): 3.07 - samples/sec: 8611.48 - lr: 0.000022 - momentum: 0.000000
2023-10-19 20:00:21,320 epoch 7 - iter 356/1786 - loss 0.20316963 - time (sec): 6.02 - samples/sec: 8591.59 - lr: 0.000021 - momentum: 0.000000
2023-10-19 20:00:24,042 epoch 7 - iter 534/1786 - loss 0.20106304 - time (sec): 8.74 - samples/sec: 8595.51 - lr: 0.000021 - momentum: 0.000000
2023-10-19 20:00:26,699 epoch 7 - iter 712/1786 - loss 0.20457425 - time (sec): 11.40 - samples/sec: 8635.39 - lr: 0.000020 - momentum: 0.000000
2023-10-19 20:00:29,586 epoch 7 - iter 890/1786 - loss 0.20559868 - time (sec): 14.29 - samples/sec: 8632.86 - lr: 0.000019 - momentum: 0.000000
2023-10-19 20:00:32,638 epoch 7 - iter 1068/1786 - loss 0.20441274 - time (sec): 17.34 - samples/sec: 8538.94 - lr: 0.000019 - momentum: 0.000000
2023-10-19 20:00:35,776 epoch 7 - iter 1246/1786 - loss 0.20404307 - time (sec): 20.48 - samples/sec: 8477.96 - lr: 0.000018 - momentum: 0.000000
2023-10-19 20:00:38,866 epoch 7 - iter 1424/1786 - loss 0.20331972 - time (sec): 23.57 - samples/sec: 8506.69 - lr: 0.000018 - momentum: 0.000000
2023-10-19 20:00:41,876 epoch 7 - iter 1602/1786 - loss 0.20559511 - time (sec): 26.58 - samples/sec: 8436.74 - lr: 0.000017 - momentum: 0.000000
2023-10-19 20:00:44,948 epoch 7 - iter 1780/1786 - loss 0.20567843 - time (sec): 29.65 - samples/sec: 8375.37 - lr: 0.000017 - momentum: 0.000000
2023-10-19 20:00:45,045 ----------------------------------------------------------------------------------------------------
2023-10-19 20:00:45,045 EPOCH 7 done: loss 0.2058 - lr: 0.000017
2023-10-19 20:00:47,397 DEV : loss 0.19306714832782745 - f1-score (micro avg) 0.5548
2023-10-19 20:00:47,412 saving best model
2023-10-19 20:00:47,447 ----------------------------------------------------------------------------------------------------
2023-10-19 20:00:50,568 epoch 8 - iter 178/1786 - loss 0.19689046 - time (sec): 3.12 - samples/sec: 8040.04 - lr: 0.000016 - momentum: 0.000000
2023-10-19 20:00:53,580 epoch 8 - iter 356/1786 - loss 0.18936570 - time (sec): 6.13 - samples/sec: 8150.05 - lr: 0.000016 - momentum: 0.000000
2023-10-19 20:00:56,635 epoch 8 - iter 534/1786 - loss 0.19552394 - time (sec): 9.19 - samples/sec: 8034.75 - lr: 0.000015 - momentum: 0.000000
2023-10-19 20:00:59,723 epoch 8 - iter 712/1786 - loss 0.19544811 - time (sec): 12.28 - samples/sec: 8039.73 - lr: 0.000014 - momentum: 0.000000
2023-10-19 20:01:02,832 epoch 8 - iter 890/1786 - loss 0.19406914 - time (sec): 15.38 - samples/sec: 8042.01 - lr: 0.000014 - momentum: 0.000000
2023-10-19 20:01:05,960 epoch 8 - iter 1068/1786 - loss 0.19371324 - time (sec): 18.51 - samples/sec: 8062.53 - lr: 0.000013 - momentum: 0.000000
2023-10-19 20:01:09,034 epoch 8 - iter 1246/1786 - loss 0.19346585 - time (sec): 21.59 - samples/sec: 8060.43 - lr: 0.000013 - momentum: 0.000000
2023-10-19 20:01:12,085 epoch 8 - iter 1424/1786 - loss 0.19281594 - time (sec): 24.64 - samples/sec: 8080.93 - lr: 0.000012 - momentum: 0.000000
2023-10-19 20:01:14,967 epoch 8 - iter 1602/1786 - loss 0.19551404 - time (sec): 27.52 - samples/sec: 8117.22 - lr: 0.000012 - momentum: 0.000000
2023-10-19 20:01:18,146 epoch 8 - iter 1780/1786 - loss 0.19565402 - time (sec): 30.70 - samples/sec: 8077.28 - lr: 0.000011 - momentum: 0.000000
2023-10-19 20:01:18,246 ----------------------------------------------------------------------------------------------------
2023-10-19 20:01:18,246 EPOCH 8 done: loss 0.1955 - lr: 0.000011
2023-10-19 20:01:21,157 DEV : loss 0.1908378303050995 - f1-score (micro avg) 0.5511
2023-10-19 20:01:21,172 ----------------------------------------------------------------------------------------------------
2023-10-19 20:01:24,362 epoch 9 - iter 178/1786 - loss 0.18088382 - time (sec): 3.19 - samples/sec: 8303.42 - lr: 0.000011 - momentum: 0.000000
2023-10-19 20:01:27,480 epoch 9 - iter 356/1786 - loss 0.17917792 - time (sec): 6.31 - samples/sec: 8210.43 - lr: 0.000010 - momentum: 0.000000
2023-10-19 20:01:30,457 epoch 9 - iter 534/1786 - loss 0.17920050 - time (sec): 9.28 - samples/sec: 8204.99 - lr: 0.000009 - momentum: 0.000000
2023-10-19 20:01:33,384 epoch 9 - iter 712/1786 - loss 0.18318395 - time (sec): 12.21 - samples/sec: 8147.49 - lr: 0.000009 - momentum: 0.000000
2023-10-19 20:01:36,370 epoch 9 - iter 890/1786 - loss 0.18476519 - time (sec): 15.20 - samples/sec: 8104.66 - lr: 0.000008 - momentum: 0.000000
2023-10-19 20:01:39,482 epoch 9 - iter 1068/1786 - loss 0.18662129 - time (sec): 18.31 - samples/sec: 8129.74 - lr: 0.000008 - momentum: 0.000000
2023-10-19 20:01:42,674 epoch 9 - iter 1246/1786 - loss 0.18931705 - time (sec): 21.50 - samples/sec: 8094.90 - lr: 0.000007 - momentum: 0.000000
2023-10-19 20:01:45,671 epoch 9 - iter 1424/1786 - loss 0.18853491 - time (sec): 24.50 - samples/sec: 8096.88 - lr: 0.000007 - momentum: 0.000000
2023-10-19 20:01:48,900 epoch 9 - iter 1602/1786 - loss 0.18944300 - time (sec): 27.73 - samples/sec: 8070.14 - lr: 0.000006 - momentum: 0.000000
2023-10-19 20:01:51,967 epoch 9 - iter 1780/1786 - loss 0.18742971 - time (sec): 30.79 - samples/sec: 8054.92 - lr: 0.000006 - momentum: 0.000000
2023-10-19 20:01:52,069 ----------------------------------------------------------------------------------------------------
2023-10-19 20:01:52,069 EPOCH 9 done: loss 0.1871 - lr: 0.000006
2023-10-19 20:01:54,434 DEV : loss 0.1917891502380371 - f1-score (micro avg) 0.5576
2023-10-19 20:01:54,448 saving best model
2023-10-19 20:01:54,482 ----------------------------------------------------------------------------------------------------
2023-10-19 20:01:57,521 epoch 10 - iter 178/1786 - loss 0.19254167 - time (sec): 3.04 - samples/sec: 7641.91 - lr: 0.000005 - momentum: 0.000000
2023-10-19 20:02:00,659 epoch 10 - iter 356/1786 - loss 0.19593691 - time (sec): 6.18 - samples/sec: 7746.14 - lr: 0.000004 - momentum: 0.000000
2023-10-19 20:02:03,654 epoch 10 - iter 534/1786 - loss 0.19352313 - time (sec): 9.17 - samples/sec: 7904.24 - lr: 0.000004 - momentum: 0.000000
2023-10-19 20:02:06,740 epoch 10 - iter 712/1786 - loss 0.19368257 - time (sec): 12.26 - samples/sec: 7965.43 - lr: 0.000003 - momentum: 0.000000
2023-10-19 20:02:09,806 epoch 10 - iter 890/1786 - loss 0.18973332 - time (sec): 15.32 - samples/sec: 7925.22 - lr: 0.000003 - momentum: 0.000000
2023-10-19 20:02:12,863 epoch 10 - iter 1068/1786 - loss 0.18637461 - time (sec): 18.38 - samples/sec: 7953.10 - lr: 0.000002 - momentum: 0.000000
2023-10-19 20:02:16,003 epoch 10 - iter 1246/1786 - loss 0.18229577 - time (sec): 21.52 - samples/sec: 8003.31 - lr: 0.000002 - momentum: 0.000000
2023-10-19 20:02:19,049 epoch 10 - iter 1424/1786 - loss 0.17769788 - time (sec): 24.57 - samples/sec: 8063.36 - lr: 0.000001 - momentum: 0.000000
2023-10-19 20:02:22,100 epoch 10 - iter 1602/1786 - loss 0.17934314 - time (sec): 27.62 - samples/sec: 8089.31 - lr: 0.000001 - momentum: 0.000000
2023-10-19 20:02:25,155 epoch 10 - iter 1780/1786 - loss 0.18135445 - time (sec): 30.67 - samples/sec: 8090.82 - lr: 0.000000 - momentum: 0.000000
2023-10-19 20:02:25,250 ----------------------------------------------------------------------------------------------------
2023-10-19 20:02:25,250 EPOCH 10 done: loss 0.1819 - lr: 0.000000
2023-10-19 20:02:28,074 DEV : loss 0.19165924191474915 - f1-score (micro avg) 0.5584
2023-10-19 20:02:28,088 saving best model
2023-10-19 20:02:28,154 ----------------------------------------------------------------------------------------------------
2023-10-19 20:02:28,154 Loading model from best epoch ...
2023-10-19 20:02:28,235 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-19 20:02:32,857
Results:
- F-score (micro) 0.449
- F-score (macro) 0.2899
- Accuracy 0.2994
By class:
precision recall f1-score support
LOC 0.4268 0.5562 0.4830 1095
PER 0.4745 0.5247 0.4984 1012
ORG 0.1980 0.1625 0.1785 357
HumanProd 0.0000 0.0000 0.0000 33
micro avg 0.4220 0.4798 0.4490 2497
macro avg 0.2748 0.3108 0.2899 2497
weighted avg 0.4078 0.4798 0.4393 2497
2023-10-19 20:02:32,857 ----------------------------------------------------------------------------------------------------