stefan-it's picture
Upload folder using huggingface_hub
90b7512
2023-10-18 18:12:26,648 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:26,648 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 18:12:26,648 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:26,648 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-18 18:12:26,648 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:26,648 Train: 3575 sentences
2023-10-18 18:12:26,648 (train_with_dev=False, train_with_test=False)
2023-10-18 18:12:26,648 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:26,648 Training Params:
2023-10-18 18:12:26,648 - learning_rate: "3e-05"
2023-10-18 18:12:26,648 - mini_batch_size: "4"
2023-10-18 18:12:26,648 - max_epochs: "10"
2023-10-18 18:12:26,648 - shuffle: "True"
2023-10-18 18:12:26,648 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:26,649 Plugins:
2023-10-18 18:12:26,649 - TensorboardLogger
2023-10-18 18:12:26,649 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 18:12:26,649 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:26,649 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 18:12:26,649 - metric: "('micro avg', 'f1-score')"
2023-10-18 18:12:26,649 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:26,649 Computation:
2023-10-18 18:12:26,649 - compute on device: cuda:0
2023-10-18 18:12:26,649 - embedding storage: none
2023-10-18 18:12:26,649 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:26,649 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-18 18:12:26,649 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:26,649 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:26,649 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 18:12:28,063 epoch 1 - iter 89/894 - loss 4.30065895 - time (sec): 1.41 - samples/sec: 5863.68 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:12:29,477 epoch 1 - iter 178/894 - loss 4.05118813 - time (sec): 2.83 - samples/sec: 5957.89 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:12:30,886 epoch 1 - iter 267/894 - loss 3.74797971 - time (sec): 4.24 - samples/sec: 6281.29 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:12:32,265 epoch 1 - iter 356/894 - loss 3.35481920 - time (sec): 5.62 - samples/sec: 6340.56 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:12:33,666 epoch 1 - iter 445/894 - loss 2.91699777 - time (sec): 7.02 - samples/sec: 6417.64 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:12:35,047 epoch 1 - iter 534/894 - loss 2.58442281 - time (sec): 8.40 - samples/sec: 6383.78 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:12:36,429 epoch 1 - iter 623/894 - loss 2.32823076 - time (sec): 9.78 - samples/sec: 6307.91 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:12:37,798 epoch 1 - iter 712/894 - loss 2.12806775 - time (sec): 11.15 - samples/sec: 6264.38 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:12:39,182 epoch 1 - iter 801/894 - loss 1.97667733 - time (sec): 12.53 - samples/sec: 6194.87 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:12:40,561 epoch 1 - iter 890/894 - loss 1.83834318 - time (sec): 13.91 - samples/sec: 6197.81 - lr: 0.000030 - momentum: 0.000000
2023-10-18 18:12:40,624 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:40,625 EPOCH 1 done: loss 1.8338 - lr: 0.000030
2023-10-18 18:12:42,893 DEV : loss 0.4581781327724457 - f1-score (micro avg) 0.0
2023-10-18 18:12:42,916 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:44,292 epoch 2 - iter 89/894 - loss 0.59867823 - time (sec): 1.38 - samples/sec: 6435.72 - lr: 0.000030 - momentum: 0.000000
2023-10-18 18:12:45,665 epoch 2 - iter 178/894 - loss 0.57480149 - time (sec): 2.75 - samples/sec: 6269.54 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:12:47,021 epoch 2 - iter 267/894 - loss 0.56770059 - time (sec): 4.10 - samples/sec: 6239.82 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:12:48,401 epoch 2 - iter 356/894 - loss 0.56163379 - time (sec): 5.49 - samples/sec: 6187.91 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:12:49,776 epoch 2 - iter 445/894 - loss 0.54920126 - time (sec): 6.86 - samples/sec: 6175.95 - lr: 0.000028 - momentum: 0.000000
2023-10-18 18:12:51,155 epoch 2 - iter 534/894 - loss 0.54336327 - time (sec): 8.24 - samples/sec: 6073.43 - lr: 0.000028 - momentum: 0.000000
2023-10-18 18:12:52,540 epoch 2 - iter 623/894 - loss 0.53277504 - time (sec): 9.62 - samples/sec: 6109.68 - lr: 0.000028 - momentum: 0.000000
2023-10-18 18:12:53,993 epoch 2 - iter 712/894 - loss 0.51554670 - time (sec): 11.08 - samples/sec: 6207.01 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:12:55,391 epoch 2 - iter 801/894 - loss 0.51151749 - time (sec): 12.47 - samples/sec: 6234.99 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:12:56,802 epoch 2 - iter 890/894 - loss 0.50365857 - time (sec): 13.89 - samples/sec: 6205.94 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:12:56,873 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:56,874 EPOCH 2 done: loss 0.5047 - lr: 0.000027
2023-10-18 18:13:02,051 DEV : loss 0.34852492809295654 - f1-score (micro avg) 0.0967
2023-10-18 18:13:02,074 saving best model
2023-10-18 18:13:02,108 ----------------------------------------------------------------------------------------------------
2023-10-18 18:13:03,520 epoch 3 - iter 89/894 - loss 0.42103620 - time (sec): 1.41 - samples/sec: 6298.51 - lr: 0.000026 - momentum: 0.000000
2023-10-18 18:13:04,895 epoch 3 - iter 178/894 - loss 0.40181846 - time (sec): 2.79 - samples/sec: 6186.38 - lr: 0.000026 - momentum: 0.000000
2023-10-18 18:13:06,296 epoch 3 - iter 267/894 - loss 0.41796152 - time (sec): 4.19 - samples/sec: 6196.39 - lr: 0.000026 - momentum: 0.000000
2023-10-18 18:13:07,671 epoch 3 - iter 356/894 - loss 0.42404257 - time (sec): 5.56 - samples/sec: 6124.96 - lr: 0.000025 - momentum: 0.000000
2023-10-18 18:13:09,042 epoch 3 - iter 445/894 - loss 0.41948278 - time (sec): 6.93 - samples/sec: 6176.89 - lr: 0.000025 - momentum: 0.000000
2023-10-18 18:13:10,487 epoch 3 - iter 534/894 - loss 0.41394295 - time (sec): 8.38 - samples/sec: 6298.19 - lr: 0.000025 - momentum: 0.000000
2023-10-18 18:13:11,914 epoch 3 - iter 623/894 - loss 0.42265535 - time (sec): 9.81 - samples/sec: 6275.67 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:13:13,303 epoch 3 - iter 712/894 - loss 0.41519781 - time (sec): 11.20 - samples/sec: 6238.64 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:13:14,686 epoch 3 - iter 801/894 - loss 0.41634385 - time (sec): 12.58 - samples/sec: 6174.11 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:13:16,059 epoch 3 - iter 890/894 - loss 0.41644892 - time (sec): 13.95 - samples/sec: 6174.42 - lr: 0.000023 - momentum: 0.000000
2023-10-18 18:13:16,119 ----------------------------------------------------------------------------------------------------
2023-10-18 18:13:16,119 EPOCH 3 done: loss 0.4159 - lr: 0.000023
2023-10-18 18:13:21,342 DEV : loss 0.32269880175590515 - f1-score (micro avg) 0.2706
2023-10-18 18:13:21,366 saving best model
2023-10-18 18:13:21,403 ----------------------------------------------------------------------------------------------------
2023-10-18 18:13:22,790 epoch 4 - iter 89/894 - loss 0.35650452 - time (sec): 1.39 - samples/sec: 5576.49 - lr: 0.000023 - momentum: 0.000000
2023-10-18 18:13:24,169 epoch 4 - iter 178/894 - loss 0.37533141 - time (sec): 2.77 - samples/sec: 5757.37 - lr: 0.000023 - momentum: 0.000000
2023-10-18 18:13:25,573 epoch 4 - iter 267/894 - loss 0.40715939 - time (sec): 4.17 - samples/sec: 5835.06 - lr: 0.000022 - momentum: 0.000000
2023-10-18 18:13:26,968 epoch 4 - iter 356/894 - loss 0.40776949 - time (sec): 5.56 - samples/sec: 5922.05 - lr: 0.000022 - momentum: 0.000000
2023-10-18 18:13:28,396 epoch 4 - iter 445/894 - loss 0.39118533 - time (sec): 6.99 - samples/sec: 6059.11 - lr: 0.000022 - momentum: 0.000000
2023-10-18 18:13:29,766 epoch 4 - iter 534/894 - loss 0.38025983 - time (sec): 8.36 - samples/sec: 6110.48 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:13:31,206 epoch 4 - iter 623/894 - loss 0.37519774 - time (sec): 9.80 - samples/sec: 6187.68 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:13:32,610 epoch 4 - iter 712/894 - loss 0.37572856 - time (sec): 11.21 - samples/sec: 6218.96 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:13:33,990 epoch 4 - iter 801/894 - loss 0.37183429 - time (sec): 12.59 - samples/sec: 6197.42 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:13:35,366 epoch 4 - iter 890/894 - loss 0.37518355 - time (sec): 13.96 - samples/sec: 6178.59 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:13:35,422 ----------------------------------------------------------------------------------------------------
2023-10-18 18:13:35,422 EPOCH 4 done: loss 0.3762 - lr: 0.000020
2023-10-18 18:13:40,338 DEV : loss 0.32122358679771423 - f1-score (micro avg) 0.2991
2023-10-18 18:13:40,361 saving best model
2023-10-18 18:13:40,395 ----------------------------------------------------------------------------------------------------
2023-10-18 18:13:41,655 epoch 5 - iter 89/894 - loss 0.37027438 - time (sec): 1.26 - samples/sec: 7062.84 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:13:42,917 epoch 5 - iter 178/894 - loss 0.34873801 - time (sec): 2.52 - samples/sec: 6657.06 - lr: 0.000019 - momentum: 0.000000
2023-10-18 18:13:44,643 epoch 5 - iter 267/894 - loss 0.34681192 - time (sec): 4.25 - samples/sec: 6165.03 - lr: 0.000019 - momentum: 0.000000
2023-10-18 18:13:46,029 epoch 5 - iter 356/894 - loss 0.35184385 - time (sec): 5.63 - samples/sec: 6206.58 - lr: 0.000019 - momentum: 0.000000
2023-10-18 18:13:47,434 epoch 5 - iter 445/894 - loss 0.34721870 - time (sec): 7.04 - samples/sec: 6219.04 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:13:48,822 epoch 5 - iter 534/894 - loss 0.34945722 - time (sec): 8.43 - samples/sec: 6192.10 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:13:50,180 epoch 5 - iter 623/894 - loss 0.35630689 - time (sec): 9.78 - samples/sec: 6119.76 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:13:51,574 epoch 5 - iter 712/894 - loss 0.35521245 - time (sec): 11.18 - samples/sec: 6089.61 - lr: 0.000017 - momentum: 0.000000
2023-10-18 18:13:52,948 epoch 5 - iter 801/894 - loss 0.35160330 - time (sec): 12.55 - samples/sec: 6080.81 - lr: 0.000017 - momentum: 0.000000
2023-10-18 18:13:54,243 epoch 5 - iter 890/894 - loss 0.35440492 - time (sec): 13.85 - samples/sec: 6232.31 - lr: 0.000017 - momentum: 0.000000
2023-10-18 18:13:54,305 ----------------------------------------------------------------------------------------------------
2023-10-18 18:13:54,305 EPOCH 5 done: loss 0.3544 - lr: 0.000017
2023-10-18 18:13:59,306 DEV : loss 0.3080124258995056 - f1-score (micro avg) 0.3158
2023-10-18 18:13:59,331 saving best model
2023-10-18 18:13:59,364 ----------------------------------------------------------------------------------------------------
2023-10-18 18:14:00,783 epoch 6 - iter 89/894 - loss 0.33467325 - time (sec): 1.42 - samples/sec: 5601.86 - lr: 0.000016 - momentum: 0.000000
2023-10-18 18:14:02,208 epoch 6 - iter 178/894 - loss 0.30810723 - time (sec): 2.84 - samples/sec: 6105.34 - lr: 0.000016 - momentum: 0.000000
2023-10-18 18:14:03,625 epoch 6 - iter 267/894 - loss 0.30292506 - time (sec): 4.26 - samples/sec: 5918.09 - lr: 0.000016 - momentum: 0.000000
2023-10-18 18:14:05,038 epoch 6 - iter 356/894 - loss 0.32078430 - time (sec): 5.67 - samples/sec: 6105.76 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:14:06,396 epoch 6 - iter 445/894 - loss 0.32374234 - time (sec): 7.03 - samples/sec: 6237.92 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:14:07,777 epoch 6 - iter 534/894 - loss 0.32865529 - time (sec): 8.41 - samples/sec: 6183.14 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:14:09,178 epoch 6 - iter 623/894 - loss 0.32702913 - time (sec): 9.81 - samples/sec: 6127.61 - lr: 0.000014 - momentum: 0.000000
2023-10-18 18:14:10,596 epoch 6 - iter 712/894 - loss 0.32725480 - time (sec): 11.23 - samples/sec: 6209.99 - lr: 0.000014 - momentum: 0.000000
2023-10-18 18:14:11,856 epoch 6 - iter 801/894 - loss 0.32393042 - time (sec): 12.49 - samples/sec: 6227.81 - lr: 0.000014 - momentum: 0.000000
2023-10-18 18:14:13,084 epoch 6 - iter 890/894 - loss 0.33649160 - time (sec): 13.72 - samples/sec: 6284.34 - lr: 0.000013 - momentum: 0.000000
2023-10-18 18:14:13,135 ----------------------------------------------------------------------------------------------------
2023-10-18 18:14:13,135 EPOCH 6 done: loss 0.3365 - lr: 0.000013
2023-10-18 18:14:18,439 DEV : loss 0.3021206855773926 - f1-score (micro avg) 0.3207
2023-10-18 18:14:18,464 saving best model
2023-10-18 18:14:18,496 ----------------------------------------------------------------------------------------------------
2023-10-18 18:14:19,958 epoch 7 - iter 89/894 - loss 0.26960562 - time (sec): 1.46 - samples/sec: 6365.87 - lr: 0.000013 - momentum: 0.000000
2023-10-18 18:14:21,346 epoch 7 - iter 178/894 - loss 0.30468972 - time (sec): 2.85 - samples/sec: 6151.31 - lr: 0.000013 - momentum: 0.000000
2023-10-18 18:14:22,767 epoch 7 - iter 267/894 - loss 0.33313252 - time (sec): 4.27 - samples/sec: 6407.92 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:14:24,187 epoch 7 - iter 356/894 - loss 0.33431427 - time (sec): 5.69 - samples/sec: 6344.58 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:14:25,577 epoch 7 - iter 445/894 - loss 0.33591565 - time (sec): 7.08 - samples/sec: 6302.37 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:14:26,922 epoch 7 - iter 534/894 - loss 0.33295271 - time (sec): 8.43 - samples/sec: 6245.70 - lr: 0.000011 - momentum: 0.000000
2023-10-18 18:14:28,331 epoch 7 - iter 623/894 - loss 0.32496234 - time (sec): 9.83 - samples/sec: 6181.53 - lr: 0.000011 - momentum: 0.000000
2023-10-18 18:14:29,748 epoch 7 - iter 712/894 - loss 0.32501740 - time (sec): 11.25 - samples/sec: 6160.77 - lr: 0.000011 - momentum: 0.000000
2023-10-18 18:14:31,133 epoch 7 - iter 801/894 - loss 0.31957759 - time (sec): 12.64 - samples/sec: 6170.46 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:14:32,480 epoch 7 - iter 890/894 - loss 0.32210478 - time (sec): 13.98 - samples/sec: 6164.55 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:14:32,542 ----------------------------------------------------------------------------------------------------
2023-10-18 18:14:32,543 EPOCH 7 done: loss 0.3228 - lr: 0.000010
2023-10-18 18:14:37,838 DEV : loss 0.30848488211631775 - f1-score (micro avg) 0.3318
2023-10-18 18:14:37,863 saving best model
2023-10-18 18:14:37,897 ----------------------------------------------------------------------------------------------------
2023-10-18 18:14:39,315 epoch 8 - iter 89/894 - loss 0.31380512 - time (sec): 1.42 - samples/sec: 6724.19 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:14:40,788 epoch 8 - iter 178/894 - loss 0.30161176 - time (sec): 2.89 - samples/sec: 6165.41 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:14:42,187 epoch 8 - iter 267/894 - loss 0.31408641 - time (sec): 4.29 - samples/sec: 6172.07 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:14:43,636 epoch 8 - iter 356/894 - loss 0.32273847 - time (sec): 5.74 - samples/sec: 6020.27 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:14:45,011 epoch 8 - iter 445/894 - loss 0.32694106 - time (sec): 7.11 - samples/sec: 6034.66 - lr: 0.000008 - momentum: 0.000000
2023-10-18 18:14:46,388 epoch 8 - iter 534/894 - loss 0.32254045 - time (sec): 8.49 - samples/sec: 6039.39 - lr: 0.000008 - momentum: 0.000000
2023-10-18 18:14:47,761 epoch 8 - iter 623/894 - loss 0.31843528 - time (sec): 9.86 - samples/sec: 6009.62 - lr: 0.000008 - momentum: 0.000000
2023-10-18 18:14:49,180 epoch 8 - iter 712/894 - loss 0.31741782 - time (sec): 11.28 - samples/sec: 6032.12 - lr: 0.000007 - momentum: 0.000000
2023-10-18 18:14:50,555 epoch 8 - iter 801/894 - loss 0.31067949 - time (sec): 12.66 - samples/sec: 6049.15 - lr: 0.000007 - momentum: 0.000000
2023-10-18 18:14:51,967 epoch 8 - iter 890/894 - loss 0.31317693 - time (sec): 14.07 - samples/sec: 6118.58 - lr: 0.000007 - momentum: 0.000000
2023-10-18 18:14:52,032 ----------------------------------------------------------------------------------------------------
2023-10-18 18:14:52,032 EPOCH 8 done: loss 0.3121 - lr: 0.000007
2023-10-18 18:14:57,331 DEV : loss 0.304724782705307 - f1-score (micro avg) 0.3341
2023-10-18 18:14:57,355 saving best model
2023-10-18 18:14:57,395 ----------------------------------------------------------------------------------------------------
2023-10-18 18:14:58,774 epoch 9 - iter 89/894 - loss 0.28201030 - time (sec): 1.38 - samples/sec: 5974.75 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:15:00,149 epoch 9 - iter 178/894 - loss 0.30076729 - time (sec): 2.75 - samples/sec: 5684.88 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:15:01,567 epoch 9 - iter 267/894 - loss 0.29425291 - time (sec): 4.17 - samples/sec: 5863.62 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:15:03,049 epoch 9 - iter 356/894 - loss 0.30464495 - time (sec): 5.65 - samples/sec: 5799.05 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:15:04,519 epoch 9 - iter 445/894 - loss 0.30458727 - time (sec): 7.12 - samples/sec: 5906.97 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:15:05,921 epoch 9 - iter 534/894 - loss 0.30462622 - time (sec): 8.53 - samples/sec: 6011.64 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:15:07,339 epoch 9 - iter 623/894 - loss 0.29970434 - time (sec): 9.94 - samples/sec: 6014.50 - lr: 0.000004 - momentum: 0.000000
2023-10-18 18:15:08,843 epoch 9 - iter 712/894 - loss 0.29994186 - time (sec): 11.45 - samples/sec: 5930.13 - lr: 0.000004 - momentum: 0.000000
2023-10-18 18:15:10,270 epoch 9 - iter 801/894 - loss 0.30270378 - time (sec): 12.87 - samples/sec: 6023.83 - lr: 0.000004 - momentum: 0.000000
2023-10-18 18:15:11,655 epoch 9 - iter 890/894 - loss 0.30548739 - time (sec): 14.26 - samples/sec: 6053.64 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:15:11,717 ----------------------------------------------------------------------------------------------------
2023-10-18 18:15:11,717 EPOCH 9 done: loss 0.3060 - lr: 0.000003
2023-10-18 18:15:16,672 DEV : loss 0.3093281090259552 - f1-score (micro avg) 0.3296
2023-10-18 18:15:16,697 ----------------------------------------------------------------------------------------------------
2023-10-18 18:15:18,092 epoch 10 - iter 89/894 - loss 0.35217359 - time (sec): 1.39 - samples/sec: 5946.91 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:15:19,482 epoch 10 - iter 178/894 - loss 0.32604035 - time (sec): 2.78 - samples/sec: 5919.61 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:15:20,873 epoch 10 - iter 267/894 - loss 0.30390326 - time (sec): 4.18 - samples/sec: 6002.65 - lr: 0.000002 - momentum: 0.000000
2023-10-18 18:15:22,239 epoch 10 - iter 356/894 - loss 0.29975890 - time (sec): 5.54 - samples/sec: 5978.58 - lr: 0.000002 - momentum: 0.000000
2023-10-18 18:15:23,593 epoch 10 - iter 445/894 - loss 0.30706171 - time (sec): 6.90 - samples/sec: 5917.41 - lr: 0.000002 - momentum: 0.000000
2023-10-18 18:15:25,047 epoch 10 - iter 534/894 - loss 0.29824962 - time (sec): 8.35 - samples/sec: 5945.27 - lr: 0.000001 - momentum: 0.000000
2023-10-18 18:15:26,410 epoch 10 - iter 623/894 - loss 0.29881777 - time (sec): 9.71 - samples/sec: 5970.96 - lr: 0.000001 - momentum: 0.000000
2023-10-18 18:15:28,198 epoch 10 - iter 712/894 - loss 0.30115841 - time (sec): 11.50 - samples/sec: 5963.76 - lr: 0.000001 - momentum: 0.000000
2023-10-18 18:15:29,602 epoch 10 - iter 801/894 - loss 0.30301864 - time (sec): 12.90 - samples/sec: 5967.88 - lr: 0.000000 - momentum: 0.000000
2023-10-18 18:15:31,020 epoch 10 - iter 890/894 - loss 0.30007494 - time (sec): 14.32 - samples/sec: 6006.28 - lr: 0.000000 - momentum: 0.000000
2023-10-18 18:15:31,082 ----------------------------------------------------------------------------------------------------
2023-10-18 18:15:31,082 EPOCH 10 done: loss 0.3003 - lr: 0.000000
2023-10-18 18:15:36,037 DEV : loss 0.30702558159828186 - f1-score (micro avg) 0.3351
2023-10-18 18:15:36,062 saving best model
2023-10-18 18:15:36,125 ----------------------------------------------------------------------------------------------------
2023-10-18 18:15:36,126 Loading model from best epoch ...
2023-10-18 18:15:36,208 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-18 18:15:38,533
Results:
- F-score (micro) 0.3319
- F-score (macro) 0.1314
- Accuracy 0.2091
By class:
precision recall f1-score support
loc 0.4859 0.5503 0.5161 596
pers 0.1228 0.1652 0.1408 333
org 0.0000 0.0000 0.0000 132
prod 0.0000 0.0000 0.0000 66
time 0.0000 0.0000 0.0000 49
micro avg 0.3383 0.3257 0.3319 1176
macro avg 0.1217 0.1431 0.1314 1176
weighted avg 0.2810 0.3257 0.3015 1176
2023-10-18 18:15:38,533 ----------------------------------------------------------------------------------------------------