2023-10-18 18:12:26,648 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:12:26,648 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 18:12:26,648 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:12:26,648 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences - NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator 2023-10-18 18:12:26,648 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:12:26,648 Train: 3575 sentences 2023-10-18 18:12:26,648 (train_with_dev=False, train_with_test=False) 2023-10-18 18:12:26,648 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:12:26,648 Training Params: 2023-10-18 18:12:26,648 - learning_rate: "3e-05" 2023-10-18 18:12:26,648 - mini_batch_size: "4" 2023-10-18 18:12:26,648 - max_epochs: "10" 2023-10-18 18:12:26,648 - shuffle: "True" 2023-10-18 18:12:26,648 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:12:26,649 Plugins: 2023-10-18 18:12:26,649 - TensorboardLogger 2023-10-18 18:12:26,649 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 18:12:26,649 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:12:26,649 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 18:12:26,649 - metric: "('micro avg', 'f1-score')" 2023-10-18 18:12:26,649 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:12:26,649 Computation: 2023-10-18 18:12:26,649 - compute on device: cuda:0 2023-10-18 18:12:26,649 - embedding storage: none 2023-10-18 18:12:26,649 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:12:26,649 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-18 18:12:26,649 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:12:26,649 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:12:26,649 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 18:12:28,063 epoch 1 - iter 89/894 - loss 4.30065895 - time (sec): 1.41 - samples/sec: 5863.68 - lr: 0.000003 - momentum: 0.000000 2023-10-18 18:12:29,477 epoch 1 - iter 178/894 - loss 4.05118813 - time (sec): 2.83 - samples/sec: 5957.89 - lr: 0.000006 - momentum: 0.000000 2023-10-18 18:12:30,886 epoch 1 - iter 267/894 - loss 3.74797971 - time (sec): 4.24 - samples/sec: 6281.29 - lr: 0.000009 - momentum: 0.000000 2023-10-18 18:12:32,265 epoch 1 - iter 356/894 - loss 3.35481920 - time (sec): 5.62 - samples/sec: 6340.56 - lr: 0.000012 - momentum: 0.000000 2023-10-18 18:12:33,666 epoch 1 - iter 445/894 - loss 2.91699777 - time (sec): 7.02 - samples/sec: 6417.64 - lr: 0.000015 - momentum: 0.000000 2023-10-18 18:12:35,047 epoch 1 - iter 534/894 - loss 2.58442281 - time (sec): 8.40 - samples/sec: 6383.78 - lr: 0.000018 - momentum: 0.000000 2023-10-18 18:12:36,429 epoch 1 - iter 623/894 - loss 2.32823076 - time (sec): 9.78 - samples/sec: 6307.91 - lr: 0.000021 - momentum: 0.000000 2023-10-18 18:12:37,798 epoch 1 - iter 712/894 - loss 2.12806775 - time (sec): 11.15 - samples/sec: 6264.38 - lr: 0.000024 - momentum: 0.000000 2023-10-18 18:12:39,182 epoch 1 - iter 801/894 - loss 1.97667733 - time (sec): 12.53 - samples/sec: 6194.87 - lr: 0.000027 - momentum: 0.000000 2023-10-18 18:12:40,561 epoch 1 - iter 890/894 - loss 1.83834318 - time (sec): 13.91 - samples/sec: 6197.81 - lr: 0.000030 - momentum: 0.000000 2023-10-18 18:12:40,624 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:12:40,625 EPOCH 1 done: loss 1.8338 - lr: 0.000030 2023-10-18 18:12:42,893 DEV : loss 0.4581781327724457 - f1-score (micro avg) 0.0 2023-10-18 18:12:42,916 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:12:44,292 epoch 2 - iter 89/894 - loss 0.59867823 - time (sec): 1.38 - samples/sec: 6435.72 - lr: 0.000030 - momentum: 0.000000 2023-10-18 18:12:45,665 epoch 2 - iter 178/894 - loss 0.57480149 - time (sec): 2.75 - samples/sec: 6269.54 - lr: 0.000029 - momentum: 0.000000 2023-10-18 18:12:47,021 epoch 2 - iter 267/894 - loss 0.56770059 - time (sec): 4.10 - samples/sec: 6239.82 - lr: 0.000029 - momentum: 0.000000 2023-10-18 18:12:48,401 epoch 2 - iter 356/894 - loss 0.56163379 - time (sec): 5.49 - samples/sec: 6187.91 - lr: 0.000029 - momentum: 0.000000 2023-10-18 18:12:49,776 epoch 2 - iter 445/894 - loss 0.54920126 - time (sec): 6.86 - samples/sec: 6175.95 - lr: 0.000028 - momentum: 0.000000 2023-10-18 18:12:51,155 epoch 2 - iter 534/894 - loss 0.54336327 - time (sec): 8.24 - samples/sec: 6073.43 - lr: 0.000028 - momentum: 0.000000 2023-10-18 18:12:52,540 epoch 2 - iter 623/894 - loss 0.53277504 - time (sec): 9.62 - samples/sec: 6109.68 - lr: 0.000028 - momentum: 0.000000 2023-10-18 18:12:53,993 epoch 2 - iter 712/894 - loss 0.51554670 - time (sec): 11.08 - samples/sec: 6207.01 - lr: 0.000027 - momentum: 0.000000 2023-10-18 18:12:55,391 epoch 2 - iter 801/894 - loss 0.51151749 - time (sec): 12.47 - samples/sec: 6234.99 - lr: 0.000027 - momentum: 0.000000 2023-10-18 18:12:56,802 epoch 2 - iter 890/894 - loss 0.50365857 - time (sec): 13.89 - samples/sec: 6205.94 - lr: 0.000027 - momentum: 0.000000 2023-10-18 18:12:56,873 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:12:56,874 EPOCH 2 done: loss 0.5047 - lr: 0.000027 2023-10-18 18:13:02,051 DEV : loss 0.34852492809295654 - f1-score (micro avg) 0.0967 2023-10-18 18:13:02,074 saving best model 2023-10-18 18:13:02,108 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:13:03,520 epoch 3 - iter 89/894 - loss 0.42103620 - time (sec): 1.41 - samples/sec: 6298.51 - lr: 0.000026 - momentum: 0.000000 2023-10-18 18:13:04,895 epoch 3 - iter 178/894 - loss 0.40181846 - time (sec): 2.79 - samples/sec: 6186.38 - lr: 0.000026 - momentum: 0.000000 2023-10-18 18:13:06,296 epoch 3 - iter 267/894 - loss 0.41796152 - time (sec): 4.19 - samples/sec: 6196.39 - lr: 0.000026 - momentum: 0.000000 2023-10-18 18:13:07,671 epoch 3 - iter 356/894 - loss 0.42404257 - time (sec): 5.56 - samples/sec: 6124.96 - lr: 0.000025 - momentum: 0.000000 2023-10-18 18:13:09,042 epoch 3 - iter 445/894 - loss 0.41948278 - time (sec): 6.93 - samples/sec: 6176.89 - lr: 0.000025 - momentum: 0.000000 2023-10-18 18:13:10,487 epoch 3 - iter 534/894 - loss 0.41394295 - time (sec): 8.38 - samples/sec: 6298.19 - lr: 0.000025 - momentum: 0.000000 2023-10-18 18:13:11,914 epoch 3 - iter 623/894 - loss 0.42265535 - time (sec): 9.81 - samples/sec: 6275.67 - lr: 0.000024 - momentum: 0.000000 2023-10-18 18:13:13,303 epoch 3 - iter 712/894 - loss 0.41519781 - time (sec): 11.20 - samples/sec: 6238.64 - lr: 0.000024 - momentum: 0.000000 2023-10-18 18:13:14,686 epoch 3 - iter 801/894 - loss 0.41634385 - time (sec): 12.58 - samples/sec: 6174.11 - lr: 0.000024 - momentum: 0.000000 2023-10-18 18:13:16,059 epoch 3 - iter 890/894 - loss 0.41644892 - time (sec): 13.95 - samples/sec: 6174.42 - lr: 0.000023 - momentum: 0.000000 2023-10-18 18:13:16,119 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:13:16,119 EPOCH 3 done: loss 0.4159 - lr: 0.000023 2023-10-18 18:13:21,342 DEV : loss 0.32269880175590515 - f1-score (micro avg) 0.2706 2023-10-18 18:13:21,366 saving best model 2023-10-18 18:13:21,403 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:13:22,790 epoch 4 - iter 89/894 - loss 0.35650452 - time (sec): 1.39 - samples/sec: 5576.49 - lr: 0.000023 - momentum: 0.000000 2023-10-18 18:13:24,169 epoch 4 - iter 178/894 - loss 0.37533141 - time (sec): 2.77 - samples/sec: 5757.37 - lr: 0.000023 - momentum: 0.000000 2023-10-18 18:13:25,573 epoch 4 - iter 267/894 - loss 0.40715939 - time (sec): 4.17 - samples/sec: 5835.06 - lr: 0.000022 - momentum: 0.000000 2023-10-18 18:13:26,968 epoch 4 - iter 356/894 - loss 0.40776949 - time (sec): 5.56 - samples/sec: 5922.05 - lr: 0.000022 - momentum: 0.000000 2023-10-18 18:13:28,396 epoch 4 - iter 445/894 - loss 0.39118533 - time (sec): 6.99 - samples/sec: 6059.11 - lr: 0.000022 - momentum: 0.000000 2023-10-18 18:13:29,766 epoch 4 - iter 534/894 - loss 0.38025983 - time (sec): 8.36 - samples/sec: 6110.48 - lr: 0.000021 - momentum: 0.000000 2023-10-18 18:13:31,206 epoch 4 - iter 623/894 - loss 0.37519774 - time (sec): 9.80 - samples/sec: 6187.68 - lr: 0.000021 - momentum: 0.000000 2023-10-18 18:13:32,610 epoch 4 - iter 712/894 - loss 0.37572856 - time (sec): 11.21 - samples/sec: 6218.96 - lr: 0.000021 - momentum: 0.000000 2023-10-18 18:13:33,990 epoch 4 - iter 801/894 - loss 0.37183429 - time (sec): 12.59 - samples/sec: 6197.42 - lr: 0.000020 - momentum: 0.000000 2023-10-18 18:13:35,366 epoch 4 - iter 890/894 - loss 0.37518355 - time (sec): 13.96 - samples/sec: 6178.59 - lr: 0.000020 - momentum: 0.000000 2023-10-18 18:13:35,422 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:13:35,422 EPOCH 4 done: loss 0.3762 - lr: 0.000020 2023-10-18 18:13:40,338 DEV : loss 0.32122358679771423 - f1-score (micro avg) 0.2991 2023-10-18 18:13:40,361 saving best model 2023-10-18 18:13:40,395 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:13:41,655 epoch 5 - iter 89/894 - loss 0.37027438 - time (sec): 1.26 - samples/sec: 7062.84 - lr: 0.000020 - momentum: 0.000000 2023-10-18 18:13:42,917 epoch 5 - iter 178/894 - loss 0.34873801 - time (sec): 2.52 - samples/sec: 6657.06 - lr: 0.000019 - momentum: 0.000000 2023-10-18 18:13:44,643 epoch 5 - iter 267/894 - loss 0.34681192 - time (sec): 4.25 - samples/sec: 6165.03 - lr: 0.000019 - momentum: 0.000000 2023-10-18 18:13:46,029 epoch 5 - iter 356/894 - loss 0.35184385 - time (sec): 5.63 - samples/sec: 6206.58 - lr: 0.000019 - momentum: 0.000000 2023-10-18 18:13:47,434 epoch 5 - iter 445/894 - loss 0.34721870 - time (sec): 7.04 - samples/sec: 6219.04 - lr: 0.000018 - momentum: 0.000000 2023-10-18 18:13:48,822 epoch 5 - iter 534/894 - loss 0.34945722 - time (sec): 8.43 - samples/sec: 6192.10 - lr: 0.000018 - momentum: 0.000000 2023-10-18 18:13:50,180 epoch 5 - iter 623/894 - loss 0.35630689 - time (sec): 9.78 - samples/sec: 6119.76 - lr: 0.000018 - momentum: 0.000000 2023-10-18 18:13:51,574 epoch 5 - iter 712/894 - loss 0.35521245 - time (sec): 11.18 - samples/sec: 6089.61 - lr: 0.000017 - momentum: 0.000000 2023-10-18 18:13:52,948 epoch 5 - iter 801/894 - loss 0.35160330 - time (sec): 12.55 - samples/sec: 6080.81 - lr: 0.000017 - momentum: 0.000000 2023-10-18 18:13:54,243 epoch 5 - iter 890/894 - loss 0.35440492 - time (sec): 13.85 - samples/sec: 6232.31 - lr: 0.000017 - momentum: 0.000000 2023-10-18 18:13:54,305 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:13:54,305 EPOCH 5 done: loss 0.3544 - lr: 0.000017 2023-10-18 18:13:59,306 DEV : loss 0.3080124258995056 - f1-score (micro avg) 0.3158 2023-10-18 18:13:59,331 saving best model 2023-10-18 18:13:59,364 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:14:00,783 epoch 6 - iter 89/894 - loss 0.33467325 - time (sec): 1.42 - samples/sec: 5601.86 - lr: 0.000016 - momentum: 0.000000 2023-10-18 18:14:02,208 epoch 6 - iter 178/894 - loss 0.30810723 - time (sec): 2.84 - samples/sec: 6105.34 - lr: 0.000016 - momentum: 0.000000 2023-10-18 18:14:03,625 epoch 6 - iter 267/894 - loss 0.30292506 - time (sec): 4.26 - samples/sec: 5918.09 - lr: 0.000016 - momentum: 0.000000 2023-10-18 18:14:05,038 epoch 6 - iter 356/894 - loss 0.32078430 - time (sec): 5.67 - samples/sec: 6105.76 - lr: 0.000015 - momentum: 0.000000 2023-10-18 18:14:06,396 epoch 6 - iter 445/894 - loss 0.32374234 - time (sec): 7.03 - samples/sec: 6237.92 - lr: 0.000015 - momentum: 0.000000 2023-10-18 18:14:07,777 epoch 6 - iter 534/894 - loss 0.32865529 - time (sec): 8.41 - samples/sec: 6183.14 - lr: 0.000015 - momentum: 0.000000 2023-10-18 18:14:09,178 epoch 6 - iter 623/894 - loss 0.32702913 - time (sec): 9.81 - samples/sec: 6127.61 - lr: 0.000014 - momentum: 0.000000 2023-10-18 18:14:10,596 epoch 6 - iter 712/894 - loss 0.32725480 - time (sec): 11.23 - samples/sec: 6209.99 - lr: 0.000014 - momentum: 0.000000 2023-10-18 18:14:11,856 epoch 6 - iter 801/894 - loss 0.32393042 - time (sec): 12.49 - samples/sec: 6227.81 - lr: 0.000014 - momentum: 0.000000 2023-10-18 18:14:13,084 epoch 6 - iter 890/894 - loss 0.33649160 - time (sec): 13.72 - samples/sec: 6284.34 - lr: 0.000013 - momentum: 0.000000 2023-10-18 18:14:13,135 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:14:13,135 EPOCH 6 done: loss 0.3365 - lr: 0.000013 2023-10-18 18:14:18,439 DEV : loss 0.3021206855773926 - f1-score (micro avg) 0.3207 2023-10-18 18:14:18,464 saving best model 2023-10-18 18:14:18,496 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:14:19,958 epoch 7 - iter 89/894 - loss 0.26960562 - time (sec): 1.46 - samples/sec: 6365.87 - lr: 0.000013 - momentum: 0.000000 2023-10-18 18:14:21,346 epoch 7 - iter 178/894 - loss 0.30468972 - time (sec): 2.85 - samples/sec: 6151.31 - lr: 0.000013 - momentum: 0.000000 2023-10-18 18:14:22,767 epoch 7 - iter 267/894 - loss 0.33313252 - time (sec): 4.27 - samples/sec: 6407.92 - lr: 0.000012 - momentum: 0.000000 2023-10-18 18:14:24,187 epoch 7 - iter 356/894 - loss 0.33431427 - time (sec): 5.69 - samples/sec: 6344.58 - lr: 0.000012 - momentum: 0.000000 2023-10-18 18:14:25,577 epoch 7 - iter 445/894 - loss 0.33591565 - time (sec): 7.08 - samples/sec: 6302.37 - lr: 0.000012 - momentum: 0.000000 2023-10-18 18:14:26,922 epoch 7 - iter 534/894 - loss 0.33295271 - time (sec): 8.43 - samples/sec: 6245.70 - lr: 0.000011 - momentum: 0.000000 2023-10-18 18:14:28,331 epoch 7 - iter 623/894 - loss 0.32496234 - time (sec): 9.83 - samples/sec: 6181.53 - lr: 0.000011 - momentum: 0.000000 2023-10-18 18:14:29,748 epoch 7 - iter 712/894 - loss 0.32501740 - time (sec): 11.25 - samples/sec: 6160.77 - lr: 0.000011 - momentum: 0.000000 2023-10-18 18:14:31,133 epoch 7 - iter 801/894 - loss 0.31957759 - time (sec): 12.64 - samples/sec: 6170.46 - lr: 0.000010 - momentum: 0.000000 2023-10-18 18:14:32,480 epoch 7 - iter 890/894 - loss 0.32210478 - time (sec): 13.98 - samples/sec: 6164.55 - lr: 0.000010 - momentum: 0.000000 2023-10-18 18:14:32,542 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:14:32,543 EPOCH 7 done: loss 0.3228 - lr: 0.000010 2023-10-18 18:14:37,838 DEV : loss 0.30848488211631775 - f1-score (micro avg) 0.3318 2023-10-18 18:14:37,863 saving best model 2023-10-18 18:14:37,897 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:14:39,315 epoch 8 - iter 89/894 - loss 0.31380512 - time (sec): 1.42 - samples/sec: 6724.19 - lr: 0.000010 - momentum: 0.000000 2023-10-18 18:14:40,788 epoch 8 - iter 178/894 - loss 0.30161176 - time (sec): 2.89 - samples/sec: 6165.41 - lr: 0.000009 - momentum: 0.000000 2023-10-18 18:14:42,187 epoch 8 - iter 267/894 - loss 0.31408641 - time (sec): 4.29 - samples/sec: 6172.07 - lr: 0.000009 - momentum: 0.000000 2023-10-18 18:14:43,636 epoch 8 - iter 356/894 - loss 0.32273847 - time (sec): 5.74 - samples/sec: 6020.27 - lr: 0.000009 - momentum: 0.000000 2023-10-18 18:14:45,011 epoch 8 - iter 445/894 - loss 0.32694106 - time (sec): 7.11 - samples/sec: 6034.66 - lr: 0.000008 - momentum: 0.000000 2023-10-18 18:14:46,388 epoch 8 - iter 534/894 - loss 0.32254045 - time (sec): 8.49 - samples/sec: 6039.39 - lr: 0.000008 - momentum: 0.000000 2023-10-18 18:14:47,761 epoch 8 - iter 623/894 - loss 0.31843528 - time (sec): 9.86 - samples/sec: 6009.62 - lr: 0.000008 - momentum: 0.000000 2023-10-18 18:14:49,180 epoch 8 - iter 712/894 - loss 0.31741782 - time (sec): 11.28 - samples/sec: 6032.12 - lr: 0.000007 - momentum: 0.000000 2023-10-18 18:14:50,555 epoch 8 - iter 801/894 - loss 0.31067949 - time (sec): 12.66 - samples/sec: 6049.15 - lr: 0.000007 - momentum: 0.000000 2023-10-18 18:14:51,967 epoch 8 - iter 890/894 - loss 0.31317693 - time (sec): 14.07 - samples/sec: 6118.58 - lr: 0.000007 - momentum: 0.000000 2023-10-18 18:14:52,032 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:14:52,032 EPOCH 8 done: loss 0.3121 - lr: 0.000007 2023-10-18 18:14:57,331 DEV : loss 0.304724782705307 - f1-score (micro avg) 0.3341 2023-10-18 18:14:57,355 saving best model 2023-10-18 18:14:57,395 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:14:58,774 epoch 9 - iter 89/894 - loss 0.28201030 - time (sec): 1.38 - samples/sec: 5974.75 - lr: 0.000006 - momentum: 0.000000 2023-10-18 18:15:00,149 epoch 9 - iter 178/894 - loss 0.30076729 - time (sec): 2.75 - samples/sec: 5684.88 - lr: 0.000006 - momentum: 0.000000 2023-10-18 18:15:01,567 epoch 9 - iter 267/894 - loss 0.29425291 - time (sec): 4.17 - samples/sec: 5863.62 - lr: 0.000006 - momentum: 0.000000 2023-10-18 18:15:03,049 epoch 9 - iter 356/894 - loss 0.30464495 - time (sec): 5.65 - samples/sec: 5799.05 - lr: 0.000005 - momentum: 0.000000 2023-10-18 18:15:04,519 epoch 9 - iter 445/894 - loss 0.30458727 - time (sec): 7.12 - samples/sec: 5906.97 - lr: 0.000005 - momentum: 0.000000 2023-10-18 18:15:05,921 epoch 9 - iter 534/894 - loss 0.30462622 - time (sec): 8.53 - samples/sec: 6011.64 - lr: 0.000005 - momentum: 0.000000 2023-10-18 18:15:07,339 epoch 9 - iter 623/894 - loss 0.29970434 - time (sec): 9.94 - samples/sec: 6014.50 - lr: 0.000004 - momentum: 0.000000 2023-10-18 18:15:08,843 epoch 9 - iter 712/894 - loss 0.29994186 - time (sec): 11.45 - samples/sec: 5930.13 - lr: 0.000004 - momentum: 0.000000 2023-10-18 18:15:10,270 epoch 9 - iter 801/894 - loss 0.30270378 - time (sec): 12.87 - samples/sec: 6023.83 - lr: 0.000004 - momentum: 0.000000 2023-10-18 18:15:11,655 epoch 9 - iter 890/894 - loss 0.30548739 - time (sec): 14.26 - samples/sec: 6053.64 - lr: 0.000003 - momentum: 0.000000 2023-10-18 18:15:11,717 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:15:11,717 EPOCH 9 done: loss 0.3060 - lr: 0.000003 2023-10-18 18:15:16,672 DEV : loss 0.3093281090259552 - f1-score (micro avg) 0.3296 2023-10-18 18:15:16,697 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:15:18,092 epoch 10 - iter 89/894 - loss 0.35217359 - time (sec): 1.39 - samples/sec: 5946.91 - lr: 0.000003 - momentum: 0.000000 2023-10-18 18:15:19,482 epoch 10 - iter 178/894 - loss 0.32604035 - time (sec): 2.78 - samples/sec: 5919.61 - lr: 0.000003 - momentum: 0.000000 2023-10-18 18:15:20,873 epoch 10 - iter 267/894 - loss 0.30390326 - time (sec): 4.18 - samples/sec: 6002.65 - lr: 0.000002 - momentum: 0.000000 2023-10-18 18:15:22,239 epoch 10 - iter 356/894 - loss 0.29975890 - time (sec): 5.54 - samples/sec: 5978.58 - lr: 0.000002 - momentum: 0.000000 2023-10-18 18:15:23,593 epoch 10 - iter 445/894 - loss 0.30706171 - time (sec): 6.90 - samples/sec: 5917.41 - lr: 0.000002 - momentum: 0.000000 2023-10-18 18:15:25,047 epoch 10 - iter 534/894 - loss 0.29824962 - time (sec): 8.35 - samples/sec: 5945.27 - lr: 0.000001 - momentum: 0.000000 2023-10-18 18:15:26,410 epoch 10 - iter 623/894 - loss 0.29881777 - time (sec): 9.71 - samples/sec: 5970.96 - lr: 0.000001 - momentum: 0.000000 2023-10-18 18:15:28,198 epoch 10 - iter 712/894 - loss 0.30115841 - time (sec): 11.50 - samples/sec: 5963.76 - lr: 0.000001 - momentum: 0.000000 2023-10-18 18:15:29,602 epoch 10 - iter 801/894 - loss 0.30301864 - time (sec): 12.90 - samples/sec: 5967.88 - lr: 0.000000 - momentum: 0.000000 2023-10-18 18:15:31,020 epoch 10 - iter 890/894 - loss 0.30007494 - time (sec): 14.32 - samples/sec: 6006.28 - lr: 0.000000 - momentum: 0.000000 2023-10-18 18:15:31,082 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:15:31,082 EPOCH 10 done: loss 0.3003 - lr: 0.000000 2023-10-18 18:15:36,037 DEV : loss 0.30702558159828186 - f1-score (micro avg) 0.3351 2023-10-18 18:15:36,062 saving best model 2023-10-18 18:15:36,125 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:15:36,126 Loading model from best epoch ... 2023-10-18 18:15:36,208 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time 2023-10-18 18:15:38,533 Results: - F-score (micro) 0.3319 - F-score (macro) 0.1314 - Accuracy 0.2091 By class: precision recall f1-score support loc 0.4859 0.5503 0.5161 596 pers 0.1228 0.1652 0.1408 333 org 0.0000 0.0000 0.0000 132 prod 0.0000 0.0000 0.0000 66 time 0.0000 0.0000 0.0000 49 micro avg 0.3383 0.3257 0.3319 1176 macro avg 0.1217 0.1431 0.1314 1176 weighted avg 0.2810 0.3257 0.3015 1176 2023-10-18 18:15:38,533 ----------------------------------------------------------------------------------------------------