2023-10-25 21:30:08,203 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:30:08,204 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 21:30:08,204 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:30:08,204 MultiCorpus: 1166 train + 165 dev + 415 test sentences - NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator 2023-10-25 21:30:08,204 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:30:08,204 Train: 1166 sentences 2023-10-25 21:30:08,204 (train_with_dev=False, train_with_test=False) 2023-10-25 21:30:08,204 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:30:08,204 Training Params: 2023-10-25 21:30:08,204 - learning_rate: "3e-05" 2023-10-25 21:30:08,204 - mini_batch_size: "4" 2023-10-25 21:30:08,204 - max_epochs: "10" 2023-10-25 21:30:08,204 - shuffle: "True" 2023-10-25 21:30:08,204 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:30:08,204 Plugins: 2023-10-25 21:30:08,205 - TensorboardLogger 2023-10-25 21:30:08,205 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 21:30:08,205 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:30:08,205 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 21:30:08,205 - metric: "('micro avg', 'f1-score')" 2023-10-25 21:30:08,205 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:30:08,205 Computation: 2023-10-25 21:30:08,205 - compute on device: cuda:0 2023-10-25 21:30:08,205 - embedding storage: none 2023-10-25 21:30:08,205 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:30:08,205 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-25 21:30:08,205 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:30:08,205 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:30:08,205 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 21:30:09,606 epoch 1 - iter 29/292 - loss 2.59371458 - time (sec): 1.40 - samples/sec: 2896.33 - lr: 0.000003 - momentum: 0.000000 2023-10-25 21:30:10,850 epoch 1 - iter 58/292 - loss 1.95964027 - time (sec): 2.64 - samples/sec: 2959.98 - lr: 0.000006 - momentum: 0.000000 2023-10-25 21:30:12,193 epoch 1 - iter 87/292 - loss 1.55409749 - time (sec): 3.99 - samples/sec: 3138.14 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:30:13,533 epoch 1 - iter 116/292 - loss 1.29448959 - time (sec): 5.33 - samples/sec: 3211.67 - lr: 0.000012 - momentum: 0.000000 2023-10-25 21:30:14,805 epoch 1 - iter 145/292 - loss 1.10125147 - time (sec): 6.60 - samples/sec: 3291.64 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:30:16,127 epoch 1 - iter 174/292 - loss 0.98629607 - time (sec): 7.92 - samples/sec: 3235.22 - lr: 0.000018 - momentum: 0.000000 2023-10-25 21:30:17,493 epoch 1 - iter 203/292 - loss 0.86851694 - time (sec): 9.29 - samples/sec: 3310.48 - lr: 0.000021 - momentum: 0.000000 2023-10-25 21:30:18,859 epoch 1 - iter 232/292 - loss 0.77881535 - time (sec): 10.65 - samples/sec: 3357.49 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:30:20,169 epoch 1 - iter 261/292 - loss 0.72190147 - time (sec): 11.96 - samples/sec: 3368.38 - lr: 0.000027 - momentum: 0.000000 2023-10-25 21:30:21,437 epoch 1 - iter 290/292 - loss 0.68367522 - time (sec): 13.23 - samples/sec: 3341.39 - lr: 0.000030 - momentum: 0.000000 2023-10-25 21:30:21,516 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:30:21,516 EPOCH 1 done: loss 0.6826 - lr: 0.000030 2023-10-25 21:30:22,184 DEV : loss 0.1497330218553543 - f1-score (micro avg) 0.5972 2023-10-25 21:30:22,188 saving best model 2023-10-25 21:30:22,530 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:30:23,819 epoch 2 - iter 29/292 - loss 0.17252226 - time (sec): 1.29 - samples/sec: 3461.31 - lr: 0.000030 - momentum: 0.000000 2023-10-25 21:30:25,115 epoch 2 - iter 58/292 - loss 0.19611204 - time (sec): 2.58 - samples/sec: 3403.61 - lr: 0.000029 - momentum: 0.000000 2023-10-25 21:30:26,423 epoch 2 - iter 87/292 - loss 0.18751975 - time (sec): 3.89 - samples/sec: 3340.71 - lr: 0.000029 - momentum: 0.000000 2023-10-25 21:30:27,769 epoch 2 - iter 116/292 - loss 0.17497620 - time (sec): 5.24 - samples/sec: 3374.72 - lr: 0.000029 - momentum: 0.000000 2023-10-25 21:30:29,058 epoch 2 - iter 145/292 - loss 0.16837296 - time (sec): 6.53 - samples/sec: 3335.07 - lr: 0.000028 - momentum: 0.000000 2023-10-25 21:30:30,367 epoch 2 - iter 174/292 - loss 0.16678211 - time (sec): 7.84 - samples/sec: 3347.37 - lr: 0.000028 - momentum: 0.000000 2023-10-25 21:30:31,620 epoch 2 - iter 203/292 - loss 0.16469283 - time (sec): 9.09 - samples/sec: 3342.18 - lr: 0.000028 - momentum: 0.000000 2023-10-25 21:30:32,877 epoch 2 - iter 232/292 - loss 0.16617154 - time (sec): 10.35 - samples/sec: 3360.39 - lr: 0.000027 - momentum: 0.000000 2023-10-25 21:30:34,211 epoch 2 - iter 261/292 - loss 0.16625252 - time (sec): 11.68 - samples/sec: 3367.96 - lr: 0.000027 - momentum: 0.000000 2023-10-25 21:30:35,519 epoch 2 - iter 290/292 - loss 0.16032533 - time (sec): 12.99 - samples/sec: 3391.30 - lr: 0.000027 - momentum: 0.000000 2023-10-25 21:30:35,604 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:30:35,604 EPOCH 2 done: loss 0.1595 - lr: 0.000027 2023-10-25 21:30:36,510 DEV : loss 0.12749069929122925 - f1-score (micro avg) 0.6216 2023-10-25 21:30:36,514 saving best model 2023-10-25 21:30:37,133 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:30:38,576 epoch 3 - iter 29/292 - loss 0.09022188 - time (sec): 1.44 - samples/sec: 4060.31 - lr: 0.000026 - momentum: 0.000000 2023-10-25 21:30:39,893 epoch 3 - iter 58/292 - loss 0.10087929 - time (sec): 2.76 - samples/sec: 3781.78 - lr: 0.000026 - momentum: 0.000000 2023-10-25 21:30:41,160 epoch 3 - iter 87/292 - loss 0.10208490 - time (sec): 4.03 - samples/sec: 3635.44 - lr: 0.000026 - momentum: 0.000000 2023-10-25 21:30:42,449 epoch 3 - iter 116/292 - loss 0.10105352 - time (sec): 5.31 - samples/sec: 3517.62 - lr: 0.000025 - momentum: 0.000000 2023-10-25 21:30:43,751 epoch 3 - iter 145/292 - loss 0.10094429 - time (sec): 6.62 - samples/sec: 3495.48 - lr: 0.000025 - momentum: 0.000000 2023-10-25 21:30:45,016 epoch 3 - iter 174/292 - loss 0.09645934 - time (sec): 7.88 - samples/sec: 3441.59 - lr: 0.000025 - momentum: 0.000000 2023-10-25 21:30:46,488 epoch 3 - iter 203/292 - loss 0.09398874 - time (sec): 9.35 - samples/sec: 3382.73 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:30:47,737 epoch 3 - iter 232/292 - loss 0.09314091 - time (sec): 10.60 - samples/sec: 3308.62 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:30:49,070 epoch 3 - iter 261/292 - loss 0.09176423 - time (sec): 11.93 - samples/sec: 3358.99 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:30:50,363 epoch 3 - iter 290/292 - loss 0.09054539 - time (sec): 13.23 - samples/sec: 3342.94 - lr: 0.000023 - momentum: 0.000000 2023-10-25 21:30:50,450 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:30:50,450 EPOCH 3 done: loss 0.0913 - lr: 0.000023 2023-10-25 21:30:51,361 DEV : loss 0.11841346323490143 - f1-score (micro avg) 0.7118 2023-10-25 21:30:51,365 saving best model 2023-10-25 21:30:52,005 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:30:53,266 epoch 4 - iter 29/292 - loss 0.06571735 - time (sec): 1.26 - samples/sec: 3370.24 - lr: 0.000023 - momentum: 0.000000 2023-10-25 21:30:54,598 epoch 4 - iter 58/292 - loss 0.05780459 - time (sec): 2.59 - samples/sec: 3250.20 - lr: 0.000023 - momentum: 0.000000 2023-10-25 21:30:55,965 epoch 4 - iter 87/292 - loss 0.05216866 - time (sec): 3.96 - samples/sec: 3297.12 - lr: 0.000022 - momentum: 0.000000 2023-10-25 21:30:57,257 epoch 4 - iter 116/292 - loss 0.04928450 - time (sec): 5.25 - samples/sec: 3238.11 - lr: 0.000022 - momentum: 0.000000 2023-10-25 21:30:58,611 epoch 4 - iter 145/292 - loss 0.05174887 - time (sec): 6.60 - samples/sec: 3421.02 - lr: 0.000022 - momentum: 0.000000 2023-10-25 21:30:59,914 epoch 4 - iter 174/292 - loss 0.05349774 - time (sec): 7.91 - samples/sec: 3465.27 - lr: 0.000021 - momentum: 0.000000 2023-10-25 21:31:01,321 epoch 4 - iter 203/292 - loss 0.05412245 - time (sec): 9.31 - samples/sec: 3411.38 - lr: 0.000021 - momentum: 0.000000 2023-10-25 21:31:02,580 epoch 4 - iter 232/292 - loss 0.05878785 - time (sec): 10.57 - samples/sec: 3383.60 - lr: 0.000021 - momentum: 0.000000 2023-10-25 21:31:03,967 epoch 4 - iter 261/292 - loss 0.06009869 - time (sec): 11.96 - samples/sec: 3381.49 - lr: 0.000020 - momentum: 0.000000 2023-10-25 21:31:05,232 epoch 4 - iter 290/292 - loss 0.05847457 - time (sec): 13.22 - samples/sec: 3342.38 - lr: 0.000020 - momentum: 0.000000 2023-10-25 21:31:05,312 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:31:05,312 EPOCH 4 done: loss 0.0583 - lr: 0.000020 2023-10-25 21:31:06,226 DEV : loss 0.13698235154151917 - f1-score (micro avg) 0.7329 2023-10-25 21:31:06,231 saving best model 2023-10-25 21:31:06,849 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:31:08,165 epoch 5 - iter 29/292 - loss 0.03972724 - time (sec): 1.31 - samples/sec: 3464.93 - lr: 0.000020 - momentum: 0.000000 2023-10-25 21:31:09,448 epoch 5 - iter 58/292 - loss 0.03798626 - time (sec): 2.60 - samples/sec: 3365.39 - lr: 0.000019 - momentum: 0.000000 2023-10-25 21:31:10,783 epoch 5 - iter 87/292 - loss 0.03555398 - time (sec): 3.93 - samples/sec: 3314.51 - lr: 0.000019 - momentum: 0.000000 2023-10-25 21:31:12,032 epoch 5 - iter 116/292 - loss 0.03214301 - time (sec): 5.18 - samples/sec: 3360.42 - lr: 0.000019 - momentum: 0.000000 2023-10-25 21:31:13,311 epoch 5 - iter 145/292 - loss 0.03292483 - time (sec): 6.46 - samples/sec: 3385.15 - lr: 0.000018 - momentum: 0.000000 2023-10-25 21:31:14,600 epoch 5 - iter 174/292 - loss 0.03681121 - time (sec): 7.75 - samples/sec: 3329.04 - lr: 0.000018 - momentum: 0.000000 2023-10-25 21:31:15,983 epoch 5 - iter 203/292 - loss 0.03818340 - time (sec): 9.13 - samples/sec: 3327.67 - lr: 0.000018 - momentum: 0.000000 2023-10-25 21:31:17,283 epoch 5 - iter 232/292 - loss 0.04048615 - time (sec): 10.43 - samples/sec: 3412.34 - lr: 0.000017 - momentum: 0.000000 2023-10-25 21:31:18,581 epoch 5 - iter 261/292 - loss 0.04033068 - time (sec): 11.73 - samples/sec: 3414.90 - lr: 0.000017 - momentum: 0.000000 2023-10-25 21:31:19,822 epoch 5 - iter 290/292 - loss 0.03959189 - time (sec): 12.97 - samples/sec: 3415.07 - lr: 0.000017 - momentum: 0.000000 2023-10-25 21:31:19,897 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:31:19,897 EPOCH 5 done: loss 0.0395 - lr: 0.000017 2023-10-25 21:31:20,805 DEV : loss 0.14024661481380463 - f1-score (micro avg) 0.7364 2023-10-25 21:31:20,809 saving best model 2023-10-25 21:31:21,435 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:31:22,752 epoch 6 - iter 29/292 - loss 0.02545293 - time (sec): 1.31 - samples/sec: 3771.70 - lr: 0.000016 - momentum: 0.000000 2023-10-25 21:31:24,050 epoch 6 - iter 58/292 - loss 0.03242435 - time (sec): 2.61 - samples/sec: 3433.27 - lr: 0.000016 - momentum: 0.000000 2023-10-25 21:31:25,321 epoch 6 - iter 87/292 - loss 0.02375022 - time (sec): 3.88 - samples/sec: 3488.35 - lr: 0.000016 - momentum: 0.000000 2023-10-25 21:31:26,643 epoch 6 - iter 116/292 - loss 0.03110057 - time (sec): 5.20 - samples/sec: 3502.60 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:31:27,915 epoch 6 - iter 145/292 - loss 0.03115742 - time (sec): 6.48 - samples/sec: 3503.77 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:31:29,157 epoch 6 - iter 174/292 - loss 0.03002770 - time (sec): 7.72 - samples/sec: 3510.24 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:31:30,359 epoch 6 - iter 203/292 - loss 0.02977686 - time (sec): 8.92 - samples/sec: 3488.79 - lr: 0.000014 - momentum: 0.000000 2023-10-25 21:31:31,596 epoch 6 - iter 232/292 - loss 0.02910083 - time (sec): 10.16 - samples/sec: 3471.29 - lr: 0.000014 - momentum: 0.000000 2023-10-25 21:31:32,922 epoch 6 - iter 261/292 - loss 0.02814018 - time (sec): 11.48 - samples/sec: 3472.71 - lr: 0.000014 - momentum: 0.000000 2023-10-25 21:31:34,133 epoch 6 - iter 290/292 - loss 0.02750245 - time (sec): 12.69 - samples/sec: 3462.82 - lr: 0.000013 - momentum: 0.000000 2023-10-25 21:31:34,216 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:31:34,216 EPOCH 6 done: loss 0.0275 - lr: 0.000013 2023-10-25 21:31:35,136 DEV : loss 0.1627625823020935 - f1-score (micro avg) 0.7527 2023-10-25 21:31:35,140 saving best model 2023-10-25 21:31:35,751 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:31:36,998 epoch 7 - iter 29/292 - loss 0.02550245 - time (sec): 1.24 - samples/sec: 3912.23 - lr: 0.000013 - momentum: 0.000000 2023-10-25 21:31:38,272 epoch 7 - iter 58/292 - loss 0.03222503 - time (sec): 2.52 - samples/sec: 3887.60 - lr: 0.000013 - momentum: 0.000000 2023-10-25 21:31:39,460 epoch 7 - iter 87/292 - loss 0.03168046 - time (sec): 3.70 - samples/sec: 3733.03 - lr: 0.000012 - momentum: 0.000000 2023-10-25 21:31:40,677 epoch 7 - iter 116/292 - loss 0.02908034 - time (sec): 4.92 - samples/sec: 3612.53 - lr: 0.000012 - momentum: 0.000000 2023-10-25 21:31:41,881 epoch 7 - iter 145/292 - loss 0.02514753 - time (sec): 6.13 - samples/sec: 3542.89 - lr: 0.000012 - momentum: 0.000000 2023-10-25 21:31:43,238 epoch 7 - iter 174/292 - loss 0.02468695 - time (sec): 7.48 - samples/sec: 3553.75 - lr: 0.000011 - momentum: 0.000000 2023-10-25 21:31:44,537 epoch 7 - iter 203/292 - loss 0.02340193 - time (sec): 8.78 - samples/sec: 3551.40 - lr: 0.000011 - momentum: 0.000000 2023-10-25 21:31:45,809 epoch 7 - iter 232/292 - loss 0.02228458 - time (sec): 10.05 - samples/sec: 3510.36 - lr: 0.000011 - momentum: 0.000000 2023-10-25 21:31:47,167 epoch 7 - iter 261/292 - loss 0.02091578 - time (sec): 11.41 - samples/sec: 3473.34 - lr: 0.000010 - momentum: 0.000000 2023-10-25 21:31:48,521 epoch 7 - iter 290/292 - loss 0.02054549 - time (sec): 12.77 - samples/sec: 3470.06 - lr: 0.000010 - momentum: 0.000000 2023-10-25 21:31:48,612 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:31:48,612 EPOCH 7 done: loss 0.0205 - lr: 0.000010 2023-10-25 21:31:49,536 DEV : loss 0.18406300246715546 - f1-score (micro avg) 0.7638 2023-10-25 21:31:49,540 saving best model 2023-10-25 21:31:50,152 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:31:51,534 epoch 8 - iter 29/292 - loss 0.01278755 - time (sec): 1.38 - samples/sec: 3174.63 - lr: 0.000010 - momentum: 0.000000 2023-10-25 21:31:52,890 epoch 8 - iter 58/292 - loss 0.01502218 - time (sec): 2.73 - samples/sec: 3230.30 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:31:54,205 epoch 8 - iter 87/292 - loss 0.01162352 - time (sec): 4.05 - samples/sec: 3323.36 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:31:55,489 epoch 8 - iter 116/292 - loss 0.01556603 - time (sec): 5.33 - samples/sec: 3302.90 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:31:56,784 epoch 8 - iter 145/292 - loss 0.01579903 - time (sec): 6.63 - samples/sec: 3285.94 - lr: 0.000008 - momentum: 0.000000 2023-10-25 21:31:58,212 epoch 8 - iter 174/292 - loss 0.01669043 - time (sec): 8.06 - samples/sec: 3226.30 - lr: 0.000008 - momentum: 0.000000 2023-10-25 21:31:59,482 epoch 8 - iter 203/292 - loss 0.01521658 - time (sec): 9.33 - samples/sec: 3184.00 - lr: 0.000008 - momentum: 0.000000 2023-10-25 21:32:00,800 epoch 8 - iter 232/292 - loss 0.01518710 - time (sec): 10.64 - samples/sec: 3210.18 - lr: 0.000007 - momentum: 0.000000 2023-10-25 21:32:02,092 epoch 8 - iter 261/292 - loss 0.01451772 - time (sec): 11.94 - samples/sec: 3264.25 - lr: 0.000007 - momentum: 0.000000 2023-10-25 21:32:03,464 epoch 8 - iter 290/292 - loss 0.01550935 - time (sec): 13.31 - samples/sec: 3323.97 - lr: 0.000007 - momentum: 0.000000 2023-10-25 21:32:03,551 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:32:03,552 EPOCH 8 done: loss 0.0156 - lr: 0.000007 2023-10-25 21:32:04,457 DEV : loss 0.18133316934108734 - f1-score (micro avg) 0.7489 2023-10-25 21:32:04,462 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:32:05,828 epoch 9 - iter 29/292 - loss 0.00469478 - time (sec): 1.36 - samples/sec: 3658.24 - lr: 0.000006 - momentum: 0.000000 2023-10-25 21:32:07,078 epoch 9 - iter 58/292 - loss 0.00939010 - time (sec): 2.61 - samples/sec: 3544.28 - lr: 0.000006 - momentum: 0.000000 2023-10-25 21:32:08,353 epoch 9 - iter 87/292 - loss 0.00915593 - time (sec): 3.89 - samples/sec: 3555.78 - lr: 0.000006 - momentum: 0.000000 2023-10-25 21:32:09,712 epoch 9 - iter 116/292 - loss 0.01207961 - time (sec): 5.25 - samples/sec: 3529.11 - lr: 0.000005 - momentum: 0.000000 2023-10-25 21:32:11,030 epoch 9 - iter 145/292 - loss 0.01222517 - time (sec): 6.57 - samples/sec: 3493.57 - lr: 0.000005 - momentum: 0.000000 2023-10-25 21:32:12,307 epoch 9 - iter 174/292 - loss 0.01195187 - time (sec): 7.84 - samples/sec: 3478.22 - lr: 0.000005 - momentum: 0.000000 2023-10-25 21:32:13,584 epoch 9 - iter 203/292 - loss 0.01115930 - time (sec): 9.12 - samples/sec: 3479.08 - lr: 0.000004 - momentum: 0.000000 2023-10-25 21:32:14,839 epoch 9 - iter 232/292 - loss 0.01057009 - time (sec): 10.38 - samples/sec: 3434.42 - lr: 0.000004 - momentum: 0.000000 2023-10-25 21:32:16,163 epoch 9 - iter 261/292 - loss 0.01094290 - time (sec): 11.70 - samples/sec: 3384.80 - lr: 0.000004 - momentum: 0.000000 2023-10-25 21:32:17,540 epoch 9 - iter 290/292 - loss 0.01034591 - time (sec): 13.08 - samples/sec: 3376.60 - lr: 0.000003 - momentum: 0.000000 2023-10-25 21:32:17,632 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:32:17,632 EPOCH 9 done: loss 0.0103 - lr: 0.000003 2023-10-25 21:32:18,549 DEV : loss 0.1857309192419052 - f1-score (micro avg) 0.7461 2023-10-25 21:32:18,553 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:32:19,857 epoch 10 - iter 29/292 - loss 0.00833025 - time (sec): 1.30 - samples/sec: 3353.71 - lr: 0.000003 - momentum: 0.000000 2023-10-25 21:32:21,159 epoch 10 - iter 58/292 - loss 0.00620856 - time (sec): 2.60 - samples/sec: 3160.18 - lr: 0.000003 - momentum: 0.000000 2023-10-25 21:32:22,448 epoch 10 - iter 87/292 - loss 0.01250990 - time (sec): 3.89 - samples/sec: 3177.17 - lr: 0.000002 - momentum: 0.000000 2023-10-25 21:32:23,642 epoch 10 - iter 116/292 - loss 0.01078091 - time (sec): 5.09 - samples/sec: 3278.69 - lr: 0.000002 - momentum: 0.000000 2023-10-25 21:32:24,940 epoch 10 - iter 145/292 - loss 0.00880841 - time (sec): 6.39 - samples/sec: 3371.71 - lr: 0.000002 - momentum: 0.000000 2023-10-25 21:32:26,129 epoch 10 - iter 174/292 - loss 0.00884157 - time (sec): 7.57 - samples/sec: 3418.33 - lr: 0.000001 - momentum: 0.000000 2023-10-25 21:32:27,383 epoch 10 - iter 203/292 - loss 0.00866918 - time (sec): 8.83 - samples/sec: 3498.88 - lr: 0.000001 - momentum: 0.000000 2023-10-25 21:32:28,613 epoch 10 - iter 232/292 - loss 0.00812226 - time (sec): 10.06 - samples/sec: 3493.63 - lr: 0.000001 - momentum: 0.000000 2023-10-25 21:32:29,859 epoch 10 - iter 261/292 - loss 0.00777060 - time (sec): 11.30 - samples/sec: 3510.22 - lr: 0.000000 - momentum: 0.000000 2023-10-25 21:32:31,235 epoch 10 - iter 290/292 - loss 0.00767455 - time (sec): 12.68 - samples/sec: 3490.79 - lr: 0.000000 - momentum: 0.000000 2023-10-25 21:32:31,323 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:32:31,323 EPOCH 10 done: loss 0.0084 - lr: 0.000000 2023-10-25 21:32:32,251 DEV : loss 0.1949864774942398 - f1-score (micro avg) 0.7571 2023-10-25 21:32:32,724 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:32:32,725 Loading model from best epoch ... 2023-10-25 21:32:34,339 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-25 21:32:35,859 Results: - F-score (micro) 0.7648 - F-score (macro) 0.6948 - Accuracy 0.6424 By class: precision recall f1-score support PER 0.8187 0.8305 0.8245 348 LOC 0.6759 0.8391 0.7487 261 ORG 0.4583 0.4231 0.4400 52 HumanProd 0.7200 0.8182 0.7660 22 micro avg 0.7307 0.8023 0.7648 683 macro avg 0.6682 0.7277 0.6948 683 weighted avg 0.7335 0.8023 0.7644 683 2023-10-25 21:32:35,859 ----------------------------------------------------------------------------------------------------