2023-10-25 21:02:46,321 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:02:46,322 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 21:02:46,322 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:02:46,322 MultiCorpus: 1166 train + 165 dev + 415 test sentences - NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator 2023-10-25 21:02:46,322 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:02:46,322 Train: 1166 sentences 2023-10-25 21:02:46,322 (train_with_dev=False, train_with_test=False) 2023-10-25 21:02:46,322 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:02:46,322 Training Params: 2023-10-25 21:02:46,322 - learning_rate: "5e-05" 2023-10-25 21:02:46,322 - mini_batch_size: "4" 2023-10-25 21:02:46,322 - max_epochs: "10" 2023-10-25 21:02:46,322 - shuffle: "True" 2023-10-25 21:02:46,322 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:02:46,323 Plugins: 2023-10-25 21:02:46,323 - TensorboardLogger 2023-10-25 21:02:46,323 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 21:02:46,323 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:02:46,323 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 21:02:46,323 - metric: "('micro avg', 'f1-score')" 2023-10-25 21:02:46,323 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:02:46,323 Computation: 2023-10-25 21:02:46,323 - compute on device: cuda:0 2023-10-25 21:02:46,323 - embedding storage: none 2023-10-25 21:02:46,323 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:02:46,323 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-25 21:02:46,323 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:02:46,323 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:02:46,323 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 21:02:47,568 epoch 1 - iter 29/292 - loss 2.70493946 - time (sec): 1.24 - samples/sec: 3001.90 - lr: 0.000005 - momentum: 0.000000 2023-10-25 21:02:48,960 epoch 1 - iter 58/292 - loss 1.64294689 - time (sec): 2.64 - samples/sec: 3433.28 - lr: 0.000010 - momentum: 0.000000 2023-10-25 21:02:50,359 epoch 1 - iter 87/292 - loss 1.24359019 - time (sec): 4.03 - samples/sec: 3589.43 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:02:51,628 epoch 1 - iter 116/292 - loss 1.05004462 - time (sec): 5.30 - samples/sec: 3552.13 - lr: 0.000020 - momentum: 0.000000 2023-10-25 21:02:52,874 epoch 1 - iter 145/292 - loss 0.91786982 - time (sec): 6.55 - samples/sec: 3520.91 - lr: 0.000025 - momentum: 0.000000 2023-10-25 21:02:54,229 epoch 1 - iter 174/292 - loss 0.80347598 - time (sec): 7.91 - samples/sec: 3546.58 - lr: 0.000030 - momentum: 0.000000 2023-10-25 21:02:55,489 epoch 1 - iter 203/292 - loss 0.74188701 - time (sec): 9.17 - samples/sec: 3431.97 - lr: 0.000035 - momentum: 0.000000 2023-10-25 21:02:56,762 epoch 1 - iter 232/292 - loss 0.68879598 - time (sec): 10.44 - samples/sec: 3393.36 - lr: 0.000040 - momentum: 0.000000 2023-10-25 21:02:57,981 epoch 1 - iter 261/292 - loss 0.65138888 - time (sec): 11.66 - samples/sec: 3344.55 - lr: 0.000045 - momentum: 0.000000 2023-10-25 21:02:59,311 epoch 1 - iter 290/292 - loss 0.60276932 - time (sec): 12.99 - samples/sec: 3401.64 - lr: 0.000049 - momentum: 0.000000 2023-10-25 21:02:59,388 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:02:59,388 EPOCH 1 done: loss 0.5995 - lr: 0.000049 2023-10-25 21:03:00,042 DEV : loss 0.15543945133686066 - f1-score (micro avg) 0.5759 2023-10-25 21:03:00,046 saving best model 2023-10-25 21:03:00,575 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:03:01,799 epoch 2 - iter 29/292 - loss 0.16952144 - time (sec): 1.22 - samples/sec: 3371.46 - lr: 0.000049 - momentum: 0.000000 2023-10-25 21:03:03,040 epoch 2 - iter 58/292 - loss 0.18657242 - time (sec): 2.46 - samples/sec: 3294.57 - lr: 0.000049 - momentum: 0.000000 2023-10-25 21:03:04,352 epoch 2 - iter 87/292 - loss 0.16106239 - time (sec): 3.78 - samples/sec: 3300.13 - lr: 0.000048 - momentum: 0.000000 2023-10-25 21:03:05,608 epoch 2 - iter 116/292 - loss 0.15644988 - time (sec): 5.03 - samples/sec: 3352.82 - lr: 0.000048 - momentum: 0.000000 2023-10-25 21:03:06,836 epoch 2 - iter 145/292 - loss 0.15373069 - time (sec): 6.26 - samples/sec: 3411.63 - lr: 0.000047 - momentum: 0.000000 2023-10-25 21:03:08,014 epoch 2 - iter 174/292 - loss 0.16590765 - time (sec): 7.44 - samples/sec: 3373.45 - lr: 0.000047 - momentum: 0.000000 2023-10-25 21:03:09,249 epoch 2 - iter 203/292 - loss 0.17045348 - time (sec): 8.67 - samples/sec: 3377.62 - lr: 0.000046 - momentum: 0.000000 2023-10-25 21:03:10,551 epoch 2 - iter 232/292 - loss 0.17007656 - time (sec): 9.97 - samples/sec: 3343.68 - lr: 0.000046 - momentum: 0.000000 2023-10-25 21:03:11,963 epoch 2 - iter 261/292 - loss 0.16415110 - time (sec): 11.39 - samples/sec: 3441.55 - lr: 0.000045 - momentum: 0.000000 2023-10-25 21:03:13,297 epoch 2 - iter 290/292 - loss 0.15537644 - time (sec): 12.72 - samples/sec: 3482.88 - lr: 0.000045 - momentum: 0.000000 2023-10-25 21:03:13,376 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:03:13,376 EPOCH 2 done: loss 0.1553 - lr: 0.000045 2023-10-25 21:03:14,282 DEV : loss 0.14280402660369873 - f1-score (micro avg) 0.7113 2023-10-25 21:03:14,287 saving best model 2023-10-25 21:03:14,953 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:03:16,338 epoch 3 - iter 29/292 - loss 0.12799538 - time (sec): 1.38 - samples/sec: 3483.45 - lr: 0.000044 - momentum: 0.000000 2023-10-25 21:03:17,670 epoch 3 - iter 58/292 - loss 0.09862348 - time (sec): 2.71 - samples/sec: 3517.39 - lr: 0.000043 - momentum: 0.000000 2023-10-25 21:03:18,972 epoch 3 - iter 87/292 - loss 0.08769634 - time (sec): 4.02 - samples/sec: 3425.70 - lr: 0.000043 - momentum: 0.000000 2023-10-25 21:03:20,258 epoch 3 - iter 116/292 - loss 0.08875247 - time (sec): 5.30 - samples/sec: 3388.13 - lr: 0.000042 - momentum: 0.000000 2023-10-25 21:03:21,512 epoch 3 - iter 145/292 - loss 0.08886591 - time (sec): 6.56 - samples/sec: 3325.24 - lr: 0.000042 - momentum: 0.000000 2023-10-25 21:03:22,754 epoch 3 - iter 174/292 - loss 0.09139678 - time (sec): 7.80 - samples/sec: 3283.17 - lr: 0.000041 - momentum: 0.000000 2023-10-25 21:03:24,065 epoch 3 - iter 203/292 - loss 0.08933904 - time (sec): 9.11 - samples/sec: 3332.68 - lr: 0.000041 - momentum: 0.000000 2023-10-25 21:03:25,360 epoch 3 - iter 232/292 - loss 0.08802096 - time (sec): 10.40 - samples/sec: 3352.74 - lr: 0.000040 - momentum: 0.000000 2023-10-25 21:03:26,654 epoch 3 - iter 261/292 - loss 0.09013801 - time (sec): 11.70 - samples/sec: 3350.70 - lr: 0.000040 - momentum: 0.000000 2023-10-25 21:03:28,018 epoch 3 - iter 290/292 - loss 0.08726593 - time (sec): 13.06 - samples/sec: 3372.98 - lr: 0.000039 - momentum: 0.000000 2023-10-25 21:03:28,107 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:03:28,108 EPOCH 3 done: loss 0.0879 - lr: 0.000039 2023-10-25 21:03:29,012 DEV : loss 0.16181543469429016 - f1-score (micro avg) 0.7412 2023-10-25 21:03:29,017 saving best model 2023-10-25 21:03:29,682 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:03:31,011 epoch 4 - iter 29/292 - loss 0.08251774 - time (sec): 1.33 - samples/sec: 3144.69 - lr: 0.000038 - momentum: 0.000000 2023-10-25 21:03:32,296 epoch 4 - iter 58/292 - loss 0.06039563 - time (sec): 2.61 - samples/sec: 3480.66 - lr: 0.000038 - momentum: 0.000000 2023-10-25 21:03:33,577 epoch 4 - iter 87/292 - loss 0.05907690 - time (sec): 3.89 - samples/sec: 3336.97 - lr: 0.000037 - momentum: 0.000000 2023-10-25 21:03:34,854 epoch 4 - iter 116/292 - loss 0.05470702 - time (sec): 5.17 - samples/sec: 3343.45 - lr: 0.000037 - momentum: 0.000000 2023-10-25 21:03:36,253 epoch 4 - iter 145/292 - loss 0.05698655 - time (sec): 6.57 - samples/sec: 3300.17 - lr: 0.000036 - momentum: 0.000000 2023-10-25 21:03:37,613 epoch 4 - iter 174/292 - loss 0.05507272 - time (sec): 7.93 - samples/sec: 3315.00 - lr: 0.000036 - momentum: 0.000000 2023-10-25 21:03:38,875 epoch 4 - iter 203/292 - loss 0.06136269 - time (sec): 9.19 - samples/sec: 3312.60 - lr: 0.000035 - momentum: 0.000000 2023-10-25 21:03:40,241 epoch 4 - iter 232/292 - loss 0.06014642 - time (sec): 10.56 - samples/sec: 3353.87 - lr: 0.000035 - momentum: 0.000000 2023-10-25 21:03:41,520 epoch 4 - iter 261/292 - loss 0.05735605 - time (sec): 11.84 - samples/sec: 3372.49 - lr: 0.000034 - momentum: 0.000000 2023-10-25 21:03:42,826 epoch 4 - iter 290/292 - loss 0.05824407 - time (sec): 13.14 - samples/sec: 3365.23 - lr: 0.000033 - momentum: 0.000000 2023-10-25 21:03:42,909 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:03:42,909 EPOCH 4 done: loss 0.0580 - lr: 0.000033 2023-10-25 21:03:43,968 DEV : loss 0.1569732278585434 - f1-score (micro avg) 0.7606 2023-10-25 21:03:43,972 saving best model 2023-10-25 21:03:44,658 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:03:45,907 epoch 5 - iter 29/292 - loss 0.02448988 - time (sec): 1.25 - samples/sec: 2965.12 - lr: 0.000033 - momentum: 0.000000 2023-10-25 21:03:47,242 epoch 5 - iter 58/292 - loss 0.03323303 - time (sec): 2.58 - samples/sec: 3230.68 - lr: 0.000032 - momentum: 0.000000 2023-10-25 21:03:48,588 epoch 5 - iter 87/292 - loss 0.03405391 - time (sec): 3.93 - samples/sec: 3407.07 - lr: 0.000032 - momentum: 0.000000 2023-10-25 21:03:49,819 epoch 5 - iter 116/292 - loss 0.03187030 - time (sec): 5.16 - samples/sec: 3408.51 - lr: 0.000031 - momentum: 0.000000 2023-10-25 21:03:51,096 epoch 5 - iter 145/292 - loss 0.03696710 - time (sec): 6.43 - samples/sec: 3345.01 - lr: 0.000031 - momentum: 0.000000 2023-10-25 21:03:52,458 epoch 5 - iter 174/292 - loss 0.03833785 - time (sec): 7.80 - samples/sec: 3310.13 - lr: 0.000030 - momentum: 0.000000 2023-10-25 21:03:53,820 epoch 5 - iter 203/292 - loss 0.04040647 - time (sec): 9.16 - samples/sec: 3370.90 - lr: 0.000030 - momentum: 0.000000 2023-10-25 21:03:55,148 epoch 5 - iter 232/292 - loss 0.03785862 - time (sec): 10.49 - samples/sec: 3380.50 - lr: 0.000029 - momentum: 0.000000 2023-10-25 21:03:56,505 epoch 5 - iter 261/292 - loss 0.03665954 - time (sec): 11.84 - samples/sec: 3356.28 - lr: 0.000028 - momentum: 0.000000 2023-10-25 21:03:57,830 epoch 5 - iter 290/292 - loss 0.03585809 - time (sec): 13.17 - samples/sec: 3363.05 - lr: 0.000028 - momentum: 0.000000 2023-10-25 21:03:57,905 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:03:57,905 EPOCH 5 done: loss 0.0357 - lr: 0.000028 2023-10-25 21:03:58,815 DEV : loss 0.14395155012607574 - f1-score (micro avg) 0.7265 2023-10-25 21:03:58,819 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:04:00,208 epoch 6 - iter 29/292 - loss 0.02165138 - time (sec): 1.39 - samples/sec: 3839.74 - lr: 0.000027 - momentum: 0.000000 2023-10-25 21:04:01,519 epoch 6 - iter 58/292 - loss 0.01941242 - time (sec): 2.70 - samples/sec: 3485.44 - lr: 0.000027 - momentum: 0.000000 2023-10-25 21:04:02,781 epoch 6 - iter 87/292 - loss 0.02557862 - time (sec): 3.96 - samples/sec: 3472.87 - lr: 0.000026 - momentum: 0.000000 2023-10-25 21:04:04,147 epoch 6 - iter 116/292 - loss 0.02482614 - time (sec): 5.33 - samples/sec: 3550.68 - lr: 0.000026 - momentum: 0.000000 2023-10-25 21:04:05,427 epoch 6 - iter 145/292 - loss 0.02631017 - time (sec): 6.61 - samples/sec: 3457.32 - lr: 0.000025 - momentum: 0.000000 2023-10-25 21:04:06,786 epoch 6 - iter 174/292 - loss 0.02314945 - time (sec): 7.97 - samples/sec: 3450.43 - lr: 0.000025 - momentum: 0.000000 2023-10-25 21:04:08,068 epoch 6 - iter 203/292 - loss 0.02313133 - time (sec): 9.25 - samples/sec: 3440.85 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:04:09,331 epoch 6 - iter 232/292 - loss 0.02352516 - time (sec): 10.51 - samples/sec: 3447.74 - lr: 0.000023 - momentum: 0.000000 2023-10-25 21:04:10,604 epoch 6 - iter 261/292 - loss 0.02453468 - time (sec): 11.78 - samples/sec: 3461.04 - lr: 0.000023 - momentum: 0.000000 2023-10-25 21:04:11,847 epoch 6 - iter 290/292 - loss 0.02499737 - time (sec): 13.03 - samples/sec: 3387.70 - lr: 0.000022 - momentum: 0.000000 2023-10-25 21:04:11,936 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:04:11,936 EPOCH 6 done: loss 0.0251 - lr: 0.000022 2023-10-25 21:04:12,842 DEV : loss 0.17060092091560364 - f1-score (micro avg) 0.7447 2023-10-25 21:04:12,846 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:04:14,184 epoch 7 - iter 29/292 - loss 0.03038679 - time (sec): 1.34 - samples/sec: 3200.51 - lr: 0.000022 - momentum: 0.000000 2023-10-25 21:04:15,504 epoch 7 - iter 58/292 - loss 0.02199783 - time (sec): 2.66 - samples/sec: 3269.59 - lr: 0.000021 - momentum: 0.000000 2023-10-25 21:04:16,943 epoch 7 - iter 87/292 - loss 0.02107799 - time (sec): 4.10 - samples/sec: 3377.74 - lr: 0.000021 - momentum: 0.000000 2023-10-25 21:04:18,274 epoch 7 - iter 116/292 - loss 0.01988430 - time (sec): 5.43 - samples/sec: 3358.94 - lr: 0.000020 - momentum: 0.000000 2023-10-25 21:04:19,515 epoch 7 - iter 145/292 - loss 0.02054682 - time (sec): 6.67 - samples/sec: 3387.20 - lr: 0.000020 - momentum: 0.000000 2023-10-25 21:04:20,827 epoch 7 - iter 174/292 - loss 0.01902887 - time (sec): 7.98 - samples/sec: 3402.18 - lr: 0.000019 - momentum: 0.000000 2023-10-25 21:04:22,089 epoch 7 - iter 203/292 - loss 0.01967890 - time (sec): 9.24 - samples/sec: 3381.86 - lr: 0.000018 - momentum: 0.000000 2023-10-25 21:04:23,370 epoch 7 - iter 232/292 - loss 0.02047310 - time (sec): 10.52 - samples/sec: 3353.31 - lr: 0.000018 - momentum: 0.000000 2023-10-25 21:04:24,632 epoch 7 - iter 261/292 - loss 0.02022304 - time (sec): 11.78 - samples/sec: 3367.80 - lr: 0.000017 - momentum: 0.000000 2023-10-25 21:04:25,851 epoch 7 - iter 290/292 - loss 0.01979598 - time (sec): 13.00 - samples/sec: 3390.67 - lr: 0.000017 - momentum: 0.000000 2023-10-25 21:04:25,946 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:04:25,946 EPOCH 7 done: loss 0.0196 - lr: 0.000017 2023-10-25 21:04:26,861 DEV : loss 0.1871466189622879 - f1-score (micro avg) 0.7439 2023-10-25 21:04:26,866 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:04:28,143 epoch 8 - iter 29/292 - loss 0.00718740 - time (sec): 1.28 - samples/sec: 3597.72 - lr: 0.000016 - momentum: 0.000000 2023-10-25 21:04:29,467 epoch 8 - iter 58/292 - loss 0.01503456 - time (sec): 2.60 - samples/sec: 3365.84 - lr: 0.000016 - momentum: 0.000000 2023-10-25 21:04:30,790 epoch 8 - iter 87/292 - loss 0.01557495 - time (sec): 3.92 - samples/sec: 3190.64 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:04:32,093 epoch 8 - iter 116/292 - loss 0.01500266 - time (sec): 5.23 - samples/sec: 3234.34 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:04:33,336 epoch 8 - iter 145/292 - loss 0.01724417 - time (sec): 6.47 - samples/sec: 3304.72 - lr: 0.000014 - momentum: 0.000000 2023-10-25 21:04:34,647 epoch 8 - iter 174/292 - loss 0.01760521 - time (sec): 7.78 - samples/sec: 3357.34 - lr: 0.000013 - momentum: 0.000000 2023-10-25 21:04:35,925 epoch 8 - iter 203/292 - loss 0.01743967 - time (sec): 9.06 - samples/sec: 3381.28 - lr: 0.000013 - momentum: 0.000000 2023-10-25 21:04:37,315 epoch 8 - iter 232/292 - loss 0.01549544 - time (sec): 10.45 - samples/sec: 3403.73 - lr: 0.000012 - momentum: 0.000000 2023-10-25 21:04:38,612 epoch 8 - iter 261/292 - loss 0.01484755 - time (sec): 11.74 - samples/sec: 3405.07 - lr: 0.000012 - momentum: 0.000000 2023-10-25 21:04:39,938 epoch 8 - iter 290/292 - loss 0.01565987 - time (sec): 13.07 - samples/sec: 3394.16 - lr: 0.000011 - momentum: 0.000000 2023-10-25 21:04:40,013 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:04:40,013 EPOCH 8 done: loss 0.0159 - lr: 0.000011 2023-10-25 21:04:41,089 DEV : loss 0.2016552984714508 - f1-score (micro avg) 0.7642 2023-10-25 21:04:41,094 saving best model 2023-10-25 21:04:41,658 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:04:42,938 epoch 9 - iter 29/292 - loss 0.00883147 - time (sec): 1.28 - samples/sec: 3170.38 - lr: 0.000011 - momentum: 0.000000 2023-10-25 21:04:44,271 epoch 9 - iter 58/292 - loss 0.01566657 - time (sec): 2.61 - samples/sec: 3276.96 - lr: 0.000010 - momentum: 0.000000 2023-10-25 21:04:45,628 epoch 9 - iter 87/292 - loss 0.01114646 - time (sec): 3.97 - samples/sec: 3248.19 - lr: 0.000010 - momentum: 0.000000 2023-10-25 21:04:46,926 epoch 9 - iter 116/292 - loss 0.00886227 - time (sec): 5.27 - samples/sec: 3159.36 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:04:48,286 epoch 9 - iter 145/292 - loss 0.00720736 - time (sec): 6.63 - samples/sec: 3205.00 - lr: 0.000008 - momentum: 0.000000 2023-10-25 21:04:49,636 epoch 9 - iter 174/292 - loss 0.00633132 - time (sec): 7.98 - samples/sec: 3302.03 - lr: 0.000008 - momentum: 0.000000 2023-10-25 21:04:50,954 epoch 9 - iter 203/292 - loss 0.00655342 - time (sec): 9.29 - samples/sec: 3292.40 - lr: 0.000007 - momentum: 0.000000 2023-10-25 21:04:52,300 epoch 9 - iter 232/292 - loss 0.00853524 - time (sec): 10.64 - samples/sec: 3304.99 - lr: 0.000007 - momentum: 0.000000 2023-10-25 21:04:53,593 epoch 9 - iter 261/292 - loss 0.00803946 - time (sec): 11.93 - samples/sec: 3297.35 - lr: 0.000006 - momentum: 0.000000 2023-10-25 21:04:54,911 epoch 9 - iter 290/292 - loss 0.00829946 - time (sec): 13.25 - samples/sec: 3327.92 - lr: 0.000006 - momentum: 0.000000 2023-10-25 21:04:54,992 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:04:54,992 EPOCH 9 done: loss 0.0082 - lr: 0.000006 2023-10-25 21:04:55,903 DEV : loss 0.2061367630958557 - f1-score (micro avg) 0.7543 2023-10-25 21:04:55,908 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:04:57,187 epoch 10 - iter 29/292 - loss 0.00318513 - time (sec): 1.28 - samples/sec: 3035.16 - lr: 0.000005 - momentum: 0.000000 2023-10-25 21:04:58,468 epoch 10 - iter 58/292 - loss 0.00296072 - time (sec): 2.56 - samples/sec: 3051.77 - lr: 0.000005 - momentum: 0.000000 2023-10-25 21:04:59,815 epoch 10 - iter 87/292 - loss 0.00649624 - time (sec): 3.91 - samples/sec: 3244.54 - lr: 0.000004 - momentum: 0.000000 2023-10-25 21:05:01,138 epoch 10 - iter 116/292 - loss 0.00570956 - time (sec): 5.23 - samples/sec: 3328.03 - lr: 0.000003 - momentum: 0.000000 2023-10-25 21:05:02,456 epoch 10 - iter 145/292 - loss 0.00505520 - time (sec): 6.55 - samples/sec: 3425.62 - lr: 0.000003 - momentum: 0.000000 2023-10-25 21:05:03,752 epoch 10 - iter 174/292 - loss 0.00507932 - time (sec): 7.84 - samples/sec: 3488.66 - lr: 0.000002 - momentum: 0.000000 2023-10-25 21:05:05,062 epoch 10 - iter 203/292 - loss 0.00532013 - time (sec): 9.15 - samples/sec: 3509.91 - lr: 0.000002 - momentum: 0.000000 2023-10-25 21:05:06,295 epoch 10 - iter 232/292 - loss 0.00556513 - time (sec): 10.39 - samples/sec: 3461.67 - lr: 0.000001 - momentum: 0.000000 2023-10-25 21:05:07,561 epoch 10 - iter 261/292 - loss 0.00596959 - time (sec): 11.65 - samples/sec: 3425.39 - lr: 0.000001 - momentum: 0.000000 2023-10-25 21:05:08,803 epoch 10 - iter 290/292 - loss 0.00553911 - time (sec): 12.89 - samples/sec: 3424.82 - lr: 0.000000 - momentum: 0.000000 2023-10-25 21:05:08,887 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:05:08,887 EPOCH 10 done: loss 0.0058 - lr: 0.000000 2023-10-25 21:05:09,796 DEV : loss 0.20963872969150543 - f1-score (micro avg) 0.7366 2023-10-25 21:05:10,330 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:05:10,332 Loading model from best epoch ... 2023-10-25 21:05:12,043 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-25 21:05:13,585 Results: - F-score (micro) 0.7502 - F-score (macro) 0.6592 - Accuracy 0.6229 By class: precision recall f1-score support PER 0.7754 0.8333 0.8033 348 LOC 0.6889 0.8314 0.7535 261 ORG 0.3962 0.4038 0.4000 52 HumanProd 0.6071 0.7727 0.6800 22 micro avg 0.7078 0.7980 0.7502 683 macro avg 0.6169 0.7103 0.6592 683 weighted avg 0.7081 0.7980 0.7496 683 2023-10-25 21:05:13,585 ----------------------------------------------------------------------------------------------------