stefan-it's picture
Upload folder using huggingface_hub
9d0ebab
2023-10-17 17:44:25,631 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:25,632 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 17:44:25,632 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:25,632 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-17 17:44:25,632 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:25,632 Train: 1166 sentences
2023-10-17 17:44:25,632 (train_with_dev=False, train_with_test=False)
2023-10-17 17:44:25,632 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:25,632 Training Params:
2023-10-17 17:44:25,632 - learning_rate: "5e-05"
2023-10-17 17:44:25,632 - mini_batch_size: "4"
2023-10-17 17:44:25,632 - max_epochs: "10"
2023-10-17 17:44:25,632 - shuffle: "True"
2023-10-17 17:44:25,632 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:25,632 Plugins:
2023-10-17 17:44:25,632 - TensorboardLogger
2023-10-17 17:44:25,632 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 17:44:25,632 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:25,632 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 17:44:25,632 - metric: "('micro avg', 'f1-score')"
2023-10-17 17:44:25,632 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:25,632 Computation:
2023-10-17 17:44:25,633 - compute on device: cuda:0
2023-10-17 17:44:25,633 - embedding storage: none
2023-10-17 17:44:25,633 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:25,633 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-17 17:44:25,633 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:25,633 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:25,633 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 17:44:27,412 epoch 1 - iter 29/292 - loss 3.57048903 - time (sec): 1.78 - samples/sec: 2510.89 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:44:29,115 epoch 1 - iter 58/292 - loss 2.69134226 - time (sec): 3.48 - samples/sec: 2638.12 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:44:30,931 epoch 1 - iter 87/292 - loss 2.00635595 - time (sec): 5.30 - samples/sec: 2531.16 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:44:32,440 epoch 1 - iter 116/292 - loss 1.67505873 - time (sec): 6.81 - samples/sec: 2517.67 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:44:33,997 epoch 1 - iter 145/292 - loss 1.44361519 - time (sec): 8.36 - samples/sec: 2543.88 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:44:35,582 epoch 1 - iter 174/292 - loss 1.26249810 - time (sec): 9.95 - samples/sec: 2555.63 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:44:37,185 epoch 1 - iter 203/292 - loss 1.12316057 - time (sec): 11.55 - samples/sec: 2573.81 - lr: 0.000035 - momentum: 0.000000
2023-10-17 17:44:38,686 epoch 1 - iter 232/292 - loss 1.03035540 - time (sec): 13.05 - samples/sec: 2580.37 - lr: 0.000040 - momentum: 0.000000
2023-10-17 17:44:40,348 epoch 1 - iter 261/292 - loss 0.93353604 - time (sec): 14.71 - samples/sec: 2610.98 - lr: 0.000045 - momentum: 0.000000
2023-10-17 17:44:42,337 epoch 1 - iter 290/292 - loss 0.85173568 - time (sec): 16.70 - samples/sec: 2648.79 - lr: 0.000049 - momentum: 0.000000
2023-10-17 17:44:42,438 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:42,438 EPOCH 1 done: loss 0.8497 - lr: 0.000049
2023-10-17 17:44:43,302 DEV : loss 0.15992386639118195 - f1-score (micro avg) 0.525
2023-10-17 17:44:43,310 saving best model
2023-10-17 17:44:43,721 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:45,590 epoch 2 - iter 29/292 - loss 0.25767213 - time (sec): 1.87 - samples/sec: 2697.46 - lr: 0.000049 - momentum: 0.000000
2023-10-17 17:44:47,114 epoch 2 - iter 58/292 - loss 0.25456634 - time (sec): 3.39 - samples/sec: 2571.19 - lr: 0.000049 - momentum: 0.000000
2023-10-17 17:44:48,626 epoch 2 - iter 87/292 - loss 0.24045766 - time (sec): 4.90 - samples/sec: 2556.28 - lr: 0.000048 - momentum: 0.000000
2023-10-17 17:44:50,105 epoch 2 - iter 116/292 - loss 0.22630269 - time (sec): 6.38 - samples/sec: 2537.48 - lr: 0.000048 - momentum: 0.000000
2023-10-17 17:44:51,910 epoch 2 - iter 145/292 - loss 0.22581740 - time (sec): 8.19 - samples/sec: 2618.08 - lr: 0.000047 - momentum: 0.000000
2023-10-17 17:44:53,578 epoch 2 - iter 174/292 - loss 0.21214049 - time (sec): 9.85 - samples/sec: 2567.67 - lr: 0.000047 - momentum: 0.000000
2023-10-17 17:44:55,252 epoch 2 - iter 203/292 - loss 0.20001713 - time (sec): 11.53 - samples/sec: 2563.49 - lr: 0.000046 - momentum: 0.000000
2023-10-17 17:44:56,845 epoch 2 - iter 232/292 - loss 0.19935194 - time (sec): 13.12 - samples/sec: 2570.87 - lr: 0.000046 - momentum: 0.000000
2023-10-17 17:44:58,631 epoch 2 - iter 261/292 - loss 0.18698302 - time (sec): 14.91 - samples/sec: 2621.32 - lr: 0.000045 - momentum: 0.000000
2023-10-17 17:45:00,465 epoch 2 - iter 290/292 - loss 0.18609856 - time (sec): 16.74 - samples/sec: 2644.64 - lr: 0.000045 - momentum: 0.000000
2023-10-17 17:45:00,549 ----------------------------------------------------------------------------------------------------
2023-10-17 17:45:00,549 EPOCH 2 done: loss 0.1855 - lr: 0.000045
2023-10-17 17:45:02,044 DEV : loss 0.1280839890241623 - f1-score (micro avg) 0.6897
2023-10-17 17:45:02,049 saving best model
2023-10-17 17:45:02,529 ----------------------------------------------------------------------------------------------------
2023-10-17 17:45:04,015 epoch 3 - iter 29/292 - loss 0.11027878 - time (sec): 1.48 - samples/sec: 2381.84 - lr: 0.000044 - momentum: 0.000000
2023-10-17 17:45:05,837 epoch 3 - iter 58/292 - loss 0.10728336 - time (sec): 3.31 - samples/sec: 2645.70 - lr: 0.000043 - momentum: 0.000000
2023-10-17 17:45:07,539 epoch 3 - iter 87/292 - loss 0.09859684 - time (sec): 5.01 - samples/sec: 2698.14 - lr: 0.000043 - momentum: 0.000000
2023-10-17 17:45:09,206 epoch 3 - iter 116/292 - loss 0.09629751 - time (sec): 6.67 - samples/sec: 2670.46 - lr: 0.000042 - momentum: 0.000000
2023-10-17 17:45:10,833 epoch 3 - iter 145/292 - loss 0.10399359 - time (sec): 8.30 - samples/sec: 2627.51 - lr: 0.000042 - momentum: 0.000000
2023-10-17 17:45:12,436 epoch 3 - iter 174/292 - loss 0.10387591 - time (sec): 9.90 - samples/sec: 2632.99 - lr: 0.000041 - momentum: 0.000000
2023-10-17 17:45:14,016 epoch 3 - iter 203/292 - loss 0.10831849 - time (sec): 11.48 - samples/sec: 2604.01 - lr: 0.000041 - momentum: 0.000000
2023-10-17 17:45:15,776 epoch 3 - iter 232/292 - loss 0.10540846 - time (sec): 13.24 - samples/sec: 2646.30 - lr: 0.000040 - momentum: 0.000000
2023-10-17 17:45:17,546 epoch 3 - iter 261/292 - loss 0.10375178 - time (sec): 15.01 - samples/sec: 2643.50 - lr: 0.000040 - momentum: 0.000000
2023-10-17 17:45:19,277 epoch 3 - iter 290/292 - loss 0.10490339 - time (sec): 16.75 - samples/sec: 2645.03 - lr: 0.000039 - momentum: 0.000000
2023-10-17 17:45:19,369 ----------------------------------------------------------------------------------------------------
2023-10-17 17:45:19,370 EPOCH 3 done: loss 0.1046 - lr: 0.000039
2023-10-17 17:45:20,624 DEV : loss 0.09797272831201553 - f1-score (micro avg) 0.7626
2023-10-17 17:45:20,630 saving best model
2023-10-17 17:45:21,089 ----------------------------------------------------------------------------------------------------
2023-10-17 17:45:22,852 epoch 4 - iter 29/292 - loss 0.07421333 - time (sec): 1.76 - samples/sec: 2811.82 - lr: 0.000038 - momentum: 0.000000
2023-10-17 17:45:24,540 epoch 4 - iter 58/292 - loss 0.08033029 - time (sec): 3.44 - samples/sec: 2694.02 - lr: 0.000038 - momentum: 0.000000
2023-10-17 17:45:26,333 epoch 4 - iter 87/292 - loss 0.08827135 - time (sec): 5.24 - samples/sec: 2625.30 - lr: 0.000037 - momentum: 0.000000
2023-10-17 17:45:27,884 epoch 4 - iter 116/292 - loss 0.08338749 - time (sec): 6.79 - samples/sec: 2539.07 - lr: 0.000037 - momentum: 0.000000
2023-10-17 17:45:29,628 epoch 4 - iter 145/292 - loss 0.07896574 - time (sec): 8.53 - samples/sec: 2541.56 - lr: 0.000036 - momentum: 0.000000
2023-10-17 17:45:31,449 epoch 4 - iter 174/292 - loss 0.08184651 - time (sec): 10.35 - samples/sec: 2586.75 - lr: 0.000036 - momentum: 0.000000
2023-10-17 17:45:33,000 epoch 4 - iter 203/292 - loss 0.07823801 - time (sec): 11.90 - samples/sec: 2566.39 - lr: 0.000035 - momentum: 0.000000
2023-10-17 17:45:34,653 epoch 4 - iter 232/292 - loss 0.07894673 - time (sec): 13.56 - samples/sec: 2553.86 - lr: 0.000035 - momentum: 0.000000
2023-10-17 17:45:36,427 epoch 4 - iter 261/292 - loss 0.07449010 - time (sec): 15.33 - samples/sec: 2553.21 - lr: 0.000034 - momentum: 0.000000
2023-10-17 17:45:38,163 epoch 4 - iter 290/292 - loss 0.07251058 - time (sec): 17.07 - samples/sec: 2597.22 - lr: 0.000033 - momentum: 0.000000
2023-10-17 17:45:38,252 ----------------------------------------------------------------------------------------------------
2023-10-17 17:45:38,253 EPOCH 4 done: loss 0.0723 - lr: 0.000033
2023-10-17 17:45:39,502 DEV : loss 0.12399590760469437 - f1-score (micro avg) 0.7545
2023-10-17 17:45:39,508 ----------------------------------------------------------------------------------------------------
2023-10-17 17:45:41,142 epoch 5 - iter 29/292 - loss 0.06122202 - time (sec): 1.63 - samples/sec: 2384.89 - lr: 0.000033 - momentum: 0.000000
2023-10-17 17:45:42,878 epoch 5 - iter 58/292 - loss 0.05684321 - time (sec): 3.37 - samples/sec: 2592.69 - lr: 0.000032 - momentum: 0.000000
2023-10-17 17:45:44,550 epoch 5 - iter 87/292 - loss 0.04607849 - time (sec): 5.04 - samples/sec: 2681.49 - lr: 0.000032 - momentum: 0.000000
2023-10-17 17:45:46,392 epoch 5 - iter 116/292 - loss 0.04987496 - time (sec): 6.88 - samples/sec: 2649.25 - lr: 0.000031 - momentum: 0.000000
2023-10-17 17:45:47,996 epoch 5 - iter 145/292 - loss 0.04808783 - time (sec): 8.49 - samples/sec: 2600.28 - lr: 0.000031 - momentum: 0.000000
2023-10-17 17:45:49,559 epoch 5 - iter 174/292 - loss 0.04564148 - time (sec): 10.05 - samples/sec: 2605.18 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:45:51,278 epoch 5 - iter 203/292 - loss 0.04341831 - time (sec): 11.77 - samples/sec: 2614.97 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:45:52,908 epoch 5 - iter 232/292 - loss 0.04291497 - time (sec): 13.40 - samples/sec: 2621.26 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:45:54,762 epoch 5 - iter 261/292 - loss 0.04403114 - time (sec): 15.25 - samples/sec: 2619.43 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:45:56,349 epoch 5 - iter 290/292 - loss 0.04465672 - time (sec): 16.84 - samples/sec: 2618.02 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:45:56,464 ----------------------------------------------------------------------------------------------------
2023-10-17 17:45:56,464 EPOCH 5 done: loss 0.0447 - lr: 0.000028
2023-10-17 17:45:57,721 DEV : loss 0.15154214203357697 - f1-score (micro avg) 0.7635
2023-10-17 17:45:57,727 saving best model
2023-10-17 17:45:58,188 ----------------------------------------------------------------------------------------------------
2023-10-17 17:45:59,748 epoch 6 - iter 29/292 - loss 0.03976061 - time (sec): 1.55 - samples/sec: 2589.61 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:46:01,590 epoch 6 - iter 58/292 - loss 0.03082695 - time (sec): 3.39 - samples/sec: 2678.28 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:46:03,128 epoch 6 - iter 87/292 - loss 0.03254628 - time (sec): 4.93 - samples/sec: 2579.17 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:46:04,899 epoch 6 - iter 116/292 - loss 0.03205393 - time (sec): 6.70 - samples/sec: 2590.92 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:46:06,780 epoch 6 - iter 145/292 - loss 0.03458738 - time (sec): 8.58 - samples/sec: 2595.02 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:46:08,466 epoch 6 - iter 174/292 - loss 0.03208665 - time (sec): 10.26 - samples/sec: 2641.93 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:46:10,044 epoch 6 - iter 203/292 - loss 0.03331330 - time (sec): 11.84 - samples/sec: 2639.43 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:46:11,732 epoch 6 - iter 232/292 - loss 0.03066017 - time (sec): 13.53 - samples/sec: 2610.71 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:46:13,409 epoch 6 - iter 261/292 - loss 0.03183791 - time (sec): 15.21 - samples/sec: 2605.18 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:46:14,961 epoch 6 - iter 290/292 - loss 0.03229201 - time (sec): 16.76 - samples/sec: 2646.98 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:46:15,043 ----------------------------------------------------------------------------------------------------
2023-10-17 17:46:15,043 EPOCH 6 done: loss 0.0322 - lr: 0.000022
2023-10-17 17:46:16,317 DEV : loss 0.16179697215557098 - f1-score (micro avg) 0.744
2023-10-17 17:46:16,322 ----------------------------------------------------------------------------------------------------
2023-10-17 17:46:17,927 epoch 7 - iter 29/292 - loss 0.00901751 - time (sec): 1.60 - samples/sec: 2384.42 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:46:19,649 epoch 7 - iter 58/292 - loss 0.02237209 - time (sec): 3.33 - samples/sec: 2604.07 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:46:21,396 epoch 7 - iter 87/292 - loss 0.02251530 - time (sec): 5.07 - samples/sec: 2655.62 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:46:23,095 epoch 7 - iter 116/292 - loss 0.02465865 - time (sec): 6.77 - samples/sec: 2627.44 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:46:24,758 epoch 7 - iter 145/292 - loss 0.02146014 - time (sec): 8.43 - samples/sec: 2660.21 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:46:26,368 epoch 7 - iter 174/292 - loss 0.02068415 - time (sec): 10.04 - samples/sec: 2597.45 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:46:28,058 epoch 7 - iter 203/292 - loss 0.02188084 - time (sec): 11.73 - samples/sec: 2647.43 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:46:29,721 epoch 7 - iter 232/292 - loss 0.02297817 - time (sec): 13.40 - samples/sec: 2643.17 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:46:31,432 epoch 7 - iter 261/292 - loss 0.02240980 - time (sec): 15.11 - samples/sec: 2651.96 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:46:33,093 epoch 7 - iter 290/292 - loss 0.02259232 - time (sec): 16.77 - samples/sec: 2638.68 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:46:33,193 ----------------------------------------------------------------------------------------------------
2023-10-17 17:46:33,193 EPOCH 7 done: loss 0.0225 - lr: 0.000017
2023-10-17 17:46:34,648 DEV : loss 0.1630016714334488 - f1-score (micro avg) 0.7623
2023-10-17 17:46:34,653 ----------------------------------------------------------------------------------------------------
2023-10-17 17:46:36,218 epoch 8 - iter 29/292 - loss 0.01934696 - time (sec): 1.56 - samples/sec: 2570.85 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:46:37,938 epoch 8 - iter 58/292 - loss 0.02003869 - time (sec): 3.28 - samples/sec: 2562.21 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:46:39,606 epoch 8 - iter 87/292 - loss 0.01742481 - time (sec): 4.95 - samples/sec: 2539.20 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:46:41,191 epoch 8 - iter 116/292 - loss 0.01679156 - time (sec): 6.54 - samples/sec: 2539.65 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:46:42,848 epoch 8 - iter 145/292 - loss 0.01590234 - time (sec): 8.19 - samples/sec: 2582.42 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:46:44,530 epoch 8 - iter 174/292 - loss 0.01751979 - time (sec): 9.88 - samples/sec: 2615.82 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:46:46,030 epoch 8 - iter 203/292 - loss 0.01821748 - time (sec): 11.38 - samples/sec: 2607.56 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:46:47,894 epoch 8 - iter 232/292 - loss 0.01646602 - time (sec): 13.24 - samples/sec: 2642.17 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:46:49,628 epoch 8 - iter 261/292 - loss 0.01566264 - time (sec): 14.97 - samples/sec: 2617.99 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:46:51,454 epoch 8 - iter 290/292 - loss 0.01638527 - time (sec): 16.80 - samples/sec: 2637.25 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:46:51,546 ----------------------------------------------------------------------------------------------------
2023-10-17 17:46:51,547 EPOCH 8 done: loss 0.0163 - lr: 0.000011
2023-10-17 17:46:52,820 DEV : loss 0.16230525076389313 - f1-score (micro avg) 0.7716
2023-10-17 17:46:52,825 saving best model
2023-10-17 17:46:53,310 ----------------------------------------------------------------------------------------------------
2023-10-17 17:46:54,984 epoch 9 - iter 29/292 - loss 0.00632684 - time (sec): 1.67 - samples/sec: 2789.68 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:46:56,690 epoch 9 - iter 58/292 - loss 0.01106083 - time (sec): 3.38 - samples/sec: 2618.10 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:46:58,545 epoch 9 - iter 87/292 - loss 0.01162112 - time (sec): 5.23 - samples/sec: 2633.00 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:47:00,497 epoch 9 - iter 116/292 - loss 0.01186606 - time (sec): 7.19 - samples/sec: 2578.57 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:47:02,415 epoch 9 - iter 145/292 - loss 0.01186437 - time (sec): 9.10 - samples/sec: 2567.92 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:47:04,070 epoch 9 - iter 174/292 - loss 0.01099713 - time (sec): 10.76 - samples/sec: 2561.94 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:47:05,657 epoch 9 - iter 203/292 - loss 0.01147828 - time (sec): 12.35 - samples/sec: 2568.19 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:47:07,263 epoch 9 - iter 232/292 - loss 0.01048590 - time (sec): 13.95 - samples/sec: 2572.44 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:47:08,725 epoch 9 - iter 261/292 - loss 0.01103468 - time (sec): 15.41 - samples/sec: 2549.87 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:47:10,422 epoch 9 - iter 290/292 - loss 0.01088504 - time (sec): 17.11 - samples/sec: 2578.89 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:47:10,518 ----------------------------------------------------------------------------------------------------
2023-10-17 17:47:10,518 EPOCH 9 done: loss 0.0112 - lr: 0.000006
2023-10-17 17:47:11,779 DEV : loss 0.1698322296142578 - f1-score (micro avg) 0.7462
2023-10-17 17:47:11,784 ----------------------------------------------------------------------------------------------------
2023-10-17 17:47:13,358 epoch 10 - iter 29/292 - loss 0.00521108 - time (sec): 1.57 - samples/sec: 2788.97 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:47:14,996 epoch 10 - iter 58/292 - loss 0.00785901 - time (sec): 3.21 - samples/sec: 2645.32 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:47:16,751 epoch 10 - iter 87/292 - loss 0.00903835 - time (sec): 4.97 - samples/sec: 2593.06 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:47:18,520 epoch 10 - iter 116/292 - loss 0.00918701 - time (sec): 6.73 - samples/sec: 2664.33 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:47:20,212 epoch 10 - iter 145/292 - loss 0.00813227 - time (sec): 8.43 - samples/sec: 2681.74 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:47:21,773 epoch 10 - iter 174/292 - loss 0.00709191 - time (sec): 9.99 - samples/sec: 2654.89 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:47:23,671 epoch 10 - iter 203/292 - loss 0.00756203 - time (sec): 11.89 - samples/sec: 2653.04 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:47:25,354 epoch 10 - iter 232/292 - loss 0.00685533 - time (sec): 13.57 - samples/sec: 2671.01 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:47:26,919 epoch 10 - iter 261/292 - loss 0.00729752 - time (sec): 15.13 - samples/sec: 2663.07 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:47:28,459 epoch 10 - iter 290/292 - loss 0.00678873 - time (sec): 16.67 - samples/sec: 2652.99 - lr: 0.000000 - momentum: 0.000000
2023-10-17 17:47:28,548 ----------------------------------------------------------------------------------------------------
2023-10-17 17:47:28,549 EPOCH 10 done: loss 0.0068 - lr: 0.000000
2023-10-17 17:47:29,786 DEV : loss 0.17408965528011322 - f1-score (micro avg) 0.7623
2023-10-17 17:47:30,144 ----------------------------------------------------------------------------------------------------
2023-10-17 17:47:30,145 Loading model from best epoch ...
2023-10-17 17:47:31,483 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-17 17:47:33,834
Results:
- F-score (micro) 0.7609
- F-score (macro) 0.7061
- Accuracy 0.633
By class:
precision recall f1-score support
PER 0.8104 0.8477 0.8287 348
LOC 0.6337 0.8352 0.7207 261
ORG 0.5532 0.5000 0.5253 52
HumanProd 0.6923 0.8182 0.7500 22
micro avg 0.7132 0.8155 0.7609 683
macro avg 0.6724 0.7503 0.7061 683
weighted avg 0.7195 0.8155 0.7618 683
2023-10-17 17:47:33,834 ----------------------------------------------------------------------------------------------------