2023-10-17 17:44:25,631 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:44:25,632 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 17:44:25,632 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:44:25,632 MultiCorpus: 1166 train + 165 dev + 415 test sentences - NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator 2023-10-17 17:44:25,632 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:44:25,632 Train: 1166 sentences 2023-10-17 17:44:25,632 (train_with_dev=False, train_with_test=False) 2023-10-17 17:44:25,632 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:44:25,632 Training Params: 2023-10-17 17:44:25,632 - learning_rate: "5e-05" 2023-10-17 17:44:25,632 - mini_batch_size: "4" 2023-10-17 17:44:25,632 - max_epochs: "10" 2023-10-17 17:44:25,632 - shuffle: "True" 2023-10-17 17:44:25,632 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:44:25,632 Plugins: 2023-10-17 17:44:25,632 - TensorboardLogger 2023-10-17 17:44:25,632 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 17:44:25,632 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:44:25,632 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 17:44:25,632 - metric: "('micro avg', 'f1-score')" 2023-10-17 17:44:25,632 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:44:25,632 Computation: 2023-10-17 17:44:25,633 - compute on device: cuda:0 2023-10-17 17:44:25,633 - embedding storage: none 2023-10-17 17:44:25,633 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:44:25,633 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-17 17:44:25,633 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:44:25,633 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:44:25,633 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 17:44:27,412 epoch 1 - iter 29/292 - loss 3.57048903 - time (sec): 1.78 - samples/sec: 2510.89 - lr: 0.000005 - momentum: 0.000000 2023-10-17 17:44:29,115 epoch 1 - iter 58/292 - loss 2.69134226 - time (sec): 3.48 - samples/sec: 2638.12 - lr: 0.000010 - momentum: 0.000000 2023-10-17 17:44:30,931 epoch 1 - iter 87/292 - loss 2.00635595 - time (sec): 5.30 - samples/sec: 2531.16 - lr: 0.000015 - momentum: 0.000000 2023-10-17 17:44:32,440 epoch 1 - iter 116/292 - loss 1.67505873 - time (sec): 6.81 - samples/sec: 2517.67 - lr: 0.000020 - momentum: 0.000000 2023-10-17 17:44:33,997 epoch 1 - iter 145/292 - loss 1.44361519 - time (sec): 8.36 - samples/sec: 2543.88 - lr: 0.000025 - momentum: 0.000000 2023-10-17 17:44:35,582 epoch 1 - iter 174/292 - loss 1.26249810 - time (sec): 9.95 - samples/sec: 2555.63 - lr: 0.000030 - momentum: 0.000000 2023-10-17 17:44:37,185 epoch 1 - iter 203/292 - loss 1.12316057 - time (sec): 11.55 - samples/sec: 2573.81 - lr: 0.000035 - momentum: 0.000000 2023-10-17 17:44:38,686 epoch 1 - iter 232/292 - loss 1.03035540 - time (sec): 13.05 - samples/sec: 2580.37 - lr: 0.000040 - momentum: 0.000000 2023-10-17 17:44:40,348 epoch 1 - iter 261/292 - loss 0.93353604 - time (sec): 14.71 - samples/sec: 2610.98 - lr: 0.000045 - momentum: 0.000000 2023-10-17 17:44:42,337 epoch 1 - iter 290/292 - loss 0.85173568 - time (sec): 16.70 - samples/sec: 2648.79 - lr: 0.000049 - momentum: 0.000000 2023-10-17 17:44:42,438 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:44:42,438 EPOCH 1 done: loss 0.8497 - lr: 0.000049 2023-10-17 17:44:43,302 DEV : loss 0.15992386639118195 - f1-score (micro avg) 0.525 2023-10-17 17:44:43,310 saving best model 2023-10-17 17:44:43,721 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:44:45,590 epoch 2 - iter 29/292 - loss 0.25767213 - time (sec): 1.87 - samples/sec: 2697.46 - lr: 0.000049 - momentum: 0.000000 2023-10-17 17:44:47,114 epoch 2 - iter 58/292 - loss 0.25456634 - time (sec): 3.39 - samples/sec: 2571.19 - lr: 0.000049 - momentum: 0.000000 2023-10-17 17:44:48,626 epoch 2 - iter 87/292 - loss 0.24045766 - time (sec): 4.90 - samples/sec: 2556.28 - lr: 0.000048 - momentum: 0.000000 2023-10-17 17:44:50,105 epoch 2 - iter 116/292 - loss 0.22630269 - time (sec): 6.38 - samples/sec: 2537.48 - lr: 0.000048 - momentum: 0.000000 2023-10-17 17:44:51,910 epoch 2 - iter 145/292 - loss 0.22581740 - time (sec): 8.19 - samples/sec: 2618.08 - lr: 0.000047 - momentum: 0.000000 2023-10-17 17:44:53,578 epoch 2 - iter 174/292 - loss 0.21214049 - time (sec): 9.85 - samples/sec: 2567.67 - lr: 0.000047 - momentum: 0.000000 2023-10-17 17:44:55,252 epoch 2 - iter 203/292 - loss 0.20001713 - time (sec): 11.53 - samples/sec: 2563.49 - lr: 0.000046 - momentum: 0.000000 2023-10-17 17:44:56,845 epoch 2 - iter 232/292 - loss 0.19935194 - time (sec): 13.12 - samples/sec: 2570.87 - lr: 0.000046 - momentum: 0.000000 2023-10-17 17:44:58,631 epoch 2 - iter 261/292 - loss 0.18698302 - time (sec): 14.91 - samples/sec: 2621.32 - lr: 0.000045 - momentum: 0.000000 2023-10-17 17:45:00,465 epoch 2 - iter 290/292 - loss 0.18609856 - time (sec): 16.74 - samples/sec: 2644.64 - lr: 0.000045 - momentum: 0.000000 2023-10-17 17:45:00,549 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:45:00,549 EPOCH 2 done: loss 0.1855 - lr: 0.000045 2023-10-17 17:45:02,044 DEV : loss 0.1280839890241623 - f1-score (micro avg) 0.6897 2023-10-17 17:45:02,049 saving best model 2023-10-17 17:45:02,529 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:45:04,015 epoch 3 - iter 29/292 - loss 0.11027878 - time (sec): 1.48 - samples/sec: 2381.84 - lr: 0.000044 - momentum: 0.000000 2023-10-17 17:45:05,837 epoch 3 - iter 58/292 - loss 0.10728336 - time (sec): 3.31 - samples/sec: 2645.70 - lr: 0.000043 - momentum: 0.000000 2023-10-17 17:45:07,539 epoch 3 - iter 87/292 - loss 0.09859684 - time (sec): 5.01 - samples/sec: 2698.14 - lr: 0.000043 - momentum: 0.000000 2023-10-17 17:45:09,206 epoch 3 - iter 116/292 - loss 0.09629751 - time (sec): 6.67 - samples/sec: 2670.46 - lr: 0.000042 - momentum: 0.000000 2023-10-17 17:45:10,833 epoch 3 - iter 145/292 - loss 0.10399359 - time (sec): 8.30 - samples/sec: 2627.51 - lr: 0.000042 - momentum: 0.000000 2023-10-17 17:45:12,436 epoch 3 - iter 174/292 - loss 0.10387591 - time (sec): 9.90 - samples/sec: 2632.99 - lr: 0.000041 - momentum: 0.000000 2023-10-17 17:45:14,016 epoch 3 - iter 203/292 - loss 0.10831849 - time (sec): 11.48 - samples/sec: 2604.01 - lr: 0.000041 - momentum: 0.000000 2023-10-17 17:45:15,776 epoch 3 - iter 232/292 - loss 0.10540846 - time (sec): 13.24 - samples/sec: 2646.30 - lr: 0.000040 - momentum: 0.000000 2023-10-17 17:45:17,546 epoch 3 - iter 261/292 - loss 0.10375178 - time (sec): 15.01 - samples/sec: 2643.50 - lr: 0.000040 - momentum: 0.000000 2023-10-17 17:45:19,277 epoch 3 - iter 290/292 - loss 0.10490339 - time (sec): 16.75 - samples/sec: 2645.03 - lr: 0.000039 - momentum: 0.000000 2023-10-17 17:45:19,369 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:45:19,370 EPOCH 3 done: loss 0.1046 - lr: 0.000039 2023-10-17 17:45:20,624 DEV : loss 0.09797272831201553 - f1-score (micro avg) 0.7626 2023-10-17 17:45:20,630 saving best model 2023-10-17 17:45:21,089 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:45:22,852 epoch 4 - iter 29/292 - loss 0.07421333 - time (sec): 1.76 - samples/sec: 2811.82 - lr: 0.000038 - momentum: 0.000000 2023-10-17 17:45:24,540 epoch 4 - iter 58/292 - loss 0.08033029 - time (sec): 3.44 - samples/sec: 2694.02 - lr: 0.000038 - momentum: 0.000000 2023-10-17 17:45:26,333 epoch 4 - iter 87/292 - loss 0.08827135 - time (sec): 5.24 - samples/sec: 2625.30 - lr: 0.000037 - momentum: 0.000000 2023-10-17 17:45:27,884 epoch 4 - iter 116/292 - loss 0.08338749 - time (sec): 6.79 - samples/sec: 2539.07 - lr: 0.000037 - momentum: 0.000000 2023-10-17 17:45:29,628 epoch 4 - iter 145/292 - loss 0.07896574 - time (sec): 8.53 - samples/sec: 2541.56 - lr: 0.000036 - momentum: 0.000000 2023-10-17 17:45:31,449 epoch 4 - iter 174/292 - loss 0.08184651 - time (sec): 10.35 - samples/sec: 2586.75 - lr: 0.000036 - momentum: 0.000000 2023-10-17 17:45:33,000 epoch 4 - iter 203/292 - loss 0.07823801 - time (sec): 11.90 - samples/sec: 2566.39 - lr: 0.000035 - momentum: 0.000000 2023-10-17 17:45:34,653 epoch 4 - iter 232/292 - loss 0.07894673 - time (sec): 13.56 - samples/sec: 2553.86 - lr: 0.000035 - momentum: 0.000000 2023-10-17 17:45:36,427 epoch 4 - iter 261/292 - loss 0.07449010 - time (sec): 15.33 - samples/sec: 2553.21 - lr: 0.000034 - momentum: 0.000000 2023-10-17 17:45:38,163 epoch 4 - iter 290/292 - loss 0.07251058 - time (sec): 17.07 - samples/sec: 2597.22 - lr: 0.000033 - momentum: 0.000000 2023-10-17 17:45:38,252 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:45:38,253 EPOCH 4 done: loss 0.0723 - lr: 0.000033 2023-10-17 17:45:39,502 DEV : loss 0.12399590760469437 - f1-score (micro avg) 0.7545 2023-10-17 17:45:39,508 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:45:41,142 epoch 5 - iter 29/292 - loss 0.06122202 - time (sec): 1.63 - samples/sec: 2384.89 - lr: 0.000033 - momentum: 0.000000 2023-10-17 17:45:42,878 epoch 5 - iter 58/292 - loss 0.05684321 - time (sec): 3.37 - samples/sec: 2592.69 - lr: 0.000032 - momentum: 0.000000 2023-10-17 17:45:44,550 epoch 5 - iter 87/292 - loss 0.04607849 - time (sec): 5.04 - samples/sec: 2681.49 - lr: 0.000032 - momentum: 0.000000 2023-10-17 17:45:46,392 epoch 5 - iter 116/292 - loss 0.04987496 - time (sec): 6.88 - samples/sec: 2649.25 - lr: 0.000031 - momentum: 0.000000 2023-10-17 17:45:47,996 epoch 5 - iter 145/292 - loss 0.04808783 - time (sec): 8.49 - samples/sec: 2600.28 - lr: 0.000031 - momentum: 0.000000 2023-10-17 17:45:49,559 epoch 5 - iter 174/292 - loss 0.04564148 - time (sec): 10.05 - samples/sec: 2605.18 - lr: 0.000030 - momentum: 0.000000 2023-10-17 17:45:51,278 epoch 5 - iter 203/292 - loss 0.04341831 - time (sec): 11.77 - samples/sec: 2614.97 - lr: 0.000030 - momentum: 0.000000 2023-10-17 17:45:52,908 epoch 5 - iter 232/292 - loss 0.04291497 - time (sec): 13.40 - samples/sec: 2621.26 - lr: 0.000029 - momentum: 0.000000 2023-10-17 17:45:54,762 epoch 5 - iter 261/292 - loss 0.04403114 - time (sec): 15.25 - samples/sec: 2619.43 - lr: 0.000028 - momentum: 0.000000 2023-10-17 17:45:56,349 epoch 5 - iter 290/292 - loss 0.04465672 - time (sec): 16.84 - samples/sec: 2618.02 - lr: 0.000028 - momentum: 0.000000 2023-10-17 17:45:56,464 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:45:56,464 EPOCH 5 done: loss 0.0447 - lr: 0.000028 2023-10-17 17:45:57,721 DEV : loss 0.15154214203357697 - f1-score (micro avg) 0.7635 2023-10-17 17:45:57,727 saving best model 2023-10-17 17:45:58,188 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:45:59,748 epoch 6 - iter 29/292 - loss 0.03976061 - time (sec): 1.55 - samples/sec: 2589.61 - lr: 0.000027 - momentum: 0.000000 2023-10-17 17:46:01,590 epoch 6 - iter 58/292 - loss 0.03082695 - time (sec): 3.39 - samples/sec: 2678.28 - lr: 0.000027 - momentum: 0.000000 2023-10-17 17:46:03,128 epoch 6 - iter 87/292 - loss 0.03254628 - time (sec): 4.93 - samples/sec: 2579.17 - lr: 0.000026 - momentum: 0.000000 2023-10-17 17:46:04,899 epoch 6 - iter 116/292 - loss 0.03205393 - time (sec): 6.70 - samples/sec: 2590.92 - lr: 0.000026 - momentum: 0.000000 2023-10-17 17:46:06,780 epoch 6 - iter 145/292 - loss 0.03458738 - time (sec): 8.58 - samples/sec: 2595.02 - lr: 0.000025 - momentum: 0.000000 2023-10-17 17:46:08,466 epoch 6 - iter 174/292 - loss 0.03208665 - time (sec): 10.26 - samples/sec: 2641.93 - lr: 0.000025 - momentum: 0.000000 2023-10-17 17:46:10,044 epoch 6 - iter 203/292 - loss 0.03331330 - time (sec): 11.84 - samples/sec: 2639.43 - lr: 0.000024 - momentum: 0.000000 2023-10-17 17:46:11,732 epoch 6 - iter 232/292 - loss 0.03066017 - time (sec): 13.53 - samples/sec: 2610.71 - lr: 0.000023 - momentum: 0.000000 2023-10-17 17:46:13,409 epoch 6 - iter 261/292 - loss 0.03183791 - time (sec): 15.21 - samples/sec: 2605.18 - lr: 0.000023 - momentum: 0.000000 2023-10-17 17:46:14,961 epoch 6 - iter 290/292 - loss 0.03229201 - time (sec): 16.76 - samples/sec: 2646.98 - lr: 0.000022 - momentum: 0.000000 2023-10-17 17:46:15,043 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:46:15,043 EPOCH 6 done: loss 0.0322 - lr: 0.000022 2023-10-17 17:46:16,317 DEV : loss 0.16179697215557098 - f1-score (micro avg) 0.744 2023-10-17 17:46:16,322 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:46:17,927 epoch 7 - iter 29/292 - loss 0.00901751 - time (sec): 1.60 - samples/sec: 2384.42 - lr: 0.000022 - momentum: 0.000000 2023-10-17 17:46:19,649 epoch 7 - iter 58/292 - loss 0.02237209 - time (sec): 3.33 - samples/sec: 2604.07 - lr: 0.000021 - momentum: 0.000000 2023-10-17 17:46:21,396 epoch 7 - iter 87/292 - loss 0.02251530 - time (sec): 5.07 - samples/sec: 2655.62 - lr: 0.000021 - momentum: 0.000000 2023-10-17 17:46:23,095 epoch 7 - iter 116/292 - loss 0.02465865 - time (sec): 6.77 - samples/sec: 2627.44 - lr: 0.000020 - momentum: 0.000000 2023-10-17 17:46:24,758 epoch 7 - iter 145/292 - loss 0.02146014 - time (sec): 8.43 - samples/sec: 2660.21 - lr: 0.000020 - momentum: 0.000000 2023-10-17 17:46:26,368 epoch 7 - iter 174/292 - loss 0.02068415 - time (sec): 10.04 - samples/sec: 2597.45 - lr: 0.000019 - momentum: 0.000000 2023-10-17 17:46:28,058 epoch 7 - iter 203/292 - loss 0.02188084 - time (sec): 11.73 - samples/sec: 2647.43 - lr: 0.000018 - momentum: 0.000000 2023-10-17 17:46:29,721 epoch 7 - iter 232/292 - loss 0.02297817 - time (sec): 13.40 - samples/sec: 2643.17 - lr: 0.000018 - momentum: 0.000000 2023-10-17 17:46:31,432 epoch 7 - iter 261/292 - loss 0.02240980 - time (sec): 15.11 - samples/sec: 2651.96 - lr: 0.000017 - momentum: 0.000000 2023-10-17 17:46:33,093 epoch 7 - iter 290/292 - loss 0.02259232 - time (sec): 16.77 - samples/sec: 2638.68 - lr: 0.000017 - momentum: 0.000000 2023-10-17 17:46:33,193 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:46:33,193 EPOCH 7 done: loss 0.0225 - lr: 0.000017 2023-10-17 17:46:34,648 DEV : loss 0.1630016714334488 - f1-score (micro avg) 0.7623 2023-10-17 17:46:34,653 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:46:36,218 epoch 8 - iter 29/292 - loss 0.01934696 - time (sec): 1.56 - samples/sec: 2570.85 - lr: 0.000016 - momentum: 0.000000 2023-10-17 17:46:37,938 epoch 8 - iter 58/292 - loss 0.02003869 - time (sec): 3.28 - samples/sec: 2562.21 - lr: 0.000016 - momentum: 0.000000 2023-10-17 17:46:39,606 epoch 8 - iter 87/292 - loss 0.01742481 - time (sec): 4.95 - samples/sec: 2539.20 - lr: 0.000015 - momentum: 0.000000 2023-10-17 17:46:41,191 epoch 8 - iter 116/292 - loss 0.01679156 - time (sec): 6.54 - samples/sec: 2539.65 - lr: 0.000015 - momentum: 0.000000 2023-10-17 17:46:42,848 epoch 8 - iter 145/292 - loss 0.01590234 - time (sec): 8.19 - samples/sec: 2582.42 - lr: 0.000014 - momentum: 0.000000 2023-10-17 17:46:44,530 epoch 8 - iter 174/292 - loss 0.01751979 - time (sec): 9.88 - samples/sec: 2615.82 - lr: 0.000013 - momentum: 0.000000 2023-10-17 17:46:46,030 epoch 8 - iter 203/292 - loss 0.01821748 - time (sec): 11.38 - samples/sec: 2607.56 - lr: 0.000013 - momentum: 0.000000 2023-10-17 17:46:47,894 epoch 8 - iter 232/292 - loss 0.01646602 - time (sec): 13.24 - samples/sec: 2642.17 - lr: 0.000012 - momentum: 0.000000 2023-10-17 17:46:49,628 epoch 8 - iter 261/292 - loss 0.01566264 - time (sec): 14.97 - samples/sec: 2617.99 - lr: 0.000012 - momentum: 0.000000 2023-10-17 17:46:51,454 epoch 8 - iter 290/292 - loss 0.01638527 - time (sec): 16.80 - samples/sec: 2637.25 - lr: 0.000011 - momentum: 0.000000 2023-10-17 17:46:51,546 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:46:51,547 EPOCH 8 done: loss 0.0163 - lr: 0.000011 2023-10-17 17:46:52,820 DEV : loss 0.16230525076389313 - f1-score (micro avg) 0.7716 2023-10-17 17:46:52,825 saving best model 2023-10-17 17:46:53,310 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:46:54,984 epoch 9 - iter 29/292 - loss 0.00632684 - time (sec): 1.67 - samples/sec: 2789.68 - lr: 0.000011 - momentum: 0.000000 2023-10-17 17:46:56,690 epoch 9 - iter 58/292 - loss 0.01106083 - time (sec): 3.38 - samples/sec: 2618.10 - lr: 0.000010 - momentum: 0.000000 2023-10-17 17:46:58,545 epoch 9 - iter 87/292 - loss 0.01162112 - time (sec): 5.23 - samples/sec: 2633.00 - lr: 0.000010 - momentum: 0.000000 2023-10-17 17:47:00,497 epoch 9 - iter 116/292 - loss 0.01186606 - time (sec): 7.19 - samples/sec: 2578.57 - lr: 0.000009 - momentum: 0.000000 2023-10-17 17:47:02,415 epoch 9 - iter 145/292 - loss 0.01186437 - time (sec): 9.10 - samples/sec: 2567.92 - lr: 0.000008 - momentum: 0.000000 2023-10-17 17:47:04,070 epoch 9 - iter 174/292 - loss 0.01099713 - time (sec): 10.76 - samples/sec: 2561.94 - lr: 0.000008 - momentum: 0.000000 2023-10-17 17:47:05,657 epoch 9 - iter 203/292 - loss 0.01147828 - time (sec): 12.35 - samples/sec: 2568.19 - lr: 0.000007 - momentum: 0.000000 2023-10-17 17:47:07,263 epoch 9 - iter 232/292 - loss 0.01048590 - time (sec): 13.95 - samples/sec: 2572.44 - lr: 0.000007 - momentum: 0.000000 2023-10-17 17:47:08,725 epoch 9 - iter 261/292 - loss 0.01103468 - time (sec): 15.41 - samples/sec: 2549.87 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:47:10,422 epoch 9 - iter 290/292 - loss 0.01088504 - time (sec): 17.11 - samples/sec: 2578.89 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:47:10,518 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:47:10,518 EPOCH 9 done: loss 0.0112 - lr: 0.000006 2023-10-17 17:47:11,779 DEV : loss 0.1698322296142578 - f1-score (micro avg) 0.7462 2023-10-17 17:47:11,784 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:47:13,358 epoch 10 - iter 29/292 - loss 0.00521108 - time (sec): 1.57 - samples/sec: 2788.97 - lr: 0.000005 - momentum: 0.000000 2023-10-17 17:47:14,996 epoch 10 - iter 58/292 - loss 0.00785901 - time (sec): 3.21 - samples/sec: 2645.32 - lr: 0.000005 - momentum: 0.000000 2023-10-17 17:47:16,751 epoch 10 - iter 87/292 - loss 0.00903835 - time (sec): 4.97 - samples/sec: 2593.06 - lr: 0.000004 - momentum: 0.000000 2023-10-17 17:47:18,520 epoch 10 - iter 116/292 - loss 0.00918701 - time (sec): 6.73 - samples/sec: 2664.33 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:47:20,212 epoch 10 - iter 145/292 - loss 0.00813227 - time (sec): 8.43 - samples/sec: 2681.74 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:47:21,773 epoch 10 - iter 174/292 - loss 0.00709191 - time (sec): 9.99 - samples/sec: 2654.89 - lr: 0.000002 - momentum: 0.000000 2023-10-17 17:47:23,671 epoch 10 - iter 203/292 - loss 0.00756203 - time (sec): 11.89 - samples/sec: 2653.04 - lr: 0.000002 - momentum: 0.000000 2023-10-17 17:47:25,354 epoch 10 - iter 232/292 - loss 0.00685533 - time (sec): 13.57 - samples/sec: 2671.01 - lr: 0.000001 - momentum: 0.000000 2023-10-17 17:47:26,919 epoch 10 - iter 261/292 - loss 0.00729752 - time (sec): 15.13 - samples/sec: 2663.07 - lr: 0.000001 - momentum: 0.000000 2023-10-17 17:47:28,459 epoch 10 - iter 290/292 - loss 0.00678873 - time (sec): 16.67 - samples/sec: 2652.99 - lr: 0.000000 - momentum: 0.000000 2023-10-17 17:47:28,548 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:47:28,549 EPOCH 10 done: loss 0.0068 - lr: 0.000000 2023-10-17 17:47:29,786 DEV : loss 0.17408965528011322 - f1-score (micro avg) 0.7623 2023-10-17 17:47:30,144 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:47:30,145 Loading model from best epoch ... 2023-10-17 17:47:31,483 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-17 17:47:33,834 Results: - F-score (micro) 0.7609 - F-score (macro) 0.7061 - Accuracy 0.633 By class: precision recall f1-score support PER 0.8104 0.8477 0.8287 348 LOC 0.6337 0.8352 0.7207 261 ORG 0.5532 0.5000 0.5253 52 HumanProd 0.6923 0.8182 0.7500 22 micro avg 0.7132 0.8155 0.7609 683 macro avg 0.6724 0.7503 0.7061 683 weighted avg 0.7195 0.8155 0.7618 683 2023-10-17 17:47:33,834 ----------------------------------------------------------------------------------------------------