2023-10-17 17:54:32,307 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:54:32,308 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 17:54:32,308 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:54:32,308 MultiCorpus: 1166 train + 165 dev + 415 test sentences - NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator 2023-10-17 17:54:32,308 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:54:32,308 Train: 1166 sentences 2023-10-17 17:54:32,308 (train_with_dev=False, train_with_test=False) 2023-10-17 17:54:32,308 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:54:32,308 Training Params: 2023-10-17 17:54:32,308 - learning_rate: "3e-05" 2023-10-17 17:54:32,308 - mini_batch_size: "4" 2023-10-17 17:54:32,308 - max_epochs: "10" 2023-10-17 17:54:32,308 - shuffle: "True" 2023-10-17 17:54:32,309 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:54:32,309 Plugins: 2023-10-17 17:54:32,309 - TensorboardLogger 2023-10-17 17:54:32,309 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 17:54:32,309 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:54:32,309 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 17:54:32,309 - metric: "('micro avg', 'f1-score')" 2023-10-17 17:54:32,309 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:54:32,309 Computation: 2023-10-17 17:54:32,309 - compute on device: cuda:0 2023-10-17 17:54:32,309 - embedding storage: none 2023-10-17 17:54:32,309 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:54:32,309 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-17 17:54:32,309 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:54:32,309 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:54:32,309 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 17:54:33,870 epoch 1 - iter 29/292 - loss 3.46529745 - time (sec): 1.56 - samples/sec: 2414.70 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:54:35,648 epoch 1 - iter 58/292 - loss 2.85262564 - time (sec): 3.34 - samples/sec: 2695.36 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:54:37,352 epoch 1 - iter 87/292 - loss 2.24144121 - time (sec): 5.04 - samples/sec: 2729.14 - lr: 0.000009 - momentum: 0.000000 2023-10-17 17:54:39,256 epoch 1 - iter 116/292 - loss 1.84711241 - time (sec): 6.95 - samples/sec: 2702.68 - lr: 0.000012 - momentum: 0.000000 2023-10-17 17:54:40,855 epoch 1 - iter 145/292 - loss 1.60586186 - time (sec): 8.55 - samples/sec: 2668.69 - lr: 0.000015 - momentum: 0.000000 2023-10-17 17:54:42,574 epoch 1 - iter 174/292 - loss 1.40883045 - time (sec): 10.26 - samples/sec: 2662.50 - lr: 0.000018 - momentum: 0.000000 2023-10-17 17:54:44,270 epoch 1 - iter 203/292 - loss 1.26133435 - time (sec): 11.96 - samples/sec: 2663.10 - lr: 0.000021 - momentum: 0.000000 2023-10-17 17:54:45,832 epoch 1 - iter 232/292 - loss 1.17210562 - time (sec): 13.52 - samples/sec: 2650.12 - lr: 0.000024 - momentum: 0.000000 2023-10-17 17:54:47,491 epoch 1 - iter 261/292 - loss 1.07516745 - time (sec): 15.18 - samples/sec: 2632.77 - lr: 0.000027 - momentum: 0.000000 2023-10-17 17:54:49,094 epoch 1 - iter 290/292 - loss 1.00482708 - time (sec): 16.78 - samples/sec: 2632.29 - lr: 0.000030 - momentum: 0.000000 2023-10-17 17:54:49,197 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:54:49,198 EPOCH 1 done: loss 1.0011 - lr: 0.000030 2023-10-17 17:54:50,247 DEV : loss 0.18033993244171143 - f1-score (micro avg) 0.5166 2023-10-17 17:54:50,252 saving best model 2023-10-17 17:54:50,598 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:54:52,163 epoch 2 - iter 29/292 - loss 0.26889643 - time (sec): 1.56 - samples/sec: 2819.81 - lr: 0.000030 - momentum: 0.000000 2023-10-17 17:54:53,751 epoch 2 - iter 58/292 - loss 0.23616332 - time (sec): 3.15 - samples/sec: 2670.49 - lr: 0.000029 - momentum: 0.000000 2023-10-17 17:54:55,366 epoch 2 - iter 87/292 - loss 0.23714419 - time (sec): 4.77 - samples/sec: 2706.10 - lr: 0.000029 - momentum: 0.000000 2023-10-17 17:54:57,264 epoch 2 - iter 116/292 - loss 0.23528197 - time (sec): 6.66 - samples/sec: 2714.76 - lr: 0.000029 - momentum: 0.000000 2023-10-17 17:54:58,855 epoch 2 - iter 145/292 - loss 0.23228705 - time (sec): 8.26 - samples/sec: 2656.03 - lr: 0.000028 - momentum: 0.000000 2023-10-17 17:55:00,541 epoch 2 - iter 174/292 - loss 0.21871137 - time (sec): 9.94 - samples/sec: 2656.97 - lr: 0.000028 - momentum: 0.000000 2023-10-17 17:55:02,351 epoch 2 - iter 203/292 - loss 0.21324631 - time (sec): 11.75 - samples/sec: 2694.53 - lr: 0.000028 - momentum: 0.000000 2023-10-17 17:55:04,110 epoch 2 - iter 232/292 - loss 0.20885632 - time (sec): 13.51 - samples/sec: 2715.99 - lr: 0.000027 - momentum: 0.000000 2023-10-17 17:55:05,662 epoch 2 - iter 261/292 - loss 0.20683157 - time (sec): 15.06 - samples/sec: 2667.65 - lr: 0.000027 - momentum: 0.000000 2023-10-17 17:55:07,281 epoch 2 - iter 290/292 - loss 0.20615249 - time (sec): 16.68 - samples/sec: 2655.07 - lr: 0.000027 - momentum: 0.000000 2023-10-17 17:55:07,370 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:55:07,370 EPOCH 2 done: loss 0.2057 - lr: 0.000027 2023-10-17 17:55:08,617 DEV : loss 0.13886240124702454 - f1-score (micro avg) 0.6061 2023-10-17 17:55:08,622 saving best model 2023-10-17 17:55:09,062 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:55:10,780 epoch 3 - iter 29/292 - loss 0.13024211 - time (sec): 1.72 - samples/sec: 2838.53 - lr: 0.000026 - momentum: 0.000000 2023-10-17 17:55:12,637 epoch 3 - iter 58/292 - loss 0.12759556 - time (sec): 3.57 - samples/sec: 2715.59 - lr: 0.000026 - momentum: 0.000000 2023-10-17 17:55:14,276 epoch 3 - iter 87/292 - loss 0.14205085 - time (sec): 5.21 - samples/sec: 2682.63 - lr: 0.000026 - momentum: 0.000000 2023-10-17 17:55:15,845 epoch 3 - iter 116/292 - loss 0.12865193 - time (sec): 6.78 - samples/sec: 2641.70 - lr: 0.000025 - momentum: 0.000000 2023-10-17 17:55:17,502 epoch 3 - iter 145/292 - loss 0.12530962 - time (sec): 8.44 - samples/sec: 2652.51 - lr: 0.000025 - momentum: 0.000000 2023-10-17 17:55:19,195 epoch 3 - iter 174/292 - loss 0.12338449 - time (sec): 10.13 - samples/sec: 2671.17 - lr: 0.000025 - momentum: 0.000000 2023-10-17 17:55:20,660 epoch 3 - iter 203/292 - loss 0.12001001 - time (sec): 11.60 - samples/sec: 2696.64 - lr: 0.000024 - momentum: 0.000000 2023-10-17 17:55:22,357 epoch 3 - iter 232/292 - loss 0.11516561 - time (sec): 13.29 - samples/sec: 2692.81 - lr: 0.000024 - momentum: 0.000000 2023-10-17 17:55:23,957 epoch 3 - iter 261/292 - loss 0.11294112 - time (sec): 14.89 - samples/sec: 2688.59 - lr: 0.000024 - momentum: 0.000000 2023-10-17 17:55:25,606 epoch 3 - iter 290/292 - loss 0.11431090 - time (sec): 16.54 - samples/sec: 2676.98 - lr: 0.000023 - momentum: 0.000000 2023-10-17 17:55:25,693 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:55:25,693 EPOCH 3 done: loss 0.1145 - lr: 0.000023 2023-10-17 17:55:26,953 DEV : loss 0.12177132815122604 - f1-score (micro avg) 0.7233 2023-10-17 17:55:26,981 saving best model 2023-10-17 17:55:27,438 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:55:29,139 epoch 4 - iter 29/292 - loss 0.06672050 - time (sec): 1.70 - samples/sec: 2718.35 - lr: 0.000023 - momentum: 0.000000 2023-10-17 17:55:30,934 epoch 4 - iter 58/292 - loss 0.06268066 - time (sec): 3.49 - samples/sec: 2683.17 - lr: 0.000023 - momentum: 0.000000 2023-10-17 17:55:32,516 epoch 4 - iter 87/292 - loss 0.06732367 - time (sec): 5.07 - samples/sec: 2611.61 - lr: 0.000022 - momentum: 0.000000 2023-10-17 17:55:34,394 epoch 4 - iter 116/292 - loss 0.06108828 - time (sec): 6.95 - samples/sec: 2647.70 - lr: 0.000022 - momentum: 0.000000 2023-10-17 17:55:36,052 epoch 4 - iter 145/292 - loss 0.06815551 - time (sec): 8.61 - samples/sec: 2653.98 - lr: 0.000022 - momentum: 0.000000 2023-10-17 17:55:37,835 epoch 4 - iter 174/292 - loss 0.07114621 - time (sec): 10.39 - samples/sec: 2644.31 - lr: 0.000021 - momentum: 0.000000 2023-10-17 17:55:39,445 epoch 4 - iter 203/292 - loss 0.07001109 - time (sec): 12.00 - samples/sec: 2622.13 - lr: 0.000021 - momentum: 0.000000 2023-10-17 17:55:41,196 epoch 4 - iter 232/292 - loss 0.07466102 - time (sec): 13.75 - samples/sec: 2611.52 - lr: 0.000021 - momentum: 0.000000 2023-10-17 17:55:42,800 epoch 4 - iter 261/292 - loss 0.07419526 - time (sec): 15.36 - samples/sec: 2616.12 - lr: 0.000020 - momentum: 0.000000 2023-10-17 17:55:44,399 epoch 4 - iter 290/292 - loss 0.07236502 - time (sec): 16.96 - samples/sec: 2612.87 - lr: 0.000020 - momentum: 0.000000 2023-10-17 17:55:44,487 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:55:44,487 EPOCH 4 done: loss 0.0722 - lr: 0.000020 2023-10-17 17:55:45,740 DEV : loss 0.12535437941551208 - f1-score (micro avg) 0.7738 2023-10-17 17:55:45,745 saving best model 2023-10-17 17:55:46,211 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:55:47,883 epoch 5 - iter 29/292 - loss 0.04075673 - time (sec): 1.67 - samples/sec: 2548.52 - lr: 0.000020 - momentum: 0.000000 2023-10-17 17:55:49,424 epoch 5 - iter 58/292 - loss 0.04398143 - time (sec): 3.21 - samples/sec: 2657.55 - lr: 0.000019 - momentum: 0.000000 2023-10-17 17:55:51,125 epoch 5 - iter 87/292 - loss 0.05122247 - time (sec): 4.91 - samples/sec: 2767.67 - lr: 0.000019 - momentum: 0.000000 2023-10-17 17:55:52,824 epoch 5 - iter 116/292 - loss 0.05362585 - time (sec): 6.61 - samples/sec: 2709.69 - lr: 0.000019 - momentum: 0.000000 2023-10-17 17:55:54,584 epoch 5 - iter 145/292 - loss 0.06020828 - time (sec): 8.37 - samples/sec: 2671.88 - lr: 0.000018 - momentum: 0.000000 2023-10-17 17:55:56,221 epoch 5 - iter 174/292 - loss 0.05705763 - time (sec): 10.01 - samples/sec: 2660.10 - lr: 0.000018 - momentum: 0.000000 2023-10-17 17:55:57,804 epoch 5 - iter 203/292 - loss 0.05440471 - time (sec): 11.59 - samples/sec: 2661.69 - lr: 0.000018 - momentum: 0.000000 2023-10-17 17:55:59,504 epoch 5 - iter 232/292 - loss 0.05342826 - time (sec): 13.29 - samples/sec: 2650.40 - lr: 0.000017 - momentum: 0.000000 2023-10-17 17:56:01,190 epoch 5 - iter 261/292 - loss 0.04996839 - time (sec): 14.98 - samples/sec: 2660.10 - lr: 0.000017 - momentum: 0.000000 2023-10-17 17:56:02,840 epoch 5 - iter 290/292 - loss 0.05076733 - time (sec): 16.63 - samples/sec: 2664.57 - lr: 0.000017 - momentum: 0.000000 2023-10-17 17:56:02,942 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:56:02,942 EPOCH 5 done: loss 0.0507 - lr: 0.000017 2023-10-17 17:56:04,636 DEV : loss 0.1338176429271698 - f1-score (micro avg) 0.7873 2023-10-17 17:56:04,643 saving best model 2023-10-17 17:56:05,220 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:56:07,011 epoch 6 - iter 29/292 - loss 0.03285633 - time (sec): 1.79 - samples/sec: 2416.37 - lr: 0.000016 - momentum: 0.000000 2023-10-17 17:56:08,784 epoch 6 - iter 58/292 - loss 0.04542416 - time (sec): 3.56 - samples/sec: 2541.67 - lr: 0.000016 - momentum: 0.000000 2023-10-17 17:56:10,460 epoch 6 - iter 87/292 - loss 0.04632496 - time (sec): 5.24 - samples/sec: 2485.10 - lr: 0.000016 - momentum: 0.000000 2023-10-17 17:56:11,985 epoch 6 - iter 116/292 - loss 0.04275741 - time (sec): 6.76 - samples/sec: 2426.11 - lr: 0.000015 - momentum: 0.000000 2023-10-17 17:56:13,731 epoch 6 - iter 145/292 - loss 0.03816003 - time (sec): 8.51 - samples/sec: 2499.75 - lr: 0.000015 - momentum: 0.000000 2023-10-17 17:56:15,477 epoch 6 - iter 174/292 - loss 0.04074117 - time (sec): 10.26 - samples/sec: 2561.94 - lr: 0.000015 - momentum: 0.000000 2023-10-17 17:56:17,003 epoch 6 - iter 203/292 - loss 0.04022817 - time (sec): 11.78 - samples/sec: 2557.88 - lr: 0.000014 - momentum: 0.000000 2023-10-17 17:56:18,654 epoch 6 - iter 232/292 - loss 0.03926639 - time (sec): 13.43 - samples/sec: 2557.04 - lr: 0.000014 - momentum: 0.000000 2023-10-17 17:56:20,236 epoch 6 - iter 261/292 - loss 0.04036851 - time (sec): 15.01 - samples/sec: 2580.58 - lr: 0.000014 - momentum: 0.000000 2023-10-17 17:56:22,084 epoch 6 - iter 290/292 - loss 0.03754090 - time (sec): 16.86 - samples/sec: 2621.46 - lr: 0.000013 - momentum: 0.000000 2023-10-17 17:56:22,182 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:56:22,182 EPOCH 6 done: loss 0.0377 - lr: 0.000013 2023-10-17 17:56:23,418 DEV : loss 0.13025900721549988 - f1-score (micro avg) 0.7822 2023-10-17 17:56:23,423 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:56:25,047 epoch 7 - iter 29/292 - loss 0.02527725 - time (sec): 1.62 - samples/sec: 2566.13 - lr: 0.000013 - momentum: 0.000000 2023-10-17 17:56:26,530 epoch 7 - iter 58/292 - loss 0.03459254 - time (sec): 3.11 - samples/sec: 2529.66 - lr: 0.000013 - momentum: 0.000000 2023-10-17 17:56:28,186 epoch 7 - iter 87/292 - loss 0.02696686 - time (sec): 4.76 - samples/sec: 2593.67 - lr: 0.000012 - momentum: 0.000000 2023-10-17 17:56:29,761 epoch 7 - iter 116/292 - loss 0.02843464 - time (sec): 6.34 - samples/sec: 2607.35 - lr: 0.000012 - momentum: 0.000000 2023-10-17 17:56:31,440 epoch 7 - iter 145/292 - loss 0.03184250 - time (sec): 8.02 - samples/sec: 2662.15 - lr: 0.000012 - momentum: 0.000000 2023-10-17 17:56:33,110 epoch 7 - iter 174/292 - loss 0.03157781 - time (sec): 9.69 - samples/sec: 2625.25 - lr: 0.000011 - momentum: 0.000000 2023-10-17 17:56:34,856 epoch 7 - iter 203/292 - loss 0.02943012 - time (sec): 11.43 - samples/sec: 2629.65 - lr: 0.000011 - momentum: 0.000000 2023-10-17 17:56:36,561 epoch 7 - iter 232/292 - loss 0.02784831 - time (sec): 13.14 - samples/sec: 2603.66 - lr: 0.000011 - momentum: 0.000000 2023-10-17 17:56:38,277 epoch 7 - iter 261/292 - loss 0.02771447 - time (sec): 14.85 - samples/sec: 2614.00 - lr: 0.000010 - momentum: 0.000000 2023-10-17 17:56:39,989 epoch 7 - iter 290/292 - loss 0.02650986 - time (sec): 16.56 - samples/sec: 2646.73 - lr: 0.000010 - momentum: 0.000000 2023-10-17 17:56:40,181 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:56:40,181 EPOCH 7 done: loss 0.0265 - lr: 0.000010 2023-10-17 17:56:41,471 DEV : loss 0.13627947866916656 - f1-score (micro avg) 0.7758 2023-10-17 17:56:41,477 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:56:43,311 epoch 8 - iter 29/292 - loss 0.01493153 - time (sec): 1.83 - samples/sec: 2394.48 - lr: 0.000010 - momentum: 0.000000 2023-10-17 17:56:45,102 epoch 8 - iter 58/292 - loss 0.02355857 - time (sec): 3.62 - samples/sec: 2428.09 - lr: 0.000009 - momentum: 0.000000 2023-10-17 17:56:46,760 epoch 8 - iter 87/292 - loss 0.01909118 - time (sec): 5.28 - samples/sec: 2516.02 - lr: 0.000009 - momentum: 0.000000 2023-10-17 17:56:48,389 epoch 8 - iter 116/292 - loss 0.02305759 - time (sec): 6.91 - samples/sec: 2577.21 - lr: 0.000009 - momentum: 0.000000 2023-10-17 17:56:50,048 epoch 8 - iter 145/292 - loss 0.02327096 - time (sec): 8.57 - samples/sec: 2611.98 - lr: 0.000008 - momentum: 0.000000 2023-10-17 17:56:51,689 epoch 8 - iter 174/292 - loss 0.02166280 - time (sec): 10.21 - samples/sec: 2648.62 - lr: 0.000008 - momentum: 0.000000 2023-10-17 17:56:53,498 epoch 8 - iter 203/292 - loss 0.02052288 - time (sec): 12.02 - samples/sec: 2664.00 - lr: 0.000008 - momentum: 0.000000 2023-10-17 17:56:54,998 epoch 8 - iter 232/292 - loss 0.02174233 - time (sec): 13.52 - samples/sec: 2632.40 - lr: 0.000007 - momentum: 0.000000 2023-10-17 17:56:56,744 epoch 8 - iter 261/292 - loss 0.02030829 - time (sec): 15.27 - samples/sec: 2641.97 - lr: 0.000007 - momentum: 0.000000 2023-10-17 17:56:58,399 epoch 8 - iter 290/292 - loss 0.01930915 - time (sec): 16.92 - samples/sec: 2615.82 - lr: 0.000007 - momentum: 0.000000 2023-10-17 17:56:58,493 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:56:58,493 EPOCH 8 done: loss 0.0192 - lr: 0.000007 2023-10-17 17:56:59,773 DEV : loss 0.14689218997955322 - f1-score (micro avg) 0.783 2023-10-17 17:56:59,779 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:57:01,356 epoch 9 - iter 29/292 - loss 0.02283188 - time (sec): 1.58 - samples/sec: 2542.88 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:57:03,241 epoch 9 - iter 58/292 - loss 0.02374641 - time (sec): 3.46 - samples/sec: 2751.78 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:57:05,057 epoch 9 - iter 87/292 - loss 0.02522260 - time (sec): 5.28 - samples/sec: 2785.00 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:57:06,636 epoch 9 - iter 116/292 - loss 0.02221788 - time (sec): 6.86 - samples/sec: 2713.59 - lr: 0.000005 - momentum: 0.000000 2023-10-17 17:57:08,138 epoch 9 - iter 145/292 - loss 0.02100944 - time (sec): 8.36 - samples/sec: 2656.06 - lr: 0.000005 - momentum: 0.000000 2023-10-17 17:57:09,813 epoch 9 - iter 174/292 - loss 0.01978056 - time (sec): 10.03 - samples/sec: 2673.02 - lr: 0.000005 - momentum: 0.000000 2023-10-17 17:57:11,400 epoch 9 - iter 203/292 - loss 0.01857901 - time (sec): 11.62 - samples/sec: 2658.24 - lr: 0.000004 - momentum: 0.000000 2023-10-17 17:57:13,088 epoch 9 - iter 232/292 - loss 0.01720288 - time (sec): 13.31 - samples/sec: 2698.42 - lr: 0.000004 - momentum: 0.000000 2023-10-17 17:57:14,615 epoch 9 - iter 261/292 - loss 0.01669609 - time (sec): 14.84 - samples/sec: 2669.73 - lr: 0.000004 - momentum: 0.000000 2023-10-17 17:57:16,176 epoch 9 - iter 290/292 - loss 0.01569075 - time (sec): 16.40 - samples/sec: 2681.37 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:57:16,317 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:57:16,317 EPOCH 9 done: loss 0.0155 - lr: 0.000003 2023-10-17 17:57:17,603 DEV : loss 0.1420283317565918 - f1-score (micro avg) 0.7859 2023-10-17 17:57:17,608 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:57:19,468 epoch 10 - iter 29/292 - loss 0.02328778 - time (sec): 1.86 - samples/sec: 2880.70 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:57:21,144 epoch 10 - iter 58/292 - loss 0.01581561 - time (sec): 3.54 - samples/sec: 2784.13 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:57:22,748 epoch 10 - iter 87/292 - loss 0.01386311 - time (sec): 5.14 - samples/sec: 2741.10 - lr: 0.000002 - momentum: 0.000000 2023-10-17 17:57:24,360 epoch 10 - iter 116/292 - loss 0.01274530 - time (sec): 6.75 - samples/sec: 2712.62 - lr: 0.000002 - momentum: 0.000000 2023-10-17 17:57:25,977 epoch 10 - iter 145/292 - loss 0.01181146 - time (sec): 8.37 - samples/sec: 2640.57 - lr: 0.000002 - momentum: 0.000000 2023-10-17 17:57:27,705 epoch 10 - iter 174/292 - loss 0.01493437 - time (sec): 10.10 - samples/sec: 2610.52 - lr: 0.000001 - momentum: 0.000000 2023-10-17 17:57:29,461 epoch 10 - iter 203/292 - loss 0.01450790 - time (sec): 11.85 - samples/sec: 2649.17 - lr: 0.000001 - momentum: 0.000000 2023-10-17 17:57:30,986 epoch 10 - iter 232/292 - loss 0.01526752 - time (sec): 13.38 - samples/sec: 2667.82 - lr: 0.000001 - momentum: 0.000000 2023-10-17 17:57:32,502 epoch 10 - iter 261/292 - loss 0.01469076 - time (sec): 14.89 - samples/sec: 2658.73 - lr: 0.000000 - momentum: 0.000000 2023-10-17 17:57:34,269 epoch 10 - iter 290/292 - loss 0.01440001 - time (sec): 16.66 - samples/sec: 2647.74 - lr: 0.000000 - momentum: 0.000000 2023-10-17 17:57:34,374 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:57:34,374 EPOCH 10 done: loss 0.0143 - lr: 0.000000 2023-10-17 17:57:35,631 DEV : loss 0.14396768808364868 - f1-score (micro avg) 0.793 2023-10-17 17:57:35,636 saving best model 2023-10-17 17:57:36,631 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:57:36,632 Loading model from best epoch ... 2023-10-17 17:57:38,019 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-17 17:57:40,546 Results: - F-score (micro) 0.7599 - F-score (macro) 0.7052 - Accuracy 0.633 By class: precision recall f1-score support PER 0.8065 0.8621 0.8333 348 LOC 0.6494 0.8161 0.7233 261 ORG 0.4333 0.5000 0.4643 52 HumanProd 0.7826 0.8182 0.8000 22 micro avg 0.7114 0.8155 0.7599 683 macro avg 0.6679 0.7491 0.7052 683 weighted avg 0.7173 0.8155 0.7621 683 2023-10-17 17:57:40,547 ----------------------------------------------------------------------------------------------------