stefan-it's picture
Upload folder using huggingface_hub
3f21fda
2023-10-17 17:54:32,307 ----------------------------------------------------------------------------------------------------
2023-10-17 17:54:32,308 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 17:54:32,308 ----------------------------------------------------------------------------------------------------
2023-10-17 17:54:32,308 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-17 17:54:32,308 ----------------------------------------------------------------------------------------------------
2023-10-17 17:54:32,308 Train: 1166 sentences
2023-10-17 17:54:32,308 (train_with_dev=False, train_with_test=False)
2023-10-17 17:54:32,308 ----------------------------------------------------------------------------------------------------
2023-10-17 17:54:32,308 Training Params:
2023-10-17 17:54:32,308 - learning_rate: "3e-05"
2023-10-17 17:54:32,308 - mini_batch_size: "4"
2023-10-17 17:54:32,308 - max_epochs: "10"
2023-10-17 17:54:32,308 - shuffle: "True"
2023-10-17 17:54:32,309 ----------------------------------------------------------------------------------------------------
2023-10-17 17:54:32,309 Plugins:
2023-10-17 17:54:32,309 - TensorboardLogger
2023-10-17 17:54:32,309 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 17:54:32,309 ----------------------------------------------------------------------------------------------------
2023-10-17 17:54:32,309 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 17:54:32,309 - metric: "('micro avg', 'f1-score')"
2023-10-17 17:54:32,309 ----------------------------------------------------------------------------------------------------
2023-10-17 17:54:32,309 Computation:
2023-10-17 17:54:32,309 - compute on device: cuda:0
2023-10-17 17:54:32,309 - embedding storage: none
2023-10-17 17:54:32,309 ----------------------------------------------------------------------------------------------------
2023-10-17 17:54:32,309 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-17 17:54:32,309 ----------------------------------------------------------------------------------------------------
2023-10-17 17:54:32,309 ----------------------------------------------------------------------------------------------------
2023-10-17 17:54:32,309 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 17:54:33,870 epoch 1 - iter 29/292 - loss 3.46529745 - time (sec): 1.56 - samples/sec: 2414.70 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:54:35,648 epoch 1 - iter 58/292 - loss 2.85262564 - time (sec): 3.34 - samples/sec: 2695.36 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:54:37,352 epoch 1 - iter 87/292 - loss 2.24144121 - time (sec): 5.04 - samples/sec: 2729.14 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:54:39,256 epoch 1 - iter 116/292 - loss 1.84711241 - time (sec): 6.95 - samples/sec: 2702.68 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:54:40,855 epoch 1 - iter 145/292 - loss 1.60586186 - time (sec): 8.55 - samples/sec: 2668.69 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:54:42,574 epoch 1 - iter 174/292 - loss 1.40883045 - time (sec): 10.26 - samples/sec: 2662.50 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:54:44,270 epoch 1 - iter 203/292 - loss 1.26133435 - time (sec): 11.96 - samples/sec: 2663.10 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:54:45,832 epoch 1 - iter 232/292 - loss 1.17210562 - time (sec): 13.52 - samples/sec: 2650.12 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:54:47,491 epoch 1 - iter 261/292 - loss 1.07516745 - time (sec): 15.18 - samples/sec: 2632.77 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:54:49,094 epoch 1 - iter 290/292 - loss 1.00482708 - time (sec): 16.78 - samples/sec: 2632.29 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:54:49,197 ----------------------------------------------------------------------------------------------------
2023-10-17 17:54:49,198 EPOCH 1 done: loss 1.0011 - lr: 0.000030
2023-10-17 17:54:50,247 DEV : loss 0.18033993244171143 - f1-score (micro avg) 0.5166
2023-10-17 17:54:50,252 saving best model
2023-10-17 17:54:50,598 ----------------------------------------------------------------------------------------------------
2023-10-17 17:54:52,163 epoch 2 - iter 29/292 - loss 0.26889643 - time (sec): 1.56 - samples/sec: 2819.81 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:54:53,751 epoch 2 - iter 58/292 - loss 0.23616332 - time (sec): 3.15 - samples/sec: 2670.49 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:54:55,366 epoch 2 - iter 87/292 - loss 0.23714419 - time (sec): 4.77 - samples/sec: 2706.10 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:54:57,264 epoch 2 - iter 116/292 - loss 0.23528197 - time (sec): 6.66 - samples/sec: 2714.76 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:54:58,855 epoch 2 - iter 145/292 - loss 0.23228705 - time (sec): 8.26 - samples/sec: 2656.03 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:55:00,541 epoch 2 - iter 174/292 - loss 0.21871137 - time (sec): 9.94 - samples/sec: 2656.97 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:55:02,351 epoch 2 - iter 203/292 - loss 0.21324631 - time (sec): 11.75 - samples/sec: 2694.53 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:55:04,110 epoch 2 - iter 232/292 - loss 0.20885632 - time (sec): 13.51 - samples/sec: 2715.99 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:55:05,662 epoch 2 - iter 261/292 - loss 0.20683157 - time (sec): 15.06 - samples/sec: 2667.65 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:55:07,281 epoch 2 - iter 290/292 - loss 0.20615249 - time (sec): 16.68 - samples/sec: 2655.07 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:55:07,370 ----------------------------------------------------------------------------------------------------
2023-10-17 17:55:07,370 EPOCH 2 done: loss 0.2057 - lr: 0.000027
2023-10-17 17:55:08,617 DEV : loss 0.13886240124702454 - f1-score (micro avg) 0.6061
2023-10-17 17:55:08,622 saving best model
2023-10-17 17:55:09,062 ----------------------------------------------------------------------------------------------------
2023-10-17 17:55:10,780 epoch 3 - iter 29/292 - loss 0.13024211 - time (sec): 1.72 - samples/sec: 2838.53 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:55:12,637 epoch 3 - iter 58/292 - loss 0.12759556 - time (sec): 3.57 - samples/sec: 2715.59 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:55:14,276 epoch 3 - iter 87/292 - loss 0.14205085 - time (sec): 5.21 - samples/sec: 2682.63 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:55:15,845 epoch 3 - iter 116/292 - loss 0.12865193 - time (sec): 6.78 - samples/sec: 2641.70 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:55:17,502 epoch 3 - iter 145/292 - loss 0.12530962 - time (sec): 8.44 - samples/sec: 2652.51 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:55:19,195 epoch 3 - iter 174/292 - loss 0.12338449 - time (sec): 10.13 - samples/sec: 2671.17 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:55:20,660 epoch 3 - iter 203/292 - loss 0.12001001 - time (sec): 11.60 - samples/sec: 2696.64 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:55:22,357 epoch 3 - iter 232/292 - loss 0.11516561 - time (sec): 13.29 - samples/sec: 2692.81 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:55:23,957 epoch 3 - iter 261/292 - loss 0.11294112 - time (sec): 14.89 - samples/sec: 2688.59 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:55:25,606 epoch 3 - iter 290/292 - loss 0.11431090 - time (sec): 16.54 - samples/sec: 2676.98 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:55:25,693 ----------------------------------------------------------------------------------------------------
2023-10-17 17:55:25,693 EPOCH 3 done: loss 0.1145 - lr: 0.000023
2023-10-17 17:55:26,953 DEV : loss 0.12177132815122604 - f1-score (micro avg) 0.7233
2023-10-17 17:55:26,981 saving best model
2023-10-17 17:55:27,438 ----------------------------------------------------------------------------------------------------
2023-10-17 17:55:29,139 epoch 4 - iter 29/292 - loss 0.06672050 - time (sec): 1.70 - samples/sec: 2718.35 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:55:30,934 epoch 4 - iter 58/292 - loss 0.06268066 - time (sec): 3.49 - samples/sec: 2683.17 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:55:32,516 epoch 4 - iter 87/292 - loss 0.06732367 - time (sec): 5.07 - samples/sec: 2611.61 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:55:34,394 epoch 4 - iter 116/292 - loss 0.06108828 - time (sec): 6.95 - samples/sec: 2647.70 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:55:36,052 epoch 4 - iter 145/292 - loss 0.06815551 - time (sec): 8.61 - samples/sec: 2653.98 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:55:37,835 epoch 4 - iter 174/292 - loss 0.07114621 - time (sec): 10.39 - samples/sec: 2644.31 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:55:39,445 epoch 4 - iter 203/292 - loss 0.07001109 - time (sec): 12.00 - samples/sec: 2622.13 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:55:41,196 epoch 4 - iter 232/292 - loss 0.07466102 - time (sec): 13.75 - samples/sec: 2611.52 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:55:42,800 epoch 4 - iter 261/292 - loss 0.07419526 - time (sec): 15.36 - samples/sec: 2616.12 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:55:44,399 epoch 4 - iter 290/292 - loss 0.07236502 - time (sec): 16.96 - samples/sec: 2612.87 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:55:44,487 ----------------------------------------------------------------------------------------------------
2023-10-17 17:55:44,487 EPOCH 4 done: loss 0.0722 - lr: 0.000020
2023-10-17 17:55:45,740 DEV : loss 0.12535437941551208 - f1-score (micro avg) 0.7738
2023-10-17 17:55:45,745 saving best model
2023-10-17 17:55:46,211 ----------------------------------------------------------------------------------------------------
2023-10-17 17:55:47,883 epoch 5 - iter 29/292 - loss 0.04075673 - time (sec): 1.67 - samples/sec: 2548.52 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:55:49,424 epoch 5 - iter 58/292 - loss 0.04398143 - time (sec): 3.21 - samples/sec: 2657.55 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:55:51,125 epoch 5 - iter 87/292 - loss 0.05122247 - time (sec): 4.91 - samples/sec: 2767.67 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:55:52,824 epoch 5 - iter 116/292 - loss 0.05362585 - time (sec): 6.61 - samples/sec: 2709.69 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:55:54,584 epoch 5 - iter 145/292 - loss 0.06020828 - time (sec): 8.37 - samples/sec: 2671.88 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:55:56,221 epoch 5 - iter 174/292 - loss 0.05705763 - time (sec): 10.01 - samples/sec: 2660.10 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:55:57,804 epoch 5 - iter 203/292 - loss 0.05440471 - time (sec): 11.59 - samples/sec: 2661.69 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:55:59,504 epoch 5 - iter 232/292 - loss 0.05342826 - time (sec): 13.29 - samples/sec: 2650.40 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:56:01,190 epoch 5 - iter 261/292 - loss 0.04996839 - time (sec): 14.98 - samples/sec: 2660.10 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:56:02,840 epoch 5 - iter 290/292 - loss 0.05076733 - time (sec): 16.63 - samples/sec: 2664.57 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:56:02,942 ----------------------------------------------------------------------------------------------------
2023-10-17 17:56:02,942 EPOCH 5 done: loss 0.0507 - lr: 0.000017
2023-10-17 17:56:04,636 DEV : loss 0.1338176429271698 - f1-score (micro avg) 0.7873
2023-10-17 17:56:04,643 saving best model
2023-10-17 17:56:05,220 ----------------------------------------------------------------------------------------------------
2023-10-17 17:56:07,011 epoch 6 - iter 29/292 - loss 0.03285633 - time (sec): 1.79 - samples/sec: 2416.37 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:56:08,784 epoch 6 - iter 58/292 - loss 0.04542416 - time (sec): 3.56 - samples/sec: 2541.67 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:56:10,460 epoch 6 - iter 87/292 - loss 0.04632496 - time (sec): 5.24 - samples/sec: 2485.10 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:56:11,985 epoch 6 - iter 116/292 - loss 0.04275741 - time (sec): 6.76 - samples/sec: 2426.11 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:56:13,731 epoch 6 - iter 145/292 - loss 0.03816003 - time (sec): 8.51 - samples/sec: 2499.75 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:56:15,477 epoch 6 - iter 174/292 - loss 0.04074117 - time (sec): 10.26 - samples/sec: 2561.94 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:56:17,003 epoch 6 - iter 203/292 - loss 0.04022817 - time (sec): 11.78 - samples/sec: 2557.88 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:56:18,654 epoch 6 - iter 232/292 - loss 0.03926639 - time (sec): 13.43 - samples/sec: 2557.04 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:56:20,236 epoch 6 - iter 261/292 - loss 0.04036851 - time (sec): 15.01 - samples/sec: 2580.58 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:56:22,084 epoch 6 - iter 290/292 - loss 0.03754090 - time (sec): 16.86 - samples/sec: 2621.46 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:56:22,182 ----------------------------------------------------------------------------------------------------
2023-10-17 17:56:22,182 EPOCH 6 done: loss 0.0377 - lr: 0.000013
2023-10-17 17:56:23,418 DEV : loss 0.13025900721549988 - f1-score (micro avg) 0.7822
2023-10-17 17:56:23,423 ----------------------------------------------------------------------------------------------------
2023-10-17 17:56:25,047 epoch 7 - iter 29/292 - loss 0.02527725 - time (sec): 1.62 - samples/sec: 2566.13 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:56:26,530 epoch 7 - iter 58/292 - loss 0.03459254 - time (sec): 3.11 - samples/sec: 2529.66 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:56:28,186 epoch 7 - iter 87/292 - loss 0.02696686 - time (sec): 4.76 - samples/sec: 2593.67 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:56:29,761 epoch 7 - iter 116/292 - loss 0.02843464 - time (sec): 6.34 - samples/sec: 2607.35 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:56:31,440 epoch 7 - iter 145/292 - loss 0.03184250 - time (sec): 8.02 - samples/sec: 2662.15 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:56:33,110 epoch 7 - iter 174/292 - loss 0.03157781 - time (sec): 9.69 - samples/sec: 2625.25 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:56:34,856 epoch 7 - iter 203/292 - loss 0.02943012 - time (sec): 11.43 - samples/sec: 2629.65 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:56:36,561 epoch 7 - iter 232/292 - loss 0.02784831 - time (sec): 13.14 - samples/sec: 2603.66 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:56:38,277 epoch 7 - iter 261/292 - loss 0.02771447 - time (sec): 14.85 - samples/sec: 2614.00 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:56:39,989 epoch 7 - iter 290/292 - loss 0.02650986 - time (sec): 16.56 - samples/sec: 2646.73 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:56:40,181 ----------------------------------------------------------------------------------------------------
2023-10-17 17:56:40,181 EPOCH 7 done: loss 0.0265 - lr: 0.000010
2023-10-17 17:56:41,471 DEV : loss 0.13627947866916656 - f1-score (micro avg) 0.7758
2023-10-17 17:56:41,477 ----------------------------------------------------------------------------------------------------
2023-10-17 17:56:43,311 epoch 8 - iter 29/292 - loss 0.01493153 - time (sec): 1.83 - samples/sec: 2394.48 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:56:45,102 epoch 8 - iter 58/292 - loss 0.02355857 - time (sec): 3.62 - samples/sec: 2428.09 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:56:46,760 epoch 8 - iter 87/292 - loss 0.01909118 - time (sec): 5.28 - samples/sec: 2516.02 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:56:48,389 epoch 8 - iter 116/292 - loss 0.02305759 - time (sec): 6.91 - samples/sec: 2577.21 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:56:50,048 epoch 8 - iter 145/292 - loss 0.02327096 - time (sec): 8.57 - samples/sec: 2611.98 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:56:51,689 epoch 8 - iter 174/292 - loss 0.02166280 - time (sec): 10.21 - samples/sec: 2648.62 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:56:53,498 epoch 8 - iter 203/292 - loss 0.02052288 - time (sec): 12.02 - samples/sec: 2664.00 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:56:54,998 epoch 8 - iter 232/292 - loss 0.02174233 - time (sec): 13.52 - samples/sec: 2632.40 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:56:56,744 epoch 8 - iter 261/292 - loss 0.02030829 - time (sec): 15.27 - samples/sec: 2641.97 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:56:58,399 epoch 8 - iter 290/292 - loss 0.01930915 - time (sec): 16.92 - samples/sec: 2615.82 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:56:58,493 ----------------------------------------------------------------------------------------------------
2023-10-17 17:56:58,493 EPOCH 8 done: loss 0.0192 - lr: 0.000007
2023-10-17 17:56:59,773 DEV : loss 0.14689218997955322 - f1-score (micro avg) 0.783
2023-10-17 17:56:59,779 ----------------------------------------------------------------------------------------------------
2023-10-17 17:57:01,356 epoch 9 - iter 29/292 - loss 0.02283188 - time (sec): 1.58 - samples/sec: 2542.88 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:57:03,241 epoch 9 - iter 58/292 - loss 0.02374641 - time (sec): 3.46 - samples/sec: 2751.78 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:57:05,057 epoch 9 - iter 87/292 - loss 0.02522260 - time (sec): 5.28 - samples/sec: 2785.00 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:57:06,636 epoch 9 - iter 116/292 - loss 0.02221788 - time (sec): 6.86 - samples/sec: 2713.59 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:57:08,138 epoch 9 - iter 145/292 - loss 0.02100944 - time (sec): 8.36 - samples/sec: 2656.06 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:57:09,813 epoch 9 - iter 174/292 - loss 0.01978056 - time (sec): 10.03 - samples/sec: 2673.02 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:57:11,400 epoch 9 - iter 203/292 - loss 0.01857901 - time (sec): 11.62 - samples/sec: 2658.24 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:57:13,088 epoch 9 - iter 232/292 - loss 0.01720288 - time (sec): 13.31 - samples/sec: 2698.42 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:57:14,615 epoch 9 - iter 261/292 - loss 0.01669609 - time (sec): 14.84 - samples/sec: 2669.73 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:57:16,176 epoch 9 - iter 290/292 - loss 0.01569075 - time (sec): 16.40 - samples/sec: 2681.37 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:57:16,317 ----------------------------------------------------------------------------------------------------
2023-10-17 17:57:16,317 EPOCH 9 done: loss 0.0155 - lr: 0.000003
2023-10-17 17:57:17,603 DEV : loss 0.1420283317565918 - f1-score (micro avg) 0.7859
2023-10-17 17:57:17,608 ----------------------------------------------------------------------------------------------------
2023-10-17 17:57:19,468 epoch 10 - iter 29/292 - loss 0.02328778 - time (sec): 1.86 - samples/sec: 2880.70 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:57:21,144 epoch 10 - iter 58/292 - loss 0.01581561 - time (sec): 3.54 - samples/sec: 2784.13 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:57:22,748 epoch 10 - iter 87/292 - loss 0.01386311 - time (sec): 5.14 - samples/sec: 2741.10 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:57:24,360 epoch 10 - iter 116/292 - loss 0.01274530 - time (sec): 6.75 - samples/sec: 2712.62 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:57:25,977 epoch 10 - iter 145/292 - loss 0.01181146 - time (sec): 8.37 - samples/sec: 2640.57 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:57:27,705 epoch 10 - iter 174/292 - loss 0.01493437 - time (sec): 10.10 - samples/sec: 2610.52 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:57:29,461 epoch 10 - iter 203/292 - loss 0.01450790 - time (sec): 11.85 - samples/sec: 2649.17 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:57:30,986 epoch 10 - iter 232/292 - loss 0.01526752 - time (sec): 13.38 - samples/sec: 2667.82 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:57:32,502 epoch 10 - iter 261/292 - loss 0.01469076 - time (sec): 14.89 - samples/sec: 2658.73 - lr: 0.000000 - momentum: 0.000000
2023-10-17 17:57:34,269 epoch 10 - iter 290/292 - loss 0.01440001 - time (sec): 16.66 - samples/sec: 2647.74 - lr: 0.000000 - momentum: 0.000000
2023-10-17 17:57:34,374 ----------------------------------------------------------------------------------------------------
2023-10-17 17:57:34,374 EPOCH 10 done: loss 0.0143 - lr: 0.000000
2023-10-17 17:57:35,631 DEV : loss 0.14396768808364868 - f1-score (micro avg) 0.793
2023-10-17 17:57:35,636 saving best model
2023-10-17 17:57:36,631 ----------------------------------------------------------------------------------------------------
2023-10-17 17:57:36,632 Loading model from best epoch ...
2023-10-17 17:57:38,019 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-17 17:57:40,546
Results:
- F-score (micro) 0.7599
- F-score (macro) 0.7052
- Accuracy 0.633
By class:
precision recall f1-score support
PER 0.8065 0.8621 0.8333 348
LOC 0.6494 0.8161 0.7233 261
ORG 0.4333 0.5000 0.4643 52
HumanProd 0.7826 0.8182 0.8000 22
micro avg 0.7114 0.8155 0.7599 683
macro avg 0.6679 0.7491 0.7052 683
weighted avg 0.7173 0.8155 0.7621 683
2023-10-17 17:57:40,547 ----------------------------------------------------------------------------------------------------