stefan-it's picture
Upload folder using huggingface_hub
80da254
2023-10-18 18:27:41,081 ----------------------------------------------------------------------------------------------------
2023-10-18 18:27:41,081 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 18:27:41,081 ----------------------------------------------------------------------------------------------------
2023-10-18 18:27:41,081 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-18 18:27:41,081 ----------------------------------------------------------------------------------------------------
2023-10-18 18:27:41,081 Train: 3575 sentences
2023-10-18 18:27:41,081 (train_with_dev=False, train_with_test=False)
2023-10-18 18:27:41,082 ----------------------------------------------------------------------------------------------------
2023-10-18 18:27:41,082 Training Params:
2023-10-18 18:27:41,082 - learning_rate: "5e-05"
2023-10-18 18:27:41,082 - mini_batch_size: "4"
2023-10-18 18:27:41,082 - max_epochs: "10"
2023-10-18 18:27:41,082 - shuffle: "True"
2023-10-18 18:27:41,082 ----------------------------------------------------------------------------------------------------
2023-10-18 18:27:41,082 Plugins:
2023-10-18 18:27:41,082 - TensorboardLogger
2023-10-18 18:27:41,082 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 18:27:41,082 ----------------------------------------------------------------------------------------------------
2023-10-18 18:27:41,082 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 18:27:41,082 - metric: "('micro avg', 'f1-score')"
2023-10-18 18:27:41,082 ----------------------------------------------------------------------------------------------------
2023-10-18 18:27:41,082 Computation:
2023-10-18 18:27:41,082 - compute on device: cuda:0
2023-10-18 18:27:41,082 - embedding storage: none
2023-10-18 18:27:41,082 ----------------------------------------------------------------------------------------------------
2023-10-18 18:27:41,082 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-18 18:27:41,082 ----------------------------------------------------------------------------------------------------
2023-10-18 18:27:41,082 ----------------------------------------------------------------------------------------------------
2023-10-18 18:27:41,082 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 18:27:42,484 epoch 1 - iter 89/894 - loss 3.41107969 - time (sec): 1.40 - samples/sec: 6802.89 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:27:43,913 epoch 1 - iter 178/894 - loss 3.19741479 - time (sec): 2.83 - samples/sec: 6532.72 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:27:45,319 epoch 1 - iter 267/894 - loss 2.75348000 - time (sec): 4.24 - samples/sec: 6457.95 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:27:46,746 epoch 1 - iter 356/894 - loss 2.35673844 - time (sec): 5.66 - samples/sec: 6272.25 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:27:48,216 epoch 1 - iter 445/894 - loss 2.02890086 - time (sec): 7.13 - samples/sec: 6222.55 - lr: 0.000025 - momentum: 0.000000
2023-10-18 18:27:49,617 epoch 1 - iter 534/894 - loss 1.80326419 - time (sec): 8.53 - samples/sec: 6177.43 - lr: 0.000030 - momentum: 0.000000
2023-10-18 18:27:50,993 epoch 1 - iter 623/894 - loss 1.64019919 - time (sec): 9.91 - samples/sec: 6186.33 - lr: 0.000035 - momentum: 0.000000
2023-10-18 18:27:52,383 epoch 1 - iter 712/894 - loss 1.51273727 - time (sec): 11.30 - samples/sec: 6159.43 - lr: 0.000040 - momentum: 0.000000
2023-10-18 18:27:53,785 epoch 1 - iter 801/894 - loss 1.41315171 - time (sec): 12.70 - samples/sec: 6135.25 - lr: 0.000045 - momentum: 0.000000
2023-10-18 18:27:55,152 epoch 1 - iter 890/894 - loss 1.33192532 - time (sec): 14.07 - samples/sec: 6118.99 - lr: 0.000050 - momentum: 0.000000
2023-10-18 18:27:55,213 ----------------------------------------------------------------------------------------------------
2023-10-18 18:27:55,214 EPOCH 1 done: loss 1.3270 - lr: 0.000050
2023-10-18 18:27:57,145 DEV : loss 0.40020185708999634 - f1-score (micro avg) 0.0
2023-10-18 18:27:57,170 ----------------------------------------------------------------------------------------------------
2023-10-18 18:27:58,590 epoch 2 - iter 89/894 - loss 0.49527024 - time (sec): 1.42 - samples/sec: 6831.62 - lr: 0.000049 - momentum: 0.000000
2023-10-18 18:28:00,298 epoch 2 - iter 178/894 - loss 0.48637380 - time (sec): 3.13 - samples/sec: 5980.55 - lr: 0.000049 - momentum: 0.000000
2023-10-18 18:28:01,706 epoch 2 - iter 267/894 - loss 0.46255258 - time (sec): 4.54 - samples/sec: 5969.96 - lr: 0.000048 - momentum: 0.000000
2023-10-18 18:28:02,986 epoch 2 - iter 356/894 - loss 0.44116644 - time (sec): 5.82 - samples/sec: 6121.42 - lr: 0.000048 - momentum: 0.000000
2023-10-18 18:28:04,229 epoch 2 - iter 445/894 - loss 0.43615971 - time (sec): 7.06 - samples/sec: 6164.56 - lr: 0.000047 - momentum: 0.000000
2023-10-18 18:28:05,541 epoch 2 - iter 534/894 - loss 0.43240093 - time (sec): 8.37 - samples/sec: 6180.42 - lr: 0.000047 - momentum: 0.000000
2023-10-18 18:28:06,934 epoch 2 - iter 623/894 - loss 0.43284058 - time (sec): 9.76 - samples/sec: 6234.01 - lr: 0.000046 - momentum: 0.000000
2023-10-18 18:28:08,295 epoch 2 - iter 712/894 - loss 0.43590184 - time (sec): 11.12 - samples/sec: 6248.52 - lr: 0.000046 - momentum: 0.000000
2023-10-18 18:28:09,693 epoch 2 - iter 801/894 - loss 0.43638346 - time (sec): 12.52 - samples/sec: 6210.86 - lr: 0.000045 - momentum: 0.000000
2023-10-18 18:28:11,075 epoch 2 - iter 890/894 - loss 0.43549143 - time (sec): 13.90 - samples/sec: 6204.71 - lr: 0.000044 - momentum: 0.000000
2023-10-18 18:28:11,132 ----------------------------------------------------------------------------------------------------
2023-10-18 18:28:11,132 EPOCH 2 done: loss 0.4358 - lr: 0.000044
2023-10-18 18:28:16,010 DEV : loss 0.3361940383911133 - f1-score (micro avg) 0.2145
2023-10-18 18:28:16,036 saving best model
2023-10-18 18:28:16,068 ----------------------------------------------------------------------------------------------------
2023-10-18 18:28:17,467 epoch 3 - iter 89/894 - loss 0.40109413 - time (sec): 1.40 - samples/sec: 6297.77 - lr: 0.000044 - momentum: 0.000000
2023-10-18 18:28:18,833 epoch 3 - iter 178/894 - loss 0.40822413 - time (sec): 2.76 - samples/sec: 6191.74 - lr: 0.000043 - momentum: 0.000000
2023-10-18 18:28:20,178 epoch 3 - iter 267/894 - loss 0.39533388 - time (sec): 4.11 - samples/sec: 6294.03 - lr: 0.000043 - momentum: 0.000000
2023-10-18 18:28:21,538 epoch 3 - iter 356/894 - loss 0.38348937 - time (sec): 5.47 - samples/sec: 6222.82 - lr: 0.000042 - momentum: 0.000000
2023-10-18 18:28:22,957 epoch 3 - iter 445/894 - loss 0.38947787 - time (sec): 6.89 - samples/sec: 6236.91 - lr: 0.000042 - momentum: 0.000000
2023-10-18 18:28:24,314 epoch 3 - iter 534/894 - loss 0.38207063 - time (sec): 8.25 - samples/sec: 6208.34 - lr: 0.000041 - momentum: 0.000000
2023-10-18 18:28:25,753 epoch 3 - iter 623/894 - loss 0.37655257 - time (sec): 9.68 - samples/sec: 6286.71 - lr: 0.000041 - momentum: 0.000000
2023-10-18 18:28:27,147 epoch 3 - iter 712/894 - loss 0.37547493 - time (sec): 11.08 - samples/sec: 6228.02 - lr: 0.000040 - momentum: 0.000000
2023-10-18 18:28:28,544 epoch 3 - iter 801/894 - loss 0.37667688 - time (sec): 12.48 - samples/sec: 6244.41 - lr: 0.000039 - momentum: 0.000000
2023-10-18 18:28:29,912 epoch 3 - iter 890/894 - loss 0.37253884 - time (sec): 13.84 - samples/sec: 6232.41 - lr: 0.000039 - momentum: 0.000000
2023-10-18 18:28:29,973 ----------------------------------------------------------------------------------------------------
2023-10-18 18:28:29,973 EPOCH 3 done: loss 0.3725 - lr: 0.000039
2023-10-18 18:28:35,228 DEV : loss 0.3129556179046631 - f1-score (micro avg) 0.3398
2023-10-18 18:28:35,254 saving best model
2023-10-18 18:28:35,288 ----------------------------------------------------------------------------------------------------
2023-10-18 18:28:36,535 epoch 4 - iter 89/894 - loss 0.31905254 - time (sec): 1.25 - samples/sec: 7257.23 - lr: 0.000038 - momentum: 0.000000
2023-10-18 18:28:37,917 epoch 4 - iter 178/894 - loss 0.31151319 - time (sec): 2.63 - samples/sec: 6761.27 - lr: 0.000038 - momentum: 0.000000
2023-10-18 18:28:39,330 epoch 4 - iter 267/894 - loss 0.31487409 - time (sec): 4.04 - samples/sec: 6779.46 - lr: 0.000037 - momentum: 0.000000
2023-10-18 18:28:40,725 epoch 4 - iter 356/894 - loss 0.32858231 - time (sec): 5.44 - samples/sec: 6680.54 - lr: 0.000037 - momentum: 0.000000
2023-10-18 18:28:42,103 epoch 4 - iter 445/894 - loss 0.32782339 - time (sec): 6.82 - samples/sec: 6476.42 - lr: 0.000036 - momentum: 0.000000
2023-10-18 18:28:43,488 epoch 4 - iter 534/894 - loss 0.33208690 - time (sec): 8.20 - samples/sec: 6503.13 - lr: 0.000036 - momentum: 0.000000
2023-10-18 18:28:44,842 epoch 4 - iter 623/894 - loss 0.33442800 - time (sec): 9.55 - samples/sec: 6431.29 - lr: 0.000035 - momentum: 0.000000
2023-10-18 18:28:46,190 epoch 4 - iter 712/894 - loss 0.33661458 - time (sec): 10.90 - samples/sec: 6393.71 - lr: 0.000034 - momentum: 0.000000
2023-10-18 18:28:47,572 epoch 4 - iter 801/894 - loss 0.33599717 - time (sec): 12.28 - samples/sec: 6338.63 - lr: 0.000034 - momentum: 0.000000
2023-10-18 18:28:48,953 epoch 4 - iter 890/894 - loss 0.33575512 - time (sec): 13.66 - samples/sec: 6306.29 - lr: 0.000033 - momentum: 0.000000
2023-10-18 18:28:49,015 ----------------------------------------------------------------------------------------------------
2023-10-18 18:28:49,015 EPOCH 4 done: loss 0.3357 - lr: 0.000033
2023-10-18 18:28:54,302 DEV : loss 0.3030177354812622 - f1-score (micro avg) 0.3312
2023-10-18 18:28:54,327 ----------------------------------------------------------------------------------------------------
2023-10-18 18:28:55,699 epoch 5 - iter 89/894 - loss 0.35702426 - time (sec): 1.37 - samples/sec: 5665.85 - lr: 0.000033 - momentum: 0.000000
2023-10-18 18:28:57,091 epoch 5 - iter 178/894 - loss 0.32420012 - time (sec): 2.76 - samples/sec: 5731.95 - lr: 0.000032 - momentum: 0.000000
2023-10-18 18:28:58,456 epoch 5 - iter 267/894 - loss 0.32031718 - time (sec): 4.13 - samples/sec: 5872.56 - lr: 0.000032 - momentum: 0.000000
2023-10-18 18:28:59,881 epoch 5 - iter 356/894 - loss 0.30064946 - time (sec): 5.55 - samples/sec: 6092.07 - lr: 0.000031 - momentum: 0.000000
2023-10-18 18:29:01,282 epoch 5 - iter 445/894 - loss 0.29561401 - time (sec): 6.95 - samples/sec: 6226.51 - lr: 0.000031 - momentum: 0.000000
2023-10-18 18:29:02,671 epoch 5 - iter 534/894 - loss 0.29665882 - time (sec): 8.34 - samples/sec: 6312.81 - lr: 0.000030 - momentum: 0.000000
2023-10-18 18:29:03,937 epoch 5 - iter 623/894 - loss 0.29657716 - time (sec): 9.61 - samples/sec: 6344.01 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:29:05,320 epoch 5 - iter 712/894 - loss 0.30045153 - time (sec): 10.99 - samples/sec: 6292.58 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:29:06,719 epoch 5 - iter 801/894 - loss 0.30113299 - time (sec): 12.39 - samples/sec: 6267.79 - lr: 0.000028 - momentum: 0.000000
2023-10-18 18:29:08,105 epoch 5 - iter 890/894 - loss 0.30518006 - time (sec): 13.78 - samples/sec: 6262.12 - lr: 0.000028 - momentum: 0.000000
2023-10-18 18:29:08,167 ----------------------------------------------------------------------------------------------------
2023-10-18 18:29:08,167 EPOCH 5 done: loss 0.3047 - lr: 0.000028
2023-10-18 18:29:13,478 DEV : loss 0.30577316880226135 - f1-score (micro avg) 0.3593
2023-10-18 18:29:13,504 saving best model
2023-10-18 18:29:13,542 ----------------------------------------------------------------------------------------------------
2023-10-18 18:29:14,955 epoch 6 - iter 89/894 - loss 0.26832203 - time (sec): 1.41 - samples/sec: 6075.82 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:29:16,377 epoch 6 - iter 178/894 - loss 0.28611994 - time (sec): 2.83 - samples/sec: 6127.23 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:29:17,827 epoch 6 - iter 267/894 - loss 0.28634822 - time (sec): 4.28 - samples/sec: 6356.47 - lr: 0.000026 - momentum: 0.000000
2023-10-18 18:29:19,184 epoch 6 - iter 356/894 - loss 0.28936804 - time (sec): 5.64 - samples/sec: 6304.55 - lr: 0.000026 - momentum: 0.000000
2023-10-18 18:29:20,590 epoch 6 - iter 445/894 - loss 0.29315417 - time (sec): 7.05 - samples/sec: 6235.92 - lr: 0.000025 - momentum: 0.000000
2023-10-18 18:29:21,923 epoch 6 - iter 534/894 - loss 0.28966787 - time (sec): 8.38 - samples/sec: 6153.24 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:29:23,326 epoch 6 - iter 623/894 - loss 0.28961136 - time (sec): 9.78 - samples/sec: 6163.56 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:29:24,707 epoch 6 - iter 712/894 - loss 0.28874325 - time (sec): 11.16 - samples/sec: 6158.15 - lr: 0.000023 - momentum: 0.000000
2023-10-18 18:29:26,073 epoch 6 - iter 801/894 - loss 0.29000085 - time (sec): 12.53 - samples/sec: 6181.58 - lr: 0.000023 - momentum: 0.000000
2023-10-18 18:29:27,468 epoch 6 - iter 890/894 - loss 0.28502336 - time (sec): 13.93 - samples/sec: 6190.45 - lr: 0.000022 - momentum: 0.000000
2023-10-18 18:29:27,527 ----------------------------------------------------------------------------------------------------
2023-10-18 18:29:27,528 EPOCH 6 done: loss 0.2842 - lr: 0.000022
2023-10-18 18:29:32,498 DEV : loss 0.3079672157764435 - f1-score (micro avg) 0.3617
2023-10-18 18:29:32,523 saving best model
2023-10-18 18:29:32,556 ----------------------------------------------------------------------------------------------------
2023-10-18 18:29:33,944 epoch 7 - iter 89/894 - loss 0.29011633 - time (sec): 1.39 - samples/sec: 6127.79 - lr: 0.000022 - momentum: 0.000000
2023-10-18 18:29:35,340 epoch 7 - iter 178/894 - loss 0.26920405 - time (sec): 2.78 - samples/sec: 6027.29 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:29:36,651 epoch 7 - iter 267/894 - loss 0.26706714 - time (sec): 4.10 - samples/sec: 6065.90 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:29:37,901 epoch 7 - iter 356/894 - loss 0.26434876 - time (sec): 5.34 - samples/sec: 6391.15 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:29:39,252 epoch 7 - iter 445/894 - loss 0.26500313 - time (sec): 6.70 - samples/sec: 6422.62 - lr: 0.000019 - momentum: 0.000000
2023-10-18 18:29:40,624 epoch 7 - iter 534/894 - loss 0.26000296 - time (sec): 8.07 - samples/sec: 6428.25 - lr: 0.000019 - momentum: 0.000000
2023-10-18 18:29:42,006 epoch 7 - iter 623/894 - loss 0.26918289 - time (sec): 9.45 - samples/sec: 6401.06 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:29:43,758 epoch 7 - iter 712/894 - loss 0.26880946 - time (sec): 11.20 - samples/sec: 6224.17 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:29:45,150 epoch 7 - iter 801/894 - loss 0.26969539 - time (sec): 12.59 - samples/sec: 6194.44 - lr: 0.000017 - momentum: 0.000000
2023-10-18 18:29:46,539 epoch 7 - iter 890/894 - loss 0.26911572 - time (sec): 13.98 - samples/sec: 6159.93 - lr: 0.000017 - momentum: 0.000000
2023-10-18 18:29:46,605 ----------------------------------------------------------------------------------------------------
2023-10-18 18:29:46,605 EPOCH 7 done: loss 0.2683 - lr: 0.000017
2023-10-18 18:29:51,592 DEV : loss 0.30530446767807007 - f1-score (micro avg) 0.3729
2023-10-18 18:29:51,618 saving best model
2023-10-18 18:29:51,658 ----------------------------------------------------------------------------------------------------
2023-10-18 18:29:53,038 epoch 8 - iter 89/894 - loss 0.27926806 - time (sec): 1.38 - samples/sec: 6199.09 - lr: 0.000016 - momentum: 0.000000
2023-10-18 18:29:54,406 epoch 8 - iter 178/894 - loss 0.26615970 - time (sec): 2.75 - samples/sec: 6182.68 - lr: 0.000016 - momentum: 0.000000
2023-10-18 18:29:55,792 epoch 8 - iter 267/894 - loss 0.27250183 - time (sec): 4.13 - samples/sec: 6266.44 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:29:57,067 epoch 8 - iter 356/894 - loss 0.27294376 - time (sec): 5.41 - samples/sec: 6418.59 - lr: 0.000014 - momentum: 0.000000
2023-10-18 18:29:58,310 epoch 8 - iter 445/894 - loss 0.26958943 - time (sec): 6.65 - samples/sec: 6509.10 - lr: 0.000014 - momentum: 0.000000
2023-10-18 18:29:59,586 epoch 8 - iter 534/894 - loss 0.26501812 - time (sec): 7.93 - samples/sec: 6698.31 - lr: 0.000013 - momentum: 0.000000
2023-10-18 18:30:00,903 epoch 8 - iter 623/894 - loss 0.26337124 - time (sec): 9.24 - samples/sec: 6602.18 - lr: 0.000013 - momentum: 0.000000
2023-10-18 18:30:02,303 epoch 8 - iter 712/894 - loss 0.26398454 - time (sec): 10.64 - samples/sec: 6548.88 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:30:03,695 epoch 8 - iter 801/894 - loss 0.25847704 - time (sec): 12.04 - samples/sec: 6556.75 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:30:05,051 epoch 8 - iter 890/894 - loss 0.26097250 - time (sec): 13.39 - samples/sec: 6438.89 - lr: 0.000011 - momentum: 0.000000
2023-10-18 18:30:05,107 ----------------------------------------------------------------------------------------------------
2023-10-18 18:30:05,107 EPOCH 8 done: loss 0.2606 - lr: 0.000011
2023-10-18 18:30:10,376 DEV : loss 0.3058369755744934 - f1-score (micro avg) 0.3789
2023-10-18 18:30:10,402 saving best model
2023-10-18 18:30:10,438 ----------------------------------------------------------------------------------------------------
2023-10-18 18:30:11,818 epoch 9 - iter 89/894 - loss 0.22849598 - time (sec): 1.38 - samples/sec: 5809.18 - lr: 0.000011 - momentum: 0.000000
2023-10-18 18:30:13,259 epoch 9 - iter 178/894 - loss 0.25523926 - time (sec): 2.82 - samples/sec: 6331.01 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:30:14,625 epoch 9 - iter 267/894 - loss 0.27346974 - time (sec): 4.19 - samples/sec: 6288.83 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:30:16,020 epoch 9 - iter 356/894 - loss 0.26469584 - time (sec): 5.58 - samples/sec: 6259.18 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:30:17,401 epoch 9 - iter 445/894 - loss 0.26281820 - time (sec): 6.96 - samples/sec: 6183.94 - lr: 0.000008 - momentum: 0.000000
2023-10-18 18:30:18,773 epoch 9 - iter 534/894 - loss 0.25909388 - time (sec): 8.33 - samples/sec: 6321.42 - lr: 0.000008 - momentum: 0.000000
2023-10-18 18:30:20,014 epoch 9 - iter 623/894 - loss 0.25428350 - time (sec): 9.58 - samples/sec: 6435.17 - lr: 0.000007 - momentum: 0.000000
2023-10-18 18:30:21,348 epoch 9 - iter 712/894 - loss 0.25478754 - time (sec): 10.91 - samples/sec: 6395.98 - lr: 0.000007 - momentum: 0.000000
2023-10-18 18:30:22,731 epoch 9 - iter 801/894 - loss 0.25297570 - time (sec): 12.29 - samples/sec: 6370.94 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:30:24,086 epoch 9 - iter 890/894 - loss 0.25061435 - time (sec): 13.65 - samples/sec: 6326.55 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:30:24,142 ----------------------------------------------------------------------------------------------------
2023-10-18 18:30:24,142 EPOCH 9 done: loss 0.2509 - lr: 0.000006
2023-10-18 18:30:29,469 DEV : loss 0.310301274061203 - f1-score (micro avg) 0.3822
2023-10-18 18:30:29,497 saving best model
2023-10-18 18:30:29,537 ----------------------------------------------------------------------------------------------------
2023-10-18 18:30:30,976 epoch 10 - iter 89/894 - loss 0.20663355 - time (sec): 1.44 - samples/sec: 6219.17 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:30:32,338 epoch 10 - iter 178/894 - loss 0.21406280 - time (sec): 2.80 - samples/sec: 6257.50 - lr: 0.000004 - momentum: 0.000000
2023-10-18 18:30:33,664 epoch 10 - iter 267/894 - loss 0.20949232 - time (sec): 4.13 - samples/sec: 6102.03 - lr: 0.000004 - momentum: 0.000000
2023-10-18 18:30:35,048 epoch 10 - iter 356/894 - loss 0.22041080 - time (sec): 5.51 - samples/sec: 6045.24 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:30:36,427 epoch 10 - iter 445/894 - loss 0.22589832 - time (sec): 6.89 - samples/sec: 6029.46 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:30:37,802 epoch 10 - iter 534/894 - loss 0.23045096 - time (sec): 8.26 - samples/sec: 6010.35 - lr: 0.000002 - momentum: 0.000000
2023-10-18 18:30:39,178 epoch 10 - iter 623/894 - loss 0.23335194 - time (sec): 9.64 - samples/sec: 6024.46 - lr: 0.000002 - momentum: 0.000000
2023-10-18 18:30:40,601 epoch 10 - iter 712/894 - loss 0.23654913 - time (sec): 11.06 - samples/sec: 6101.95 - lr: 0.000001 - momentum: 0.000000
2023-10-18 18:30:42,027 epoch 10 - iter 801/894 - loss 0.23561267 - time (sec): 12.49 - samples/sec: 6217.60 - lr: 0.000001 - momentum: 0.000000
2023-10-18 18:30:43,401 epoch 10 - iter 890/894 - loss 0.24226094 - time (sec): 13.86 - samples/sec: 6220.52 - lr: 0.000000 - momentum: 0.000000
2023-10-18 18:30:43,458 ----------------------------------------------------------------------------------------------------
2023-10-18 18:30:43,458 EPOCH 10 done: loss 0.2425 - lr: 0.000000
2023-10-18 18:30:48,847 DEV : loss 0.3040592074394226 - f1-score (micro avg) 0.3858
2023-10-18 18:30:48,871 saving best model
2023-10-18 18:30:48,931 ----------------------------------------------------------------------------------------------------
2023-10-18 18:30:48,931 Loading model from best epoch ...
2023-10-18 18:30:49,004 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-18 18:30:50,932
Results:
- F-score (micro) 0.3766
- F-score (macro) 0.1915
- Accuracy 0.2454
By class:
precision recall f1-score support
loc 0.5280 0.5688 0.5477 596
pers 0.1922 0.2943 0.2325 333
org 0.0000 0.0000 0.0000 132
time 0.2333 0.1429 0.1772 49
prod 0.0000 0.0000 0.0000 66
micro avg 0.3756 0.3776 0.3766 1176
macro avg 0.1907 0.2012 0.1915 1176
weighted avg 0.3317 0.3776 0.3508 1176
2023-10-18 18:30:50,932 ----------------------------------------------------------------------------------------------------