stefan-it's picture
Upload folder using huggingface_hub
29c0390
2023-10-18 18:21:43,173 ----------------------------------------------------------------------------------------------------
2023-10-18 18:21:43,173 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 18:21:43,173 ----------------------------------------------------------------------------------------------------
2023-10-18 18:21:43,173 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-18 18:21:43,173 ----------------------------------------------------------------------------------------------------
2023-10-18 18:21:43,173 Train: 3575 sentences
2023-10-18 18:21:43,173 (train_with_dev=False, train_with_test=False)
2023-10-18 18:21:43,173 ----------------------------------------------------------------------------------------------------
2023-10-18 18:21:43,173 Training Params:
2023-10-18 18:21:43,173 - learning_rate: "5e-05"
2023-10-18 18:21:43,173 - mini_batch_size: "8"
2023-10-18 18:21:43,173 - max_epochs: "10"
2023-10-18 18:21:43,173 - shuffle: "True"
2023-10-18 18:21:43,173 ----------------------------------------------------------------------------------------------------
2023-10-18 18:21:43,173 Plugins:
2023-10-18 18:21:43,173 - TensorboardLogger
2023-10-18 18:21:43,173 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 18:21:43,174 ----------------------------------------------------------------------------------------------------
2023-10-18 18:21:43,174 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 18:21:43,174 - metric: "('micro avg', 'f1-score')"
2023-10-18 18:21:43,174 ----------------------------------------------------------------------------------------------------
2023-10-18 18:21:43,174 Computation:
2023-10-18 18:21:43,174 - compute on device: cuda:0
2023-10-18 18:21:43,174 - embedding storage: none
2023-10-18 18:21:43,174 ----------------------------------------------------------------------------------------------------
2023-10-18 18:21:43,174 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-18 18:21:43,174 ----------------------------------------------------------------------------------------------------
2023-10-18 18:21:43,174 ----------------------------------------------------------------------------------------------------
2023-10-18 18:21:43,174 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 18:21:44,063 epoch 1 - iter 44/447 - loss 4.23649821 - time (sec): 0.89 - samples/sec: 9205.03 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:21:45,001 epoch 1 - iter 88/447 - loss 4.09913748 - time (sec): 1.83 - samples/sec: 9161.65 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:21:46,122 epoch 1 - iter 132/447 - loss 3.70752803 - time (sec): 2.95 - samples/sec: 8885.41 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:21:47,135 epoch 1 - iter 176/447 - loss 3.33851307 - time (sec): 3.96 - samples/sec: 8897.63 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:21:48,153 epoch 1 - iter 220/447 - loss 2.91921810 - time (sec): 4.98 - samples/sec: 8929.24 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:21:49,138 epoch 1 - iter 264/447 - loss 2.57172108 - time (sec): 5.96 - samples/sec: 8897.72 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:21:50,118 epoch 1 - iter 308/447 - loss 2.32138092 - time (sec): 6.94 - samples/sec: 8782.67 - lr: 0.000034 - momentum: 0.000000
2023-10-18 18:21:51,110 epoch 1 - iter 352/447 - loss 2.12444880 - time (sec): 7.94 - samples/sec: 8684.37 - lr: 0.000039 - momentum: 0.000000
2023-10-18 18:21:52,115 epoch 1 - iter 396/447 - loss 1.96997781 - time (sec): 8.94 - samples/sec: 8592.80 - lr: 0.000044 - momentum: 0.000000
2023-10-18 18:21:53,135 epoch 1 - iter 440/447 - loss 1.83051644 - time (sec): 9.96 - samples/sec: 8560.53 - lr: 0.000049 - momentum: 0.000000
2023-10-18 18:21:53,279 ----------------------------------------------------------------------------------------------------
2023-10-18 18:21:53,280 EPOCH 1 done: loss 1.8109 - lr: 0.000049
2023-10-18 18:21:55,566 DEV : loss 0.44877171516418457 - f1-score (micro avg) 0.0
2023-10-18 18:21:55,589 ----------------------------------------------------------------------------------------------------
2023-10-18 18:21:56,612 epoch 2 - iter 44/447 - loss 0.60155493 - time (sec): 1.02 - samples/sec: 8574.38 - lr: 0.000049 - momentum: 0.000000
2023-10-18 18:21:57,597 epoch 2 - iter 88/447 - loss 0.56142538 - time (sec): 2.01 - samples/sec: 8504.11 - lr: 0.000049 - momentum: 0.000000
2023-10-18 18:21:58,565 epoch 2 - iter 132/447 - loss 0.55726995 - time (sec): 2.98 - samples/sec: 8469.91 - lr: 0.000048 - momentum: 0.000000
2023-10-18 18:21:59,547 epoch 2 - iter 176/447 - loss 0.54406916 - time (sec): 3.96 - samples/sec: 8495.52 - lr: 0.000048 - momentum: 0.000000
2023-10-18 18:22:00,547 epoch 2 - iter 220/447 - loss 0.53559205 - time (sec): 4.96 - samples/sec: 8474.13 - lr: 0.000047 - momentum: 0.000000
2023-10-18 18:22:01,542 epoch 2 - iter 264/447 - loss 0.52587561 - time (sec): 5.95 - samples/sec: 8323.02 - lr: 0.000047 - momentum: 0.000000
2023-10-18 18:22:02,554 epoch 2 - iter 308/447 - loss 0.51234790 - time (sec): 6.96 - samples/sec: 8349.78 - lr: 0.000046 - momentum: 0.000000
2023-10-18 18:22:03,602 epoch 2 - iter 352/447 - loss 0.49579329 - time (sec): 8.01 - samples/sec: 8484.55 - lr: 0.000046 - momentum: 0.000000
2023-10-18 18:22:04,613 epoch 2 - iter 396/447 - loss 0.49021091 - time (sec): 9.02 - samples/sec: 8540.08 - lr: 0.000045 - momentum: 0.000000
2023-10-18 18:22:05,637 epoch 2 - iter 440/447 - loss 0.48330812 - time (sec): 10.05 - samples/sec: 8495.76 - lr: 0.000045 - momentum: 0.000000
2023-10-18 18:22:05,784 ----------------------------------------------------------------------------------------------------
2023-10-18 18:22:05,784 EPOCH 2 done: loss 0.4835 - lr: 0.000045
2023-10-18 18:22:11,082 DEV : loss 0.32866111397743225 - f1-score (micro avg) 0.2007
2023-10-18 18:22:11,105 saving best model
2023-10-18 18:22:11,142 ----------------------------------------------------------------------------------------------------
2023-10-18 18:22:12,115 epoch 3 - iter 44/447 - loss 0.40849086 - time (sec): 0.97 - samples/sec: 9093.83 - lr: 0.000044 - momentum: 0.000000
2023-10-18 18:22:13,082 epoch 3 - iter 88/447 - loss 0.39700437 - time (sec): 1.94 - samples/sec: 8792.81 - lr: 0.000043 - momentum: 0.000000
2023-10-18 18:22:14,073 epoch 3 - iter 132/447 - loss 0.41180397 - time (sec): 2.93 - samples/sec: 8767.84 - lr: 0.000043 - momentum: 0.000000
2023-10-18 18:22:15,057 epoch 3 - iter 176/447 - loss 0.41517633 - time (sec): 3.91 - samples/sec: 8603.19 - lr: 0.000042 - momentum: 0.000000
2023-10-18 18:22:16,077 epoch 3 - iter 220/447 - loss 0.40697778 - time (sec): 4.93 - samples/sec: 8582.36 - lr: 0.000042 - momentum: 0.000000
2023-10-18 18:22:17,133 epoch 3 - iter 264/447 - loss 0.40336023 - time (sec): 5.99 - samples/sec: 8712.21 - lr: 0.000041 - momentum: 0.000000
2023-10-18 18:22:18,123 epoch 3 - iter 308/447 - loss 0.40473996 - time (sec): 6.98 - samples/sec: 8727.05 - lr: 0.000041 - momentum: 0.000000
2023-10-18 18:22:19,105 epoch 3 - iter 352/447 - loss 0.40234530 - time (sec): 7.96 - samples/sec: 8674.70 - lr: 0.000040 - momentum: 0.000000
2023-10-18 18:22:20,067 epoch 3 - iter 396/447 - loss 0.40175952 - time (sec): 8.92 - samples/sec: 8624.41 - lr: 0.000040 - momentum: 0.000000
2023-10-18 18:22:21,055 epoch 3 - iter 440/447 - loss 0.40155202 - time (sec): 9.91 - samples/sec: 8597.01 - lr: 0.000039 - momentum: 0.000000
2023-10-18 18:22:21,214 ----------------------------------------------------------------------------------------------------
2023-10-18 18:22:21,214 EPOCH 3 done: loss 0.3996 - lr: 0.000039
2023-10-18 18:22:26,494 DEV : loss 0.31150904297828674 - f1-score (micro avg) 0.2895
2023-10-18 18:22:26,518 saving best model
2023-10-18 18:22:26,553 ----------------------------------------------------------------------------------------------------
2023-10-18 18:22:27,547 epoch 4 - iter 44/447 - loss 0.31938505 - time (sec): 0.99 - samples/sec: 7768.01 - lr: 0.000038 - momentum: 0.000000
2023-10-18 18:22:28,522 epoch 4 - iter 88/447 - loss 0.35629034 - time (sec): 1.97 - samples/sec: 8020.00 - lr: 0.000038 - momentum: 0.000000
2023-10-18 18:22:29,360 epoch 4 - iter 132/447 - loss 0.37841602 - time (sec): 2.81 - samples/sec: 8550.59 - lr: 0.000037 - momentum: 0.000000
2023-10-18 18:22:30,205 epoch 4 - iter 176/447 - loss 0.37884124 - time (sec): 3.65 - samples/sec: 8816.64 - lr: 0.000037 - momentum: 0.000000
2023-10-18 18:22:31,154 epoch 4 - iter 220/447 - loss 0.36800228 - time (sec): 4.60 - samples/sec: 9070.98 - lr: 0.000036 - momentum: 0.000000
2023-10-18 18:22:32,140 epoch 4 - iter 264/447 - loss 0.35905570 - time (sec): 5.59 - samples/sec: 9038.17 - lr: 0.000036 - momentum: 0.000000
2023-10-18 18:22:33,228 epoch 4 - iter 308/447 - loss 0.35834364 - time (sec): 6.68 - samples/sec: 8997.15 - lr: 0.000035 - momentum: 0.000000
2023-10-18 18:22:34,223 epoch 4 - iter 352/447 - loss 0.35753489 - time (sec): 7.67 - samples/sec: 8989.30 - lr: 0.000035 - momentum: 0.000000
2023-10-18 18:22:35,209 epoch 4 - iter 396/447 - loss 0.35399316 - time (sec): 8.66 - samples/sec: 8900.47 - lr: 0.000034 - momentum: 0.000000
2023-10-18 18:22:36,209 epoch 4 - iter 440/447 - loss 0.35960244 - time (sec): 9.66 - samples/sec: 8828.74 - lr: 0.000033 - momentum: 0.000000
2023-10-18 18:22:36,366 ----------------------------------------------------------------------------------------------------
2023-10-18 18:22:36,366 EPOCH 4 done: loss 0.3601 - lr: 0.000033
2023-10-18 18:22:41,343 DEV : loss 0.3030697703361511 - f1-score (micro avg) 0.3261
2023-10-18 18:22:41,367 saving best model
2023-10-18 18:22:41,411 ----------------------------------------------------------------------------------------------------
2023-10-18 18:22:42,431 epoch 5 - iter 44/447 - loss 0.32935993 - time (sec): 1.02 - samples/sec: 8650.32 - lr: 0.000033 - momentum: 0.000000
2023-10-18 18:22:43,415 epoch 5 - iter 88/447 - loss 0.31978926 - time (sec): 2.00 - samples/sec: 8326.37 - lr: 0.000032 - momentum: 0.000000
2023-10-18 18:22:44,768 epoch 5 - iter 132/447 - loss 0.32434382 - time (sec): 3.36 - samples/sec: 7570.99 - lr: 0.000032 - momentum: 0.000000
2023-10-18 18:22:45,779 epoch 5 - iter 176/447 - loss 0.32626966 - time (sec): 4.37 - samples/sec: 7906.10 - lr: 0.000031 - momentum: 0.000000
2023-10-18 18:22:46,731 epoch 5 - iter 220/447 - loss 0.32079104 - time (sec): 5.32 - samples/sec: 8117.80 - lr: 0.000031 - momentum: 0.000000
2023-10-18 18:22:47,707 epoch 5 - iter 264/447 - loss 0.32647442 - time (sec): 6.30 - samples/sec: 8199.87 - lr: 0.000030 - momentum: 0.000000
2023-10-18 18:22:48,633 epoch 5 - iter 308/447 - loss 0.33184709 - time (sec): 7.22 - samples/sec: 8204.55 - lr: 0.000030 - momentum: 0.000000
2023-10-18 18:22:49,467 epoch 5 - iter 352/447 - loss 0.33135495 - time (sec): 8.06 - samples/sec: 8350.69 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:22:50,382 epoch 5 - iter 396/447 - loss 0.32921133 - time (sec): 8.97 - samples/sec: 8431.60 - lr: 0.000028 - momentum: 0.000000
2023-10-18 18:22:51,434 epoch 5 - iter 440/447 - loss 0.33252622 - time (sec): 10.02 - samples/sec: 8520.37 - lr: 0.000028 - momentum: 0.000000
2023-10-18 18:22:51,599 ----------------------------------------------------------------------------------------------------
2023-10-18 18:22:51,599 EPOCH 5 done: loss 0.3314 - lr: 0.000028
2023-10-18 18:22:56,580 DEV : loss 0.2966790497303009 - f1-score (micro avg) 0.3432
2023-10-18 18:22:56,605 saving best model
2023-10-18 18:22:56,646 ----------------------------------------------------------------------------------------------------
2023-10-18 18:22:57,639 epoch 6 - iter 44/447 - loss 0.32205009 - time (sec): 0.99 - samples/sec: 7882.22 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:22:58,675 epoch 6 - iter 88/447 - loss 0.28851192 - time (sec): 2.03 - samples/sec: 8518.03 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:22:59,657 epoch 6 - iter 132/447 - loss 0.27935474 - time (sec): 3.01 - samples/sec: 8308.57 - lr: 0.000026 - momentum: 0.000000
2023-10-18 18:23:00,674 epoch 6 - iter 176/447 - loss 0.29305316 - time (sec): 4.03 - samples/sec: 8517.53 - lr: 0.000026 - momentum: 0.000000
2023-10-18 18:23:01,689 epoch 6 - iter 220/447 - loss 0.29344782 - time (sec): 5.04 - samples/sec: 8600.65 - lr: 0.000025 - momentum: 0.000000
2023-10-18 18:23:02,693 epoch 6 - iter 264/447 - loss 0.29557160 - time (sec): 6.05 - samples/sec: 8512.17 - lr: 0.000025 - momentum: 0.000000
2023-10-18 18:23:03,653 epoch 6 - iter 308/447 - loss 0.29555013 - time (sec): 7.01 - samples/sec: 8482.76 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:23:04,667 epoch 6 - iter 352/447 - loss 0.29649384 - time (sec): 8.02 - samples/sec: 8570.31 - lr: 0.000023 - momentum: 0.000000
2023-10-18 18:23:05,648 epoch 6 - iter 396/447 - loss 0.29505917 - time (sec): 9.00 - samples/sec: 8570.39 - lr: 0.000023 - momentum: 0.000000
2023-10-18 18:23:06,640 epoch 6 - iter 440/447 - loss 0.30654783 - time (sec): 9.99 - samples/sec: 8556.98 - lr: 0.000022 - momentum: 0.000000
2023-10-18 18:23:06,794 ----------------------------------------------------------------------------------------------------
2023-10-18 18:23:06,794 EPOCH 6 done: loss 0.3076 - lr: 0.000022
2023-10-18 18:23:12,104 DEV : loss 0.2881721258163452 - f1-score (micro avg) 0.3428
2023-10-18 18:23:12,130 ----------------------------------------------------------------------------------------------------
2023-10-18 18:23:13,173 epoch 7 - iter 44/447 - loss 0.24532264 - time (sec): 1.04 - samples/sec: 8882.55 - lr: 0.000022 - momentum: 0.000000
2023-10-18 18:23:14,184 epoch 7 - iter 88/447 - loss 0.27568476 - time (sec): 2.05 - samples/sec: 8486.80 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:23:15,283 epoch 7 - iter 132/447 - loss 0.29794749 - time (sec): 3.15 - samples/sec: 8599.07 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:23:16,312 epoch 7 - iter 176/447 - loss 0.29956954 - time (sec): 4.18 - samples/sec: 8565.70 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:23:17,376 epoch 7 - iter 220/447 - loss 0.30213892 - time (sec): 5.25 - samples/sec: 8384.24 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:23:18,349 epoch 7 - iter 264/447 - loss 0.30357337 - time (sec): 6.22 - samples/sec: 8377.41 - lr: 0.000019 - momentum: 0.000000
2023-10-18 18:23:19,377 epoch 7 - iter 308/447 - loss 0.29758405 - time (sec): 7.25 - samples/sec: 8318.63 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:23:20,358 epoch 7 - iter 352/447 - loss 0.29898433 - time (sec): 8.23 - samples/sec: 8353.32 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:23:21,384 epoch 7 - iter 396/447 - loss 0.29612803 - time (sec): 9.25 - samples/sec: 8325.21 - lr: 0.000017 - momentum: 0.000000
2023-10-18 18:23:22,424 epoch 7 - iter 440/447 - loss 0.29754448 - time (sec): 10.29 - samples/sec: 8276.01 - lr: 0.000017 - momentum: 0.000000
2023-10-18 18:23:22,589 ----------------------------------------------------------------------------------------------------
2023-10-18 18:23:22,589 EPOCH 7 done: loss 0.2978 - lr: 0.000017
2023-10-18 18:23:27,857 DEV : loss 0.2950444519519806 - f1-score (micro avg) 0.3478
2023-10-18 18:23:27,881 saving best model
2023-10-18 18:23:27,915 ----------------------------------------------------------------------------------------------------
2023-10-18 18:23:28,944 epoch 8 - iter 44/447 - loss 0.29022344 - time (sec): 1.03 - samples/sec: 9194.27 - lr: 0.000016 - momentum: 0.000000
2023-10-18 18:23:29,939 epoch 8 - iter 88/447 - loss 0.27818831 - time (sec): 2.02 - samples/sec: 8694.64 - lr: 0.000016 - momentum: 0.000000
2023-10-18 18:23:30,925 epoch 8 - iter 132/447 - loss 0.29183782 - time (sec): 3.01 - samples/sec: 8683.21 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:23:31,934 epoch 8 - iter 176/447 - loss 0.29768575 - time (sec): 4.02 - samples/sec: 8511.47 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:23:32,961 epoch 8 - iter 220/447 - loss 0.30091732 - time (sec): 5.05 - samples/sec: 8435.25 - lr: 0.000014 - momentum: 0.000000
2023-10-18 18:23:33,943 epoch 8 - iter 264/447 - loss 0.29597312 - time (sec): 6.03 - samples/sec: 8442.64 - lr: 0.000013 - momentum: 0.000000
2023-10-18 18:23:34,941 epoch 8 - iter 308/447 - loss 0.29271218 - time (sec): 7.03 - samples/sec: 8347.80 - lr: 0.000013 - momentum: 0.000000
2023-10-18 18:23:35,941 epoch 8 - iter 352/447 - loss 0.29100478 - time (sec): 8.03 - samples/sec: 8386.27 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:23:36,957 epoch 8 - iter 396/447 - loss 0.28742095 - time (sec): 9.04 - samples/sec: 8390.58 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:23:38,050 epoch 8 - iter 440/447 - loss 0.28930722 - time (sec): 10.13 - samples/sec: 8390.30 - lr: 0.000011 - momentum: 0.000000
2023-10-18 18:23:38,225 ----------------------------------------------------------------------------------------------------
2023-10-18 18:23:38,225 EPOCH 8 done: loss 0.2878 - lr: 0.000011
2023-10-18 18:23:43,523 DEV : loss 0.29474112391471863 - f1-score (micro avg) 0.3473
2023-10-18 18:23:43,547 ----------------------------------------------------------------------------------------------------
2023-10-18 18:23:44,548 epoch 9 - iter 44/447 - loss 0.25740795 - time (sec): 1.00 - samples/sec: 8141.93 - lr: 0.000011 - momentum: 0.000000
2023-10-18 18:23:45,525 epoch 9 - iter 88/447 - loss 0.27278712 - time (sec): 1.98 - samples/sec: 7858.76 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:23:46,560 epoch 9 - iter 132/447 - loss 0.26541675 - time (sec): 3.01 - samples/sec: 8036.26 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:23:47,629 epoch 9 - iter 176/447 - loss 0.27555671 - time (sec): 4.08 - samples/sec: 7930.21 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:23:48,717 epoch 9 - iter 220/447 - loss 0.27547894 - time (sec): 5.17 - samples/sec: 8030.67 - lr: 0.000008 - momentum: 0.000000
2023-10-18 18:23:49,774 epoch 9 - iter 264/447 - loss 0.27439270 - time (sec): 6.23 - samples/sec: 8141.15 - lr: 0.000008 - momentum: 0.000000
2023-10-18 18:23:50,822 epoch 9 - iter 308/447 - loss 0.27018164 - time (sec): 7.27 - samples/sec: 8134.64 - lr: 0.000007 - momentum: 0.000000
2023-10-18 18:23:51,870 epoch 9 - iter 352/447 - loss 0.26807567 - time (sec): 8.32 - samples/sec: 8095.54 - lr: 0.000007 - momentum: 0.000000
2023-10-18 18:23:52,962 epoch 9 - iter 396/447 - loss 0.27305200 - time (sec): 9.41 - samples/sec: 8151.44 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:23:53,986 epoch 9 - iter 440/447 - loss 0.27534298 - time (sec): 10.44 - samples/sec: 8199.93 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:23:54,132 ----------------------------------------------------------------------------------------------------
2023-10-18 18:23:54,133 EPOCH 9 done: loss 0.2753 - lr: 0.000006
2023-10-18 18:23:59,472 DEV : loss 0.2967955470085144 - f1-score (micro avg) 0.3451
2023-10-18 18:23:59,496 ----------------------------------------------------------------------------------------------------
2023-10-18 18:24:00,570 epoch 10 - iter 44/447 - loss 0.31147788 - time (sec): 1.07 - samples/sec: 7567.81 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:24:01,581 epoch 10 - iter 88/447 - loss 0.29350164 - time (sec): 2.08 - samples/sec: 7864.46 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:24:02,585 epoch 10 - iter 132/447 - loss 0.27414972 - time (sec): 3.09 - samples/sec: 7968.43 - lr: 0.000004 - momentum: 0.000000
2023-10-18 18:24:03,541 epoch 10 - iter 176/447 - loss 0.27134812 - time (sec): 4.05 - samples/sec: 8107.61 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:24:04,542 epoch 10 - iter 220/447 - loss 0.27511632 - time (sec): 5.05 - samples/sec: 8012.55 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:24:05,574 epoch 10 - iter 264/447 - loss 0.26854414 - time (sec): 6.08 - samples/sec: 8070.76 - lr: 0.000002 - momentum: 0.000000
2023-10-18 18:24:06,537 epoch 10 - iter 308/447 - loss 0.27296262 - time (sec): 7.04 - samples/sec: 8110.53 - lr: 0.000002 - momentum: 0.000000
2023-10-18 18:24:07,641 epoch 10 - iter 352/447 - loss 0.27183740 - time (sec): 8.14 - samples/sec: 8302.87 - lr: 0.000001 - momentum: 0.000000
2023-10-18 18:24:08,693 epoch 10 - iter 396/447 - loss 0.27614075 - time (sec): 9.20 - samples/sec: 8270.70 - lr: 0.000001 - momentum: 0.000000
2023-10-18 18:24:09,741 epoch 10 - iter 440/447 - loss 0.27640189 - time (sec): 10.24 - samples/sec: 8271.79 - lr: 0.000000 - momentum: 0.000000
2023-10-18 18:24:09,932 ----------------------------------------------------------------------------------------------------
2023-10-18 18:24:09,932 EPOCH 10 done: loss 0.2761 - lr: 0.000000
2023-10-18 18:24:14,905 DEV : loss 0.2940215766429901 - f1-score (micro avg) 0.3506
2023-10-18 18:24:14,929 saving best model
2023-10-18 18:24:14,992 ----------------------------------------------------------------------------------------------------
2023-10-18 18:24:14,992 Loading model from best epoch ...
2023-10-18 18:24:15,067 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-18 18:24:17,305
Results:
- F-score (micro) 0.3601
- F-score (macro) 0.1639
- Accuracy 0.2303
By class:
precision recall f1-score support
loc 0.5196 0.5554 0.5369 596
pers 0.1712 0.2282 0.1956 333
org 0.0000 0.0000 0.0000 132
time 0.1500 0.0612 0.0870 49
prod 0.0000 0.0000 0.0000 66
micro avg 0.3724 0.3486 0.3601 1176
macro avg 0.1682 0.1690 0.1639 1176
weighted avg 0.3181 0.3486 0.3311 1176
2023-10-18 18:24:17,305 ----------------------------------------------------------------------------------------------------