stefan-it's picture
Upload ./training.log with huggingface_hub
57a5e54
2023-10-25 12:55:17,372 ----------------------------------------------------------------------------------------------------
2023-10-25 12:55:17,373 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 12:55:17,373 ----------------------------------------------------------------------------------------------------
2023-10-25 12:55:17,373 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-25 12:55:17,373 ----------------------------------------------------------------------------------------------------
2023-10-25 12:55:17,373 Train: 6183 sentences
2023-10-25 12:55:17,373 (train_with_dev=False, train_with_test=False)
2023-10-25 12:55:17,374 ----------------------------------------------------------------------------------------------------
2023-10-25 12:55:17,374 Training Params:
2023-10-25 12:55:17,374 - learning_rate: "3e-05"
2023-10-25 12:55:17,374 - mini_batch_size: "8"
2023-10-25 12:55:17,374 - max_epochs: "10"
2023-10-25 12:55:17,374 - shuffle: "True"
2023-10-25 12:55:17,374 ----------------------------------------------------------------------------------------------------
2023-10-25 12:55:17,374 Plugins:
2023-10-25 12:55:17,374 - TensorboardLogger
2023-10-25 12:55:17,374 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 12:55:17,374 ----------------------------------------------------------------------------------------------------
2023-10-25 12:55:17,374 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 12:55:17,374 - metric: "('micro avg', 'f1-score')"
2023-10-25 12:55:17,374 ----------------------------------------------------------------------------------------------------
2023-10-25 12:55:17,374 Computation:
2023-10-25 12:55:17,374 - compute on device: cuda:0
2023-10-25 12:55:17,374 - embedding storage: none
2023-10-25 12:55:17,374 ----------------------------------------------------------------------------------------------------
2023-10-25 12:55:17,374 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-25 12:55:17,374 ----------------------------------------------------------------------------------------------------
2023-10-25 12:55:17,374 ----------------------------------------------------------------------------------------------------
2023-10-25 12:55:17,374 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 12:55:22,325 epoch 1 - iter 77/773 - loss 1.93757916 - time (sec): 4.95 - samples/sec: 2633.95 - lr: 0.000003 - momentum: 0.000000
2023-10-25 12:55:27,147 epoch 1 - iter 154/773 - loss 1.11916091 - time (sec): 9.77 - samples/sec: 2623.41 - lr: 0.000006 - momentum: 0.000000
2023-10-25 12:55:31,976 epoch 1 - iter 231/773 - loss 0.82756714 - time (sec): 14.60 - samples/sec: 2575.99 - lr: 0.000009 - momentum: 0.000000
2023-10-25 12:55:36,821 epoch 1 - iter 308/773 - loss 0.66515291 - time (sec): 19.45 - samples/sec: 2548.10 - lr: 0.000012 - momentum: 0.000000
2023-10-25 12:55:41,560 epoch 1 - iter 385/773 - loss 0.56018412 - time (sec): 24.18 - samples/sec: 2536.59 - lr: 0.000015 - momentum: 0.000000
2023-10-25 12:55:46,415 epoch 1 - iter 462/773 - loss 0.48843856 - time (sec): 29.04 - samples/sec: 2539.21 - lr: 0.000018 - momentum: 0.000000
2023-10-25 12:55:51,206 epoch 1 - iter 539/773 - loss 0.43466720 - time (sec): 33.83 - samples/sec: 2528.25 - lr: 0.000021 - momentum: 0.000000
2023-10-25 12:55:56,079 epoch 1 - iter 616/773 - loss 0.39209967 - time (sec): 38.70 - samples/sec: 2530.25 - lr: 0.000024 - momentum: 0.000000
2023-10-25 12:56:00,886 epoch 1 - iter 693/773 - loss 0.35744237 - time (sec): 43.51 - samples/sec: 2545.35 - lr: 0.000027 - momentum: 0.000000
2023-10-25 12:56:05,722 epoch 1 - iter 770/773 - loss 0.32866683 - time (sec): 48.35 - samples/sec: 2563.64 - lr: 0.000030 - momentum: 0.000000
2023-10-25 12:56:05,903 ----------------------------------------------------------------------------------------------------
2023-10-25 12:56:05,903 EPOCH 1 done: loss 0.3279 - lr: 0.000030
2023-10-25 12:56:09,118 DEV : loss 0.04656985402107239 - f1-score (micro avg) 0.7598
2023-10-25 12:56:09,136 saving best model
2023-10-25 12:56:09,633 ----------------------------------------------------------------------------------------------------
2023-10-25 12:56:14,324 epoch 2 - iter 77/773 - loss 0.08563895 - time (sec): 4.69 - samples/sec: 2443.89 - lr: 0.000030 - momentum: 0.000000
2023-10-25 12:56:19,149 epoch 2 - iter 154/773 - loss 0.07537749 - time (sec): 9.51 - samples/sec: 2436.76 - lr: 0.000029 - momentum: 0.000000
2023-10-25 12:56:24,014 epoch 2 - iter 231/773 - loss 0.07532105 - time (sec): 14.38 - samples/sec: 2440.29 - lr: 0.000029 - momentum: 0.000000
2023-10-25 12:56:28,747 epoch 2 - iter 308/773 - loss 0.07738630 - time (sec): 19.11 - samples/sec: 2509.80 - lr: 0.000029 - momentum: 0.000000
2023-10-25 12:56:33,915 epoch 2 - iter 385/773 - loss 0.07528257 - time (sec): 24.28 - samples/sec: 2523.33 - lr: 0.000028 - momentum: 0.000000
2023-10-25 12:56:38,709 epoch 2 - iter 462/773 - loss 0.07420890 - time (sec): 29.07 - samples/sec: 2528.91 - lr: 0.000028 - momentum: 0.000000
2023-10-25 12:56:43,381 epoch 2 - iter 539/773 - loss 0.07258413 - time (sec): 33.75 - samples/sec: 2581.48 - lr: 0.000028 - momentum: 0.000000
2023-10-25 12:56:48,044 epoch 2 - iter 616/773 - loss 0.07168966 - time (sec): 38.41 - samples/sec: 2585.65 - lr: 0.000027 - momentum: 0.000000
2023-10-25 12:56:52,727 epoch 2 - iter 693/773 - loss 0.07240517 - time (sec): 43.09 - samples/sec: 2583.59 - lr: 0.000027 - momentum: 0.000000
2023-10-25 12:56:57,388 epoch 2 - iter 770/773 - loss 0.07049818 - time (sec): 47.75 - samples/sec: 2593.56 - lr: 0.000027 - momentum: 0.000000
2023-10-25 12:56:57,548 ----------------------------------------------------------------------------------------------------
2023-10-25 12:56:57,548 EPOCH 2 done: loss 0.0705 - lr: 0.000027
2023-10-25 12:57:00,100 DEV : loss 0.05241599678993225 - f1-score (micro avg) 0.766
2023-10-25 12:57:00,121 saving best model
2023-10-25 12:57:00,787 ----------------------------------------------------------------------------------------------------
2023-10-25 12:57:05,053 epoch 3 - iter 77/773 - loss 0.03937428 - time (sec): 4.26 - samples/sec: 2830.87 - lr: 0.000026 - momentum: 0.000000
2023-10-25 12:57:10,111 epoch 3 - iter 154/773 - loss 0.03820952 - time (sec): 9.32 - samples/sec: 2597.71 - lr: 0.000026 - momentum: 0.000000
2023-10-25 12:57:14,565 epoch 3 - iter 231/773 - loss 0.04003269 - time (sec): 13.77 - samples/sec: 2757.09 - lr: 0.000026 - momentum: 0.000000
2023-10-25 12:57:18,915 epoch 3 - iter 308/773 - loss 0.04207363 - time (sec): 18.12 - samples/sec: 2736.31 - lr: 0.000025 - momentum: 0.000000
2023-10-25 12:57:23,359 epoch 3 - iter 385/773 - loss 0.04520442 - time (sec): 22.57 - samples/sec: 2747.11 - lr: 0.000025 - momentum: 0.000000
2023-10-25 12:57:27,600 epoch 3 - iter 462/773 - loss 0.04505890 - time (sec): 26.81 - samples/sec: 2750.23 - lr: 0.000025 - momentum: 0.000000
2023-10-25 12:57:31,826 epoch 3 - iter 539/773 - loss 0.04438068 - time (sec): 31.04 - samples/sec: 2778.92 - lr: 0.000024 - momentum: 0.000000
2023-10-25 12:57:36,080 epoch 3 - iter 616/773 - loss 0.04412004 - time (sec): 35.29 - samples/sec: 2803.10 - lr: 0.000024 - momentum: 0.000000
2023-10-25 12:57:40,438 epoch 3 - iter 693/773 - loss 0.04308429 - time (sec): 39.65 - samples/sec: 2803.93 - lr: 0.000024 - momentum: 0.000000
2023-10-25 12:57:44,861 epoch 3 - iter 770/773 - loss 0.04355937 - time (sec): 44.07 - samples/sec: 2811.96 - lr: 0.000023 - momentum: 0.000000
2023-10-25 12:57:45,041 ----------------------------------------------------------------------------------------------------
2023-10-25 12:57:45,042 EPOCH 3 done: loss 0.0435 - lr: 0.000023
2023-10-25 12:57:47,526 DEV : loss 0.07553908228874207 - f1-score (micro avg) 0.7581
2023-10-25 12:57:47,545 ----------------------------------------------------------------------------------------------------
2023-10-25 12:57:52,456 epoch 4 - iter 77/773 - loss 0.02787168 - time (sec): 4.91 - samples/sec: 2464.84 - lr: 0.000023 - momentum: 0.000000
2023-10-25 12:57:57,409 epoch 4 - iter 154/773 - loss 0.02466669 - time (sec): 9.86 - samples/sec: 2463.70 - lr: 0.000023 - momentum: 0.000000
2023-10-25 12:58:01,920 epoch 4 - iter 231/773 - loss 0.02719373 - time (sec): 14.37 - samples/sec: 2567.69 - lr: 0.000022 - momentum: 0.000000
2023-10-25 12:58:06,111 epoch 4 - iter 308/773 - loss 0.02590039 - time (sec): 18.56 - samples/sec: 2641.33 - lr: 0.000022 - momentum: 0.000000
2023-10-25 12:58:10,335 epoch 4 - iter 385/773 - loss 0.02730470 - time (sec): 22.79 - samples/sec: 2649.13 - lr: 0.000022 - momentum: 0.000000
2023-10-25 12:58:14,488 epoch 4 - iter 462/773 - loss 0.02758635 - time (sec): 26.94 - samples/sec: 2667.62 - lr: 0.000021 - momentum: 0.000000
2023-10-25 12:58:19,096 epoch 4 - iter 539/773 - loss 0.02908079 - time (sec): 31.55 - samples/sec: 2686.53 - lr: 0.000021 - momentum: 0.000000
2023-10-25 12:58:23,572 epoch 4 - iter 616/773 - loss 0.02974028 - time (sec): 36.03 - samples/sec: 2711.79 - lr: 0.000021 - momentum: 0.000000
2023-10-25 12:58:27,965 epoch 4 - iter 693/773 - loss 0.02923596 - time (sec): 40.42 - samples/sec: 2751.85 - lr: 0.000020 - momentum: 0.000000
2023-10-25 12:58:32,489 epoch 4 - iter 770/773 - loss 0.02839604 - time (sec): 44.94 - samples/sec: 2757.19 - lr: 0.000020 - momentum: 0.000000
2023-10-25 12:58:32,667 ----------------------------------------------------------------------------------------------------
2023-10-25 12:58:32,667 EPOCH 4 done: loss 0.0285 - lr: 0.000020
2023-10-25 12:58:35,305 DEV : loss 0.10123448073863983 - f1-score (micro avg) 0.7588
2023-10-25 12:58:35,322 ----------------------------------------------------------------------------------------------------
2023-10-25 12:58:40,130 epoch 5 - iter 77/773 - loss 0.01818218 - time (sec): 4.81 - samples/sec: 2724.06 - lr: 0.000020 - momentum: 0.000000
2023-10-25 12:58:44,779 epoch 5 - iter 154/773 - loss 0.01934967 - time (sec): 9.45 - samples/sec: 2642.61 - lr: 0.000019 - momentum: 0.000000
2023-10-25 12:58:49,349 epoch 5 - iter 231/773 - loss 0.01860970 - time (sec): 14.02 - samples/sec: 2622.37 - lr: 0.000019 - momentum: 0.000000
2023-10-25 12:58:53,878 epoch 5 - iter 308/773 - loss 0.01852623 - time (sec): 18.55 - samples/sec: 2604.00 - lr: 0.000019 - momentum: 0.000000
2023-10-25 12:58:58,223 epoch 5 - iter 385/773 - loss 0.02029059 - time (sec): 22.90 - samples/sec: 2677.70 - lr: 0.000018 - momentum: 0.000000
2023-10-25 12:59:02,519 epoch 5 - iter 462/773 - loss 0.01982896 - time (sec): 27.19 - samples/sec: 2678.43 - lr: 0.000018 - momentum: 0.000000
2023-10-25 12:59:06,948 epoch 5 - iter 539/773 - loss 0.01903206 - time (sec): 31.62 - samples/sec: 2682.60 - lr: 0.000018 - momentum: 0.000000
2023-10-25 12:59:11,469 epoch 5 - iter 616/773 - loss 0.01983065 - time (sec): 36.14 - samples/sec: 2715.33 - lr: 0.000017 - momentum: 0.000000
2023-10-25 12:59:15,959 epoch 5 - iter 693/773 - loss 0.01906685 - time (sec): 40.63 - samples/sec: 2734.27 - lr: 0.000017 - momentum: 0.000000
2023-10-25 12:59:20,380 epoch 5 - iter 770/773 - loss 0.01928792 - time (sec): 45.06 - samples/sec: 2749.87 - lr: 0.000017 - momentum: 0.000000
2023-10-25 12:59:20,549 ----------------------------------------------------------------------------------------------------
2023-10-25 12:59:20,549 EPOCH 5 done: loss 0.0193 - lr: 0.000017
2023-10-25 12:59:23,132 DEV : loss 0.09339083731174469 - f1-score (micro avg) 0.7702
2023-10-25 12:59:23,152 saving best model
2023-10-25 12:59:23,843 ----------------------------------------------------------------------------------------------------
2023-10-25 12:59:28,319 epoch 6 - iter 77/773 - loss 0.01168986 - time (sec): 4.47 - samples/sec: 2758.77 - lr: 0.000016 - momentum: 0.000000
2023-10-25 12:59:32,805 epoch 6 - iter 154/773 - loss 0.01402440 - time (sec): 8.96 - samples/sec: 2807.18 - lr: 0.000016 - momentum: 0.000000
2023-10-25 12:59:37,192 epoch 6 - iter 231/773 - loss 0.01359986 - time (sec): 13.35 - samples/sec: 2773.09 - lr: 0.000016 - momentum: 0.000000
2023-10-25 12:59:41,538 epoch 6 - iter 308/773 - loss 0.01576820 - time (sec): 17.69 - samples/sec: 2806.65 - lr: 0.000015 - momentum: 0.000000
2023-10-25 12:59:45,952 epoch 6 - iter 385/773 - loss 0.01553924 - time (sec): 22.11 - samples/sec: 2811.99 - lr: 0.000015 - momentum: 0.000000
2023-10-25 12:59:50,240 epoch 6 - iter 462/773 - loss 0.01476289 - time (sec): 26.40 - samples/sec: 2838.14 - lr: 0.000015 - momentum: 0.000000
2023-10-25 12:59:54,601 epoch 6 - iter 539/773 - loss 0.01422328 - time (sec): 30.76 - samples/sec: 2834.90 - lr: 0.000014 - momentum: 0.000000
2023-10-25 12:59:58,868 epoch 6 - iter 616/773 - loss 0.01396188 - time (sec): 35.02 - samples/sec: 2835.21 - lr: 0.000014 - momentum: 0.000000
2023-10-25 13:00:04,003 epoch 6 - iter 693/773 - loss 0.01434232 - time (sec): 40.16 - samples/sec: 2781.23 - lr: 0.000014 - momentum: 0.000000
2023-10-25 13:00:08,377 epoch 6 - iter 770/773 - loss 0.01390229 - time (sec): 44.53 - samples/sec: 2782.03 - lr: 0.000013 - momentum: 0.000000
2023-10-25 13:00:08,548 ----------------------------------------------------------------------------------------------------
2023-10-25 13:00:08,548 EPOCH 6 done: loss 0.0139 - lr: 0.000013
2023-10-25 13:00:11,523 DEV : loss 0.10927439481019974 - f1-score (micro avg) 0.7676
2023-10-25 13:00:11,545 ----------------------------------------------------------------------------------------------------
2023-10-25 13:00:16,317 epoch 7 - iter 77/773 - loss 0.00916644 - time (sec): 4.77 - samples/sec: 2721.74 - lr: 0.000013 - momentum: 0.000000
2023-10-25 13:00:21,088 epoch 7 - iter 154/773 - loss 0.00780367 - time (sec): 9.54 - samples/sec: 2623.17 - lr: 0.000013 - momentum: 0.000000
2023-10-25 13:00:26,000 epoch 7 - iter 231/773 - loss 0.00824754 - time (sec): 14.45 - samples/sec: 2616.04 - lr: 0.000012 - momentum: 0.000000
2023-10-25 13:00:30,653 epoch 7 - iter 308/773 - loss 0.00848303 - time (sec): 19.11 - samples/sec: 2639.15 - lr: 0.000012 - momentum: 0.000000
2023-10-25 13:00:35,271 epoch 7 - iter 385/773 - loss 0.00832052 - time (sec): 23.72 - samples/sec: 2617.13 - lr: 0.000012 - momentum: 0.000000
2023-10-25 13:00:39,832 epoch 7 - iter 462/773 - loss 0.00885479 - time (sec): 28.29 - samples/sec: 2634.34 - lr: 0.000011 - momentum: 0.000000
2023-10-25 13:00:44,324 epoch 7 - iter 539/773 - loss 0.00964508 - time (sec): 32.78 - samples/sec: 2650.42 - lr: 0.000011 - momentum: 0.000000
2023-10-25 13:00:49,078 epoch 7 - iter 616/773 - loss 0.00959482 - time (sec): 37.53 - samples/sec: 2638.35 - lr: 0.000011 - momentum: 0.000000
2023-10-25 13:00:53,873 epoch 7 - iter 693/773 - loss 0.00983066 - time (sec): 42.33 - samples/sec: 2657.55 - lr: 0.000010 - momentum: 0.000000
2023-10-25 13:00:58,356 epoch 7 - iter 770/773 - loss 0.00990440 - time (sec): 46.81 - samples/sec: 2648.46 - lr: 0.000010 - momentum: 0.000000
2023-10-25 13:00:58,529 ----------------------------------------------------------------------------------------------------
2023-10-25 13:00:58,529 EPOCH 7 done: loss 0.0099 - lr: 0.000010
2023-10-25 13:01:01,014 DEV : loss 0.11133752763271332 - f1-score (micro avg) 0.7647
2023-10-25 13:01:01,031 ----------------------------------------------------------------------------------------------------
2023-10-25 13:01:05,488 epoch 8 - iter 77/773 - loss 0.00796583 - time (sec): 4.46 - samples/sec: 2791.97 - lr: 0.000010 - momentum: 0.000000
2023-10-25 13:01:09,862 epoch 8 - iter 154/773 - loss 0.00803247 - time (sec): 8.83 - samples/sec: 2753.71 - lr: 0.000009 - momentum: 0.000000
2023-10-25 13:01:14,395 epoch 8 - iter 231/773 - loss 0.00591916 - time (sec): 13.36 - samples/sec: 2744.01 - lr: 0.000009 - momentum: 0.000000
2023-10-25 13:01:18,853 epoch 8 - iter 308/773 - loss 0.00604710 - time (sec): 17.82 - samples/sec: 2757.60 - lr: 0.000009 - momentum: 0.000000
2023-10-25 13:01:23,366 epoch 8 - iter 385/773 - loss 0.00748894 - time (sec): 22.33 - samples/sec: 2758.83 - lr: 0.000008 - momentum: 0.000000
2023-10-25 13:01:27,828 epoch 8 - iter 462/773 - loss 0.00792268 - time (sec): 26.80 - samples/sec: 2747.09 - lr: 0.000008 - momentum: 0.000000
2023-10-25 13:01:32,471 epoch 8 - iter 539/773 - loss 0.00729680 - time (sec): 31.44 - samples/sec: 2787.77 - lr: 0.000008 - momentum: 0.000000
2023-10-25 13:01:37,083 epoch 8 - iter 616/773 - loss 0.00697704 - time (sec): 36.05 - samples/sec: 2774.52 - lr: 0.000007 - momentum: 0.000000
2023-10-25 13:01:41,417 epoch 8 - iter 693/773 - loss 0.00685826 - time (sec): 40.38 - samples/sec: 2764.31 - lr: 0.000007 - momentum: 0.000000
2023-10-25 13:01:45,884 epoch 8 - iter 770/773 - loss 0.00645588 - time (sec): 44.85 - samples/sec: 2759.56 - lr: 0.000007 - momentum: 0.000000
2023-10-25 13:01:46,056 ----------------------------------------------------------------------------------------------------
2023-10-25 13:01:46,056 EPOCH 8 done: loss 0.0065 - lr: 0.000007
2023-10-25 13:01:48,732 DEV : loss 0.11849800497293472 - f1-score (micro avg) 0.7757
2023-10-25 13:01:48,749 saving best model
2023-10-25 13:01:49,470 ----------------------------------------------------------------------------------------------------
2023-10-25 13:01:53,926 epoch 9 - iter 77/773 - loss 0.00385767 - time (sec): 4.45 - samples/sec: 2952.48 - lr: 0.000006 - momentum: 0.000000
2023-10-25 13:01:58,418 epoch 9 - iter 154/773 - loss 0.00336774 - time (sec): 8.95 - samples/sec: 2784.84 - lr: 0.000006 - momentum: 0.000000
2023-10-25 13:02:02,980 epoch 9 - iter 231/773 - loss 0.00334359 - time (sec): 13.51 - samples/sec: 2796.57 - lr: 0.000006 - momentum: 0.000000
2023-10-25 13:02:07,317 epoch 9 - iter 308/773 - loss 0.00425217 - time (sec): 17.84 - samples/sec: 2778.14 - lr: 0.000005 - momentum: 0.000000
2023-10-25 13:02:11,731 epoch 9 - iter 385/773 - loss 0.00409554 - time (sec): 22.26 - samples/sec: 2785.82 - lr: 0.000005 - momentum: 0.000000
2023-10-25 13:02:16,194 epoch 9 - iter 462/773 - loss 0.00412034 - time (sec): 26.72 - samples/sec: 2794.35 - lr: 0.000005 - momentum: 0.000000
2023-10-25 13:02:20,762 epoch 9 - iter 539/773 - loss 0.00422574 - time (sec): 31.29 - samples/sec: 2794.67 - lr: 0.000004 - momentum: 0.000000
2023-10-25 13:02:25,322 epoch 9 - iter 616/773 - loss 0.00422062 - time (sec): 35.85 - samples/sec: 2797.81 - lr: 0.000004 - momentum: 0.000000
2023-10-25 13:02:29,701 epoch 9 - iter 693/773 - loss 0.00418772 - time (sec): 40.23 - samples/sec: 2793.58 - lr: 0.000004 - momentum: 0.000000
2023-10-25 13:02:33,944 epoch 9 - iter 770/773 - loss 0.00459974 - time (sec): 44.47 - samples/sec: 2787.88 - lr: 0.000003 - momentum: 0.000000
2023-10-25 13:02:34,101 ----------------------------------------------------------------------------------------------------
2023-10-25 13:02:34,101 EPOCH 9 done: loss 0.0046 - lr: 0.000003
2023-10-25 13:02:36,714 DEV : loss 0.12025374174118042 - f1-score (micro avg) 0.7708
2023-10-25 13:02:36,733 ----------------------------------------------------------------------------------------------------
2023-10-25 13:02:41,339 epoch 10 - iter 77/773 - loss 0.00335894 - time (sec): 4.60 - samples/sec: 2609.85 - lr: 0.000003 - momentum: 0.000000
2023-10-25 13:02:46,249 epoch 10 - iter 154/773 - loss 0.00503572 - time (sec): 9.51 - samples/sec: 2606.07 - lr: 0.000003 - momentum: 0.000000
2023-10-25 13:02:50,911 epoch 10 - iter 231/773 - loss 0.00379034 - time (sec): 14.18 - samples/sec: 2543.26 - lr: 0.000002 - momentum: 0.000000
2023-10-25 13:02:55,569 epoch 10 - iter 308/773 - loss 0.00329299 - time (sec): 18.83 - samples/sec: 2543.49 - lr: 0.000002 - momentum: 0.000000
2023-10-25 13:03:00,225 epoch 10 - iter 385/773 - loss 0.00325492 - time (sec): 23.49 - samples/sec: 2568.46 - lr: 0.000002 - momentum: 0.000000
2023-10-25 13:03:04,754 epoch 10 - iter 462/773 - loss 0.00345791 - time (sec): 28.02 - samples/sec: 2594.25 - lr: 0.000001 - momentum: 0.000000
2023-10-25 13:03:09,480 epoch 10 - iter 539/773 - loss 0.00316405 - time (sec): 32.75 - samples/sec: 2628.26 - lr: 0.000001 - momentum: 0.000000
2023-10-25 13:03:14,068 epoch 10 - iter 616/773 - loss 0.00348066 - time (sec): 37.33 - samples/sec: 2653.42 - lr: 0.000001 - momentum: 0.000000
2023-10-25 13:03:18,559 epoch 10 - iter 693/773 - loss 0.00321585 - time (sec): 41.83 - samples/sec: 2669.67 - lr: 0.000000 - momentum: 0.000000
2023-10-25 13:03:22,933 epoch 10 - iter 770/773 - loss 0.00290477 - time (sec): 46.20 - samples/sec: 2682.83 - lr: 0.000000 - momentum: 0.000000
2023-10-25 13:03:23,088 ----------------------------------------------------------------------------------------------------
2023-10-25 13:03:23,089 EPOCH 10 done: loss 0.0029 - lr: 0.000000
2023-10-25 13:03:26,447 DEV : loss 0.12326761335134506 - f1-score (micro avg) 0.7702
2023-10-25 13:03:26,922 ----------------------------------------------------------------------------------------------------
2023-10-25 13:03:26,923 Loading model from best epoch ...
2023-10-25 13:03:28,638 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-25 13:03:37,408
Results:
- F-score (micro) 0.7792
- F-score (macro) 0.6761
- Accuracy 0.6601
By class:
precision recall f1-score support
LOC 0.8471 0.8256 0.8362 946
BUILDING 0.5414 0.4595 0.4971 185
STREET 0.6613 0.7321 0.6949 56
micro avg 0.7949 0.7641 0.7792 1187
macro avg 0.6833 0.6724 0.6761 1187
weighted avg 0.7907 0.7641 0.7767 1187
2023-10-25 13:03:37,408 ----------------------------------------------------------------------------------------------------