stefan-it's picture
Upload ./training.log with huggingface_hub
7770fbb
2023-10-25 11:32:17,897 ----------------------------------------------------------------------------------------------------
2023-10-25 11:32:17,898 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 11:32:17,898 ----------------------------------------------------------------------------------------------------
2023-10-25 11:32:17,899 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-25 11:32:17,899 ----------------------------------------------------------------------------------------------------
2023-10-25 11:32:17,899 Train: 6183 sentences
2023-10-25 11:32:17,899 (train_with_dev=False, train_with_test=False)
2023-10-25 11:32:17,899 ----------------------------------------------------------------------------------------------------
2023-10-25 11:32:17,899 Training Params:
2023-10-25 11:32:17,899 - learning_rate: "5e-05"
2023-10-25 11:32:17,899 - mini_batch_size: "8"
2023-10-25 11:32:17,899 - max_epochs: "10"
2023-10-25 11:32:17,899 - shuffle: "True"
2023-10-25 11:32:17,899 ----------------------------------------------------------------------------------------------------
2023-10-25 11:32:17,899 Plugins:
2023-10-25 11:32:17,899 - TensorboardLogger
2023-10-25 11:32:17,899 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 11:32:17,899 ----------------------------------------------------------------------------------------------------
2023-10-25 11:32:17,899 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 11:32:17,899 - metric: "('micro avg', 'f1-score')"
2023-10-25 11:32:17,899 ----------------------------------------------------------------------------------------------------
2023-10-25 11:32:17,899 Computation:
2023-10-25 11:32:17,899 - compute on device: cuda:0
2023-10-25 11:32:17,899 - embedding storage: none
2023-10-25 11:32:17,899 ----------------------------------------------------------------------------------------------------
2023-10-25 11:32:17,899 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-25 11:32:17,899 ----------------------------------------------------------------------------------------------------
2023-10-25 11:32:17,899 ----------------------------------------------------------------------------------------------------
2023-10-25 11:32:17,900 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 11:32:22,467 epoch 1 - iter 77/773 - loss 1.55928995 - time (sec): 4.57 - samples/sec: 3117.25 - lr: 0.000005 - momentum: 0.000000
2023-10-25 11:32:26,899 epoch 1 - iter 154/773 - loss 0.94444348 - time (sec): 9.00 - samples/sec: 2895.74 - lr: 0.000010 - momentum: 0.000000
2023-10-25 11:32:31,544 epoch 1 - iter 231/773 - loss 0.68472071 - time (sec): 13.64 - samples/sec: 2835.90 - lr: 0.000015 - momentum: 0.000000
2023-10-25 11:32:36,101 epoch 1 - iter 308/773 - loss 0.54838258 - time (sec): 18.20 - samples/sec: 2792.56 - lr: 0.000020 - momentum: 0.000000
2023-10-25 11:32:40,729 epoch 1 - iter 385/773 - loss 0.46334242 - time (sec): 22.83 - samples/sec: 2770.44 - lr: 0.000025 - momentum: 0.000000
2023-10-25 11:32:45,076 epoch 1 - iter 462/773 - loss 0.41308200 - time (sec): 27.18 - samples/sec: 2740.33 - lr: 0.000030 - momentum: 0.000000
2023-10-25 11:32:49,556 epoch 1 - iter 539/773 - loss 0.37076947 - time (sec): 31.66 - samples/sec: 2735.35 - lr: 0.000035 - momentum: 0.000000
2023-10-25 11:32:53,999 epoch 1 - iter 616/773 - loss 0.33516167 - time (sec): 36.10 - samples/sec: 2746.05 - lr: 0.000040 - momentum: 0.000000
2023-10-25 11:32:58,447 epoch 1 - iter 693/773 - loss 0.30900288 - time (sec): 40.55 - samples/sec: 2748.87 - lr: 0.000045 - momentum: 0.000000
2023-10-25 11:33:03,589 epoch 1 - iter 770/773 - loss 0.28804151 - time (sec): 45.69 - samples/sec: 2708.60 - lr: 0.000050 - momentum: 0.000000
2023-10-25 11:33:03,765 ----------------------------------------------------------------------------------------------------
2023-10-25 11:33:03,766 EPOCH 1 done: loss 0.2873 - lr: 0.000050
2023-10-25 11:33:06,404 DEV : loss 0.06803968548774719 - f1-score (micro avg) 0.7209
2023-10-25 11:33:06,427 saving best model
2023-10-25 11:33:06,908 ----------------------------------------------------------------------------------------------------
2023-10-25 11:33:11,327 epoch 2 - iter 77/773 - loss 0.07623849 - time (sec): 4.42 - samples/sec: 2667.47 - lr: 0.000049 - momentum: 0.000000
2023-10-25 11:33:15,659 epoch 2 - iter 154/773 - loss 0.07951236 - time (sec): 8.75 - samples/sec: 2814.00 - lr: 0.000049 - momentum: 0.000000
2023-10-25 11:33:19,994 epoch 2 - iter 231/773 - loss 0.08070316 - time (sec): 13.08 - samples/sec: 2905.85 - lr: 0.000048 - momentum: 0.000000
2023-10-25 11:33:24,314 epoch 2 - iter 308/773 - loss 0.07668770 - time (sec): 17.40 - samples/sec: 2938.42 - lr: 0.000048 - momentum: 0.000000
2023-10-25 11:33:28,562 epoch 2 - iter 385/773 - loss 0.07601583 - time (sec): 21.65 - samples/sec: 2931.54 - lr: 0.000047 - momentum: 0.000000
2023-10-25 11:33:32,947 epoch 2 - iter 462/773 - loss 0.07553467 - time (sec): 26.04 - samples/sec: 2877.34 - lr: 0.000047 - momentum: 0.000000
2023-10-25 11:33:37,394 epoch 2 - iter 539/773 - loss 0.07726320 - time (sec): 30.48 - samples/sec: 2852.33 - lr: 0.000046 - momentum: 0.000000
2023-10-25 11:33:41,685 epoch 2 - iter 616/773 - loss 0.07687578 - time (sec): 34.77 - samples/sec: 2862.30 - lr: 0.000046 - momentum: 0.000000
2023-10-25 11:33:45,897 epoch 2 - iter 693/773 - loss 0.07690697 - time (sec): 38.99 - samples/sec: 2853.35 - lr: 0.000045 - momentum: 0.000000
2023-10-25 11:33:50,273 epoch 2 - iter 770/773 - loss 0.07652112 - time (sec): 43.36 - samples/sec: 2853.67 - lr: 0.000044 - momentum: 0.000000
2023-10-25 11:33:50,458 ----------------------------------------------------------------------------------------------------
2023-10-25 11:33:50,458 EPOCH 2 done: loss 0.0764 - lr: 0.000044
2023-10-25 11:33:54,305 DEV : loss 0.05765606090426445 - f1-score (micro avg) 0.6577
2023-10-25 11:33:54,328 ----------------------------------------------------------------------------------------------------
2023-10-25 11:33:58,799 epoch 3 - iter 77/773 - loss 0.04624094 - time (sec): 4.47 - samples/sec: 2773.78 - lr: 0.000044 - momentum: 0.000000
2023-10-25 11:34:03,308 epoch 3 - iter 154/773 - loss 0.04639348 - time (sec): 8.98 - samples/sec: 2763.86 - lr: 0.000043 - momentum: 0.000000
2023-10-25 11:34:07,837 epoch 3 - iter 231/773 - loss 0.04638161 - time (sec): 13.51 - samples/sec: 2848.46 - lr: 0.000043 - momentum: 0.000000
2023-10-25 11:34:12,399 epoch 3 - iter 308/773 - loss 0.04881426 - time (sec): 18.07 - samples/sec: 2807.68 - lr: 0.000042 - momentum: 0.000000
2023-10-25 11:34:17,014 epoch 3 - iter 385/773 - loss 0.05206470 - time (sec): 22.68 - samples/sec: 2748.41 - lr: 0.000042 - momentum: 0.000000
2023-10-25 11:34:21,488 epoch 3 - iter 462/773 - loss 0.05275003 - time (sec): 27.16 - samples/sec: 2704.18 - lr: 0.000041 - momentum: 0.000000
2023-10-25 11:34:25,906 epoch 3 - iter 539/773 - loss 0.05152024 - time (sec): 31.58 - samples/sec: 2736.60 - lr: 0.000041 - momentum: 0.000000
2023-10-25 11:34:30,464 epoch 3 - iter 616/773 - loss 0.05087172 - time (sec): 36.13 - samples/sec: 2732.63 - lr: 0.000040 - momentum: 0.000000
2023-10-25 11:34:35,078 epoch 3 - iter 693/773 - loss 0.05319417 - time (sec): 40.75 - samples/sec: 2731.67 - lr: 0.000039 - momentum: 0.000000
2023-10-25 11:34:39,720 epoch 3 - iter 770/773 - loss 0.05805865 - time (sec): 45.39 - samples/sec: 2731.10 - lr: 0.000039 - momentum: 0.000000
2023-10-25 11:34:39,889 ----------------------------------------------------------------------------------------------------
2023-10-25 11:34:39,889 EPOCH 3 done: loss 0.0581 - lr: 0.000039
2023-10-25 11:34:42,593 DEV : loss 0.09656655043363571 - f1-score (micro avg) 0.7287
2023-10-25 11:34:42,615 saving best model
2023-10-25 11:34:43,315 ----------------------------------------------------------------------------------------------------
2023-10-25 11:34:47,956 epoch 4 - iter 77/773 - loss 0.04468020 - time (sec): 4.64 - samples/sec: 2770.59 - lr: 0.000038 - momentum: 0.000000
2023-10-25 11:34:52,542 epoch 4 - iter 154/773 - loss 0.05357280 - time (sec): 9.22 - samples/sec: 2790.55 - lr: 0.000038 - momentum: 0.000000
2023-10-25 11:34:56,989 epoch 4 - iter 231/773 - loss 0.05948461 - time (sec): 13.67 - samples/sec: 2812.21 - lr: 0.000037 - momentum: 0.000000
2023-10-25 11:35:01,205 epoch 4 - iter 308/773 - loss 0.06530804 - time (sec): 17.89 - samples/sec: 2838.76 - lr: 0.000037 - momentum: 0.000000
2023-10-25 11:35:05,460 epoch 4 - iter 385/773 - loss 0.06132855 - time (sec): 22.14 - samples/sec: 2860.87 - lr: 0.000036 - momentum: 0.000000
2023-10-25 11:35:09,680 epoch 4 - iter 462/773 - loss 0.05751702 - time (sec): 26.36 - samples/sec: 2876.95 - lr: 0.000036 - momentum: 0.000000
2023-10-25 11:35:13,858 epoch 4 - iter 539/773 - loss 0.06019540 - time (sec): 30.54 - samples/sec: 2865.69 - lr: 0.000035 - momentum: 0.000000
2023-10-25 11:35:17,970 epoch 4 - iter 616/773 - loss 0.06214758 - time (sec): 34.65 - samples/sec: 2860.96 - lr: 0.000034 - momentum: 0.000000
2023-10-25 11:35:22,136 epoch 4 - iter 693/773 - loss 0.06374761 - time (sec): 38.82 - samples/sec: 2850.03 - lr: 0.000034 - momentum: 0.000000
2023-10-25 11:35:26,302 epoch 4 - iter 770/773 - loss 0.06105828 - time (sec): 42.98 - samples/sec: 2878.38 - lr: 0.000033 - momentum: 0.000000
2023-10-25 11:35:26,475 ----------------------------------------------------------------------------------------------------
2023-10-25 11:35:26,476 EPOCH 4 done: loss 0.0609 - lr: 0.000033
2023-10-25 11:35:29,209 DEV : loss 0.10266965627670288 - f1-score (micro avg) 0.7418
2023-10-25 11:35:29,226 saving best model
2023-10-25 11:35:29,827 ----------------------------------------------------------------------------------------------------
2023-10-25 11:35:34,167 epoch 5 - iter 77/773 - loss 0.07354317 - time (sec): 4.34 - samples/sec: 2703.66 - lr: 0.000033 - momentum: 0.000000
2023-10-25 11:35:38,533 epoch 5 - iter 154/773 - loss 0.05656659 - time (sec): 8.70 - samples/sec: 2754.18 - lr: 0.000032 - momentum: 0.000000
2023-10-25 11:35:43,064 epoch 5 - iter 231/773 - loss 0.06643580 - time (sec): 13.24 - samples/sec: 2772.49 - lr: 0.000032 - momentum: 0.000000
2023-10-25 11:35:47,551 epoch 5 - iter 308/773 - loss 0.06782891 - time (sec): 17.72 - samples/sec: 2771.55 - lr: 0.000031 - momentum: 0.000000
2023-10-25 11:35:52,054 epoch 5 - iter 385/773 - loss 0.06184014 - time (sec): 22.23 - samples/sec: 2727.18 - lr: 0.000031 - momentum: 0.000000
2023-10-25 11:35:56,527 epoch 5 - iter 462/773 - loss 0.05866160 - time (sec): 26.70 - samples/sec: 2749.72 - lr: 0.000030 - momentum: 0.000000
2023-10-25 11:36:00,938 epoch 5 - iter 539/773 - loss 0.06250747 - time (sec): 31.11 - samples/sec: 2755.48 - lr: 0.000029 - momentum: 0.000000
2023-10-25 11:36:05,448 epoch 5 - iter 616/773 - loss 0.06026997 - time (sec): 35.62 - samples/sec: 2756.11 - lr: 0.000029 - momentum: 0.000000
2023-10-25 11:36:09,984 epoch 5 - iter 693/773 - loss 0.05814121 - time (sec): 40.16 - samples/sec: 2749.97 - lr: 0.000028 - momentum: 0.000000
2023-10-25 11:36:14,741 epoch 5 - iter 770/773 - loss 0.05494553 - time (sec): 44.91 - samples/sec: 2759.99 - lr: 0.000028 - momentum: 0.000000
2023-10-25 11:36:14,907 ----------------------------------------------------------------------------------------------------
2023-10-25 11:36:14,907 EPOCH 5 done: loss 0.0549 - lr: 0.000028
2023-10-25 11:36:17,604 DEV : loss 0.1113104522228241 - f1-score (micro avg) 0.7619
2023-10-25 11:36:17,624 saving best model
2023-10-25 11:36:18,324 ----------------------------------------------------------------------------------------------------
2023-10-25 11:36:22,987 epoch 6 - iter 77/773 - loss 0.04865046 - time (sec): 4.66 - samples/sec: 2616.38 - lr: 0.000027 - momentum: 0.000000
2023-10-25 11:36:27,557 epoch 6 - iter 154/773 - loss 0.03475110 - time (sec): 9.23 - samples/sec: 2676.91 - lr: 0.000027 - momentum: 0.000000
2023-10-25 11:36:32,097 epoch 6 - iter 231/773 - loss 0.02956246 - time (sec): 13.77 - samples/sec: 2710.98 - lr: 0.000026 - momentum: 0.000000
2023-10-25 11:36:36,609 epoch 6 - iter 308/773 - loss 0.03026117 - time (sec): 18.28 - samples/sec: 2695.78 - lr: 0.000026 - momentum: 0.000000
2023-10-25 11:36:41,206 epoch 6 - iter 385/773 - loss 0.03316285 - time (sec): 22.88 - samples/sec: 2765.88 - lr: 0.000025 - momentum: 0.000000
2023-10-25 11:36:45,639 epoch 6 - iter 462/773 - loss 0.03323264 - time (sec): 27.31 - samples/sec: 2774.22 - lr: 0.000024 - momentum: 0.000000
2023-10-25 11:36:50,098 epoch 6 - iter 539/773 - loss 0.03456321 - time (sec): 31.77 - samples/sec: 2750.32 - lr: 0.000024 - momentum: 0.000000
2023-10-25 11:36:54,524 epoch 6 - iter 616/773 - loss 0.03464651 - time (sec): 36.20 - samples/sec: 2751.08 - lr: 0.000023 - momentum: 0.000000
2023-10-25 11:36:58,914 epoch 6 - iter 693/773 - loss 0.03425099 - time (sec): 40.59 - samples/sec: 2748.59 - lr: 0.000023 - momentum: 0.000000
2023-10-25 11:37:03,254 epoch 6 - iter 770/773 - loss 0.03297911 - time (sec): 44.93 - samples/sec: 2756.29 - lr: 0.000022 - momentum: 0.000000
2023-10-25 11:37:03,415 ----------------------------------------------------------------------------------------------------
2023-10-25 11:37:03,416 EPOCH 6 done: loss 0.0329 - lr: 0.000022
2023-10-25 11:37:06,006 DEV : loss 0.09328915923833847 - f1-score (micro avg) 0.7741
2023-10-25 11:37:06,024 saving best model
2023-10-25 11:37:06,667 ----------------------------------------------------------------------------------------------------
2023-10-25 11:37:11,201 epoch 7 - iter 77/773 - loss 0.02826965 - time (sec): 4.53 - samples/sec: 2689.74 - lr: 0.000022 - momentum: 0.000000
2023-10-25 11:37:16,301 epoch 7 - iter 154/773 - loss 0.02916549 - time (sec): 9.63 - samples/sec: 2579.06 - lr: 0.000021 - momentum: 0.000000
2023-10-25 11:37:20,597 epoch 7 - iter 231/773 - loss 0.02694630 - time (sec): 13.93 - samples/sec: 2688.77 - lr: 0.000021 - momentum: 0.000000
2023-10-25 11:37:25,060 epoch 7 - iter 308/773 - loss 0.02382137 - time (sec): 18.39 - samples/sec: 2700.08 - lr: 0.000020 - momentum: 0.000000
2023-10-25 11:37:29,623 epoch 7 - iter 385/773 - loss 0.02400061 - time (sec): 22.95 - samples/sec: 2674.46 - lr: 0.000019 - momentum: 0.000000
2023-10-25 11:37:34,323 epoch 7 - iter 462/773 - loss 0.02420007 - time (sec): 27.65 - samples/sec: 2692.42 - lr: 0.000019 - momentum: 0.000000
2023-10-25 11:37:39,239 epoch 7 - iter 539/773 - loss 0.02350486 - time (sec): 32.57 - samples/sec: 2653.80 - lr: 0.000018 - momentum: 0.000000
2023-10-25 11:37:44,444 epoch 7 - iter 616/773 - loss 0.02263543 - time (sec): 37.77 - samples/sec: 2600.25 - lr: 0.000018 - momentum: 0.000000
2023-10-25 11:37:48,997 epoch 7 - iter 693/773 - loss 0.02248745 - time (sec): 42.33 - samples/sec: 2624.42 - lr: 0.000017 - momentum: 0.000000
2023-10-25 11:37:53,293 epoch 7 - iter 770/773 - loss 0.02276205 - time (sec): 46.62 - samples/sec: 2656.63 - lr: 0.000017 - momentum: 0.000000
2023-10-25 11:37:53,458 ----------------------------------------------------------------------------------------------------
2023-10-25 11:37:53,458 EPOCH 7 done: loss 0.0227 - lr: 0.000017
2023-10-25 11:37:56,498 DEV : loss 0.09706912189722061 - f1-score (micro avg) 0.786
2023-10-25 11:37:56,520 saving best model
2023-10-25 11:37:57,194 ----------------------------------------------------------------------------------------------------
2023-10-25 11:38:01,852 epoch 8 - iter 77/773 - loss 0.00963689 - time (sec): 4.65 - samples/sec: 2569.03 - lr: 0.000016 - momentum: 0.000000
2023-10-25 11:38:06,649 epoch 8 - iter 154/773 - loss 0.01048478 - time (sec): 9.45 - samples/sec: 2646.74 - lr: 0.000016 - momentum: 0.000000
2023-10-25 11:38:11,176 epoch 8 - iter 231/773 - loss 0.01023261 - time (sec): 13.98 - samples/sec: 2696.04 - lr: 0.000015 - momentum: 0.000000
2023-10-25 11:38:15,527 epoch 8 - iter 308/773 - loss 0.01078646 - time (sec): 18.33 - samples/sec: 2741.57 - lr: 0.000014 - momentum: 0.000000
2023-10-25 11:38:19,868 epoch 8 - iter 385/773 - loss 0.01293255 - time (sec): 22.67 - samples/sec: 2762.41 - lr: 0.000014 - momentum: 0.000000
2023-10-25 11:38:24,519 epoch 8 - iter 462/773 - loss 0.01414220 - time (sec): 27.32 - samples/sec: 2713.20 - lr: 0.000013 - momentum: 0.000000
2023-10-25 11:38:29,076 epoch 8 - iter 539/773 - loss 0.01443726 - time (sec): 31.88 - samples/sec: 2720.03 - lr: 0.000013 - momentum: 0.000000
2023-10-25 11:38:33,590 epoch 8 - iter 616/773 - loss 0.01493887 - time (sec): 36.39 - samples/sec: 2719.16 - lr: 0.000012 - momentum: 0.000000
2023-10-25 11:38:38,269 epoch 8 - iter 693/773 - loss 0.01475114 - time (sec): 41.07 - samples/sec: 2706.28 - lr: 0.000012 - momentum: 0.000000
2023-10-25 11:38:42,783 epoch 8 - iter 770/773 - loss 0.01430847 - time (sec): 45.58 - samples/sec: 2712.30 - lr: 0.000011 - momentum: 0.000000
2023-10-25 11:38:42,957 ----------------------------------------------------------------------------------------------------
2023-10-25 11:38:42,958 EPOCH 8 done: loss 0.0143 - lr: 0.000011
2023-10-25 11:38:45,598 DEV : loss 0.10801413655281067 - f1-score (micro avg) 0.7515
2023-10-25 11:38:45,615 ----------------------------------------------------------------------------------------------------
2023-10-25 11:38:50,161 epoch 9 - iter 77/773 - loss 0.00466196 - time (sec): 4.54 - samples/sec: 2698.86 - lr: 0.000011 - momentum: 0.000000
2023-10-25 11:38:54,650 epoch 9 - iter 154/773 - loss 0.00740543 - time (sec): 9.03 - samples/sec: 2702.12 - lr: 0.000010 - momentum: 0.000000
2023-10-25 11:38:59,263 epoch 9 - iter 231/773 - loss 0.00858285 - time (sec): 13.65 - samples/sec: 2678.11 - lr: 0.000009 - momentum: 0.000000
2023-10-25 11:39:03,513 epoch 9 - iter 308/773 - loss 0.00944576 - time (sec): 17.90 - samples/sec: 2731.89 - lr: 0.000009 - momentum: 0.000000
2023-10-25 11:39:08,048 epoch 9 - iter 385/773 - loss 0.00881598 - time (sec): 22.43 - samples/sec: 2778.28 - lr: 0.000008 - momentum: 0.000000
2023-10-25 11:39:12,207 epoch 9 - iter 462/773 - loss 0.00893348 - time (sec): 26.59 - samples/sec: 2804.33 - lr: 0.000008 - momentum: 0.000000
2023-10-25 11:39:16,419 epoch 9 - iter 539/773 - loss 0.00944343 - time (sec): 30.80 - samples/sec: 2818.54 - lr: 0.000007 - momentum: 0.000000
2023-10-25 11:39:20,930 epoch 9 - iter 616/773 - loss 0.00960352 - time (sec): 35.31 - samples/sec: 2824.99 - lr: 0.000007 - momentum: 0.000000
2023-10-25 11:39:25,174 epoch 9 - iter 693/773 - loss 0.00940583 - time (sec): 39.56 - samples/sec: 2841.88 - lr: 0.000006 - momentum: 0.000000
2023-10-25 11:39:29,510 epoch 9 - iter 770/773 - loss 0.00955294 - time (sec): 43.89 - samples/sec: 2821.72 - lr: 0.000006 - momentum: 0.000000
2023-10-25 11:39:29,686 ----------------------------------------------------------------------------------------------------
2023-10-25 11:39:29,687 EPOCH 9 done: loss 0.0095 - lr: 0.000006
2023-10-25 11:39:32,704 DEV : loss 0.11337698251008987 - f1-score (micro avg) 0.7683
2023-10-25 11:39:32,724 ----------------------------------------------------------------------------------------------------
2023-10-25 11:39:37,694 epoch 10 - iter 77/773 - loss 0.00238185 - time (sec): 4.97 - samples/sec: 2756.02 - lr: 0.000005 - momentum: 0.000000
2023-10-25 11:39:42,114 epoch 10 - iter 154/773 - loss 0.00380767 - time (sec): 9.39 - samples/sec: 2716.28 - lr: 0.000005 - momentum: 0.000000
2023-10-25 11:39:46,400 epoch 10 - iter 231/773 - loss 0.00374973 - time (sec): 13.68 - samples/sec: 2800.44 - lr: 0.000004 - momentum: 0.000000
2023-10-25 11:39:50,769 epoch 10 - iter 308/773 - loss 0.00342233 - time (sec): 18.04 - samples/sec: 2807.72 - lr: 0.000003 - momentum: 0.000000
2023-10-25 11:39:54,989 epoch 10 - iter 385/773 - loss 0.00446767 - time (sec): 22.26 - samples/sec: 2856.40 - lr: 0.000003 - momentum: 0.000000
2023-10-25 11:39:59,798 epoch 10 - iter 462/773 - loss 0.00465055 - time (sec): 27.07 - samples/sec: 2805.78 - lr: 0.000002 - momentum: 0.000000
2023-10-25 11:40:03,966 epoch 10 - iter 539/773 - loss 0.00454917 - time (sec): 31.24 - samples/sec: 2810.46 - lr: 0.000002 - momentum: 0.000000
2023-10-25 11:40:08,229 epoch 10 - iter 616/773 - loss 0.00492510 - time (sec): 35.50 - samples/sec: 2808.44 - lr: 0.000001 - momentum: 0.000000
2023-10-25 11:40:12,467 epoch 10 - iter 693/773 - loss 0.00477115 - time (sec): 39.74 - samples/sec: 2816.91 - lr: 0.000001 - momentum: 0.000000
2023-10-25 11:40:16,634 epoch 10 - iter 770/773 - loss 0.00517975 - time (sec): 43.91 - samples/sec: 2821.68 - lr: 0.000000 - momentum: 0.000000
2023-10-25 11:40:16,791 ----------------------------------------------------------------------------------------------------
2023-10-25 11:40:16,791 EPOCH 10 done: loss 0.0052 - lr: 0.000000
2023-10-25 11:40:20,311 DEV : loss 0.12045777589082718 - f1-score (micro avg) 0.7648
2023-10-25 11:40:20,827 ----------------------------------------------------------------------------------------------------
2023-10-25 11:40:20,828 Loading model from best epoch ...
2023-10-25 11:40:22,671 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-25 11:40:33,409
Results:
- F-score (micro) 0.7908
- F-score (macro) 0.6692
- Accuracy 0.6765
By class:
precision recall f1-score support
LOC 0.8141 0.8562 0.8346 946
BUILDING 0.6667 0.5622 0.6100 185
STREET 0.6170 0.5179 0.5631 56
micro avg 0.7871 0.7944 0.7908 1187
macro avg 0.6993 0.6454 0.6692 1187
weighted avg 0.7818 0.7944 0.7868 1187
2023-10-25 11:40:33,410 ----------------------------------------------------------------------------------------------------