stefan-it's picture
Upload ./training.log with huggingface_hub
f5a8622
2023-10-25 10:46:23,096 ----------------------------------------------------------------------------------------------------
2023-10-25 10:46:23,097 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 10:46:23,097 ----------------------------------------------------------------------------------------------------
2023-10-25 10:46:23,098 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-25 10:46:23,098 ----------------------------------------------------------------------------------------------------
2023-10-25 10:46:23,098 Train: 6183 sentences
2023-10-25 10:46:23,098 (train_with_dev=False, train_with_test=False)
2023-10-25 10:46:23,098 ----------------------------------------------------------------------------------------------------
2023-10-25 10:46:23,098 Training Params:
2023-10-25 10:46:23,098 - learning_rate: "5e-05"
2023-10-25 10:46:23,098 - mini_batch_size: "8"
2023-10-25 10:46:23,098 - max_epochs: "10"
2023-10-25 10:46:23,098 - shuffle: "True"
2023-10-25 10:46:23,098 ----------------------------------------------------------------------------------------------------
2023-10-25 10:46:23,098 Plugins:
2023-10-25 10:46:23,098 - TensorboardLogger
2023-10-25 10:46:23,098 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 10:46:23,098 ----------------------------------------------------------------------------------------------------
2023-10-25 10:46:23,098 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 10:46:23,098 - metric: "('micro avg', 'f1-score')"
2023-10-25 10:46:23,098 ----------------------------------------------------------------------------------------------------
2023-10-25 10:46:23,098 Computation:
2023-10-25 10:46:23,098 - compute on device: cuda:0
2023-10-25 10:46:23,098 - embedding storage: none
2023-10-25 10:46:23,098 ----------------------------------------------------------------------------------------------------
2023-10-25 10:46:23,098 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-25 10:46:23,098 ----------------------------------------------------------------------------------------------------
2023-10-25 10:46:23,099 ----------------------------------------------------------------------------------------------------
2023-10-25 10:46:23,099 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 10:46:28,075 epoch 1 - iter 77/773 - loss 1.64952430 - time (sec): 4.97 - samples/sec: 2562.70 - lr: 0.000005 - momentum: 0.000000
2023-10-25 10:46:33,104 epoch 1 - iter 154/773 - loss 0.93337294 - time (sec): 10.00 - samples/sec: 2490.60 - lr: 0.000010 - momentum: 0.000000
2023-10-25 10:46:37,967 epoch 1 - iter 231/773 - loss 0.67782129 - time (sec): 14.87 - samples/sec: 2485.70 - lr: 0.000015 - momentum: 0.000000
2023-10-25 10:46:42,883 epoch 1 - iter 308/773 - loss 0.53205948 - time (sec): 19.78 - samples/sec: 2518.33 - lr: 0.000020 - momentum: 0.000000
2023-10-25 10:46:47,510 epoch 1 - iter 385/773 - loss 0.44635492 - time (sec): 24.41 - samples/sec: 2548.09 - lr: 0.000025 - momentum: 0.000000
2023-10-25 10:46:52,119 epoch 1 - iter 462/773 - loss 0.39431926 - time (sec): 29.02 - samples/sec: 2548.56 - lr: 0.000030 - momentum: 0.000000
2023-10-25 10:46:56,856 epoch 1 - iter 539/773 - loss 0.35292625 - time (sec): 33.76 - samples/sec: 2572.32 - lr: 0.000035 - momentum: 0.000000
2023-10-25 10:47:01,538 epoch 1 - iter 616/773 - loss 0.31935913 - time (sec): 38.44 - samples/sec: 2591.35 - lr: 0.000040 - momentum: 0.000000
2023-10-25 10:47:06,067 epoch 1 - iter 693/773 - loss 0.29408255 - time (sec): 42.97 - samples/sec: 2603.74 - lr: 0.000045 - momentum: 0.000000
2023-10-25 10:47:11,234 epoch 1 - iter 770/773 - loss 0.27440561 - time (sec): 48.13 - samples/sec: 2572.37 - lr: 0.000050 - momentum: 0.000000
2023-10-25 10:47:11,419 ----------------------------------------------------------------------------------------------------
2023-10-25 10:47:11,419 EPOCH 1 done: loss 0.2740 - lr: 0.000050
2023-10-25 10:47:14,343 DEV : loss 0.06034430116415024 - f1-score (micro avg) 0.7484
2023-10-25 10:47:14,361 saving best model
2023-10-25 10:47:14,855 ----------------------------------------------------------------------------------------------------
2023-10-25 10:47:19,548 epoch 2 - iter 77/773 - loss 0.07012990 - time (sec): 4.69 - samples/sec: 2619.54 - lr: 0.000049 - momentum: 0.000000
2023-10-25 10:47:24,265 epoch 2 - iter 154/773 - loss 0.07639423 - time (sec): 9.41 - samples/sec: 2633.95 - lr: 0.000049 - momentum: 0.000000
2023-10-25 10:47:28,908 epoch 2 - iter 231/773 - loss 0.07702505 - time (sec): 14.05 - samples/sec: 2566.17 - lr: 0.000048 - momentum: 0.000000
2023-10-25 10:47:33,737 epoch 2 - iter 308/773 - loss 0.07889340 - time (sec): 18.88 - samples/sec: 2592.89 - lr: 0.000048 - momentum: 0.000000
2023-10-25 10:47:38,654 epoch 2 - iter 385/773 - loss 0.07691864 - time (sec): 23.80 - samples/sec: 2589.87 - lr: 0.000047 - momentum: 0.000000
2023-10-25 10:47:43,488 epoch 2 - iter 462/773 - loss 0.07543602 - time (sec): 28.63 - samples/sec: 2594.69 - lr: 0.000047 - momentum: 0.000000
2023-10-25 10:47:48,286 epoch 2 - iter 539/773 - loss 0.07550983 - time (sec): 33.43 - samples/sec: 2604.25 - lr: 0.000046 - momentum: 0.000000
2023-10-25 10:47:52,924 epoch 2 - iter 616/773 - loss 0.07470403 - time (sec): 38.07 - samples/sec: 2593.52 - lr: 0.000046 - momentum: 0.000000
2023-10-25 10:47:57,364 epoch 2 - iter 693/773 - loss 0.07377788 - time (sec): 42.51 - samples/sec: 2598.85 - lr: 0.000045 - momentum: 0.000000
2023-10-25 10:48:02,105 epoch 2 - iter 770/773 - loss 0.07413970 - time (sec): 47.25 - samples/sec: 2619.42 - lr: 0.000044 - momentum: 0.000000
2023-10-25 10:48:02,274 ----------------------------------------------------------------------------------------------------
2023-10-25 10:48:02,274 EPOCH 2 done: loss 0.0739 - lr: 0.000044
2023-10-25 10:48:05,217 DEV : loss 0.05858875438570976 - f1-score (micro avg) 0.7287
2023-10-25 10:48:05,234 ----------------------------------------------------------------------------------------------------
2023-10-25 10:48:09,912 epoch 3 - iter 77/773 - loss 0.04462850 - time (sec): 4.68 - samples/sec: 2610.70 - lr: 0.000044 - momentum: 0.000000
2023-10-25 10:48:14,526 epoch 3 - iter 154/773 - loss 0.04609617 - time (sec): 9.29 - samples/sec: 2568.19 - lr: 0.000043 - momentum: 0.000000
2023-10-25 10:48:19,206 epoch 3 - iter 231/773 - loss 0.04727208 - time (sec): 13.97 - samples/sec: 2560.80 - lr: 0.000043 - momentum: 0.000000
2023-10-25 10:48:23,981 epoch 3 - iter 308/773 - loss 0.04497885 - time (sec): 18.75 - samples/sec: 2579.51 - lr: 0.000042 - momentum: 0.000000
2023-10-25 10:48:28,676 epoch 3 - iter 385/773 - loss 0.04572064 - time (sec): 23.44 - samples/sec: 2582.45 - lr: 0.000042 - momentum: 0.000000
2023-10-25 10:48:33,361 epoch 3 - iter 462/773 - loss 0.04619941 - time (sec): 28.12 - samples/sec: 2615.55 - lr: 0.000041 - momentum: 0.000000
2023-10-25 10:48:38,007 epoch 3 - iter 539/773 - loss 0.04607725 - time (sec): 32.77 - samples/sec: 2627.41 - lr: 0.000041 - momentum: 0.000000
2023-10-25 10:48:42,587 epoch 3 - iter 616/773 - loss 0.04767181 - time (sec): 37.35 - samples/sec: 2642.74 - lr: 0.000040 - momentum: 0.000000
2023-10-25 10:48:47,298 epoch 3 - iter 693/773 - loss 0.04860853 - time (sec): 42.06 - samples/sec: 2652.28 - lr: 0.000039 - momentum: 0.000000
2023-10-25 10:48:51,926 epoch 3 - iter 770/773 - loss 0.04752276 - time (sec): 46.69 - samples/sec: 2653.31 - lr: 0.000039 - momentum: 0.000000
2023-10-25 10:48:52,128 ----------------------------------------------------------------------------------------------------
2023-10-25 10:48:52,128 EPOCH 3 done: loss 0.0474 - lr: 0.000039
2023-10-25 10:48:54,819 DEV : loss 0.07712458819150925 - f1-score (micro avg) 0.7984
2023-10-25 10:48:54,837 saving best model
2023-10-25 10:48:55,493 ----------------------------------------------------------------------------------------------------
2023-10-25 10:49:00,161 epoch 4 - iter 77/773 - loss 0.03050623 - time (sec): 4.67 - samples/sec: 2675.26 - lr: 0.000038 - momentum: 0.000000
2023-10-25 10:49:04,947 epoch 4 - iter 154/773 - loss 0.02946324 - time (sec): 9.45 - samples/sec: 2673.61 - lr: 0.000038 - momentum: 0.000000
2023-10-25 10:49:09,646 epoch 4 - iter 231/773 - loss 0.02910360 - time (sec): 14.15 - samples/sec: 2668.78 - lr: 0.000037 - momentum: 0.000000
2023-10-25 10:49:14,475 epoch 4 - iter 308/773 - loss 0.03117292 - time (sec): 18.98 - samples/sec: 2650.63 - lr: 0.000037 - momentum: 0.000000
2023-10-25 10:49:19,063 epoch 4 - iter 385/773 - loss 0.03199838 - time (sec): 23.57 - samples/sec: 2641.28 - lr: 0.000036 - momentum: 0.000000
2023-10-25 10:49:23,661 epoch 4 - iter 462/773 - loss 0.03362317 - time (sec): 28.16 - samples/sec: 2632.01 - lr: 0.000036 - momentum: 0.000000
2023-10-25 10:49:28,433 epoch 4 - iter 539/773 - loss 0.03421957 - time (sec): 32.94 - samples/sec: 2636.66 - lr: 0.000035 - momentum: 0.000000
2023-10-25 10:49:33,019 epoch 4 - iter 616/773 - loss 0.03426652 - time (sec): 37.52 - samples/sec: 2654.08 - lr: 0.000034 - momentum: 0.000000
2023-10-25 10:49:37,742 epoch 4 - iter 693/773 - loss 0.03423819 - time (sec): 42.25 - samples/sec: 2659.12 - lr: 0.000034 - momentum: 0.000000
2023-10-25 10:49:42,356 epoch 4 - iter 770/773 - loss 0.03434937 - time (sec): 46.86 - samples/sec: 2642.18 - lr: 0.000033 - momentum: 0.000000
2023-10-25 10:49:42,527 ----------------------------------------------------------------------------------------------------
2023-10-25 10:49:42,527 EPOCH 4 done: loss 0.0345 - lr: 0.000033
2023-10-25 10:49:45,313 DEV : loss 0.08568881452083588 - f1-score (micro avg) 0.7335
2023-10-25 10:49:45,330 ----------------------------------------------------------------------------------------------------
2023-10-25 10:49:49,956 epoch 5 - iter 77/773 - loss 0.02456627 - time (sec): 4.62 - samples/sec: 2661.67 - lr: 0.000033 - momentum: 0.000000
2023-10-25 10:49:54,528 epoch 5 - iter 154/773 - loss 0.02402263 - time (sec): 9.20 - samples/sec: 2683.11 - lr: 0.000032 - momentum: 0.000000
2023-10-25 10:49:59,332 epoch 5 - iter 231/773 - loss 0.02135560 - time (sec): 14.00 - samples/sec: 2685.78 - lr: 0.000032 - momentum: 0.000000
2023-10-25 10:50:03,983 epoch 5 - iter 308/773 - loss 0.02453825 - time (sec): 18.65 - samples/sec: 2667.18 - lr: 0.000031 - momentum: 0.000000
2023-10-25 10:50:08,782 epoch 5 - iter 385/773 - loss 0.02274215 - time (sec): 23.45 - samples/sec: 2674.33 - lr: 0.000031 - momentum: 0.000000
2023-10-25 10:50:13,512 epoch 5 - iter 462/773 - loss 0.02118376 - time (sec): 28.18 - samples/sec: 2677.97 - lr: 0.000030 - momentum: 0.000000
2023-10-25 10:50:18,343 epoch 5 - iter 539/773 - loss 0.02077515 - time (sec): 33.01 - samples/sec: 2672.12 - lr: 0.000029 - momentum: 0.000000
2023-10-25 10:50:22,967 epoch 5 - iter 616/773 - loss 0.02113624 - time (sec): 37.64 - samples/sec: 2649.88 - lr: 0.000029 - momentum: 0.000000
2023-10-25 10:50:27,639 epoch 5 - iter 693/773 - loss 0.02125485 - time (sec): 42.31 - samples/sec: 2654.17 - lr: 0.000028 - momentum: 0.000000
2023-10-25 10:50:32,191 epoch 5 - iter 770/773 - loss 0.02224251 - time (sec): 46.86 - samples/sec: 2642.63 - lr: 0.000028 - momentum: 0.000000
2023-10-25 10:50:32,396 ----------------------------------------------------------------------------------------------------
2023-10-25 10:50:32,397 EPOCH 5 done: loss 0.0222 - lr: 0.000028
2023-10-25 10:50:35,116 DEV : loss 0.0947536900639534 - f1-score (micro avg) 0.7796
2023-10-25 10:50:35,134 ----------------------------------------------------------------------------------------------------
2023-10-25 10:50:39,949 epoch 6 - iter 77/773 - loss 0.01500115 - time (sec): 4.81 - samples/sec: 2626.09 - lr: 0.000027 - momentum: 0.000000
2023-10-25 10:50:44,800 epoch 6 - iter 154/773 - loss 0.01845212 - time (sec): 9.66 - samples/sec: 2601.14 - lr: 0.000027 - momentum: 0.000000
2023-10-25 10:50:49,370 epoch 6 - iter 231/773 - loss 0.01861329 - time (sec): 14.23 - samples/sec: 2594.22 - lr: 0.000026 - momentum: 0.000000
2023-10-25 10:50:54,061 epoch 6 - iter 308/773 - loss 0.01918329 - time (sec): 18.93 - samples/sec: 2630.47 - lr: 0.000026 - momentum: 0.000000
2023-10-25 10:50:58,683 epoch 6 - iter 385/773 - loss 0.01825918 - time (sec): 23.55 - samples/sec: 2678.01 - lr: 0.000025 - momentum: 0.000000
2023-10-25 10:51:03,288 epoch 6 - iter 462/773 - loss 0.01813405 - time (sec): 28.15 - samples/sec: 2692.93 - lr: 0.000024 - momentum: 0.000000
2023-10-25 10:51:07,966 epoch 6 - iter 539/773 - loss 0.01809914 - time (sec): 32.83 - samples/sec: 2676.28 - lr: 0.000024 - momentum: 0.000000
2023-10-25 10:51:12,563 epoch 6 - iter 616/773 - loss 0.01698338 - time (sec): 37.43 - samples/sec: 2660.98 - lr: 0.000023 - momentum: 0.000000
2023-10-25 10:51:17,016 epoch 6 - iter 693/773 - loss 0.01652935 - time (sec): 41.88 - samples/sec: 2668.09 - lr: 0.000023 - momentum: 0.000000
2023-10-25 10:51:21,606 epoch 6 - iter 770/773 - loss 0.01672796 - time (sec): 46.47 - samples/sec: 2666.18 - lr: 0.000022 - momentum: 0.000000
2023-10-25 10:51:21,789 ----------------------------------------------------------------------------------------------------
2023-10-25 10:51:21,790 EPOCH 6 done: loss 0.0169 - lr: 0.000022
2023-10-25 10:51:24,542 DEV : loss 0.10434131324291229 - f1-score (micro avg) 0.7475
2023-10-25 10:51:24,560 ----------------------------------------------------------------------------------------------------
2023-10-25 10:51:29,476 epoch 7 - iter 77/773 - loss 0.01488389 - time (sec): 4.91 - samples/sec: 2591.88 - lr: 0.000022 - momentum: 0.000000
2023-10-25 10:51:34,023 epoch 7 - iter 154/773 - loss 0.01287761 - time (sec): 9.46 - samples/sec: 2626.81 - lr: 0.000021 - momentum: 0.000000
2023-10-25 10:51:39,004 epoch 7 - iter 231/773 - loss 0.01049621 - time (sec): 14.44 - samples/sec: 2683.40 - lr: 0.000021 - momentum: 0.000000
2023-10-25 10:51:43,584 epoch 7 - iter 308/773 - loss 0.01009304 - time (sec): 19.02 - samples/sec: 2608.10 - lr: 0.000020 - momentum: 0.000000
2023-10-25 10:51:48,267 epoch 7 - iter 385/773 - loss 0.01061720 - time (sec): 23.70 - samples/sec: 2608.95 - lr: 0.000019 - momentum: 0.000000
2023-10-25 10:51:52,972 epoch 7 - iter 462/773 - loss 0.01079765 - time (sec): 28.41 - samples/sec: 2622.25 - lr: 0.000019 - momentum: 0.000000
2023-10-25 10:51:57,626 epoch 7 - iter 539/773 - loss 0.01022193 - time (sec): 33.06 - samples/sec: 2597.96 - lr: 0.000018 - momentum: 0.000000
2023-10-25 10:52:02,645 epoch 7 - iter 616/773 - loss 0.01026500 - time (sec): 38.08 - samples/sec: 2570.01 - lr: 0.000018 - momentum: 0.000000
2023-10-25 10:52:07,340 epoch 7 - iter 693/773 - loss 0.01015055 - time (sec): 42.78 - samples/sec: 2593.69 - lr: 0.000017 - momentum: 0.000000
2023-10-25 10:52:11,969 epoch 7 - iter 770/773 - loss 0.01047840 - time (sec): 47.41 - samples/sec: 2609.58 - lr: 0.000017 - momentum: 0.000000
2023-10-25 10:52:12,148 ----------------------------------------------------------------------------------------------------
2023-10-25 10:52:12,148 EPOCH 7 done: loss 0.0104 - lr: 0.000017
2023-10-25 10:52:14,980 DEV : loss 0.12387851625680923 - f1-score (micro avg) 0.754
2023-10-25 10:52:15,002 ----------------------------------------------------------------------------------------------------
2023-10-25 10:52:19,770 epoch 8 - iter 77/773 - loss 0.00559353 - time (sec): 4.77 - samples/sec: 2587.00 - lr: 0.000016 - momentum: 0.000000
2023-10-25 10:52:24,472 epoch 8 - iter 154/773 - loss 0.00716554 - time (sec): 9.47 - samples/sec: 2622.85 - lr: 0.000016 - momentum: 0.000000
2023-10-25 10:52:29,209 epoch 8 - iter 231/773 - loss 0.00969964 - time (sec): 14.21 - samples/sec: 2561.88 - lr: 0.000015 - momentum: 0.000000
2023-10-25 10:52:33,958 epoch 8 - iter 308/773 - loss 0.00787012 - time (sec): 18.95 - samples/sec: 2562.72 - lr: 0.000014 - momentum: 0.000000
2023-10-25 10:52:38,746 epoch 8 - iter 385/773 - loss 0.00822737 - time (sec): 23.74 - samples/sec: 2574.97 - lr: 0.000014 - momentum: 0.000000
2023-10-25 10:52:43,366 epoch 8 - iter 462/773 - loss 0.00949259 - time (sec): 28.36 - samples/sec: 2624.79 - lr: 0.000013 - momentum: 0.000000
2023-10-25 10:52:48,014 epoch 8 - iter 539/773 - loss 0.00909000 - time (sec): 33.01 - samples/sec: 2634.45 - lr: 0.000013 - momentum: 0.000000
2023-10-25 10:52:52,785 epoch 8 - iter 616/773 - loss 0.00910250 - time (sec): 37.78 - samples/sec: 2625.07 - lr: 0.000012 - momentum: 0.000000
2023-10-25 10:52:57,572 epoch 8 - iter 693/773 - loss 0.00897170 - time (sec): 42.57 - samples/sec: 2612.95 - lr: 0.000012 - momentum: 0.000000
2023-10-25 10:53:02,343 epoch 8 - iter 770/773 - loss 0.00836489 - time (sec): 47.34 - samples/sec: 2613.96 - lr: 0.000011 - momentum: 0.000000
2023-10-25 10:53:02,522 ----------------------------------------------------------------------------------------------------
2023-10-25 10:53:02,522 EPOCH 8 done: loss 0.0083 - lr: 0.000011
2023-10-25 10:53:05,400 DEV : loss 0.13534879684448242 - f1-score (micro avg) 0.76
2023-10-25 10:53:05,423 ----------------------------------------------------------------------------------------------------
2023-10-25 10:53:10,212 epoch 9 - iter 77/773 - loss 0.00267671 - time (sec): 4.79 - samples/sec: 2631.30 - lr: 0.000011 - momentum: 0.000000
2023-10-25 10:53:14,902 epoch 9 - iter 154/773 - loss 0.00629952 - time (sec): 9.48 - samples/sec: 2640.80 - lr: 0.000010 - momentum: 0.000000
2023-10-25 10:53:19,488 epoch 9 - iter 231/773 - loss 0.00639907 - time (sec): 14.06 - samples/sec: 2659.72 - lr: 0.000009 - momentum: 0.000000
2023-10-25 10:53:24,185 epoch 9 - iter 308/773 - loss 0.00529316 - time (sec): 18.76 - samples/sec: 2685.60 - lr: 0.000009 - momentum: 0.000000
2023-10-25 10:53:28,965 epoch 9 - iter 385/773 - loss 0.00507245 - time (sec): 23.54 - samples/sec: 2662.13 - lr: 0.000008 - momentum: 0.000000
2023-10-25 10:53:33,679 epoch 9 - iter 462/773 - loss 0.00499478 - time (sec): 28.25 - samples/sec: 2652.62 - lr: 0.000008 - momentum: 0.000000
2023-10-25 10:53:38,383 epoch 9 - iter 539/773 - loss 0.00530658 - time (sec): 32.96 - samples/sec: 2637.92 - lr: 0.000007 - momentum: 0.000000
2023-10-25 10:53:42,943 epoch 9 - iter 616/773 - loss 0.00494346 - time (sec): 37.52 - samples/sec: 2639.57 - lr: 0.000007 - momentum: 0.000000
2023-10-25 10:53:47,719 epoch 9 - iter 693/773 - loss 0.00479337 - time (sec): 42.29 - samples/sec: 2652.85 - lr: 0.000006 - momentum: 0.000000
2023-10-25 10:53:52,483 epoch 9 - iter 770/773 - loss 0.00475456 - time (sec): 47.06 - samples/sec: 2634.72 - lr: 0.000006 - momentum: 0.000000
2023-10-25 10:53:52,652 ----------------------------------------------------------------------------------------------------
2023-10-25 10:53:52,652 EPOCH 9 done: loss 0.0047 - lr: 0.000006
2023-10-25 10:53:55,453 DEV : loss 0.13144543766975403 - f1-score (micro avg) 0.7613
2023-10-25 10:53:55,471 ----------------------------------------------------------------------------------------------------
2023-10-25 10:54:00,257 epoch 10 - iter 77/773 - loss 0.00043461 - time (sec): 4.78 - samples/sec: 2475.21 - lr: 0.000005 - momentum: 0.000000
2023-10-25 10:54:04,904 epoch 10 - iter 154/773 - loss 0.00118691 - time (sec): 9.43 - samples/sec: 2492.59 - lr: 0.000005 - momentum: 0.000000
2023-10-25 10:54:09,501 epoch 10 - iter 231/773 - loss 0.00206157 - time (sec): 14.03 - samples/sec: 2527.22 - lr: 0.000004 - momentum: 0.000000
2023-10-25 10:54:14,081 epoch 10 - iter 308/773 - loss 0.00311004 - time (sec): 18.61 - samples/sec: 2565.03 - lr: 0.000003 - momentum: 0.000000
2023-10-25 10:54:18,614 epoch 10 - iter 385/773 - loss 0.00265094 - time (sec): 23.14 - samples/sec: 2560.55 - lr: 0.000003 - momentum: 0.000000
2023-10-25 10:54:23,379 epoch 10 - iter 462/773 - loss 0.00274512 - time (sec): 27.91 - samples/sec: 2585.42 - lr: 0.000002 - momentum: 0.000000
2023-10-25 10:54:28,186 epoch 10 - iter 539/773 - loss 0.00263932 - time (sec): 32.71 - samples/sec: 2601.23 - lr: 0.000002 - momentum: 0.000000
2023-10-25 10:54:33,072 epoch 10 - iter 616/773 - loss 0.00248618 - time (sec): 37.60 - samples/sec: 2613.44 - lr: 0.000001 - momentum: 0.000000
2023-10-25 10:54:37,776 epoch 10 - iter 693/773 - loss 0.00242872 - time (sec): 42.30 - samples/sec: 2631.20 - lr: 0.000001 - momentum: 0.000000
2023-10-25 10:54:42,474 epoch 10 - iter 770/773 - loss 0.00237640 - time (sec): 47.00 - samples/sec: 2629.04 - lr: 0.000000 - momentum: 0.000000
2023-10-25 10:54:42,664 ----------------------------------------------------------------------------------------------------
2023-10-25 10:54:42,664 EPOCH 10 done: loss 0.0025 - lr: 0.000000
2023-10-25 10:54:45,907 DEV : loss 0.13963457942008972 - f1-score (micro avg) 0.7556
2023-10-25 10:54:46,396 ----------------------------------------------------------------------------------------------------
2023-10-25 10:54:46,398 Loading model from best epoch ...
2023-10-25 10:54:48,450 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-25 10:54:58,129
Results:
- F-score (micro) 0.7752
- F-score (macro) 0.6601
- Accuracy 0.6529
By class:
precision recall f1-score support
LOC 0.7859 0.8573 0.8200 946
BUILDING 0.6454 0.4919 0.5583 185
STREET 0.6596 0.5536 0.6019 56
micro avg 0.7648 0.7860 0.7752 1187
macro avg 0.6969 0.6343 0.6601 1187
weighted avg 0.7580 0.7860 0.7689 1187
2023-10-25 10:54:58,129 ----------------------------------------------------------------------------------------------------