stefan-it's picture
Upload ./training.log with huggingface_hub
73e528e
2023-10-25 09:57:58,522 ----------------------------------------------------------------------------------------------------
2023-10-25 09:57:58,523 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 09:57:58,523 ----------------------------------------------------------------------------------------------------
2023-10-25 09:57:58,524 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-25 09:57:58,524 ----------------------------------------------------------------------------------------------------
2023-10-25 09:57:58,524 Train: 6183 sentences
2023-10-25 09:57:58,524 (train_with_dev=False, train_with_test=False)
2023-10-25 09:57:58,524 ----------------------------------------------------------------------------------------------------
2023-10-25 09:57:58,524 Training Params:
2023-10-25 09:57:58,524 - learning_rate: "5e-05"
2023-10-25 09:57:58,524 - mini_batch_size: "8"
2023-10-25 09:57:58,524 - max_epochs: "10"
2023-10-25 09:57:58,524 - shuffle: "True"
2023-10-25 09:57:58,524 ----------------------------------------------------------------------------------------------------
2023-10-25 09:57:58,524 Plugins:
2023-10-25 09:57:58,524 - TensorboardLogger
2023-10-25 09:57:58,524 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 09:57:58,524 ----------------------------------------------------------------------------------------------------
2023-10-25 09:57:58,524 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 09:57:58,524 - metric: "('micro avg', 'f1-score')"
2023-10-25 09:57:58,524 ----------------------------------------------------------------------------------------------------
2023-10-25 09:57:58,524 Computation:
2023-10-25 09:57:58,524 - compute on device: cuda:0
2023-10-25 09:57:58,524 - embedding storage: none
2023-10-25 09:57:58,524 ----------------------------------------------------------------------------------------------------
2023-10-25 09:57:58,525 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-25 09:57:58,525 ----------------------------------------------------------------------------------------------------
2023-10-25 09:57:58,525 ----------------------------------------------------------------------------------------------------
2023-10-25 09:57:58,525 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 09:58:03,484 epoch 1 - iter 77/773 - loss 2.03904562 - time (sec): 4.96 - samples/sec: 2546.24 - lr: 0.000005 - momentum: 0.000000
2023-10-25 09:58:08,424 epoch 1 - iter 154/773 - loss 1.12019790 - time (sec): 9.90 - samples/sec: 2562.94 - lr: 0.000010 - momentum: 0.000000
2023-10-25 09:58:13,526 epoch 1 - iter 231/773 - loss 0.79717628 - time (sec): 15.00 - samples/sec: 2507.80 - lr: 0.000015 - momentum: 0.000000
2023-10-25 09:58:18,475 epoch 1 - iter 308/773 - loss 0.62691977 - time (sec): 19.95 - samples/sec: 2515.65 - lr: 0.000020 - momentum: 0.000000
2023-10-25 09:58:23,289 epoch 1 - iter 385/773 - loss 0.53255000 - time (sec): 24.76 - samples/sec: 2487.10 - lr: 0.000025 - momentum: 0.000000
2023-10-25 09:58:28,225 epoch 1 - iter 462/773 - loss 0.46194793 - time (sec): 29.70 - samples/sec: 2488.13 - lr: 0.000030 - momentum: 0.000000
2023-10-25 09:58:33,164 epoch 1 - iter 539/773 - loss 0.41113608 - time (sec): 34.64 - samples/sec: 2494.45 - lr: 0.000035 - momentum: 0.000000
2023-10-25 09:58:38,154 epoch 1 - iter 616/773 - loss 0.37213224 - time (sec): 39.63 - samples/sec: 2506.66 - lr: 0.000040 - momentum: 0.000000
2023-10-25 09:58:43,158 epoch 1 - iter 693/773 - loss 0.34227260 - time (sec): 44.63 - samples/sec: 2495.53 - lr: 0.000045 - momentum: 0.000000
2023-10-25 09:58:48,050 epoch 1 - iter 770/773 - loss 0.31635614 - time (sec): 49.52 - samples/sec: 2504.31 - lr: 0.000050 - momentum: 0.000000
2023-10-25 09:58:48,231 ----------------------------------------------------------------------------------------------------
2023-10-25 09:58:48,232 EPOCH 1 done: loss 0.3160 - lr: 0.000050
2023-10-25 09:58:50,793 DEV : loss 0.07084957510232925 - f1-score (micro avg) 0.6729
2023-10-25 09:58:50,820 saving best model
2023-10-25 09:58:51,326 ----------------------------------------------------------------------------------------------------
2023-10-25 09:58:56,425 epoch 2 - iter 77/773 - loss 0.08383621 - time (sec): 5.10 - samples/sec: 2422.97 - lr: 0.000049 - momentum: 0.000000
2023-10-25 09:59:01,475 epoch 2 - iter 154/773 - loss 0.08060651 - time (sec): 10.15 - samples/sec: 2410.00 - lr: 0.000049 - momentum: 0.000000
2023-10-25 09:59:06,628 epoch 2 - iter 231/773 - loss 0.08038522 - time (sec): 15.30 - samples/sec: 2410.53 - lr: 0.000048 - momentum: 0.000000
2023-10-25 09:59:11,762 epoch 2 - iter 308/773 - loss 0.07891260 - time (sec): 20.43 - samples/sec: 2417.00 - lr: 0.000048 - momentum: 0.000000
2023-10-25 09:59:16,729 epoch 2 - iter 385/773 - loss 0.07927934 - time (sec): 25.40 - samples/sec: 2421.31 - lr: 0.000047 - momentum: 0.000000
2023-10-25 09:59:21,933 epoch 2 - iter 462/773 - loss 0.07826921 - time (sec): 30.60 - samples/sec: 2413.32 - lr: 0.000047 - momentum: 0.000000
2023-10-25 09:59:26,942 epoch 2 - iter 539/773 - loss 0.07567819 - time (sec): 35.61 - samples/sec: 2427.55 - lr: 0.000046 - momentum: 0.000000
2023-10-25 09:59:32,543 epoch 2 - iter 616/773 - loss 0.07482291 - time (sec): 41.21 - samples/sec: 2403.65 - lr: 0.000046 - momentum: 0.000000
2023-10-25 09:59:37,548 epoch 2 - iter 693/773 - loss 0.07633291 - time (sec): 46.22 - samples/sec: 2417.97 - lr: 0.000045 - momentum: 0.000000
2023-10-25 09:59:42,566 epoch 2 - iter 770/773 - loss 0.07631245 - time (sec): 51.24 - samples/sec: 2418.55 - lr: 0.000044 - momentum: 0.000000
2023-10-25 09:59:42,749 ----------------------------------------------------------------------------------------------------
2023-10-25 09:59:42,750 EPOCH 2 done: loss 0.0763 - lr: 0.000044
2023-10-25 09:59:45,492 DEV : loss 0.05928114056587219 - f1-score (micro avg) 0.7621
2023-10-25 09:59:45,523 saving best model
2023-10-25 09:59:46,246 ----------------------------------------------------------------------------------------------------
2023-10-25 09:59:51,253 epoch 3 - iter 77/773 - loss 0.03484718 - time (sec): 5.00 - samples/sec: 2494.02 - lr: 0.000044 - momentum: 0.000000
2023-10-25 09:59:56,296 epoch 3 - iter 154/773 - loss 0.03415565 - time (sec): 10.05 - samples/sec: 2431.13 - lr: 0.000043 - momentum: 0.000000
2023-10-25 10:00:01,430 epoch 3 - iter 231/773 - loss 0.03721366 - time (sec): 15.18 - samples/sec: 2406.07 - lr: 0.000043 - momentum: 0.000000
2023-10-25 10:00:06,559 epoch 3 - iter 308/773 - loss 0.04116588 - time (sec): 20.31 - samples/sec: 2404.77 - lr: 0.000042 - momentum: 0.000000
2023-10-25 10:00:11,520 epoch 3 - iter 385/773 - loss 0.04243653 - time (sec): 25.27 - samples/sec: 2419.43 - lr: 0.000042 - momentum: 0.000000
2023-10-25 10:00:16,488 epoch 3 - iter 462/773 - loss 0.04382048 - time (sec): 30.24 - samples/sec: 2448.99 - lr: 0.000041 - momentum: 0.000000
2023-10-25 10:00:21,593 epoch 3 - iter 539/773 - loss 0.04459189 - time (sec): 35.34 - samples/sec: 2449.95 - lr: 0.000041 - momentum: 0.000000
2023-10-25 10:00:26,549 epoch 3 - iter 616/773 - loss 0.04536472 - time (sec): 40.30 - samples/sec: 2454.64 - lr: 0.000040 - momentum: 0.000000
2023-10-25 10:00:31,453 epoch 3 - iter 693/773 - loss 0.04577056 - time (sec): 45.20 - samples/sec: 2458.03 - lr: 0.000039 - momentum: 0.000000
2023-10-25 10:00:36,478 epoch 3 - iter 770/773 - loss 0.04613942 - time (sec): 50.23 - samples/sec: 2461.36 - lr: 0.000039 - momentum: 0.000000
2023-10-25 10:00:36,682 ----------------------------------------------------------------------------------------------------
2023-10-25 10:00:36,683 EPOCH 3 done: loss 0.0461 - lr: 0.000039
2023-10-25 10:00:39,113 DEV : loss 0.07082913815975189 - f1-score (micro avg) 0.7401
2023-10-25 10:00:39,131 ----------------------------------------------------------------------------------------------------
2023-10-25 10:00:44,166 epoch 4 - iter 77/773 - loss 0.02209400 - time (sec): 5.03 - samples/sec: 2555.77 - lr: 0.000038 - momentum: 0.000000
2023-10-25 10:00:49,060 epoch 4 - iter 154/773 - loss 0.02495574 - time (sec): 9.93 - samples/sec: 2463.86 - lr: 0.000038 - momentum: 0.000000
2023-10-25 10:00:54,178 epoch 4 - iter 231/773 - loss 0.02855728 - time (sec): 15.05 - samples/sec: 2494.99 - lr: 0.000037 - momentum: 0.000000
2023-10-25 10:00:59,390 epoch 4 - iter 308/773 - loss 0.03133128 - time (sec): 20.26 - samples/sec: 2491.55 - lr: 0.000037 - momentum: 0.000000
2023-10-25 10:01:04,471 epoch 4 - iter 385/773 - loss 0.03152342 - time (sec): 25.34 - samples/sec: 2456.61 - lr: 0.000036 - momentum: 0.000000
2023-10-25 10:01:09,557 epoch 4 - iter 462/773 - loss 0.03107184 - time (sec): 30.42 - samples/sec: 2460.15 - lr: 0.000036 - momentum: 0.000000
2023-10-25 10:01:14,638 epoch 4 - iter 539/773 - loss 0.03090760 - time (sec): 35.51 - samples/sec: 2475.10 - lr: 0.000035 - momentum: 0.000000
2023-10-25 10:01:19,778 epoch 4 - iter 616/773 - loss 0.03088891 - time (sec): 40.65 - samples/sec: 2462.11 - lr: 0.000034 - momentum: 0.000000
2023-10-25 10:01:24,787 epoch 4 - iter 693/773 - loss 0.03076526 - time (sec): 45.65 - samples/sec: 2454.39 - lr: 0.000034 - momentum: 0.000000
2023-10-25 10:01:29,768 epoch 4 - iter 770/773 - loss 0.03111074 - time (sec): 50.64 - samples/sec: 2444.60 - lr: 0.000033 - momentum: 0.000000
2023-10-25 10:01:29,960 ----------------------------------------------------------------------------------------------------
2023-10-25 10:01:29,960 EPOCH 4 done: loss 0.0312 - lr: 0.000033
2023-10-25 10:01:32,618 DEV : loss 0.08776696026325226 - f1-score (micro avg) 0.7409
2023-10-25 10:01:32,634 ----------------------------------------------------------------------------------------------------
2023-10-25 10:01:37,624 epoch 5 - iter 77/773 - loss 0.02788476 - time (sec): 4.99 - samples/sec: 2283.51 - lr: 0.000033 - momentum: 0.000000
2023-10-25 10:01:42,764 epoch 5 - iter 154/773 - loss 0.02471748 - time (sec): 10.13 - samples/sec: 2431.01 - lr: 0.000032 - momentum: 0.000000
2023-10-25 10:01:47,922 epoch 5 - iter 231/773 - loss 0.02124558 - time (sec): 15.29 - samples/sec: 2447.42 - lr: 0.000032 - momentum: 0.000000
2023-10-25 10:01:53,049 epoch 5 - iter 308/773 - loss 0.02130901 - time (sec): 20.41 - samples/sec: 2441.56 - lr: 0.000031 - momentum: 0.000000
2023-10-25 10:01:58,037 epoch 5 - iter 385/773 - loss 0.02252948 - time (sec): 25.40 - samples/sec: 2441.86 - lr: 0.000031 - momentum: 0.000000
2023-10-25 10:02:03,283 epoch 5 - iter 462/773 - loss 0.02151139 - time (sec): 30.65 - samples/sec: 2434.62 - lr: 0.000030 - momentum: 0.000000
2023-10-25 10:02:08,345 epoch 5 - iter 539/773 - loss 0.02272270 - time (sec): 35.71 - samples/sec: 2429.79 - lr: 0.000029 - momentum: 0.000000
2023-10-25 10:02:13,261 epoch 5 - iter 616/773 - loss 0.02261758 - time (sec): 40.63 - samples/sec: 2440.11 - lr: 0.000029 - momentum: 0.000000
2023-10-25 10:02:18,235 epoch 5 - iter 693/773 - loss 0.02263034 - time (sec): 45.60 - samples/sec: 2447.06 - lr: 0.000028 - momentum: 0.000000
2023-10-25 10:02:23,287 epoch 5 - iter 770/773 - loss 0.02286498 - time (sec): 50.65 - samples/sec: 2446.74 - lr: 0.000028 - momentum: 0.000000
2023-10-25 10:02:23,467 ----------------------------------------------------------------------------------------------------
2023-10-25 10:02:23,467 EPOCH 5 done: loss 0.0228 - lr: 0.000028
2023-10-25 10:02:26,021 DEV : loss 0.10517541319131851 - f1-score (micro avg) 0.7738
2023-10-25 10:02:26,040 saving best model
2023-10-25 10:02:26,757 ----------------------------------------------------------------------------------------------------
2023-10-25 10:02:31,936 epoch 6 - iter 77/773 - loss 0.01060825 - time (sec): 5.18 - samples/sec: 2429.49 - lr: 0.000027 - momentum: 0.000000
2023-10-25 10:02:37,075 epoch 6 - iter 154/773 - loss 0.01094672 - time (sec): 10.32 - samples/sec: 2419.14 - lr: 0.000027 - momentum: 0.000000
2023-10-25 10:02:42,090 epoch 6 - iter 231/773 - loss 0.01297909 - time (sec): 15.33 - samples/sec: 2439.59 - lr: 0.000026 - momentum: 0.000000
2023-10-25 10:02:47,138 epoch 6 - iter 308/773 - loss 0.01451729 - time (sec): 20.38 - samples/sec: 2450.65 - lr: 0.000026 - momentum: 0.000000
2023-10-25 10:02:52,051 epoch 6 - iter 385/773 - loss 0.01449527 - time (sec): 25.29 - samples/sec: 2412.22 - lr: 0.000025 - momentum: 0.000000
2023-10-25 10:02:57,072 epoch 6 - iter 462/773 - loss 0.01536073 - time (sec): 30.31 - samples/sec: 2411.44 - lr: 0.000024 - momentum: 0.000000
2023-10-25 10:03:02,074 epoch 6 - iter 539/773 - loss 0.01493132 - time (sec): 35.32 - samples/sec: 2426.18 - lr: 0.000024 - momentum: 0.000000
2023-10-25 10:03:07,077 epoch 6 - iter 616/773 - loss 0.01532872 - time (sec): 40.32 - samples/sec: 2455.13 - lr: 0.000023 - momentum: 0.000000
2023-10-25 10:03:12,152 epoch 6 - iter 693/773 - loss 0.01546009 - time (sec): 45.39 - samples/sec: 2455.84 - lr: 0.000023 - momentum: 0.000000
2023-10-25 10:03:17,621 epoch 6 - iter 770/773 - loss 0.01587373 - time (sec): 50.86 - samples/sec: 2435.45 - lr: 0.000022 - momentum: 0.000000
2023-10-25 10:03:17,830 ----------------------------------------------------------------------------------------------------
2023-10-25 10:03:17,831 EPOCH 6 done: loss 0.0159 - lr: 0.000022
2023-10-25 10:03:20,818 DEV : loss 0.10992585122585297 - f1-score (micro avg) 0.7592
2023-10-25 10:03:20,835 ----------------------------------------------------------------------------------------------------
2023-10-25 10:03:25,522 epoch 7 - iter 77/773 - loss 0.00548947 - time (sec): 4.69 - samples/sec: 2601.12 - lr: 0.000022 - momentum: 0.000000
2023-10-25 10:03:30,408 epoch 7 - iter 154/773 - loss 0.01141571 - time (sec): 9.57 - samples/sec: 2606.13 - lr: 0.000021 - momentum: 0.000000
2023-10-25 10:03:35,507 epoch 7 - iter 231/773 - loss 0.01079487 - time (sec): 14.67 - samples/sec: 2600.40 - lr: 0.000021 - momentum: 0.000000
2023-10-25 10:03:40,827 epoch 7 - iter 308/773 - loss 0.01099606 - time (sec): 19.99 - samples/sec: 2516.09 - lr: 0.000020 - momentum: 0.000000
2023-10-25 10:03:46,346 epoch 7 - iter 385/773 - loss 0.01110398 - time (sec): 25.51 - samples/sec: 2448.01 - lr: 0.000019 - momentum: 0.000000
2023-10-25 10:03:51,677 epoch 7 - iter 462/773 - loss 0.01103215 - time (sec): 30.84 - samples/sec: 2399.15 - lr: 0.000019 - momentum: 0.000000
2023-10-25 10:03:57,134 epoch 7 - iter 539/773 - loss 0.01112854 - time (sec): 36.30 - samples/sec: 2394.48 - lr: 0.000018 - momentum: 0.000000
2023-10-25 10:04:02,346 epoch 7 - iter 616/773 - loss 0.01041358 - time (sec): 41.51 - samples/sec: 2395.92 - lr: 0.000018 - momentum: 0.000000
2023-10-25 10:04:07,437 epoch 7 - iter 693/773 - loss 0.01037195 - time (sec): 46.60 - samples/sec: 2392.68 - lr: 0.000017 - momentum: 0.000000
2023-10-25 10:04:12,559 epoch 7 - iter 770/773 - loss 0.01062967 - time (sec): 51.72 - samples/sec: 2390.07 - lr: 0.000017 - momentum: 0.000000
2023-10-25 10:04:12,770 ----------------------------------------------------------------------------------------------------
2023-10-25 10:04:12,771 EPOCH 7 done: loss 0.0106 - lr: 0.000017
2023-10-25 10:04:16,220 DEV : loss 0.12441226094961166 - f1-score (micro avg) 0.768
2023-10-25 10:04:16,242 ----------------------------------------------------------------------------------------------------
2023-10-25 10:04:21,552 epoch 8 - iter 77/773 - loss 0.01141740 - time (sec): 5.31 - samples/sec: 2362.59 - lr: 0.000016 - momentum: 0.000000
2023-10-25 10:04:26,934 epoch 8 - iter 154/773 - loss 0.00925878 - time (sec): 10.69 - samples/sec: 2411.77 - lr: 0.000016 - momentum: 0.000000
2023-10-25 10:04:32,233 epoch 8 - iter 231/773 - loss 0.00818945 - time (sec): 15.99 - samples/sec: 2361.92 - lr: 0.000015 - momentum: 0.000000
2023-10-25 10:04:37,479 epoch 8 - iter 308/773 - loss 0.00843259 - time (sec): 21.24 - samples/sec: 2332.50 - lr: 0.000014 - momentum: 0.000000
2023-10-25 10:04:42,853 epoch 8 - iter 385/773 - loss 0.00881106 - time (sec): 26.61 - samples/sec: 2302.79 - lr: 0.000014 - momentum: 0.000000
2023-10-25 10:04:48,080 epoch 8 - iter 462/773 - loss 0.00919388 - time (sec): 31.84 - samples/sec: 2297.67 - lr: 0.000013 - momentum: 0.000000
2023-10-25 10:04:53,452 epoch 8 - iter 539/773 - loss 0.00913278 - time (sec): 37.21 - samples/sec: 2282.50 - lr: 0.000013 - momentum: 0.000000
2023-10-25 10:04:58,881 epoch 8 - iter 616/773 - loss 0.00832774 - time (sec): 42.64 - samples/sec: 2310.18 - lr: 0.000012 - momentum: 0.000000
2023-10-25 10:05:04,376 epoch 8 - iter 693/773 - loss 0.00817666 - time (sec): 48.13 - samples/sec: 2314.71 - lr: 0.000012 - momentum: 0.000000
2023-10-25 10:05:09,725 epoch 8 - iter 770/773 - loss 0.00807240 - time (sec): 53.48 - samples/sec: 2314.18 - lr: 0.000011 - momentum: 0.000000
2023-10-25 10:05:09,948 ----------------------------------------------------------------------------------------------------
2023-10-25 10:05:09,948 EPOCH 8 done: loss 0.0081 - lr: 0.000011
2023-10-25 10:05:12,914 DEV : loss 0.11879534274339676 - f1-score (micro avg) 0.7653
2023-10-25 10:05:12,938 ----------------------------------------------------------------------------------------------------
2023-10-25 10:05:18,125 epoch 9 - iter 77/773 - loss 0.00379106 - time (sec): 5.19 - samples/sec: 2229.98 - lr: 0.000011 - momentum: 0.000000
2023-10-25 10:05:23,477 epoch 9 - iter 154/773 - loss 0.00536332 - time (sec): 10.54 - samples/sec: 2262.36 - lr: 0.000010 - momentum: 0.000000
2023-10-25 10:05:28,825 epoch 9 - iter 231/773 - loss 0.00487666 - time (sec): 15.89 - samples/sec: 2307.17 - lr: 0.000009 - momentum: 0.000000
2023-10-25 10:05:34,164 epoch 9 - iter 308/773 - loss 0.00444790 - time (sec): 21.22 - samples/sec: 2326.98 - lr: 0.000009 - momentum: 0.000000
2023-10-25 10:05:39,560 epoch 9 - iter 385/773 - loss 0.00409264 - time (sec): 26.62 - samples/sec: 2350.96 - lr: 0.000008 - momentum: 0.000000
2023-10-25 10:05:44,934 epoch 9 - iter 462/773 - loss 0.00392048 - time (sec): 31.99 - samples/sec: 2348.07 - lr: 0.000008 - momentum: 0.000000
2023-10-25 10:05:50,225 epoch 9 - iter 539/773 - loss 0.00393411 - time (sec): 37.29 - samples/sec: 2367.10 - lr: 0.000007 - momentum: 0.000000
2023-10-25 10:05:54,787 epoch 9 - iter 616/773 - loss 0.00373331 - time (sec): 41.85 - samples/sec: 2399.22 - lr: 0.000007 - momentum: 0.000000
2023-10-25 10:05:59,572 epoch 9 - iter 693/773 - loss 0.00434126 - time (sec): 46.63 - samples/sec: 2397.01 - lr: 0.000006 - momentum: 0.000000
2023-10-25 10:06:04,607 epoch 9 - iter 770/773 - loss 0.00435473 - time (sec): 51.67 - samples/sec: 2396.05 - lr: 0.000006 - momentum: 0.000000
2023-10-25 10:06:04,793 ----------------------------------------------------------------------------------------------------
2023-10-25 10:06:04,794 EPOCH 9 done: loss 0.0043 - lr: 0.000006
2023-10-25 10:06:07,519 DEV : loss 0.1255100518465042 - f1-score (micro avg) 0.79
2023-10-25 10:06:07,536 saving best model
2023-10-25 10:06:08,244 ----------------------------------------------------------------------------------------------------
2023-10-25 10:06:13,192 epoch 10 - iter 77/773 - loss 0.00296306 - time (sec): 4.95 - samples/sec: 2493.27 - lr: 0.000005 - momentum: 0.000000
2023-10-25 10:06:17,962 epoch 10 - iter 154/773 - loss 0.00363997 - time (sec): 9.72 - samples/sec: 2415.75 - lr: 0.000005 - momentum: 0.000000
2023-10-25 10:06:22,715 epoch 10 - iter 231/773 - loss 0.00251245 - time (sec): 14.47 - samples/sec: 2472.97 - lr: 0.000004 - momentum: 0.000000
2023-10-25 10:06:27,822 epoch 10 - iter 308/773 - loss 0.00259746 - time (sec): 19.58 - samples/sec: 2456.70 - lr: 0.000003 - momentum: 0.000000
2023-10-25 10:06:33,684 epoch 10 - iter 385/773 - loss 0.00245530 - time (sec): 25.44 - samples/sec: 2398.88 - lr: 0.000003 - momentum: 0.000000
2023-10-25 10:06:38,992 epoch 10 - iter 462/773 - loss 0.00267112 - time (sec): 30.75 - samples/sec: 2385.70 - lr: 0.000002 - momentum: 0.000000
2023-10-25 10:06:44,386 epoch 10 - iter 539/773 - loss 0.00238484 - time (sec): 36.14 - samples/sec: 2366.86 - lr: 0.000002 - momentum: 0.000000
2023-10-25 10:06:49,620 epoch 10 - iter 616/773 - loss 0.00268458 - time (sec): 41.37 - samples/sec: 2374.04 - lr: 0.000001 - momentum: 0.000000
2023-10-25 10:06:54,913 epoch 10 - iter 693/773 - loss 0.00240025 - time (sec): 46.67 - samples/sec: 2373.18 - lr: 0.000001 - momentum: 0.000000
2023-10-25 10:07:00,374 epoch 10 - iter 770/773 - loss 0.00227698 - time (sec): 52.13 - samples/sec: 2374.69 - lr: 0.000000 - momentum: 0.000000
2023-10-25 10:07:00,578 ----------------------------------------------------------------------------------------------------
2023-10-25 10:07:00,578 EPOCH 10 done: loss 0.0023 - lr: 0.000000
2023-10-25 10:07:03,614 DEV : loss 0.12568846344947815 - f1-score (micro avg) 0.7918
2023-10-25 10:07:03,635 saving best model
2023-10-25 10:07:04,884 ----------------------------------------------------------------------------------------------------
2023-10-25 10:07:04,886 Loading model from best epoch ...
2023-10-25 10:07:07,226 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-25 10:07:18,217
Results:
- F-score (micro) 0.7958
- F-score (macro) 0.7034
- Accuracy 0.6827
By class:
precision recall f1-score support
LOC 0.8311 0.8478 0.8394 946
BUILDING 0.6201 0.6000 0.6099 185
STREET 0.6441 0.6786 0.6609 56
micro avg 0.7905 0.8012 0.7958 1187
macro avg 0.6984 0.7088 0.7034 1187
weighted avg 0.7894 0.8012 0.7952 1187
2023-10-25 10:07:18,218 ----------------------------------------------------------------------------------------------------