stefan-it's picture
Upload ./training.log with huggingface_hub
911f200
2023-10-25 09:48:20,325 ----------------------------------------------------------------------------------------------------
2023-10-25 09:48:20,326 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 09:48:20,326 ----------------------------------------------------------------------------------------------------
2023-10-25 09:48:20,327 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-25 09:48:20,327 ----------------------------------------------------------------------------------------------------
2023-10-25 09:48:20,327 Train: 6183 sentences
2023-10-25 09:48:20,327 (train_with_dev=False, train_with_test=False)
2023-10-25 09:48:20,327 ----------------------------------------------------------------------------------------------------
2023-10-25 09:48:20,327 Training Params:
2023-10-25 09:48:20,327 - learning_rate: "3e-05"
2023-10-25 09:48:20,327 - mini_batch_size: "8"
2023-10-25 09:48:20,327 - max_epochs: "10"
2023-10-25 09:48:20,327 - shuffle: "True"
2023-10-25 09:48:20,327 ----------------------------------------------------------------------------------------------------
2023-10-25 09:48:20,327 Plugins:
2023-10-25 09:48:20,327 - TensorboardLogger
2023-10-25 09:48:20,327 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 09:48:20,327 ----------------------------------------------------------------------------------------------------
2023-10-25 09:48:20,327 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 09:48:20,327 - metric: "('micro avg', 'f1-score')"
2023-10-25 09:48:20,327 ----------------------------------------------------------------------------------------------------
2023-10-25 09:48:20,327 Computation:
2023-10-25 09:48:20,327 - compute on device: cuda:0
2023-10-25 09:48:20,327 - embedding storage: none
2023-10-25 09:48:20,327 ----------------------------------------------------------------------------------------------------
2023-10-25 09:48:20,327 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-25 09:48:20,327 ----------------------------------------------------------------------------------------------------
2023-10-25 09:48:20,328 ----------------------------------------------------------------------------------------------------
2023-10-25 09:48:20,328 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 09:48:25,998 epoch 1 - iter 77/773 - loss 2.44068512 - time (sec): 5.67 - samples/sec: 2227.15 - lr: 0.000003 - momentum: 0.000000
2023-10-25 09:48:30,949 epoch 1 - iter 154/773 - loss 1.34447366 - time (sec): 10.62 - samples/sec: 2388.61 - lr: 0.000006 - momentum: 0.000000
2023-10-25 09:48:35,983 epoch 1 - iter 231/773 - loss 0.95565315 - time (sec): 15.65 - samples/sec: 2403.03 - lr: 0.000009 - momentum: 0.000000
2023-10-25 09:48:40,857 epoch 1 - iter 308/773 - loss 0.74863025 - time (sec): 20.53 - samples/sec: 2444.69 - lr: 0.000012 - momentum: 0.000000
2023-10-25 09:48:45,768 epoch 1 - iter 385/773 - loss 0.63300379 - time (sec): 25.44 - samples/sec: 2421.02 - lr: 0.000015 - momentum: 0.000000
2023-10-25 09:48:50,689 epoch 1 - iter 462/773 - loss 0.54698749 - time (sec): 30.36 - samples/sec: 2433.96 - lr: 0.000018 - momentum: 0.000000
2023-10-25 09:48:55,538 epoch 1 - iter 539/773 - loss 0.48462883 - time (sec): 35.21 - samples/sec: 2453.96 - lr: 0.000021 - momentum: 0.000000
2023-10-25 09:49:00,693 epoch 1 - iter 616/773 - loss 0.43581574 - time (sec): 40.36 - samples/sec: 2460.96 - lr: 0.000024 - momentum: 0.000000
2023-10-25 09:49:05,753 epoch 1 - iter 693/773 - loss 0.39859972 - time (sec): 45.42 - samples/sec: 2452.02 - lr: 0.000027 - momentum: 0.000000
2023-10-25 09:49:10,816 epoch 1 - iter 770/773 - loss 0.36632636 - time (sec): 50.49 - samples/sec: 2456.53 - lr: 0.000030 - momentum: 0.000000
2023-10-25 09:49:11,013 ----------------------------------------------------------------------------------------------------
2023-10-25 09:49:11,014 EPOCH 1 done: loss 0.3659 - lr: 0.000030
2023-10-25 09:49:13,651 DEV : loss 0.06680409610271454 - f1-score (micro avg) 0.673
2023-10-25 09:49:13,670 saving best model
2023-10-25 09:49:14,233 ----------------------------------------------------------------------------------------------------
2023-10-25 09:49:19,151 epoch 2 - iter 77/773 - loss 0.08307367 - time (sec): 4.92 - samples/sec: 2511.83 - lr: 0.000030 - momentum: 0.000000
2023-10-25 09:49:24,030 epoch 2 - iter 154/773 - loss 0.07562126 - time (sec): 9.79 - samples/sec: 2496.58 - lr: 0.000029 - momentum: 0.000000
2023-10-25 09:49:29,067 epoch 2 - iter 231/773 - loss 0.07838681 - time (sec): 14.83 - samples/sec: 2486.59 - lr: 0.000029 - momentum: 0.000000
2023-10-25 09:49:34,072 epoch 2 - iter 308/773 - loss 0.07706583 - time (sec): 19.84 - samples/sec: 2489.64 - lr: 0.000029 - momentum: 0.000000
2023-10-25 09:49:39,069 epoch 2 - iter 385/773 - loss 0.07642793 - time (sec): 24.83 - samples/sec: 2476.52 - lr: 0.000028 - momentum: 0.000000
2023-10-25 09:49:43,970 epoch 2 - iter 462/773 - loss 0.07419511 - time (sec): 29.73 - samples/sec: 2483.97 - lr: 0.000028 - momentum: 0.000000
2023-10-25 09:49:49,027 epoch 2 - iter 539/773 - loss 0.07195341 - time (sec): 34.79 - samples/sec: 2484.94 - lr: 0.000028 - momentum: 0.000000
2023-10-25 09:49:54,504 epoch 2 - iter 616/773 - loss 0.07050405 - time (sec): 40.27 - samples/sec: 2460.10 - lr: 0.000027 - momentum: 0.000000
2023-10-25 09:49:59,469 epoch 2 - iter 693/773 - loss 0.07203292 - time (sec): 45.23 - samples/sec: 2470.66 - lr: 0.000027 - momentum: 0.000000
2023-10-25 09:50:04,418 epoch 2 - iter 770/773 - loss 0.07229932 - time (sec): 50.18 - samples/sec: 2469.35 - lr: 0.000027 - momentum: 0.000000
2023-10-25 09:50:04,614 ----------------------------------------------------------------------------------------------------
2023-10-25 09:50:04,615 EPOCH 2 done: loss 0.0723 - lr: 0.000027
2023-10-25 09:50:07,405 DEV : loss 0.05944029986858368 - f1-score (micro avg) 0.7636
2023-10-25 09:50:07,428 saving best model
2023-10-25 09:50:08,212 ----------------------------------------------------------------------------------------------------
2023-10-25 09:50:13,281 epoch 3 - iter 77/773 - loss 0.03624577 - time (sec): 5.07 - samples/sec: 2463.78 - lr: 0.000026 - momentum: 0.000000
2023-10-25 09:50:18,315 epoch 3 - iter 154/773 - loss 0.03645789 - time (sec): 10.10 - samples/sec: 2418.55 - lr: 0.000026 - momentum: 0.000000
2023-10-25 09:50:23,400 epoch 3 - iter 231/773 - loss 0.03657292 - time (sec): 15.19 - samples/sec: 2405.52 - lr: 0.000026 - momentum: 0.000000
2023-10-25 09:50:28,540 epoch 3 - iter 308/773 - loss 0.03934264 - time (sec): 20.32 - samples/sec: 2403.18 - lr: 0.000025 - momentum: 0.000000
2023-10-25 09:50:33,684 epoch 3 - iter 385/773 - loss 0.03930106 - time (sec): 25.47 - samples/sec: 2400.76 - lr: 0.000025 - momentum: 0.000000
2023-10-25 09:50:38,552 epoch 3 - iter 462/773 - loss 0.04052705 - time (sec): 30.34 - samples/sec: 2441.13 - lr: 0.000025 - momentum: 0.000000
2023-10-25 09:50:43,656 epoch 3 - iter 539/773 - loss 0.04200801 - time (sec): 35.44 - samples/sec: 2443.28 - lr: 0.000024 - momentum: 0.000000
2023-10-25 09:50:48,660 epoch 3 - iter 616/773 - loss 0.04251387 - time (sec): 40.45 - samples/sec: 2445.86 - lr: 0.000024 - momentum: 0.000000
2023-10-25 09:50:53,646 epoch 3 - iter 693/773 - loss 0.04265944 - time (sec): 45.43 - samples/sec: 2445.82 - lr: 0.000024 - momentum: 0.000000
2023-10-25 09:50:58,686 epoch 3 - iter 770/773 - loss 0.04239354 - time (sec): 50.47 - samples/sec: 2449.55 - lr: 0.000023 - momentum: 0.000000
2023-10-25 09:50:58,909 ----------------------------------------------------------------------------------------------------
2023-10-25 09:50:58,909 EPOCH 3 done: loss 0.0423 - lr: 0.000023
2023-10-25 09:51:01,639 DEV : loss 0.07079580426216125 - f1-score (micro avg) 0.7485
2023-10-25 09:51:01,657 ----------------------------------------------------------------------------------------------------
2023-10-25 09:51:06,731 epoch 4 - iter 77/773 - loss 0.02225122 - time (sec): 5.07 - samples/sec: 2536.78 - lr: 0.000023 - momentum: 0.000000
2023-10-25 09:51:11,642 epoch 4 - iter 154/773 - loss 0.02524008 - time (sec): 9.98 - samples/sec: 2450.31 - lr: 0.000023 - momentum: 0.000000
2023-10-25 09:51:16,877 epoch 4 - iter 231/773 - loss 0.02602377 - time (sec): 15.22 - samples/sec: 2466.82 - lr: 0.000022 - momentum: 0.000000
2023-10-25 09:51:21,955 epoch 4 - iter 308/773 - loss 0.02739077 - time (sec): 20.30 - samples/sec: 2486.84 - lr: 0.000022 - momentum: 0.000000
2023-10-25 09:51:26,855 epoch 4 - iter 385/773 - loss 0.02692404 - time (sec): 25.20 - samples/sec: 2470.56 - lr: 0.000022 - momentum: 0.000000
2023-10-25 09:51:31,800 epoch 4 - iter 462/773 - loss 0.02708752 - time (sec): 30.14 - samples/sec: 2483.30 - lr: 0.000021 - momentum: 0.000000
2023-10-25 09:51:36,791 epoch 4 - iter 539/773 - loss 0.02609069 - time (sec): 35.13 - samples/sec: 2501.52 - lr: 0.000021 - momentum: 0.000000
2023-10-25 09:51:41,808 epoch 4 - iter 616/773 - loss 0.02645417 - time (sec): 40.15 - samples/sec: 2492.57 - lr: 0.000021 - momentum: 0.000000
2023-10-25 09:51:46,771 epoch 4 - iter 693/773 - loss 0.02725811 - time (sec): 45.11 - samples/sec: 2483.94 - lr: 0.000020 - momentum: 0.000000
2023-10-25 09:51:51,638 epoch 4 - iter 770/773 - loss 0.02781422 - time (sec): 49.98 - samples/sec: 2476.71 - lr: 0.000020 - momentum: 0.000000
2023-10-25 09:51:51,928 ----------------------------------------------------------------------------------------------------
2023-10-25 09:51:51,929 EPOCH 4 done: loss 0.0278 - lr: 0.000020
2023-10-25 09:51:54,482 DEV : loss 0.09247004240751266 - f1-score (micro avg) 0.7868
2023-10-25 09:51:54,510 saving best model
2023-10-25 09:51:55,226 ----------------------------------------------------------------------------------------------------
2023-10-25 09:52:00,327 epoch 5 - iter 77/773 - loss 0.02243221 - time (sec): 5.10 - samples/sec: 2233.83 - lr: 0.000020 - momentum: 0.000000
2023-10-25 09:52:05,427 epoch 5 - iter 154/773 - loss 0.02225000 - time (sec): 10.20 - samples/sec: 2414.07 - lr: 0.000019 - momentum: 0.000000
2023-10-25 09:52:10,533 epoch 5 - iter 231/773 - loss 0.01978642 - time (sec): 15.30 - samples/sec: 2444.43 - lr: 0.000019 - momentum: 0.000000
2023-10-25 09:52:15,595 epoch 5 - iter 308/773 - loss 0.02239947 - time (sec): 20.37 - samples/sec: 2447.16 - lr: 0.000019 - momentum: 0.000000
2023-10-25 09:52:20,586 epoch 5 - iter 385/773 - loss 0.02443414 - time (sec): 25.36 - samples/sec: 2446.04 - lr: 0.000018 - momentum: 0.000000
2023-10-25 09:52:25,754 epoch 5 - iter 462/773 - loss 0.02333707 - time (sec): 30.53 - samples/sec: 2444.34 - lr: 0.000018 - momentum: 0.000000
2023-10-25 09:52:30,804 epoch 5 - iter 539/773 - loss 0.02353120 - time (sec): 35.58 - samples/sec: 2438.86 - lr: 0.000018 - momentum: 0.000000
2023-10-25 09:52:35,821 epoch 5 - iter 616/773 - loss 0.02287409 - time (sec): 40.59 - samples/sec: 2442.07 - lr: 0.000017 - momentum: 0.000000
2023-10-25 09:52:40,961 epoch 5 - iter 693/773 - loss 0.02315495 - time (sec): 45.73 - samples/sec: 2439.91 - lr: 0.000017 - momentum: 0.000000
2023-10-25 09:52:46,106 epoch 5 - iter 770/773 - loss 0.02367241 - time (sec): 50.88 - samples/sec: 2435.85 - lr: 0.000017 - momentum: 0.000000
2023-10-25 09:52:46,279 ----------------------------------------------------------------------------------------------------
2023-10-25 09:52:46,279 EPOCH 5 done: loss 0.0236 - lr: 0.000017
2023-10-25 09:52:48,935 DEV : loss 0.08800413459539413 - f1-score (micro avg) 0.7835
2023-10-25 09:52:48,951 ----------------------------------------------------------------------------------------------------
2023-10-25 09:52:53,960 epoch 6 - iter 77/773 - loss 0.00982011 - time (sec): 5.01 - samples/sec: 2512.18 - lr: 0.000016 - momentum: 0.000000
2023-10-25 09:52:59,003 epoch 6 - iter 154/773 - loss 0.01078328 - time (sec): 10.05 - samples/sec: 2483.04 - lr: 0.000016 - momentum: 0.000000
2023-10-25 09:53:04,477 epoch 6 - iter 231/773 - loss 0.01419181 - time (sec): 15.52 - samples/sec: 2409.37 - lr: 0.000016 - momentum: 0.000000
2023-10-25 09:53:09,531 epoch 6 - iter 308/773 - loss 0.01599070 - time (sec): 20.58 - samples/sec: 2426.99 - lr: 0.000015 - momentum: 0.000000
2023-10-25 09:53:14,405 epoch 6 - iter 385/773 - loss 0.01515093 - time (sec): 25.45 - samples/sec: 2397.06 - lr: 0.000015 - momentum: 0.000000
2023-10-25 09:53:19,406 epoch 6 - iter 462/773 - loss 0.01623985 - time (sec): 30.45 - samples/sec: 2400.37 - lr: 0.000015 - momentum: 0.000000
2023-10-25 09:53:24,366 epoch 6 - iter 539/773 - loss 0.01569609 - time (sec): 35.41 - samples/sec: 2419.50 - lr: 0.000014 - momentum: 0.000000
2023-10-25 09:53:29,359 epoch 6 - iter 616/773 - loss 0.01651881 - time (sec): 40.41 - samples/sec: 2449.80 - lr: 0.000014 - momentum: 0.000000
2023-10-25 09:53:34,430 epoch 6 - iter 693/773 - loss 0.01616537 - time (sec): 45.48 - samples/sec: 2451.34 - lr: 0.000014 - momentum: 0.000000
2023-10-25 09:53:39,468 epoch 6 - iter 770/773 - loss 0.01584310 - time (sec): 50.51 - samples/sec: 2452.20 - lr: 0.000013 - momentum: 0.000000
2023-10-25 09:53:39,658 ----------------------------------------------------------------------------------------------------
2023-10-25 09:53:39,658 EPOCH 6 done: loss 0.0158 - lr: 0.000013
2023-10-25 09:53:42,928 DEV : loss 0.11100301146507263 - f1-score (micro avg) 0.7586
2023-10-25 09:53:42,946 ----------------------------------------------------------------------------------------------------
2023-10-25 09:53:48,033 epoch 7 - iter 77/773 - loss 0.01063580 - time (sec): 5.08 - samples/sec: 2397.23 - lr: 0.000013 - momentum: 0.000000
2023-10-25 09:53:53,188 epoch 7 - iter 154/773 - loss 0.01434576 - time (sec): 10.24 - samples/sec: 2436.04 - lr: 0.000013 - momentum: 0.000000
2023-10-25 09:53:58,266 epoch 7 - iter 231/773 - loss 0.01280713 - time (sec): 15.32 - samples/sec: 2490.55 - lr: 0.000012 - momentum: 0.000000
2023-10-25 09:54:03,465 epoch 7 - iter 308/773 - loss 0.01179169 - time (sec): 20.52 - samples/sec: 2451.55 - lr: 0.000012 - momentum: 0.000000
2023-10-25 09:54:08,512 epoch 7 - iter 385/773 - loss 0.01188043 - time (sec): 25.56 - samples/sec: 2442.80 - lr: 0.000012 - momentum: 0.000000
2023-10-25 09:54:13,621 epoch 7 - iter 462/773 - loss 0.01181791 - time (sec): 30.67 - samples/sec: 2412.34 - lr: 0.000011 - momentum: 0.000000
2023-10-25 09:54:18,982 epoch 7 - iter 539/773 - loss 0.01160376 - time (sec): 36.03 - samples/sec: 2412.04 - lr: 0.000011 - momentum: 0.000000
2023-10-25 09:54:24,039 epoch 7 - iter 616/773 - loss 0.01099026 - time (sec): 41.09 - samples/sec: 2420.38 - lr: 0.000011 - momentum: 0.000000
2023-10-25 09:54:29,120 epoch 7 - iter 693/773 - loss 0.01056242 - time (sec): 46.17 - samples/sec: 2414.93 - lr: 0.000010 - momentum: 0.000000
2023-10-25 09:54:34,123 epoch 7 - iter 770/773 - loss 0.01026174 - time (sec): 51.17 - samples/sec: 2415.69 - lr: 0.000010 - momentum: 0.000000
2023-10-25 09:54:34,322 ----------------------------------------------------------------------------------------------------
2023-10-25 09:54:34,322 EPOCH 7 done: loss 0.0102 - lr: 0.000010
2023-10-25 09:54:36,914 DEV : loss 0.1104028970003128 - f1-score (micro avg) 0.7847
2023-10-25 09:54:36,933 ----------------------------------------------------------------------------------------------------
2023-10-25 09:54:42,059 epoch 8 - iter 77/773 - loss 0.00902845 - time (sec): 5.12 - samples/sec: 2447.37 - lr: 0.000010 - momentum: 0.000000
2023-10-25 09:54:47,039 epoch 8 - iter 154/773 - loss 0.00733489 - time (sec): 10.10 - samples/sec: 2551.79 - lr: 0.000009 - momentum: 0.000000
2023-10-25 09:54:52,056 epoch 8 - iter 231/773 - loss 0.00762665 - time (sec): 15.12 - samples/sec: 2497.49 - lr: 0.000009 - momentum: 0.000000
2023-10-25 09:54:57,067 epoch 8 - iter 308/773 - loss 0.00816844 - time (sec): 20.13 - samples/sec: 2460.35 - lr: 0.000009 - momentum: 0.000000
2023-10-25 09:55:02,176 epoch 8 - iter 385/773 - loss 0.00826861 - time (sec): 25.24 - samples/sec: 2427.67 - lr: 0.000008 - momentum: 0.000000
2023-10-25 09:55:07,189 epoch 8 - iter 462/773 - loss 0.00783738 - time (sec): 30.25 - samples/sec: 2417.84 - lr: 0.000008 - momentum: 0.000000
2023-10-25 09:55:12,188 epoch 8 - iter 539/773 - loss 0.00740576 - time (sec): 35.25 - samples/sec: 2409.14 - lr: 0.000008 - momentum: 0.000000
2023-10-25 09:55:17,334 epoch 8 - iter 616/773 - loss 0.00670498 - time (sec): 40.40 - samples/sec: 2438.22 - lr: 0.000007 - momentum: 0.000000
2023-10-25 09:55:22,409 epoch 8 - iter 693/773 - loss 0.00647742 - time (sec): 45.47 - samples/sec: 2449.99 - lr: 0.000007 - momentum: 0.000000
2023-10-25 09:55:27,686 epoch 8 - iter 770/773 - loss 0.00657971 - time (sec): 50.75 - samples/sec: 2438.64 - lr: 0.000007 - momentum: 0.000000
2023-10-25 09:55:27,910 ----------------------------------------------------------------------------------------------------
2023-10-25 09:55:27,910 EPOCH 8 done: loss 0.0066 - lr: 0.000007
2023-10-25 09:55:30,932 DEV : loss 0.126515731215477 - f1-score (micro avg) 0.7705
2023-10-25 09:55:30,952 ----------------------------------------------------------------------------------------------------
2023-10-25 09:55:35,948 epoch 9 - iter 77/773 - loss 0.00352912 - time (sec): 4.99 - samples/sec: 2314.99 - lr: 0.000006 - momentum: 0.000000
2023-10-25 09:55:41,037 epoch 9 - iter 154/773 - loss 0.00411738 - time (sec): 10.08 - samples/sec: 2364.06 - lr: 0.000006 - momentum: 0.000000
2023-10-25 09:55:45,993 epoch 9 - iter 231/773 - loss 0.00342230 - time (sec): 15.04 - samples/sec: 2436.84 - lr: 0.000006 - momentum: 0.000000
2023-10-25 09:55:51,280 epoch 9 - iter 308/773 - loss 0.00417078 - time (sec): 20.33 - samples/sec: 2429.69 - lr: 0.000005 - momentum: 0.000000
2023-10-25 09:55:56,485 epoch 9 - iter 385/773 - loss 0.00410253 - time (sec): 25.53 - samples/sec: 2451.17 - lr: 0.000005 - momentum: 0.000000
2023-10-25 09:56:01,460 epoch 9 - iter 462/773 - loss 0.00411207 - time (sec): 30.51 - samples/sec: 2462.56 - lr: 0.000005 - momentum: 0.000000
2023-10-25 09:56:06,653 epoch 9 - iter 539/773 - loss 0.00389372 - time (sec): 35.70 - samples/sec: 2472.27 - lr: 0.000004 - momentum: 0.000000
2023-10-25 09:56:11,766 epoch 9 - iter 616/773 - loss 0.00353167 - time (sec): 40.81 - samples/sec: 2460.01 - lr: 0.000004 - momentum: 0.000000
2023-10-25 09:56:16,752 epoch 9 - iter 693/773 - loss 0.00362333 - time (sec): 45.80 - samples/sec: 2440.61 - lr: 0.000004 - momentum: 0.000000
2023-10-25 09:56:21,727 epoch 9 - iter 770/773 - loss 0.00379019 - time (sec): 50.77 - samples/sec: 2438.22 - lr: 0.000003 - momentum: 0.000000
2023-10-25 09:56:21,933 ----------------------------------------------------------------------------------------------------
2023-10-25 09:56:21,934 EPOCH 9 done: loss 0.0038 - lr: 0.000003
2023-10-25 09:56:25,021 DEV : loss 0.12340500205755234 - f1-score (micro avg) 0.7823
2023-10-25 09:56:25,042 ----------------------------------------------------------------------------------------------------
2023-10-25 09:56:30,157 epoch 10 - iter 77/773 - loss 0.00453140 - time (sec): 5.11 - samples/sec: 2412.07 - lr: 0.000003 - momentum: 0.000000
2023-10-25 09:56:35,026 epoch 10 - iter 154/773 - loss 0.00572525 - time (sec): 9.98 - samples/sec: 2351.54 - lr: 0.000003 - momentum: 0.000000
2023-10-25 09:56:39,971 epoch 10 - iter 231/773 - loss 0.00379453 - time (sec): 14.93 - samples/sec: 2397.12 - lr: 0.000002 - momentum: 0.000000
2023-10-25 09:56:45,045 epoch 10 - iter 308/773 - loss 0.00410640 - time (sec): 20.00 - samples/sec: 2404.59 - lr: 0.000002 - momentum: 0.000000
2023-10-25 09:56:50,059 epoch 10 - iter 385/773 - loss 0.00356080 - time (sec): 25.02 - samples/sec: 2439.49 - lr: 0.000002 - momentum: 0.000000
2023-10-25 09:56:55,022 epoch 10 - iter 462/773 - loss 0.00339071 - time (sec): 29.98 - samples/sec: 2446.81 - lr: 0.000001 - momentum: 0.000000
2023-10-25 09:57:00,096 epoch 10 - iter 539/773 - loss 0.00337113 - time (sec): 35.05 - samples/sec: 2440.30 - lr: 0.000001 - momentum: 0.000000
2023-10-25 09:57:05,154 epoch 10 - iter 616/773 - loss 0.00330523 - time (sec): 40.11 - samples/sec: 2448.87 - lr: 0.000001 - momentum: 0.000000
2023-10-25 09:57:10,122 epoch 10 - iter 693/773 - loss 0.00301120 - time (sec): 45.08 - samples/sec: 2456.87 - lr: 0.000000 - momentum: 0.000000
2023-10-25 09:57:15,200 epoch 10 - iter 770/773 - loss 0.00310954 - time (sec): 50.16 - samples/sec: 2468.08 - lr: 0.000000 - momentum: 0.000000
2023-10-25 09:57:15,384 ----------------------------------------------------------------------------------------------------
2023-10-25 09:57:15,384 EPOCH 10 done: loss 0.0031 - lr: 0.000000
2023-10-25 09:57:17,906 DEV : loss 0.12285467237234116 - f1-score (micro avg) 0.7886
2023-10-25 09:57:17,925 saving best model
2023-10-25 09:57:19,238 ----------------------------------------------------------------------------------------------------
2023-10-25 09:57:19,240 Loading model from best epoch ...
2023-10-25 09:57:21,486 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-25 09:57:30,130
Results:
- F-score (micro) 0.8131
- F-score (macro) 0.7236
- Accuracy 0.7047
By class:
precision recall f1-score support
LOC 0.8469 0.8710 0.8588 946
BUILDING 0.6073 0.6270 0.6170 185
STREET 0.6613 0.7321 0.6949 56
micro avg 0.8002 0.8265 0.8131 1187
macro avg 0.7052 0.7434 0.7236 1187
weighted avg 0.8008 0.8265 0.8134 1187
2023-10-25 09:57:30,130 ----------------------------------------------------------------------------------------------------