stefan-it's picture
Upload ./training.log with huggingface_hub
889bf72
2023-10-25 12:09:01,302 ----------------------------------------------------------------------------------------------------
2023-10-25 12:09:01,303 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 12:09:01,303 ----------------------------------------------------------------------------------------------------
2023-10-25 12:09:01,303 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-25 12:09:01,303 ----------------------------------------------------------------------------------------------------
2023-10-25 12:09:01,303 Train: 6183 sentences
2023-10-25 12:09:01,303 (train_with_dev=False, train_with_test=False)
2023-10-25 12:09:01,304 ----------------------------------------------------------------------------------------------------
2023-10-25 12:09:01,304 Training Params:
2023-10-25 12:09:01,304 - learning_rate: "3e-05"
2023-10-25 12:09:01,304 - mini_batch_size: "8"
2023-10-25 12:09:01,304 - max_epochs: "10"
2023-10-25 12:09:01,304 - shuffle: "True"
2023-10-25 12:09:01,304 ----------------------------------------------------------------------------------------------------
2023-10-25 12:09:01,304 Plugins:
2023-10-25 12:09:01,304 - TensorboardLogger
2023-10-25 12:09:01,304 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 12:09:01,304 ----------------------------------------------------------------------------------------------------
2023-10-25 12:09:01,304 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 12:09:01,304 - metric: "('micro avg', 'f1-score')"
2023-10-25 12:09:01,304 ----------------------------------------------------------------------------------------------------
2023-10-25 12:09:01,304 Computation:
2023-10-25 12:09:01,304 - compute on device: cuda:0
2023-10-25 12:09:01,304 - embedding storage: none
2023-10-25 12:09:01,304 ----------------------------------------------------------------------------------------------------
2023-10-25 12:09:01,304 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-25 12:09:01,304 ----------------------------------------------------------------------------------------------------
2023-10-25 12:09:01,304 ----------------------------------------------------------------------------------------------------
2023-10-25 12:09:01,304 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 12:09:05,698 epoch 1 - iter 77/773 - loss 2.04935701 - time (sec): 4.39 - samples/sec: 2982.47 - lr: 0.000003 - momentum: 0.000000
2023-10-25 12:09:10,085 epoch 1 - iter 154/773 - loss 1.15224118 - time (sec): 8.78 - samples/sec: 2971.82 - lr: 0.000006 - momentum: 0.000000
2023-10-25 12:09:14,492 epoch 1 - iter 231/773 - loss 0.83962482 - time (sec): 13.19 - samples/sec: 2899.99 - lr: 0.000009 - momentum: 0.000000
2023-10-25 12:09:18,796 epoch 1 - iter 308/773 - loss 0.67002780 - time (sec): 17.49 - samples/sec: 2880.95 - lr: 0.000012 - momentum: 0.000000
2023-10-25 12:09:23,015 epoch 1 - iter 385/773 - loss 0.56132674 - time (sec): 21.71 - samples/sec: 2892.66 - lr: 0.000015 - momentum: 0.000000
2023-10-25 12:09:27,216 epoch 1 - iter 462/773 - loss 0.48308824 - time (sec): 25.91 - samples/sec: 2915.44 - lr: 0.000018 - momentum: 0.000000
2023-10-25 12:09:31,389 epoch 1 - iter 539/773 - loss 0.42824782 - time (sec): 30.08 - samples/sec: 2907.87 - lr: 0.000021 - momentum: 0.000000
2023-10-25 12:09:35,594 epoch 1 - iter 616/773 - loss 0.38904871 - time (sec): 34.29 - samples/sec: 2913.03 - lr: 0.000024 - momentum: 0.000000
2023-10-25 12:09:39,872 epoch 1 - iter 693/773 - loss 0.35655845 - time (sec): 38.57 - samples/sec: 2917.22 - lr: 0.000027 - momentum: 0.000000
2023-10-25 12:09:43,997 epoch 1 - iter 770/773 - loss 0.33228475 - time (sec): 42.69 - samples/sec: 2897.79 - lr: 0.000030 - momentum: 0.000000
2023-10-25 12:09:44,162 ----------------------------------------------------------------------------------------------------
2023-10-25 12:09:44,162 EPOCH 1 done: loss 0.3309 - lr: 0.000030
2023-10-25 12:09:47,293 DEV : loss 0.049993742257356644 - f1-score (micro avg) 0.7523
2023-10-25 12:09:47,319 saving best model
2023-10-25 12:09:47,898 ----------------------------------------------------------------------------------------------------
2023-10-25 12:09:52,486 epoch 2 - iter 77/773 - loss 0.05586007 - time (sec): 4.59 - samples/sec: 2762.10 - lr: 0.000030 - momentum: 0.000000
2023-10-25 12:09:57,004 epoch 2 - iter 154/773 - loss 0.06548935 - time (sec): 9.10 - samples/sec: 2828.81 - lr: 0.000029 - momentum: 0.000000
2023-10-25 12:10:01,297 epoch 2 - iter 231/773 - loss 0.06383761 - time (sec): 13.40 - samples/sec: 2888.10 - lr: 0.000029 - momentum: 0.000000
2023-10-25 12:10:05,751 epoch 2 - iter 308/773 - loss 0.06415157 - time (sec): 17.85 - samples/sec: 2910.03 - lr: 0.000029 - momentum: 0.000000
2023-10-25 12:10:10,194 epoch 2 - iter 385/773 - loss 0.07204142 - time (sec): 22.29 - samples/sec: 2908.49 - lr: 0.000028 - momentum: 0.000000
2023-10-25 12:10:14,677 epoch 2 - iter 462/773 - loss 0.07062633 - time (sec): 26.78 - samples/sec: 2864.21 - lr: 0.000028 - momentum: 0.000000
2023-10-25 12:10:18,899 epoch 2 - iter 539/773 - loss 0.07204287 - time (sec): 31.00 - samples/sec: 2857.66 - lr: 0.000028 - momentum: 0.000000
2023-10-25 12:10:23,280 epoch 2 - iter 616/773 - loss 0.07210437 - time (sec): 35.38 - samples/sec: 2818.57 - lr: 0.000027 - momentum: 0.000000
2023-10-25 12:10:27,601 epoch 2 - iter 693/773 - loss 0.07212975 - time (sec): 39.70 - samples/sec: 2803.29 - lr: 0.000027 - momentum: 0.000000
2023-10-25 12:10:32,010 epoch 2 - iter 770/773 - loss 0.07237257 - time (sec): 44.11 - samples/sec: 2809.61 - lr: 0.000027 - momentum: 0.000000
2023-10-25 12:10:32,167 ----------------------------------------------------------------------------------------------------
2023-10-25 12:10:32,168 EPOCH 2 done: loss 0.0723 - lr: 0.000027
2023-10-25 12:10:34,902 DEV : loss 0.06731431186199188 - f1-score (micro avg) 0.6855
2023-10-25 12:10:34,921 ----------------------------------------------------------------------------------------------------
2023-10-25 12:10:40,075 epoch 3 - iter 77/773 - loss 0.04149199 - time (sec): 5.15 - samples/sec: 2307.69 - lr: 0.000026 - momentum: 0.000000
2023-10-25 12:10:44,706 epoch 3 - iter 154/773 - loss 0.04585807 - time (sec): 9.78 - samples/sec: 2476.90 - lr: 0.000026 - momentum: 0.000000
2023-10-25 12:10:49,149 epoch 3 - iter 231/773 - loss 0.04473604 - time (sec): 14.23 - samples/sec: 2559.25 - lr: 0.000026 - momentum: 0.000000
2023-10-25 12:10:53,891 epoch 3 - iter 308/773 - loss 0.04580293 - time (sec): 18.97 - samples/sec: 2566.94 - lr: 0.000025 - momentum: 0.000000
2023-10-25 12:10:58,585 epoch 3 - iter 385/773 - loss 0.04415912 - time (sec): 23.66 - samples/sec: 2561.23 - lr: 0.000025 - momentum: 0.000000
2023-10-25 12:11:03,129 epoch 3 - iter 462/773 - loss 0.04458377 - time (sec): 28.21 - samples/sec: 2580.06 - lr: 0.000025 - momentum: 0.000000
2023-10-25 12:11:07,629 epoch 3 - iter 539/773 - loss 0.04465819 - time (sec): 32.71 - samples/sec: 2625.80 - lr: 0.000024 - momentum: 0.000000
2023-10-25 12:11:12,217 epoch 3 - iter 616/773 - loss 0.04392868 - time (sec): 37.29 - samples/sec: 2651.43 - lr: 0.000024 - momentum: 0.000000
2023-10-25 12:11:16,638 epoch 3 - iter 693/773 - loss 0.04295948 - time (sec): 41.72 - samples/sec: 2669.75 - lr: 0.000024 - momentum: 0.000000
2023-10-25 12:11:20,917 epoch 3 - iter 770/773 - loss 0.04338122 - time (sec): 45.99 - samples/sec: 2694.91 - lr: 0.000023 - momentum: 0.000000
2023-10-25 12:11:21,073 ----------------------------------------------------------------------------------------------------
2023-10-25 12:11:21,073 EPOCH 3 done: loss 0.0435 - lr: 0.000023
2023-10-25 12:11:23,599 DEV : loss 0.06992863118648529 - f1-score (micro avg) 0.7865
2023-10-25 12:11:23,617 saving best model
2023-10-25 12:11:24,310 ----------------------------------------------------------------------------------------------------
2023-10-25 12:11:28,634 epoch 4 - iter 77/773 - loss 0.03433423 - time (sec): 4.32 - samples/sec: 2690.37 - lr: 0.000023 - momentum: 0.000000
2023-10-25 12:11:33,053 epoch 4 - iter 154/773 - loss 0.02988579 - time (sec): 8.74 - samples/sec: 2771.11 - lr: 0.000023 - momentum: 0.000000
2023-10-25 12:11:37,747 epoch 4 - iter 231/773 - loss 0.03026380 - time (sec): 13.43 - samples/sec: 2748.66 - lr: 0.000022 - momentum: 0.000000
2023-10-25 12:11:42,346 epoch 4 - iter 308/773 - loss 0.03020876 - time (sec): 18.03 - samples/sec: 2694.06 - lr: 0.000022 - momentum: 0.000000
2023-10-25 12:11:47,174 epoch 4 - iter 385/773 - loss 0.03311283 - time (sec): 22.86 - samples/sec: 2724.08 - lr: 0.000022 - momentum: 0.000000
2023-10-25 12:11:51,918 epoch 4 - iter 462/773 - loss 0.03238387 - time (sec): 27.61 - samples/sec: 2694.60 - lr: 0.000021 - momentum: 0.000000
2023-10-25 12:11:56,468 epoch 4 - iter 539/773 - loss 0.03213652 - time (sec): 32.16 - samples/sec: 2705.37 - lr: 0.000021 - momentum: 0.000000
2023-10-25 12:12:00,887 epoch 4 - iter 616/773 - loss 0.03163816 - time (sec): 36.57 - samples/sec: 2678.18 - lr: 0.000021 - momentum: 0.000000
2023-10-25 12:12:05,429 epoch 4 - iter 693/773 - loss 0.03249376 - time (sec): 41.12 - samples/sec: 2700.12 - lr: 0.000020 - momentum: 0.000000
2023-10-25 12:12:09,862 epoch 4 - iter 770/773 - loss 0.03182826 - time (sec): 45.55 - samples/sec: 2714.95 - lr: 0.000020 - momentum: 0.000000
2023-10-25 12:12:10,038 ----------------------------------------------------------------------------------------------------
2023-10-25 12:12:10,039 EPOCH 4 done: loss 0.0318 - lr: 0.000020
2023-10-25 12:12:12,691 DEV : loss 0.0831867977976799 - f1-score (micro avg) 0.7757
2023-10-25 12:12:12,709 ----------------------------------------------------------------------------------------------------
2023-10-25 12:12:17,196 epoch 5 - iter 77/773 - loss 0.01118081 - time (sec): 4.49 - samples/sec: 2845.40 - lr: 0.000020 - momentum: 0.000000
2023-10-25 12:12:21,635 epoch 5 - iter 154/773 - loss 0.01225207 - time (sec): 8.92 - samples/sec: 2784.43 - lr: 0.000019 - momentum: 0.000000
2023-10-25 12:12:26,012 epoch 5 - iter 231/773 - loss 0.01513811 - time (sec): 13.30 - samples/sec: 2779.75 - lr: 0.000019 - momentum: 0.000000
2023-10-25 12:12:30,375 epoch 5 - iter 308/773 - loss 0.01783538 - time (sec): 17.66 - samples/sec: 2778.96 - lr: 0.000019 - momentum: 0.000000
2023-10-25 12:12:34,838 epoch 5 - iter 385/773 - loss 0.01923528 - time (sec): 22.13 - samples/sec: 2788.59 - lr: 0.000018 - momentum: 0.000000
2023-10-25 12:12:39,253 epoch 5 - iter 462/773 - loss 0.02067768 - time (sec): 26.54 - samples/sec: 2788.11 - lr: 0.000018 - momentum: 0.000000
2023-10-25 12:12:43,815 epoch 5 - iter 539/773 - loss 0.02178588 - time (sec): 31.10 - samples/sec: 2768.08 - lr: 0.000018 - momentum: 0.000000
2023-10-25 12:12:48,247 epoch 5 - iter 616/773 - loss 0.02202980 - time (sec): 35.54 - samples/sec: 2799.37 - lr: 0.000017 - momentum: 0.000000
2023-10-25 12:12:52,671 epoch 5 - iter 693/773 - loss 0.02206390 - time (sec): 39.96 - samples/sec: 2789.53 - lr: 0.000017 - momentum: 0.000000
2023-10-25 12:12:57,019 epoch 5 - iter 770/773 - loss 0.02268567 - time (sec): 44.31 - samples/sec: 2794.12 - lr: 0.000017 - momentum: 0.000000
2023-10-25 12:12:57,193 ----------------------------------------------------------------------------------------------------
2023-10-25 12:12:57,194 EPOCH 5 done: loss 0.0227 - lr: 0.000017
2023-10-25 12:12:59,923 DEV : loss 0.10409079492092133 - f1-score (micro avg) 0.7599
2023-10-25 12:12:59,941 ----------------------------------------------------------------------------------------------------
2023-10-25 12:13:04,359 epoch 6 - iter 77/773 - loss 0.01348224 - time (sec): 4.42 - samples/sec: 2675.10 - lr: 0.000016 - momentum: 0.000000
2023-10-25 12:13:08,929 epoch 6 - iter 154/773 - loss 0.01638656 - time (sec): 8.99 - samples/sec: 2724.77 - lr: 0.000016 - momentum: 0.000000
2023-10-25 12:13:13,271 epoch 6 - iter 231/773 - loss 0.01518915 - time (sec): 13.33 - samples/sec: 2699.04 - lr: 0.000016 - momentum: 0.000000
2023-10-25 12:13:17,554 epoch 6 - iter 308/773 - loss 0.01423275 - time (sec): 17.61 - samples/sec: 2715.12 - lr: 0.000015 - momentum: 0.000000
2023-10-25 12:13:21,948 epoch 6 - iter 385/773 - loss 0.01465336 - time (sec): 22.00 - samples/sec: 2719.42 - lr: 0.000015 - momentum: 0.000000
2023-10-25 12:13:26,275 epoch 6 - iter 462/773 - loss 0.01428615 - time (sec): 26.33 - samples/sec: 2755.48 - lr: 0.000015 - momentum: 0.000000
2023-10-25 12:13:30,650 epoch 6 - iter 539/773 - loss 0.01633936 - time (sec): 30.71 - samples/sec: 2796.82 - lr: 0.000014 - momentum: 0.000000
2023-10-25 12:13:34,925 epoch 6 - iter 616/773 - loss 0.01638676 - time (sec): 34.98 - samples/sec: 2821.52 - lr: 0.000014 - momentum: 0.000000
2023-10-25 12:13:39,145 epoch 6 - iter 693/773 - loss 0.01555940 - time (sec): 39.20 - samples/sec: 2844.70 - lr: 0.000014 - momentum: 0.000000
2023-10-25 12:13:43,632 epoch 6 - iter 770/773 - loss 0.01534485 - time (sec): 43.69 - samples/sec: 2830.40 - lr: 0.000013 - momentum: 0.000000
2023-10-25 12:13:43,817 ----------------------------------------------------------------------------------------------------
2023-10-25 12:13:43,818 EPOCH 6 done: loss 0.0153 - lr: 0.000013
2023-10-25 12:13:47,014 DEV : loss 0.12169384211301804 - f1-score (micro avg) 0.7558
2023-10-25 12:13:47,030 ----------------------------------------------------------------------------------------------------
2023-10-25 12:13:51,329 epoch 7 - iter 77/773 - loss 0.01436857 - time (sec): 4.30 - samples/sec: 2942.53 - lr: 0.000013 - momentum: 0.000000
2023-10-25 12:13:55,723 epoch 7 - iter 154/773 - loss 0.01141007 - time (sec): 8.69 - samples/sec: 2978.98 - lr: 0.000013 - momentum: 0.000000
2023-10-25 12:14:00,111 epoch 7 - iter 231/773 - loss 0.01256394 - time (sec): 13.08 - samples/sec: 2923.47 - lr: 0.000012 - momentum: 0.000000
2023-10-25 12:14:04,412 epoch 7 - iter 308/773 - loss 0.01155963 - time (sec): 17.38 - samples/sec: 2877.60 - lr: 0.000012 - momentum: 0.000000
2023-10-25 12:14:08,899 epoch 7 - iter 385/773 - loss 0.01165634 - time (sec): 21.87 - samples/sec: 2848.16 - lr: 0.000012 - momentum: 0.000000
2023-10-25 12:14:13,201 epoch 7 - iter 462/773 - loss 0.01214036 - time (sec): 26.17 - samples/sec: 2826.93 - lr: 0.000011 - momentum: 0.000000
2023-10-25 12:14:17,593 epoch 7 - iter 539/773 - loss 0.01164323 - time (sec): 30.56 - samples/sec: 2804.21 - lr: 0.000011 - momentum: 0.000000
2023-10-25 12:14:22,113 epoch 7 - iter 616/773 - loss 0.01236649 - time (sec): 35.08 - samples/sec: 2810.57 - lr: 0.000011 - momentum: 0.000000
2023-10-25 12:14:26,511 epoch 7 - iter 693/773 - loss 0.01338845 - time (sec): 39.48 - samples/sec: 2829.38 - lr: 0.000010 - momentum: 0.000000
2023-10-25 12:14:30,788 epoch 7 - iter 770/773 - loss 0.01263676 - time (sec): 43.76 - samples/sec: 2831.85 - lr: 0.000010 - momentum: 0.000000
2023-10-25 12:14:30,948 ----------------------------------------------------------------------------------------------------
2023-10-25 12:14:30,949 EPOCH 7 done: loss 0.0126 - lr: 0.000010
2023-10-25 12:14:33,504 DEV : loss 0.12100550532341003 - f1-score (micro avg) 0.7798
2023-10-25 12:14:33,521 ----------------------------------------------------------------------------------------------------
2023-10-25 12:14:38,008 epoch 8 - iter 77/773 - loss 0.00697278 - time (sec): 4.48 - samples/sec: 2710.94 - lr: 0.000010 - momentum: 0.000000
2023-10-25 12:14:42,374 epoch 8 - iter 154/773 - loss 0.00942980 - time (sec): 8.85 - samples/sec: 2728.78 - lr: 0.000009 - momentum: 0.000000
2023-10-25 12:14:47,348 epoch 8 - iter 231/773 - loss 0.00907831 - time (sec): 13.82 - samples/sec: 2680.93 - lr: 0.000009 - momentum: 0.000000
2023-10-25 12:14:52,128 epoch 8 - iter 308/773 - loss 0.00926748 - time (sec): 18.61 - samples/sec: 2648.71 - lr: 0.000009 - momentum: 0.000000
2023-10-25 12:14:56,906 epoch 8 - iter 385/773 - loss 0.00942620 - time (sec): 23.38 - samples/sec: 2667.48 - lr: 0.000008 - momentum: 0.000000
2023-10-25 12:15:01,658 epoch 8 - iter 462/773 - loss 0.00856654 - time (sec): 28.13 - samples/sec: 2670.62 - lr: 0.000008 - momentum: 0.000000
2023-10-25 12:15:06,213 epoch 8 - iter 539/773 - loss 0.00915887 - time (sec): 32.69 - samples/sec: 2686.56 - lr: 0.000008 - momentum: 0.000000
2023-10-25 12:15:10,739 epoch 8 - iter 616/773 - loss 0.00945267 - time (sec): 37.22 - samples/sec: 2687.82 - lr: 0.000007 - momentum: 0.000000
2023-10-25 12:15:15,239 epoch 8 - iter 693/773 - loss 0.00899884 - time (sec): 41.72 - samples/sec: 2683.48 - lr: 0.000007 - momentum: 0.000000
2023-10-25 12:15:19,697 epoch 8 - iter 770/773 - loss 0.00865270 - time (sec): 46.17 - samples/sec: 2684.25 - lr: 0.000007 - momentum: 0.000000
2023-10-25 12:15:19,865 ----------------------------------------------------------------------------------------------------
2023-10-25 12:15:19,865 EPOCH 8 done: loss 0.0086 - lr: 0.000007
2023-10-25 12:15:22,716 DEV : loss 0.1208329051733017 - f1-score (micro avg) 0.7718
2023-10-25 12:15:22,735 ----------------------------------------------------------------------------------------------------
2023-10-25 12:15:27,218 epoch 9 - iter 77/773 - loss 0.00207713 - time (sec): 4.48 - samples/sec: 2590.59 - lr: 0.000006 - momentum: 0.000000
2023-10-25 12:15:31,785 epoch 9 - iter 154/773 - loss 0.00299376 - time (sec): 9.05 - samples/sec: 2690.56 - lr: 0.000006 - momentum: 0.000000
2023-10-25 12:15:36,337 epoch 9 - iter 231/773 - loss 0.00439480 - time (sec): 13.60 - samples/sec: 2662.19 - lr: 0.000006 - momentum: 0.000000
2023-10-25 12:15:40,867 epoch 9 - iter 308/773 - loss 0.00427536 - time (sec): 18.13 - samples/sec: 2678.24 - lr: 0.000005 - momentum: 0.000000
2023-10-25 12:15:45,495 epoch 9 - iter 385/773 - loss 0.00426518 - time (sec): 22.76 - samples/sec: 2696.46 - lr: 0.000005 - momentum: 0.000000
2023-10-25 12:15:50,316 epoch 9 - iter 462/773 - loss 0.00496819 - time (sec): 27.58 - samples/sec: 2690.43 - lr: 0.000005 - momentum: 0.000000
2023-10-25 12:15:55,200 epoch 9 - iter 539/773 - loss 0.00520497 - time (sec): 32.46 - samples/sec: 2672.68 - lr: 0.000004 - momentum: 0.000000
2023-10-25 12:15:59,912 epoch 9 - iter 616/773 - loss 0.00510625 - time (sec): 37.17 - samples/sec: 2659.85 - lr: 0.000004 - momentum: 0.000000
2023-10-25 12:16:04,732 epoch 9 - iter 693/773 - loss 0.00494235 - time (sec): 41.99 - samples/sec: 2654.97 - lr: 0.000004 - momentum: 0.000000
2023-10-25 12:16:09,705 epoch 9 - iter 770/773 - loss 0.00530484 - time (sec): 46.97 - samples/sec: 2631.90 - lr: 0.000003 - momentum: 0.000000
2023-10-25 12:16:09,890 ----------------------------------------------------------------------------------------------------
2023-10-25 12:16:09,890 EPOCH 9 done: loss 0.0053 - lr: 0.000003
2023-10-25 12:16:12,699 DEV : loss 0.12893347442150116 - f1-score (micro avg) 0.7771
2023-10-25 12:16:12,715 ----------------------------------------------------------------------------------------------------
2023-10-25 12:16:17,489 epoch 10 - iter 77/773 - loss 0.00071556 - time (sec): 4.77 - samples/sec: 2666.44 - lr: 0.000003 - momentum: 0.000000
2023-10-25 12:16:22,187 epoch 10 - iter 154/773 - loss 0.00252394 - time (sec): 9.47 - samples/sec: 2581.74 - lr: 0.000003 - momentum: 0.000000
2023-10-25 12:16:26,865 epoch 10 - iter 231/773 - loss 0.00200991 - time (sec): 14.15 - samples/sec: 2640.64 - lr: 0.000002 - momentum: 0.000000
2023-10-25 12:16:31,612 epoch 10 - iter 308/773 - loss 0.00177058 - time (sec): 18.90 - samples/sec: 2657.01 - lr: 0.000002 - momentum: 0.000000
2023-10-25 12:16:36,407 epoch 10 - iter 385/773 - loss 0.00273068 - time (sec): 23.69 - samples/sec: 2650.98 - lr: 0.000002 - momentum: 0.000000
2023-10-25 12:16:41,225 epoch 10 - iter 462/773 - loss 0.00291902 - time (sec): 28.51 - samples/sec: 2655.16 - lr: 0.000001 - momentum: 0.000000
2023-10-25 12:16:45,670 epoch 10 - iter 539/773 - loss 0.00304957 - time (sec): 32.95 - samples/sec: 2675.25 - lr: 0.000001 - momentum: 0.000000
2023-10-25 12:16:50,087 epoch 10 - iter 616/773 - loss 0.00324520 - time (sec): 37.37 - samples/sec: 2666.88 - lr: 0.000001 - momentum: 0.000000
2023-10-25 12:16:54,512 epoch 10 - iter 693/773 - loss 0.00312779 - time (sec): 41.79 - samples/sec: 2683.66 - lr: 0.000000 - momentum: 0.000000
2023-10-25 12:16:58,780 epoch 10 - iter 770/773 - loss 0.00321731 - time (sec): 46.06 - samples/sec: 2688.80 - lr: 0.000000 - momentum: 0.000000
2023-10-25 12:16:58,943 ----------------------------------------------------------------------------------------------------
2023-10-25 12:16:58,944 EPOCH 10 done: loss 0.0032 - lr: 0.000000
2023-10-25 12:17:01,521 DEV : loss 0.12955833971500397 - f1-score (micro avg) 0.7787
2023-10-25 12:17:02,420 ----------------------------------------------------------------------------------------------------
2023-10-25 12:17:02,422 Loading model from best epoch ...
2023-10-25 12:17:04,237 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-25 12:17:12,614
Results:
- F-score (micro) 0.7703
- F-score (macro) 0.606
- Accuracy 0.6435
By class:
precision recall f1-score support
LOC 0.8309 0.8414 0.8361 946
BUILDING 0.5915 0.2270 0.3281 185
STREET 0.7083 0.6071 0.6538 56
micro avg 0.8097 0.7346 0.7703 1187
macro avg 0.7103 0.5585 0.6060 1187
weighted avg 0.7878 0.7346 0.7484 1187
2023-10-25 12:17:12,614 ----------------------------------------------------------------------------------------------------