stefan-it's picture
Upload folder using huggingface_hub
410f4df
2023-10-18 17:48:28,457 ----------------------------------------------------------------------------------------------------
2023-10-18 17:48:28,458 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 17:48:28,458 ----------------------------------------------------------------------------------------------------
2023-10-18 17:48:28,458 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-18 17:48:28,458 ----------------------------------------------------------------------------------------------------
2023-10-18 17:48:28,458 Train: 3575 sentences
2023-10-18 17:48:28,458 (train_with_dev=False, train_with_test=False)
2023-10-18 17:48:28,458 ----------------------------------------------------------------------------------------------------
2023-10-18 17:48:28,458 Training Params:
2023-10-18 17:48:28,458 - learning_rate: "3e-05"
2023-10-18 17:48:28,458 - mini_batch_size: "4"
2023-10-18 17:48:28,458 - max_epochs: "10"
2023-10-18 17:48:28,458 - shuffle: "True"
2023-10-18 17:48:28,458 ----------------------------------------------------------------------------------------------------
2023-10-18 17:48:28,458 Plugins:
2023-10-18 17:48:28,458 - TensorboardLogger
2023-10-18 17:48:28,458 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 17:48:28,458 ----------------------------------------------------------------------------------------------------
2023-10-18 17:48:28,458 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 17:48:28,458 - metric: "('micro avg', 'f1-score')"
2023-10-18 17:48:28,458 ----------------------------------------------------------------------------------------------------
2023-10-18 17:48:28,458 Computation:
2023-10-18 17:48:28,458 - compute on device: cuda:0
2023-10-18 17:48:28,458 - embedding storage: none
2023-10-18 17:48:28,458 ----------------------------------------------------------------------------------------------------
2023-10-18 17:48:28,459 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-18 17:48:28,459 ----------------------------------------------------------------------------------------------------
2023-10-18 17:48:28,459 ----------------------------------------------------------------------------------------------------
2023-10-18 17:48:28,459 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 17:48:29,748 epoch 1 - iter 89/894 - loss 3.19630071 - time (sec): 1.29 - samples/sec: 7045.16 - lr: 0.000003 - momentum: 0.000000
2023-10-18 17:48:31,082 epoch 1 - iter 178/894 - loss 2.96353199 - time (sec): 2.62 - samples/sec: 7187.38 - lr: 0.000006 - momentum: 0.000000
2023-10-18 17:48:32,421 epoch 1 - iter 267/894 - loss 2.74540200 - time (sec): 3.96 - samples/sec: 6702.85 - lr: 0.000009 - momentum: 0.000000
2023-10-18 17:48:33,790 epoch 1 - iter 356/894 - loss 2.43894253 - time (sec): 5.33 - samples/sec: 6411.16 - lr: 0.000012 - momentum: 0.000000
2023-10-18 17:48:35,196 epoch 1 - iter 445/894 - loss 2.12976788 - time (sec): 6.74 - samples/sec: 6277.64 - lr: 0.000015 - momentum: 0.000000
2023-10-18 17:48:36,585 epoch 1 - iter 534/894 - loss 1.88648563 - time (sec): 8.13 - samples/sec: 6263.25 - lr: 0.000018 - momentum: 0.000000
2023-10-18 17:48:37,976 epoch 1 - iter 623/894 - loss 1.67326583 - time (sec): 9.52 - samples/sec: 6416.47 - lr: 0.000021 - momentum: 0.000000
2023-10-18 17:48:39,326 epoch 1 - iter 712/894 - loss 1.53353448 - time (sec): 10.87 - samples/sec: 6403.24 - lr: 0.000024 - momentum: 0.000000
2023-10-18 17:48:40,702 epoch 1 - iter 801/894 - loss 1.42922638 - time (sec): 12.24 - samples/sec: 6366.34 - lr: 0.000027 - momentum: 0.000000
2023-10-18 17:48:42,044 epoch 1 - iter 890/894 - loss 1.35456652 - time (sec): 13.58 - samples/sec: 6351.81 - lr: 0.000030 - momentum: 0.000000
2023-10-18 17:48:42,100 ----------------------------------------------------------------------------------------------------
2023-10-18 17:48:42,100 EPOCH 1 done: loss 1.3532 - lr: 0.000030
2023-10-18 17:48:44,322 DEV : loss 0.45246344804763794 - f1-score (micro avg) 0.0
2023-10-18 17:48:44,344 ----------------------------------------------------------------------------------------------------
2023-10-18 17:48:45,711 epoch 2 - iter 89/894 - loss 0.52574422 - time (sec): 1.37 - samples/sec: 6183.97 - lr: 0.000030 - momentum: 0.000000
2023-10-18 17:48:47,080 epoch 2 - iter 178/894 - loss 0.55545770 - time (sec): 2.74 - samples/sec: 6374.43 - lr: 0.000029 - momentum: 0.000000
2023-10-18 17:48:48,520 epoch 2 - iter 267/894 - loss 0.54150980 - time (sec): 4.18 - samples/sec: 6041.74 - lr: 0.000029 - momentum: 0.000000
2023-10-18 17:48:49,990 epoch 2 - iter 356/894 - loss 0.53379165 - time (sec): 5.64 - samples/sec: 5904.15 - lr: 0.000029 - momentum: 0.000000
2023-10-18 17:48:51,410 epoch 2 - iter 445/894 - loss 0.53062271 - time (sec): 7.07 - samples/sec: 6094.43 - lr: 0.000028 - momentum: 0.000000
2023-10-18 17:48:52,758 epoch 2 - iter 534/894 - loss 0.52081983 - time (sec): 8.41 - samples/sec: 6121.72 - lr: 0.000028 - momentum: 0.000000
2023-10-18 17:48:54,219 epoch 2 - iter 623/894 - loss 0.51963233 - time (sec): 9.87 - samples/sec: 6246.67 - lr: 0.000028 - momentum: 0.000000
2023-10-18 17:48:55,574 epoch 2 - iter 712/894 - loss 0.51540611 - time (sec): 11.23 - samples/sec: 6162.82 - lr: 0.000027 - momentum: 0.000000
2023-10-18 17:48:56,953 epoch 2 - iter 801/894 - loss 0.51041159 - time (sec): 12.61 - samples/sec: 6165.15 - lr: 0.000027 - momentum: 0.000000
2023-10-18 17:48:58,341 epoch 2 - iter 890/894 - loss 0.50535478 - time (sec): 14.00 - samples/sec: 6164.74 - lr: 0.000027 - momentum: 0.000000
2023-10-18 17:48:58,402 ----------------------------------------------------------------------------------------------------
2023-10-18 17:48:58,403 EPOCH 2 done: loss 0.5064 - lr: 0.000027
2023-10-18 17:49:03,571 DEV : loss 0.3578655421733856 - f1-score (micro avg) 0.0592
2023-10-18 17:49:03,594 saving best model
2023-10-18 17:49:03,627 ----------------------------------------------------------------------------------------------------
2023-10-18 17:49:05,037 epoch 3 - iter 89/894 - loss 0.48336938 - time (sec): 1.41 - samples/sec: 6484.71 - lr: 0.000026 - momentum: 0.000000
2023-10-18 17:49:06,438 epoch 3 - iter 178/894 - loss 0.47658882 - time (sec): 2.81 - samples/sec: 6324.71 - lr: 0.000026 - momentum: 0.000000
2023-10-18 17:49:07,814 epoch 3 - iter 267/894 - loss 0.45712463 - time (sec): 4.19 - samples/sec: 6340.69 - lr: 0.000026 - momentum: 0.000000
2023-10-18 17:49:09,179 epoch 3 - iter 356/894 - loss 0.46393711 - time (sec): 5.55 - samples/sec: 6172.52 - lr: 0.000025 - momentum: 0.000000
2023-10-18 17:49:10,558 epoch 3 - iter 445/894 - loss 0.44830358 - time (sec): 6.93 - samples/sec: 6114.08 - lr: 0.000025 - momentum: 0.000000
2023-10-18 17:49:11,950 epoch 3 - iter 534/894 - loss 0.44669476 - time (sec): 8.32 - samples/sec: 6129.35 - lr: 0.000025 - momentum: 0.000000
2023-10-18 17:49:13,338 epoch 3 - iter 623/894 - loss 0.43924135 - time (sec): 9.71 - samples/sec: 6133.73 - lr: 0.000024 - momentum: 0.000000
2023-10-18 17:49:14,741 epoch 3 - iter 712/894 - loss 0.43899574 - time (sec): 11.11 - samples/sec: 6177.45 - lr: 0.000024 - momentum: 0.000000
2023-10-18 17:49:16,164 epoch 3 - iter 801/894 - loss 0.43256443 - time (sec): 12.54 - samples/sec: 6201.00 - lr: 0.000024 - momentum: 0.000000
2023-10-18 17:49:17,548 epoch 3 - iter 890/894 - loss 0.42949594 - time (sec): 13.92 - samples/sec: 6194.26 - lr: 0.000023 - momentum: 0.000000
2023-10-18 17:49:17,606 ----------------------------------------------------------------------------------------------------
2023-10-18 17:49:17,606 EPOCH 3 done: loss 0.4296 - lr: 0.000023
2023-10-18 17:49:22,809 DEV : loss 0.34198394417762756 - f1-score (micro avg) 0.2455
2023-10-18 17:49:22,832 saving best model
2023-10-18 17:49:22,865 ----------------------------------------------------------------------------------------------------
2023-10-18 17:49:24,232 epoch 4 - iter 89/894 - loss 0.41318889 - time (sec): 1.37 - samples/sec: 5961.29 - lr: 0.000023 - momentum: 0.000000
2023-10-18 17:49:25,688 epoch 4 - iter 178/894 - loss 0.38872500 - time (sec): 2.82 - samples/sec: 6480.28 - lr: 0.000023 - momentum: 0.000000
2023-10-18 17:49:27,081 epoch 4 - iter 267/894 - loss 0.38659751 - time (sec): 4.22 - samples/sec: 6377.23 - lr: 0.000022 - momentum: 0.000000
2023-10-18 17:49:28,484 epoch 4 - iter 356/894 - loss 0.39484288 - time (sec): 5.62 - samples/sec: 6340.75 - lr: 0.000022 - momentum: 0.000000
2023-10-18 17:49:29,929 epoch 4 - iter 445/894 - loss 0.38596894 - time (sec): 7.06 - samples/sec: 6296.23 - lr: 0.000022 - momentum: 0.000000
2023-10-18 17:49:31,291 epoch 4 - iter 534/894 - loss 0.38762013 - time (sec): 8.43 - samples/sec: 6266.21 - lr: 0.000021 - momentum: 0.000000
2023-10-18 17:49:32,694 epoch 4 - iter 623/894 - loss 0.38363860 - time (sec): 9.83 - samples/sec: 6235.32 - lr: 0.000021 - momentum: 0.000000
2023-10-18 17:49:34,090 epoch 4 - iter 712/894 - loss 0.38723100 - time (sec): 11.22 - samples/sec: 6214.74 - lr: 0.000021 - momentum: 0.000000
2023-10-18 17:49:35,490 epoch 4 - iter 801/894 - loss 0.38665888 - time (sec): 12.62 - samples/sec: 6162.80 - lr: 0.000020 - momentum: 0.000000
2023-10-18 17:49:36,891 epoch 4 - iter 890/894 - loss 0.38610173 - time (sec): 14.03 - samples/sec: 6143.24 - lr: 0.000020 - momentum: 0.000000
2023-10-18 17:49:36,957 ----------------------------------------------------------------------------------------------------
2023-10-18 17:49:36,957 EPOCH 4 done: loss 0.3846 - lr: 0.000020
2023-10-18 17:49:41,899 DEV : loss 0.3150075674057007 - f1-score (micro avg) 0.3247
2023-10-18 17:49:41,923 saving best model
2023-10-18 17:49:41,956 ----------------------------------------------------------------------------------------------------
2023-10-18 17:49:43,421 epoch 5 - iter 89/894 - loss 0.38144140 - time (sec): 1.46 - samples/sec: 5968.14 - lr: 0.000020 - momentum: 0.000000
2023-10-18 17:49:44,878 epoch 5 - iter 178/894 - loss 0.35024717 - time (sec): 2.92 - samples/sec: 6219.91 - lr: 0.000019 - momentum: 0.000000
2023-10-18 17:49:46,330 epoch 5 - iter 267/894 - loss 0.35588428 - time (sec): 4.37 - samples/sec: 5912.08 - lr: 0.000019 - momentum: 0.000000
2023-10-18 17:49:47,698 epoch 5 - iter 356/894 - loss 0.35467769 - time (sec): 5.74 - samples/sec: 6017.05 - lr: 0.000019 - momentum: 0.000000
2023-10-18 17:49:49,081 epoch 5 - iter 445/894 - loss 0.36077938 - time (sec): 7.12 - samples/sec: 5971.89 - lr: 0.000018 - momentum: 0.000000
2023-10-18 17:49:50,469 epoch 5 - iter 534/894 - loss 0.36734085 - time (sec): 8.51 - samples/sec: 5956.77 - lr: 0.000018 - momentum: 0.000000
2023-10-18 17:49:51,914 epoch 5 - iter 623/894 - loss 0.36702674 - time (sec): 9.96 - samples/sec: 6018.41 - lr: 0.000018 - momentum: 0.000000
2023-10-18 17:49:53,392 epoch 5 - iter 712/894 - loss 0.36851786 - time (sec): 11.44 - samples/sec: 6059.43 - lr: 0.000017 - momentum: 0.000000
2023-10-18 17:49:54,785 epoch 5 - iter 801/894 - loss 0.36449745 - time (sec): 12.83 - samples/sec: 6069.19 - lr: 0.000017 - momentum: 0.000000
2023-10-18 17:49:56,193 epoch 5 - iter 890/894 - loss 0.35757282 - time (sec): 14.24 - samples/sec: 6057.28 - lr: 0.000017 - momentum: 0.000000
2023-10-18 17:49:56,259 ----------------------------------------------------------------------------------------------------
2023-10-18 17:49:56,259 EPOCH 5 done: loss 0.3570 - lr: 0.000017
2023-10-18 17:50:01,547 DEV : loss 0.3157925307750702 - f1-score (micro avg) 0.3235
2023-10-18 17:50:01,571 ----------------------------------------------------------------------------------------------------
2023-10-18 17:50:03,005 epoch 6 - iter 89/894 - loss 0.34112212 - time (sec): 1.43 - samples/sec: 5988.42 - lr: 0.000016 - momentum: 0.000000
2023-10-18 17:50:04,422 epoch 6 - iter 178/894 - loss 0.33324202 - time (sec): 2.85 - samples/sec: 5812.42 - lr: 0.000016 - momentum: 0.000000
2023-10-18 17:50:05,839 epoch 6 - iter 267/894 - loss 0.32141621 - time (sec): 4.27 - samples/sec: 5680.00 - lr: 0.000016 - momentum: 0.000000
2023-10-18 17:50:07,210 epoch 6 - iter 356/894 - loss 0.34308998 - time (sec): 5.64 - samples/sec: 5766.94 - lr: 0.000015 - momentum: 0.000000
2023-10-18 17:50:08,571 epoch 6 - iter 445/894 - loss 0.34151889 - time (sec): 7.00 - samples/sec: 5841.83 - lr: 0.000015 - momentum: 0.000000
2023-10-18 17:50:10,009 epoch 6 - iter 534/894 - loss 0.35052796 - time (sec): 8.44 - samples/sec: 6032.34 - lr: 0.000015 - momentum: 0.000000
2023-10-18 17:50:11,412 epoch 6 - iter 623/894 - loss 0.34315959 - time (sec): 9.84 - samples/sec: 6061.10 - lr: 0.000014 - momentum: 0.000000
2023-10-18 17:50:12,775 epoch 6 - iter 712/894 - loss 0.33510239 - time (sec): 11.20 - samples/sec: 6112.00 - lr: 0.000014 - momentum: 0.000000
2023-10-18 17:50:14,192 epoch 6 - iter 801/894 - loss 0.33637264 - time (sec): 12.62 - samples/sec: 6162.15 - lr: 0.000014 - momentum: 0.000000
2023-10-18 17:50:15,601 epoch 6 - iter 890/894 - loss 0.33235115 - time (sec): 14.03 - samples/sec: 6145.37 - lr: 0.000013 - momentum: 0.000000
2023-10-18 17:50:15,658 ----------------------------------------------------------------------------------------------------
2023-10-18 17:50:15,658 EPOCH 6 done: loss 0.3326 - lr: 0.000013
2023-10-18 17:50:20,999 DEV : loss 0.30548417568206787 - f1-score (micro avg) 0.3435
2023-10-18 17:50:21,022 saving best model
2023-10-18 17:50:21,056 ----------------------------------------------------------------------------------------------------
2023-10-18 17:50:22,623 epoch 7 - iter 89/894 - loss 0.31222141 - time (sec): 1.57 - samples/sec: 5714.42 - lr: 0.000013 - momentum: 0.000000
2023-10-18 17:50:24,005 epoch 7 - iter 178/894 - loss 0.32705010 - time (sec): 2.95 - samples/sec: 5872.12 - lr: 0.000013 - momentum: 0.000000
2023-10-18 17:50:25,399 epoch 7 - iter 267/894 - loss 0.32263937 - time (sec): 4.34 - samples/sec: 5931.39 - lr: 0.000012 - momentum: 0.000000
2023-10-18 17:50:26,782 epoch 7 - iter 356/894 - loss 0.32544371 - time (sec): 5.73 - samples/sec: 5904.42 - lr: 0.000012 - momentum: 0.000000
2023-10-18 17:50:28,201 epoch 7 - iter 445/894 - loss 0.31601726 - time (sec): 7.14 - samples/sec: 5919.08 - lr: 0.000012 - momentum: 0.000000
2023-10-18 17:50:29,634 epoch 7 - iter 534/894 - loss 0.31615422 - time (sec): 8.58 - samples/sec: 6012.46 - lr: 0.000011 - momentum: 0.000000
2023-10-18 17:50:31,015 epoch 7 - iter 623/894 - loss 0.31916724 - time (sec): 9.96 - samples/sec: 6066.94 - lr: 0.000011 - momentum: 0.000000
2023-10-18 17:50:32,413 epoch 7 - iter 712/894 - loss 0.32144582 - time (sec): 11.36 - samples/sec: 6137.21 - lr: 0.000011 - momentum: 0.000000
2023-10-18 17:50:33,805 epoch 7 - iter 801/894 - loss 0.32125099 - time (sec): 12.75 - samples/sec: 6150.79 - lr: 0.000010 - momentum: 0.000000
2023-10-18 17:50:35,151 epoch 7 - iter 890/894 - loss 0.31897858 - time (sec): 14.09 - samples/sec: 6117.17 - lr: 0.000010 - momentum: 0.000000
2023-10-18 17:50:35,210 ----------------------------------------------------------------------------------------------------
2023-10-18 17:50:35,210 EPOCH 7 done: loss 0.3195 - lr: 0.000010
2023-10-18 17:50:40,535 DEV : loss 0.3022187650203705 - f1-score (micro avg) 0.3497
2023-10-18 17:50:40,559 saving best model
2023-10-18 17:50:40,599 ----------------------------------------------------------------------------------------------------
2023-10-18 17:50:42,000 epoch 8 - iter 89/894 - loss 0.34555510 - time (sec): 1.40 - samples/sec: 6326.33 - lr: 0.000010 - momentum: 0.000000
2023-10-18 17:50:43,433 epoch 8 - iter 178/894 - loss 0.32944189 - time (sec): 2.83 - samples/sec: 6467.94 - lr: 0.000009 - momentum: 0.000000
2023-10-18 17:50:44,836 epoch 8 - iter 267/894 - loss 0.32918232 - time (sec): 4.24 - samples/sec: 6225.29 - lr: 0.000009 - momentum: 0.000000
2023-10-18 17:50:46,332 epoch 8 - iter 356/894 - loss 0.31980933 - time (sec): 5.73 - samples/sec: 6146.00 - lr: 0.000009 - momentum: 0.000000
2023-10-18 17:50:47,718 epoch 8 - iter 445/894 - loss 0.31177364 - time (sec): 7.12 - samples/sec: 6270.04 - lr: 0.000008 - momentum: 0.000000
2023-10-18 17:50:49,196 epoch 8 - iter 534/894 - loss 0.30996565 - time (sec): 8.60 - samples/sec: 6170.04 - lr: 0.000008 - momentum: 0.000000
2023-10-18 17:50:50,598 epoch 8 - iter 623/894 - loss 0.31266497 - time (sec): 10.00 - samples/sec: 6092.52 - lr: 0.000008 - momentum: 0.000000
2023-10-18 17:50:52,119 epoch 8 - iter 712/894 - loss 0.30729815 - time (sec): 11.52 - samples/sec: 6096.38 - lr: 0.000007 - momentum: 0.000000
2023-10-18 17:50:53,590 epoch 8 - iter 801/894 - loss 0.31538057 - time (sec): 12.99 - samples/sec: 6039.31 - lr: 0.000007 - momentum: 0.000000
2023-10-18 17:50:54,948 epoch 8 - iter 890/894 - loss 0.31034285 - time (sec): 14.35 - samples/sec: 6002.89 - lr: 0.000007 - momentum: 0.000000
2023-10-18 17:50:55,010 ----------------------------------------------------------------------------------------------------
2023-10-18 17:50:55,010 EPOCH 8 done: loss 0.3097 - lr: 0.000007
2023-10-18 17:50:59,964 DEV : loss 0.30434364080429077 - f1-score (micro avg) 0.3522
2023-10-18 17:50:59,988 saving best model
2023-10-18 17:51:00,025 ----------------------------------------------------------------------------------------------------
2023-10-18 17:51:01,426 epoch 9 - iter 89/894 - loss 0.23959273 - time (sec): 1.40 - samples/sec: 6024.01 - lr: 0.000006 - momentum: 0.000000
2023-10-18 17:51:02,796 epoch 9 - iter 178/894 - loss 0.27105112 - time (sec): 2.77 - samples/sec: 6003.56 - lr: 0.000006 - momentum: 0.000000
2023-10-18 17:51:04,184 epoch 9 - iter 267/894 - loss 0.28143971 - time (sec): 4.16 - samples/sec: 6298.31 - lr: 0.000006 - momentum: 0.000000
2023-10-18 17:51:05,606 epoch 9 - iter 356/894 - loss 0.29591837 - time (sec): 5.58 - samples/sec: 6341.19 - lr: 0.000005 - momentum: 0.000000
2023-10-18 17:51:07,028 epoch 9 - iter 445/894 - loss 0.29997546 - time (sec): 7.00 - samples/sec: 6173.84 - lr: 0.000005 - momentum: 0.000000
2023-10-18 17:51:08,706 epoch 9 - iter 534/894 - loss 0.30091527 - time (sec): 8.68 - samples/sec: 5943.69 - lr: 0.000005 - momentum: 0.000000
2023-10-18 17:51:10,103 epoch 9 - iter 623/894 - loss 0.30243132 - time (sec): 10.08 - samples/sec: 5982.17 - lr: 0.000004 - momentum: 0.000000
2023-10-18 17:51:11,554 epoch 9 - iter 712/894 - loss 0.29460013 - time (sec): 11.53 - samples/sec: 6064.68 - lr: 0.000004 - momentum: 0.000000
2023-10-18 17:51:12,917 epoch 9 - iter 801/894 - loss 0.30229177 - time (sec): 12.89 - samples/sec: 6047.04 - lr: 0.000004 - momentum: 0.000000
2023-10-18 17:51:14,303 epoch 9 - iter 890/894 - loss 0.30375418 - time (sec): 14.28 - samples/sec: 6036.73 - lr: 0.000003 - momentum: 0.000000
2023-10-18 17:51:14,365 ----------------------------------------------------------------------------------------------------
2023-10-18 17:51:14,365 EPOCH 9 done: loss 0.3029 - lr: 0.000003
2023-10-18 17:51:19,332 DEV : loss 0.30048123002052307 - f1-score (micro avg) 0.3541
2023-10-18 17:51:19,357 saving best model
2023-10-18 17:51:19,393 ----------------------------------------------------------------------------------------------------
2023-10-18 17:51:20,890 epoch 10 - iter 89/894 - loss 0.29449068 - time (sec): 1.50 - samples/sec: 6558.64 - lr: 0.000003 - momentum: 0.000000
2023-10-18 17:51:22,283 epoch 10 - iter 178/894 - loss 0.30220126 - time (sec): 2.89 - samples/sec: 6265.58 - lr: 0.000003 - momentum: 0.000000
2023-10-18 17:51:23,707 epoch 10 - iter 267/894 - loss 0.28820853 - time (sec): 4.31 - samples/sec: 6138.84 - lr: 0.000002 - momentum: 0.000000
2023-10-18 17:51:25,097 epoch 10 - iter 356/894 - loss 0.29937325 - time (sec): 5.70 - samples/sec: 6225.42 - lr: 0.000002 - momentum: 0.000000
2023-10-18 17:51:26,464 epoch 10 - iter 445/894 - loss 0.29761073 - time (sec): 7.07 - samples/sec: 6096.08 - lr: 0.000002 - momentum: 0.000000
2023-10-18 17:51:27,899 epoch 10 - iter 534/894 - loss 0.29810315 - time (sec): 8.51 - samples/sec: 6153.61 - lr: 0.000001 - momentum: 0.000000
2023-10-18 17:51:29,274 epoch 10 - iter 623/894 - loss 0.30838358 - time (sec): 9.88 - samples/sec: 6263.76 - lr: 0.000001 - momentum: 0.000000
2023-10-18 17:51:30,631 epoch 10 - iter 712/894 - loss 0.31079931 - time (sec): 11.24 - samples/sec: 6198.45 - lr: 0.000001 - momentum: 0.000000
2023-10-18 17:51:31,990 epoch 10 - iter 801/894 - loss 0.30309182 - time (sec): 12.60 - samples/sec: 6192.14 - lr: 0.000000 - momentum: 0.000000
2023-10-18 17:51:33,352 epoch 10 - iter 890/894 - loss 0.30297146 - time (sec): 13.96 - samples/sec: 6148.71 - lr: 0.000000 - momentum: 0.000000
2023-10-18 17:51:33,434 ----------------------------------------------------------------------------------------------------
2023-10-18 17:51:33,434 EPOCH 10 done: loss 0.3025 - lr: 0.000000
2023-10-18 17:51:38,772 DEV : loss 0.300153523683548 - f1-score (micro avg) 0.3579
2023-10-18 17:51:38,797 saving best model
2023-10-18 17:51:38,868 ----------------------------------------------------------------------------------------------------
2023-10-18 17:51:38,868 Loading model from best epoch ...
2023-10-18 17:51:38,945 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-18 17:51:41,291
Results:
- F-score (micro) 0.3715
- F-score (macro) 0.1528
- Accuracy 0.2374
By class:
precision recall f1-score support
loc 0.4882 0.5923 0.5353 596
pers 0.1927 0.2072 0.1997 333
org 0.0000 0.0000 0.0000 132
time 0.0500 0.0204 0.0290 49
prod 0.0000 0.0000 0.0000 66
micro avg 0.3842 0.3597 0.3715 1176
macro avg 0.1462 0.1640 0.1528 1176
weighted avg 0.3041 0.3597 0.3290 1176
2023-10-18 17:51:41,291 ----------------------------------------------------------------------------------------------------