stefan-it's picture
Upload folder using huggingface_hub
0440949
2023-10-18 18:07:05,092 ----------------------------------------------------------------------------------------------------
2023-10-18 18:07:05,092 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 18:07:05,092 ----------------------------------------------------------------------------------------------------
2023-10-18 18:07:05,092 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-18 18:07:05,092 ----------------------------------------------------------------------------------------------------
2023-10-18 18:07:05,092 Train: 3575 sentences
2023-10-18 18:07:05,092 (train_with_dev=False, train_with_test=False)
2023-10-18 18:07:05,092 ----------------------------------------------------------------------------------------------------
2023-10-18 18:07:05,092 Training Params:
2023-10-18 18:07:05,093 - learning_rate: "3e-05"
2023-10-18 18:07:05,093 - mini_batch_size: "8"
2023-10-18 18:07:05,093 - max_epochs: "10"
2023-10-18 18:07:05,093 - shuffle: "True"
2023-10-18 18:07:05,093 ----------------------------------------------------------------------------------------------------
2023-10-18 18:07:05,093 Plugins:
2023-10-18 18:07:05,093 - TensorboardLogger
2023-10-18 18:07:05,093 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 18:07:05,093 ----------------------------------------------------------------------------------------------------
2023-10-18 18:07:05,093 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 18:07:05,093 - metric: "('micro avg', 'f1-score')"
2023-10-18 18:07:05,093 ----------------------------------------------------------------------------------------------------
2023-10-18 18:07:05,093 Computation:
2023-10-18 18:07:05,093 - compute on device: cuda:0
2023-10-18 18:07:05,093 - embedding storage: none
2023-10-18 18:07:05,093 ----------------------------------------------------------------------------------------------------
2023-10-18 18:07:05,093 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-18 18:07:05,093 ----------------------------------------------------------------------------------------------------
2023-10-18 18:07:05,093 ----------------------------------------------------------------------------------------------------
2023-10-18 18:07:05,093 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 18:07:05,891 epoch 1 - iter 44/447 - loss 3.50461292 - time (sec): 0.80 - samples/sec: 10088.17 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:07:06,695 epoch 1 - iter 88/447 - loss 3.36992254 - time (sec): 1.60 - samples/sec: 10075.15 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:07:07,556 epoch 1 - iter 132/447 - loss 3.17925117 - time (sec): 2.46 - samples/sec: 10196.84 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:07:08,433 epoch 1 - iter 176/447 - loss 2.91552427 - time (sec): 3.34 - samples/sec: 10158.79 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:07:09,438 epoch 1 - iter 220/447 - loss 2.64264091 - time (sec): 4.34 - samples/sec: 9779.21 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:07:10,433 epoch 1 - iter 264/447 - loss 2.37544260 - time (sec): 5.34 - samples/sec: 9466.40 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:07:11,401 epoch 1 - iter 308/447 - loss 2.14809863 - time (sec): 6.31 - samples/sec: 9273.60 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:07:12,414 epoch 1 - iter 352/447 - loss 1.96232764 - time (sec): 7.32 - samples/sec: 9127.34 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:07:13,473 epoch 1 - iter 396/447 - loss 1.78387650 - time (sec): 8.38 - samples/sec: 9196.65 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:07:14,484 epoch 1 - iter 440/447 - loss 1.66986552 - time (sec): 9.39 - samples/sec: 9077.85 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:07:14,643 ----------------------------------------------------------------------------------------------------
2023-10-18 18:07:14,643 EPOCH 1 done: loss 1.6561 - lr: 0.000029
2023-10-18 18:07:16,854 DEV : loss 0.47917860746383667 - f1-score (micro avg) 0.0
2023-10-18 18:07:16,878 ----------------------------------------------------------------------------------------------------
2023-10-18 18:07:17,885 epoch 2 - iter 44/447 - loss 0.59675206 - time (sec): 1.01 - samples/sec: 8748.78 - lr: 0.000030 - momentum: 0.000000
2023-10-18 18:07:18,906 epoch 2 - iter 88/447 - loss 0.57436524 - time (sec): 2.03 - samples/sec: 8496.81 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:07:19,912 epoch 2 - iter 132/447 - loss 0.57218780 - time (sec): 3.03 - samples/sec: 8338.73 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:07:20,901 epoch 2 - iter 176/447 - loss 0.57549297 - time (sec): 4.02 - samples/sec: 8262.18 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:07:21,897 epoch 2 - iter 220/447 - loss 0.57292048 - time (sec): 5.02 - samples/sec: 8299.12 - lr: 0.000028 - momentum: 0.000000
2023-10-18 18:07:22,896 epoch 2 - iter 264/447 - loss 0.56116937 - time (sec): 6.02 - samples/sec: 8342.46 - lr: 0.000028 - momentum: 0.000000
2023-10-18 18:07:23,874 epoch 2 - iter 308/447 - loss 0.56094107 - time (sec): 7.00 - samples/sec: 8327.98 - lr: 0.000028 - momentum: 0.000000
2023-10-18 18:07:24,902 epoch 2 - iter 352/447 - loss 0.55109095 - time (sec): 8.02 - samples/sec: 8331.66 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:07:25,979 epoch 2 - iter 396/447 - loss 0.54595643 - time (sec): 9.10 - samples/sec: 8433.04 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:07:26,986 epoch 2 - iter 440/447 - loss 0.54125168 - time (sec): 10.11 - samples/sec: 8421.53 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:07:27,142 ----------------------------------------------------------------------------------------------------
2023-10-18 18:07:27,142 EPOCH 2 done: loss 0.5395 - lr: 0.000027
2023-10-18 18:07:31,982 DEV : loss 0.3830968141555786 - f1-score (micro avg) 0.0046
2023-10-18 18:07:32,007 saving best model
2023-10-18 18:07:32,043 ----------------------------------------------------------------------------------------------------
2023-10-18 18:07:33,021 epoch 3 - iter 44/447 - loss 0.49082765 - time (sec): 0.98 - samples/sec: 8339.91 - lr: 0.000026 - momentum: 0.000000
2023-10-18 18:07:34,035 epoch 3 - iter 88/447 - loss 0.51096977 - time (sec): 1.99 - samples/sec: 8496.04 - lr: 0.000026 - momentum: 0.000000
2023-10-18 18:07:35,097 epoch 3 - iter 132/447 - loss 0.49344824 - time (sec): 3.05 - samples/sec: 8568.51 - lr: 0.000026 - momentum: 0.000000
2023-10-18 18:07:36,123 epoch 3 - iter 176/447 - loss 0.49144506 - time (sec): 4.08 - samples/sec: 8540.93 - lr: 0.000025 - momentum: 0.000000
2023-10-18 18:07:37,170 epoch 3 - iter 220/447 - loss 0.47976846 - time (sec): 5.13 - samples/sec: 8459.17 - lr: 0.000025 - momentum: 0.000000
2023-10-18 18:07:38,209 epoch 3 - iter 264/447 - loss 0.46171274 - time (sec): 6.17 - samples/sec: 8428.96 - lr: 0.000025 - momentum: 0.000000
2023-10-18 18:07:39,280 epoch 3 - iter 308/447 - loss 0.45866907 - time (sec): 7.24 - samples/sec: 8351.11 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:07:40,298 epoch 3 - iter 352/447 - loss 0.45353460 - time (sec): 8.25 - samples/sec: 8282.42 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:07:41,314 epoch 3 - iter 396/447 - loss 0.45002822 - time (sec): 9.27 - samples/sec: 8318.39 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:07:42,298 epoch 3 - iter 440/447 - loss 0.45039203 - time (sec): 10.26 - samples/sec: 8305.69 - lr: 0.000023 - momentum: 0.000000
2023-10-18 18:07:42,459 ----------------------------------------------------------------------------------------------------
2023-10-18 18:07:42,460 EPOCH 3 done: loss 0.4492 - lr: 0.000023
2023-10-18 18:07:47,591 DEV : loss 0.35753190517425537 - f1-score (micro avg) 0.0929
2023-10-18 18:07:47,616 saving best model
2023-10-18 18:07:47,647 ----------------------------------------------------------------------------------------------------
2023-10-18 18:07:48,651 epoch 4 - iter 44/447 - loss 0.42225951 - time (sec): 1.00 - samples/sec: 8473.16 - lr: 0.000023 - momentum: 0.000000
2023-10-18 18:07:49,655 epoch 4 - iter 88/447 - loss 0.44058185 - time (sec): 2.01 - samples/sec: 8560.44 - lr: 0.000023 - momentum: 0.000000
2023-10-18 18:07:50,648 epoch 4 - iter 132/447 - loss 0.43632193 - time (sec): 3.00 - samples/sec: 8600.18 - lr: 0.000022 - momentum: 0.000000
2023-10-18 18:07:51,725 epoch 4 - iter 176/447 - loss 0.42490939 - time (sec): 4.08 - samples/sec: 8563.05 - lr: 0.000022 - momentum: 0.000000
2023-10-18 18:07:52,835 epoch 4 - iter 220/447 - loss 0.42362563 - time (sec): 5.19 - samples/sec: 8595.40 - lr: 0.000022 - momentum: 0.000000
2023-10-18 18:07:53,870 epoch 4 - iter 264/447 - loss 0.42543379 - time (sec): 6.22 - samples/sec: 8444.87 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:07:54,934 epoch 4 - iter 308/447 - loss 0.41859725 - time (sec): 7.29 - samples/sec: 8347.96 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:07:55,958 epoch 4 - iter 352/447 - loss 0.42123722 - time (sec): 8.31 - samples/sec: 8286.21 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:07:57,002 epoch 4 - iter 396/447 - loss 0.42071393 - time (sec): 9.35 - samples/sec: 8201.04 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:07:58,069 epoch 4 - iter 440/447 - loss 0.41441893 - time (sec): 10.42 - samples/sec: 8196.17 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:07:58,228 ----------------------------------------------------------------------------------------------------
2023-10-18 18:07:58,229 EPOCH 4 done: loss 0.4146 - lr: 0.000020
2023-10-18 18:08:03,384 DEV : loss 0.3443821668624878 - f1-score (micro avg) 0.2113
2023-10-18 18:08:03,411 saving best model
2023-10-18 18:08:03,442 ----------------------------------------------------------------------------------------------------
2023-10-18 18:08:04,547 epoch 5 - iter 44/447 - loss 0.38157833 - time (sec): 1.10 - samples/sec: 7597.16 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:08:05,639 epoch 5 - iter 88/447 - loss 0.38080177 - time (sec): 2.20 - samples/sec: 8217.75 - lr: 0.000019 - momentum: 0.000000
2023-10-18 18:08:06,677 epoch 5 - iter 132/447 - loss 0.38920520 - time (sec): 3.23 - samples/sec: 8330.35 - lr: 0.000019 - momentum: 0.000000
2023-10-18 18:08:07,682 epoch 5 - iter 176/447 - loss 0.38415499 - time (sec): 4.24 - samples/sec: 8330.93 - lr: 0.000019 - momentum: 0.000000
2023-10-18 18:08:08,668 epoch 5 - iter 220/447 - loss 0.39229553 - time (sec): 5.23 - samples/sec: 8266.94 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:08:09,712 epoch 5 - iter 264/447 - loss 0.38988800 - time (sec): 6.27 - samples/sec: 8187.04 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:08:10,751 epoch 5 - iter 308/447 - loss 0.39386531 - time (sec): 7.31 - samples/sec: 8197.79 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:08:11,774 epoch 5 - iter 352/447 - loss 0.39212543 - time (sec): 8.33 - samples/sec: 8281.85 - lr: 0.000017 - momentum: 0.000000
2023-10-18 18:08:12,788 epoch 5 - iter 396/447 - loss 0.39073713 - time (sec): 9.35 - samples/sec: 8287.81 - lr: 0.000017 - momentum: 0.000000
2023-10-18 18:08:13,784 epoch 5 - iter 440/447 - loss 0.39025883 - time (sec): 10.34 - samples/sec: 8254.89 - lr: 0.000017 - momentum: 0.000000
2023-10-18 18:08:13,937 ----------------------------------------------------------------------------------------------------
2023-10-18 18:08:13,938 EPOCH 5 done: loss 0.3880 - lr: 0.000017
2023-10-18 18:08:19,214 DEV : loss 0.32710015773773193 - f1-score (micro avg) 0.2681
2023-10-18 18:08:19,239 saving best model
2023-10-18 18:08:19,269 ----------------------------------------------------------------------------------------------------
2023-10-18 18:08:20,299 epoch 6 - iter 44/447 - loss 0.32138402 - time (sec): 1.03 - samples/sec: 8948.29 - lr: 0.000016 - momentum: 0.000000
2023-10-18 18:08:21,338 epoch 6 - iter 88/447 - loss 0.35939093 - time (sec): 2.07 - samples/sec: 8475.70 - lr: 0.000016 - momentum: 0.000000
2023-10-18 18:08:22,396 epoch 6 - iter 132/447 - loss 0.38072210 - time (sec): 3.13 - samples/sec: 8262.49 - lr: 0.000016 - momentum: 0.000000
2023-10-18 18:08:23,491 epoch 6 - iter 176/447 - loss 0.38219611 - time (sec): 4.22 - samples/sec: 8121.81 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:08:24,532 epoch 6 - iter 220/447 - loss 0.38590948 - time (sec): 5.26 - samples/sec: 8195.23 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:08:25,609 epoch 6 - iter 264/447 - loss 0.38458419 - time (sec): 6.34 - samples/sec: 8136.28 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:08:26,669 epoch 6 - iter 308/447 - loss 0.37526689 - time (sec): 7.40 - samples/sec: 8061.49 - lr: 0.000014 - momentum: 0.000000
2023-10-18 18:08:27,702 epoch 6 - iter 352/447 - loss 0.37111345 - time (sec): 8.43 - samples/sec: 8062.46 - lr: 0.000014 - momentum: 0.000000
2023-10-18 18:08:28,769 epoch 6 - iter 396/447 - loss 0.36328385 - time (sec): 9.50 - samples/sec: 8096.07 - lr: 0.000014 - momentum: 0.000000
2023-10-18 18:08:29,786 epoch 6 - iter 440/447 - loss 0.36654173 - time (sec): 10.52 - samples/sec: 8084.61 - lr: 0.000013 - momentum: 0.000000
2023-10-18 18:08:29,946 ----------------------------------------------------------------------------------------------------
2023-10-18 18:08:29,946 EPOCH 6 done: loss 0.3674 - lr: 0.000013
2023-10-18 18:08:35,149 DEV : loss 0.3198098838329315 - f1-score (micro avg) 0.2982
2023-10-18 18:08:35,174 saving best model
2023-10-18 18:08:35,213 ----------------------------------------------------------------------------------------------------
2023-10-18 18:08:36,213 epoch 7 - iter 44/447 - loss 0.30880564 - time (sec): 1.00 - samples/sec: 8101.35 - lr: 0.000013 - momentum: 0.000000
2023-10-18 18:08:37,227 epoch 7 - iter 88/447 - loss 0.34698433 - time (sec): 2.01 - samples/sec: 8636.29 - lr: 0.000013 - momentum: 0.000000
2023-10-18 18:08:38,218 epoch 7 - iter 132/447 - loss 0.34845425 - time (sec): 3.00 - samples/sec: 8564.49 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:08:39,278 epoch 7 - iter 176/447 - loss 0.34775342 - time (sec): 4.06 - samples/sec: 8745.68 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:08:40,278 epoch 7 - iter 220/447 - loss 0.35181495 - time (sec): 5.06 - samples/sec: 8700.53 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:08:41,293 epoch 7 - iter 264/447 - loss 0.35415511 - time (sec): 6.08 - samples/sec: 8740.50 - lr: 0.000011 - momentum: 0.000000
2023-10-18 18:08:42,269 epoch 7 - iter 308/447 - loss 0.35519252 - time (sec): 7.06 - samples/sec: 8636.30 - lr: 0.000011 - momentum: 0.000000
2023-10-18 18:08:43,309 epoch 7 - iter 352/447 - loss 0.36019247 - time (sec): 8.09 - samples/sec: 8566.63 - lr: 0.000011 - momentum: 0.000000
2023-10-18 18:08:44,299 epoch 7 - iter 396/447 - loss 0.36381442 - time (sec): 9.09 - samples/sec: 8521.82 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:08:45,298 epoch 7 - iter 440/447 - loss 0.35927689 - time (sec): 10.08 - samples/sec: 8457.32 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:08:45,469 ----------------------------------------------------------------------------------------------------
2023-10-18 18:08:45,469 EPOCH 7 done: loss 0.3616 - lr: 0.000010
2023-10-18 18:08:50,395 DEV : loss 0.319092720746994 - f1-score (micro avg) 0.3018
2023-10-18 18:08:50,420 saving best model
2023-10-18 18:08:50,457 ----------------------------------------------------------------------------------------------------
2023-10-18 18:08:51,438 epoch 8 - iter 44/447 - loss 0.33586850 - time (sec): 0.98 - samples/sec: 8029.13 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:08:52,394 epoch 8 - iter 88/447 - loss 0.34939112 - time (sec): 1.94 - samples/sec: 7813.59 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:08:53,388 epoch 8 - iter 132/447 - loss 0.34860416 - time (sec): 2.93 - samples/sec: 8130.83 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:08:54,411 epoch 8 - iter 176/447 - loss 0.36296272 - time (sec): 3.95 - samples/sec: 8116.89 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:08:55,442 epoch 8 - iter 220/447 - loss 0.35372586 - time (sec): 4.98 - samples/sec: 8030.27 - lr: 0.000008 - momentum: 0.000000
2023-10-18 18:08:56,441 epoch 8 - iter 264/447 - loss 0.35124092 - time (sec): 5.98 - samples/sec: 8024.65 - lr: 0.000008 - momentum: 0.000000
2023-10-18 18:08:57,459 epoch 8 - iter 308/447 - loss 0.34747995 - time (sec): 7.00 - samples/sec: 8214.22 - lr: 0.000008 - momentum: 0.000000
2023-10-18 18:08:58,478 epoch 8 - iter 352/447 - loss 0.35373290 - time (sec): 8.02 - samples/sec: 8332.01 - lr: 0.000007 - momentum: 0.000000
2023-10-18 18:08:59,461 epoch 8 - iter 396/447 - loss 0.35277945 - time (sec): 9.00 - samples/sec: 8297.12 - lr: 0.000007 - momentum: 0.000000
2023-10-18 18:09:00,541 epoch 8 - iter 440/447 - loss 0.35015304 - time (sec): 10.08 - samples/sec: 8324.91 - lr: 0.000007 - momentum: 0.000000
2023-10-18 18:09:01,065 ----------------------------------------------------------------------------------------------------
2023-10-18 18:09:01,065 EPOCH 8 done: loss 0.3492 - lr: 0.000007
2023-10-18 18:09:06,001 DEV : loss 0.31162384152412415 - f1-score (micro avg) 0.3008
2023-10-18 18:09:06,028 ----------------------------------------------------------------------------------------------------
2023-10-18 18:09:07,212 epoch 9 - iter 44/447 - loss 0.36624375 - time (sec): 1.18 - samples/sec: 8244.97 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:09:08,241 epoch 9 - iter 88/447 - loss 0.37242831 - time (sec): 2.21 - samples/sec: 8235.91 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:09:09,304 epoch 9 - iter 132/447 - loss 0.35798892 - time (sec): 3.28 - samples/sec: 8329.30 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:09:10,265 epoch 9 - iter 176/447 - loss 0.36040509 - time (sec): 4.24 - samples/sec: 8162.04 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:09:11,281 epoch 9 - iter 220/447 - loss 0.34292561 - time (sec): 5.25 - samples/sec: 8191.00 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:09:12,283 epoch 9 - iter 264/447 - loss 0.35220271 - time (sec): 6.25 - samples/sec: 8200.26 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:09:13,337 epoch 9 - iter 308/447 - loss 0.34817812 - time (sec): 7.31 - samples/sec: 8175.54 - lr: 0.000004 - momentum: 0.000000
2023-10-18 18:09:14,342 epoch 9 - iter 352/447 - loss 0.34389750 - time (sec): 8.31 - samples/sec: 8219.01 - lr: 0.000004 - momentum: 0.000000
2023-10-18 18:09:15,337 epoch 9 - iter 396/447 - loss 0.34225937 - time (sec): 9.31 - samples/sec: 8251.47 - lr: 0.000004 - momentum: 0.000000
2023-10-18 18:09:16,385 epoch 9 - iter 440/447 - loss 0.34245648 - time (sec): 10.36 - samples/sec: 8172.11 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:09:16,574 ----------------------------------------------------------------------------------------------------
2023-10-18 18:09:16,574 EPOCH 9 done: loss 0.3424 - lr: 0.000003
2023-10-18 18:09:21,816 DEV : loss 0.3109774589538574 - f1-score (micro avg) 0.3081
2023-10-18 18:09:21,841 saving best model
2023-10-18 18:09:21,871 ----------------------------------------------------------------------------------------------------
2023-10-18 18:09:22,862 epoch 10 - iter 44/447 - loss 0.36620473 - time (sec): 0.99 - samples/sec: 8918.75 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:09:23,843 epoch 10 - iter 88/447 - loss 0.36388255 - time (sec): 1.97 - samples/sec: 8882.55 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:09:24,809 epoch 10 - iter 132/447 - loss 0.35657952 - time (sec): 2.94 - samples/sec: 8690.34 - lr: 0.000002 - momentum: 0.000000
2023-10-18 18:09:25,762 epoch 10 - iter 176/447 - loss 0.35887161 - time (sec): 3.89 - samples/sec: 8547.45 - lr: 0.000002 - momentum: 0.000000
2023-10-18 18:09:26,768 epoch 10 - iter 220/447 - loss 0.35777360 - time (sec): 4.90 - samples/sec: 8436.02 - lr: 0.000002 - momentum: 0.000000
2023-10-18 18:09:27,835 epoch 10 - iter 264/447 - loss 0.35322405 - time (sec): 5.96 - samples/sec: 8422.42 - lr: 0.000001 - momentum: 0.000000
2023-10-18 18:09:28,938 epoch 10 - iter 308/447 - loss 0.34508246 - time (sec): 7.07 - samples/sec: 8386.42 - lr: 0.000001 - momentum: 0.000000
2023-10-18 18:09:29,971 epoch 10 - iter 352/447 - loss 0.34032089 - time (sec): 8.10 - samples/sec: 8370.11 - lr: 0.000001 - momentum: 0.000000
2023-10-18 18:09:30,970 epoch 10 - iter 396/447 - loss 0.33827017 - time (sec): 9.10 - samples/sec: 8395.25 - lr: 0.000000 - momentum: 0.000000
2023-10-18 18:09:31,965 epoch 10 - iter 440/447 - loss 0.33662610 - time (sec): 10.09 - samples/sec: 8439.74 - lr: 0.000000 - momentum: 0.000000
2023-10-18 18:09:32,118 ----------------------------------------------------------------------------------------------------
2023-10-18 18:09:32,119 EPOCH 10 done: loss 0.3368 - lr: 0.000000
2023-10-18 18:09:37,350 DEV : loss 0.3095775544643402 - f1-score (micro avg) 0.3138
2023-10-18 18:09:37,375 saving best model
2023-10-18 18:09:37,436 ----------------------------------------------------------------------------------------------------
2023-10-18 18:09:37,436 Loading model from best epoch ...
2023-10-18 18:09:37,514 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-18 18:09:39,395
Results:
- F-score (micro) 0.3101
- F-score (macro) 0.1149
- Accuracy 0.1915
By class:
precision recall f1-score support
loc 0.4433 0.4983 0.4692 596
pers 0.1168 0.0961 0.1054 333
org 0.0000 0.0000 0.0000 132
prod 0.0000 0.0000 0.0000 66
time 0.0000 0.0000 0.0000 49
micro avg 0.3478 0.2798 0.3101 1176
macro avg 0.1120 0.1189 0.1149 1176
weighted avg 0.2577 0.2798 0.2676 1176
2023-10-18 18:09:39,395 ----------------------------------------------------------------------------------------------------