stefan-it's picture
Upload ./training.log with huggingface_hub
d1cb3ef
2023-10-25 15:40:49,725 ----------------------------------------------------------------------------------------------------
2023-10-25 15:40:49,726 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 15:40:49,726 ----------------------------------------------------------------------------------------------------
2023-10-25 15:40:49,726 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-25 15:40:49,726 ----------------------------------------------------------------------------------------------------
2023-10-25 15:40:49,726 Train: 20847 sentences
2023-10-25 15:40:49,726 (train_with_dev=False, train_with_test=False)
2023-10-25 15:40:49,727 ----------------------------------------------------------------------------------------------------
2023-10-25 15:40:49,727 Training Params:
2023-10-25 15:40:49,727 - learning_rate: "5e-05"
2023-10-25 15:40:49,727 - mini_batch_size: "4"
2023-10-25 15:40:49,727 - max_epochs: "10"
2023-10-25 15:40:49,727 - shuffle: "True"
2023-10-25 15:40:49,727 ----------------------------------------------------------------------------------------------------
2023-10-25 15:40:49,727 Plugins:
2023-10-25 15:40:49,727 - TensorboardLogger
2023-10-25 15:40:49,727 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 15:40:49,727 ----------------------------------------------------------------------------------------------------
2023-10-25 15:40:49,727 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 15:40:49,727 - metric: "('micro avg', 'f1-score')"
2023-10-25 15:40:49,727 ----------------------------------------------------------------------------------------------------
2023-10-25 15:40:49,727 Computation:
2023-10-25 15:40:49,727 - compute on device: cuda:0
2023-10-25 15:40:49,727 - embedding storage: none
2023-10-25 15:40:49,727 ----------------------------------------------------------------------------------------------------
2023-10-25 15:40:49,727 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-25 15:40:49,727 ----------------------------------------------------------------------------------------------------
2023-10-25 15:40:49,727 ----------------------------------------------------------------------------------------------------
2023-10-25 15:40:49,727 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 15:41:12,430 epoch 1 - iter 521/5212 - loss 1.24847613 - time (sec): 22.70 - samples/sec: 1631.49 - lr: 0.000005 - momentum: 0.000000
2023-10-25 15:41:35,742 epoch 1 - iter 1042/5212 - loss 0.79409542 - time (sec): 46.01 - samples/sec: 1589.33 - lr: 0.000010 - momentum: 0.000000
2023-10-25 15:41:58,495 epoch 1 - iter 1563/5212 - loss 0.62000166 - time (sec): 68.77 - samples/sec: 1620.44 - lr: 0.000015 - momentum: 0.000000
2023-10-25 15:42:21,224 epoch 1 - iter 2084/5212 - loss 0.52313149 - time (sec): 91.50 - samples/sec: 1620.63 - lr: 0.000020 - momentum: 0.000000
2023-10-25 15:42:45,174 epoch 1 - iter 2605/5212 - loss 0.46787427 - time (sec): 115.45 - samples/sec: 1636.39 - lr: 0.000025 - momentum: 0.000000
2023-10-25 15:43:07,482 epoch 1 - iter 3126/5212 - loss 0.42636894 - time (sec): 137.75 - samples/sec: 1634.90 - lr: 0.000030 - momentum: 0.000000
2023-10-25 15:43:29,865 epoch 1 - iter 3647/5212 - loss 0.39828423 - time (sec): 160.14 - samples/sec: 1640.03 - lr: 0.000035 - momentum: 0.000000
2023-10-25 15:43:52,495 epoch 1 - iter 4168/5212 - loss 0.37970166 - time (sec): 182.77 - samples/sec: 1635.63 - lr: 0.000040 - momentum: 0.000000
2023-10-25 15:44:15,495 epoch 1 - iter 4689/5212 - loss 0.37361807 - time (sec): 205.77 - samples/sec: 1614.91 - lr: 0.000045 - momentum: 0.000000
2023-10-25 15:44:37,828 epoch 1 - iter 5210/5212 - loss 0.36166651 - time (sec): 228.10 - samples/sec: 1610.66 - lr: 0.000050 - momentum: 0.000000
2023-10-25 15:44:37,911 ----------------------------------------------------------------------------------------------------
2023-10-25 15:44:37,911 EPOCH 1 done: loss 0.3617 - lr: 0.000050
2023-10-25 15:44:41,574 DEV : loss 0.22080442309379578 - f1-score (micro avg) 0.1438
2023-10-25 15:44:41,599 saving best model
2023-10-25 15:44:42,080 ----------------------------------------------------------------------------------------------------
2023-10-25 15:45:05,344 epoch 2 - iter 521/5212 - loss 0.27701476 - time (sec): 23.26 - samples/sec: 1639.01 - lr: 0.000049 - momentum: 0.000000
2023-10-25 15:45:28,141 epoch 2 - iter 1042/5212 - loss 0.26774174 - time (sec): 46.06 - samples/sec: 1663.32 - lr: 0.000049 - momentum: 0.000000
2023-10-25 15:45:50,898 epoch 2 - iter 1563/5212 - loss 0.25315314 - time (sec): 68.82 - samples/sec: 1650.18 - lr: 0.000048 - momentum: 0.000000
2023-10-25 15:46:14,163 epoch 2 - iter 2084/5212 - loss 0.29757522 - time (sec): 92.08 - samples/sec: 1628.91 - lr: 0.000048 - momentum: 0.000000
2023-10-25 15:46:36,427 epoch 2 - iter 2605/5212 - loss 0.32230692 - time (sec): 114.35 - samples/sec: 1631.59 - lr: 0.000047 - momentum: 0.000000
2023-10-25 15:46:58,692 epoch 2 - iter 3126/5212 - loss 0.31370599 - time (sec): 136.61 - samples/sec: 1625.37 - lr: 0.000047 - momentum: 0.000000
2023-10-25 15:47:21,123 epoch 2 - iter 3647/5212 - loss 0.30764574 - time (sec): 159.04 - samples/sec: 1631.06 - lr: 0.000046 - momentum: 0.000000
2023-10-25 15:47:43,308 epoch 2 - iter 4168/5212 - loss 0.29897371 - time (sec): 181.23 - samples/sec: 1637.77 - lr: 0.000046 - momentum: 0.000000
2023-10-25 15:48:05,566 epoch 2 - iter 4689/5212 - loss 0.29460317 - time (sec): 203.48 - samples/sec: 1623.40 - lr: 0.000045 - momentum: 0.000000
2023-10-25 15:48:27,906 epoch 2 - iter 5210/5212 - loss 0.28663517 - time (sec): 225.83 - samples/sec: 1626.67 - lr: 0.000044 - momentum: 0.000000
2023-10-25 15:48:27,993 ----------------------------------------------------------------------------------------------------
2023-10-25 15:48:27,993 EPOCH 2 done: loss 0.2867 - lr: 0.000044
2023-10-25 15:48:35,139 DEV : loss 0.15546594560146332 - f1-score (micro avg) 0.2256
2023-10-25 15:48:35,165 saving best model
2023-10-25 15:48:35,768 ----------------------------------------------------------------------------------------------------
2023-10-25 15:48:57,895 epoch 3 - iter 521/5212 - loss 0.30169616 - time (sec): 22.12 - samples/sec: 1661.09 - lr: 0.000044 - momentum: 0.000000
2023-10-25 15:49:20,057 epoch 3 - iter 1042/5212 - loss 0.30759449 - time (sec): 44.29 - samples/sec: 1617.71 - lr: 0.000043 - momentum: 0.000000
2023-10-25 15:49:42,707 epoch 3 - iter 1563/5212 - loss 0.27595502 - time (sec): 66.94 - samples/sec: 1612.63 - lr: 0.000043 - momentum: 0.000000
2023-10-25 15:50:05,056 epoch 3 - iter 2084/5212 - loss 0.24782471 - time (sec): 89.29 - samples/sec: 1632.91 - lr: 0.000042 - momentum: 0.000000
2023-10-25 15:50:27,628 epoch 3 - iter 2605/5212 - loss 0.23028822 - time (sec): 111.86 - samples/sec: 1639.48 - lr: 0.000042 - momentum: 0.000000
2023-10-25 15:50:48,988 epoch 3 - iter 3126/5212 - loss 0.22726832 - time (sec): 133.22 - samples/sec: 1652.52 - lr: 0.000041 - momentum: 0.000000
2023-10-25 15:51:12,671 epoch 3 - iter 3647/5212 - loss 0.22179543 - time (sec): 156.90 - samples/sec: 1629.53 - lr: 0.000041 - momentum: 0.000000
2023-10-25 15:51:34,997 epoch 3 - iter 4168/5212 - loss 0.21682299 - time (sec): 179.23 - samples/sec: 1633.25 - lr: 0.000040 - momentum: 0.000000
2023-10-25 15:51:57,314 epoch 3 - iter 4689/5212 - loss 0.21234182 - time (sec): 201.54 - samples/sec: 1625.65 - lr: 0.000039 - momentum: 0.000000
2023-10-25 15:52:19,667 epoch 3 - iter 5210/5212 - loss 0.20687427 - time (sec): 223.90 - samples/sec: 1640.40 - lr: 0.000039 - momentum: 0.000000
2023-10-25 15:52:19,759 ----------------------------------------------------------------------------------------------------
2023-10-25 15:52:19,759 EPOCH 3 done: loss 0.2068 - lr: 0.000039
2023-10-25 15:52:26,616 DEV : loss 0.24288956820964813 - f1-score (micro avg) 0.2767
2023-10-25 15:52:26,640 saving best model
2023-10-25 15:52:27,244 ----------------------------------------------------------------------------------------------------
2023-10-25 15:52:49,976 epoch 4 - iter 521/5212 - loss 0.14541617 - time (sec): 22.73 - samples/sec: 1594.73 - lr: 0.000038 - momentum: 0.000000
2023-10-25 15:53:12,334 epoch 4 - iter 1042/5212 - loss 0.14059672 - time (sec): 45.09 - samples/sec: 1612.24 - lr: 0.000038 - momentum: 0.000000
2023-10-25 15:53:34,746 epoch 4 - iter 1563/5212 - loss 0.14336781 - time (sec): 67.50 - samples/sec: 1619.62 - lr: 0.000037 - momentum: 0.000000
2023-10-25 15:53:56,360 epoch 4 - iter 2084/5212 - loss 0.15113922 - time (sec): 89.11 - samples/sec: 1613.43 - lr: 0.000037 - momentum: 0.000000
2023-10-25 15:54:18,268 epoch 4 - iter 2605/5212 - loss 0.15623749 - time (sec): 111.02 - samples/sec: 1607.30 - lr: 0.000036 - momentum: 0.000000
2023-10-25 15:54:40,720 epoch 4 - iter 3126/5212 - loss 0.15743031 - time (sec): 133.47 - samples/sec: 1639.68 - lr: 0.000036 - momentum: 0.000000
2023-10-25 15:55:02,499 epoch 4 - iter 3647/5212 - loss 0.15737676 - time (sec): 155.25 - samples/sec: 1630.29 - lr: 0.000035 - momentum: 0.000000
2023-10-25 15:55:24,814 epoch 4 - iter 4168/5212 - loss 0.16195469 - time (sec): 177.57 - samples/sec: 1635.41 - lr: 0.000034 - momentum: 0.000000
2023-10-25 15:55:46,893 epoch 4 - iter 4689/5212 - loss 0.16745988 - time (sec): 199.65 - samples/sec: 1638.43 - lr: 0.000034 - momentum: 0.000000
2023-10-25 15:56:09,860 epoch 4 - iter 5210/5212 - loss 0.16610419 - time (sec): 222.61 - samples/sec: 1649.17 - lr: 0.000033 - momentum: 0.000000
2023-10-25 15:56:09,961 ----------------------------------------------------------------------------------------------------
2023-10-25 15:56:09,961 EPOCH 4 done: loss 0.1662 - lr: 0.000033
2023-10-25 15:56:16,861 DEV : loss 0.23563335835933685 - f1-score (micro avg) 0.3339
2023-10-25 15:56:16,886 saving best model
2023-10-25 15:56:17,503 ----------------------------------------------------------------------------------------------------
2023-10-25 15:56:40,112 epoch 5 - iter 521/5212 - loss 0.16272851 - time (sec): 22.60 - samples/sec: 1648.41 - lr: 0.000033 - momentum: 0.000000
2023-10-25 15:57:02,345 epoch 5 - iter 1042/5212 - loss 0.15512717 - time (sec): 44.84 - samples/sec: 1621.45 - lr: 0.000032 - momentum: 0.000000
2023-10-25 15:57:24,778 epoch 5 - iter 1563/5212 - loss 0.15269704 - time (sec): 67.27 - samples/sec: 1649.97 - lr: 0.000032 - momentum: 0.000000
2023-10-25 15:57:46,753 epoch 5 - iter 2084/5212 - loss 0.16050403 - time (sec): 89.25 - samples/sec: 1661.90 - lr: 0.000031 - momentum: 0.000000
2023-10-25 15:58:08,939 epoch 5 - iter 2605/5212 - loss 0.16890836 - time (sec): 111.43 - samples/sec: 1645.12 - lr: 0.000031 - momentum: 0.000000
2023-10-25 15:58:30,542 epoch 5 - iter 3126/5212 - loss 0.16716994 - time (sec): 133.03 - samples/sec: 1650.66 - lr: 0.000030 - momentum: 0.000000
2023-10-25 15:58:52,733 epoch 5 - iter 3647/5212 - loss 0.16808642 - time (sec): 155.23 - samples/sec: 1651.60 - lr: 0.000029 - momentum: 0.000000
2023-10-25 15:59:14,632 epoch 5 - iter 4168/5212 - loss 0.16952321 - time (sec): 177.12 - samples/sec: 1661.77 - lr: 0.000029 - momentum: 0.000000
2023-10-25 15:59:36,952 epoch 5 - iter 4689/5212 - loss 0.17072890 - time (sec): 199.44 - samples/sec: 1670.87 - lr: 0.000028 - momentum: 0.000000
2023-10-25 15:59:58,912 epoch 5 - iter 5210/5212 - loss 0.17071298 - time (sec): 221.40 - samples/sec: 1659.36 - lr: 0.000028 - momentum: 0.000000
2023-10-25 15:59:58,988 ----------------------------------------------------------------------------------------------------
2023-10-25 15:59:58,989 EPOCH 5 done: loss 0.1707 - lr: 0.000028
2023-10-25 16:00:05,971 DEV : loss 0.2225915640592575 - f1-score (micro avg) 0.2691
2023-10-25 16:00:05,997 ----------------------------------------------------------------------------------------------------
2023-10-25 16:00:28,448 epoch 6 - iter 521/5212 - loss 0.14161427 - time (sec): 22.45 - samples/sec: 1756.92 - lr: 0.000027 - momentum: 0.000000
2023-10-25 16:00:50,604 epoch 6 - iter 1042/5212 - loss 0.13835517 - time (sec): 44.61 - samples/sec: 1765.39 - lr: 0.000027 - momentum: 0.000000
2023-10-25 16:01:12,353 epoch 6 - iter 1563/5212 - loss 0.14662044 - time (sec): 66.35 - samples/sec: 1706.62 - lr: 0.000026 - momentum: 0.000000
2023-10-25 16:01:33,931 epoch 6 - iter 2084/5212 - loss 0.16265685 - time (sec): 87.93 - samples/sec: 1700.00 - lr: 0.000026 - momentum: 0.000000
2023-10-25 16:01:55,994 epoch 6 - iter 2605/5212 - loss 0.16371044 - time (sec): 110.00 - samples/sec: 1661.81 - lr: 0.000025 - momentum: 0.000000
2023-10-25 16:02:17,993 epoch 6 - iter 3126/5212 - loss 0.16618079 - time (sec): 131.99 - samples/sec: 1661.11 - lr: 0.000024 - momentum: 0.000000
2023-10-25 16:02:40,063 epoch 6 - iter 3647/5212 - loss 0.16564198 - time (sec): 154.06 - samples/sec: 1666.52 - lr: 0.000024 - momentum: 0.000000
2023-10-25 16:03:02,499 epoch 6 - iter 4168/5212 - loss 0.16634156 - time (sec): 176.50 - samples/sec: 1667.11 - lr: 0.000023 - momentum: 0.000000
2023-10-25 16:03:24,979 epoch 6 - iter 4689/5212 - loss 0.16502704 - time (sec): 198.98 - samples/sec: 1667.40 - lr: 0.000023 - momentum: 0.000000
2023-10-25 16:03:46,903 epoch 6 - iter 5210/5212 - loss 0.16841155 - time (sec): 220.90 - samples/sec: 1662.80 - lr: 0.000022 - momentum: 0.000000
2023-10-25 16:03:46,983 ----------------------------------------------------------------------------------------------------
2023-10-25 16:03:46,983 EPOCH 6 done: loss 0.1684 - lr: 0.000022
2023-10-25 16:03:53,850 DEV : loss 0.20808175206184387 - f1-score (micro avg) 0.2423
2023-10-25 16:03:53,875 ----------------------------------------------------------------------------------------------------
2023-10-25 16:04:16,243 epoch 7 - iter 521/5212 - loss 0.16586561 - time (sec): 22.37 - samples/sec: 1558.29 - lr: 0.000022 - momentum: 0.000000
2023-10-25 16:04:38,816 epoch 7 - iter 1042/5212 - loss 0.14976359 - time (sec): 44.94 - samples/sec: 1600.71 - lr: 0.000021 - momentum: 0.000000
2023-10-25 16:05:00,525 epoch 7 - iter 1563/5212 - loss 0.14294447 - time (sec): 66.65 - samples/sec: 1584.38 - lr: 0.000021 - momentum: 0.000000
2023-10-25 16:05:22,503 epoch 7 - iter 2084/5212 - loss 0.14958504 - time (sec): 88.63 - samples/sec: 1605.49 - lr: 0.000020 - momentum: 0.000000
2023-10-25 16:05:44,352 epoch 7 - iter 2605/5212 - loss 0.15272797 - time (sec): 110.48 - samples/sec: 1622.73 - lr: 0.000019 - momentum: 0.000000
2023-10-25 16:06:07,447 epoch 7 - iter 3126/5212 - loss 0.15256052 - time (sec): 133.57 - samples/sec: 1647.28 - lr: 0.000019 - momentum: 0.000000
2023-10-25 16:06:29,677 epoch 7 - iter 3647/5212 - loss 0.15252498 - time (sec): 155.80 - samples/sec: 1675.50 - lr: 0.000018 - momentum: 0.000000
2023-10-25 16:06:51,711 epoch 7 - iter 4168/5212 - loss 0.15070024 - time (sec): 177.83 - samples/sec: 1676.15 - lr: 0.000018 - momentum: 0.000000
2023-10-25 16:07:13,611 epoch 7 - iter 4689/5212 - loss 0.15136587 - time (sec): 199.73 - samples/sec: 1674.35 - lr: 0.000017 - momentum: 0.000000
2023-10-25 16:07:35,780 epoch 7 - iter 5210/5212 - loss 0.15018887 - time (sec): 221.90 - samples/sec: 1654.74 - lr: 0.000017 - momentum: 0.000000
2023-10-25 16:07:35,873 ----------------------------------------------------------------------------------------------------
2023-10-25 16:07:35,873 EPOCH 7 done: loss 0.1502 - lr: 0.000017
2023-10-25 16:07:42,032 DEV : loss 0.2270050346851349 - f1-score (micro avg) 0.2385
2023-10-25 16:07:42,058 ----------------------------------------------------------------------------------------------------
2023-10-25 16:08:04,521 epoch 8 - iter 521/5212 - loss 0.14589267 - time (sec): 22.46 - samples/sec: 1749.06 - lr: 0.000016 - momentum: 0.000000
2023-10-25 16:08:26,536 epoch 8 - iter 1042/5212 - loss 0.12679147 - time (sec): 44.48 - samples/sec: 1648.21 - lr: 0.000016 - momentum: 0.000000
2023-10-25 16:08:49,455 epoch 8 - iter 1563/5212 - loss 0.13027312 - time (sec): 67.40 - samples/sec: 1629.21 - lr: 0.000015 - momentum: 0.000000
2023-10-25 16:09:11,507 epoch 8 - iter 2084/5212 - loss 0.13178230 - time (sec): 89.45 - samples/sec: 1632.23 - lr: 0.000014 - momentum: 0.000000
2023-10-25 16:09:33,603 epoch 8 - iter 2605/5212 - loss 0.13370436 - time (sec): 111.54 - samples/sec: 1649.83 - lr: 0.000014 - momentum: 0.000000
2023-10-25 16:09:55,760 epoch 8 - iter 3126/5212 - loss 0.13098280 - time (sec): 133.70 - samples/sec: 1673.75 - lr: 0.000013 - momentum: 0.000000
2023-10-25 16:10:17,942 epoch 8 - iter 3647/5212 - loss 0.13384808 - time (sec): 155.88 - samples/sec: 1659.85 - lr: 0.000013 - momentum: 0.000000
2023-10-25 16:10:40,208 epoch 8 - iter 4168/5212 - loss 0.13575297 - time (sec): 178.15 - samples/sec: 1686.01 - lr: 0.000012 - momentum: 0.000000
2023-10-25 16:11:01,988 epoch 8 - iter 4689/5212 - loss 0.13594078 - time (sec): 199.93 - samples/sec: 1671.82 - lr: 0.000012 - momentum: 0.000000
2023-10-25 16:11:23,871 epoch 8 - iter 5210/5212 - loss 0.13594037 - time (sec): 221.81 - samples/sec: 1656.28 - lr: 0.000011 - momentum: 0.000000
2023-10-25 16:11:23,947 ----------------------------------------------------------------------------------------------------
2023-10-25 16:11:23,947 EPOCH 8 done: loss 0.1359 - lr: 0.000011
2023-10-25 16:11:30,260 DEV : loss 0.2554771304130554 - f1-score (micro avg) 0.2288
2023-10-25 16:11:30,286 ----------------------------------------------------------------------------------------------------
2023-10-25 16:11:52,419 epoch 9 - iter 521/5212 - loss 0.11567060 - time (sec): 22.13 - samples/sec: 1548.27 - lr: 0.000011 - momentum: 0.000000
2023-10-25 16:12:14,779 epoch 9 - iter 1042/5212 - loss 0.12925166 - time (sec): 44.49 - samples/sec: 1633.21 - lr: 0.000010 - momentum: 0.000000
2023-10-25 16:12:36,602 epoch 9 - iter 1563/5212 - loss 0.12394185 - time (sec): 66.31 - samples/sec: 1646.06 - lr: 0.000009 - momentum: 0.000000
2023-10-25 16:12:58,308 epoch 9 - iter 2084/5212 - loss 0.12623309 - time (sec): 88.02 - samples/sec: 1642.85 - lr: 0.000009 - momentum: 0.000000
2023-10-25 16:13:21,115 epoch 9 - iter 2605/5212 - loss 0.12686418 - time (sec): 110.83 - samples/sec: 1621.52 - lr: 0.000008 - momentum: 0.000000
2023-10-25 16:13:43,274 epoch 9 - iter 3126/5212 - loss 0.12617219 - time (sec): 132.99 - samples/sec: 1632.06 - lr: 0.000008 - momentum: 0.000000
2023-10-25 16:14:06,024 epoch 9 - iter 3647/5212 - loss 0.12470505 - time (sec): 155.74 - samples/sec: 1630.26 - lr: 0.000007 - momentum: 0.000000
2023-10-25 16:14:28,476 epoch 9 - iter 4168/5212 - loss 0.12338268 - time (sec): 178.19 - samples/sec: 1647.55 - lr: 0.000007 - momentum: 0.000000
2023-10-25 16:14:50,431 epoch 9 - iter 4689/5212 - loss 0.12303248 - time (sec): 200.14 - samples/sec: 1655.20 - lr: 0.000006 - momentum: 0.000000
2023-10-25 16:15:13,282 epoch 9 - iter 5210/5212 - loss 0.12380050 - time (sec): 222.99 - samples/sec: 1647.36 - lr: 0.000006 - momentum: 0.000000
2023-10-25 16:15:13,380 ----------------------------------------------------------------------------------------------------
2023-10-25 16:15:13,380 EPOCH 9 done: loss 0.1238 - lr: 0.000006
2023-10-25 16:15:20,024 DEV : loss 0.25904580950737 - f1-score (micro avg) 0.2442
2023-10-25 16:15:20,051 ----------------------------------------------------------------------------------------------------
2023-10-25 16:15:42,601 epoch 10 - iter 521/5212 - loss 0.09950056 - time (sec): 22.55 - samples/sec: 1569.28 - lr: 0.000005 - momentum: 0.000000
2023-10-25 16:16:05,408 epoch 10 - iter 1042/5212 - loss 0.09962004 - time (sec): 45.36 - samples/sec: 1621.04 - lr: 0.000004 - momentum: 0.000000
2023-10-25 16:16:28,632 epoch 10 - iter 1563/5212 - loss 0.10130806 - time (sec): 68.58 - samples/sec: 1566.23 - lr: 0.000004 - momentum: 0.000000
2023-10-25 16:16:50,806 epoch 10 - iter 2084/5212 - loss 0.10676368 - time (sec): 90.75 - samples/sec: 1600.53 - lr: 0.000003 - momentum: 0.000000
2023-10-25 16:17:12,875 epoch 10 - iter 2605/5212 - loss 0.11022598 - time (sec): 112.82 - samples/sec: 1629.67 - lr: 0.000003 - momentum: 0.000000
2023-10-25 16:17:35,096 epoch 10 - iter 3126/5212 - loss 0.10822074 - time (sec): 135.04 - samples/sec: 1640.39 - lr: 0.000002 - momentum: 0.000000
2023-10-25 16:17:57,698 epoch 10 - iter 3647/5212 - loss 0.11044716 - time (sec): 157.65 - samples/sec: 1638.54 - lr: 0.000002 - momentum: 0.000000
2023-10-25 16:18:20,171 epoch 10 - iter 4168/5212 - loss 0.11138827 - time (sec): 180.12 - samples/sec: 1634.56 - lr: 0.000001 - momentum: 0.000000
2023-10-25 16:18:42,613 epoch 10 - iter 4689/5212 - loss 0.11209184 - time (sec): 202.56 - samples/sec: 1624.26 - lr: 0.000001 - momentum: 0.000000
2023-10-25 16:19:04,757 epoch 10 - iter 5210/5212 - loss 0.11119727 - time (sec): 224.70 - samples/sec: 1634.63 - lr: 0.000000 - momentum: 0.000000
2023-10-25 16:19:04,836 ----------------------------------------------------------------------------------------------------
2023-10-25 16:19:04,836 EPOCH 10 done: loss 0.1112 - lr: 0.000000
2023-10-25 16:19:11,706 DEV : loss 0.2723826766014099 - f1-score (micro avg) 0.232
2023-10-25 16:19:12,230 ----------------------------------------------------------------------------------------------------
2023-10-25 16:19:12,231 Loading model from best epoch ...
2023-10-25 16:19:13,972 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 16:19:23,755
Results:
- F-score (micro) 0.3408
- F-score (macro) 0.2134
- Accuracy 0.2086
By class:
precision recall f1-score support
LOC 0.4309 0.4876 0.4575 1214
PER 0.2575 0.2970 0.2759 808
ORG 0.1042 0.1416 0.1200 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.3166 0.3690 0.3408 2390
macro avg 0.1981 0.2316 0.2134 2390
weighted avg 0.3213 0.3690 0.3434 2390
2023-10-25 16:19:23,755 ----------------------------------------------------------------------------------------------------