stefan-it's picture
Upload ./training.log with huggingface_hub
bce8bfd
2023-10-25 11:00:55,891 ----------------------------------------------------------------------------------------------------
2023-10-25 11:00:55,892 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 11:00:55,892 ----------------------------------------------------------------------------------------------------
2023-10-25 11:00:55,892 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-25 11:00:55,892 ----------------------------------------------------------------------------------------------------
2023-10-25 11:00:55,892 Train: 20847 sentences
2023-10-25 11:00:55,892 (train_with_dev=False, train_with_test=False)
2023-10-25 11:00:55,892 ----------------------------------------------------------------------------------------------------
2023-10-25 11:00:55,892 Training Params:
2023-10-25 11:00:55,892 - learning_rate: "5e-05"
2023-10-25 11:00:55,892 - mini_batch_size: "4"
2023-10-25 11:00:55,892 - max_epochs: "10"
2023-10-25 11:00:55,892 - shuffle: "True"
2023-10-25 11:00:55,892 ----------------------------------------------------------------------------------------------------
2023-10-25 11:00:55,892 Plugins:
2023-10-25 11:00:55,892 - TensorboardLogger
2023-10-25 11:00:55,892 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 11:00:55,892 ----------------------------------------------------------------------------------------------------
2023-10-25 11:00:55,892 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 11:00:55,892 - metric: "('micro avg', 'f1-score')"
2023-10-25 11:00:55,892 ----------------------------------------------------------------------------------------------------
2023-10-25 11:00:55,892 Computation:
2023-10-25 11:00:55,892 - compute on device: cuda:0
2023-10-25 11:00:55,892 - embedding storage: none
2023-10-25 11:00:55,893 ----------------------------------------------------------------------------------------------------
2023-10-25 11:00:55,893 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-25 11:00:55,893 ----------------------------------------------------------------------------------------------------
2023-10-25 11:00:55,893 ----------------------------------------------------------------------------------------------------
2023-10-25 11:00:55,893 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 11:01:18,136 epoch 1 - iter 521/5212 - loss 1.37149992 - time (sec): 22.24 - samples/sec: 1721.34 - lr: 0.000005 - momentum: 0.000000
2023-10-25 11:01:40,217 epoch 1 - iter 1042/5212 - loss 0.87935906 - time (sec): 44.32 - samples/sec: 1665.09 - lr: 0.000010 - momentum: 0.000000
2023-10-25 11:02:01,925 epoch 1 - iter 1563/5212 - loss 0.68503997 - time (sec): 66.03 - samples/sec: 1643.34 - lr: 0.000015 - momentum: 0.000000
2023-10-25 11:02:23,692 epoch 1 - iter 2084/5212 - loss 0.58285105 - time (sec): 87.80 - samples/sec: 1629.58 - lr: 0.000020 - momentum: 0.000000
2023-10-25 11:02:45,198 epoch 1 - iter 2605/5212 - loss 0.50956019 - time (sec): 109.30 - samples/sec: 1641.78 - lr: 0.000025 - momentum: 0.000000
2023-10-25 11:03:07,147 epoch 1 - iter 3126/5212 - loss 0.46284092 - time (sec): 131.25 - samples/sec: 1652.16 - lr: 0.000030 - momentum: 0.000000
2023-10-25 11:03:29,327 epoch 1 - iter 3647/5212 - loss 0.42868172 - time (sec): 153.43 - samples/sec: 1660.62 - lr: 0.000035 - momentum: 0.000000
2023-10-25 11:03:51,454 epoch 1 - iter 4168/5212 - loss 0.40148351 - time (sec): 175.56 - samples/sec: 1666.12 - lr: 0.000040 - momentum: 0.000000
2023-10-25 11:04:13,887 epoch 1 - iter 4689/5212 - loss 0.38468469 - time (sec): 197.99 - samples/sec: 1665.16 - lr: 0.000045 - momentum: 0.000000
2023-10-25 11:04:36,107 epoch 1 - iter 5210/5212 - loss 0.36578248 - time (sec): 220.21 - samples/sec: 1668.00 - lr: 0.000050 - momentum: 0.000000
2023-10-25 11:04:36,182 ----------------------------------------------------------------------------------------------------
2023-10-25 11:04:36,183 EPOCH 1 done: loss 0.3660 - lr: 0.000050
2023-10-25 11:04:39,923 DEV : loss 0.13538858294487 - f1-score (micro avg) 0.3034
2023-10-25 11:04:39,948 saving best model
2023-10-25 11:04:40,317 ----------------------------------------------------------------------------------------------------
2023-10-25 11:05:03,229 epoch 2 - iter 521/5212 - loss 0.20469904 - time (sec): 22.91 - samples/sec: 1536.72 - lr: 0.000049 - momentum: 0.000000
2023-10-25 11:05:25,136 epoch 2 - iter 1042/5212 - loss 0.23803288 - time (sec): 44.82 - samples/sec: 1661.73 - lr: 0.000049 - momentum: 0.000000
2023-10-25 11:05:47,474 epoch 2 - iter 1563/5212 - loss 0.23244269 - time (sec): 67.16 - samples/sec: 1670.90 - lr: 0.000048 - momentum: 0.000000
2023-10-25 11:06:09,399 epoch 2 - iter 2084/5212 - loss 0.22959571 - time (sec): 89.08 - samples/sec: 1658.83 - lr: 0.000048 - momentum: 0.000000
2023-10-25 11:06:31,093 epoch 2 - iter 2605/5212 - loss 0.22320080 - time (sec): 110.78 - samples/sec: 1645.05 - lr: 0.000047 - momentum: 0.000000
2023-10-25 11:06:53,231 epoch 2 - iter 3126/5212 - loss 0.22319310 - time (sec): 132.91 - samples/sec: 1642.13 - lr: 0.000047 - momentum: 0.000000
2023-10-25 11:07:15,062 epoch 2 - iter 3647/5212 - loss 0.21867469 - time (sec): 154.74 - samples/sec: 1640.78 - lr: 0.000046 - momentum: 0.000000
2023-10-25 11:07:37,150 epoch 2 - iter 4168/5212 - loss 0.21838122 - time (sec): 176.83 - samples/sec: 1642.21 - lr: 0.000046 - momentum: 0.000000
2023-10-25 11:07:59,564 epoch 2 - iter 4689/5212 - loss 0.22961072 - time (sec): 199.25 - samples/sec: 1652.99 - lr: 0.000045 - momentum: 0.000000
2023-10-25 11:08:21,823 epoch 2 - iter 5210/5212 - loss 0.22659583 - time (sec): 221.50 - samples/sec: 1658.66 - lr: 0.000044 - momentum: 0.000000
2023-10-25 11:08:21,903 ----------------------------------------------------------------------------------------------------
2023-10-25 11:08:21,903 EPOCH 2 done: loss 0.2266 - lr: 0.000044
2023-10-25 11:08:28,788 DEV : loss 0.14999593794345856 - f1-score (micro avg) 0.317
2023-10-25 11:08:28,813 saving best model
2023-10-25 11:08:29,311 ----------------------------------------------------------------------------------------------------
2023-10-25 11:08:51,082 epoch 3 - iter 521/5212 - loss 0.39731821 - time (sec): 21.77 - samples/sec: 1643.10 - lr: 0.000044 - momentum: 0.000000
2023-10-25 11:09:13,112 epoch 3 - iter 1042/5212 - loss 0.33525516 - time (sec): 43.80 - samples/sec: 1660.21 - lr: 0.000043 - momentum: 0.000000
2023-10-25 11:09:35,064 epoch 3 - iter 1563/5212 - loss 0.28290022 - time (sec): 65.75 - samples/sec: 1636.73 - lr: 0.000043 - momentum: 0.000000
2023-10-25 11:09:56,744 epoch 3 - iter 2084/5212 - loss 0.25099625 - time (sec): 87.43 - samples/sec: 1654.98 - lr: 0.000042 - momentum: 0.000000
2023-10-25 11:10:19,059 epoch 3 - iter 2605/5212 - loss 0.22758921 - time (sec): 109.74 - samples/sec: 1659.03 - lr: 0.000042 - momentum: 0.000000
2023-10-25 11:10:40,793 epoch 3 - iter 3126/5212 - loss 0.22506705 - time (sec): 131.48 - samples/sec: 1660.35 - lr: 0.000041 - momentum: 0.000000
2023-10-25 11:11:02,558 epoch 3 - iter 3647/5212 - loss 0.21926003 - time (sec): 153.24 - samples/sec: 1684.17 - lr: 0.000041 - momentum: 0.000000
2023-10-25 11:11:24,327 epoch 3 - iter 4168/5212 - loss 0.21800940 - time (sec): 175.01 - samples/sec: 1674.58 - lr: 0.000040 - momentum: 0.000000
2023-10-25 11:11:46,643 epoch 3 - iter 4689/5212 - loss 0.21324164 - time (sec): 197.33 - samples/sec: 1684.30 - lr: 0.000039 - momentum: 0.000000
2023-10-25 11:12:08,682 epoch 3 - iter 5210/5212 - loss 0.21112728 - time (sec): 219.37 - samples/sec: 1674.21 - lr: 0.000039 - momentum: 0.000000
2023-10-25 11:12:08,759 ----------------------------------------------------------------------------------------------------
2023-10-25 11:12:08,759 EPOCH 3 done: loss 0.2111 - lr: 0.000039
2023-10-25 11:12:15,629 DEV : loss 0.1838080883026123 - f1-score (micro avg) 0.2752
2023-10-25 11:12:15,654 ----------------------------------------------------------------------------------------------------
2023-10-25 11:12:37,555 epoch 4 - iter 521/5212 - loss 0.14551985 - time (sec): 21.90 - samples/sec: 1637.84 - lr: 0.000038 - momentum: 0.000000
2023-10-25 11:12:59,454 epoch 4 - iter 1042/5212 - loss 0.14054748 - time (sec): 43.80 - samples/sec: 1676.75 - lr: 0.000038 - momentum: 0.000000
2023-10-25 11:13:21,366 epoch 4 - iter 1563/5212 - loss 0.14476054 - time (sec): 65.71 - samples/sec: 1676.82 - lr: 0.000037 - momentum: 0.000000
2023-10-25 11:13:43,227 epoch 4 - iter 2084/5212 - loss 0.14805005 - time (sec): 87.57 - samples/sec: 1685.25 - lr: 0.000037 - momentum: 0.000000
2023-10-25 11:14:04,835 epoch 4 - iter 2605/5212 - loss 0.15829476 - time (sec): 109.18 - samples/sec: 1676.45 - lr: 0.000036 - momentum: 0.000000
2023-10-25 11:14:26,663 epoch 4 - iter 3126/5212 - loss 0.15472574 - time (sec): 131.01 - samples/sec: 1686.89 - lr: 0.000036 - momentum: 0.000000
2023-10-25 11:14:48,463 epoch 4 - iter 3647/5212 - loss 0.14772175 - time (sec): 152.81 - samples/sec: 1688.63 - lr: 0.000035 - momentum: 0.000000
2023-10-25 11:15:10,375 epoch 4 - iter 4168/5212 - loss 0.14632759 - time (sec): 174.72 - samples/sec: 1676.75 - lr: 0.000034 - momentum: 0.000000
2023-10-25 11:15:32,499 epoch 4 - iter 4689/5212 - loss 0.14404764 - time (sec): 196.84 - samples/sec: 1675.46 - lr: 0.000034 - momentum: 0.000000
2023-10-25 11:15:54,778 epoch 4 - iter 5210/5212 - loss 0.14268477 - time (sec): 219.12 - samples/sec: 1676.29 - lr: 0.000033 - momentum: 0.000000
2023-10-25 11:15:54,863 ----------------------------------------------------------------------------------------------------
2023-10-25 11:15:54,863 EPOCH 4 done: loss 0.1427 - lr: 0.000033
2023-10-25 11:16:01,737 DEV : loss 0.22028455138206482 - f1-score (micro avg) 0.3096
2023-10-25 11:16:01,763 ----------------------------------------------------------------------------------------------------
2023-10-25 11:16:23,704 epoch 5 - iter 521/5212 - loss 0.10320092 - time (sec): 21.94 - samples/sec: 1709.74 - lr: 0.000033 - momentum: 0.000000
2023-10-25 11:16:45,397 epoch 5 - iter 1042/5212 - loss 0.10355668 - time (sec): 43.63 - samples/sec: 1662.39 - lr: 0.000032 - momentum: 0.000000
2023-10-25 11:17:07,635 epoch 5 - iter 1563/5212 - loss 0.10540687 - time (sec): 65.87 - samples/sec: 1685.07 - lr: 0.000032 - momentum: 0.000000
2023-10-25 11:17:29,500 epoch 5 - iter 2084/5212 - loss 0.10136807 - time (sec): 87.74 - samples/sec: 1702.49 - lr: 0.000031 - momentum: 0.000000
2023-10-25 11:17:51,526 epoch 5 - iter 2605/5212 - loss 0.10353459 - time (sec): 109.76 - samples/sec: 1712.76 - lr: 0.000031 - momentum: 0.000000
2023-10-25 11:18:13,450 epoch 5 - iter 3126/5212 - loss 0.10253579 - time (sec): 131.69 - samples/sec: 1700.34 - lr: 0.000030 - momentum: 0.000000
2023-10-25 11:18:35,496 epoch 5 - iter 3647/5212 - loss 0.10477648 - time (sec): 153.73 - samples/sec: 1688.45 - lr: 0.000029 - momentum: 0.000000
2023-10-25 11:18:57,342 epoch 5 - iter 4168/5212 - loss 0.10361175 - time (sec): 175.58 - samples/sec: 1697.83 - lr: 0.000029 - momentum: 0.000000
2023-10-25 11:19:19,350 epoch 5 - iter 4689/5212 - loss 0.10359200 - time (sec): 197.59 - samples/sec: 1680.29 - lr: 0.000028 - momentum: 0.000000
2023-10-25 11:19:41,597 epoch 5 - iter 5210/5212 - loss 0.10503616 - time (sec): 219.83 - samples/sec: 1671.08 - lr: 0.000028 - momentum: 0.000000
2023-10-25 11:19:41,677 ----------------------------------------------------------------------------------------------------
2023-10-25 11:19:41,677 EPOCH 5 done: loss 0.1050 - lr: 0.000028
2023-10-25 11:19:47,864 DEV : loss 0.22237569093704224 - f1-score (micro avg) 0.3813
2023-10-25 11:19:47,891 saving best model
2023-10-25 11:19:48,294 ----------------------------------------------------------------------------------------------------
2023-10-25 11:20:10,627 epoch 6 - iter 521/5212 - loss 0.08012409 - time (sec): 22.33 - samples/sec: 1606.65 - lr: 0.000027 - momentum: 0.000000
2023-10-25 11:20:33,460 epoch 6 - iter 1042/5212 - loss 0.07475817 - time (sec): 45.16 - samples/sec: 1652.26 - lr: 0.000027 - momentum: 0.000000
2023-10-25 11:20:54,638 epoch 6 - iter 1563/5212 - loss 0.07771787 - time (sec): 66.34 - samples/sec: 1656.96 - lr: 0.000026 - momentum: 0.000000
2023-10-25 11:21:16,836 epoch 6 - iter 2084/5212 - loss 0.07455376 - time (sec): 88.54 - samples/sec: 1675.11 - lr: 0.000026 - momentum: 0.000000
2023-10-25 11:21:38,932 epoch 6 - iter 2605/5212 - loss 0.07609439 - time (sec): 110.64 - samples/sec: 1681.84 - lr: 0.000025 - momentum: 0.000000
2023-10-25 11:22:01,190 epoch 6 - iter 3126/5212 - loss 0.07530603 - time (sec): 132.89 - samples/sec: 1673.73 - lr: 0.000024 - momentum: 0.000000
2023-10-25 11:22:23,343 epoch 6 - iter 3647/5212 - loss 0.07426848 - time (sec): 155.05 - samples/sec: 1691.85 - lr: 0.000024 - momentum: 0.000000
2023-10-25 11:22:45,050 epoch 6 - iter 4168/5212 - loss 0.07447383 - time (sec): 176.75 - samples/sec: 1694.20 - lr: 0.000023 - momentum: 0.000000
2023-10-25 11:23:06,759 epoch 6 - iter 4689/5212 - loss 0.07623639 - time (sec): 198.46 - samples/sec: 1675.58 - lr: 0.000023 - momentum: 0.000000
2023-10-25 11:23:28,556 epoch 6 - iter 5210/5212 - loss 0.07631310 - time (sec): 220.26 - samples/sec: 1667.59 - lr: 0.000022 - momentum: 0.000000
2023-10-25 11:23:28,640 ----------------------------------------------------------------------------------------------------
2023-10-25 11:23:28,640 EPOCH 6 done: loss 0.0763 - lr: 0.000022
2023-10-25 11:23:34,855 DEV : loss 0.32064488530158997 - f1-score (micro avg) 0.3633
2023-10-25 11:23:34,881 ----------------------------------------------------------------------------------------------------
2023-10-25 11:23:56,935 epoch 7 - iter 521/5212 - loss 0.05039789 - time (sec): 22.05 - samples/sec: 1871.83 - lr: 0.000022 - momentum: 0.000000
2023-10-25 11:24:19,042 epoch 7 - iter 1042/5212 - loss 0.05944987 - time (sec): 44.16 - samples/sec: 1698.53 - lr: 0.000021 - momentum: 0.000000
2023-10-25 11:24:40,988 epoch 7 - iter 1563/5212 - loss 0.06349842 - time (sec): 66.11 - samples/sec: 1651.93 - lr: 0.000021 - momentum: 0.000000
2023-10-25 11:25:03,426 epoch 7 - iter 2084/5212 - loss 0.06688293 - time (sec): 88.54 - samples/sec: 1648.17 - lr: 0.000020 - momentum: 0.000000
2023-10-25 11:25:24,892 epoch 7 - iter 2605/5212 - loss 0.06760611 - time (sec): 110.01 - samples/sec: 1677.77 - lr: 0.000019 - momentum: 0.000000
2023-10-25 11:25:46,879 epoch 7 - iter 3126/5212 - loss 0.06775305 - time (sec): 132.00 - samples/sec: 1672.32 - lr: 0.000019 - momentum: 0.000000
2023-10-25 11:26:08,888 epoch 7 - iter 3647/5212 - loss 0.06912261 - time (sec): 154.01 - samples/sec: 1664.05 - lr: 0.000018 - momentum: 0.000000
2023-10-25 11:26:31,107 epoch 7 - iter 4168/5212 - loss 0.06908616 - time (sec): 176.22 - samples/sec: 1655.61 - lr: 0.000018 - momentum: 0.000000
2023-10-25 11:26:53,084 epoch 7 - iter 4689/5212 - loss 0.07148833 - time (sec): 198.20 - samples/sec: 1663.87 - lr: 0.000017 - momentum: 0.000000
2023-10-25 11:27:15,291 epoch 7 - iter 5210/5212 - loss 0.06952026 - time (sec): 220.41 - samples/sec: 1663.04 - lr: 0.000017 - momentum: 0.000000
2023-10-25 11:27:15,436 ----------------------------------------------------------------------------------------------------
2023-10-25 11:27:15,437 EPOCH 7 done: loss 0.0694 - lr: 0.000017
2023-10-25 11:27:21,672 DEV : loss 0.3756372332572937 - f1-score (micro avg) 0.3489
2023-10-25 11:27:21,697 ----------------------------------------------------------------------------------------------------
2023-10-25 11:27:43,233 epoch 8 - iter 521/5212 - loss 0.05879330 - time (sec): 21.53 - samples/sec: 1859.19 - lr: 0.000016 - momentum: 0.000000
2023-10-25 11:28:04,959 epoch 8 - iter 1042/5212 - loss 0.05539014 - time (sec): 43.26 - samples/sec: 1815.53 - lr: 0.000016 - momentum: 0.000000
2023-10-25 11:28:27,149 epoch 8 - iter 1563/5212 - loss 0.05646244 - time (sec): 65.45 - samples/sec: 1747.62 - lr: 0.000015 - momentum: 0.000000
2023-10-25 11:28:49,551 epoch 8 - iter 2084/5212 - loss 0.05287024 - time (sec): 87.85 - samples/sec: 1742.21 - lr: 0.000014 - momentum: 0.000000
2023-10-25 11:29:11,741 epoch 8 - iter 2605/5212 - loss 0.05325825 - time (sec): 110.04 - samples/sec: 1728.37 - lr: 0.000014 - momentum: 0.000000
2023-10-25 11:29:33,499 epoch 8 - iter 3126/5212 - loss 0.05326024 - time (sec): 131.80 - samples/sec: 1708.19 - lr: 0.000013 - momentum: 0.000000
2023-10-25 11:29:55,509 epoch 8 - iter 3647/5212 - loss 0.05476219 - time (sec): 153.81 - samples/sec: 1685.49 - lr: 0.000013 - momentum: 0.000000
2023-10-25 11:30:18,302 epoch 8 - iter 4168/5212 - loss 0.05479079 - time (sec): 176.60 - samples/sec: 1679.91 - lr: 0.000012 - momentum: 0.000000
2023-10-25 11:30:40,251 epoch 8 - iter 4689/5212 - loss 0.05530734 - time (sec): 198.55 - samples/sec: 1675.48 - lr: 0.000012 - momentum: 0.000000
2023-10-25 11:31:02,189 epoch 8 - iter 5210/5212 - loss 0.05608379 - time (sec): 220.49 - samples/sec: 1666.00 - lr: 0.000011 - momentum: 0.000000
2023-10-25 11:31:02,275 ----------------------------------------------------------------------------------------------------
2023-10-25 11:31:02,275 EPOCH 8 done: loss 0.0561 - lr: 0.000011
2023-10-25 11:31:08,474 DEV : loss 0.3516038954257965 - f1-score (micro avg) 0.3603
2023-10-25 11:31:08,500 ----------------------------------------------------------------------------------------------------
2023-10-25 11:31:30,381 epoch 9 - iter 521/5212 - loss 0.03414719 - time (sec): 21.88 - samples/sec: 1677.12 - lr: 0.000011 - momentum: 0.000000
2023-10-25 11:31:51,575 epoch 9 - iter 1042/5212 - loss 0.03394267 - time (sec): 43.07 - samples/sec: 1687.17 - lr: 0.000010 - momentum: 0.000000
2023-10-25 11:32:12,995 epoch 9 - iter 1563/5212 - loss 0.03753554 - time (sec): 64.49 - samples/sec: 1713.27 - lr: 0.000009 - momentum: 0.000000
2023-10-25 11:32:34,589 epoch 9 - iter 2084/5212 - loss 0.03899856 - time (sec): 86.09 - samples/sec: 1732.76 - lr: 0.000009 - momentum: 0.000000
2023-10-25 11:32:56,245 epoch 9 - iter 2605/5212 - loss 0.03908002 - time (sec): 107.74 - samples/sec: 1716.66 - lr: 0.000008 - momentum: 0.000000
2023-10-25 11:33:19,017 epoch 9 - iter 3126/5212 - loss 0.03898247 - time (sec): 130.52 - samples/sec: 1697.62 - lr: 0.000008 - momentum: 0.000000
2023-10-25 11:33:40,952 epoch 9 - iter 3647/5212 - loss 0.03853540 - time (sec): 152.45 - samples/sec: 1688.78 - lr: 0.000007 - momentum: 0.000000
2023-10-25 11:34:02,644 epoch 9 - iter 4168/5212 - loss 0.03902025 - time (sec): 174.14 - samples/sec: 1686.38 - lr: 0.000007 - momentum: 0.000000
2023-10-25 11:34:24,348 epoch 9 - iter 4689/5212 - loss 0.03912183 - time (sec): 195.85 - samples/sec: 1685.83 - lr: 0.000006 - momentum: 0.000000
2023-10-25 11:34:46,370 epoch 9 - iter 5210/5212 - loss 0.03907876 - time (sec): 217.87 - samples/sec: 1686.24 - lr: 0.000006 - momentum: 0.000000
2023-10-25 11:34:46,460 ----------------------------------------------------------------------------------------------------
2023-10-25 11:34:46,460 EPOCH 9 done: loss 0.0391 - lr: 0.000006
2023-10-25 11:34:53,392 DEV : loss 0.405927836894989 - f1-score (micro avg) 0.3634
2023-10-25 11:34:53,418 ----------------------------------------------------------------------------------------------------
2023-10-25 11:35:14,984 epoch 10 - iter 521/5212 - loss 0.03118015 - time (sec): 21.56 - samples/sec: 1775.07 - lr: 0.000005 - momentum: 0.000000
2023-10-25 11:35:37,344 epoch 10 - iter 1042/5212 - loss 0.02868530 - time (sec): 43.93 - samples/sec: 1717.36 - lr: 0.000004 - momentum: 0.000000
2023-10-25 11:35:59,555 epoch 10 - iter 1563/5212 - loss 0.02897899 - time (sec): 66.14 - samples/sec: 1698.08 - lr: 0.000004 - momentum: 0.000000
2023-10-25 11:36:21,701 epoch 10 - iter 2084/5212 - loss 0.02993813 - time (sec): 88.28 - samples/sec: 1686.02 - lr: 0.000003 - momentum: 0.000000
2023-10-25 11:36:43,697 epoch 10 - iter 2605/5212 - loss 0.03083416 - time (sec): 110.28 - samples/sec: 1686.24 - lr: 0.000003 - momentum: 0.000000
2023-10-25 11:37:05,594 epoch 10 - iter 3126/5212 - loss 0.03010510 - time (sec): 132.18 - samples/sec: 1688.94 - lr: 0.000002 - momentum: 0.000000
2023-10-25 11:37:27,487 epoch 10 - iter 3647/5212 - loss 0.02911948 - time (sec): 154.07 - samples/sec: 1675.51 - lr: 0.000002 - momentum: 0.000000
2023-10-25 11:37:49,663 epoch 10 - iter 4168/5212 - loss 0.02885642 - time (sec): 176.24 - samples/sec: 1661.67 - lr: 0.000001 - momentum: 0.000000
2023-10-25 11:38:11,733 epoch 10 - iter 4689/5212 - loss 0.02769243 - time (sec): 198.31 - samples/sec: 1660.60 - lr: 0.000001 - momentum: 0.000000
2023-10-25 11:38:33,789 epoch 10 - iter 5210/5212 - loss 0.02725114 - time (sec): 220.37 - samples/sec: 1667.00 - lr: 0.000000 - momentum: 0.000000
2023-10-25 11:38:33,876 ----------------------------------------------------------------------------------------------------
2023-10-25 11:38:33,876 EPOCH 10 done: loss 0.0272 - lr: 0.000000
2023-10-25 11:38:40,758 DEV : loss 0.41593435406684875 - f1-score (micro avg) 0.368
2023-10-25 11:38:41,135 ----------------------------------------------------------------------------------------------------
2023-10-25 11:38:41,136 Loading model from best epoch ...
2023-10-25 11:38:42,646 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 11:38:52,346
Results:
- F-score (micro) 0.3457
- F-score (macro) 0.225
- Accuracy 0.211
By class:
precision recall f1-score support
LOC 0.5061 0.3443 0.4098 1214
PER 0.3730 0.3106 0.3390 808
ORG 0.1860 0.1275 0.1513 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4101 0.2987 0.3457 2390
macro avg 0.2662 0.1956 0.2250 2390
weighted avg 0.4106 0.2987 0.3451 2390
2023-10-25 11:38:52,346 ----------------------------------------------------------------------------------------------------