stefan-it's picture
Upload ./training.log with huggingface_hub
f4e3d0d
2023-10-25 19:56:58,111 ----------------------------------------------------------------------------------------------------
2023-10-25 19:56:58,112 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 19:56:58,112 ----------------------------------------------------------------------------------------------------
2023-10-25 19:56:58,112 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-25 19:56:58,112 ----------------------------------------------------------------------------------------------------
2023-10-25 19:56:58,112 Train: 20847 sentences
2023-10-25 19:56:58,112 (train_with_dev=False, train_with_test=False)
2023-10-25 19:56:58,112 ----------------------------------------------------------------------------------------------------
2023-10-25 19:56:58,112 Training Params:
2023-10-25 19:56:58,112 - learning_rate: "5e-05"
2023-10-25 19:56:58,112 - mini_batch_size: "4"
2023-10-25 19:56:58,112 - max_epochs: "10"
2023-10-25 19:56:58,112 - shuffle: "True"
2023-10-25 19:56:58,112 ----------------------------------------------------------------------------------------------------
2023-10-25 19:56:58,112 Plugins:
2023-10-25 19:56:58,112 - TensorboardLogger
2023-10-25 19:56:58,112 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 19:56:58,112 ----------------------------------------------------------------------------------------------------
2023-10-25 19:56:58,112 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 19:56:58,112 - metric: "('micro avg', 'f1-score')"
2023-10-25 19:56:58,112 ----------------------------------------------------------------------------------------------------
2023-10-25 19:56:58,112 Computation:
2023-10-25 19:56:58,112 - compute on device: cuda:0
2023-10-25 19:56:58,112 - embedding storage: none
2023-10-25 19:56:58,112 ----------------------------------------------------------------------------------------------------
2023-10-25 19:56:58,112 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-25 19:56:58,112 ----------------------------------------------------------------------------------------------------
2023-10-25 19:56:58,113 ----------------------------------------------------------------------------------------------------
2023-10-25 19:56:58,113 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 19:57:20,386 epoch 1 - iter 521/5212 - loss 1.08348570 - time (sec): 22.27 - samples/sec: 1626.00 - lr: 0.000005 - momentum: 0.000000
2023-10-25 19:57:42,372 epoch 1 - iter 1042/5212 - loss 0.70758840 - time (sec): 44.26 - samples/sec: 1663.56 - lr: 0.000010 - momentum: 0.000000
2023-10-25 19:58:04,781 epoch 1 - iter 1563/5212 - loss 0.55092863 - time (sec): 66.67 - samples/sec: 1681.34 - lr: 0.000015 - momentum: 0.000000
2023-10-25 19:58:26,646 epoch 1 - iter 2084/5212 - loss 0.47203657 - time (sec): 88.53 - samples/sec: 1679.64 - lr: 0.000020 - momentum: 0.000000
2023-10-25 19:58:48,297 epoch 1 - iter 2605/5212 - loss 0.42902220 - time (sec): 110.18 - samples/sec: 1665.66 - lr: 0.000025 - momentum: 0.000000
2023-10-25 19:59:10,349 epoch 1 - iter 3126/5212 - loss 0.39940587 - time (sec): 132.24 - samples/sec: 1653.17 - lr: 0.000030 - momentum: 0.000000
2023-10-25 19:59:32,713 epoch 1 - iter 3647/5212 - loss 0.37156643 - time (sec): 154.60 - samples/sec: 1673.93 - lr: 0.000035 - momentum: 0.000000
2023-10-25 19:59:54,575 epoch 1 - iter 4168/5212 - loss 0.34705201 - time (sec): 176.46 - samples/sec: 1679.91 - lr: 0.000040 - momentum: 0.000000
2023-10-25 20:00:16,906 epoch 1 - iter 4689/5212 - loss 0.33147071 - time (sec): 198.79 - samples/sec: 1670.29 - lr: 0.000045 - momentum: 0.000000
2023-10-25 20:00:39,372 epoch 1 - iter 5210/5212 - loss 0.32226811 - time (sec): 221.26 - samples/sec: 1659.98 - lr: 0.000050 - momentum: 0.000000
2023-10-25 20:00:39,451 ----------------------------------------------------------------------------------------------------
2023-10-25 20:00:39,451 EPOCH 1 done: loss 0.3222 - lr: 0.000050
2023-10-25 20:00:43,129 DEV : loss 0.11420618742704391 - f1-score (micro avg) 0.2931
2023-10-25 20:00:43,155 saving best model
2023-10-25 20:00:43,631 ----------------------------------------------------------------------------------------------------
2023-10-25 20:01:05,416 epoch 2 - iter 521/5212 - loss 0.18552954 - time (sec): 21.78 - samples/sec: 1681.53 - lr: 0.000049 - momentum: 0.000000
2023-10-25 20:01:27,647 epoch 2 - iter 1042/5212 - loss 0.17954476 - time (sec): 44.01 - samples/sec: 1655.99 - lr: 0.000049 - momentum: 0.000000
2023-10-25 20:01:49,727 epoch 2 - iter 1563/5212 - loss 0.18022652 - time (sec): 66.09 - samples/sec: 1672.98 - lr: 0.000048 - momentum: 0.000000
2023-10-25 20:02:11,870 epoch 2 - iter 2084/5212 - loss 0.18252575 - time (sec): 88.24 - samples/sec: 1672.56 - lr: 0.000048 - momentum: 0.000000
2023-10-25 20:02:34,162 epoch 2 - iter 2605/5212 - loss 0.18537995 - time (sec): 110.53 - samples/sec: 1660.49 - lr: 0.000047 - momentum: 0.000000
2023-10-25 20:02:56,495 epoch 2 - iter 3126/5212 - loss 0.18574013 - time (sec): 132.86 - samples/sec: 1670.44 - lr: 0.000047 - momentum: 0.000000
2023-10-25 20:03:18,676 epoch 2 - iter 3647/5212 - loss 0.21371016 - time (sec): 155.04 - samples/sec: 1663.40 - lr: 0.000046 - momentum: 0.000000
2023-10-25 20:03:40,512 epoch 2 - iter 4168/5212 - loss 0.22735133 - time (sec): 176.88 - samples/sec: 1654.56 - lr: 0.000046 - momentum: 0.000000
2023-10-25 20:04:02,042 epoch 2 - iter 4689/5212 - loss 0.26232838 - time (sec): 198.41 - samples/sec: 1658.91 - lr: 0.000045 - momentum: 0.000000
2023-10-25 20:04:23,713 epoch 2 - iter 5210/5212 - loss 0.29428995 - time (sec): 220.08 - samples/sec: 1669.05 - lr: 0.000044 - momentum: 0.000000
2023-10-25 20:04:23,796 ----------------------------------------------------------------------------------------------------
2023-10-25 20:04:23,796 EPOCH 2 done: loss 0.2942 - lr: 0.000044
2023-10-25 20:04:30,604 DEV : loss 0.21574755012989044 - f1-score (micro avg) 0.0
2023-10-25 20:04:30,629 ----------------------------------------------------------------------------------------------------
2023-10-25 20:04:52,387 epoch 3 - iter 521/5212 - loss 0.61023887 - time (sec): 21.76 - samples/sec: 1676.40 - lr: 0.000044 - momentum: 0.000000
2023-10-25 20:05:14,600 epoch 3 - iter 1042/5212 - loss 0.56119949 - time (sec): 43.97 - samples/sec: 1730.25 - lr: 0.000043 - momentum: 0.000000
2023-10-25 20:05:36,372 epoch 3 - iter 1563/5212 - loss 0.56421056 - time (sec): 65.74 - samples/sec: 1717.45 - lr: 0.000043 - momentum: 0.000000
2023-10-25 20:05:58,622 epoch 3 - iter 2084/5212 - loss 0.57799536 - time (sec): 87.99 - samples/sec: 1690.37 - lr: 0.000042 - momentum: 0.000000
2023-10-25 20:06:20,339 epoch 3 - iter 2605/5212 - loss 0.57725572 - time (sec): 109.71 - samples/sec: 1690.27 - lr: 0.000042 - momentum: 0.000000
2023-10-25 20:06:42,574 epoch 3 - iter 3126/5212 - loss 0.57924881 - time (sec): 131.94 - samples/sec: 1666.46 - lr: 0.000041 - momentum: 0.000000
2023-10-25 20:07:04,690 epoch 3 - iter 3647/5212 - loss 0.58468851 - time (sec): 154.06 - samples/sec: 1662.21 - lr: 0.000041 - momentum: 0.000000
2023-10-25 20:07:27,168 epoch 3 - iter 4168/5212 - loss 0.58359992 - time (sec): 176.54 - samples/sec: 1672.35 - lr: 0.000040 - momentum: 0.000000
2023-10-25 20:07:48,934 epoch 3 - iter 4689/5212 - loss 0.58385727 - time (sec): 198.30 - samples/sec: 1661.25 - lr: 0.000039 - momentum: 0.000000
2023-10-25 20:08:10,931 epoch 3 - iter 5210/5212 - loss 0.57967064 - time (sec): 220.30 - samples/sec: 1667.45 - lr: 0.000039 - momentum: 0.000000
2023-10-25 20:08:11,008 ----------------------------------------------------------------------------------------------------
2023-10-25 20:08:11,008 EPOCH 3 done: loss 0.5798 - lr: 0.000039
2023-10-25 20:08:17,809 DEV : loss 0.22797180712223053 - f1-score (micro avg) 0.0
2023-10-25 20:08:17,834 ----------------------------------------------------------------------------------------------------
2023-10-25 20:08:39,704 epoch 4 - iter 521/5212 - loss 0.53509142 - time (sec): 21.87 - samples/sec: 1670.95 - lr: 0.000038 - momentum: 0.000000
2023-10-25 20:09:01,224 epoch 4 - iter 1042/5212 - loss 0.57089598 - time (sec): 43.39 - samples/sec: 1769.02 - lr: 0.000038 - momentum: 0.000000
2023-10-25 20:09:23,217 epoch 4 - iter 1563/5212 - loss 0.56098897 - time (sec): 65.38 - samples/sec: 1736.26 - lr: 0.000037 - momentum: 0.000000
2023-10-25 20:09:45,468 epoch 4 - iter 2084/5212 - loss 0.56792280 - time (sec): 87.63 - samples/sec: 1733.55 - lr: 0.000037 - momentum: 0.000000
2023-10-25 20:10:07,654 epoch 4 - iter 2605/5212 - loss 0.55856576 - time (sec): 109.82 - samples/sec: 1732.46 - lr: 0.000036 - momentum: 0.000000
2023-10-25 20:10:29,680 epoch 4 - iter 3126/5212 - loss 0.55882546 - time (sec): 131.84 - samples/sec: 1704.50 - lr: 0.000036 - momentum: 0.000000
2023-10-25 20:10:51,853 epoch 4 - iter 3647/5212 - loss 0.56106127 - time (sec): 154.02 - samples/sec: 1706.01 - lr: 0.000035 - momentum: 0.000000
2023-10-25 20:11:13,676 epoch 4 - iter 4168/5212 - loss 0.55614039 - time (sec): 175.84 - samples/sec: 1698.08 - lr: 0.000034 - momentum: 0.000000
2023-10-25 20:11:35,559 epoch 4 - iter 4689/5212 - loss 0.55568875 - time (sec): 197.72 - samples/sec: 1687.66 - lr: 0.000034 - momentum: 0.000000
2023-10-25 20:11:57,314 epoch 4 - iter 5210/5212 - loss 0.56143141 - time (sec): 219.48 - samples/sec: 1673.89 - lr: 0.000033 - momentum: 0.000000
2023-10-25 20:11:57,394 ----------------------------------------------------------------------------------------------------
2023-10-25 20:11:57,394 EPOCH 4 done: loss 0.5614 - lr: 0.000033
2023-10-25 20:12:04,213 DEV : loss 0.22103846073150635 - f1-score (micro avg) 0.0
2023-10-25 20:12:04,238 ----------------------------------------------------------------------------------------------------
2023-10-25 20:12:26,199 epoch 5 - iter 521/5212 - loss 0.59690938 - time (sec): 21.96 - samples/sec: 1651.61 - lr: 0.000033 - momentum: 0.000000
2023-10-25 20:12:47,893 epoch 5 - iter 1042/5212 - loss 0.57083926 - time (sec): 43.65 - samples/sec: 1634.39 - lr: 0.000032 - momentum: 0.000000
2023-10-25 20:13:09,967 epoch 5 - iter 1563/5212 - loss 0.55181100 - time (sec): 65.73 - samples/sec: 1668.56 - lr: 0.000032 - momentum: 0.000000
2023-10-25 20:13:32,378 epoch 5 - iter 2084/5212 - loss 0.55381084 - time (sec): 88.14 - samples/sec: 1674.65 - lr: 0.000031 - momentum: 0.000000
2023-10-25 20:13:54,700 epoch 5 - iter 2605/5212 - loss 0.55370422 - time (sec): 110.46 - samples/sec: 1669.48 - lr: 0.000031 - momentum: 0.000000
2023-10-25 20:14:16,931 epoch 5 - iter 3126/5212 - loss 0.55580912 - time (sec): 132.69 - samples/sec: 1670.75 - lr: 0.000030 - momentum: 0.000000
2023-10-25 20:14:39,010 epoch 5 - iter 3647/5212 - loss 0.55361204 - time (sec): 154.77 - samples/sec: 1648.30 - lr: 0.000029 - momentum: 0.000000
2023-10-25 20:15:01,019 epoch 5 - iter 4168/5212 - loss 0.55662199 - time (sec): 176.78 - samples/sec: 1657.09 - lr: 0.000029 - momentum: 0.000000
2023-10-25 20:15:23,088 epoch 5 - iter 4689/5212 - loss 0.54917164 - time (sec): 198.85 - samples/sec: 1654.63 - lr: 0.000028 - momentum: 0.000000
2023-10-25 20:15:45,019 epoch 5 - iter 5210/5212 - loss 0.54696535 - time (sec): 220.78 - samples/sec: 1664.02 - lr: 0.000028 - momentum: 0.000000
2023-10-25 20:15:45,098 ----------------------------------------------------------------------------------------------------
2023-10-25 20:15:45,098 EPOCH 5 done: loss 0.5469 - lr: 0.000028
2023-10-25 20:15:51,857 DEV : loss 0.2455786168575287 - f1-score (micro avg) 0.0
2023-10-25 20:15:51,883 ----------------------------------------------------------------------------------------------------
2023-10-25 20:16:14,379 epoch 6 - iter 521/5212 - loss 0.57254029 - time (sec): 22.49 - samples/sec: 1691.82 - lr: 0.000027 - momentum: 0.000000
2023-10-25 20:16:36,220 epoch 6 - iter 1042/5212 - loss 0.52755461 - time (sec): 44.34 - samples/sec: 1655.77 - lr: 0.000027 - momentum: 0.000000
2023-10-25 20:16:58,010 epoch 6 - iter 1563/5212 - loss 0.53602606 - time (sec): 66.13 - samples/sec: 1643.95 - lr: 0.000026 - momentum: 0.000000
2023-10-25 20:17:20,136 epoch 6 - iter 2084/5212 - loss 0.53959820 - time (sec): 88.25 - samples/sec: 1666.58 - lr: 0.000026 - momentum: 0.000000
2023-10-25 20:17:41,835 epoch 6 - iter 2605/5212 - loss 0.54382641 - time (sec): 109.95 - samples/sec: 1658.66 - lr: 0.000025 - momentum: 0.000000
2023-10-25 20:18:04,025 epoch 6 - iter 3126/5212 - loss 0.54769962 - time (sec): 132.14 - samples/sec: 1669.15 - lr: 0.000024 - momentum: 0.000000
2023-10-25 20:18:25,736 epoch 6 - iter 3647/5212 - loss 0.54874941 - time (sec): 153.85 - samples/sec: 1658.88 - lr: 0.000024 - momentum: 0.000000
2023-10-25 20:18:47,563 epoch 6 - iter 4168/5212 - loss 0.54164387 - time (sec): 175.68 - samples/sec: 1663.19 - lr: 0.000023 - momentum: 0.000000
2023-10-25 20:19:09,794 epoch 6 - iter 4689/5212 - loss 0.53460429 - time (sec): 197.91 - samples/sec: 1668.66 - lr: 0.000023 - momentum: 0.000000
2023-10-25 20:19:32,171 epoch 6 - iter 5210/5212 - loss 0.53273458 - time (sec): 220.29 - samples/sec: 1667.76 - lr: 0.000022 - momentum: 0.000000
2023-10-25 20:19:32,256 ----------------------------------------------------------------------------------------------------
2023-10-25 20:19:32,256 EPOCH 6 done: loss 0.5327 - lr: 0.000022
2023-10-25 20:19:39,076 DEV : loss 0.24423913657665253 - f1-score (micro avg) 0.0
2023-10-25 20:19:39,103 ----------------------------------------------------------------------------------------------------
2023-10-25 20:20:01,010 epoch 7 - iter 521/5212 - loss 0.52293661 - time (sec): 21.91 - samples/sec: 1654.87 - lr: 0.000022 - momentum: 0.000000
2023-10-25 20:20:22,981 epoch 7 - iter 1042/5212 - loss 0.55864614 - time (sec): 43.88 - samples/sec: 1685.68 - lr: 0.000021 - momentum: 0.000000
2023-10-25 20:20:44,977 epoch 7 - iter 1563/5212 - loss 0.54801286 - time (sec): 65.87 - samples/sec: 1687.51 - lr: 0.000021 - momentum: 0.000000
2023-10-25 20:21:07,465 epoch 7 - iter 2084/5212 - loss 0.52281712 - time (sec): 88.36 - samples/sec: 1697.53 - lr: 0.000020 - momentum: 0.000000
2023-10-25 20:21:29,412 epoch 7 - iter 2605/5212 - loss 0.52541212 - time (sec): 110.31 - samples/sec: 1686.47 - lr: 0.000019 - momentum: 0.000000
2023-10-25 20:21:50,941 epoch 7 - iter 3126/5212 - loss 0.52460589 - time (sec): 131.84 - samples/sec: 1702.56 - lr: 0.000019 - momentum: 0.000000
2023-10-25 20:22:12,842 epoch 7 - iter 3647/5212 - loss 0.52696389 - time (sec): 153.74 - samples/sec: 1706.10 - lr: 0.000018 - momentum: 0.000000
2023-10-25 20:22:34,666 epoch 7 - iter 4168/5212 - loss 0.52978826 - time (sec): 175.56 - samples/sec: 1703.61 - lr: 0.000018 - momentum: 0.000000
2023-10-25 20:22:56,241 epoch 7 - iter 4689/5212 - loss 0.53279906 - time (sec): 197.14 - samples/sec: 1687.62 - lr: 0.000017 - momentum: 0.000000
2023-10-25 20:23:18,267 epoch 7 - iter 5210/5212 - loss 0.53320841 - time (sec): 219.16 - samples/sec: 1676.23 - lr: 0.000017 - momentum: 0.000000
2023-10-25 20:23:18,343 ----------------------------------------------------------------------------------------------------
2023-10-25 20:23:18,343 EPOCH 7 done: loss 0.5332 - lr: 0.000017
2023-10-25 20:23:24,491 DEV : loss 0.24582862854003906 - f1-score (micro avg) 0.0
2023-10-25 20:23:24,518 ----------------------------------------------------------------------------------------------------
2023-10-25 20:23:46,181 epoch 8 - iter 521/5212 - loss 0.51997984 - time (sec): 21.66 - samples/sec: 1675.53 - lr: 0.000016 - momentum: 0.000000
2023-10-25 20:24:08,717 epoch 8 - iter 1042/5212 - loss 0.50827131 - time (sec): 44.20 - samples/sec: 1632.62 - lr: 0.000016 - momentum: 0.000000
2023-10-25 20:24:30,464 epoch 8 - iter 1563/5212 - loss 0.52442736 - time (sec): 65.94 - samples/sec: 1655.07 - lr: 0.000015 - momentum: 0.000000
2023-10-25 20:24:52,677 epoch 8 - iter 2084/5212 - loss 0.53205725 - time (sec): 88.16 - samples/sec: 1627.91 - lr: 0.000014 - momentum: 0.000000
2023-10-25 20:25:14,585 epoch 8 - iter 2605/5212 - loss 0.52219100 - time (sec): 110.06 - samples/sec: 1645.61 - lr: 0.000014 - momentum: 0.000000
2023-10-25 20:25:36,027 epoch 8 - iter 3126/5212 - loss 0.53222967 - time (sec): 131.51 - samples/sec: 1636.67 - lr: 0.000013 - momentum: 0.000000
2023-10-25 20:25:58,145 epoch 8 - iter 3647/5212 - loss 0.53929555 - time (sec): 153.63 - samples/sec: 1646.93 - lr: 0.000013 - momentum: 0.000000
2023-10-25 20:26:20,446 epoch 8 - iter 4168/5212 - loss 0.54041489 - time (sec): 175.93 - samples/sec: 1649.90 - lr: 0.000012 - momentum: 0.000000
2023-10-25 20:26:42,442 epoch 8 - iter 4689/5212 - loss 0.53637281 - time (sec): 197.92 - samples/sec: 1667.54 - lr: 0.000012 - momentum: 0.000000
2023-10-25 20:27:04,221 epoch 8 - iter 5210/5212 - loss 0.53009907 - time (sec): 219.70 - samples/sec: 1671.95 - lr: 0.000011 - momentum: 0.000000
2023-10-25 20:27:04,313 ----------------------------------------------------------------------------------------------------
2023-10-25 20:27:04,314 EPOCH 8 done: loss 0.5301 - lr: 0.000011
2023-10-25 20:27:10,440 DEV : loss 0.2523580491542816 - f1-score (micro avg) 0.0
2023-10-25 20:27:10,466 ----------------------------------------------------------------------------------------------------
2023-10-25 20:27:32,371 epoch 9 - iter 521/5212 - loss 0.52508937 - time (sec): 21.90 - samples/sec: 1695.81 - lr: 0.000011 - momentum: 0.000000
2023-10-25 20:27:54,597 epoch 9 - iter 1042/5212 - loss 0.54085716 - time (sec): 44.13 - samples/sec: 1626.44 - lr: 0.000010 - momentum: 0.000000
2023-10-25 20:28:16,737 epoch 9 - iter 1563/5212 - loss 0.54010812 - time (sec): 66.27 - samples/sec: 1656.38 - lr: 0.000009 - momentum: 0.000000
2023-10-25 20:28:39,178 epoch 9 - iter 2084/5212 - loss 0.54472189 - time (sec): 88.71 - samples/sec: 1642.16 - lr: 0.000009 - momentum: 0.000000
2023-10-25 20:29:01,052 epoch 9 - iter 2605/5212 - loss 0.54518847 - time (sec): 110.58 - samples/sec: 1633.86 - lr: 0.000008 - momentum: 0.000000
2023-10-25 20:29:23,296 epoch 9 - iter 3126/5212 - loss 0.54101988 - time (sec): 132.83 - samples/sec: 1638.13 - lr: 0.000008 - momentum: 0.000000
2023-10-25 20:29:45,113 epoch 9 - iter 3647/5212 - loss 0.53936286 - time (sec): 154.65 - samples/sec: 1632.42 - lr: 0.000007 - momentum: 0.000000
2023-10-25 20:30:07,429 epoch 9 - iter 4168/5212 - loss 0.53352108 - time (sec): 176.96 - samples/sec: 1649.12 - lr: 0.000007 - momentum: 0.000000
2023-10-25 20:30:29,902 epoch 9 - iter 4689/5212 - loss 0.53176607 - time (sec): 199.43 - samples/sec: 1661.49 - lr: 0.000006 - momentum: 0.000000
2023-10-25 20:30:52,350 epoch 9 - iter 5210/5212 - loss 0.52897509 - time (sec): 221.88 - samples/sec: 1655.79 - lr: 0.000006 - momentum: 0.000000
2023-10-25 20:30:52,435 ----------------------------------------------------------------------------------------------------
2023-10-25 20:30:52,435 EPOCH 9 done: loss 0.5289 - lr: 0.000006
2023-10-25 20:30:58,598 DEV : loss 0.26791492104530334 - f1-score (micro avg) 0.0
2023-10-25 20:30:58,624 ----------------------------------------------------------------------------------------------------
2023-10-25 20:31:20,588 epoch 10 - iter 521/5212 - loss 0.48688069 - time (sec): 21.96 - samples/sec: 1689.43 - lr: 0.000005 - momentum: 0.000000
2023-10-25 20:31:42,505 epoch 10 - iter 1042/5212 - loss 0.52394542 - time (sec): 43.88 - samples/sec: 1671.09 - lr: 0.000004 - momentum: 0.000000
2023-10-25 20:32:04,555 epoch 10 - iter 1563/5212 - loss 0.54120385 - time (sec): 65.93 - samples/sec: 1662.55 - lr: 0.000004 - momentum: 0.000000
2023-10-25 20:32:26,521 epoch 10 - iter 2084/5212 - loss 0.54119260 - time (sec): 87.90 - samples/sec: 1654.95 - lr: 0.000003 - momentum: 0.000000
2023-10-25 20:32:48,533 epoch 10 - iter 2605/5212 - loss 0.53504612 - time (sec): 109.91 - samples/sec: 1653.51 - lr: 0.000003 - momentum: 0.000000
2023-10-25 20:33:09,946 epoch 10 - iter 3126/5212 - loss 0.54141693 - time (sec): 131.32 - samples/sec: 1638.10 - lr: 0.000002 - momentum: 0.000000
2023-10-25 20:33:31,270 epoch 10 - iter 3647/5212 - loss 0.53472820 - time (sec): 152.64 - samples/sec: 1649.26 - lr: 0.000002 - momentum: 0.000000
2023-10-25 20:33:53,359 epoch 10 - iter 4168/5212 - loss 0.53331190 - time (sec): 174.73 - samples/sec: 1649.47 - lr: 0.000001 - momentum: 0.000000
2023-10-25 20:34:15,455 epoch 10 - iter 4689/5212 - loss 0.52669984 - time (sec): 196.83 - samples/sec: 1667.97 - lr: 0.000001 - momentum: 0.000000
2023-10-25 20:34:37,198 epoch 10 - iter 5210/5212 - loss 0.52741026 - time (sec): 218.57 - samples/sec: 1680.12 - lr: 0.000000 - momentum: 0.000000
2023-10-25 20:34:37,280 ----------------------------------------------------------------------------------------------------
2023-10-25 20:34:37,280 EPOCH 10 done: loss 0.5273 - lr: 0.000000
2023-10-25 20:34:44,070 DEV : loss 0.26033899188041687 - f1-score (micro avg) 0.0
2023-10-25 20:34:44,436 ----------------------------------------------------------------------------------------------------
2023-10-25 20:34:44,437 Loading model from best epoch ...
2023-10-25 20:34:46,052 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 20:34:55,751
Results:
- F-score (micro) 0.3206
- F-score (macro) 0.1588
- Accuracy 0.1914
By class:
precision recall f1-score support
LOC 0.4903 0.4580 0.4736 1214
PER 0.1940 0.1126 0.1425 808
ORG 0.0588 0.0113 0.0190 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.3896 0.2724 0.3206 2390
macro avg 0.1858 0.1455 0.1588 2390
weighted avg 0.3233 0.2724 0.2916 2390
2023-10-25 20:34:55,751 ----------------------------------------------------------------------------------------------------