stefan-it's picture
Upload ./training.log with huggingface_hub
8453771
2023-10-25 12:30:39,622 ----------------------------------------------------------------------------------------------------
2023-10-25 12:30:39,624 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 12:30:39,624 ----------------------------------------------------------------------------------------------------
2023-10-25 12:30:39,624 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-25 12:30:39,624 ----------------------------------------------------------------------------------------------------
2023-10-25 12:30:39,624 Train: 20847 sentences
2023-10-25 12:30:39,624 (train_with_dev=False, train_with_test=False)
2023-10-25 12:30:39,624 ----------------------------------------------------------------------------------------------------
2023-10-25 12:30:39,624 Training Params:
2023-10-25 12:30:39,624 - learning_rate: "3e-05"
2023-10-25 12:30:39,624 - mini_batch_size: "4"
2023-10-25 12:30:39,624 - max_epochs: "10"
2023-10-25 12:30:39,624 - shuffle: "True"
2023-10-25 12:30:39,624 ----------------------------------------------------------------------------------------------------
2023-10-25 12:30:39,624 Plugins:
2023-10-25 12:30:39,624 - TensorboardLogger
2023-10-25 12:30:39,625 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 12:30:39,625 ----------------------------------------------------------------------------------------------------
2023-10-25 12:30:39,625 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 12:30:39,625 - metric: "('micro avg', 'f1-score')"
2023-10-25 12:30:39,625 ----------------------------------------------------------------------------------------------------
2023-10-25 12:30:39,625 Computation:
2023-10-25 12:30:39,625 - compute on device: cuda:0
2023-10-25 12:30:39,625 - embedding storage: none
2023-10-25 12:30:39,625 ----------------------------------------------------------------------------------------------------
2023-10-25 12:30:39,625 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-25 12:30:39,625 ----------------------------------------------------------------------------------------------------
2023-10-25 12:30:39,625 ----------------------------------------------------------------------------------------------------
2023-10-25 12:30:39,625 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 12:31:02,328 epoch 1 - iter 521/5212 - loss 1.35690721 - time (sec): 22.70 - samples/sec: 1640.18 - lr: 0.000003 - momentum: 0.000000
2023-10-25 12:31:25,335 epoch 1 - iter 1042/5212 - loss 0.84359775 - time (sec): 45.71 - samples/sec: 1683.49 - lr: 0.000006 - momentum: 0.000000
2023-10-25 12:31:47,641 epoch 1 - iter 1563/5212 - loss 0.66469253 - time (sec): 68.02 - samples/sec: 1680.95 - lr: 0.000009 - momentum: 0.000000
2023-10-25 12:32:09,930 epoch 1 - iter 2084/5212 - loss 0.57214190 - time (sec): 90.30 - samples/sec: 1647.86 - lr: 0.000012 - momentum: 0.000000
2023-10-25 12:32:32,799 epoch 1 - iter 2605/5212 - loss 0.50869684 - time (sec): 113.17 - samples/sec: 1631.54 - lr: 0.000015 - momentum: 0.000000
2023-10-25 12:32:57,487 epoch 1 - iter 3126/5212 - loss 0.45814316 - time (sec): 137.86 - samples/sec: 1615.37 - lr: 0.000018 - momentum: 0.000000
2023-10-25 12:33:21,714 epoch 1 - iter 3647/5212 - loss 0.42300347 - time (sec): 162.09 - samples/sec: 1584.75 - lr: 0.000021 - momentum: 0.000000
2023-10-25 12:33:44,563 epoch 1 - iter 4168/5212 - loss 0.39672953 - time (sec): 184.94 - samples/sec: 1584.20 - lr: 0.000024 - momentum: 0.000000
2023-10-25 12:34:07,932 epoch 1 - iter 4689/5212 - loss 0.37637868 - time (sec): 208.31 - samples/sec: 1578.22 - lr: 0.000027 - momentum: 0.000000
2023-10-25 12:34:31,644 epoch 1 - iter 5210/5212 - loss 0.35671008 - time (sec): 232.02 - samples/sec: 1583.15 - lr: 0.000030 - momentum: 0.000000
2023-10-25 12:34:31,723 ----------------------------------------------------------------------------------------------------
2023-10-25 12:34:31,724 EPOCH 1 done: loss 0.3566 - lr: 0.000030
2023-10-25 12:34:35,611 DEV : loss 0.15410612523555756 - f1-score (micro avg) 0.3387
2023-10-25 12:34:35,637 saving best model
2023-10-25 12:34:36,029 ----------------------------------------------------------------------------------------------------
2023-10-25 12:34:58,919 epoch 2 - iter 521/5212 - loss 0.18873373 - time (sec): 22.89 - samples/sec: 1688.82 - lr: 0.000030 - momentum: 0.000000
2023-10-25 12:35:22,298 epoch 2 - iter 1042/5212 - loss 0.19284643 - time (sec): 46.27 - samples/sec: 1643.22 - lr: 0.000029 - momentum: 0.000000
2023-10-25 12:35:44,803 epoch 2 - iter 1563/5212 - loss 0.18768849 - time (sec): 68.77 - samples/sec: 1658.21 - lr: 0.000029 - momentum: 0.000000
2023-10-25 12:36:07,368 epoch 2 - iter 2084/5212 - loss 0.18422999 - time (sec): 91.34 - samples/sec: 1621.08 - lr: 0.000029 - momentum: 0.000000
2023-10-25 12:36:29,330 epoch 2 - iter 2605/5212 - loss 0.18427899 - time (sec): 113.30 - samples/sec: 1615.48 - lr: 0.000028 - momentum: 0.000000
2023-10-25 12:36:52,426 epoch 2 - iter 3126/5212 - loss 0.18293934 - time (sec): 136.40 - samples/sec: 1606.22 - lr: 0.000028 - momentum: 0.000000
2023-10-25 12:37:16,445 epoch 2 - iter 3647/5212 - loss 0.18028597 - time (sec): 160.41 - samples/sec: 1596.49 - lr: 0.000028 - momentum: 0.000000
2023-10-25 12:37:38,507 epoch 2 - iter 4168/5212 - loss 0.17741025 - time (sec): 182.48 - samples/sec: 1602.46 - lr: 0.000027 - momentum: 0.000000
2023-10-25 12:38:01,300 epoch 2 - iter 4689/5212 - loss 0.17315665 - time (sec): 205.27 - samples/sec: 1608.13 - lr: 0.000027 - momentum: 0.000000
2023-10-25 12:38:24,244 epoch 2 - iter 5210/5212 - loss 0.17228676 - time (sec): 228.21 - samples/sec: 1609.54 - lr: 0.000027 - momentum: 0.000000
2023-10-25 12:38:24,336 ----------------------------------------------------------------------------------------------------
2023-10-25 12:38:24,336 EPOCH 2 done: loss 0.1723 - lr: 0.000027
2023-10-25 12:38:31,462 DEV : loss 0.16118642687797546 - f1-score (micro avg) 0.3443
2023-10-25 12:38:31,487 saving best model
2023-10-25 12:38:32,153 ----------------------------------------------------------------------------------------------------
2023-10-25 12:38:55,011 epoch 3 - iter 521/5212 - loss 0.11772243 - time (sec): 22.85 - samples/sec: 1474.34 - lr: 0.000026 - momentum: 0.000000
2023-10-25 12:39:17,159 epoch 3 - iter 1042/5212 - loss 0.12532719 - time (sec): 45.00 - samples/sec: 1506.77 - lr: 0.000026 - momentum: 0.000000
2023-10-25 12:39:40,169 epoch 3 - iter 1563/5212 - loss 0.11670384 - time (sec): 68.01 - samples/sec: 1587.55 - lr: 0.000026 - momentum: 0.000000
2023-10-25 12:40:03,561 epoch 3 - iter 2084/5212 - loss 0.11831328 - time (sec): 91.40 - samples/sec: 1568.07 - lr: 0.000025 - momentum: 0.000000
2023-10-25 12:40:25,816 epoch 3 - iter 2605/5212 - loss 0.11752038 - time (sec): 113.66 - samples/sec: 1593.91 - lr: 0.000025 - momentum: 0.000000
2023-10-25 12:40:47,878 epoch 3 - iter 3126/5212 - loss 0.11849344 - time (sec): 135.72 - samples/sec: 1602.22 - lr: 0.000025 - momentum: 0.000000
2023-10-25 12:41:10,977 epoch 3 - iter 3647/5212 - loss 0.11922318 - time (sec): 158.82 - samples/sec: 1620.57 - lr: 0.000024 - momentum: 0.000000
2023-10-25 12:41:34,112 epoch 3 - iter 4168/5212 - loss 0.11720478 - time (sec): 181.95 - samples/sec: 1620.11 - lr: 0.000024 - momentum: 0.000000
2023-10-25 12:41:58,465 epoch 3 - iter 4689/5212 - loss 0.11833606 - time (sec): 206.31 - samples/sec: 1613.98 - lr: 0.000024 - momentum: 0.000000
2023-10-25 12:42:20,774 epoch 3 - iter 5210/5212 - loss 0.11828266 - time (sec): 228.62 - samples/sec: 1606.87 - lr: 0.000023 - momentum: 0.000000
2023-10-25 12:42:20,855 ----------------------------------------------------------------------------------------------------
2023-10-25 12:42:20,856 EPOCH 3 done: loss 0.1183 - lr: 0.000023
2023-10-25 12:42:27,971 DEV : loss 0.21911266446113586 - f1-score (micro avg) 0.3492
2023-10-25 12:42:27,995 saving best model
2023-10-25 12:42:28,517 ----------------------------------------------------------------------------------------------------
2023-10-25 12:42:51,944 epoch 4 - iter 521/5212 - loss 0.08106536 - time (sec): 23.42 - samples/sec: 1575.48 - lr: 0.000023 - momentum: 0.000000
2023-10-25 12:43:16,950 epoch 4 - iter 1042/5212 - loss 0.08434711 - time (sec): 48.43 - samples/sec: 1515.83 - lr: 0.000023 - momentum: 0.000000
2023-10-25 12:43:41,217 epoch 4 - iter 1563/5212 - loss 0.08582429 - time (sec): 72.70 - samples/sec: 1510.30 - lr: 0.000022 - momentum: 0.000000
2023-10-25 12:44:03,821 epoch 4 - iter 2084/5212 - loss 0.08399684 - time (sec): 95.30 - samples/sec: 1499.18 - lr: 0.000022 - momentum: 0.000000
2023-10-25 12:44:25,673 epoch 4 - iter 2605/5212 - loss 0.08320564 - time (sec): 117.15 - samples/sec: 1544.67 - lr: 0.000022 - momentum: 0.000000
2023-10-25 12:44:47,784 epoch 4 - iter 3126/5212 - loss 0.08202671 - time (sec): 139.26 - samples/sec: 1559.23 - lr: 0.000021 - momentum: 0.000000
2023-10-25 12:45:09,957 epoch 4 - iter 3647/5212 - loss 0.08016329 - time (sec): 161.44 - samples/sec: 1576.18 - lr: 0.000021 - momentum: 0.000000
2023-10-25 12:45:32,033 epoch 4 - iter 4168/5212 - loss 0.08273180 - time (sec): 183.51 - samples/sec: 1601.57 - lr: 0.000021 - momentum: 0.000000
2023-10-25 12:45:55,001 epoch 4 - iter 4689/5212 - loss 0.08226071 - time (sec): 206.48 - samples/sec: 1589.66 - lr: 0.000020 - momentum: 0.000000
2023-10-25 12:46:17,693 epoch 4 - iter 5210/5212 - loss 0.08209536 - time (sec): 229.17 - samples/sec: 1603.11 - lr: 0.000020 - momentum: 0.000000
2023-10-25 12:46:17,780 ----------------------------------------------------------------------------------------------------
2023-10-25 12:46:17,781 EPOCH 4 done: loss 0.0821 - lr: 0.000020
2023-10-25 12:46:24,911 DEV : loss 0.2745342254638672 - f1-score (micro avg) 0.3681
2023-10-25 12:46:24,951 saving best model
2023-10-25 12:46:25,554 ----------------------------------------------------------------------------------------------------
2023-10-25 12:46:47,731 epoch 5 - iter 521/5212 - loss 0.05826864 - time (sec): 22.17 - samples/sec: 1597.07 - lr: 0.000020 - momentum: 0.000000
2023-10-25 12:47:10,029 epoch 5 - iter 1042/5212 - loss 0.06149609 - time (sec): 44.47 - samples/sec: 1603.10 - lr: 0.000019 - momentum: 0.000000
2023-10-25 12:47:32,836 epoch 5 - iter 1563/5212 - loss 0.06191734 - time (sec): 67.28 - samples/sec: 1637.69 - lr: 0.000019 - momentum: 0.000000
2023-10-25 12:47:55,689 epoch 5 - iter 2084/5212 - loss 0.05955556 - time (sec): 90.13 - samples/sec: 1643.10 - lr: 0.000019 - momentum: 0.000000
2023-10-25 12:48:18,311 epoch 5 - iter 2605/5212 - loss 0.05898503 - time (sec): 112.75 - samples/sec: 1649.45 - lr: 0.000018 - momentum: 0.000000
2023-10-25 12:48:40,933 epoch 5 - iter 3126/5212 - loss 0.06036901 - time (sec): 135.37 - samples/sec: 1660.97 - lr: 0.000018 - momentum: 0.000000
2023-10-25 12:49:03,897 epoch 5 - iter 3647/5212 - loss 0.06016401 - time (sec): 158.34 - samples/sec: 1648.50 - lr: 0.000018 - momentum: 0.000000
2023-10-25 12:49:26,926 epoch 5 - iter 4168/5212 - loss 0.05974490 - time (sec): 181.37 - samples/sec: 1628.81 - lr: 0.000017 - momentum: 0.000000
2023-10-25 12:49:49,005 epoch 5 - iter 4689/5212 - loss 0.05911683 - time (sec): 203.45 - samples/sec: 1621.62 - lr: 0.000017 - momentum: 0.000000
2023-10-25 12:50:11,372 epoch 5 - iter 5210/5212 - loss 0.05898527 - time (sec): 225.81 - samples/sec: 1626.95 - lr: 0.000017 - momentum: 0.000000
2023-10-25 12:50:11,468 ----------------------------------------------------------------------------------------------------
2023-10-25 12:50:11,468 EPOCH 5 done: loss 0.0590 - lr: 0.000017
2023-10-25 12:50:19,364 DEV : loss 0.4137490689754486 - f1-score (micro avg) 0.3585
2023-10-25 12:50:19,391 ----------------------------------------------------------------------------------------------------
2023-10-25 12:50:43,444 epoch 6 - iter 521/5212 - loss 0.03965013 - time (sec): 24.05 - samples/sec: 1582.76 - lr: 0.000016 - momentum: 0.000000
2023-10-25 12:51:06,780 epoch 6 - iter 1042/5212 - loss 0.04173389 - time (sec): 47.39 - samples/sec: 1628.09 - lr: 0.000016 - momentum: 0.000000
2023-10-25 12:51:31,606 epoch 6 - iter 1563/5212 - loss 0.04000309 - time (sec): 72.21 - samples/sec: 1587.69 - lr: 0.000016 - momentum: 0.000000
2023-10-25 12:51:54,957 epoch 6 - iter 2084/5212 - loss 0.04058525 - time (sec): 95.56 - samples/sec: 1598.59 - lr: 0.000015 - momentum: 0.000000
2023-10-25 12:52:17,380 epoch 6 - iter 2605/5212 - loss 0.04049229 - time (sec): 117.99 - samples/sec: 1580.67 - lr: 0.000015 - momentum: 0.000000
2023-10-25 12:52:39,712 epoch 6 - iter 3126/5212 - loss 0.04175965 - time (sec): 140.32 - samples/sec: 1587.57 - lr: 0.000015 - momentum: 0.000000
2023-10-25 12:53:02,319 epoch 6 - iter 3647/5212 - loss 0.04166030 - time (sec): 162.93 - samples/sec: 1578.04 - lr: 0.000014 - momentum: 0.000000
2023-10-25 12:53:25,485 epoch 6 - iter 4168/5212 - loss 0.04117385 - time (sec): 186.09 - samples/sec: 1576.80 - lr: 0.000014 - momentum: 0.000000
2023-10-25 12:53:47,873 epoch 6 - iter 4689/5212 - loss 0.04144962 - time (sec): 208.48 - samples/sec: 1582.00 - lr: 0.000014 - momentum: 0.000000
2023-10-25 12:54:11,061 epoch 6 - iter 5210/5212 - loss 0.04147789 - time (sec): 231.67 - samples/sec: 1585.25 - lr: 0.000013 - momentum: 0.000000
2023-10-25 12:54:11,156 ----------------------------------------------------------------------------------------------------
2023-10-25 12:54:11,157 EPOCH 6 done: loss 0.0416 - lr: 0.000013
2023-10-25 12:54:18,637 DEV : loss 0.3830464780330658 - f1-score (micro avg) 0.3865
2023-10-25 12:54:18,662 saving best model
2023-10-25 12:54:19,323 ----------------------------------------------------------------------------------------------------
2023-10-25 12:54:41,768 epoch 7 - iter 521/5212 - loss 0.03965221 - time (sec): 22.44 - samples/sec: 1653.26 - lr: 0.000013 - momentum: 0.000000
2023-10-25 12:55:04,720 epoch 7 - iter 1042/5212 - loss 0.03314505 - time (sec): 45.40 - samples/sec: 1666.32 - lr: 0.000013 - momentum: 0.000000
2023-10-25 12:55:27,670 epoch 7 - iter 1563/5212 - loss 0.03234971 - time (sec): 68.35 - samples/sec: 1616.79 - lr: 0.000012 - momentum: 0.000000
2023-10-25 12:55:49,778 epoch 7 - iter 2084/5212 - loss 0.03347041 - time (sec): 90.45 - samples/sec: 1635.84 - lr: 0.000012 - momentum: 0.000000
2023-10-25 12:56:12,507 epoch 7 - iter 2605/5212 - loss 0.03354482 - time (sec): 113.18 - samples/sec: 1635.45 - lr: 0.000012 - momentum: 0.000000
2023-10-25 12:56:34,929 epoch 7 - iter 3126/5212 - loss 0.03288247 - time (sec): 135.60 - samples/sec: 1652.93 - lr: 0.000011 - momentum: 0.000000
2023-10-25 12:56:57,939 epoch 7 - iter 3647/5212 - loss 0.03251089 - time (sec): 158.61 - samples/sec: 1644.48 - lr: 0.000011 - momentum: 0.000000
2023-10-25 12:57:19,830 epoch 7 - iter 4168/5212 - loss 0.03188274 - time (sec): 180.51 - samples/sec: 1638.75 - lr: 0.000011 - momentum: 0.000000
2023-10-25 12:57:41,626 epoch 7 - iter 4689/5212 - loss 0.03190288 - time (sec): 202.30 - samples/sec: 1636.71 - lr: 0.000010 - momentum: 0.000000
2023-10-25 12:58:03,906 epoch 7 - iter 5210/5212 - loss 0.03148374 - time (sec): 224.58 - samples/sec: 1633.76 - lr: 0.000010 - momentum: 0.000000
2023-10-25 12:58:04,015 ----------------------------------------------------------------------------------------------------
2023-10-25 12:58:04,016 EPOCH 7 done: loss 0.0314 - lr: 0.000010
2023-10-25 12:58:10,396 DEV : loss 0.37830933928489685 - f1-score (micro avg) 0.4183
2023-10-25 12:58:10,422 saving best model
2023-10-25 12:58:10,956 ----------------------------------------------------------------------------------------------------
2023-10-25 12:58:36,539 epoch 8 - iter 521/5212 - loss 0.02379176 - time (sec): 25.58 - samples/sec: 1452.39 - lr: 0.000010 - momentum: 0.000000
2023-10-25 12:58:59,230 epoch 8 - iter 1042/5212 - loss 0.02278495 - time (sec): 48.27 - samples/sec: 1565.51 - lr: 0.000009 - momentum: 0.000000
2023-10-25 12:59:22,682 epoch 8 - iter 1563/5212 - loss 0.02209281 - time (sec): 71.72 - samples/sec: 1567.66 - lr: 0.000009 - momentum: 0.000000
2023-10-25 12:59:45,344 epoch 8 - iter 2084/5212 - loss 0.02158886 - time (sec): 94.38 - samples/sec: 1591.70 - lr: 0.000009 - momentum: 0.000000
2023-10-25 13:00:08,655 epoch 8 - iter 2605/5212 - loss 0.02127616 - time (sec): 117.69 - samples/sec: 1616.77 - lr: 0.000008 - momentum: 0.000000
2023-10-25 13:00:31,012 epoch 8 - iter 3126/5212 - loss 0.02227528 - time (sec): 140.05 - samples/sec: 1609.64 - lr: 0.000008 - momentum: 0.000000
2023-10-25 13:00:53,123 epoch 8 - iter 3647/5212 - loss 0.02166962 - time (sec): 162.16 - samples/sec: 1597.17 - lr: 0.000008 - momentum: 0.000000
2023-10-25 13:01:16,172 epoch 8 - iter 4168/5212 - loss 0.02167663 - time (sec): 185.21 - samples/sec: 1603.61 - lr: 0.000007 - momentum: 0.000000
2023-10-25 13:01:38,925 epoch 8 - iter 4689/5212 - loss 0.02183753 - time (sec): 207.97 - samples/sec: 1601.55 - lr: 0.000007 - momentum: 0.000000
2023-10-25 13:02:02,358 epoch 8 - iter 5210/5212 - loss 0.02140802 - time (sec): 231.40 - samples/sec: 1587.37 - lr: 0.000007 - momentum: 0.000000
2023-10-25 13:02:02,447 ----------------------------------------------------------------------------------------------------
2023-10-25 13:02:02,447 EPOCH 8 done: loss 0.0214 - lr: 0.000007
2023-10-25 13:02:08,780 DEV : loss 0.477633535861969 - f1-score (micro avg) 0.3662
2023-10-25 13:02:08,806 ----------------------------------------------------------------------------------------------------
2023-10-25 13:02:32,294 epoch 9 - iter 521/5212 - loss 0.01864109 - time (sec): 23.49 - samples/sec: 1540.21 - lr: 0.000006 - momentum: 0.000000
2023-10-25 13:02:54,705 epoch 9 - iter 1042/5212 - loss 0.01626733 - time (sec): 45.90 - samples/sec: 1589.21 - lr: 0.000006 - momentum: 0.000000
2023-10-25 13:03:17,344 epoch 9 - iter 1563/5212 - loss 0.01544773 - time (sec): 68.54 - samples/sec: 1598.24 - lr: 0.000006 - momentum: 0.000000
2023-10-25 13:03:40,791 epoch 9 - iter 2084/5212 - loss 0.01503458 - time (sec): 91.98 - samples/sec: 1583.86 - lr: 0.000005 - momentum: 0.000000
2023-10-25 13:04:03,675 epoch 9 - iter 2605/5212 - loss 0.01474038 - time (sec): 114.87 - samples/sec: 1601.09 - lr: 0.000005 - momentum: 0.000000
2023-10-25 13:04:26,738 epoch 9 - iter 3126/5212 - loss 0.01433621 - time (sec): 137.93 - samples/sec: 1601.10 - lr: 0.000005 - momentum: 0.000000
2023-10-25 13:04:49,324 epoch 9 - iter 3647/5212 - loss 0.01428946 - time (sec): 160.52 - samples/sec: 1612.61 - lr: 0.000004 - momentum: 0.000000
2023-10-25 13:05:11,507 epoch 9 - iter 4168/5212 - loss 0.01464623 - time (sec): 182.70 - samples/sec: 1607.65 - lr: 0.000004 - momentum: 0.000000
2023-10-25 13:05:35,042 epoch 9 - iter 4689/5212 - loss 0.01437046 - time (sec): 206.24 - samples/sec: 1609.02 - lr: 0.000004 - momentum: 0.000000
2023-10-25 13:05:57,365 epoch 9 - iter 5210/5212 - loss 0.01431548 - time (sec): 228.56 - samples/sec: 1606.78 - lr: 0.000003 - momentum: 0.000000
2023-10-25 13:05:57,456 ----------------------------------------------------------------------------------------------------
2023-10-25 13:05:57,456 EPOCH 9 done: loss 0.0143 - lr: 0.000003
2023-10-25 13:06:04,012 DEV : loss 0.52719646692276 - f1-score (micro avg) 0.3728
2023-10-25 13:06:04,038 ----------------------------------------------------------------------------------------------------
2023-10-25 13:06:26,903 epoch 10 - iter 521/5212 - loss 0.00541859 - time (sec): 22.86 - samples/sec: 1629.64 - lr: 0.000003 - momentum: 0.000000
2023-10-25 13:06:50,062 epoch 10 - iter 1042/5212 - loss 0.00660409 - time (sec): 46.02 - samples/sec: 1566.18 - lr: 0.000003 - momentum: 0.000000
2023-10-25 13:07:12,889 epoch 10 - iter 1563/5212 - loss 0.00863731 - time (sec): 68.85 - samples/sec: 1624.59 - lr: 0.000002 - momentum: 0.000000
2023-10-25 13:07:35,660 epoch 10 - iter 2084/5212 - loss 0.00985019 - time (sec): 91.62 - samples/sec: 1632.99 - lr: 0.000002 - momentum: 0.000000
2023-10-25 13:07:57,770 epoch 10 - iter 2605/5212 - loss 0.00938135 - time (sec): 113.73 - samples/sec: 1613.63 - lr: 0.000002 - momentum: 0.000000
2023-10-25 13:08:19,806 epoch 10 - iter 3126/5212 - loss 0.00960318 - time (sec): 135.77 - samples/sec: 1620.76 - lr: 0.000001 - momentum: 0.000000
2023-10-25 13:08:43,358 epoch 10 - iter 3647/5212 - loss 0.00958806 - time (sec): 159.32 - samples/sec: 1605.72 - lr: 0.000001 - momentum: 0.000000
2023-10-25 13:09:05,476 epoch 10 - iter 4168/5212 - loss 0.00967595 - time (sec): 181.44 - samples/sec: 1615.99 - lr: 0.000001 - momentum: 0.000000
2023-10-25 13:09:28,042 epoch 10 - iter 4689/5212 - loss 0.00960296 - time (sec): 204.00 - samples/sec: 1621.97 - lr: 0.000000 - momentum: 0.000000
2023-10-25 13:09:51,237 epoch 10 - iter 5210/5212 - loss 0.00964924 - time (sec): 227.20 - samples/sec: 1616.88 - lr: 0.000000 - momentum: 0.000000
2023-10-25 13:09:51,323 ----------------------------------------------------------------------------------------------------
2023-10-25 13:09:51,323 EPOCH 10 done: loss 0.0096 - lr: 0.000000
2023-10-25 13:09:58,242 DEV : loss 0.46924296021461487 - f1-score (micro avg) 0.3859
2023-10-25 13:09:58,800 ----------------------------------------------------------------------------------------------------
2023-10-25 13:09:58,801 Loading model from best epoch ...
2023-10-25 13:10:00,526 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 13:10:10,661
Results:
- F-score (micro) 0.4912
- F-score (macro) 0.33
- Accuracy 0.3282
By class:
precision recall f1-score support
LOC 0.5356 0.6260 0.5773 1214
PER 0.4094 0.4530 0.4301 808
ORG 0.3536 0.2805 0.3128 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4715 0.5126 0.4912 2390
macro avg 0.3246 0.3399 0.3300 2390
weighted avg 0.4627 0.5126 0.4848 2390
2023-10-25 13:10:10,661 ----------------------------------------------------------------------------------------------------