|
2023-10-25 12:30:39,622 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:30:39,624 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 12:30:39,624 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:30:39,624 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences |
|
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator |
|
2023-10-25 12:30:39,624 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:30:39,624 Train: 20847 sentences |
|
2023-10-25 12:30:39,624 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 12:30:39,624 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:30:39,624 Training Params: |
|
2023-10-25 12:30:39,624 - learning_rate: "3e-05" |
|
2023-10-25 12:30:39,624 - mini_batch_size: "4" |
|
2023-10-25 12:30:39,624 - max_epochs: "10" |
|
2023-10-25 12:30:39,624 - shuffle: "True" |
|
2023-10-25 12:30:39,624 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:30:39,624 Plugins: |
|
2023-10-25 12:30:39,624 - TensorboardLogger |
|
2023-10-25 12:30:39,625 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 12:30:39,625 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:30:39,625 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 12:30:39,625 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 12:30:39,625 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:30:39,625 Computation: |
|
2023-10-25 12:30:39,625 - compute on device: cuda:0 |
|
2023-10-25 12:30:39,625 - embedding storage: none |
|
2023-10-25 12:30:39,625 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:30:39,625 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-25 12:30:39,625 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:30:39,625 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:30:39,625 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 12:31:02,328 epoch 1 - iter 521/5212 - loss 1.35690721 - time (sec): 22.70 - samples/sec: 1640.18 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 12:31:25,335 epoch 1 - iter 1042/5212 - loss 0.84359775 - time (sec): 45.71 - samples/sec: 1683.49 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 12:31:47,641 epoch 1 - iter 1563/5212 - loss 0.66469253 - time (sec): 68.02 - samples/sec: 1680.95 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 12:32:09,930 epoch 1 - iter 2084/5212 - loss 0.57214190 - time (sec): 90.30 - samples/sec: 1647.86 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 12:32:32,799 epoch 1 - iter 2605/5212 - loss 0.50869684 - time (sec): 113.17 - samples/sec: 1631.54 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 12:32:57,487 epoch 1 - iter 3126/5212 - loss 0.45814316 - time (sec): 137.86 - samples/sec: 1615.37 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 12:33:21,714 epoch 1 - iter 3647/5212 - loss 0.42300347 - time (sec): 162.09 - samples/sec: 1584.75 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 12:33:44,563 epoch 1 - iter 4168/5212 - loss 0.39672953 - time (sec): 184.94 - samples/sec: 1584.20 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 12:34:07,932 epoch 1 - iter 4689/5212 - loss 0.37637868 - time (sec): 208.31 - samples/sec: 1578.22 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 12:34:31,644 epoch 1 - iter 5210/5212 - loss 0.35671008 - time (sec): 232.02 - samples/sec: 1583.15 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 12:34:31,723 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:34:31,724 EPOCH 1 done: loss 0.3566 - lr: 0.000030 |
|
2023-10-25 12:34:35,611 DEV : loss 0.15410612523555756 - f1-score (micro avg) 0.3387 |
|
2023-10-25 12:34:35,637 saving best model |
|
2023-10-25 12:34:36,029 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:34:58,919 epoch 2 - iter 521/5212 - loss 0.18873373 - time (sec): 22.89 - samples/sec: 1688.82 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 12:35:22,298 epoch 2 - iter 1042/5212 - loss 0.19284643 - time (sec): 46.27 - samples/sec: 1643.22 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 12:35:44,803 epoch 2 - iter 1563/5212 - loss 0.18768849 - time (sec): 68.77 - samples/sec: 1658.21 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 12:36:07,368 epoch 2 - iter 2084/5212 - loss 0.18422999 - time (sec): 91.34 - samples/sec: 1621.08 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 12:36:29,330 epoch 2 - iter 2605/5212 - loss 0.18427899 - time (sec): 113.30 - samples/sec: 1615.48 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 12:36:52,426 epoch 2 - iter 3126/5212 - loss 0.18293934 - time (sec): 136.40 - samples/sec: 1606.22 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 12:37:16,445 epoch 2 - iter 3647/5212 - loss 0.18028597 - time (sec): 160.41 - samples/sec: 1596.49 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 12:37:38,507 epoch 2 - iter 4168/5212 - loss 0.17741025 - time (sec): 182.48 - samples/sec: 1602.46 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 12:38:01,300 epoch 2 - iter 4689/5212 - loss 0.17315665 - time (sec): 205.27 - samples/sec: 1608.13 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 12:38:24,244 epoch 2 - iter 5210/5212 - loss 0.17228676 - time (sec): 228.21 - samples/sec: 1609.54 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 12:38:24,336 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:38:24,336 EPOCH 2 done: loss 0.1723 - lr: 0.000027 |
|
2023-10-25 12:38:31,462 DEV : loss 0.16118642687797546 - f1-score (micro avg) 0.3443 |
|
2023-10-25 12:38:31,487 saving best model |
|
2023-10-25 12:38:32,153 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:38:55,011 epoch 3 - iter 521/5212 - loss 0.11772243 - time (sec): 22.85 - samples/sec: 1474.34 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 12:39:17,159 epoch 3 - iter 1042/5212 - loss 0.12532719 - time (sec): 45.00 - samples/sec: 1506.77 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 12:39:40,169 epoch 3 - iter 1563/5212 - loss 0.11670384 - time (sec): 68.01 - samples/sec: 1587.55 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 12:40:03,561 epoch 3 - iter 2084/5212 - loss 0.11831328 - time (sec): 91.40 - samples/sec: 1568.07 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 12:40:25,816 epoch 3 - iter 2605/5212 - loss 0.11752038 - time (sec): 113.66 - samples/sec: 1593.91 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 12:40:47,878 epoch 3 - iter 3126/5212 - loss 0.11849344 - time (sec): 135.72 - samples/sec: 1602.22 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 12:41:10,977 epoch 3 - iter 3647/5212 - loss 0.11922318 - time (sec): 158.82 - samples/sec: 1620.57 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 12:41:34,112 epoch 3 - iter 4168/5212 - loss 0.11720478 - time (sec): 181.95 - samples/sec: 1620.11 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 12:41:58,465 epoch 3 - iter 4689/5212 - loss 0.11833606 - time (sec): 206.31 - samples/sec: 1613.98 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 12:42:20,774 epoch 3 - iter 5210/5212 - loss 0.11828266 - time (sec): 228.62 - samples/sec: 1606.87 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 12:42:20,855 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:42:20,856 EPOCH 3 done: loss 0.1183 - lr: 0.000023 |
|
2023-10-25 12:42:27,971 DEV : loss 0.21911266446113586 - f1-score (micro avg) 0.3492 |
|
2023-10-25 12:42:27,995 saving best model |
|
2023-10-25 12:42:28,517 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:42:51,944 epoch 4 - iter 521/5212 - loss 0.08106536 - time (sec): 23.42 - samples/sec: 1575.48 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 12:43:16,950 epoch 4 - iter 1042/5212 - loss 0.08434711 - time (sec): 48.43 - samples/sec: 1515.83 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 12:43:41,217 epoch 4 - iter 1563/5212 - loss 0.08582429 - time (sec): 72.70 - samples/sec: 1510.30 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 12:44:03,821 epoch 4 - iter 2084/5212 - loss 0.08399684 - time (sec): 95.30 - samples/sec: 1499.18 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 12:44:25,673 epoch 4 - iter 2605/5212 - loss 0.08320564 - time (sec): 117.15 - samples/sec: 1544.67 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 12:44:47,784 epoch 4 - iter 3126/5212 - loss 0.08202671 - time (sec): 139.26 - samples/sec: 1559.23 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 12:45:09,957 epoch 4 - iter 3647/5212 - loss 0.08016329 - time (sec): 161.44 - samples/sec: 1576.18 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 12:45:32,033 epoch 4 - iter 4168/5212 - loss 0.08273180 - time (sec): 183.51 - samples/sec: 1601.57 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 12:45:55,001 epoch 4 - iter 4689/5212 - loss 0.08226071 - time (sec): 206.48 - samples/sec: 1589.66 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 12:46:17,693 epoch 4 - iter 5210/5212 - loss 0.08209536 - time (sec): 229.17 - samples/sec: 1603.11 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 12:46:17,780 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:46:17,781 EPOCH 4 done: loss 0.0821 - lr: 0.000020 |
|
2023-10-25 12:46:24,911 DEV : loss 0.2745342254638672 - f1-score (micro avg) 0.3681 |
|
2023-10-25 12:46:24,951 saving best model |
|
2023-10-25 12:46:25,554 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:46:47,731 epoch 5 - iter 521/5212 - loss 0.05826864 - time (sec): 22.17 - samples/sec: 1597.07 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 12:47:10,029 epoch 5 - iter 1042/5212 - loss 0.06149609 - time (sec): 44.47 - samples/sec: 1603.10 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 12:47:32,836 epoch 5 - iter 1563/5212 - loss 0.06191734 - time (sec): 67.28 - samples/sec: 1637.69 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 12:47:55,689 epoch 5 - iter 2084/5212 - loss 0.05955556 - time (sec): 90.13 - samples/sec: 1643.10 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 12:48:18,311 epoch 5 - iter 2605/5212 - loss 0.05898503 - time (sec): 112.75 - samples/sec: 1649.45 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 12:48:40,933 epoch 5 - iter 3126/5212 - loss 0.06036901 - time (sec): 135.37 - samples/sec: 1660.97 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 12:49:03,897 epoch 5 - iter 3647/5212 - loss 0.06016401 - time (sec): 158.34 - samples/sec: 1648.50 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 12:49:26,926 epoch 5 - iter 4168/5212 - loss 0.05974490 - time (sec): 181.37 - samples/sec: 1628.81 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 12:49:49,005 epoch 5 - iter 4689/5212 - loss 0.05911683 - time (sec): 203.45 - samples/sec: 1621.62 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 12:50:11,372 epoch 5 - iter 5210/5212 - loss 0.05898527 - time (sec): 225.81 - samples/sec: 1626.95 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 12:50:11,468 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:50:11,468 EPOCH 5 done: loss 0.0590 - lr: 0.000017 |
|
2023-10-25 12:50:19,364 DEV : loss 0.4137490689754486 - f1-score (micro avg) 0.3585 |
|
2023-10-25 12:50:19,391 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:50:43,444 epoch 6 - iter 521/5212 - loss 0.03965013 - time (sec): 24.05 - samples/sec: 1582.76 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 12:51:06,780 epoch 6 - iter 1042/5212 - loss 0.04173389 - time (sec): 47.39 - samples/sec: 1628.09 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 12:51:31,606 epoch 6 - iter 1563/5212 - loss 0.04000309 - time (sec): 72.21 - samples/sec: 1587.69 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 12:51:54,957 epoch 6 - iter 2084/5212 - loss 0.04058525 - time (sec): 95.56 - samples/sec: 1598.59 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 12:52:17,380 epoch 6 - iter 2605/5212 - loss 0.04049229 - time (sec): 117.99 - samples/sec: 1580.67 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 12:52:39,712 epoch 6 - iter 3126/5212 - loss 0.04175965 - time (sec): 140.32 - samples/sec: 1587.57 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 12:53:02,319 epoch 6 - iter 3647/5212 - loss 0.04166030 - time (sec): 162.93 - samples/sec: 1578.04 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 12:53:25,485 epoch 6 - iter 4168/5212 - loss 0.04117385 - time (sec): 186.09 - samples/sec: 1576.80 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 12:53:47,873 epoch 6 - iter 4689/5212 - loss 0.04144962 - time (sec): 208.48 - samples/sec: 1582.00 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 12:54:11,061 epoch 6 - iter 5210/5212 - loss 0.04147789 - time (sec): 231.67 - samples/sec: 1585.25 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 12:54:11,156 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:54:11,157 EPOCH 6 done: loss 0.0416 - lr: 0.000013 |
|
2023-10-25 12:54:18,637 DEV : loss 0.3830464780330658 - f1-score (micro avg) 0.3865 |
|
2023-10-25 12:54:18,662 saving best model |
|
2023-10-25 12:54:19,323 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:54:41,768 epoch 7 - iter 521/5212 - loss 0.03965221 - time (sec): 22.44 - samples/sec: 1653.26 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 12:55:04,720 epoch 7 - iter 1042/5212 - loss 0.03314505 - time (sec): 45.40 - samples/sec: 1666.32 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 12:55:27,670 epoch 7 - iter 1563/5212 - loss 0.03234971 - time (sec): 68.35 - samples/sec: 1616.79 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 12:55:49,778 epoch 7 - iter 2084/5212 - loss 0.03347041 - time (sec): 90.45 - samples/sec: 1635.84 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 12:56:12,507 epoch 7 - iter 2605/5212 - loss 0.03354482 - time (sec): 113.18 - samples/sec: 1635.45 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 12:56:34,929 epoch 7 - iter 3126/5212 - loss 0.03288247 - time (sec): 135.60 - samples/sec: 1652.93 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 12:56:57,939 epoch 7 - iter 3647/5212 - loss 0.03251089 - time (sec): 158.61 - samples/sec: 1644.48 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 12:57:19,830 epoch 7 - iter 4168/5212 - loss 0.03188274 - time (sec): 180.51 - samples/sec: 1638.75 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 12:57:41,626 epoch 7 - iter 4689/5212 - loss 0.03190288 - time (sec): 202.30 - samples/sec: 1636.71 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 12:58:03,906 epoch 7 - iter 5210/5212 - loss 0.03148374 - time (sec): 224.58 - samples/sec: 1633.76 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 12:58:04,015 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:58:04,016 EPOCH 7 done: loss 0.0314 - lr: 0.000010 |
|
2023-10-25 12:58:10,396 DEV : loss 0.37830933928489685 - f1-score (micro avg) 0.4183 |
|
2023-10-25 12:58:10,422 saving best model |
|
2023-10-25 12:58:10,956 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:58:36,539 epoch 8 - iter 521/5212 - loss 0.02379176 - time (sec): 25.58 - samples/sec: 1452.39 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 12:58:59,230 epoch 8 - iter 1042/5212 - loss 0.02278495 - time (sec): 48.27 - samples/sec: 1565.51 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 12:59:22,682 epoch 8 - iter 1563/5212 - loss 0.02209281 - time (sec): 71.72 - samples/sec: 1567.66 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 12:59:45,344 epoch 8 - iter 2084/5212 - loss 0.02158886 - time (sec): 94.38 - samples/sec: 1591.70 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 13:00:08,655 epoch 8 - iter 2605/5212 - loss 0.02127616 - time (sec): 117.69 - samples/sec: 1616.77 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 13:00:31,012 epoch 8 - iter 3126/5212 - loss 0.02227528 - time (sec): 140.05 - samples/sec: 1609.64 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 13:00:53,123 epoch 8 - iter 3647/5212 - loss 0.02166962 - time (sec): 162.16 - samples/sec: 1597.17 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 13:01:16,172 epoch 8 - iter 4168/5212 - loss 0.02167663 - time (sec): 185.21 - samples/sec: 1603.61 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 13:01:38,925 epoch 8 - iter 4689/5212 - loss 0.02183753 - time (sec): 207.97 - samples/sec: 1601.55 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 13:02:02,358 epoch 8 - iter 5210/5212 - loss 0.02140802 - time (sec): 231.40 - samples/sec: 1587.37 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 13:02:02,447 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 13:02:02,447 EPOCH 8 done: loss 0.0214 - lr: 0.000007 |
|
2023-10-25 13:02:08,780 DEV : loss 0.477633535861969 - f1-score (micro avg) 0.3662 |
|
2023-10-25 13:02:08,806 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 13:02:32,294 epoch 9 - iter 521/5212 - loss 0.01864109 - time (sec): 23.49 - samples/sec: 1540.21 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 13:02:54,705 epoch 9 - iter 1042/5212 - loss 0.01626733 - time (sec): 45.90 - samples/sec: 1589.21 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 13:03:17,344 epoch 9 - iter 1563/5212 - loss 0.01544773 - time (sec): 68.54 - samples/sec: 1598.24 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 13:03:40,791 epoch 9 - iter 2084/5212 - loss 0.01503458 - time (sec): 91.98 - samples/sec: 1583.86 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 13:04:03,675 epoch 9 - iter 2605/5212 - loss 0.01474038 - time (sec): 114.87 - samples/sec: 1601.09 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 13:04:26,738 epoch 9 - iter 3126/5212 - loss 0.01433621 - time (sec): 137.93 - samples/sec: 1601.10 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 13:04:49,324 epoch 9 - iter 3647/5212 - loss 0.01428946 - time (sec): 160.52 - samples/sec: 1612.61 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 13:05:11,507 epoch 9 - iter 4168/5212 - loss 0.01464623 - time (sec): 182.70 - samples/sec: 1607.65 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 13:05:35,042 epoch 9 - iter 4689/5212 - loss 0.01437046 - time (sec): 206.24 - samples/sec: 1609.02 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 13:05:57,365 epoch 9 - iter 5210/5212 - loss 0.01431548 - time (sec): 228.56 - samples/sec: 1606.78 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 13:05:57,456 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 13:05:57,456 EPOCH 9 done: loss 0.0143 - lr: 0.000003 |
|
2023-10-25 13:06:04,012 DEV : loss 0.52719646692276 - f1-score (micro avg) 0.3728 |
|
2023-10-25 13:06:04,038 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 13:06:26,903 epoch 10 - iter 521/5212 - loss 0.00541859 - time (sec): 22.86 - samples/sec: 1629.64 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 13:06:50,062 epoch 10 - iter 1042/5212 - loss 0.00660409 - time (sec): 46.02 - samples/sec: 1566.18 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 13:07:12,889 epoch 10 - iter 1563/5212 - loss 0.00863731 - time (sec): 68.85 - samples/sec: 1624.59 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 13:07:35,660 epoch 10 - iter 2084/5212 - loss 0.00985019 - time (sec): 91.62 - samples/sec: 1632.99 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 13:07:57,770 epoch 10 - iter 2605/5212 - loss 0.00938135 - time (sec): 113.73 - samples/sec: 1613.63 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 13:08:19,806 epoch 10 - iter 3126/5212 - loss 0.00960318 - time (sec): 135.77 - samples/sec: 1620.76 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 13:08:43,358 epoch 10 - iter 3647/5212 - loss 0.00958806 - time (sec): 159.32 - samples/sec: 1605.72 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 13:09:05,476 epoch 10 - iter 4168/5212 - loss 0.00967595 - time (sec): 181.44 - samples/sec: 1615.99 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 13:09:28,042 epoch 10 - iter 4689/5212 - loss 0.00960296 - time (sec): 204.00 - samples/sec: 1621.97 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 13:09:51,237 epoch 10 - iter 5210/5212 - loss 0.00964924 - time (sec): 227.20 - samples/sec: 1616.88 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 13:09:51,323 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 13:09:51,323 EPOCH 10 done: loss 0.0096 - lr: 0.000000 |
|
2023-10-25 13:09:58,242 DEV : loss 0.46924296021461487 - f1-score (micro avg) 0.3859 |
|
2023-10-25 13:09:58,800 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 13:09:58,801 Loading model from best epoch ... |
|
2023-10-25 13:10:00,526 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-25 13:10:10,661 |
|
Results: |
|
- F-score (micro) 0.4912 |
|
- F-score (macro) 0.33 |
|
- Accuracy 0.3282 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.5356 0.6260 0.5773 1214 |
|
PER 0.4094 0.4530 0.4301 808 |
|
ORG 0.3536 0.2805 0.3128 353 |
|
HumanProd 0.0000 0.0000 0.0000 15 |
|
|
|
micro avg 0.4715 0.5126 0.4912 2390 |
|
macro avg 0.3246 0.3399 0.3300 2390 |
|
weighted avg 0.4627 0.5126 0.4848 2390 |
|
|
|
2023-10-25 13:10:10,661 ---------------------------------------------------------------------------------------------------- |
|
|