stefan-it's picture
Upload ./training.log with huggingface_hub
2409f46
2023-10-25 10:22:36,342 ----------------------------------------------------------------------------------------------------
2023-10-25 10:22:36,342 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 10:22:36,342 ----------------------------------------------------------------------------------------------------
2023-10-25 10:22:36,343 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-25 10:22:36,343 ----------------------------------------------------------------------------------------------------
2023-10-25 10:22:36,343 Train: 20847 sentences
2023-10-25 10:22:36,343 (train_with_dev=False, train_with_test=False)
2023-10-25 10:22:36,343 ----------------------------------------------------------------------------------------------------
2023-10-25 10:22:36,343 Training Params:
2023-10-25 10:22:36,343 - learning_rate: "3e-05"
2023-10-25 10:22:36,343 - mini_batch_size: "4"
2023-10-25 10:22:36,343 - max_epochs: "10"
2023-10-25 10:22:36,343 - shuffle: "True"
2023-10-25 10:22:36,343 ----------------------------------------------------------------------------------------------------
2023-10-25 10:22:36,343 Plugins:
2023-10-25 10:22:36,343 - TensorboardLogger
2023-10-25 10:22:36,343 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 10:22:36,343 ----------------------------------------------------------------------------------------------------
2023-10-25 10:22:36,343 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 10:22:36,343 - metric: "('micro avg', 'f1-score')"
2023-10-25 10:22:36,343 ----------------------------------------------------------------------------------------------------
2023-10-25 10:22:36,343 Computation:
2023-10-25 10:22:36,343 - compute on device: cuda:0
2023-10-25 10:22:36,343 - embedding storage: none
2023-10-25 10:22:36,343 ----------------------------------------------------------------------------------------------------
2023-10-25 10:22:36,343 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-25 10:22:36,343 ----------------------------------------------------------------------------------------------------
2023-10-25 10:22:36,343 ----------------------------------------------------------------------------------------------------
2023-10-25 10:22:36,343 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 10:22:59,027 epoch 1 - iter 521/5212 - loss 1.61447866 - time (sec): 22.68 - samples/sec: 1687.92 - lr: 0.000003 - momentum: 0.000000
2023-10-25 10:23:20,822 epoch 1 - iter 1042/5212 - loss 1.02069474 - time (sec): 44.48 - samples/sec: 1659.29 - lr: 0.000006 - momentum: 0.000000
2023-10-25 10:23:42,683 epoch 1 - iter 1563/5212 - loss 0.78877305 - time (sec): 66.34 - samples/sec: 1635.72 - lr: 0.000009 - momentum: 0.000000
2023-10-25 10:24:04,201 epoch 1 - iter 2084/5212 - loss 0.66357593 - time (sec): 87.86 - samples/sec: 1628.49 - lr: 0.000012 - momentum: 0.000000
2023-10-25 10:24:25,621 epoch 1 - iter 2605/5212 - loss 0.57401952 - time (sec): 109.28 - samples/sec: 1642.20 - lr: 0.000015 - momentum: 0.000000
2023-10-25 10:24:47,783 epoch 1 - iter 3126/5212 - loss 0.51521620 - time (sec): 131.44 - samples/sec: 1649.82 - lr: 0.000018 - momentum: 0.000000
2023-10-25 10:25:09,910 epoch 1 - iter 3647/5212 - loss 0.47176606 - time (sec): 153.57 - samples/sec: 1659.19 - lr: 0.000021 - momentum: 0.000000
2023-10-25 10:25:32,185 epoch 1 - iter 4168/5212 - loss 0.43469966 - time (sec): 175.84 - samples/sec: 1663.47 - lr: 0.000024 - momentum: 0.000000
2023-10-25 10:25:54,298 epoch 1 - iter 4689/5212 - loss 0.41124414 - time (sec): 197.95 - samples/sec: 1665.49 - lr: 0.000027 - momentum: 0.000000
2023-10-25 10:26:15,961 epoch 1 - iter 5210/5212 - loss 0.38840089 - time (sec): 219.62 - samples/sec: 1672.53 - lr: 0.000030 - momentum: 0.000000
2023-10-25 10:26:16,050 ----------------------------------------------------------------------------------------------------
2023-10-25 10:26:16,050 EPOCH 1 done: loss 0.3885 - lr: 0.000030
2023-10-25 10:26:19,692 DEV : loss 0.1315269023180008 - f1-score (micro avg) 0.338
2023-10-25 10:26:19,717 saving best model
2023-10-25 10:26:20,058 ----------------------------------------------------------------------------------------------------
2023-10-25 10:26:41,991 epoch 2 - iter 521/5212 - loss 0.19442606 - time (sec): 21.93 - samples/sec: 1605.23 - lr: 0.000030 - momentum: 0.000000
2023-10-25 10:27:03,993 epoch 2 - iter 1042/5212 - loss 0.20989508 - time (sec): 43.93 - samples/sec: 1695.15 - lr: 0.000029 - momentum: 0.000000
2023-10-25 10:27:26,410 epoch 2 - iter 1563/5212 - loss 0.20130386 - time (sec): 66.35 - samples/sec: 1691.18 - lr: 0.000029 - momentum: 0.000000
2023-10-25 10:27:49,049 epoch 2 - iter 2084/5212 - loss 0.20177238 - time (sec): 88.99 - samples/sec: 1660.51 - lr: 0.000029 - momentum: 0.000000
2023-10-25 10:28:10,914 epoch 2 - iter 2605/5212 - loss 0.19276261 - time (sec): 110.86 - samples/sec: 1643.86 - lr: 0.000028 - momentum: 0.000000
2023-10-25 10:28:32,871 epoch 2 - iter 3126/5212 - loss 0.19211212 - time (sec): 132.81 - samples/sec: 1643.38 - lr: 0.000028 - momentum: 0.000000
2023-10-25 10:28:54,039 epoch 2 - iter 3647/5212 - loss 0.18868218 - time (sec): 153.98 - samples/sec: 1648.92 - lr: 0.000028 - momentum: 0.000000
2023-10-25 10:29:16,095 epoch 2 - iter 4168/5212 - loss 0.18281211 - time (sec): 176.04 - samples/sec: 1649.63 - lr: 0.000027 - momentum: 0.000000
2023-10-25 10:29:38,446 epoch 2 - iter 4689/5212 - loss 0.17632028 - time (sec): 198.39 - samples/sec: 1660.15 - lr: 0.000027 - momentum: 0.000000
2023-10-25 10:30:00,578 epoch 2 - iter 5210/5212 - loss 0.17379666 - time (sec): 220.52 - samples/sec: 1666.08 - lr: 0.000027 - momentum: 0.000000
2023-10-25 10:30:00,662 ----------------------------------------------------------------------------------------------------
2023-10-25 10:30:00,662 EPOCH 2 done: loss 0.1738 - lr: 0.000027
2023-10-25 10:30:07,487 DEV : loss 0.19129540026187897 - f1-score (micro avg) 0.299
2023-10-25 10:30:07,512 ----------------------------------------------------------------------------------------------------
2023-10-25 10:30:28,749 epoch 3 - iter 521/5212 - loss 0.15139998 - time (sec): 21.24 - samples/sec: 1684.17 - lr: 0.000026 - momentum: 0.000000
2023-10-25 10:30:50,722 epoch 3 - iter 1042/5212 - loss 0.12942618 - time (sec): 43.21 - samples/sec: 1682.82 - lr: 0.000026 - momentum: 0.000000
2023-10-25 10:31:12,787 epoch 3 - iter 1563/5212 - loss 0.12828827 - time (sec): 65.27 - samples/sec: 1648.63 - lr: 0.000026 - momentum: 0.000000
2023-10-25 10:31:34,671 epoch 3 - iter 2084/5212 - loss 0.12529659 - time (sec): 87.16 - samples/sec: 1660.13 - lr: 0.000025 - momentum: 0.000000
2023-10-25 10:31:56,677 epoch 3 - iter 2605/5212 - loss 0.12545579 - time (sec): 109.16 - samples/sec: 1667.85 - lr: 0.000025 - momentum: 0.000000
2023-10-25 10:32:18,468 epoch 3 - iter 3126/5212 - loss 0.12346099 - time (sec): 130.96 - samples/sec: 1666.98 - lr: 0.000025 - momentum: 0.000000
2023-10-25 10:32:40,187 epoch 3 - iter 3647/5212 - loss 0.12100870 - time (sec): 152.67 - samples/sec: 1690.45 - lr: 0.000024 - momentum: 0.000000
2023-10-25 10:33:02,121 epoch 3 - iter 4168/5212 - loss 0.12003978 - time (sec): 174.61 - samples/sec: 1678.46 - lr: 0.000024 - momentum: 0.000000
2023-10-25 10:33:24,278 epoch 3 - iter 4689/5212 - loss 0.11896548 - time (sec): 196.77 - samples/sec: 1689.13 - lr: 0.000024 - momentum: 0.000000
2023-10-25 10:33:46,125 epoch 3 - iter 5210/5212 - loss 0.11808525 - time (sec): 218.61 - samples/sec: 1679.99 - lr: 0.000023 - momentum: 0.000000
2023-10-25 10:33:46,212 ----------------------------------------------------------------------------------------------------
2023-10-25 10:33:46,212 EPOCH 3 done: loss 0.1181 - lr: 0.000023
2023-10-25 10:33:53,030 DEV : loss 0.25125402212142944 - f1-score (micro avg) 0.3229
2023-10-25 10:33:53,055 ----------------------------------------------------------------------------------------------------
2023-10-25 10:34:15,097 epoch 4 - iter 521/5212 - loss 0.09570163 - time (sec): 22.04 - samples/sec: 1627.38 - lr: 0.000023 - momentum: 0.000000
2023-10-25 10:34:36,749 epoch 4 - iter 1042/5212 - loss 0.08527943 - time (sec): 43.69 - samples/sec: 1680.81 - lr: 0.000023 - momentum: 0.000000
2023-10-25 10:34:58,317 epoch 4 - iter 1563/5212 - loss 0.08441491 - time (sec): 65.26 - samples/sec: 1688.40 - lr: 0.000022 - momentum: 0.000000
2023-10-25 10:35:20,529 epoch 4 - iter 2084/5212 - loss 0.08605013 - time (sec): 87.47 - samples/sec: 1687.17 - lr: 0.000022 - momentum: 0.000000
2023-10-25 10:35:42,353 epoch 4 - iter 2605/5212 - loss 0.08852468 - time (sec): 109.30 - samples/sec: 1674.65 - lr: 0.000022 - momentum: 0.000000
2023-10-25 10:36:04,655 epoch 4 - iter 3126/5212 - loss 0.08558254 - time (sec): 131.60 - samples/sec: 1679.32 - lr: 0.000021 - momentum: 0.000000
2023-10-25 10:36:26,662 epoch 4 - iter 3647/5212 - loss 0.08392507 - time (sec): 153.61 - samples/sec: 1679.85 - lr: 0.000021 - momentum: 0.000000
2023-10-25 10:36:48,353 epoch 4 - iter 4168/5212 - loss 0.08560380 - time (sec): 175.30 - samples/sec: 1671.24 - lr: 0.000021 - momentum: 0.000000
2023-10-25 10:37:09,922 epoch 4 - iter 4689/5212 - loss 0.08355064 - time (sec): 196.87 - samples/sec: 1675.28 - lr: 0.000020 - momentum: 0.000000
2023-10-25 10:37:31,917 epoch 4 - iter 5210/5212 - loss 0.08496389 - time (sec): 218.86 - samples/sec: 1678.29 - lr: 0.000020 - momentum: 0.000000
2023-10-25 10:37:31,996 ----------------------------------------------------------------------------------------------------
2023-10-25 10:37:31,996 EPOCH 4 done: loss 0.0851 - lr: 0.000020
2023-10-25 10:37:38,787 DEV : loss 0.21648848056793213 - f1-score (micro avg) 0.0402
2023-10-25 10:37:38,811 ----------------------------------------------------------------------------------------------------
2023-10-25 10:38:00,909 epoch 5 - iter 521/5212 - loss 0.08430859 - time (sec): 22.10 - samples/sec: 1697.57 - lr: 0.000020 - momentum: 0.000000
2023-10-25 10:38:22,569 epoch 5 - iter 1042/5212 - loss 0.09614661 - time (sec): 43.76 - samples/sec: 1657.69 - lr: 0.000019 - momentum: 0.000000
2023-10-25 10:38:44,278 epoch 5 - iter 1563/5212 - loss 0.08803141 - time (sec): 65.47 - samples/sec: 1695.49 - lr: 0.000019 - momentum: 0.000000
2023-10-25 10:39:05,741 epoch 5 - iter 2084/5212 - loss 0.09024989 - time (sec): 86.93 - samples/sec: 1718.28 - lr: 0.000019 - momentum: 0.000000
2023-10-25 10:39:26,984 epoch 5 - iter 2605/5212 - loss 0.08998709 - time (sec): 108.17 - samples/sec: 1737.92 - lr: 0.000018 - momentum: 0.000000
2023-10-25 10:39:48,802 epoch 5 - iter 3126/5212 - loss 0.09511305 - time (sec): 129.99 - samples/sec: 1722.51 - lr: 0.000018 - momentum: 0.000000
2023-10-25 10:40:10,223 epoch 5 - iter 3647/5212 - loss 0.10064505 - time (sec): 151.41 - samples/sec: 1714.33 - lr: 0.000018 - momentum: 0.000000
2023-10-25 10:40:32,735 epoch 5 - iter 4168/5212 - loss 0.11007703 - time (sec): 173.92 - samples/sec: 1713.99 - lr: 0.000017 - momentum: 0.000000
2023-10-25 10:40:54,594 epoch 5 - iter 4689/5212 - loss 0.11742696 - time (sec): 195.78 - samples/sec: 1695.78 - lr: 0.000017 - momentum: 0.000000
2023-10-25 10:41:15,969 epoch 5 - iter 5210/5212 - loss 0.11927343 - time (sec): 217.16 - samples/sec: 1691.67 - lr: 0.000017 - momentum: 0.000000
2023-10-25 10:41:16,047 ----------------------------------------------------------------------------------------------------
2023-10-25 10:41:16,047 EPOCH 5 done: loss 0.1193 - lr: 0.000017
2023-10-25 10:41:22,258 DEV : loss 0.28464657068252563 - f1-score (micro avg) 0.3233
2023-10-25 10:41:22,284 ----------------------------------------------------------------------------------------------------
2023-10-25 10:41:43,461 epoch 6 - iter 521/5212 - loss 0.15251778 - time (sec): 21.18 - samples/sec: 1694.32 - lr: 0.000016 - momentum: 0.000000
2023-10-25 10:42:05,196 epoch 6 - iter 1042/5212 - loss 0.17466293 - time (sec): 42.91 - samples/sec: 1739.03 - lr: 0.000016 - momentum: 0.000000
2023-10-25 10:42:26,184 epoch 6 - iter 1563/5212 - loss 0.16768695 - time (sec): 63.90 - samples/sec: 1720.33 - lr: 0.000016 - momentum: 0.000000
2023-10-25 10:42:48,915 epoch 6 - iter 2084/5212 - loss 0.15751685 - time (sec): 86.63 - samples/sec: 1712.05 - lr: 0.000015 - momentum: 0.000000
2023-10-25 10:43:10,685 epoch 6 - iter 2605/5212 - loss 0.15668536 - time (sec): 108.40 - samples/sec: 1716.53 - lr: 0.000015 - momentum: 0.000000
2023-10-25 10:43:32,721 epoch 6 - iter 3126/5212 - loss 0.15637228 - time (sec): 130.44 - samples/sec: 1705.28 - lr: 0.000015 - momentum: 0.000000
2023-10-25 10:43:54,767 epoch 6 - iter 3647/5212 - loss 0.16010223 - time (sec): 152.48 - samples/sec: 1720.32 - lr: 0.000014 - momentum: 0.000000
2023-10-25 10:44:16,760 epoch 6 - iter 4168/5212 - loss 0.15955165 - time (sec): 174.48 - samples/sec: 1716.33 - lr: 0.000014 - momentum: 0.000000
2023-10-25 10:44:38,310 epoch 6 - iter 4689/5212 - loss 0.15711162 - time (sec): 196.03 - samples/sec: 1696.43 - lr: 0.000014 - momentum: 0.000000
2023-10-25 10:45:00,282 epoch 6 - iter 5210/5212 - loss 0.15589747 - time (sec): 218.00 - samples/sec: 1684.89 - lr: 0.000013 - momentum: 0.000000
2023-10-25 10:45:00,375 ----------------------------------------------------------------------------------------------------
2023-10-25 10:45:00,376 EPOCH 6 done: loss 0.1559 - lr: 0.000013
2023-10-25 10:45:06,567 DEV : loss 0.24046669900417328 - f1-score (micro avg) 0.2108
2023-10-25 10:45:06,593 ----------------------------------------------------------------------------------------------------
2023-10-25 10:45:29,331 epoch 7 - iter 521/5212 - loss 0.17391613 - time (sec): 22.74 - samples/sec: 1815.46 - lr: 0.000013 - momentum: 0.000000
2023-10-25 10:45:51,149 epoch 7 - iter 1042/5212 - loss 0.21024836 - time (sec): 44.56 - samples/sec: 1683.46 - lr: 0.000013 - momentum: 0.000000
2023-10-25 10:46:12,402 epoch 7 - iter 1563/5212 - loss 0.23113253 - time (sec): 65.81 - samples/sec: 1659.39 - lr: 0.000012 - momentum: 0.000000
2023-10-25 10:46:34,469 epoch 7 - iter 2084/5212 - loss 0.24466977 - time (sec): 87.87 - samples/sec: 1660.72 - lr: 0.000012 - momentum: 0.000000
2023-10-25 10:46:56,887 epoch 7 - iter 2605/5212 - loss 0.23975383 - time (sec): 110.29 - samples/sec: 1673.45 - lr: 0.000012 - momentum: 0.000000
2023-10-25 10:47:19,324 epoch 7 - iter 3126/5212 - loss 0.24372915 - time (sec): 132.73 - samples/sec: 1663.08 - lr: 0.000011 - momentum: 0.000000
2023-10-25 10:47:41,279 epoch 7 - iter 3647/5212 - loss 0.24695018 - time (sec): 154.69 - samples/sec: 1656.74 - lr: 0.000011 - momentum: 0.000000
2023-10-25 10:48:03,789 epoch 7 - iter 4168/5212 - loss 0.24152012 - time (sec): 177.20 - samples/sec: 1646.54 - lr: 0.000011 - momentum: 0.000000
2023-10-25 10:48:25,392 epoch 7 - iter 4689/5212 - loss 0.24270640 - time (sec): 198.80 - samples/sec: 1658.89 - lr: 0.000010 - momentum: 0.000000
2023-10-25 10:48:47,664 epoch 7 - iter 5210/5212 - loss 0.24062534 - time (sec): 221.07 - samples/sec: 1658.06 - lr: 0.000010 - momentum: 0.000000
2023-10-25 10:48:47,811 ----------------------------------------------------------------------------------------------------
2023-10-25 10:48:47,812 EPOCH 7 done: loss 0.2401 - lr: 0.000010
2023-10-25 10:48:53,966 DEV : loss 0.20865921676158905 - f1-score (micro avg) 0.0308
2023-10-25 10:48:53,990 ----------------------------------------------------------------------------------------------------
2023-10-25 10:49:15,822 epoch 8 - iter 521/5212 - loss 0.23245768 - time (sec): 21.83 - samples/sec: 1834.01 - lr: 0.000010 - momentum: 0.000000
2023-10-25 10:49:37,992 epoch 8 - iter 1042/5212 - loss 0.23641302 - time (sec): 44.00 - samples/sec: 1785.00 - lr: 0.000009 - momentum: 0.000000
2023-10-25 10:50:00,161 epoch 8 - iter 1563/5212 - loss 0.23574566 - time (sec): 66.17 - samples/sec: 1728.62 - lr: 0.000009 - momentum: 0.000000
2023-10-25 10:50:22,411 epoch 8 - iter 2084/5212 - loss 0.22911715 - time (sec): 88.42 - samples/sec: 1731.03 - lr: 0.000009 - momentum: 0.000000
2023-10-25 10:50:44,471 epoch 8 - iter 2605/5212 - loss 0.22268025 - time (sec): 110.48 - samples/sec: 1721.53 - lr: 0.000008 - momentum: 0.000000
2023-10-25 10:51:06,401 epoch 8 - iter 3126/5212 - loss 0.21603415 - time (sec): 132.41 - samples/sec: 1700.33 - lr: 0.000008 - momentum: 0.000000
2023-10-25 10:51:28,177 epoch 8 - iter 3647/5212 - loss 0.20921488 - time (sec): 154.19 - samples/sec: 1681.39 - lr: 0.000008 - momentum: 0.000000
2023-10-25 10:51:50,252 epoch 8 - iter 4168/5212 - loss 0.20396150 - time (sec): 176.26 - samples/sec: 1683.18 - lr: 0.000007 - momentum: 0.000000
2023-10-25 10:52:12,051 epoch 8 - iter 4689/5212 - loss 0.20213364 - time (sec): 198.06 - samples/sec: 1679.66 - lr: 0.000007 - momentum: 0.000000
2023-10-25 10:52:34,201 epoch 8 - iter 5210/5212 - loss 0.20039284 - time (sec): 220.21 - samples/sec: 1668.12 - lr: 0.000007 - momentum: 0.000000
2023-10-25 10:52:34,288 ----------------------------------------------------------------------------------------------------
2023-10-25 10:52:34,288 EPOCH 8 done: loss 0.2003 - lr: 0.000007
2023-10-25 10:52:41,093 DEV : loss 0.23395822942256927 - f1-score (micro avg) 0.0138
2023-10-25 10:52:41,119 ----------------------------------------------------------------------------------------------------
2023-10-25 10:53:02,952 epoch 9 - iter 521/5212 - loss 0.16081698 - time (sec): 21.83 - samples/sec: 1680.72 - lr: 0.000006 - momentum: 0.000000
2023-10-25 10:53:25,133 epoch 9 - iter 1042/5212 - loss 0.16471685 - time (sec): 44.01 - samples/sec: 1651.18 - lr: 0.000006 - momentum: 0.000000
2023-10-25 10:53:47,263 epoch 9 - iter 1563/5212 - loss 0.16730970 - time (sec): 66.14 - samples/sec: 1670.53 - lr: 0.000006 - momentum: 0.000000
2023-10-25 10:54:09,172 epoch 9 - iter 2084/5212 - loss 0.16342211 - time (sec): 88.05 - samples/sec: 1694.10 - lr: 0.000005 - momentum: 0.000000
2023-10-25 10:54:31,171 epoch 9 - iter 2605/5212 - loss 0.16742827 - time (sec): 110.05 - samples/sec: 1680.66 - lr: 0.000005 - momentum: 0.000000
2023-10-25 10:54:53,537 epoch 9 - iter 3126/5212 - loss 0.17674737 - time (sec): 132.42 - samples/sec: 1673.25 - lr: 0.000005 - momentum: 0.000000
2023-10-25 10:55:15,319 epoch 9 - iter 3647/5212 - loss 0.17888003 - time (sec): 154.20 - samples/sec: 1669.63 - lr: 0.000004 - momentum: 0.000000
2023-10-25 10:55:37,002 epoch 9 - iter 4168/5212 - loss 0.18233322 - time (sec): 175.88 - samples/sec: 1669.70 - lr: 0.000004 - momentum: 0.000000
2023-10-25 10:55:58,362 epoch 9 - iter 4689/5212 - loss 0.18579525 - time (sec): 197.24 - samples/sec: 1673.91 - lr: 0.000004 - momentum: 0.000000
2023-10-25 10:56:20,521 epoch 9 - iter 5210/5212 - loss 0.18966184 - time (sec): 219.40 - samples/sec: 1674.47 - lr: 0.000003 - momentum: 0.000000
2023-10-25 10:56:20,602 ----------------------------------------------------------------------------------------------------
2023-10-25 10:56:20,603 EPOCH 9 done: loss 0.1897 - lr: 0.000003
2023-10-25 10:56:27,387 DEV : loss 0.20925834774971008 - f1-score (micro avg) 0.0222
2023-10-25 10:56:27,413 ----------------------------------------------------------------------------------------------------
2023-10-25 10:56:49,510 epoch 10 - iter 521/5212 - loss 0.19152088 - time (sec): 22.10 - samples/sec: 1732.35 - lr: 0.000003 - momentum: 0.000000
2023-10-25 10:57:11,489 epoch 10 - iter 1042/5212 - loss 0.18615729 - time (sec): 44.07 - samples/sec: 1711.53 - lr: 0.000003 - momentum: 0.000000
2023-10-25 10:57:33,361 epoch 10 - iter 1563/5212 - loss 0.19997959 - time (sec): 65.95 - samples/sec: 1702.95 - lr: 0.000002 - momentum: 0.000000
2023-10-25 10:57:55,293 epoch 10 - iter 2084/5212 - loss 0.20007022 - time (sec): 87.88 - samples/sec: 1693.73 - lr: 0.000002 - momentum: 0.000000
2023-10-25 10:58:17,744 epoch 10 - iter 2605/5212 - loss 0.19988210 - time (sec): 110.33 - samples/sec: 1685.43 - lr: 0.000002 - momentum: 0.000000
2023-10-25 10:58:40,396 epoch 10 - iter 3126/5212 - loss 0.19721703 - time (sec): 132.98 - samples/sec: 1678.69 - lr: 0.000001 - momentum: 0.000000
2023-10-25 10:59:02,066 epoch 10 - iter 3647/5212 - loss 0.19819379 - time (sec): 154.65 - samples/sec: 1669.18 - lr: 0.000001 - momentum: 0.000000
2023-10-25 10:59:23,800 epoch 10 - iter 4168/5212 - loss 0.19797311 - time (sec): 176.39 - samples/sec: 1660.32 - lr: 0.000001 - momentum: 0.000000
2023-10-25 10:59:45,529 epoch 10 - iter 4689/5212 - loss 0.19805461 - time (sec): 198.12 - samples/sec: 1662.26 - lr: 0.000000 - momentum: 0.000000
2023-10-25 11:00:07,238 epoch 10 - iter 5210/5212 - loss 0.19561517 - time (sec): 219.82 - samples/sec: 1671.14 - lr: 0.000000 - momentum: 0.000000
2023-10-25 11:00:07,315 ----------------------------------------------------------------------------------------------------
2023-10-25 11:00:07,315 EPOCH 10 done: loss 0.1956 - lr: 0.000000
2023-10-25 11:00:14,076 DEV : loss 0.22553929686546326 - f1-score (micro avg) 0.0198
2023-10-25 11:00:14,482 ----------------------------------------------------------------------------------------------------
2023-10-25 11:00:14,483 Loading model from best epoch ...
2023-10-25 11:00:16,075 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 11:00:25,774
Results:
- F-score (micro) 0.4043
- F-score (macro) 0.2454
- Accuracy 0.256
By class:
precision recall f1-score support
LOC 0.4951 0.5387 0.5160 1214
PER 0.3085 0.4097 0.3519 808
ORG 0.1336 0.0992 0.1138 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.3840 0.4268 0.4043 2390
macro avg 0.2343 0.2619 0.2454 2390
weighted avg 0.3755 0.4268 0.3979 2390
2023-10-25 11:00:25,774 ----------------------------------------------------------------------------------------------------