|
2023-10-25 10:22:36,342 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:22:36,342 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 10:22:36,342 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:22:36,343 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences |
|
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator |
|
2023-10-25 10:22:36,343 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:22:36,343 Train: 20847 sentences |
|
2023-10-25 10:22:36,343 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 10:22:36,343 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:22:36,343 Training Params: |
|
2023-10-25 10:22:36,343 - learning_rate: "3e-05" |
|
2023-10-25 10:22:36,343 - mini_batch_size: "4" |
|
2023-10-25 10:22:36,343 - max_epochs: "10" |
|
2023-10-25 10:22:36,343 - shuffle: "True" |
|
2023-10-25 10:22:36,343 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:22:36,343 Plugins: |
|
2023-10-25 10:22:36,343 - TensorboardLogger |
|
2023-10-25 10:22:36,343 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 10:22:36,343 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:22:36,343 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 10:22:36,343 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 10:22:36,343 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:22:36,343 Computation: |
|
2023-10-25 10:22:36,343 - compute on device: cuda:0 |
|
2023-10-25 10:22:36,343 - embedding storage: none |
|
2023-10-25 10:22:36,343 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:22:36,343 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-25 10:22:36,343 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:22:36,343 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:22:36,343 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 10:22:59,027 epoch 1 - iter 521/5212 - loss 1.61447866 - time (sec): 22.68 - samples/sec: 1687.92 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 10:23:20,822 epoch 1 - iter 1042/5212 - loss 1.02069474 - time (sec): 44.48 - samples/sec: 1659.29 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 10:23:42,683 epoch 1 - iter 1563/5212 - loss 0.78877305 - time (sec): 66.34 - samples/sec: 1635.72 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 10:24:04,201 epoch 1 - iter 2084/5212 - loss 0.66357593 - time (sec): 87.86 - samples/sec: 1628.49 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 10:24:25,621 epoch 1 - iter 2605/5212 - loss 0.57401952 - time (sec): 109.28 - samples/sec: 1642.20 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 10:24:47,783 epoch 1 - iter 3126/5212 - loss 0.51521620 - time (sec): 131.44 - samples/sec: 1649.82 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 10:25:09,910 epoch 1 - iter 3647/5212 - loss 0.47176606 - time (sec): 153.57 - samples/sec: 1659.19 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 10:25:32,185 epoch 1 - iter 4168/5212 - loss 0.43469966 - time (sec): 175.84 - samples/sec: 1663.47 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 10:25:54,298 epoch 1 - iter 4689/5212 - loss 0.41124414 - time (sec): 197.95 - samples/sec: 1665.49 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 10:26:15,961 epoch 1 - iter 5210/5212 - loss 0.38840089 - time (sec): 219.62 - samples/sec: 1672.53 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 10:26:16,050 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:26:16,050 EPOCH 1 done: loss 0.3885 - lr: 0.000030 |
|
2023-10-25 10:26:19,692 DEV : loss 0.1315269023180008 - f1-score (micro avg) 0.338 |
|
2023-10-25 10:26:19,717 saving best model |
|
2023-10-25 10:26:20,058 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:26:41,991 epoch 2 - iter 521/5212 - loss 0.19442606 - time (sec): 21.93 - samples/sec: 1605.23 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 10:27:03,993 epoch 2 - iter 1042/5212 - loss 0.20989508 - time (sec): 43.93 - samples/sec: 1695.15 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 10:27:26,410 epoch 2 - iter 1563/5212 - loss 0.20130386 - time (sec): 66.35 - samples/sec: 1691.18 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 10:27:49,049 epoch 2 - iter 2084/5212 - loss 0.20177238 - time (sec): 88.99 - samples/sec: 1660.51 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 10:28:10,914 epoch 2 - iter 2605/5212 - loss 0.19276261 - time (sec): 110.86 - samples/sec: 1643.86 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 10:28:32,871 epoch 2 - iter 3126/5212 - loss 0.19211212 - time (sec): 132.81 - samples/sec: 1643.38 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 10:28:54,039 epoch 2 - iter 3647/5212 - loss 0.18868218 - time (sec): 153.98 - samples/sec: 1648.92 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 10:29:16,095 epoch 2 - iter 4168/5212 - loss 0.18281211 - time (sec): 176.04 - samples/sec: 1649.63 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 10:29:38,446 epoch 2 - iter 4689/5212 - loss 0.17632028 - time (sec): 198.39 - samples/sec: 1660.15 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 10:30:00,578 epoch 2 - iter 5210/5212 - loss 0.17379666 - time (sec): 220.52 - samples/sec: 1666.08 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 10:30:00,662 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:30:00,662 EPOCH 2 done: loss 0.1738 - lr: 0.000027 |
|
2023-10-25 10:30:07,487 DEV : loss 0.19129540026187897 - f1-score (micro avg) 0.299 |
|
2023-10-25 10:30:07,512 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:30:28,749 epoch 3 - iter 521/5212 - loss 0.15139998 - time (sec): 21.24 - samples/sec: 1684.17 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 10:30:50,722 epoch 3 - iter 1042/5212 - loss 0.12942618 - time (sec): 43.21 - samples/sec: 1682.82 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 10:31:12,787 epoch 3 - iter 1563/5212 - loss 0.12828827 - time (sec): 65.27 - samples/sec: 1648.63 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 10:31:34,671 epoch 3 - iter 2084/5212 - loss 0.12529659 - time (sec): 87.16 - samples/sec: 1660.13 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 10:31:56,677 epoch 3 - iter 2605/5212 - loss 0.12545579 - time (sec): 109.16 - samples/sec: 1667.85 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 10:32:18,468 epoch 3 - iter 3126/5212 - loss 0.12346099 - time (sec): 130.96 - samples/sec: 1666.98 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 10:32:40,187 epoch 3 - iter 3647/5212 - loss 0.12100870 - time (sec): 152.67 - samples/sec: 1690.45 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 10:33:02,121 epoch 3 - iter 4168/5212 - loss 0.12003978 - time (sec): 174.61 - samples/sec: 1678.46 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 10:33:24,278 epoch 3 - iter 4689/5212 - loss 0.11896548 - time (sec): 196.77 - samples/sec: 1689.13 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 10:33:46,125 epoch 3 - iter 5210/5212 - loss 0.11808525 - time (sec): 218.61 - samples/sec: 1679.99 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 10:33:46,212 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:33:46,212 EPOCH 3 done: loss 0.1181 - lr: 0.000023 |
|
2023-10-25 10:33:53,030 DEV : loss 0.25125402212142944 - f1-score (micro avg) 0.3229 |
|
2023-10-25 10:33:53,055 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:34:15,097 epoch 4 - iter 521/5212 - loss 0.09570163 - time (sec): 22.04 - samples/sec: 1627.38 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 10:34:36,749 epoch 4 - iter 1042/5212 - loss 0.08527943 - time (sec): 43.69 - samples/sec: 1680.81 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 10:34:58,317 epoch 4 - iter 1563/5212 - loss 0.08441491 - time (sec): 65.26 - samples/sec: 1688.40 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 10:35:20,529 epoch 4 - iter 2084/5212 - loss 0.08605013 - time (sec): 87.47 - samples/sec: 1687.17 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 10:35:42,353 epoch 4 - iter 2605/5212 - loss 0.08852468 - time (sec): 109.30 - samples/sec: 1674.65 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 10:36:04,655 epoch 4 - iter 3126/5212 - loss 0.08558254 - time (sec): 131.60 - samples/sec: 1679.32 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 10:36:26,662 epoch 4 - iter 3647/5212 - loss 0.08392507 - time (sec): 153.61 - samples/sec: 1679.85 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 10:36:48,353 epoch 4 - iter 4168/5212 - loss 0.08560380 - time (sec): 175.30 - samples/sec: 1671.24 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 10:37:09,922 epoch 4 - iter 4689/5212 - loss 0.08355064 - time (sec): 196.87 - samples/sec: 1675.28 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 10:37:31,917 epoch 4 - iter 5210/5212 - loss 0.08496389 - time (sec): 218.86 - samples/sec: 1678.29 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 10:37:31,996 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:37:31,996 EPOCH 4 done: loss 0.0851 - lr: 0.000020 |
|
2023-10-25 10:37:38,787 DEV : loss 0.21648848056793213 - f1-score (micro avg) 0.0402 |
|
2023-10-25 10:37:38,811 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:38:00,909 epoch 5 - iter 521/5212 - loss 0.08430859 - time (sec): 22.10 - samples/sec: 1697.57 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 10:38:22,569 epoch 5 - iter 1042/5212 - loss 0.09614661 - time (sec): 43.76 - samples/sec: 1657.69 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 10:38:44,278 epoch 5 - iter 1563/5212 - loss 0.08803141 - time (sec): 65.47 - samples/sec: 1695.49 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 10:39:05,741 epoch 5 - iter 2084/5212 - loss 0.09024989 - time (sec): 86.93 - samples/sec: 1718.28 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 10:39:26,984 epoch 5 - iter 2605/5212 - loss 0.08998709 - time (sec): 108.17 - samples/sec: 1737.92 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 10:39:48,802 epoch 5 - iter 3126/5212 - loss 0.09511305 - time (sec): 129.99 - samples/sec: 1722.51 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 10:40:10,223 epoch 5 - iter 3647/5212 - loss 0.10064505 - time (sec): 151.41 - samples/sec: 1714.33 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 10:40:32,735 epoch 5 - iter 4168/5212 - loss 0.11007703 - time (sec): 173.92 - samples/sec: 1713.99 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 10:40:54,594 epoch 5 - iter 4689/5212 - loss 0.11742696 - time (sec): 195.78 - samples/sec: 1695.78 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 10:41:15,969 epoch 5 - iter 5210/5212 - loss 0.11927343 - time (sec): 217.16 - samples/sec: 1691.67 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 10:41:16,047 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:41:16,047 EPOCH 5 done: loss 0.1193 - lr: 0.000017 |
|
2023-10-25 10:41:22,258 DEV : loss 0.28464657068252563 - f1-score (micro avg) 0.3233 |
|
2023-10-25 10:41:22,284 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:41:43,461 epoch 6 - iter 521/5212 - loss 0.15251778 - time (sec): 21.18 - samples/sec: 1694.32 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 10:42:05,196 epoch 6 - iter 1042/5212 - loss 0.17466293 - time (sec): 42.91 - samples/sec: 1739.03 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 10:42:26,184 epoch 6 - iter 1563/5212 - loss 0.16768695 - time (sec): 63.90 - samples/sec: 1720.33 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 10:42:48,915 epoch 6 - iter 2084/5212 - loss 0.15751685 - time (sec): 86.63 - samples/sec: 1712.05 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 10:43:10,685 epoch 6 - iter 2605/5212 - loss 0.15668536 - time (sec): 108.40 - samples/sec: 1716.53 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 10:43:32,721 epoch 6 - iter 3126/5212 - loss 0.15637228 - time (sec): 130.44 - samples/sec: 1705.28 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 10:43:54,767 epoch 6 - iter 3647/5212 - loss 0.16010223 - time (sec): 152.48 - samples/sec: 1720.32 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 10:44:16,760 epoch 6 - iter 4168/5212 - loss 0.15955165 - time (sec): 174.48 - samples/sec: 1716.33 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 10:44:38,310 epoch 6 - iter 4689/5212 - loss 0.15711162 - time (sec): 196.03 - samples/sec: 1696.43 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 10:45:00,282 epoch 6 - iter 5210/5212 - loss 0.15589747 - time (sec): 218.00 - samples/sec: 1684.89 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 10:45:00,375 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:45:00,376 EPOCH 6 done: loss 0.1559 - lr: 0.000013 |
|
2023-10-25 10:45:06,567 DEV : loss 0.24046669900417328 - f1-score (micro avg) 0.2108 |
|
2023-10-25 10:45:06,593 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:45:29,331 epoch 7 - iter 521/5212 - loss 0.17391613 - time (sec): 22.74 - samples/sec: 1815.46 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 10:45:51,149 epoch 7 - iter 1042/5212 - loss 0.21024836 - time (sec): 44.56 - samples/sec: 1683.46 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 10:46:12,402 epoch 7 - iter 1563/5212 - loss 0.23113253 - time (sec): 65.81 - samples/sec: 1659.39 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 10:46:34,469 epoch 7 - iter 2084/5212 - loss 0.24466977 - time (sec): 87.87 - samples/sec: 1660.72 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 10:46:56,887 epoch 7 - iter 2605/5212 - loss 0.23975383 - time (sec): 110.29 - samples/sec: 1673.45 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 10:47:19,324 epoch 7 - iter 3126/5212 - loss 0.24372915 - time (sec): 132.73 - samples/sec: 1663.08 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 10:47:41,279 epoch 7 - iter 3647/5212 - loss 0.24695018 - time (sec): 154.69 - samples/sec: 1656.74 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 10:48:03,789 epoch 7 - iter 4168/5212 - loss 0.24152012 - time (sec): 177.20 - samples/sec: 1646.54 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 10:48:25,392 epoch 7 - iter 4689/5212 - loss 0.24270640 - time (sec): 198.80 - samples/sec: 1658.89 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 10:48:47,664 epoch 7 - iter 5210/5212 - loss 0.24062534 - time (sec): 221.07 - samples/sec: 1658.06 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 10:48:47,811 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:48:47,812 EPOCH 7 done: loss 0.2401 - lr: 0.000010 |
|
2023-10-25 10:48:53,966 DEV : loss 0.20865921676158905 - f1-score (micro avg) 0.0308 |
|
2023-10-25 10:48:53,990 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:49:15,822 epoch 8 - iter 521/5212 - loss 0.23245768 - time (sec): 21.83 - samples/sec: 1834.01 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 10:49:37,992 epoch 8 - iter 1042/5212 - loss 0.23641302 - time (sec): 44.00 - samples/sec: 1785.00 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 10:50:00,161 epoch 8 - iter 1563/5212 - loss 0.23574566 - time (sec): 66.17 - samples/sec: 1728.62 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 10:50:22,411 epoch 8 - iter 2084/5212 - loss 0.22911715 - time (sec): 88.42 - samples/sec: 1731.03 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 10:50:44,471 epoch 8 - iter 2605/5212 - loss 0.22268025 - time (sec): 110.48 - samples/sec: 1721.53 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 10:51:06,401 epoch 8 - iter 3126/5212 - loss 0.21603415 - time (sec): 132.41 - samples/sec: 1700.33 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 10:51:28,177 epoch 8 - iter 3647/5212 - loss 0.20921488 - time (sec): 154.19 - samples/sec: 1681.39 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 10:51:50,252 epoch 8 - iter 4168/5212 - loss 0.20396150 - time (sec): 176.26 - samples/sec: 1683.18 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 10:52:12,051 epoch 8 - iter 4689/5212 - loss 0.20213364 - time (sec): 198.06 - samples/sec: 1679.66 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 10:52:34,201 epoch 8 - iter 5210/5212 - loss 0.20039284 - time (sec): 220.21 - samples/sec: 1668.12 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 10:52:34,288 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:52:34,288 EPOCH 8 done: loss 0.2003 - lr: 0.000007 |
|
2023-10-25 10:52:41,093 DEV : loss 0.23395822942256927 - f1-score (micro avg) 0.0138 |
|
2023-10-25 10:52:41,119 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:53:02,952 epoch 9 - iter 521/5212 - loss 0.16081698 - time (sec): 21.83 - samples/sec: 1680.72 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 10:53:25,133 epoch 9 - iter 1042/5212 - loss 0.16471685 - time (sec): 44.01 - samples/sec: 1651.18 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 10:53:47,263 epoch 9 - iter 1563/5212 - loss 0.16730970 - time (sec): 66.14 - samples/sec: 1670.53 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 10:54:09,172 epoch 9 - iter 2084/5212 - loss 0.16342211 - time (sec): 88.05 - samples/sec: 1694.10 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 10:54:31,171 epoch 9 - iter 2605/5212 - loss 0.16742827 - time (sec): 110.05 - samples/sec: 1680.66 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 10:54:53,537 epoch 9 - iter 3126/5212 - loss 0.17674737 - time (sec): 132.42 - samples/sec: 1673.25 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 10:55:15,319 epoch 9 - iter 3647/5212 - loss 0.17888003 - time (sec): 154.20 - samples/sec: 1669.63 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 10:55:37,002 epoch 9 - iter 4168/5212 - loss 0.18233322 - time (sec): 175.88 - samples/sec: 1669.70 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 10:55:58,362 epoch 9 - iter 4689/5212 - loss 0.18579525 - time (sec): 197.24 - samples/sec: 1673.91 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 10:56:20,521 epoch 9 - iter 5210/5212 - loss 0.18966184 - time (sec): 219.40 - samples/sec: 1674.47 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 10:56:20,602 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:56:20,603 EPOCH 9 done: loss 0.1897 - lr: 0.000003 |
|
2023-10-25 10:56:27,387 DEV : loss 0.20925834774971008 - f1-score (micro avg) 0.0222 |
|
2023-10-25 10:56:27,413 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:56:49,510 epoch 10 - iter 521/5212 - loss 0.19152088 - time (sec): 22.10 - samples/sec: 1732.35 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 10:57:11,489 epoch 10 - iter 1042/5212 - loss 0.18615729 - time (sec): 44.07 - samples/sec: 1711.53 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 10:57:33,361 epoch 10 - iter 1563/5212 - loss 0.19997959 - time (sec): 65.95 - samples/sec: 1702.95 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 10:57:55,293 epoch 10 - iter 2084/5212 - loss 0.20007022 - time (sec): 87.88 - samples/sec: 1693.73 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 10:58:17,744 epoch 10 - iter 2605/5212 - loss 0.19988210 - time (sec): 110.33 - samples/sec: 1685.43 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 10:58:40,396 epoch 10 - iter 3126/5212 - loss 0.19721703 - time (sec): 132.98 - samples/sec: 1678.69 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 10:59:02,066 epoch 10 - iter 3647/5212 - loss 0.19819379 - time (sec): 154.65 - samples/sec: 1669.18 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 10:59:23,800 epoch 10 - iter 4168/5212 - loss 0.19797311 - time (sec): 176.39 - samples/sec: 1660.32 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 10:59:45,529 epoch 10 - iter 4689/5212 - loss 0.19805461 - time (sec): 198.12 - samples/sec: 1662.26 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 11:00:07,238 epoch 10 - iter 5210/5212 - loss 0.19561517 - time (sec): 219.82 - samples/sec: 1671.14 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 11:00:07,315 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:00:07,315 EPOCH 10 done: loss 0.1956 - lr: 0.000000 |
|
2023-10-25 11:00:14,076 DEV : loss 0.22553929686546326 - f1-score (micro avg) 0.0198 |
|
2023-10-25 11:00:14,482 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:00:14,483 Loading model from best epoch ... |
|
2023-10-25 11:00:16,075 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-25 11:00:25,774 |
|
Results: |
|
- F-score (micro) 0.4043 |
|
- F-score (macro) 0.2454 |
|
- Accuracy 0.256 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.4951 0.5387 0.5160 1214 |
|
PER 0.3085 0.4097 0.3519 808 |
|
ORG 0.1336 0.0992 0.1138 353 |
|
HumanProd 0.0000 0.0000 0.0000 15 |
|
|
|
micro avg 0.3840 0.4268 0.4043 2390 |
|
macro avg 0.2343 0.2619 0.2454 2390 |
|
weighted avg 0.3755 0.4268 0.3979 2390 |
|
|
|
2023-10-25 11:00:25,774 ---------------------------------------------------------------------------------------------------- |
|
|