stefan-it's picture
Upload folder using huggingface_hub
794b77b
2023-10-15 17:46:21,046 ----------------------------------------------------------------------------------------------------
2023-10-15 17:46:21,047 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-15 17:46:21,047 ----------------------------------------------------------------------------------------------------
2023-10-15 17:46:21,047 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-15 17:46:21,047 ----------------------------------------------------------------------------------------------------
2023-10-15 17:46:21,047 Train: 20847 sentences
2023-10-15 17:46:21,047 (train_with_dev=False, train_with_test=False)
2023-10-15 17:46:21,048 ----------------------------------------------------------------------------------------------------
2023-10-15 17:46:21,048 Training Params:
2023-10-15 17:46:21,048 - learning_rate: "3e-05"
2023-10-15 17:46:21,048 - mini_batch_size: "4"
2023-10-15 17:46:21,048 - max_epochs: "10"
2023-10-15 17:46:21,048 - shuffle: "True"
2023-10-15 17:46:21,048 ----------------------------------------------------------------------------------------------------
2023-10-15 17:46:21,048 Plugins:
2023-10-15 17:46:21,048 - LinearScheduler | warmup_fraction: '0.1'
2023-10-15 17:46:21,048 ----------------------------------------------------------------------------------------------------
2023-10-15 17:46:21,048 Final evaluation on model from best epoch (best-model.pt)
2023-10-15 17:46:21,048 - metric: "('micro avg', 'f1-score')"
2023-10-15 17:46:21,048 ----------------------------------------------------------------------------------------------------
2023-10-15 17:46:21,048 Computation:
2023-10-15 17:46:21,048 - compute on device: cuda:0
2023-10-15 17:46:21,048 - embedding storage: none
2023-10-15 17:46:21,048 ----------------------------------------------------------------------------------------------------
2023-10-15 17:46:21,048 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-15 17:46:21,048 ----------------------------------------------------------------------------------------------------
2023-10-15 17:46:21,048 ----------------------------------------------------------------------------------------------------
2023-10-15 17:46:46,375 epoch 1 - iter 521/5212 - loss 1.50097394 - time (sec): 25.33 - samples/sec: 1383.16 - lr: 0.000003 - momentum: 0.000000
2023-10-15 17:47:11,339 epoch 1 - iter 1042/5212 - loss 1.01650844 - time (sec): 50.29 - samples/sec: 1411.09 - lr: 0.000006 - momentum: 0.000000
2023-10-15 17:47:37,066 epoch 1 - iter 1563/5212 - loss 0.77214666 - time (sec): 76.02 - samples/sec: 1423.69 - lr: 0.000009 - momentum: 0.000000
2023-10-15 17:48:02,847 epoch 1 - iter 2084/5212 - loss 0.64040683 - time (sec): 101.80 - samples/sec: 1431.64 - lr: 0.000012 - momentum: 0.000000
2023-10-15 17:48:28,551 epoch 1 - iter 2605/5212 - loss 0.56192130 - time (sec): 127.50 - samples/sec: 1434.16 - lr: 0.000015 - momentum: 0.000000
2023-10-15 17:48:54,484 epoch 1 - iter 3126/5212 - loss 0.49948839 - time (sec): 153.44 - samples/sec: 1449.35 - lr: 0.000018 - momentum: 0.000000
2023-10-15 17:49:20,094 epoch 1 - iter 3647/5212 - loss 0.46113087 - time (sec): 179.04 - samples/sec: 1446.19 - lr: 0.000021 - momentum: 0.000000
2023-10-15 17:49:45,443 epoch 1 - iter 4168/5212 - loss 0.43061045 - time (sec): 204.39 - samples/sec: 1446.65 - lr: 0.000024 - momentum: 0.000000
2023-10-15 17:50:10,290 epoch 1 - iter 4689/5212 - loss 0.41123436 - time (sec): 229.24 - samples/sec: 1435.64 - lr: 0.000027 - momentum: 0.000000
2023-10-15 17:50:36,050 epoch 1 - iter 5210/5212 - loss 0.38867196 - time (sec): 255.00 - samples/sec: 1440.57 - lr: 0.000030 - momentum: 0.000000
2023-10-15 17:50:36,136 ----------------------------------------------------------------------------------------------------
2023-10-15 17:50:36,137 EPOCH 1 done: loss 0.3887 - lr: 0.000030
2023-10-15 17:50:42,081 DEV : loss 0.20117585361003876 - f1-score (micro avg) 0.2861
2023-10-15 17:50:42,110 saving best model
2023-10-15 17:50:42,473 ----------------------------------------------------------------------------------------------------
2023-10-15 17:51:07,318 epoch 2 - iter 521/5212 - loss 0.22240788 - time (sec): 24.84 - samples/sec: 1352.80 - lr: 0.000030 - momentum: 0.000000
2023-10-15 17:51:32,709 epoch 2 - iter 1042/5212 - loss 0.20503019 - time (sec): 50.24 - samples/sec: 1409.19 - lr: 0.000029 - momentum: 0.000000
2023-10-15 17:51:57,674 epoch 2 - iter 1563/5212 - loss 0.19541278 - time (sec): 75.20 - samples/sec: 1411.83 - lr: 0.000029 - momentum: 0.000000
2023-10-15 17:52:23,171 epoch 2 - iter 2084/5212 - loss 0.18582207 - time (sec): 100.70 - samples/sec: 1423.80 - lr: 0.000029 - momentum: 0.000000
2023-10-15 17:52:48,033 epoch 2 - iter 2605/5212 - loss 0.18632288 - time (sec): 125.56 - samples/sec: 1427.21 - lr: 0.000028 - momentum: 0.000000
2023-10-15 17:53:13,971 epoch 2 - iter 3126/5212 - loss 0.18323477 - time (sec): 151.50 - samples/sec: 1443.16 - lr: 0.000028 - momentum: 0.000000
2023-10-15 17:53:39,204 epoch 2 - iter 3647/5212 - loss 0.18006233 - time (sec): 176.73 - samples/sec: 1439.74 - lr: 0.000028 - momentum: 0.000000
2023-10-15 17:54:04,624 epoch 2 - iter 4168/5212 - loss 0.17990351 - time (sec): 202.15 - samples/sec: 1439.47 - lr: 0.000027 - momentum: 0.000000
2023-10-15 17:54:30,039 epoch 2 - iter 4689/5212 - loss 0.17515115 - time (sec): 227.56 - samples/sec: 1442.75 - lr: 0.000027 - momentum: 0.000000
2023-10-15 17:54:55,742 epoch 2 - iter 5210/5212 - loss 0.17338933 - time (sec): 253.27 - samples/sec: 1448.53 - lr: 0.000027 - momentum: 0.000000
2023-10-15 17:54:55,905 ----------------------------------------------------------------------------------------------------
2023-10-15 17:54:55,905 EPOCH 2 done: loss 0.1735 - lr: 0.000027
2023-10-15 17:55:04,829 DEV : loss 0.17898787558078766 - f1-score (micro avg) 0.3658
2023-10-15 17:55:04,856 saving best model
2023-10-15 17:55:05,299 ----------------------------------------------------------------------------------------------------
2023-10-15 17:55:30,120 epoch 3 - iter 521/5212 - loss 0.12227339 - time (sec): 24.82 - samples/sec: 1351.59 - lr: 0.000026 - momentum: 0.000000
2023-10-15 17:55:55,409 epoch 3 - iter 1042/5212 - loss 0.12395854 - time (sec): 50.11 - samples/sec: 1373.02 - lr: 0.000026 - momentum: 0.000000
2023-10-15 17:56:21,481 epoch 3 - iter 1563/5212 - loss 0.11709392 - time (sec): 76.18 - samples/sec: 1451.09 - lr: 0.000026 - momentum: 0.000000
2023-10-15 17:56:47,235 epoch 3 - iter 2084/5212 - loss 0.11296261 - time (sec): 101.93 - samples/sec: 1450.16 - lr: 0.000025 - momentum: 0.000000
2023-10-15 17:57:12,531 epoch 3 - iter 2605/5212 - loss 0.11412983 - time (sec): 127.23 - samples/sec: 1452.46 - lr: 0.000025 - momentum: 0.000000
2023-10-15 17:57:36,931 epoch 3 - iter 3126/5212 - loss 0.11505627 - time (sec): 151.63 - samples/sec: 1462.07 - lr: 0.000025 - momentum: 0.000000
2023-10-15 17:58:01,267 epoch 3 - iter 3647/5212 - loss 0.11553900 - time (sec): 175.97 - samples/sec: 1465.80 - lr: 0.000024 - momentum: 0.000000
2023-10-15 17:58:26,228 epoch 3 - iter 4168/5212 - loss 0.11759207 - time (sec): 200.93 - samples/sec: 1457.48 - lr: 0.000024 - momentum: 0.000000
2023-10-15 17:58:51,153 epoch 3 - iter 4689/5212 - loss 0.11866732 - time (sec): 225.85 - samples/sec: 1457.93 - lr: 0.000024 - momentum: 0.000000
2023-10-15 17:59:16,200 epoch 3 - iter 5210/5212 - loss 0.11854949 - time (sec): 250.90 - samples/sec: 1464.12 - lr: 0.000023 - momentum: 0.000000
2023-10-15 17:59:16,294 ----------------------------------------------------------------------------------------------------
2023-10-15 17:59:16,294 EPOCH 3 done: loss 0.1185 - lr: 0.000023
2023-10-15 17:59:25,325 DEV : loss 0.24826383590698242 - f1-score (micro avg) 0.3206
2023-10-15 17:59:25,351 ----------------------------------------------------------------------------------------------------
2023-10-15 17:59:50,986 epoch 4 - iter 521/5212 - loss 0.07086836 - time (sec): 25.63 - samples/sec: 1446.52 - lr: 0.000023 - momentum: 0.000000
2023-10-15 18:00:16,603 epoch 4 - iter 1042/5212 - loss 0.07461997 - time (sec): 51.25 - samples/sec: 1467.28 - lr: 0.000023 - momentum: 0.000000
2023-10-15 18:00:42,316 epoch 4 - iter 1563/5212 - loss 0.07589647 - time (sec): 76.96 - samples/sec: 1444.37 - lr: 0.000022 - momentum: 0.000000
2023-10-15 18:01:08,183 epoch 4 - iter 2084/5212 - loss 0.07950376 - time (sec): 102.83 - samples/sec: 1433.87 - lr: 0.000022 - momentum: 0.000000
2023-10-15 18:01:33,328 epoch 4 - iter 2605/5212 - loss 0.07820170 - time (sec): 127.98 - samples/sec: 1437.64 - lr: 0.000022 - momentum: 0.000000
2023-10-15 18:01:59,013 epoch 4 - iter 3126/5212 - loss 0.08102218 - time (sec): 153.66 - samples/sec: 1442.95 - lr: 0.000021 - momentum: 0.000000
2023-10-15 18:02:24,785 epoch 4 - iter 3647/5212 - loss 0.08157332 - time (sec): 179.43 - samples/sec: 1446.50 - lr: 0.000021 - momentum: 0.000000
2023-10-15 18:02:50,143 epoch 4 - iter 4168/5212 - loss 0.08239732 - time (sec): 204.79 - samples/sec: 1440.24 - lr: 0.000021 - momentum: 0.000000
2023-10-15 18:03:15,106 epoch 4 - iter 4689/5212 - loss 0.08200821 - time (sec): 229.75 - samples/sec: 1437.08 - lr: 0.000020 - momentum: 0.000000
2023-10-15 18:03:40,271 epoch 4 - iter 5210/5212 - loss 0.08251544 - time (sec): 254.92 - samples/sec: 1441.21 - lr: 0.000020 - momentum: 0.000000
2023-10-15 18:03:40,369 ----------------------------------------------------------------------------------------------------
2023-10-15 18:03:40,369 EPOCH 4 done: loss 0.0825 - lr: 0.000020
2023-10-15 18:03:48,838 DEV : loss 0.35853323340415955 - f1-score (micro avg) 0.3579
2023-10-15 18:03:48,867 ----------------------------------------------------------------------------------------------------
2023-10-15 18:04:15,944 epoch 5 - iter 521/5212 - loss 0.06074459 - time (sec): 27.07 - samples/sec: 1474.06 - lr: 0.000020 - momentum: 0.000000
2023-10-15 18:04:41,310 epoch 5 - iter 1042/5212 - loss 0.05750324 - time (sec): 52.44 - samples/sec: 1450.08 - lr: 0.000019 - momentum: 0.000000
2023-10-15 18:05:06,545 epoch 5 - iter 1563/5212 - loss 0.06019844 - time (sec): 77.68 - samples/sec: 1440.58 - lr: 0.000019 - momentum: 0.000000
2023-10-15 18:05:31,865 epoch 5 - iter 2084/5212 - loss 0.05797686 - time (sec): 103.00 - samples/sec: 1457.52 - lr: 0.000019 - momentum: 0.000000
2023-10-15 18:05:56,701 epoch 5 - iter 2605/5212 - loss 0.05779824 - time (sec): 127.83 - samples/sec: 1463.83 - lr: 0.000018 - momentum: 0.000000
2023-10-15 18:06:21,368 epoch 5 - iter 3126/5212 - loss 0.05796520 - time (sec): 152.50 - samples/sec: 1458.09 - lr: 0.000018 - momentum: 0.000000
2023-10-15 18:06:46,164 epoch 5 - iter 3647/5212 - loss 0.05749706 - time (sec): 177.30 - samples/sec: 1447.84 - lr: 0.000018 - momentum: 0.000000
2023-10-15 18:07:11,414 epoch 5 - iter 4168/5212 - loss 0.05748009 - time (sec): 202.55 - samples/sec: 1451.08 - lr: 0.000017 - momentum: 0.000000
2023-10-15 18:07:36,488 epoch 5 - iter 4689/5212 - loss 0.05835980 - time (sec): 227.62 - samples/sec: 1453.58 - lr: 0.000017 - momentum: 0.000000
2023-10-15 18:08:02,017 epoch 5 - iter 5210/5212 - loss 0.05917565 - time (sec): 253.15 - samples/sec: 1451.21 - lr: 0.000017 - momentum: 0.000000
2023-10-15 18:08:02,105 ----------------------------------------------------------------------------------------------------
2023-10-15 18:08:02,106 EPOCH 5 done: loss 0.0592 - lr: 0.000017
2023-10-15 18:08:10,370 DEV : loss 0.2954880893230438 - f1-score (micro avg) 0.3915
2023-10-15 18:08:10,397 saving best model
2023-10-15 18:08:10,852 ----------------------------------------------------------------------------------------------------
2023-10-15 18:08:36,370 epoch 6 - iter 521/5212 - loss 0.03566624 - time (sec): 25.51 - samples/sec: 1407.17 - lr: 0.000016 - momentum: 0.000000
2023-10-15 18:09:01,776 epoch 6 - iter 1042/5212 - loss 0.03735819 - time (sec): 50.92 - samples/sec: 1459.40 - lr: 0.000016 - momentum: 0.000000
2023-10-15 18:09:27,672 epoch 6 - iter 1563/5212 - loss 0.04010312 - time (sec): 76.82 - samples/sec: 1461.32 - lr: 0.000016 - momentum: 0.000000
2023-10-15 18:09:53,920 epoch 6 - iter 2084/5212 - loss 0.04272252 - time (sec): 103.06 - samples/sec: 1440.65 - lr: 0.000015 - momentum: 0.000000
2023-10-15 18:10:18,891 epoch 6 - iter 2605/5212 - loss 0.04658724 - time (sec): 128.04 - samples/sec: 1423.31 - lr: 0.000015 - momentum: 0.000000
2023-10-15 18:10:44,769 epoch 6 - iter 3126/5212 - loss 0.04608168 - time (sec): 153.91 - samples/sec: 1436.73 - lr: 0.000015 - momentum: 0.000000
2023-10-15 18:11:09,948 epoch 6 - iter 3647/5212 - loss 0.04595678 - time (sec): 179.09 - samples/sec: 1434.36 - lr: 0.000014 - momentum: 0.000000
2023-10-15 18:11:35,711 epoch 6 - iter 4168/5212 - loss 0.04599344 - time (sec): 204.85 - samples/sec: 1436.63 - lr: 0.000014 - momentum: 0.000000
2023-10-15 18:12:00,745 epoch 6 - iter 4689/5212 - loss 0.04455311 - time (sec): 229.89 - samples/sec: 1439.85 - lr: 0.000014 - momentum: 0.000000
2023-10-15 18:12:26,059 epoch 6 - iter 5210/5212 - loss 0.04410825 - time (sec): 255.20 - samples/sec: 1438.27 - lr: 0.000013 - momentum: 0.000000
2023-10-15 18:12:26,260 ----------------------------------------------------------------------------------------------------
2023-10-15 18:12:26,260 EPOCH 6 done: loss 0.0441 - lr: 0.000013
2023-10-15 18:12:34,571 DEV : loss 0.39353951811790466 - f1-score (micro avg) 0.3728
2023-10-15 18:12:34,601 ----------------------------------------------------------------------------------------------------
2023-10-15 18:12:59,435 epoch 7 - iter 521/5212 - loss 0.03179235 - time (sec): 24.83 - samples/sec: 1364.97 - lr: 0.000013 - momentum: 0.000000
2023-10-15 18:13:24,802 epoch 7 - iter 1042/5212 - loss 0.02701866 - time (sec): 50.20 - samples/sec: 1413.92 - lr: 0.000013 - momentum: 0.000000
2023-10-15 18:13:49,752 epoch 7 - iter 1563/5212 - loss 0.02758616 - time (sec): 75.15 - samples/sec: 1397.11 - lr: 0.000012 - momentum: 0.000000
2023-10-15 18:14:15,112 epoch 7 - iter 2084/5212 - loss 0.02597876 - time (sec): 100.51 - samples/sec: 1423.38 - lr: 0.000012 - momentum: 0.000000
2023-10-15 18:14:40,889 epoch 7 - iter 2605/5212 - loss 0.03022671 - time (sec): 126.29 - samples/sec: 1432.09 - lr: 0.000012 - momentum: 0.000000
2023-10-15 18:15:06,162 epoch 7 - iter 3126/5212 - loss 0.03044416 - time (sec): 151.56 - samples/sec: 1431.88 - lr: 0.000011 - momentum: 0.000000
2023-10-15 18:15:31,480 epoch 7 - iter 3647/5212 - loss 0.03128317 - time (sec): 176.88 - samples/sec: 1442.88 - lr: 0.000011 - momentum: 0.000000
2023-10-15 18:15:57,689 epoch 7 - iter 4168/5212 - loss 0.03231927 - time (sec): 203.09 - samples/sec: 1441.69 - lr: 0.000011 - momentum: 0.000000
2023-10-15 18:16:24,016 epoch 7 - iter 4689/5212 - loss 0.03123865 - time (sec): 229.41 - samples/sec: 1435.67 - lr: 0.000010 - momentum: 0.000000
2023-10-15 18:16:49,654 epoch 7 - iter 5210/5212 - loss 0.03238048 - time (sec): 255.05 - samples/sec: 1440.47 - lr: 0.000010 - momentum: 0.000000
2023-10-15 18:16:49,745 ----------------------------------------------------------------------------------------------------
2023-10-15 18:16:49,745 EPOCH 7 done: loss 0.0324 - lr: 0.000010
2023-10-15 18:16:58,188 DEV : loss 0.4698057472705841 - f1-score (micro avg) 0.3445
2023-10-15 18:16:58,221 ----------------------------------------------------------------------------------------------------
2023-10-15 18:17:24,331 epoch 8 - iter 521/5212 - loss 0.02118882 - time (sec): 26.11 - samples/sec: 1507.33 - lr: 0.000010 - momentum: 0.000000
2023-10-15 18:17:50,035 epoch 8 - iter 1042/5212 - loss 0.02174828 - time (sec): 51.81 - samples/sec: 1501.03 - lr: 0.000009 - momentum: 0.000000
2023-10-15 18:18:15,674 epoch 8 - iter 1563/5212 - loss 0.02240042 - time (sec): 77.45 - samples/sec: 1484.23 - lr: 0.000009 - momentum: 0.000000
2023-10-15 18:18:40,960 epoch 8 - iter 2084/5212 - loss 0.02307800 - time (sec): 102.74 - samples/sec: 1458.93 - lr: 0.000009 - momentum: 0.000000
2023-10-15 18:19:06,094 epoch 8 - iter 2605/5212 - loss 0.02270835 - time (sec): 127.87 - samples/sec: 1448.41 - lr: 0.000008 - momentum: 0.000000
2023-10-15 18:19:31,710 epoch 8 - iter 3126/5212 - loss 0.02207908 - time (sec): 153.49 - samples/sec: 1443.14 - lr: 0.000008 - momentum: 0.000000
2023-10-15 18:19:57,514 epoch 8 - iter 3647/5212 - loss 0.02166312 - time (sec): 179.29 - samples/sec: 1447.94 - lr: 0.000008 - momentum: 0.000000
2023-10-15 18:20:22,251 epoch 8 - iter 4168/5212 - loss 0.02233639 - time (sec): 204.03 - samples/sec: 1441.81 - lr: 0.000007 - momentum: 0.000000
2023-10-15 18:20:47,299 epoch 8 - iter 4689/5212 - loss 0.02246766 - time (sec): 229.08 - samples/sec: 1441.54 - lr: 0.000007 - momentum: 0.000000
2023-10-15 18:21:11,946 epoch 8 - iter 5210/5212 - loss 0.02273107 - time (sec): 253.72 - samples/sec: 1446.48 - lr: 0.000007 - momentum: 0.000000
2023-10-15 18:21:12,056 ----------------------------------------------------------------------------------------------------
2023-10-15 18:21:12,056 EPOCH 8 done: loss 0.0227 - lr: 0.000007
2023-10-15 18:21:21,123 DEV : loss 0.48737385869026184 - f1-score (micro avg) 0.3614
2023-10-15 18:21:21,155 ----------------------------------------------------------------------------------------------------
2023-10-15 18:21:45,925 epoch 9 - iter 521/5212 - loss 0.01657988 - time (sec): 24.77 - samples/sec: 1436.79 - lr: 0.000006 - momentum: 0.000000
2023-10-15 18:22:11,071 epoch 9 - iter 1042/5212 - loss 0.01825306 - time (sec): 49.92 - samples/sec: 1435.80 - lr: 0.000006 - momentum: 0.000000
2023-10-15 18:22:35,983 epoch 9 - iter 1563/5212 - loss 0.01537751 - time (sec): 74.83 - samples/sec: 1446.59 - lr: 0.000006 - momentum: 0.000000
2023-10-15 18:23:01,098 epoch 9 - iter 2084/5212 - loss 0.01556559 - time (sec): 99.94 - samples/sec: 1448.93 - lr: 0.000005 - momentum: 0.000000
2023-10-15 18:23:26,191 epoch 9 - iter 2605/5212 - loss 0.01597811 - time (sec): 125.04 - samples/sec: 1441.42 - lr: 0.000005 - momentum: 0.000000
2023-10-15 18:23:52,046 epoch 9 - iter 3126/5212 - loss 0.01561272 - time (sec): 150.89 - samples/sec: 1451.99 - lr: 0.000005 - momentum: 0.000000
2023-10-15 18:24:16,900 epoch 9 - iter 3647/5212 - loss 0.01574407 - time (sec): 175.74 - samples/sec: 1453.65 - lr: 0.000004 - momentum: 0.000000
2023-10-15 18:24:42,461 epoch 9 - iter 4168/5212 - loss 0.01612832 - time (sec): 201.31 - samples/sec: 1448.57 - lr: 0.000004 - momentum: 0.000000
2023-10-15 18:25:08,164 epoch 9 - iter 4689/5212 - loss 0.01585164 - time (sec): 227.01 - samples/sec: 1452.94 - lr: 0.000004 - momentum: 0.000000
2023-10-15 18:25:33,956 epoch 9 - iter 5210/5212 - loss 0.01523723 - time (sec): 252.80 - samples/sec: 1452.97 - lr: 0.000003 - momentum: 0.000000
2023-10-15 18:25:34,044 ----------------------------------------------------------------------------------------------------
2023-10-15 18:25:34,044 EPOCH 9 done: loss 0.0152 - lr: 0.000003
2023-10-15 18:25:43,065 DEV : loss 0.5062488317489624 - f1-score (micro avg) 0.3655
2023-10-15 18:25:43,093 ----------------------------------------------------------------------------------------------------
2023-10-15 18:26:08,163 epoch 10 - iter 521/5212 - loss 0.00931166 - time (sec): 25.07 - samples/sec: 1432.29 - lr: 0.000003 - momentum: 0.000000
2023-10-15 18:26:33,323 epoch 10 - iter 1042/5212 - loss 0.01081890 - time (sec): 50.23 - samples/sec: 1458.64 - lr: 0.000003 - momentum: 0.000000
2023-10-15 18:26:58,431 epoch 10 - iter 1563/5212 - loss 0.01002542 - time (sec): 75.34 - samples/sec: 1467.77 - lr: 0.000002 - momentum: 0.000000
2023-10-15 18:27:24,060 epoch 10 - iter 2084/5212 - loss 0.00973675 - time (sec): 100.97 - samples/sec: 1456.67 - lr: 0.000002 - momentum: 0.000000
2023-10-15 18:27:49,708 epoch 10 - iter 2605/5212 - loss 0.01027887 - time (sec): 126.61 - samples/sec: 1466.36 - lr: 0.000002 - momentum: 0.000000
2023-10-15 18:28:15,215 epoch 10 - iter 3126/5212 - loss 0.01040973 - time (sec): 152.12 - samples/sec: 1460.95 - lr: 0.000001 - momentum: 0.000000
2023-10-15 18:28:40,464 epoch 10 - iter 3647/5212 - loss 0.01009036 - time (sec): 177.37 - samples/sec: 1459.94 - lr: 0.000001 - momentum: 0.000000
2023-10-15 18:29:05,142 epoch 10 - iter 4168/5212 - loss 0.01032693 - time (sec): 202.05 - samples/sec: 1454.33 - lr: 0.000001 - momentum: 0.000000
2023-10-15 18:29:30,640 epoch 10 - iter 4689/5212 - loss 0.01050964 - time (sec): 227.55 - samples/sec: 1448.72 - lr: 0.000000 - momentum: 0.000000
2023-10-15 18:29:55,882 epoch 10 - iter 5210/5212 - loss 0.01022552 - time (sec): 252.79 - samples/sec: 1452.20 - lr: 0.000000 - momentum: 0.000000
2023-10-15 18:29:56,017 ----------------------------------------------------------------------------------------------------
2023-10-15 18:29:56,017 EPOCH 10 done: loss 0.0102 - lr: 0.000000
2023-10-15 18:30:05,188 DEV : loss 0.5002045631408691 - f1-score (micro avg) 0.374
2023-10-15 18:30:05,584 ----------------------------------------------------------------------------------------------------
2023-10-15 18:30:05,585 Loading model from best epoch ...
2023-10-15 18:30:07,219 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-15 18:30:22,520
Results:
- F-score (micro) 0.4665
- F-score (macro) 0.3023
- Accuracy 0.3084
By class:
precision recall f1-score support
LOC 0.5410 0.5865 0.5628 1214
PER 0.4010 0.4035 0.4022 808
ORG 0.3038 0.2040 0.2441 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4686 0.4644 0.4665 2390
macro avg 0.3115 0.2985 0.3023 2390
weighted avg 0.4553 0.4644 0.4579 2390
2023-10-15 18:30:22,520 ----------------------------------------------------------------------------------------------------