2023-10-15 17:46:21,046 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:46:21,047 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-15 17:46:21,047 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:46:21,047 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences - NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator 2023-10-15 17:46:21,047 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:46:21,047 Train: 20847 sentences 2023-10-15 17:46:21,047 (train_with_dev=False, train_with_test=False) 2023-10-15 17:46:21,048 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:46:21,048 Training Params: 2023-10-15 17:46:21,048 - learning_rate: "3e-05" 2023-10-15 17:46:21,048 - mini_batch_size: "4" 2023-10-15 17:46:21,048 - max_epochs: "10" 2023-10-15 17:46:21,048 - shuffle: "True" 2023-10-15 17:46:21,048 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:46:21,048 Plugins: 2023-10-15 17:46:21,048 - LinearScheduler | warmup_fraction: '0.1' 2023-10-15 17:46:21,048 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:46:21,048 Final evaluation on model from best epoch (best-model.pt) 2023-10-15 17:46:21,048 - metric: "('micro avg', 'f1-score')" 2023-10-15 17:46:21,048 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:46:21,048 Computation: 2023-10-15 17:46:21,048 - compute on device: cuda:0 2023-10-15 17:46:21,048 - embedding storage: none 2023-10-15 17:46:21,048 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:46:21,048 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-15 17:46:21,048 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:46:21,048 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:46:46,375 epoch 1 - iter 521/5212 - loss 1.50097394 - time (sec): 25.33 - samples/sec: 1383.16 - lr: 0.000003 - momentum: 0.000000 2023-10-15 17:47:11,339 epoch 1 - iter 1042/5212 - loss 1.01650844 - time (sec): 50.29 - samples/sec: 1411.09 - lr: 0.000006 - momentum: 0.000000 2023-10-15 17:47:37,066 epoch 1 - iter 1563/5212 - loss 0.77214666 - time (sec): 76.02 - samples/sec: 1423.69 - lr: 0.000009 - momentum: 0.000000 2023-10-15 17:48:02,847 epoch 1 - iter 2084/5212 - loss 0.64040683 - time (sec): 101.80 - samples/sec: 1431.64 - lr: 0.000012 - momentum: 0.000000 2023-10-15 17:48:28,551 epoch 1 - iter 2605/5212 - loss 0.56192130 - time (sec): 127.50 - samples/sec: 1434.16 - lr: 0.000015 - momentum: 0.000000 2023-10-15 17:48:54,484 epoch 1 - iter 3126/5212 - loss 0.49948839 - time (sec): 153.44 - samples/sec: 1449.35 - lr: 0.000018 - momentum: 0.000000 2023-10-15 17:49:20,094 epoch 1 - iter 3647/5212 - loss 0.46113087 - time (sec): 179.04 - samples/sec: 1446.19 - lr: 0.000021 - momentum: 0.000000 2023-10-15 17:49:45,443 epoch 1 - iter 4168/5212 - loss 0.43061045 - time (sec): 204.39 - samples/sec: 1446.65 - lr: 0.000024 - momentum: 0.000000 2023-10-15 17:50:10,290 epoch 1 - iter 4689/5212 - loss 0.41123436 - time (sec): 229.24 - samples/sec: 1435.64 - lr: 0.000027 - momentum: 0.000000 2023-10-15 17:50:36,050 epoch 1 - iter 5210/5212 - loss 0.38867196 - time (sec): 255.00 - samples/sec: 1440.57 - lr: 0.000030 - momentum: 0.000000 2023-10-15 17:50:36,136 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:50:36,137 EPOCH 1 done: loss 0.3887 - lr: 0.000030 2023-10-15 17:50:42,081 DEV : loss 0.20117585361003876 - f1-score (micro avg) 0.2861 2023-10-15 17:50:42,110 saving best model 2023-10-15 17:50:42,473 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:51:07,318 epoch 2 - iter 521/5212 - loss 0.22240788 - time (sec): 24.84 - samples/sec: 1352.80 - lr: 0.000030 - momentum: 0.000000 2023-10-15 17:51:32,709 epoch 2 - iter 1042/5212 - loss 0.20503019 - time (sec): 50.24 - samples/sec: 1409.19 - lr: 0.000029 - momentum: 0.000000 2023-10-15 17:51:57,674 epoch 2 - iter 1563/5212 - loss 0.19541278 - time (sec): 75.20 - samples/sec: 1411.83 - lr: 0.000029 - momentum: 0.000000 2023-10-15 17:52:23,171 epoch 2 - iter 2084/5212 - loss 0.18582207 - time (sec): 100.70 - samples/sec: 1423.80 - lr: 0.000029 - momentum: 0.000000 2023-10-15 17:52:48,033 epoch 2 - iter 2605/5212 - loss 0.18632288 - time (sec): 125.56 - samples/sec: 1427.21 - lr: 0.000028 - momentum: 0.000000 2023-10-15 17:53:13,971 epoch 2 - iter 3126/5212 - loss 0.18323477 - time (sec): 151.50 - samples/sec: 1443.16 - lr: 0.000028 - momentum: 0.000000 2023-10-15 17:53:39,204 epoch 2 - iter 3647/5212 - loss 0.18006233 - time (sec): 176.73 - samples/sec: 1439.74 - lr: 0.000028 - momentum: 0.000000 2023-10-15 17:54:04,624 epoch 2 - iter 4168/5212 - loss 0.17990351 - time (sec): 202.15 - samples/sec: 1439.47 - lr: 0.000027 - momentum: 0.000000 2023-10-15 17:54:30,039 epoch 2 - iter 4689/5212 - loss 0.17515115 - time (sec): 227.56 - samples/sec: 1442.75 - lr: 0.000027 - momentum: 0.000000 2023-10-15 17:54:55,742 epoch 2 - iter 5210/5212 - loss 0.17338933 - time (sec): 253.27 - samples/sec: 1448.53 - lr: 0.000027 - momentum: 0.000000 2023-10-15 17:54:55,905 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:54:55,905 EPOCH 2 done: loss 0.1735 - lr: 0.000027 2023-10-15 17:55:04,829 DEV : loss 0.17898787558078766 - f1-score (micro avg) 0.3658 2023-10-15 17:55:04,856 saving best model 2023-10-15 17:55:05,299 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:55:30,120 epoch 3 - iter 521/5212 - loss 0.12227339 - time (sec): 24.82 - samples/sec: 1351.59 - lr: 0.000026 - momentum: 0.000000 2023-10-15 17:55:55,409 epoch 3 - iter 1042/5212 - loss 0.12395854 - time (sec): 50.11 - samples/sec: 1373.02 - lr: 0.000026 - momentum: 0.000000 2023-10-15 17:56:21,481 epoch 3 - iter 1563/5212 - loss 0.11709392 - time (sec): 76.18 - samples/sec: 1451.09 - lr: 0.000026 - momentum: 0.000000 2023-10-15 17:56:47,235 epoch 3 - iter 2084/5212 - loss 0.11296261 - time (sec): 101.93 - samples/sec: 1450.16 - lr: 0.000025 - momentum: 0.000000 2023-10-15 17:57:12,531 epoch 3 - iter 2605/5212 - loss 0.11412983 - time (sec): 127.23 - samples/sec: 1452.46 - lr: 0.000025 - momentum: 0.000000 2023-10-15 17:57:36,931 epoch 3 - iter 3126/5212 - loss 0.11505627 - time (sec): 151.63 - samples/sec: 1462.07 - lr: 0.000025 - momentum: 0.000000 2023-10-15 17:58:01,267 epoch 3 - iter 3647/5212 - loss 0.11553900 - time (sec): 175.97 - samples/sec: 1465.80 - lr: 0.000024 - momentum: 0.000000 2023-10-15 17:58:26,228 epoch 3 - iter 4168/5212 - loss 0.11759207 - time (sec): 200.93 - samples/sec: 1457.48 - lr: 0.000024 - momentum: 0.000000 2023-10-15 17:58:51,153 epoch 3 - iter 4689/5212 - loss 0.11866732 - time (sec): 225.85 - samples/sec: 1457.93 - lr: 0.000024 - momentum: 0.000000 2023-10-15 17:59:16,200 epoch 3 - iter 5210/5212 - loss 0.11854949 - time (sec): 250.90 - samples/sec: 1464.12 - lr: 0.000023 - momentum: 0.000000 2023-10-15 17:59:16,294 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:59:16,294 EPOCH 3 done: loss 0.1185 - lr: 0.000023 2023-10-15 17:59:25,325 DEV : loss 0.24826383590698242 - f1-score (micro avg) 0.3206 2023-10-15 17:59:25,351 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:59:50,986 epoch 4 - iter 521/5212 - loss 0.07086836 - time (sec): 25.63 - samples/sec: 1446.52 - lr: 0.000023 - momentum: 0.000000 2023-10-15 18:00:16,603 epoch 4 - iter 1042/5212 - loss 0.07461997 - time (sec): 51.25 - samples/sec: 1467.28 - lr: 0.000023 - momentum: 0.000000 2023-10-15 18:00:42,316 epoch 4 - iter 1563/5212 - loss 0.07589647 - time (sec): 76.96 - samples/sec: 1444.37 - lr: 0.000022 - momentum: 0.000000 2023-10-15 18:01:08,183 epoch 4 - iter 2084/5212 - loss 0.07950376 - time (sec): 102.83 - samples/sec: 1433.87 - lr: 0.000022 - momentum: 0.000000 2023-10-15 18:01:33,328 epoch 4 - iter 2605/5212 - loss 0.07820170 - time (sec): 127.98 - samples/sec: 1437.64 - lr: 0.000022 - momentum: 0.000000 2023-10-15 18:01:59,013 epoch 4 - iter 3126/5212 - loss 0.08102218 - time (sec): 153.66 - samples/sec: 1442.95 - lr: 0.000021 - momentum: 0.000000 2023-10-15 18:02:24,785 epoch 4 - iter 3647/5212 - loss 0.08157332 - time (sec): 179.43 - samples/sec: 1446.50 - lr: 0.000021 - momentum: 0.000000 2023-10-15 18:02:50,143 epoch 4 - iter 4168/5212 - loss 0.08239732 - time (sec): 204.79 - samples/sec: 1440.24 - lr: 0.000021 - momentum: 0.000000 2023-10-15 18:03:15,106 epoch 4 - iter 4689/5212 - loss 0.08200821 - time (sec): 229.75 - samples/sec: 1437.08 - lr: 0.000020 - momentum: 0.000000 2023-10-15 18:03:40,271 epoch 4 - iter 5210/5212 - loss 0.08251544 - time (sec): 254.92 - samples/sec: 1441.21 - lr: 0.000020 - momentum: 0.000000 2023-10-15 18:03:40,369 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:03:40,369 EPOCH 4 done: loss 0.0825 - lr: 0.000020 2023-10-15 18:03:48,838 DEV : loss 0.35853323340415955 - f1-score (micro avg) 0.3579 2023-10-15 18:03:48,867 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:04:15,944 epoch 5 - iter 521/5212 - loss 0.06074459 - time (sec): 27.07 - samples/sec: 1474.06 - lr: 0.000020 - momentum: 0.000000 2023-10-15 18:04:41,310 epoch 5 - iter 1042/5212 - loss 0.05750324 - time (sec): 52.44 - samples/sec: 1450.08 - lr: 0.000019 - momentum: 0.000000 2023-10-15 18:05:06,545 epoch 5 - iter 1563/5212 - loss 0.06019844 - time (sec): 77.68 - samples/sec: 1440.58 - lr: 0.000019 - momentum: 0.000000 2023-10-15 18:05:31,865 epoch 5 - iter 2084/5212 - loss 0.05797686 - time (sec): 103.00 - samples/sec: 1457.52 - lr: 0.000019 - momentum: 0.000000 2023-10-15 18:05:56,701 epoch 5 - iter 2605/5212 - loss 0.05779824 - time (sec): 127.83 - samples/sec: 1463.83 - lr: 0.000018 - momentum: 0.000000 2023-10-15 18:06:21,368 epoch 5 - iter 3126/5212 - loss 0.05796520 - time (sec): 152.50 - samples/sec: 1458.09 - lr: 0.000018 - momentum: 0.000000 2023-10-15 18:06:46,164 epoch 5 - iter 3647/5212 - loss 0.05749706 - time (sec): 177.30 - samples/sec: 1447.84 - lr: 0.000018 - momentum: 0.000000 2023-10-15 18:07:11,414 epoch 5 - iter 4168/5212 - loss 0.05748009 - time (sec): 202.55 - samples/sec: 1451.08 - lr: 0.000017 - momentum: 0.000000 2023-10-15 18:07:36,488 epoch 5 - iter 4689/5212 - loss 0.05835980 - time (sec): 227.62 - samples/sec: 1453.58 - lr: 0.000017 - momentum: 0.000000 2023-10-15 18:08:02,017 epoch 5 - iter 5210/5212 - loss 0.05917565 - time (sec): 253.15 - samples/sec: 1451.21 - lr: 0.000017 - momentum: 0.000000 2023-10-15 18:08:02,105 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:08:02,106 EPOCH 5 done: loss 0.0592 - lr: 0.000017 2023-10-15 18:08:10,370 DEV : loss 0.2954880893230438 - f1-score (micro avg) 0.3915 2023-10-15 18:08:10,397 saving best model 2023-10-15 18:08:10,852 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:08:36,370 epoch 6 - iter 521/5212 - loss 0.03566624 - time (sec): 25.51 - samples/sec: 1407.17 - lr: 0.000016 - momentum: 0.000000 2023-10-15 18:09:01,776 epoch 6 - iter 1042/5212 - loss 0.03735819 - time (sec): 50.92 - samples/sec: 1459.40 - lr: 0.000016 - momentum: 0.000000 2023-10-15 18:09:27,672 epoch 6 - iter 1563/5212 - loss 0.04010312 - time (sec): 76.82 - samples/sec: 1461.32 - lr: 0.000016 - momentum: 0.000000 2023-10-15 18:09:53,920 epoch 6 - iter 2084/5212 - loss 0.04272252 - time (sec): 103.06 - samples/sec: 1440.65 - lr: 0.000015 - momentum: 0.000000 2023-10-15 18:10:18,891 epoch 6 - iter 2605/5212 - loss 0.04658724 - time (sec): 128.04 - samples/sec: 1423.31 - lr: 0.000015 - momentum: 0.000000 2023-10-15 18:10:44,769 epoch 6 - iter 3126/5212 - loss 0.04608168 - time (sec): 153.91 - samples/sec: 1436.73 - lr: 0.000015 - momentum: 0.000000 2023-10-15 18:11:09,948 epoch 6 - iter 3647/5212 - loss 0.04595678 - time (sec): 179.09 - samples/sec: 1434.36 - lr: 0.000014 - momentum: 0.000000 2023-10-15 18:11:35,711 epoch 6 - iter 4168/5212 - loss 0.04599344 - time (sec): 204.85 - samples/sec: 1436.63 - lr: 0.000014 - momentum: 0.000000 2023-10-15 18:12:00,745 epoch 6 - iter 4689/5212 - loss 0.04455311 - time (sec): 229.89 - samples/sec: 1439.85 - lr: 0.000014 - momentum: 0.000000 2023-10-15 18:12:26,059 epoch 6 - iter 5210/5212 - loss 0.04410825 - time (sec): 255.20 - samples/sec: 1438.27 - lr: 0.000013 - momentum: 0.000000 2023-10-15 18:12:26,260 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:12:26,260 EPOCH 6 done: loss 0.0441 - lr: 0.000013 2023-10-15 18:12:34,571 DEV : loss 0.39353951811790466 - f1-score (micro avg) 0.3728 2023-10-15 18:12:34,601 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:12:59,435 epoch 7 - iter 521/5212 - loss 0.03179235 - time (sec): 24.83 - samples/sec: 1364.97 - lr: 0.000013 - momentum: 0.000000 2023-10-15 18:13:24,802 epoch 7 - iter 1042/5212 - loss 0.02701866 - time (sec): 50.20 - samples/sec: 1413.92 - lr: 0.000013 - momentum: 0.000000 2023-10-15 18:13:49,752 epoch 7 - iter 1563/5212 - loss 0.02758616 - time (sec): 75.15 - samples/sec: 1397.11 - lr: 0.000012 - momentum: 0.000000 2023-10-15 18:14:15,112 epoch 7 - iter 2084/5212 - loss 0.02597876 - time (sec): 100.51 - samples/sec: 1423.38 - lr: 0.000012 - momentum: 0.000000 2023-10-15 18:14:40,889 epoch 7 - iter 2605/5212 - loss 0.03022671 - time (sec): 126.29 - samples/sec: 1432.09 - lr: 0.000012 - momentum: 0.000000 2023-10-15 18:15:06,162 epoch 7 - iter 3126/5212 - loss 0.03044416 - time (sec): 151.56 - samples/sec: 1431.88 - lr: 0.000011 - momentum: 0.000000 2023-10-15 18:15:31,480 epoch 7 - iter 3647/5212 - loss 0.03128317 - time (sec): 176.88 - samples/sec: 1442.88 - lr: 0.000011 - momentum: 0.000000 2023-10-15 18:15:57,689 epoch 7 - iter 4168/5212 - loss 0.03231927 - time (sec): 203.09 - samples/sec: 1441.69 - lr: 0.000011 - momentum: 0.000000 2023-10-15 18:16:24,016 epoch 7 - iter 4689/5212 - loss 0.03123865 - time (sec): 229.41 - samples/sec: 1435.67 - lr: 0.000010 - momentum: 0.000000 2023-10-15 18:16:49,654 epoch 7 - iter 5210/5212 - loss 0.03238048 - time (sec): 255.05 - samples/sec: 1440.47 - lr: 0.000010 - momentum: 0.000000 2023-10-15 18:16:49,745 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:16:49,745 EPOCH 7 done: loss 0.0324 - lr: 0.000010 2023-10-15 18:16:58,188 DEV : loss 0.4698057472705841 - f1-score (micro avg) 0.3445 2023-10-15 18:16:58,221 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:17:24,331 epoch 8 - iter 521/5212 - loss 0.02118882 - time (sec): 26.11 - samples/sec: 1507.33 - lr: 0.000010 - momentum: 0.000000 2023-10-15 18:17:50,035 epoch 8 - iter 1042/5212 - loss 0.02174828 - time (sec): 51.81 - samples/sec: 1501.03 - lr: 0.000009 - momentum: 0.000000 2023-10-15 18:18:15,674 epoch 8 - iter 1563/5212 - loss 0.02240042 - time (sec): 77.45 - samples/sec: 1484.23 - lr: 0.000009 - momentum: 0.000000 2023-10-15 18:18:40,960 epoch 8 - iter 2084/5212 - loss 0.02307800 - time (sec): 102.74 - samples/sec: 1458.93 - lr: 0.000009 - momentum: 0.000000 2023-10-15 18:19:06,094 epoch 8 - iter 2605/5212 - loss 0.02270835 - time (sec): 127.87 - samples/sec: 1448.41 - lr: 0.000008 - momentum: 0.000000 2023-10-15 18:19:31,710 epoch 8 - iter 3126/5212 - loss 0.02207908 - time (sec): 153.49 - samples/sec: 1443.14 - lr: 0.000008 - momentum: 0.000000 2023-10-15 18:19:57,514 epoch 8 - iter 3647/5212 - loss 0.02166312 - time (sec): 179.29 - samples/sec: 1447.94 - lr: 0.000008 - momentum: 0.000000 2023-10-15 18:20:22,251 epoch 8 - iter 4168/5212 - loss 0.02233639 - time (sec): 204.03 - samples/sec: 1441.81 - lr: 0.000007 - momentum: 0.000000 2023-10-15 18:20:47,299 epoch 8 - iter 4689/5212 - loss 0.02246766 - time (sec): 229.08 - samples/sec: 1441.54 - lr: 0.000007 - momentum: 0.000000 2023-10-15 18:21:11,946 epoch 8 - iter 5210/5212 - loss 0.02273107 - time (sec): 253.72 - samples/sec: 1446.48 - lr: 0.000007 - momentum: 0.000000 2023-10-15 18:21:12,056 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:21:12,056 EPOCH 8 done: loss 0.0227 - lr: 0.000007 2023-10-15 18:21:21,123 DEV : loss 0.48737385869026184 - f1-score (micro avg) 0.3614 2023-10-15 18:21:21,155 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:21:45,925 epoch 9 - iter 521/5212 - loss 0.01657988 - time (sec): 24.77 - samples/sec: 1436.79 - lr: 0.000006 - momentum: 0.000000 2023-10-15 18:22:11,071 epoch 9 - iter 1042/5212 - loss 0.01825306 - time (sec): 49.92 - samples/sec: 1435.80 - lr: 0.000006 - momentum: 0.000000 2023-10-15 18:22:35,983 epoch 9 - iter 1563/5212 - loss 0.01537751 - time (sec): 74.83 - samples/sec: 1446.59 - lr: 0.000006 - momentum: 0.000000 2023-10-15 18:23:01,098 epoch 9 - iter 2084/5212 - loss 0.01556559 - time (sec): 99.94 - samples/sec: 1448.93 - lr: 0.000005 - momentum: 0.000000 2023-10-15 18:23:26,191 epoch 9 - iter 2605/5212 - loss 0.01597811 - time (sec): 125.04 - samples/sec: 1441.42 - lr: 0.000005 - momentum: 0.000000 2023-10-15 18:23:52,046 epoch 9 - iter 3126/5212 - loss 0.01561272 - time (sec): 150.89 - samples/sec: 1451.99 - lr: 0.000005 - momentum: 0.000000 2023-10-15 18:24:16,900 epoch 9 - iter 3647/5212 - loss 0.01574407 - time (sec): 175.74 - samples/sec: 1453.65 - lr: 0.000004 - momentum: 0.000000 2023-10-15 18:24:42,461 epoch 9 - iter 4168/5212 - loss 0.01612832 - time (sec): 201.31 - samples/sec: 1448.57 - lr: 0.000004 - momentum: 0.000000 2023-10-15 18:25:08,164 epoch 9 - iter 4689/5212 - loss 0.01585164 - time (sec): 227.01 - samples/sec: 1452.94 - lr: 0.000004 - momentum: 0.000000 2023-10-15 18:25:33,956 epoch 9 - iter 5210/5212 - loss 0.01523723 - time (sec): 252.80 - samples/sec: 1452.97 - lr: 0.000003 - momentum: 0.000000 2023-10-15 18:25:34,044 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:25:34,044 EPOCH 9 done: loss 0.0152 - lr: 0.000003 2023-10-15 18:25:43,065 DEV : loss 0.5062488317489624 - f1-score (micro avg) 0.3655 2023-10-15 18:25:43,093 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:26:08,163 epoch 10 - iter 521/5212 - loss 0.00931166 - time (sec): 25.07 - samples/sec: 1432.29 - lr: 0.000003 - momentum: 0.000000 2023-10-15 18:26:33,323 epoch 10 - iter 1042/5212 - loss 0.01081890 - time (sec): 50.23 - samples/sec: 1458.64 - lr: 0.000003 - momentum: 0.000000 2023-10-15 18:26:58,431 epoch 10 - iter 1563/5212 - loss 0.01002542 - time (sec): 75.34 - samples/sec: 1467.77 - lr: 0.000002 - momentum: 0.000000 2023-10-15 18:27:24,060 epoch 10 - iter 2084/5212 - loss 0.00973675 - time (sec): 100.97 - samples/sec: 1456.67 - lr: 0.000002 - momentum: 0.000000 2023-10-15 18:27:49,708 epoch 10 - iter 2605/5212 - loss 0.01027887 - time (sec): 126.61 - samples/sec: 1466.36 - lr: 0.000002 - momentum: 0.000000 2023-10-15 18:28:15,215 epoch 10 - iter 3126/5212 - loss 0.01040973 - time (sec): 152.12 - samples/sec: 1460.95 - lr: 0.000001 - momentum: 0.000000 2023-10-15 18:28:40,464 epoch 10 - iter 3647/5212 - loss 0.01009036 - time (sec): 177.37 - samples/sec: 1459.94 - lr: 0.000001 - momentum: 0.000000 2023-10-15 18:29:05,142 epoch 10 - iter 4168/5212 - loss 0.01032693 - time (sec): 202.05 - samples/sec: 1454.33 - lr: 0.000001 - momentum: 0.000000 2023-10-15 18:29:30,640 epoch 10 - iter 4689/5212 - loss 0.01050964 - time (sec): 227.55 - samples/sec: 1448.72 - lr: 0.000000 - momentum: 0.000000 2023-10-15 18:29:55,882 epoch 10 - iter 5210/5212 - loss 0.01022552 - time (sec): 252.79 - samples/sec: 1452.20 - lr: 0.000000 - momentum: 0.000000 2023-10-15 18:29:56,017 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:29:56,017 EPOCH 10 done: loss 0.0102 - lr: 0.000000 2023-10-15 18:30:05,188 DEV : loss 0.5002045631408691 - f1-score (micro avg) 0.374 2023-10-15 18:30:05,584 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:30:05,585 Loading model from best epoch ... 2023-10-15 18:30:07,219 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-15 18:30:22,520 Results: - F-score (micro) 0.4665 - F-score (macro) 0.3023 - Accuracy 0.3084 By class: precision recall f1-score support LOC 0.5410 0.5865 0.5628 1214 PER 0.4010 0.4035 0.4022 808 ORG 0.3038 0.2040 0.2441 353 HumanProd 0.0000 0.0000 0.0000 15 micro avg 0.4686 0.4644 0.4665 2390 macro avg 0.3115 0.2985 0.3023 2390 weighted avg 0.4553 0.4644 0.4579 2390 2023-10-15 18:30:22,520 ----------------------------------------------------------------------------------------------------