stefan-it's picture
Upload folder using huggingface_hub
f386d9e
2023-10-15 18:30:53,960 ----------------------------------------------------------------------------------------------------
2023-10-15 18:30:53,961 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-15 18:30:53,961 ----------------------------------------------------------------------------------------------------
2023-10-15 18:30:53,961 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-15 18:30:53,961 ----------------------------------------------------------------------------------------------------
2023-10-15 18:30:53,961 Train: 20847 sentences
2023-10-15 18:30:53,961 (train_with_dev=False, train_with_test=False)
2023-10-15 18:30:53,961 ----------------------------------------------------------------------------------------------------
2023-10-15 18:30:53,961 Training Params:
2023-10-15 18:30:53,961 - learning_rate: "5e-05"
2023-10-15 18:30:53,961 - mini_batch_size: "4"
2023-10-15 18:30:53,961 - max_epochs: "10"
2023-10-15 18:30:53,961 - shuffle: "True"
2023-10-15 18:30:53,961 ----------------------------------------------------------------------------------------------------
2023-10-15 18:30:53,961 Plugins:
2023-10-15 18:30:53,961 - LinearScheduler | warmup_fraction: '0.1'
2023-10-15 18:30:53,961 ----------------------------------------------------------------------------------------------------
2023-10-15 18:30:53,961 Final evaluation on model from best epoch (best-model.pt)
2023-10-15 18:30:53,961 - metric: "('micro avg', 'f1-score')"
2023-10-15 18:30:53,962 ----------------------------------------------------------------------------------------------------
2023-10-15 18:30:53,962 Computation:
2023-10-15 18:30:53,962 - compute on device: cuda:0
2023-10-15 18:30:53,962 - embedding storage: none
2023-10-15 18:30:53,962 ----------------------------------------------------------------------------------------------------
2023-10-15 18:30:53,962 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-15 18:30:53,962 ----------------------------------------------------------------------------------------------------
2023-10-15 18:30:53,962 ----------------------------------------------------------------------------------------------------
2023-10-15 18:31:18,728 epoch 1 - iter 521/5212 - loss 1.28960364 - time (sec): 24.77 - samples/sec: 1414.43 - lr: 0.000005 - momentum: 0.000000
2023-10-15 18:31:44,032 epoch 1 - iter 1042/5212 - loss 0.87638561 - time (sec): 50.07 - samples/sec: 1417.33 - lr: 0.000010 - momentum: 0.000000
2023-10-15 18:32:10,016 epoch 1 - iter 1563/5212 - loss 0.66894164 - time (sec): 76.05 - samples/sec: 1423.01 - lr: 0.000015 - momentum: 0.000000
2023-10-15 18:32:35,647 epoch 1 - iter 2084/5212 - loss 0.55810972 - time (sec): 101.68 - samples/sec: 1433.24 - lr: 0.000020 - momentum: 0.000000
2023-10-15 18:33:00,980 epoch 1 - iter 2605/5212 - loss 0.49511139 - time (sec): 127.02 - samples/sec: 1439.63 - lr: 0.000025 - momentum: 0.000000
2023-10-15 18:33:27,406 epoch 1 - iter 3126/5212 - loss 0.44378646 - time (sec): 153.44 - samples/sec: 1449.27 - lr: 0.000030 - momentum: 0.000000
2023-10-15 18:33:52,795 epoch 1 - iter 3647/5212 - loss 0.41309918 - time (sec): 178.83 - samples/sec: 1447.91 - lr: 0.000035 - momentum: 0.000000
2023-10-15 18:34:17,986 epoch 1 - iter 4168/5212 - loss 0.38982212 - time (sec): 204.02 - samples/sec: 1449.28 - lr: 0.000040 - momentum: 0.000000
2023-10-15 18:34:43,243 epoch 1 - iter 4689/5212 - loss 0.37676637 - time (sec): 229.28 - samples/sec: 1435.39 - lr: 0.000045 - momentum: 0.000000
2023-10-15 18:35:09,023 epoch 1 - iter 5210/5212 - loss 0.35899282 - time (sec): 255.06 - samples/sec: 1440.23 - lr: 0.000050 - momentum: 0.000000
2023-10-15 18:35:09,114 ----------------------------------------------------------------------------------------------------
2023-10-15 18:35:09,114 EPOCH 1 done: loss 0.3590 - lr: 0.000050
2023-10-15 18:35:14,909 DEV : loss 0.1935647428035736 - f1-score (micro avg) 0.2561
2023-10-15 18:35:14,934 saving best model
2023-10-15 18:35:15,282 ----------------------------------------------------------------------------------------------------
2023-10-15 18:35:40,738 epoch 2 - iter 521/5212 - loss 0.24636769 - time (sec): 25.45 - samples/sec: 1320.35 - lr: 0.000049 - momentum: 0.000000
2023-10-15 18:36:05,977 epoch 2 - iter 1042/5212 - loss 0.23231567 - time (sec): 50.69 - samples/sec: 1396.46 - lr: 0.000049 - momentum: 0.000000
2023-10-15 18:36:31,184 epoch 2 - iter 1563/5212 - loss 0.22395732 - time (sec): 75.90 - samples/sec: 1398.81 - lr: 0.000048 - momentum: 0.000000
2023-10-15 18:36:56,597 epoch 2 - iter 2084/5212 - loss 0.21548760 - time (sec): 101.31 - samples/sec: 1415.13 - lr: 0.000048 - momentum: 0.000000
2023-10-15 18:37:21,551 epoch 2 - iter 2605/5212 - loss 0.21176218 - time (sec): 126.27 - samples/sec: 1419.20 - lr: 0.000047 - momentum: 0.000000
2023-10-15 18:37:47,270 epoch 2 - iter 3126/5212 - loss 0.20948142 - time (sec): 151.99 - samples/sec: 1438.50 - lr: 0.000047 - momentum: 0.000000
2023-10-15 18:38:12,325 epoch 2 - iter 3647/5212 - loss 0.20703935 - time (sec): 177.04 - samples/sec: 1437.21 - lr: 0.000046 - momentum: 0.000000
2023-10-15 18:38:38,051 epoch 2 - iter 4168/5212 - loss 0.20284934 - time (sec): 202.77 - samples/sec: 1435.09 - lr: 0.000046 - momentum: 0.000000
2023-10-15 18:39:03,756 epoch 2 - iter 4689/5212 - loss 0.19868831 - time (sec): 228.47 - samples/sec: 1437.02 - lr: 0.000045 - momentum: 0.000000
2023-10-15 18:39:29,272 epoch 2 - iter 5210/5212 - loss 0.19660262 - time (sec): 253.99 - samples/sec: 1444.41 - lr: 0.000044 - momentum: 0.000000
2023-10-15 18:39:29,439 ----------------------------------------------------------------------------------------------------
2023-10-15 18:39:29,439 EPOCH 2 done: loss 0.1966 - lr: 0.000044
2023-10-15 18:39:38,418 DEV : loss 0.26646727323532104 - f1-score (micro avg) 0.3126
2023-10-15 18:39:38,445 saving best model
2023-10-15 18:39:38,908 ----------------------------------------------------------------------------------------------------
2023-10-15 18:40:03,583 epoch 3 - iter 521/5212 - loss 0.15758472 - time (sec): 24.67 - samples/sec: 1359.66 - lr: 0.000044 - momentum: 0.000000
2023-10-15 18:40:28,809 epoch 3 - iter 1042/5212 - loss 0.15326080 - time (sec): 49.90 - samples/sec: 1378.81 - lr: 0.000043 - momentum: 0.000000
2023-10-15 18:40:55,000 epoch 3 - iter 1563/5212 - loss 0.14597497 - time (sec): 76.09 - samples/sec: 1452.84 - lr: 0.000043 - momentum: 0.000000
2023-10-15 18:41:20,559 epoch 3 - iter 2084/5212 - loss 0.14065920 - time (sec): 101.65 - samples/sec: 1454.24 - lr: 0.000042 - momentum: 0.000000
2023-10-15 18:41:45,973 epoch 3 - iter 2605/5212 - loss 0.14592720 - time (sec): 127.06 - samples/sec: 1454.39 - lr: 0.000042 - momentum: 0.000000
2023-10-15 18:42:10,959 epoch 3 - iter 3126/5212 - loss 0.14605254 - time (sec): 152.05 - samples/sec: 1458.06 - lr: 0.000041 - momentum: 0.000000
2023-10-15 18:42:36,131 epoch 3 - iter 3647/5212 - loss 0.14458467 - time (sec): 177.22 - samples/sec: 1455.44 - lr: 0.000041 - momentum: 0.000000
2023-10-15 18:43:01,039 epoch 3 - iter 4168/5212 - loss 0.14831744 - time (sec): 202.13 - samples/sec: 1448.82 - lr: 0.000040 - momentum: 0.000000
2023-10-15 18:43:26,000 epoch 3 - iter 4689/5212 - loss 0.14722580 - time (sec): 227.09 - samples/sec: 1449.99 - lr: 0.000039 - momentum: 0.000000
2023-10-15 18:43:51,506 epoch 3 - iter 5210/5212 - loss 0.14520606 - time (sec): 252.60 - samples/sec: 1454.29 - lr: 0.000039 - momentum: 0.000000
2023-10-15 18:43:51,594 ----------------------------------------------------------------------------------------------------
2023-10-15 18:43:51,594 EPOCH 3 done: loss 0.1452 - lr: 0.000039
2023-10-15 18:43:59,942 DEV : loss 0.21102237701416016 - f1-score (micro avg) 0.3177
2023-10-15 18:43:59,971 saving best model
2023-10-15 18:44:00,404 ----------------------------------------------------------------------------------------------------
2023-10-15 18:44:26,791 epoch 4 - iter 521/5212 - loss 0.09694449 - time (sec): 26.39 - samples/sec: 1405.34 - lr: 0.000038 - momentum: 0.000000
2023-10-15 18:44:51,970 epoch 4 - iter 1042/5212 - loss 0.10160415 - time (sec): 51.56 - samples/sec: 1458.36 - lr: 0.000038 - momentum: 0.000000
2023-10-15 18:45:17,568 epoch 4 - iter 1563/5212 - loss 0.10470997 - time (sec): 77.16 - samples/sec: 1440.66 - lr: 0.000037 - momentum: 0.000000
2023-10-15 18:45:42,943 epoch 4 - iter 2084/5212 - loss 0.10504412 - time (sec): 102.54 - samples/sec: 1437.98 - lr: 0.000037 - momentum: 0.000000
2023-10-15 18:46:08,464 epoch 4 - iter 2605/5212 - loss 0.10205242 - time (sec): 128.06 - samples/sec: 1436.72 - lr: 0.000036 - momentum: 0.000000
2023-10-15 18:46:34,112 epoch 4 - iter 3126/5212 - loss 0.10491793 - time (sec): 153.71 - samples/sec: 1442.53 - lr: 0.000036 - momentum: 0.000000
2023-10-15 18:46:59,262 epoch 4 - iter 3647/5212 - loss 0.10491333 - time (sec): 178.86 - samples/sec: 1451.17 - lr: 0.000035 - momentum: 0.000000
2023-10-15 18:47:24,773 epoch 4 - iter 4168/5212 - loss 0.11104350 - time (sec): 204.37 - samples/sec: 1443.22 - lr: 0.000034 - momentum: 0.000000
2023-10-15 18:47:49,932 epoch 4 - iter 4689/5212 - loss 0.11023634 - time (sec): 229.53 - samples/sec: 1438.50 - lr: 0.000034 - momentum: 0.000000
2023-10-15 18:48:15,080 epoch 4 - iter 5210/5212 - loss 0.11005524 - time (sec): 254.67 - samples/sec: 1442.59 - lr: 0.000033 - momentum: 0.000000
2023-10-15 18:48:15,165 ----------------------------------------------------------------------------------------------------
2023-10-15 18:48:15,165 EPOCH 4 done: loss 0.1100 - lr: 0.000033
2023-10-15 18:48:23,650 DEV : loss 0.3510819673538208 - f1-score (micro avg) 0.3189
2023-10-15 18:48:23,695 saving best model
2023-10-15 18:48:24,183 ----------------------------------------------------------------------------------------------------
2023-10-15 18:48:51,942 epoch 5 - iter 521/5212 - loss 0.08171099 - time (sec): 27.76 - samples/sec: 1437.83 - lr: 0.000033 - momentum: 0.000000
2023-10-15 18:49:17,391 epoch 5 - iter 1042/5212 - loss 0.07891529 - time (sec): 53.21 - samples/sec: 1429.22 - lr: 0.000032 - momentum: 0.000000
2023-10-15 18:49:43,253 epoch 5 - iter 1563/5212 - loss 0.08091614 - time (sec): 79.07 - samples/sec: 1415.21 - lr: 0.000032 - momentum: 0.000000
2023-10-15 18:50:08,706 epoch 5 - iter 2084/5212 - loss 0.07786060 - time (sec): 104.52 - samples/sec: 1436.25 - lr: 0.000031 - momentum: 0.000000
2023-10-15 18:50:33,799 epoch 5 - iter 2605/5212 - loss 0.07747685 - time (sec): 129.61 - samples/sec: 1443.70 - lr: 0.000031 - momentum: 0.000000
2023-10-15 18:50:58,814 epoch 5 - iter 3126/5212 - loss 0.07663910 - time (sec): 154.63 - samples/sec: 1438.00 - lr: 0.000030 - momentum: 0.000000
2023-10-15 18:51:23,868 epoch 5 - iter 3647/5212 - loss 0.07608571 - time (sec): 179.68 - samples/sec: 1428.61 - lr: 0.000029 - momentum: 0.000000
2023-10-15 18:51:49,382 epoch 5 - iter 4168/5212 - loss 0.07640161 - time (sec): 205.20 - samples/sec: 1432.32 - lr: 0.000029 - momentum: 0.000000
2023-10-15 18:52:14,940 epoch 5 - iter 4689/5212 - loss 0.07755494 - time (sec): 230.76 - samples/sec: 1433.82 - lr: 0.000028 - momentum: 0.000000
2023-10-15 18:52:39,947 epoch 5 - iter 5210/5212 - loss 0.07842196 - time (sec): 255.76 - samples/sec: 1436.38 - lr: 0.000028 - momentum: 0.000000
2023-10-15 18:52:40,037 ----------------------------------------------------------------------------------------------------
2023-10-15 18:52:40,037 EPOCH 5 done: loss 0.0784 - lr: 0.000028
2023-10-15 18:52:48,525 DEV : loss 0.2648324966430664 - f1-score (micro avg) 0.3525
2023-10-15 18:52:48,571 saving best model
2023-10-15 18:52:49,098 ----------------------------------------------------------------------------------------------------
2023-10-15 18:53:15,489 epoch 6 - iter 521/5212 - loss 0.05230408 - time (sec): 26.39 - samples/sec: 1360.55 - lr: 0.000027 - momentum: 0.000000
2023-10-15 18:53:40,907 epoch 6 - iter 1042/5212 - loss 0.05845790 - time (sec): 51.81 - samples/sec: 1434.46 - lr: 0.000027 - momentum: 0.000000
2023-10-15 18:54:07,100 epoch 6 - iter 1563/5212 - loss 0.06310501 - time (sec): 78.00 - samples/sec: 1439.16 - lr: 0.000026 - momentum: 0.000000
2023-10-15 18:54:32,649 epoch 6 - iter 2084/5212 - loss 0.06369300 - time (sec): 103.55 - samples/sec: 1433.91 - lr: 0.000026 - momentum: 0.000000
2023-10-15 18:54:57,100 epoch 6 - iter 2605/5212 - loss 0.06667131 - time (sec): 128.00 - samples/sec: 1423.71 - lr: 0.000025 - momentum: 0.000000
2023-10-15 18:55:22,500 epoch 6 - iter 3126/5212 - loss 0.06631037 - time (sec): 153.40 - samples/sec: 1441.54 - lr: 0.000024 - momentum: 0.000000
2023-10-15 18:55:48,175 epoch 6 - iter 3647/5212 - loss 0.06540271 - time (sec): 179.07 - samples/sec: 1434.50 - lr: 0.000024 - momentum: 0.000000
2023-10-15 18:56:13,599 epoch 6 - iter 4168/5212 - loss 0.06511901 - time (sec): 204.50 - samples/sec: 1439.13 - lr: 0.000023 - momentum: 0.000000
2023-10-15 18:56:38,437 epoch 6 - iter 4689/5212 - loss 0.06326195 - time (sec): 229.34 - samples/sec: 1443.32 - lr: 0.000023 - momentum: 0.000000
2023-10-15 18:57:03,252 epoch 6 - iter 5210/5212 - loss 0.06322970 - time (sec): 254.15 - samples/sec: 1444.22 - lr: 0.000022 - momentum: 0.000000
2023-10-15 18:57:03,445 ----------------------------------------------------------------------------------------------------
2023-10-15 18:57:03,446 EPOCH 6 done: loss 0.0632 - lr: 0.000022
2023-10-15 18:57:11,710 DEV : loss 0.3284428119659424 - f1-score (micro avg) 0.3479
2023-10-15 18:57:11,739 ----------------------------------------------------------------------------------------------------
2023-10-15 18:57:36,385 epoch 7 - iter 521/5212 - loss 0.04602036 - time (sec): 24.65 - samples/sec: 1375.30 - lr: 0.000022 - momentum: 0.000000
2023-10-15 18:58:01,491 epoch 7 - iter 1042/5212 - loss 0.04163037 - time (sec): 49.75 - samples/sec: 1426.66 - lr: 0.000021 - momentum: 0.000000
2023-10-15 18:58:26,685 epoch 7 - iter 1563/5212 - loss 0.04302604 - time (sec): 74.94 - samples/sec: 1400.93 - lr: 0.000021 - momentum: 0.000000
2023-10-15 18:58:52,373 epoch 7 - iter 2084/5212 - loss 0.04090613 - time (sec): 100.63 - samples/sec: 1421.62 - lr: 0.000020 - momentum: 0.000000
2023-10-15 18:59:18,332 epoch 7 - iter 2605/5212 - loss 0.04297160 - time (sec): 126.59 - samples/sec: 1428.63 - lr: 0.000019 - momentum: 0.000000
2023-10-15 18:59:43,473 epoch 7 - iter 3126/5212 - loss 0.04222747 - time (sec): 151.73 - samples/sec: 1430.24 - lr: 0.000019 - momentum: 0.000000
2023-10-15 19:00:08,729 epoch 7 - iter 3647/5212 - loss 0.04304694 - time (sec): 176.99 - samples/sec: 1441.97 - lr: 0.000018 - momentum: 0.000000
2023-10-15 19:00:33,766 epoch 7 - iter 4168/5212 - loss 0.04368892 - time (sec): 202.03 - samples/sec: 1449.26 - lr: 0.000018 - momentum: 0.000000
2023-10-15 19:00:59,308 epoch 7 - iter 4689/5212 - loss 0.04297916 - time (sec): 227.57 - samples/sec: 1447.31 - lr: 0.000017 - momentum: 0.000000
2023-10-15 19:01:24,566 epoch 7 - iter 5210/5212 - loss 0.04365791 - time (sec): 252.83 - samples/sec: 1453.14 - lr: 0.000017 - momentum: 0.000000
2023-10-15 19:01:24,659 ----------------------------------------------------------------------------------------------------
2023-10-15 19:01:24,659 EPOCH 7 done: loss 0.0437 - lr: 0.000017
2023-10-15 19:01:33,790 DEV : loss 0.3804240822792053 - f1-score (micro avg) 0.351
2023-10-15 19:01:33,821 ----------------------------------------------------------------------------------------------------
2023-10-15 19:01:59,298 epoch 8 - iter 521/5212 - loss 0.03278593 - time (sec): 25.48 - samples/sec: 1544.77 - lr: 0.000016 - momentum: 0.000000
2023-10-15 19:02:25,536 epoch 8 - iter 1042/5212 - loss 0.03015265 - time (sec): 51.71 - samples/sec: 1503.92 - lr: 0.000016 - momentum: 0.000000
2023-10-15 19:02:52,003 epoch 8 - iter 1563/5212 - loss 0.03141214 - time (sec): 78.18 - samples/sec: 1470.41 - lr: 0.000015 - momentum: 0.000000
2023-10-15 19:03:17,168 epoch 8 - iter 2084/5212 - loss 0.03397435 - time (sec): 103.35 - samples/sec: 1450.36 - lr: 0.000014 - momentum: 0.000000
2023-10-15 19:03:42,358 epoch 8 - iter 2605/5212 - loss 0.03614872 - time (sec): 128.54 - samples/sec: 1440.92 - lr: 0.000014 - momentum: 0.000000
2023-10-15 19:04:08,045 epoch 8 - iter 3126/5212 - loss 0.03527472 - time (sec): 154.22 - samples/sec: 1436.27 - lr: 0.000013 - momentum: 0.000000
2023-10-15 19:04:34,122 epoch 8 - iter 3647/5212 - loss 0.03459124 - time (sec): 180.30 - samples/sec: 1439.85 - lr: 0.000013 - momentum: 0.000000
2023-10-15 19:04:58,912 epoch 8 - iter 4168/5212 - loss 0.03421361 - time (sec): 205.09 - samples/sec: 1434.35 - lr: 0.000012 - momentum: 0.000000
2023-10-15 19:05:23,753 epoch 8 - iter 4689/5212 - loss 0.03359174 - time (sec): 229.93 - samples/sec: 1436.19 - lr: 0.000012 - momentum: 0.000000
2023-10-15 19:05:48,530 epoch 8 - iter 5210/5212 - loss 0.03303597 - time (sec): 254.71 - samples/sec: 1440.90 - lr: 0.000011 - momentum: 0.000000
2023-10-15 19:05:48,643 ----------------------------------------------------------------------------------------------------
2023-10-15 19:05:48,643 EPOCH 8 done: loss 0.0330 - lr: 0.000011
2023-10-15 19:05:58,972 DEV : loss 0.3673393130302429 - f1-score (micro avg) 0.3845
2023-10-15 19:05:59,018 saving best model
2023-10-15 19:05:59,528 ----------------------------------------------------------------------------------------------------
2023-10-15 19:06:28,356 epoch 9 - iter 521/5212 - loss 0.02522660 - time (sec): 28.82 - samples/sec: 1234.63 - lr: 0.000011 - momentum: 0.000000
2023-10-15 19:06:55,424 epoch 9 - iter 1042/5212 - loss 0.02383534 - time (sec): 55.89 - samples/sec: 1282.26 - lr: 0.000010 - momentum: 0.000000
2023-10-15 19:07:20,832 epoch 9 - iter 1563/5212 - loss 0.02195053 - time (sec): 81.30 - samples/sec: 1331.42 - lr: 0.000009 - momentum: 0.000000
2023-10-15 19:07:45,994 epoch 9 - iter 2084/5212 - loss 0.02261221 - time (sec): 106.46 - samples/sec: 1360.19 - lr: 0.000009 - momentum: 0.000000
2023-10-15 19:08:11,005 epoch 9 - iter 2605/5212 - loss 0.02502914 - time (sec): 131.47 - samples/sec: 1370.83 - lr: 0.000008 - momentum: 0.000000
2023-10-15 19:08:37,062 epoch 9 - iter 3126/5212 - loss 0.02521692 - time (sec): 157.53 - samples/sec: 1390.80 - lr: 0.000008 - momentum: 0.000000
2023-10-15 19:09:02,395 epoch 9 - iter 3647/5212 - loss 0.02518572 - time (sec): 182.86 - samples/sec: 1397.05 - lr: 0.000007 - momentum: 0.000000
2023-10-15 19:09:27,771 epoch 9 - iter 4168/5212 - loss 0.02436796 - time (sec): 208.24 - samples/sec: 1400.34 - lr: 0.000007 - momentum: 0.000000
2023-10-15 19:09:53,231 epoch 9 - iter 4689/5212 - loss 0.02383592 - time (sec): 233.70 - samples/sec: 1411.35 - lr: 0.000006 - momentum: 0.000000
2023-10-15 19:10:18,608 epoch 9 - iter 5210/5212 - loss 0.02324373 - time (sec): 259.08 - samples/sec: 1417.78 - lr: 0.000006 - momentum: 0.000000
2023-10-15 19:10:18,697 ----------------------------------------------------------------------------------------------------
2023-10-15 19:10:18,697 EPOCH 9 done: loss 0.0232 - lr: 0.000006
2023-10-15 19:10:28,083 DEV : loss 0.4655354917049408 - f1-score (micro avg) 0.3527
2023-10-15 19:10:28,119 ----------------------------------------------------------------------------------------------------
2023-10-15 19:10:55,276 epoch 10 - iter 521/5212 - loss 0.02072785 - time (sec): 27.16 - samples/sec: 1322.25 - lr: 0.000005 - momentum: 0.000000
2023-10-15 19:11:20,967 epoch 10 - iter 1042/5212 - loss 0.01880289 - time (sec): 52.85 - samples/sec: 1386.39 - lr: 0.000004 - momentum: 0.000000
2023-10-15 19:11:46,669 epoch 10 - iter 1563/5212 - loss 0.01711965 - time (sec): 78.55 - samples/sec: 1407.78 - lr: 0.000004 - momentum: 0.000000
2023-10-15 19:12:11,914 epoch 10 - iter 2084/5212 - loss 0.01667270 - time (sec): 103.79 - samples/sec: 1416.99 - lr: 0.000003 - momentum: 0.000000
2023-10-15 19:12:37,387 epoch 10 - iter 2605/5212 - loss 0.01655129 - time (sec): 129.27 - samples/sec: 1436.27 - lr: 0.000003 - momentum: 0.000000
2023-10-15 19:13:03,491 epoch 10 - iter 3126/5212 - loss 0.01639347 - time (sec): 155.37 - samples/sec: 1430.40 - lr: 0.000002 - momentum: 0.000000
2023-10-15 19:13:29,109 epoch 10 - iter 3647/5212 - loss 0.01559952 - time (sec): 180.99 - samples/sec: 1430.74 - lr: 0.000002 - momentum: 0.000000
2023-10-15 19:13:54,015 epoch 10 - iter 4168/5212 - loss 0.01527630 - time (sec): 205.89 - samples/sec: 1427.16 - lr: 0.000001 - momentum: 0.000000
2023-10-15 19:14:19,322 epoch 10 - iter 4689/5212 - loss 0.01494606 - time (sec): 231.20 - samples/sec: 1425.82 - lr: 0.000001 - momentum: 0.000000
2023-10-15 19:14:44,781 epoch 10 - iter 5210/5212 - loss 0.01478709 - time (sec): 256.66 - samples/sec: 1430.28 - lr: 0.000000 - momentum: 0.000000
2023-10-15 19:14:44,923 ----------------------------------------------------------------------------------------------------
2023-10-15 19:14:44,923 EPOCH 10 done: loss 0.0148 - lr: 0.000000
2023-10-15 19:14:53,969 DEV : loss 0.4225836992263794 - f1-score (micro avg) 0.3731
2023-10-15 19:14:54,376 ----------------------------------------------------------------------------------------------------
2023-10-15 19:14:54,377 Loading model from best epoch ...
2023-10-15 19:14:55,998 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-15 19:15:12,852
Results:
- F-score (micro) 0.4487
- F-score (macro) 0.2891
- Accuracy 0.2932
By class:
precision recall f1-score support
LOC 0.5092 0.5708 0.5383 1214
PER 0.3792 0.4059 0.3921 808
ORG 0.2928 0.1841 0.2261 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4431 0.4544 0.4487 2390
macro avg 0.2953 0.2902 0.2891 2390
weighted avg 0.4301 0.4544 0.4394 2390
2023-10-15 19:15:12,852 ----------------------------------------------------------------------------------------------------