2023-10-15 18:30:53,960 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:30:53,961 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-15 18:30:53,961 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:30:53,961 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences - NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator 2023-10-15 18:30:53,961 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:30:53,961 Train: 20847 sentences 2023-10-15 18:30:53,961 (train_with_dev=False, train_with_test=False) 2023-10-15 18:30:53,961 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:30:53,961 Training Params: 2023-10-15 18:30:53,961 - learning_rate: "5e-05" 2023-10-15 18:30:53,961 - mini_batch_size: "4" 2023-10-15 18:30:53,961 - max_epochs: "10" 2023-10-15 18:30:53,961 - shuffle: "True" 2023-10-15 18:30:53,961 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:30:53,961 Plugins: 2023-10-15 18:30:53,961 - LinearScheduler | warmup_fraction: '0.1' 2023-10-15 18:30:53,961 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:30:53,961 Final evaluation on model from best epoch (best-model.pt) 2023-10-15 18:30:53,961 - metric: "('micro avg', 'f1-score')" 2023-10-15 18:30:53,962 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:30:53,962 Computation: 2023-10-15 18:30:53,962 - compute on device: cuda:0 2023-10-15 18:30:53,962 - embedding storage: none 2023-10-15 18:30:53,962 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:30:53,962 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-15 18:30:53,962 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:30:53,962 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:31:18,728 epoch 1 - iter 521/5212 - loss 1.28960364 - time (sec): 24.77 - samples/sec: 1414.43 - lr: 0.000005 - momentum: 0.000000 2023-10-15 18:31:44,032 epoch 1 - iter 1042/5212 - loss 0.87638561 - time (sec): 50.07 - samples/sec: 1417.33 - lr: 0.000010 - momentum: 0.000000 2023-10-15 18:32:10,016 epoch 1 - iter 1563/5212 - loss 0.66894164 - time (sec): 76.05 - samples/sec: 1423.01 - lr: 0.000015 - momentum: 0.000000 2023-10-15 18:32:35,647 epoch 1 - iter 2084/5212 - loss 0.55810972 - time (sec): 101.68 - samples/sec: 1433.24 - lr: 0.000020 - momentum: 0.000000 2023-10-15 18:33:00,980 epoch 1 - iter 2605/5212 - loss 0.49511139 - time (sec): 127.02 - samples/sec: 1439.63 - lr: 0.000025 - momentum: 0.000000 2023-10-15 18:33:27,406 epoch 1 - iter 3126/5212 - loss 0.44378646 - time (sec): 153.44 - samples/sec: 1449.27 - lr: 0.000030 - momentum: 0.000000 2023-10-15 18:33:52,795 epoch 1 - iter 3647/5212 - loss 0.41309918 - time (sec): 178.83 - samples/sec: 1447.91 - lr: 0.000035 - momentum: 0.000000 2023-10-15 18:34:17,986 epoch 1 - iter 4168/5212 - loss 0.38982212 - time (sec): 204.02 - samples/sec: 1449.28 - lr: 0.000040 - momentum: 0.000000 2023-10-15 18:34:43,243 epoch 1 - iter 4689/5212 - loss 0.37676637 - time (sec): 229.28 - samples/sec: 1435.39 - lr: 0.000045 - momentum: 0.000000 2023-10-15 18:35:09,023 epoch 1 - iter 5210/5212 - loss 0.35899282 - time (sec): 255.06 - samples/sec: 1440.23 - lr: 0.000050 - momentum: 0.000000 2023-10-15 18:35:09,114 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:35:09,114 EPOCH 1 done: loss 0.3590 - lr: 0.000050 2023-10-15 18:35:14,909 DEV : loss 0.1935647428035736 - f1-score (micro avg) 0.2561 2023-10-15 18:35:14,934 saving best model 2023-10-15 18:35:15,282 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:35:40,738 epoch 2 - iter 521/5212 - loss 0.24636769 - time (sec): 25.45 - samples/sec: 1320.35 - lr: 0.000049 - momentum: 0.000000 2023-10-15 18:36:05,977 epoch 2 - iter 1042/5212 - loss 0.23231567 - time (sec): 50.69 - samples/sec: 1396.46 - lr: 0.000049 - momentum: 0.000000 2023-10-15 18:36:31,184 epoch 2 - iter 1563/5212 - loss 0.22395732 - time (sec): 75.90 - samples/sec: 1398.81 - lr: 0.000048 - momentum: 0.000000 2023-10-15 18:36:56,597 epoch 2 - iter 2084/5212 - loss 0.21548760 - time (sec): 101.31 - samples/sec: 1415.13 - lr: 0.000048 - momentum: 0.000000 2023-10-15 18:37:21,551 epoch 2 - iter 2605/5212 - loss 0.21176218 - time (sec): 126.27 - samples/sec: 1419.20 - lr: 0.000047 - momentum: 0.000000 2023-10-15 18:37:47,270 epoch 2 - iter 3126/5212 - loss 0.20948142 - time (sec): 151.99 - samples/sec: 1438.50 - lr: 0.000047 - momentum: 0.000000 2023-10-15 18:38:12,325 epoch 2 - iter 3647/5212 - loss 0.20703935 - time (sec): 177.04 - samples/sec: 1437.21 - lr: 0.000046 - momentum: 0.000000 2023-10-15 18:38:38,051 epoch 2 - iter 4168/5212 - loss 0.20284934 - time (sec): 202.77 - samples/sec: 1435.09 - lr: 0.000046 - momentum: 0.000000 2023-10-15 18:39:03,756 epoch 2 - iter 4689/5212 - loss 0.19868831 - time (sec): 228.47 - samples/sec: 1437.02 - lr: 0.000045 - momentum: 0.000000 2023-10-15 18:39:29,272 epoch 2 - iter 5210/5212 - loss 0.19660262 - time (sec): 253.99 - samples/sec: 1444.41 - lr: 0.000044 - momentum: 0.000000 2023-10-15 18:39:29,439 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:39:29,439 EPOCH 2 done: loss 0.1966 - lr: 0.000044 2023-10-15 18:39:38,418 DEV : loss 0.26646727323532104 - f1-score (micro avg) 0.3126 2023-10-15 18:39:38,445 saving best model 2023-10-15 18:39:38,908 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:40:03,583 epoch 3 - iter 521/5212 - loss 0.15758472 - time (sec): 24.67 - samples/sec: 1359.66 - lr: 0.000044 - momentum: 0.000000 2023-10-15 18:40:28,809 epoch 3 - iter 1042/5212 - loss 0.15326080 - time (sec): 49.90 - samples/sec: 1378.81 - lr: 0.000043 - momentum: 0.000000 2023-10-15 18:40:55,000 epoch 3 - iter 1563/5212 - loss 0.14597497 - time (sec): 76.09 - samples/sec: 1452.84 - lr: 0.000043 - momentum: 0.000000 2023-10-15 18:41:20,559 epoch 3 - iter 2084/5212 - loss 0.14065920 - time (sec): 101.65 - samples/sec: 1454.24 - lr: 0.000042 - momentum: 0.000000 2023-10-15 18:41:45,973 epoch 3 - iter 2605/5212 - loss 0.14592720 - time (sec): 127.06 - samples/sec: 1454.39 - lr: 0.000042 - momentum: 0.000000 2023-10-15 18:42:10,959 epoch 3 - iter 3126/5212 - loss 0.14605254 - time (sec): 152.05 - samples/sec: 1458.06 - lr: 0.000041 - momentum: 0.000000 2023-10-15 18:42:36,131 epoch 3 - iter 3647/5212 - loss 0.14458467 - time (sec): 177.22 - samples/sec: 1455.44 - lr: 0.000041 - momentum: 0.000000 2023-10-15 18:43:01,039 epoch 3 - iter 4168/5212 - loss 0.14831744 - time (sec): 202.13 - samples/sec: 1448.82 - lr: 0.000040 - momentum: 0.000000 2023-10-15 18:43:26,000 epoch 3 - iter 4689/5212 - loss 0.14722580 - time (sec): 227.09 - samples/sec: 1449.99 - lr: 0.000039 - momentum: 0.000000 2023-10-15 18:43:51,506 epoch 3 - iter 5210/5212 - loss 0.14520606 - time (sec): 252.60 - samples/sec: 1454.29 - lr: 0.000039 - momentum: 0.000000 2023-10-15 18:43:51,594 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:43:51,594 EPOCH 3 done: loss 0.1452 - lr: 0.000039 2023-10-15 18:43:59,942 DEV : loss 0.21102237701416016 - f1-score (micro avg) 0.3177 2023-10-15 18:43:59,971 saving best model 2023-10-15 18:44:00,404 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:44:26,791 epoch 4 - iter 521/5212 - loss 0.09694449 - time (sec): 26.39 - samples/sec: 1405.34 - lr: 0.000038 - momentum: 0.000000 2023-10-15 18:44:51,970 epoch 4 - iter 1042/5212 - loss 0.10160415 - time (sec): 51.56 - samples/sec: 1458.36 - lr: 0.000038 - momentum: 0.000000 2023-10-15 18:45:17,568 epoch 4 - iter 1563/5212 - loss 0.10470997 - time (sec): 77.16 - samples/sec: 1440.66 - lr: 0.000037 - momentum: 0.000000 2023-10-15 18:45:42,943 epoch 4 - iter 2084/5212 - loss 0.10504412 - time (sec): 102.54 - samples/sec: 1437.98 - lr: 0.000037 - momentum: 0.000000 2023-10-15 18:46:08,464 epoch 4 - iter 2605/5212 - loss 0.10205242 - time (sec): 128.06 - samples/sec: 1436.72 - lr: 0.000036 - momentum: 0.000000 2023-10-15 18:46:34,112 epoch 4 - iter 3126/5212 - loss 0.10491793 - time (sec): 153.71 - samples/sec: 1442.53 - lr: 0.000036 - momentum: 0.000000 2023-10-15 18:46:59,262 epoch 4 - iter 3647/5212 - loss 0.10491333 - time (sec): 178.86 - samples/sec: 1451.17 - lr: 0.000035 - momentum: 0.000000 2023-10-15 18:47:24,773 epoch 4 - iter 4168/5212 - loss 0.11104350 - time (sec): 204.37 - samples/sec: 1443.22 - lr: 0.000034 - momentum: 0.000000 2023-10-15 18:47:49,932 epoch 4 - iter 4689/5212 - loss 0.11023634 - time (sec): 229.53 - samples/sec: 1438.50 - lr: 0.000034 - momentum: 0.000000 2023-10-15 18:48:15,080 epoch 4 - iter 5210/5212 - loss 0.11005524 - time (sec): 254.67 - samples/sec: 1442.59 - lr: 0.000033 - momentum: 0.000000 2023-10-15 18:48:15,165 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:48:15,165 EPOCH 4 done: loss 0.1100 - lr: 0.000033 2023-10-15 18:48:23,650 DEV : loss 0.3510819673538208 - f1-score (micro avg) 0.3189 2023-10-15 18:48:23,695 saving best model 2023-10-15 18:48:24,183 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:48:51,942 epoch 5 - iter 521/5212 - loss 0.08171099 - time (sec): 27.76 - samples/sec: 1437.83 - lr: 0.000033 - momentum: 0.000000 2023-10-15 18:49:17,391 epoch 5 - iter 1042/5212 - loss 0.07891529 - time (sec): 53.21 - samples/sec: 1429.22 - lr: 0.000032 - momentum: 0.000000 2023-10-15 18:49:43,253 epoch 5 - iter 1563/5212 - loss 0.08091614 - time (sec): 79.07 - samples/sec: 1415.21 - lr: 0.000032 - momentum: 0.000000 2023-10-15 18:50:08,706 epoch 5 - iter 2084/5212 - loss 0.07786060 - time (sec): 104.52 - samples/sec: 1436.25 - lr: 0.000031 - momentum: 0.000000 2023-10-15 18:50:33,799 epoch 5 - iter 2605/5212 - loss 0.07747685 - time (sec): 129.61 - samples/sec: 1443.70 - lr: 0.000031 - momentum: 0.000000 2023-10-15 18:50:58,814 epoch 5 - iter 3126/5212 - loss 0.07663910 - time (sec): 154.63 - samples/sec: 1438.00 - lr: 0.000030 - momentum: 0.000000 2023-10-15 18:51:23,868 epoch 5 - iter 3647/5212 - loss 0.07608571 - time (sec): 179.68 - samples/sec: 1428.61 - lr: 0.000029 - momentum: 0.000000 2023-10-15 18:51:49,382 epoch 5 - iter 4168/5212 - loss 0.07640161 - time (sec): 205.20 - samples/sec: 1432.32 - lr: 0.000029 - momentum: 0.000000 2023-10-15 18:52:14,940 epoch 5 - iter 4689/5212 - loss 0.07755494 - time (sec): 230.76 - samples/sec: 1433.82 - lr: 0.000028 - momentum: 0.000000 2023-10-15 18:52:39,947 epoch 5 - iter 5210/5212 - loss 0.07842196 - time (sec): 255.76 - samples/sec: 1436.38 - lr: 0.000028 - momentum: 0.000000 2023-10-15 18:52:40,037 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:52:40,037 EPOCH 5 done: loss 0.0784 - lr: 0.000028 2023-10-15 18:52:48,525 DEV : loss 0.2648324966430664 - f1-score (micro avg) 0.3525 2023-10-15 18:52:48,571 saving best model 2023-10-15 18:52:49,098 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:53:15,489 epoch 6 - iter 521/5212 - loss 0.05230408 - time (sec): 26.39 - samples/sec: 1360.55 - lr: 0.000027 - momentum: 0.000000 2023-10-15 18:53:40,907 epoch 6 - iter 1042/5212 - loss 0.05845790 - time (sec): 51.81 - samples/sec: 1434.46 - lr: 0.000027 - momentum: 0.000000 2023-10-15 18:54:07,100 epoch 6 - iter 1563/5212 - loss 0.06310501 - time (sec): 78.00 - samples/sec: 1439.16 - lr: 0.000026 - momentum: 0.000000 2023-10-15 18:54:32,649 epoch 6 - iter 2084/5212 - loss 0.06369300 - time (sec): 103.55 - samples/sec: 1433.91 - lr: 0.000026 - momentum: 0.000000 2023-10-15 18:54:57,100 epoch 6 - iter 2605/5212 - loss 0.06667131 - time (sec): 128.00 - samples/sec: 1423.71 - lr: 0.000025 - momentum: 0.000000 2023-10-15 18:55:22,500 epoch 6 - iter 3126/5212 - loss 0.06631037 - time (sec): 153.40 - samples/sec: 1441.54 - lr: 0.000024 - momentum: 0.000000 2023-10-15 18:55:48,175 epoch 6 - iter 3647/5212 - loss 0.06540271 - time (sec): 179.07 - samples/sec: 1434.50 - lr: 0.000024 - momentum: 0.000000 2023-10-15 18:56:13,599 epoch 6 - iter 4168/5212 - loss 0.06511901 - time (sec): 204.50 - samples/sec: 1439.13 - lr: 0.000023 - momentum: 0.000000 2023-10-15 18:56:38,437 epoch 6 - iter 4689/5212 - loss 0.06326195 - time (sec): 229.34 - samples/sec: 1443.32 - lr: 0.000023 - momentum: 0.000000 2023-10-15 18:57:03,252 epoch 6 - iter 5210/5212 - loss 0.06322970 - time (sec): 254.15 - samples/sec: 1444.22 - lr: 0.000022 - momentum: 0.000000 2023-10-15 18:57:03,445 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:57:03,446 EPOCH 6 done: loss 0.0632 - lr: 0.000022 2023-10-15 18:57:11,710 DEV : loss 0.3284428119659424 - f1-score (micro avg) 0.3479 2023-10-15 18:57:11,739 ---------------------------------------------------------------------------------------------------- 2023-10-15 18:57:36,385 epoch 7 - iter 521/5212 - loss 0.04602036 - time (sec): 24.65 - samples/sec: 1375.30 - lr: 0.000022 - momentum: 0.000000 2023-10-15 18:58:01,491 epoch 7 - iter 1042/5212 - loss 0.04163037 - time (sec): 49.75 - samples/sec: 1426.66 - lr: 0.000021 - momentum: 0.000000 2023-10-15 18:58:26,685 epoch 7 - iter 1563/5212 - loss 0.04302604 - time (sec): 74.94 - samples/sec: 1400.93 - lr: 0.000021 - momentum: 0.000000 2023-10-15 18:58:52,373 epoch 7 - iter 2084/5212 - loss 0.04090613 - time (sec): 100.63 - samples/sec: 1421.62 - lr: 0.000020 - momentum: 0.000000 2023-10-15 18:59:18,332 epoch 7 - iter 2605/5212 - loss 0.04297160 - time (sec): 126.59 - samples/sec: 1428.63 - lr: 0.000019 - momentum: 0.000000 2023-10-15 18:59:43,473 epoch 7 - iter 3126/5212 - loss 0.04222747 - time (sec): 151.73 - samples/sec: 1430.24 - lr: 0.000019 - momentum: 0.000000 2023-10-15 19:00:08,729 epoch 7 - iter 3647/5212 - loss 0.04304694 - time (sec): 176.99 - samples/sec: 1441.97 - lr: 0.000018 - momentum: 0.000000 2023-10-15 19:00:33,766 epoch 7 - iter 4168/5212 - loss 0.04368892 - time (sec): 202.03 - samples/sec: 1449.26 - lr: 0.000018 - momentum: 0.000000 2023-10-15 19:00:59,308 epoch 7 - iter 4689/5212 - loss 0.04297916 - time (sec): 227.57 - samples/sec: 1447.31 - lr: 0.000017 - momentum: 0.000000 2023-10-15 19:01:24,566 epoch 7 - iter 5210/5212 - loss 0.04365791 - time (sec): 252.83 - samples/sec: 1453.14 - lr: 0.000017 - momentum: 0.000000 2023-10-15 19:01:24,659 ---------------------------------------------------------------------------------------------------- 2023-10-15 19:01:24,659 EPOCH 7 done: loss 0.0437 - lr: 0.000017 2023-10-15 19:01:33,790 DEV : loss 0.3804240822792053 - f1-score (micro avg) 0.351 2023-10-15 19:01:33,821 ---------------------------------------------------------------------------------------------------- 2023-10-15 19:01:59,298 epoch 8 - iter 521/5212 - loss 0.03278593 - time (sec): 25.48 - samples/sec: 1544.77 - lr: 0.000016 - momentum: 0.000000 2023-10-15 19:02:25,536 epoch 8 - iter 1042/5212 - loss 0.03015265 - time (sec): 51.71 - samples/sec: 1503.92 - lr: 0.000016 - momentum: 0.000000 2023-10-15 19:02:52,003 epoch 8 - iter 1563/5212 - loss 0.03141214 - time (sec): 78.18 - samples/sec: 1470.41 - lr: 0.000015 - momentum: 0.000000 2023-10-15 19:03:17,168 epoch 8 - iter 2084/5212 - loss 0.03397435 - time (sec): 103.35 - samples/sec: 1450.36 - lr: 0.000014 - momentum: 0.000000 2023-10-15 19:03:42,358 epoch 8 - iter 2605/5212 - loss 0.03614872 - time (sec): 128.54 - samples/sec: 1440.92 - lr: 0.000014 - momentum: 0.000000 2023-10-15 19:04:08,045 epoch 8 - iter 3126/5212 - loss 0.03527472 - time (sec): 154.22 - samples/sec: 1436.27 - lr: 0.000013 - momentum: 0.000000 2023-10-15 19:04:34,122 epoch 8 - iter 3647/5212 - loss 0.03459124 - time (sec): 180.30 - samples/sec: 1439.85 - lr: 0.000013 - momentum: 0.000000 2023-10-15 19:04:58,912 epoch 8 - iter 4168/5212 - loss 0.03421361 - time (sec): 205.09 - samples/sec: 1434.35 - lr: 0.000012 - momentum: 0.000000 2023-10-15 19:05:23,753 epoch 8 - iter 4689/5212 - loss 0.03359174 - time (sec): 229.93 - samples/sec: 1436.19 - lr: 0.000012 - momentum: 0.000000 2023-10-15 19:05:48,530 epoch 8 - iter 5210/5212 - loss 0.03303597 - time (sec): 254.71 - samples/sec: 1440.90 - lr: 0.000011 - momentum: 0.000000 2023-10-15 19:05:48,643 ---------------------------------------------------------------------------------------------------- 2023-10-15 19:05:48,643 EPOCH 8 done: loss 0.0330 - lr: 0.000011 2023-10-15 19:05:58,972 DEV : loss 0.3673393130302429 - f1-score (micro avg) 0.3845 2023-10-15 19:05:59,018 saving best model 2023-10-15 19:05:59,528 ---------------------------------------------------------------------------------------------------- 2023-10-15 19:06:28,356 epoch 9 - iter 521/5212 - loss 0.02522660 - time (sec): 28.82 - samples/sec: 1234.63 - lr: 0.000011 - momentum: 0.000000 2023-10-15 19:06:55,424 epoch 9 - iter 1042/5212 - loss 0.02383534 - time (sec): 55.89 - samples/sec: 1282.26 - lr: 0.000010 - momentum: 0.000000 2023-10-15 19:07:20,832 epoch 9 - iter 1563/5212 - loss 0.02195053 - time (sec): 81.30 - samples/sec: 1331.42 - lr: 0.000009 - momentum: 0.000000 2023-10-15 19:07:45,994 epoch 9 - iter 2084/5212 - loss 0.02261221 - time (sec): 106.46 - samples/sec: 1360.19 - lr: 0.000009 - momentum: 0.000000 2023-10-15 19:08:11,005 epoch 9 - iter 2605/5212 - loss 0.02502914 - time (sec): 131.47 - samples/sec: 1370.83 - lr: 0.000008 - momentum: 0.000000 2023-10-15 19:08:37,062 epoch 9 - iter 3126/5212 - loss 0.02521692 - time (sec): 157.53 - samples/sec: 1390.80 - lr: 0.000008 - momentum: 0.000000 2023-10-15 19:09:02,395 epoch 9 - iter 3647/5212 - loss 0.02518572 - time (sec): 182.86 - samples/sec: 1397.05 - lr: 0.000007 - momentum: 0.000000 2023-10-15 19:09:27,771 epoch 9 - iter 4168/5212 - loss 0.02436796 - time (sec): 208.24 - samples/sec: 1400.34 - lr: 0.000007 - momentum: 0.000000 2023-10-15 19:09:53,231 epoch 9 - iter 4689/5212 - loss 0.02383592 - time (sec): 233.70 - samples/sec: 1411.35 - lr: 0.000006 - momentum: 0.000000 2023-10-15 19:10:18,608 epoch 9 - iter 5210/5212 - loss 0.02324373 - time (sec): 259.08 - samples/sec: 1417.78 - lr: 0.000006 - momentum: 0.000000 2023-10-15 19:10:18,697 ---------------------------------------------------------------------------------------------------- 2023-10-15 19:10:18,697 EPOCH 9 done: loss 0.0232 - lr: 0.000006 2023-10-15 19:10:28,083 DEV : loss 0.4655354917049408 - f1-score (micro avg) 0.3527 2023-10-15 19:10:28,119 ---------------------------------------------------------------------------------------------------- 2023-10-15 19:10:55,276 epoch 10 - iter 521/5212 - loss 0.02072785 - time (sec): 27.16 - samples/sec: 1322.25 - lr: 0.000005 - momentum: 0.000000 2023-10-15 19:11:20,967 epoch 10 - iter 1042/5212 - loss 0.01880289 - time (sec): 52.85 - samples/sec: 1386.39 - lr: 0.000004 - momentum: 0.000000 2023-10-15 19:11:46,669 epoch 10 - iter 1563/5212 - loss 0.01711965 - time (sec): 78.55 - samples/sec: 1407.78 - lr: 0.000004 - momentum: 0.000000 2023-10-15 19:12:11,914 epoch 10 - iter 2084/5212 - loss 0.01667270 - time (sec): 103.79 - samples/sec: 1416.99 - lr: 0.000003 - momentum: 0.000000 2023-10-15 19:12:37,387 epoch 10 - iter 2605/5212 - loss 0.01655129 - time (sec): 129.27 - samples/sec: 1436.27 - lr: 0.000003 - momentum: 0.000000 2023-10-15 19:13:03,491 epoch 10 - iter 3126/5212 - loss 0.01639347 - time (sec): 155.37 - samples/sec: 1430.40 - lr: 0.000002 - momentum: 0.000000 2023-10-15 19:13:29,109 epoch 10 - iter 3647/5212 - loss 0.01559952 - time (sec): 180.99 - samples/sec: 1430.74 - lr: 0.000002 - momentum: 0.000000 2023-10-15 19:13:54,015 epoch 10 - iter 4168/5212 - loss 0.01527630 - time (sec): 205.89 - samples/sec: 1427.16 - lr: 0.000001 - momentum: 0.000000 2023-10-15 19:14:19,322 epoch 10 - iter 4689/5212 - loss 0.01494606 - time (sec): 231.20 - samples/sec: 1425.82 - lr: 0.000001 - momentum: 0.000000 2023-10-15 19:14:44,781 epoch 10 - iter 5210/5212 - loss 0.01478709 - time (sec): 256.66 - samples/sec: 1430.28 - lr: 0.000000 - momentum: 0.000000 2023-10-15 19:14:44,923 ---------------------------------------------------------------------------------------------------- 2023-10-15 19:14:44,923 EPOCH 10 done: loss 0.0148 - lr: 0.000000 2023-10-15 19:14:53,969 DEV : loss 0.4225836992263794 - f1-score (micro avg) 0.3731 2023-10-15 19:14:54,376 ---------------------------------------------------------------------------------------------------- 2023-10-15 19:14:54,377 Loading model from best epoch ... 2023-10-15 19:14:55,998 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-15 19:15:12,852 Results: - F-score (micro) 0.4487 - F-score (macro) 0.2891 - Accuracy 0.2932 By class: precision recall f1-score support LOC 0.5092 0.5708 0.5383 1214 PER 0.3792 0.4059 0.3921 808 ORG 0.2928 0.1841 0.2261 353 HumanProd 0.0000 0.0000 0.0000 15 micro avg 0.4431 0.4544 0.4487 2390 macro avg 0.2953 0.2902 0.2891 2390 weighted avg 0.4301 0.4544 0.4394 2390 2023-10-15 19:15:12,852 ----------------------------------------------------------------------------------------------------