2023-10-15 20:24:44,673 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:24:44,674 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-15 20:24:44,674 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:24:44,675 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences - NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator 2023-10-15 20:24:44,675 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:24:44,675 Train: 20847 sentences 2023-10-15 20:24:44,675 (train_with_dev=False, train_with_test=False) 2023-10-15 20:24:44,675 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:24:44,675 Training Params: 2023-10-15 20:24:44,675 - learning_rate: "3e-05" 2023-10-15 20:24:44,675 - mini_batch_size: "4" 2023-10-15 20:24:44,675 - max_epochs: "10" 2023-10-15 20:24:44,675 - shuffle: "True" 2023-10-15 20:24:44,675 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:24:44,675 Plugins: 2023-10-15 20:24:44,675 - LinearScheduler | warmup_fraction: '0.1' 2023-10-15 20:24:44,675 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:24:44,675 Final evaluation on model from best epoch (best-model.pt) 2023-10-15 20:24:44,675 - metric: "('micro avg', 'f1-score')" 2023-10-15 20:24:44,675 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:24:44,675 Computation: 2023-10-15 20:24:44,675 - compute on device: cuda:0 2023-10-15 20:24:44,675 - embedding storage: none 2023-10-15 20:24:44,675 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:24:44,675 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-15 20:24:44,675 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:24:44,675 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:25:09,747 epoch 1 - iter 521/5212 - loss 1.56165169 - time (sec): 25.07 - samples/sec: 1394.68 - lr: 0.000003 - momentum: 0.000000 2023-10-15 20:25:35,194 epoch 1 - iter 1042/5212 - loss 0.99845314 - time (sec): 50.52 - samples/sec: 1454.90 - lr: 0.000006 - momentum: 0.000000 2023-10-15 20:26:00,007 epoch 1 - iter 1563/5212 - loss 0.77898021 - time (sec): 75.33 - samples/sec: 1445.20 - lr: 0.000009 - momentum: 0.000000 2023-10-15 20:26:25,375 epoch 1 - iter 2084/5212 - loss 0.65478772 - time (sec): 100.70 - samples/sec: 1439.43 - lr: 0.000012 - momentum: 0.000000 2023-10-15 20:26:50,728 epoch 1 - iter 2605/5212 - loss 0.57005146 - time (sec): 126.05 - samples/sec: 1457.86 - lr: 0.000015 - momentum: 0.000000 2023-10-15 20:27:15,361 epoch 1 - iter 3126/5212 - loss 0.51705867 - time (sec): 150.68 - samples/sec: 1452.88 - lr: 0.000018 - momentum: 0.000000 2023-10-15 20:27:40,262 epoch 1 - iter 3647/5212 - loss 0.47447550 - time (sec): 175.59 - samples/sec: 1451.31 - lr: 0.000021 - momentum: 0.000000 2023-10-15 20:28:05,433 epoch 1 - iter 4168/5212 - loss 0.43833701 - time (sec): 200.76 - samples/sec: 1455.45 - lr: 0.000024 - momentum: 0.000000 2023-10-15 20:28:30,449 epoch 1 - iter 4689/5212 - loss 0.41315675 - time (sec): 225.77 - samples/sec: 1455.37 - lr: 0.000027 - momentum: 0.000000 2023-10-15 20:28:55,738 epoch 1 - iter 5210/5212 - loss 0.39022797 - time (sec): 251.06 - samples/sec: 1463.30 - lr: 0.000030 - momentum: 0.000000 2023-10-15 20:28:55,827 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:28:55,827 EPOCH 1 done: loss 0.3902 - lr: 0.000030 2023-10-15 20:29:01,505 DEV : loss 0.13643652200698853 - f1-score (micro avg) 0.2701 2023-10-15 20:29:01,533 saving best model 2023-10-15 20:29:01,913 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:29:27,078 epoch 2 - iter 521/5212 - loss 0.19544052 - time (sec): 25.16 - samples/sec: 1514.57 - lr: 0.000030 - momentum: 0.000000 2023-10-15 20:29:52,579 epoch 2 - iter 1042/5212 - loss 0.17228581 - time (sec): 50.66 - samples/sec: 1505.74 - lr: 0.000029 - momentum: 0.000000 2023-10-15 20:30:18,392 epoch 2 - iter 1563/5212 - loss 0.17467943 - time (sec): 76.48 - samples/sec: 1484.11 - lr: 0.000029 - momentum: 0.000000 2023-10-15 20:30:43,402 epoch 2 - iter 2084/5212 - loss 0.17589538 - time (sec): 101.49 - samples/sec: 1461.85 - lr: 0.000029 - momentum: 0.000000 2023-10-15 20:31:08,492 epoch 2 - iter 2605/5212 - loss 0.17733174 - time (sec): 126.58 - samples/sec: 1466.95 - lr: 0.000028 - momentum: 0.000000 2023-10-15 20:31:33,239 epoch 2 - iter 3126/5212 - loss 0.17324370 - time (sec): 151.32 - samples/sec: 1462.37 - lr: 0.000028 - momentum: 0.000000 2023-10-15 20:31:58,539 epoch 2 - iter 3647/5212 - loss 0.16913298 - time (sec): 176.62 - samples/sec: 1471.29 - lr: 0.000028 - momentum: 0.000000 2023-10-15 20:32:23,133 epoch 2 - iter 4168/5212 - loss 0.17368275 - time (sec): 201.22 - samples/sec: 1467.38 - lr: 0.000027 - momentum: 0.000000 2023-10-15 20:32:48,579 epoch 2 - iter 4689/5212 - loss 0.17139323 - time (sec): 226.66 - samples/sec: 1472.89 - lr: 0.000027 - momentum: 0.000000 2023-10-15 20:33:13,124 epoch 2 - iter 5210/5212 - loss 0.16968841 - time (sec): 251.21 - samples/sec: 1462.47 - lr: 0.000027 - momentum: 0.000000 2023-10-15 20:33:13,213 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:33:13,213 EPOCH 2 done: loss 0.1697 - lr: 0.000027 2023-10-15 20:33:21,474 DEV : loss 0.1600484699010849 - f1-score (micro avg) 0.369 2023-10-15 20:33:21,503 saving best model 2023-10-15 20:33:21,991 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:33:46,977 epoch 3 - iter 521/5212 - loss 0.11625189 - time (sec): 24.98 - samples/sec: 1460.19 - lr: 0.000026 - momentum: 0.000000 2023-10-15 20:34:12,053 epoch 3 - iter 1042/5212 - loss 0.11731984 - time (sec): 50.06 - samples/sec: 1460.97 - lr: 0.000026 - momentum: 0.000000 2023-10-15 20:34:37,239 epoch 3 - iter 1563/5212 - loss 0.11673671 - time (sec): 75.24 - samples/sec: 1468.35 - lr: 0.000026 - momentum: 0.000000 2023-10-15 20:35:03,000 epoch 3 - iter 2084/5212 - loss 0.12067893 - time (sec): 101.01 - samples/sec: 1469.90 - lr: 0.000025 - momentum: 0.000000 2023-10-15 20:35:28,469 epoch 3 - iter 2605/5212 - loss 0.11697764 - time (sec): 126.47 - samples/sec: 1462.26 - lr: 0.000025 - momentum: 0.000000 2023-10-15 20:35:53,355 epoch 3 - iter 3126/5212 - loss 0.11647880 - time (sec): 151.36 - samples/sec: 1458.11 - lr: 0.000025 - momentum: 0.000000 2023-10-15 20:36:19,263 epoch 3 - iter 3647/5212 - loss 0.11797656 - time (sec): 177.27 - samples/sec: 1448.30 - lr: 0.000024 - momentum: 0.000000 2023-10-15 20:36:44,778 epoch 3 - iter 4168/5212 - loss 0.11532415 - time (sec): 202.78 - samples/sec: 1457.64 - lr: 0.000024 - momentum: 0.000000 2023-10-15 20:37:09,921 epoch 3 - iter 4689/5212 - loss 0.11597611 - time (sec): 227.93 - samples/sec: 1458.52 - lr: 0.000024 - momentum: 0.000000 2023-10-15 20:37:34,894 epoch 3 - iter 5210/5212 - loss 0.11657433 - time (sec): 252.90 - samples/sec: 1452.74 - lr: 0.000023 - momentum: 0.000000 2023-10-15 20:37:34,980 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:37:34,980 EPOCH 3 done: loss 0.1166 - lr: 0.000023 2023-10-15 20:37:43,168 DEV : loss 0.22019526362419128 - f1-score (micro avg) 0.3173 2023-10-15 20:37:43,195 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:38:08,655 epoch 4 - iter 521/5212 - loss 0.09517878 - time (sec): 25.46 - samples/sec: 1442.09 - lr: 0.000023 - momentum: 0.000000 2023-10-15 20:38:33,547 epoch 4 - iter 1042/5212 - loss 0.08447031 - time (sec): 50.35 - samples/sec: 1416.98 - lr: 0.000023 - momentum: 0.000000 2023-10-15 20:38:58,367 epoch 4 - iter 1563/5212 - loss 0.08700323 - time (sec): 75.17 - samples/sec: 1429.05 - lr: 0.000022 - momentum: 0.000000 2023-10-15 20:39:23,658 epoch 4 - iter 2084/5212 - loss 0.08555122 - time (sec): 100.46 - samples/sec: 1439.08 - lr: 0.000022 - momentum: 0.000000 2023-10-15 20:39:48,375 epoch 4 - iter 2605/5212 - loss 0.08406529 - time (sec): 125.18 - samples/sec: 1438.20 - lr: 0.000022 - momentum: 0.000000 2023-10-15 20:40:13,097 epoch 4 - iter 3126/5212 - loss 0.08735544 - time (sec): 149.90 - samples/sec: 1440.09 - lr: 0.000021 - momentum: 0.000000 2023-10-15 20:40:38,425 epoch 4 - iter 3647/5212 - loss 0.08619736 - time (sec): 175.23 - samples/sec: 1451.28 - lr: 0.000021 - momentum: 0.000000 2023-10-15 20:41:03,491 epoch 4 - iter 4168/5212 - loss 0.08622321 - time (sec): 200.30 - samples/sec: 1447.99 - lr: 0.000021 - momentum: 0.000000 2023-10-15 20:41:28,778 epoch 4 - iter 4689/5212 - loss 0.08588628 - time (sec): 225.58 - samples/sec: 1454.70 - lr: 0.000020 - momentum: 0.000000 2023-10-15 20:41:54,667 epoch 4 - iter 5210/5212 - loss 0.08579389 - time (sec): 251.47 - samples/sec: 1460.90 - lr: 0.000020 - momentum: 0.000000 2023-10-15 20:41:54,753 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:41:54,753 EPOCH 4 done: loss 0.0858 - lr: 0.000020 2023-10-15 20:42:03,901 DEV : loss 0.28183305263519287 - f1-score (micro avg) 0.3478 2023-10-15 20:42:03,930 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:42:28,675 epoch 5 - iter 521/5212 - loss 0.05241740 - time (sec): 24.74 - samples/sec: 1389.53 - lr: 0.000020 - momentum: 0.000000 2023-10-15 20:42:53,630 epoch 5 - iter 1042/5212 - loss 0.06367030 - time (sec): 49.70 - samples/sec: 1399.06 - lr: 0.000019 - momentum: 0.000000 2023-10-15 20:43:18,460 epoch 5 - iter 1563/5212 - loss 0.06209608 - time (sec): 74.53 - samples/sec: 1410.98 - lr: 0.000019 - momentum: 0.000000 2023-10-15 20:43:43,337 epoch 5 - iter 2084/5212 - loss 0.06304772 - time (sec): 99.41 - samples/sec: 1431.86 - lr: 0.000019 - momentum: 0.000000 2023-10-15 20:44:08,826 epoch 5 - iter 2605/5212 - loss 0.06257189 - time (sec): 124.89 - samples/sec: 1441.74 - lr: 0.000018 - momentum: 0.000000 2023-10-15 20:44:34,138 epoch 5 - iter 3126/5212 - loss 0.06061267 - time (sec): 150.21 - samples/sec: 1435.46 - lr: 0.000018 - momentum: 0.000000 2023-10-15 20:44:59,301 epoch 5 - iter 3647/5212 - loss 0.06083740 - time (sec): 175.37 - samples/sec: 1443.21 - lr: 0.000018 - momentum: 0.000000 2023-10-15 20:45:25,043 epoch 5 - iter 4168/5212 - loss 0.05950603 - time (sec): 201.11 - samples/sec: 1448.76 - lr: 0.000017 - momentum: 0.000000 2023-10-15 20:45:50,728 epoch 5 - iter 4689/5212 - loss 0.05942891 - time (sec): 226.80 - samples/sec: 1451.75 - lr: 0.000017 - momentum: 0.000000 2023-10-15 20:46:16,246 epoch 5 - iter 5210/5212 - loss 0.05999733 - time (sec): 252.31 - samples/sec: 1455.71 - lr: 0.000017 - momentum: 0.000000 2023-10-15 20:46:16,342 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:46:16,342 EPOCH 5 done: loss 0.0600 - lr: 0.000017 2023-10-15 20:46:25,496 DEV : loss 0.28720250725746155 - f1-score (micro avg) 0.3655 2023-10-15 20:46:25,523 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:46:50,940 epoch 6 - iter 521/5212 - loss 0.04731132 - time (sec): 25.42 - samples/sec: 1475.22 - lr: 0.000016 - momentum: 0.000000 2023-10-15 20:47:16,372 epoch 6 - iter 1042/5212 - loss 0.04736510 - time (sec): 50.85 - samples/sec: 1494.69 - lr: 0.000016 - momentum: 0.000000 2023-10-15 20:47:41,147 epoch 6 - iter 1563/5212 - loss 0.04572425 - time (sec): 75.62 - samples/sec: 1473.37 - lr: 0.000016 - momentum: 0.000000 2023-10-15 20:48:06,986 epoch 6 - iter 2084/5212 - loss 0.04405961 - time (sec): 101.46 - samples/sec: 1485.66 - lr: 0.000015 - momentum: 0.000000 2023-10-15 20:48:32,376 epoch 6 - iter 2605/5212 - loss 0.04285739 - time (sec): 126.85 - samples/sec: 1474.66 - lr: 0.000015 - momentum: 0.000000 2023-10-15 20:48:57,879 epoch 6 - iter 3126/5212 - loss 0.04390113 - time (sec): 152.35 - samples/sec: 1465.47 - lr: 0.000015 - momentum: 0.000000 2023-10-15 20:49:22,624 epoch 6 - iter 3647/5212 - loss 0.04383399 - time (sec): 177.10 - samples/sec: 1456.57 - lr: 0.000014 - momentum: 0.000000 2023-10-15 20:49:47,726 epoch 6 - iter 4168/5212 - loss 0.04360406 - time (sec): 202.20 - samples/sec: 1451.05 - lr: 0.000014 - momentum: 0.000000 2023-10-15 20:50:12,908 epoch 6 - iter 4689/5212 - loss 0.04400334 - time (sec): 227.38 - samples/sec: 1444.97 - lr: 0.000014 - momentum: 0.000000 2023-10-15 20:50:38,364 epoch 6 - iter 5210/5212 - loss 0.04464236 - time (sec): 252.84 - samples/sec: 1451.74 - lr: 0.000013 - momentum: 0.000000 2023-10-15 20:50:38,495 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:50:38,495 EPOCH 6 done: loss 0.0446 - lr: 0.000013 2023-10-15 20:50:47,694 DEV : loss 0.3866041302680969 - f1-score (micro avg) 0.3652 2023-10-15 20:50:47,722 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:51:13,075 epoch 7 - iter 521/5212 - loss 0.03788773 - time (sec): 25.35 - samples/sec: 1488.17 - lr: 0.000013 - momentum: 0.000000 2023-10-15 20:51:38,952 epoch 7 - iter 1042/5212 - loss 0.03141041 - time (sec): 51.23 - samples/sec: 1481.15 - lr: 0.000013 - momentum: 0.000000 2023-10-15 20:52:04,004 epoch 7 - iter 1563/5212 - loss 0.03509067 - time (sec): 76.28 - samples/sec: 1464.43 - lr: 0.000012 - momentum: 0.000000 2023-10-15 20:52:29,033 epoch 7 - iter 2084/5212 - loss 0.03393019 - time (sec): 101.31 - samples/sec: 1436.78 - lr: 0.000012 - momentum: 0.000000 2023-10-15 20:52:54,318 epoch 7 - iter 2605/5212 - loss 0.03383844 - time (sec): 126.60 - samples/sec: 1448.48 - lr: 0.000012 - momentum: 0.000000 2023-10-15 20:53:19,320 epoch 7 - iter 3126/5212 - loss 0.03348699 - time (sec): 151.60 - samples/sec: 1441.89 - lr: 0.000011 - momentum: 0.000000 2023-10-15 20:53:45,054 epoch 7 - iter 3647/5212 - loss 0.03338128 - time (sec): 177.33 - samples/sec: 1455.10 - lr: 0.000011 - momentum: 0.000000 2023-10-15 20:54:09,864 epoch 7 - iter 4168/5212 - loss 0.03278061 - time (sec): 202.14 - samples/sec: 1446.19 - lr: 0.000011 - momentum: 0.000000 2023-10-15 20:54:35,559 epoch 7 - iter 4689/5212 - loss 0.03275384 - time (sec): 227.84 - samples/sec: 1453.23 - lr: 0.000010 - momentum: 0.000000 2023-10-15 20:55:00,583 epoch 7 - iter 5210/5212 - loss 0.03182261 - time (sec): 252.86 - samples/sec: 1452.89 - lr: 0.000010 - momentum: 0.000000 2023-10-15 20:55:00,671 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:55:00,671 EPOCH 7 done: loss 0.0318 - lr: 0.000010 2023-10-15 20:55:09,772 DEV : loss 0.41479361057281494 - f1-score (micro avg) 0.3719 2023-10-15 20:55:09,801 saving best model 2023-10-15 20:55:10,286 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:55:34,976 epoch 8 - iter 521/5212 - loss 0.02874987 - time (sec): 24.69 - samples/sec: 1382.35 - lr: 0.000010 - momentum: 0.000000 2023-10-15 20:56:00,292 epoch 8 - iter 1042/5212 - loss 0.02974470 - time (sec): 50.00 - samples/sec: 1456.42 - lr: 0.000009 - momentum: 0.000000 2023-10-15 20:56:25,436 epoch 8 - iter 1563/5212 - loss 0.02516966 - time (sec): 75.15 - samples/sec: 1449.58 - lr: 0.000009 - momentum: 0.000000 2023-10-15 20:56:50,608 epoch 8 - iter 2084/5212 - loss 0.02526379 - time (sec): 100.32 - samples/sec: 1446.32 - lr: 0.000009 - momentum: 0.000000 2023-10-15 20:57:16,033 epoch 8 - iter 2605/5212 - loss 0.02435580 - time (sec): 125.74 - samples/sec: 1454.15 - lr: 0.000008 - momentum: 0.000000 2023-10-15 20:57:41,452 epoch 8 - iter 3126/5212 - loss 0.02388305 - time (sec): 151.16 - samples/sec: 1460.49 - lr: 0.000008 - momentum: 0.000000 2023-10-15 20:58:06,478 epoch 8 - iter 3647/5212 - loss 0.02325843 - time (sec): 176.19 - samples/sec: 1462.53 - lr: 0.000008 - momentum: 0.000000 2023-10-15 20:58:31,384 epoch 8 - iter 4168/5212 - loss 0.02355797 - time (sec): 201.10 - samples/sec: 1462.23 - lr: 0.000007 - momentum: 0.000000 2023-10-15 20:58:56,660 epoch 8 - iter 4689/5212 - loss 0.02339324 - time (sec): 226.37 - samples/sec: 1462.32 - lr: 0.000007 - momentum: 0.000000 2023-10-15 20:59:21,654 epoch 8 - iter 5210/5212 - loss 0.02336518 - time (sec): 251.37 - samples/sec: 1460.88 - lr: 0.000007 - momentum: 0.000000 2023-10-15 20:59:21,754 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:59:21,754 EPOCH 8 done: loss 0.0234 - lr: 0.000007 2023-10-15 20:59:30,902 DEV : loss 0.42180541157722473 - f1-score (micro avg) 0.3741 2023-10-15 20:59:30,930 saving best model 2023-10-15 20:59:31,418 ---------------------------------------------------------------------------------------------------- 2023-10-15 20:59:57,079 epoch 9 - iter 521/5212 - loss 0.01219894 - time (sec): 25.66 - samples/sec: 1547.72 - lr: 0.000006 - momentum: 0.000000 2023-10-15 21:00:22,613 epoch 9 - iter 1042/5212 - loss 0.01351483 - time (sec): 51.19 - samples/sec: 1525.14 - lr: 0.000006 - momentum: 0.000000 2023-10-15 21:00:47,949 epoch 9 - iter 1563/5212 - loss 0.01411036 - time (sec): 76.53 - samples/sec: 1511.90 - lr: 0.000006 - momentum: 0.000000 2023-10-15 21:01:12,968 epoch 9 - iter 2084/5212 - loss 0.01442847 - time (sec): 101.55 - samples/sec: 1496.00 - lr: 0.000005 - momentum: 0.000000 2023-10-15 21:01:38,021 epoch 9 - iter 2605/5212 - loss 0.01486565 - time (sec): 126.60 - samples/sec: 1486.57 - lr: 0.000005 - momentum: 0.000000 2023-10-15 21:02:02,741 epoch 9 - iter 3126/5212 - loss 0.01496493 - time (sec): 151.32 - samples/sec: 1460.15 - lr: 0.000005 - momentum: 0.000000 2023-10-15 21:02:28,122 epoch 9 - iter 3647/5212 - loss 0.01502140 - time (sec): 176.70 - samples/sec: 1462.06 - lr: 0.000004 - momentum: 0.000000 2023-10-15 21:02:52,991 epoch 9 - iter 4168/5212 - loss 0.01484723 - time (sec): 201.57 - samples/sec: 1461.58 - lr: 0.000004 - momentum: 0.000000 2023-10-15 21:03:18,183 epoch 9 - iter 4689/5212 - loss 0.01495762 - time (sec): 226.76 - samples/sec: 1460.46 - lr: 0.000004 - momentum: 0.000000 2023-10-15 21:03:43,356 epoch 9 - iter 5210/5212 - loss 0.01475329 - time (sec): 251.93 - samples/sec: 1458.16 - lr: 0.000003 - momentum: 0.000000 2023-10-15 21:03:43,445 ---------------------------------------------------------------------------------------------------- 2023-10-15 21:03:43,445 EPOCH 9 done: loss 0.0148 - lr: 0.000003 2023-10-15 21:03:52,572 DEV : loss 0.4861517548561096 - f1-score (micro avg) 0.3626 2023-10-15 21:03:52,599 ---------------------------------------------------------------------------------------------------- 2023-10-15 21:04:17,540 epoch 10 - iter 521/5212 - loss 0.00895462 - time (sec): 24.94 - samples/sec: 1423.74 - lr: 0.000003 - momentum: 0.000000 2023-10-15 21:04:42,571 epoch 10 - iter 1042/5212 - loss 0.01126073 - time (sec): 49.97 - samples/sec: 1388.03 - lr: 0.000003 - momentum: 0.000000 2023-10-15 21:05:07,649 epoch 10 - iter 1563/5212 - loss 0.01142583 - time (sec): 75.05 - samples/sec: 1420.18 - lr: 0.000002 - momentum: 0.000000 2023-10-15 21:05:32,542 epoch 10 - iter 2084/5212 - loss 0.01092774 - time (sec): 99.94 - samples/sec: 1426.42 - lr: 0.000002 - momentum: 0.000000 2023-10-15 21:05:57,462 epoch 10 - iter 2605/5212 - loss 0.01134639 - time (sec): 124.86 - samples/sec: 1432.03 - lr: 0.000002 - momentum: 0.000000 2023-10-15 21:06:22,580 epoch 10 - iter 3126/5212 - loss 0.01131287 - time (sec): 149.98 - samples/sec: 1436.28 - lr: 0.000001 - momentum: 0.000000 2023-10-15 21:06:47,748 epoch 10 - iter 3647/5212 - loss 0.01122214 - time (sec): 175.15 - samples/sec: 1442.52 - lr: 0.000001 - momentum: 0.000000 2023-10-15 21:07:13,252 epoch 10 - iter 4168/5212 - loss 0.01080290 - time (sec): 200.65 - samples/sec: 1452.51 - lr: 0.000001 - momentum: 0.000000 2023-10-15 21:07:38,300 epoch 10 - iter 4689/5212 - loss 0.01104492 - time (sec): 225.70 - samples/sec: 1450.67 - lr: 0.000000 - momentum: 0.000000 2023-10-15 21:08:03,995 epoch 10 - iter 5210/5212 - loss 0.01096840 - time (sec): 251.39 - samples/sec: 1461.12 - lr: 0.000000 - momentum: 0.000000 2023-10-15 21:08:04,082 ---------------------------------------------------------------------------------------------------- 2023-10-15 21:08:04,082 EPOCH 10 done: loss 0.0110 - lr: 0.000000 2023-10-15 21:08:12,303 DEV : loss 0.4902326166629791 - f1-score (micro avg) 0.3638 2023-10-15 21:08:12,728 ---------------------------------------------------------------------------------------------------- 2023-10-15 21:08:12,729 Loading model from best epoch ... 2023-10-15 21:08:14,255 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-15 21:08:30,524 Results: - F-score (micro) 0.4907 - F-score (macro) 0.3334 - Accuracy 0.3301 By class: precision recall f1-score support LOC 0.5226 0.6203 0.5672 1214 PER 0.4030 0.5371 0.4605 808 ORG 0.3028 0.3088 0.3058 353 HumanProd 0.0000 0.0000 0.0000 15 micro avg 0.4481 0.5423 0.4907 2390 macro avg 0.3071 0.3665 0.3334 2390 weighted avg 0.4464 0.5423 0.4890 2390 2023-10-15 21:08:30,524 ----------------------------------------------------------------------------------------------------