2023-10-15 13:16:10,272 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:16:10,273 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-15 13:16:10,273 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:16:10,273 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences - NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator 2023-10-15 13:16:10,273 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:16:10,273 Train: 20847 sentences 2023-10-15 13:16:10,273 (train_with_dev=False, train_with_test=False) 2023-10-15 13:16:10,273 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:16:10,273 Training Params: 2023-10-15 13:16:10,273 - learning_rate: "5e-05" 2023-10-15 13:16:10,273 - mini_batch_size: "4" 2023-10-15 13:16:10,273 - max_epochs: "10" 2023-10-15 13:16:10,273 - shuffle: "True" 2023-10-15 13:16:10,273 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:16:10,273 Plugins: 2023-10-15 13:16:10,273 - LinearScheduler | warmup_fraction: '0.1' 2023-10-15 13:16:10,273 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:16:10,273 Final evaluation on model from best epoch (best-model.pt) 2023-10-15 13:16:10,273 - metric: "('micro avg', 'f1-score')" 2023-10-15 13:16:10,273 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:16:10,273 Computation: 2023-10-15 13:16:10,273 - compute on device: cuda:0 2023-10-15 13:16:10,273 - embedding storage: none 2023-10-15 13:16:10,273 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:16:10,273 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-15 13:16:10,273 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:16:10,273 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:16:35,448 epoch 1 - iter 521/5212 - loss 1.40362758 - time (sec): 25.17 - samples/sec: 1455.51 - lr: 0.000005 - momentum: 0.000000 2023-10-15 13:17:00,458 epoch 1 - iter 1042/5212 - loss 0.89883300 - time (sec): 50.18 - samples/sec: 1460.27 - lr: 0.000010 - momentum: 0.000000 2023-10-15 13:17:25,878 epoch 1 - iter 1563/5212 - loss 0.67555628 - time (sec): 75.60 - samples/sec: 1488.02 - lr: 0.000015 - momentum: 0.000000 2023-10-15 13:17:50,812 epoch 1 - iter 2084/5212 - loss 0.57238927 - time (sec): 100.54 - samples/sec: 1477.04 - lr: 0.000020 - momentum: 0.000000 2023-10-15 13:18:16,704 epoch 1 - iter 2605/5212 - loss 0.49502228 - time (sec): 126.43 - samples/sec: 1492.43 - lr: 0.000025 - momentum: 0.000000 2023-10-15 13:18:42,337 epoch 1 - iter 3126/5212 - loss 0.45192361 - time (sec): 152.06 - samples/sec: 1483.23 - lr: 0.000030 - momentum: 0.000000 2023-10-15 13:19:08,050 epoch 1 - iter 3647/5212 - loss 0.42525465 - time (sec): 177.78 - samples/sec: 1458.09 - lr: 0.000035 - momentum: 0.000000 2023-10-15 13:19:37,108 epoch 1 - iter 4168/5212 - loss 0.40135861 - time (sec): 206.83 - samples/sec: 1438.06 - lr: 0.000040 - momentum: 0.000000 2023-10-15 13:20:03,562 epoch 1 - iter 4689/5212 - loss 0.38395729 - time (sec): 233.29 - samples/sec: 1429.03 - lr: 0.000045 - momentum: 0.000000 2023-10-15 13:20:30,449 epoch 1 - iter 5210/5212 - loss 0.37262070 - time (sec): 260.17 - samples/sec: 1411.90 - lr: 0.000050 - momentum: 0.000000 2023-10-15 13:20:30,541 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:20:30,542 EPOCH 1 done: loss 0.3725 - lr: 0.000050 2023-10-15 13:20:36,434 DEV : loss 0.11529939621686935 - f1-score (micro avg) 0.2459 2023-10-15 13:20:36,460 saving best model 2023-10-15 13:20:36,927 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:21:02,702 epoch 2 - iter 521/5212 - loss 0.19455856 - time (sec): 25.77 - samples/sec: 1478.33 - lr: 0.000049 - momentum: 0.000000 2023-10-15 13:21:28,495 epoch 2 - iter 1042/5212 - loss 0.20837326 - time (sec): 51.57 - samples/sec: 1394.34 - lr: 0.000049 - momentum: 0.000000 2023-10-15 13:21:56,301 epoch 2 - iter 1563/5212 - loss 0.19711389 - time (sec): 79.37 - samples/sec: 1372.97 - lr: 0.000048 - momentum: 0.000000 2023-10-15 13:22:22,909 epoch 2 - iter 2084/5212 - loss 0.19827196 - time (sec): 105.98 - samples/sec: 1387.68 - lr: 0.000048 - momentum: 0.000000 2023-10-15 13:22:50,780 epoch 2 - iter 2605/5212 - loss 0.19554159 - time (sec): 133.85 - samples/sec: 1376.38 - lr: 0.000047 - momentum: 0.000000 2023-10-15 13:23:16,034 epoch 2 - iter 3126/5212 - loss 0.19262159 - time (sec): 159.10 - samples/sec: 1367.76 - lr: 0.000047 - momentum: 0.000000 2023-10-15 13:23:41,047 epoch 2 - iter 3647/5212 - loss 0.19419779 - time (sec): 184.12 - samples/sec: 1389.83 - lr: 0.000046 - momentum: 0.000000 2023-10-15 13:24:06,335 epoch 2 - iter 4168/5212 - loss 0.19161465 - time (sec): 209.41 - samples/sec: 1401.90 - lr: 0.000046 - momentum: 0.000000 2023-10-15 13:24:34,437 epoch 2 - iter 4689/5212 - loss 0.19121284 - time (sec): 237.51 - samples/sec: 1404.82 - lr: 0.000045 - momentum: 0.000000 2023-10-15 13:25:02,829 epoch 2 - iter 5210/5212 - loss 0.19317263 - time (sec): 265.90 - samples/sec: 1381.36 - lr: 0.000044 - momentum: 0.000000 2023-10-15 13:25:02,932 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:25:02,932 EPOCH 2 done: loss 0.1931 - lr: 0.000044 2023-10-15 13:25:12,816 DEV : loss 0.17544493079185486 - f1-score (micro avg) 0.3654 2023-10-15 13:25:12,852 saving best model 2023-10-15 13:25:13,376 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:25:39,265 epoch 3 - iter 521/5212 - loss 0.15145173 - time (sec): 25.88 - samples/sec: 1458.63 - lr: 0.000044 - momentum: 0.000000 2023-10-15 13:26:04,335 epoch 3 - iter 1042/5212 - loss 0.15678187 - time (sec): 50.95 - samples/sec: 1438.61 - lr: 0.000043 - momentum: 0.000000 2023-10-15 13:26:30,991 epoch 3 - iter 1563/5212 - loss 0.15448299 - time (sec): 77.61 - samples/sec: 1389.38 - lr: 0.000043 - momentum: 0.000000 2023-10-15 13:26:56,639 epoch 3 - iter 2084/5212 - loss 0.15527194 - time (sec): 103.26 - samples/sec: 1392.24 - lr: 0.000042 - momentum: 0.000000 2023-10-15 13:27:25,339 epoch 3 - iter 2605/5212 - loss 0.15240117 - time (sec): 131.96 - samples/sec: 1368.28 - lr: 0.000042 - momentum: 0.000000 2023-10-15 13:27:54,043 epoch 3 - iter 3126/5212 - loss 0.14898541 - time (sec): 160.66 - samples/sec: 1356.81 - lr: 0.000041 - momentum: 0.000000 2023-10-15 13:28:19,990 epoch 3 - iter 3647/5212 - loss 0.14601801 - time (sec): 186.61 - samples/sec: 1379.17 - lr: 0.000041 - momentum: 0.000000 2023-10-15 13:28:47,027 epoch 3 - iter 4168/5212 - loss 0.14444689 - time (sec): 213.65 - samples/sec: 1371.63 - lr: 0.000040 - momentum: 0.000000 2023-10-15 13:29:14,124 epoch 3 - iter 4689/5212 - loss 0.14484556 - time (sec): 240.74 - samples/sec: 1374.81 - lr: 0.000039 - momentum: 0.000000 2023-10-15 13:29:40,292 epoch 3 - iter 5210/5212 - loss 0.14315344 - time (sec): 266.91 - samples/sec: 1376.40 - lr: 0.000039 - momentum: 0.000000 2023-10-15 13:29:40,389 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:29:40,389 EPOCH 3 done: loss 0.1431 - lr: 0.000039 2023-10-15 13:29:49,797 DEV : loss 0.23793098330497742 - f1-score (micro avg) 0.3084 2023-10-15 13:29:49,824 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:30:14,394 epoch 4 - iter 521/5212 - loss 0.11205687 - time (sec): 24.57 - samples/sec: 1434.64 - lr: 0.000038 - momentum: 0.000000 2023-10-15 13:30:39,652 epoch 4 - iter 1042/5212 - loss 0.10513806 - time (sec): 49.83 - samples/sec: 1419.82 - lr: 0.000038 - momentum: 0.000000 2023-10-15 13:31:05,128 epoch 4 - iter 1563/5212 - loss 0.11011535 - time (sec): 75.30 - samples/sec: 1448.52 - lr: 0.000037 - momentum: 0.000000 2023-10-15 13:31:30,130 epoch 4 - iter 2084/5212 - loss 0.11135520 - time (sec): 100.31 - samples/sec: 1455.61 - lr: 0.000037 - momentum: 0.000000 2023-10-15 13:31:54,730 epoch 4 - iter 2605/5212 - loss 0.11031903 - time (sec): 124.91 - samples/sec: 1449.83 - lr: 0.000036 - momentum: 0.000000 2023-10-15 13:32:19,870 epoch 4 - iter 3126/5212 - loss 0.10935951 - time (sec): 150.05 - samples/sec: 1461.11 - lr: 0.000036 - momentum: 0.000000 2023-10-15 13:32:44,907 epoch 4 - iter 3647/5212 - loss 0.10865845 - time (sec): 175.08 - samples/sec: 1459.09 - lr: 0.000035 - momentum: 0.000000 2023-10-15 13:33:10,118 epoch 4 - iter 4168/5212 - loss 0.10703262 - time (sec): 200.29 - samples/sec: 1465.07 - lr: 0.000034 - momentum: 0.000000 2023-10-15 13:33:35,323 epoch 4 - iter 4689/5212 - loss 0.10572271 - time (sec): 225.50 - samples/sec: 1457.14 - lr: 0.000034 - momentum: 0.000000 2023-10-15 13:34:01,141 epoch 4 - iter 5210/5212 - loss 0.10526184 - time (sec): 251.32 - samples/sec: 1461.60 - lr: 0.000033 - momentum: 0.000000 2023-10-15 13:34:01,232 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:34:01,233 EPOCH 4 done: loss 0.1053 - lr: 0.000033 2023-10-15 13:34:09,661 DEV : loss 0.23926250636577606 - f1-score (micro avg) 0.2991 2023-10-15 13:34:09,704 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:34:36,513 epoch 5 - iter 521/5212 - loss 0.08552624 - time (sec): 26.81 - samples/sec: 1340.65 - lr: 0.000033 - momentum: 0.000000 2023-10-15 13:35:01,593 epoch 5 - iter 1042/5212 - loss 0.07856810 - time (sec): 51.89 - samples/sec: 1363.29 - lr: 0.000032 - momentum: 0.000000 2023-10-15 13:35:26,794 epoch 5 - iter 1563/5212 - loss 0.08133709 - time (sec): 77.09 - samples/sec: 1389.39 - lr: 0.000032 - momentum: 0.000000 2023-10-15 13:35:52,056 epoch 5 - iter 2084/5212 - loss 0.08333790 - time (sec): 102.35 - samples/sec: 1406.19 - lr: 0.000031 - momentum: 0.000000 2023-10-15 13:36:17,357 epoch 5 - iter 2605/5212 - loss 0.08277969 - time (sec): 127.65 - samples/sec: 1409.26 - lr: 0.000031 - momentum: 0.000000 2023-10-15 13:36:43,034 epoch 5 - iter 3126/5212 - loss 0.08273593 - time (sec): 153.33 - samples/sec: 1420.12 - lr: 0.000030 - momentum: 0.000000 2023-10-15 13:37:08,780 epoch 5 - iter 3647/5212 - loss 0.08084529 - time (sec): 179.07 - samples/sec: 1425.88 - lr: 0.000029 - momentum: 0.000000 2023-10-15 13:37:34,843 epoch 5 - iter 4168/5212 - loss 0.08016018 - time (sec): 205.14 - samples/sec: 1436.62 - lr: 0.000029 - momentum: 0.000000 2023-10-15 13:38:00,799 epoch 5 - iter 4689/5212 - loss 0.07996177 - time (sec): 231.09 - samples/sec: 1442.23 - lr: 0.000028 - momentum: 0.000000 2023-10-15 13:38:25,393 epoch 5 - iter 5210/5212 - loss 0.08003152 - time (sec): 255.69 - samples/sec: 1436.83 - lr: 0.000028 - momentum: 0.000000 2023-10-15 13:38:25,484 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:38:25,484 EPOCH 5 done: loss 0.0800 - lr: 0.000028 2023-10-15 13:38:33,739 DEV : loss 0.3471781313419342 - f1-score (micro avg) 0.3072 2023-10-15 13:38:33,768 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:38:59,130 epoch 6 - iter 521/5212 - loss 0.04893610 - time (sec): 25.36 - samples/sec: 1550.29 - lr: 0.000027 - momentum: 0.000000 2023-10-15 13:39:24,147 epoch 6 - iter 1042/5212 - loss 0.05092956 - time (sec): 50.38 - samples/sec: 1516.20 - lr: 0.000027 - momentum: 0.000000 2023-10-15 13:39:50,524 epoch 6 - iter 1563/5212 - loss 0.05455800 - time (sec): 76.76 - samples/sec: 1509.47 - lr: 0.000026 - momentum: 0.000000 2023-10-15 13:40:16,811 epoch 6 - iter 2084/5212 - loss 0.05559083 - time (sec): 103.04 - samples/sec: 1460.28 - lr: 0.000026 - momentum: 0.000000 2023-10-15 13:40:41,525 epoch 6 - iter 2605/5212 - loss 0.05732132 - time (sec): 127.76 - samples/sec: 1434.33 - lr: 0.000025 - momentum: 0.000000 2023-10-15 13:41:07,304 epoch 6 - iter 3126/5212 - loss 0.05727400 - time (sec): 153.54 - samples/sec: 1440.88 - lr: 0.000024 - momentum: 0.000000 2023-10-15 13:41:32,019 epoch 6 - iter 3647/5212 - loss 0.05654496 - time (sec): 178.25 - samples/sec: 1428.82 - lr: 0.000024 - momentum: 0.000000 2023-10-15 13:41:57,953 epoch 6 - iter 4168/5212 - loss 0.05631661 - time (sec): 204.18 - samples/sec: 1442.93 - lr: 0.000023 - momentum: 0.000000 2023-10-15 13:42:23,594 epoch 6 - iter 4689/5212 - loss 0.05593561 - time (sec): 229.82 - samples/sec: 1445.76 - lr: 0.000023 - momentum: 0.000000 2023-10-15 13:42:48,376 epoch 6 - iter 5210/5212 - loss 0.05663169 - time (sec): 254.61 - samples/sec: 1442.84 - lr: 0.000022 - momentum: 0.000000 2023-10-15 13:42:48,471 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:42:48,471 EPOCH 6 done: loss 0.0566 - lr: 0.000022 2023-10-15 13:42:56,767 DEV : loss 0.3654634654521942 - f1-score (micro avg) 0.3514 2023-10-15 13:42:56,796 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:43:21,792 epoch 7 - iter 521/5212 - loss 0.03312481 - time (sec): 25.00 - samples/sec: 1410.93 - lr: 0.000022 - momentum: 0.000000 2023-10-15 13:43:47,410 epoch 7 - iter 1042/5212 - loss 0.03774246 - time (sec): 50.61 - samples/sec: 1425.50 - lr: 0.000021 - momentum: 0.000000 2023-10-15 13:44:12,732 epoch 7 - iter 1563/5212 - loss 0.04380652 - time (sec): 75.94 - samples/sec: 1443.44 - lr: 0.000021 - momentum: 0.000000 2023-10-15 13:44:37,716 epoch 7 - iter 2084/5212 - loss 0.04357735 - time (sec): 100.92 - samples/sec: 1453.08 - lr: 0.000020 - momentum: 0.000000 2023-10-15 13:45:02,712 epoch 7 - iter 2605/5212 - loss 0.04260736 - time (sec): 125.92 - samples/sec: 1453.59 - lr: 0.000019 - momentum: 0.000000 2023-10-15 13:45:27,992 epoch 7 - iter 3126/5212 - loss 0.04472481 - time (sec): 151.20 - samples/sec: 1454.29 - lr: 0.000019 - momentum: 0.000000 2023-10-15 13:45:54,498 epoch 7 - iter 3647/5212 - loss 0.04429013 - time (sec): 177.70 - samples/sec: 1454.84 - lr: 0.000018 - momentum: 0.000000 2023-10-15 13:46:19,450 epoch 7 - iter 4168/5212 - loss 0.04403234 - time (sec): 202.65 - samples/sec: 1452.76 - lr: 0.000018 - momentum: 0.000000 2023-10-15 13:46:44,374 epoch 7 - iter 4689/5212 - loss 0.04448230 - time (sec): 227.58 - samples/sec: 1451.83 - lr: 0.000017 - momentum: 0.000000 2023-10-15 13:47:09,346 epoch 7 - iter 5210/5212 - loss 0.04403781 - time (sec): 252.55 - samples/sec: 1453.77 - lr: 0.000017 - momentum: 0.000000 2023-10-15 13:47:09,455 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:47:09,455 EPOCH 7 done: loss 0.0440 - lr: 0.000017 2023-10-15 13:47:17,889 DEV : loss 0.34585681557655334 - f1-score (micro avg) 0.3311 2023-10-15 13:47:17,919 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:47:43,267 epoch 8 - iter 521/5212 - loss 0.02689241 - time (sec): 25.35 - samples/sec: 1476.10 - lr: 0.000016 - momentum: 0.000000 2023-10-15 13:48:08,186 epoch 8 - iter 1042/5212 - loss 0.02653419 - time (sec): 50.27 - samples/sec: 1445.14 - lr: 0.000016 - momentum: 0.000000 2023-10-15 13:48:33,822 epoch 8 - iter 1563/5212 - loss 0.02637713 - time (sec): 75.90 - samples/sec: 1441.02 - lr: 0.000015 - momentum: 0.000000 2023-10-15 13:48:59,594 epoch 8 - iter 2084/5212 - loss 0.02733328 - time (sec): 101.67 - samples/sec: 1424.32 - lr: 0.000014 - momentum: 0.000000 2023-10-15 13:49:24,714 epoch 8 - iter 2605/5212 - loss 0.02880069 - time (sec): 126.79 - samples/sec: 1426.83 - lr: 0.000014 - momentum: 0.000000 2023-10-15 13:49:51,571 epoch 8 - iter 3126/5212 - loss 0.02954909 - time (sec): 153.65 - samples/sec: 1435.69 - lr: 0.000013 - momentum: 0.000000 2023-10-15 13:50:16,561 epoch 8 - iter 3647/5212 - loss 0.03040527 - time (sec): 178.64 - samples/sec: 1444.08 - lr: 0.000013 - momentum: 0.000000 2023-10-15 13:50:41,614 epoch 8 - iter 4168/5212 - loss 0.03036981 - time (sec): 203.69 - samples/sec: 1438.53 - lr: 0.000012 - momentum: 0.000000 2023-10-15 13:51:06,615 epoch 8 - iter 4689/5212 - loss 0.02995803 - time (sec): 228.69 - samples/sec: 1442.97 - lr: 0.000012 - momentum: 0.000000 2023-10-15 13:51:33,708 epoch 8 - iter 5210/5212 - loss 0.03001161 - time (sec): 255.79 - samples/sec: 1435.88 - lr: 0.000011 - momentum: 0.000000 2023-10-15 13:51:33,815 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:51:33,816 EPOCH 8 done: loss 0.0300 - lr: 0.000011 2023-10-15 13:51:42,835 DEV : loss 0.43034762144088745 - f1-score (micro avg) 0.3493 2023-10-15 13:51:42,865 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:52:08,590 epoch 9 - iter 521/5212 - loss 0.02763096 - time (sec): 25.72 - samples/sec: 1534.61 - lr: 0.000011 - momentum: 0.000000 2023-10-15 13:52:33,552 epoch 9 - iter 1042/5212 - loss 0.02207242 - time (sec): 50.69 - samples/sec: 1519.59 - lr: 0.000010 - momentum: 0.000000 2023-10-15 13:52:58,597 epoch 9 - iter 1563/5212 - loss 0.02236283 - time (sec): 75.73 - samples/sec: 1460.68 - lr: 0.000009 - momentum: 0.000000 2023-10-15 13:53:23,759 epoch 9 - iter 2084/5212 - loss 0.02177159 - time (sec): 100.89 - samples/sec: 1469.42 - lr: 0.000009 - momentum: 0.000000 2023-10-15 13:53:49,224 epoch 9 - iter 2605/5212 - loss 0.02114199 - time (sec): 126.36 - samples/sec: 1462.95 - lr: 0.000008 - momentum: 0.000000 2023-10-15 13:54:14,015 epoch 9 - iter 3126/5212 - loss 0.02090215 - time (sec): 151.15 - samples/sec: 1458.83 - lr: 0.000008 - momentum: 0.000000 2023-10-15 13:54:39,333 epoch 9 - iter 3647/5212 - loss 0.02100536 - time (sec): 176.47 - samples/sec: 1454.71 - lr: 0.000007 - momentum: 0.000000 2023-10-15 13:55:04,484 epoch 9 - iter 4168/5212 - loss 0.02104071 - time (sec): 201.62 - samples/sec: 1453.99 - lr: 0.000007 - momentum: 0.000000 2023-10-15 13:55:30,243 epoch 9 - iter 4689/5212 - loss 0.02069058 - time (sec): 227.38 - samples/sec: 1439.28 - lr: 0.000006 - momentum: 0.000000 2023-10-15 13:55:56,291 epoch 9 - iter 5210/5212 - loss 0.02049929 - time (sec): 253.42 - samples/sec: 1449.41 - lr: 0.000006 - momentum: 0.000000 2023-10-15 13:55:56,382 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:55:56,383 EPOCH 9 done: loss 0.0205 - lr: 0.000006 2023-10-15 13:56:05,381 DEV : loss 0.5122669339179993 - f1-score (micro avg) 0.3381 2023-10-15 13:56:05,407 ---------------------------------------------------------------------------------------------------- 2023-10-15 13:56:30,432 epoch 10 - iter 521/5212 - loss 0.01178875 - time (sec): 25.02 - samples/sec: 1530.92 - lr: 0.000005 - momentum: 0.000000 2023-10-15 13:56:55,856 epoch 10 - iter 1042/5212 - loss 0.01468480 - time (sec): 50.45 - samples/sec: 1507.44 - lr: 0.000004 - momentum: 0.000000 2023-10-15 13:57:21,319 epoch 10 - iter 1563/5212 - loss 0.01459527 - time (sec): 75.91 - samples/sec: 1516.01 - lr: 0.000004 - momentum: 0.000000 2023-10-15 13:57:46,537 epoch 10 - iter 2084/5212 - loss 0.01485247 - time (sec): 101.13 - samples/sec: 1506.02 - lr: 0.000003 - momentum: 0.000000 2023-10-15 13:58:11,468 epoch 10 - iter 2605/5212 - loss 0.01444242 - time (sec): 126.06 - samples/sec: 1494.49 - lr: 0.000003 - momentum: 0.000000 2023-10-15 13:58:36,105 epoch 10 - iter 3126/5212 - loss 0.01442687 - time (sec): 150.70 - samples/sec: 1471.55 - lr: 0.000002 - momentum: 0.000000 2023-10-15 13:59:01,128 epoch 10 - iter 3647/5212 - loss 0.01393897 - time (sec): 175.72 - samples/sec: 1466.96 - lr: 0.000002 - momentum: 0.000000 2023-10-15 13:59:26,149 epoch 10 - iter 4168/5212 - loss 0.01439542 - time (sec): 200.74 - samples/sec: 1466.86 - lr: 0.000001 - momentum: 0.000000 2023-10-15 13:59:51,374 epoch 10 - iter 4689/5212 - loss 0.01427652 - time (sec): 225.97 - samples/sec: 1470.47 - lr: 0.000001 - momentum: 0.000000 2023-10-15 14:00:15,042 epoch 10 - iter 5210/5212 - loss 0.01384805 - time (sec): 249.63 - samples/sec: 1471.63 - lr: 0.000000 - momentum: 0.000000 2023-10-15 14:00:15,126 ---------------------------------------------------------------------------------------------------- 2023-10-15 14:00:15,126 EPOCH 10 done: loss 0.0138 - lr: 0.000000 2023-10-15 14:00:24,139 DEV : loss 0.46498292684555054 - f1-score (micro avg) 0.3499 2023-10-15 14:00:24,544 ---------------------------------------------------------------------------------------------------- 2023-10-15 14:00:24,545 Loading model from best epoch ... 2023-10-15 14:00:26,145 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-15 14:00:42,346 Results: - F-score (micro) 0.2834 - F-score (macro) 0.1787 - Accuracy 0.166 By class: precision recall f1-score support LOC 0.3432 0.3534 0.3482 1214 PER 0.3094 0.2030 0.2451 808 ORG 0.1345 0.1105 0.1213 353 HumanProd 0.0000 0.0000 0.0000 15 micro avg 0.3053 0.2644 0.2834 2390 macro avg 0.1968 0.1667 0.1787 2390 weighted avg 0.2988 0.2644 0.2777 2390 2023-10-15 14:00:42,346 ----------------------------------------------------------------------------------------------------