stefan-it's picture
Upload folder using huggingface_hub
635abce
2023-10-15 20:24:44,673 ----------------------------------------------------------------------------------------------------
2023-10-15 20:24:44,674 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-15 20:24:44,674 ----------------------------------------------------------------------------------------------------
2023-10-15 20:24:44,675 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-15 20:24:44,675 ----------------------------------------------------------------------------------------------------
2023-10-15 20:24:44,675 Train: 20847 sentences
2023-10-15 20:24:44,675 (train_with_dev=False, train_with_test=False)
2023-10-15 20:24:44,675 ----------------------------------------------------------------------------------------------------
2023-10-15 20:24:44,675 Training Params:
2023-10-15 20:24:44,675 - learning_rate: "3e-05"
2023-10-15 20:24:44,675 - mini_batch_size: "4"
2023-10-15 20:24:44,675 - max_epochs: "10"
2023-10-15 20:24:44,675 - shuffle: "True"
2023-10-15 20:24:44,675 ----------------------------------------------------------------------------------------------------
2023-10-15 20:24:44,675 Plugins:
2023-10-15 20:24:44,675 - LinearScheduler | warmup_fraction: '0.1'
2023-10-15 20:24:44,675 ----------------------------------------------------------------------------------------------------
2023-10-15 20:24:44,675 Final evaluation on model from best epoch (best-model.pt)
2023-10-15 20:24:44,675 - metric: "('micro avg', 'f1-score')"
2023-10-15 20:24:44,675 ----------------------------------------------------------------------------------------------------
2023-10-15 20:24:44,675 Computation:
2023-10-15 20:24:44,675 - compute on device: cuda:0
2023-10-15 20:24:44,675 - embedding storage: none
2023-10-15 20:24:44,675 ----------------------------------------------------------------------------------------------------
2023-10-15 20:24:44,675 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-15 20:24:44,675 ----------------------------------------------------------------------------------------------------
2023-10-15 20:24:44,675 ----------------------------------------------------------------------------------------------------
2023-10-15 20:25:09,747 epoch 1 - iter 521/5212 - loss 1.56165169 - time (sec): 25.07 - samples/sec: 1394.68 - lr: 0.000003 - momentum: 0.000000
2023-10-15 20:25:35,194 epoch 1 - iter 1042/5212 - loss 0.99845314 - time (sec): 50.52 - samples/sec: 1454.90 - lr: 0.000006 - momentum: 0.000000
2023-10-15 20:26:00,007 epoch 1 - iter 1563/5212 - loss 0.77898021 - time (sec): 75.33 - samples/sec: 1445.20 - lr: 0.000009 - momentum: 0.000000
2023-10-15 20:26:25,375 epoch 1 - iter 2084/5212 - loss 0.65478772 - time (sec): 100.70 - samples/sec: 1439.43 - lr: 0.000012 - momentum: 0.000000
2023-10-15 20:26:50,728 epoch 1 - iter 2605/5212 - loss 0.57005146 - time (sec): 126.05 - samples/sec: 1457.86 - lr: 0.000015 - momentum: 0.000000
2023-10-15 20:27:15,361 epoch 1 - iter 3126/5212 - loss 0.51705867 - time (sec): 150.68 - samples/sec: 1452.88 - lr: 0.000018 - momentum: 0.000000
2023-10-15 20:27:40,262 epoch 1 - iter 3647/5212 - loss 0.47447550 - time (sec): 175.59 - samples/sec: 1451.31 - lr: 0.000021 - momentum: 0.000000
2023-10-15 20:28:05,433 epoch 1 - iter 4168/5212 - loss 0.43833701 - time (sec): 200.76 - samples/sec: 1455.45 - lr: 0.000024 - momentum: 0.000000
2023-10-15 20:28:30,449 epoch 1 - iter 4689/5212 - loss 0.41315675 - time (sec): 225.77 - samples/sec: 1455.37 - lr: 0.000027 - momentum: 0.000000
2023-10-15 20:28:55,738 epoch 1 - iter 5210/5212 - loss 0.39022797 - time (sec): 251.06 - samples/sec: 1463.30 - lr: 0.000030 - momentum: 0.000000
2023-10-15 20:28:55,827 ----------------------------------------------------------------------------------------------------
2023-10-15 20:28:55,827 EPOCH 1 done: loss 0.3902 - lr: 0.000030
2023-10-15 20:29:01,505 DEV : loss 0.13643652200698853 - f1-score (micro avg) 0.2701
2023-10-15 20:29:01,533 saving best model
2023-10-15 20:29:01,913 ----------------------------------------------------------------------------------------------------
2023-10-15 20:29:27,078 epoch 2 - iter 521/5212 - loss 0.19544052 - time (sec): 25.16 - samples/sec: 1514.57 - lr: 0.000030 - momentum: 0.000000
2023-10-15 20:29:52,579 epoch 2 - iter 1042/5212 - loss 0.17228581 - time (sec): 50.66 - samples/sec: 1505.74 - lr: 0.000029 - momentum: 0.000000
2023-10-15 20:30:18,392 epoch 2 - iter 1563/5212 - loss 0.17467943 - time (sec): 76.48 - samples/sec: 1484.11 - lr: 0.000029 - momentum: 0.000000
2023-10-15 20:30:43,402 epoch 2 - iter 2084/5212 - loss 0.17589538 - time (sec): 101.49 - samples/sec: 1461.85 - lr: 0.000029 - momentum: 0.000000
2023-10-15 20:31:08,492 epoch 2 - iter 2605/5212 - loss 0.17733174 - time (sec): 126.58 - samples/sec: 1466.95 - lr: 0.000028 - momentum: 0.000000
2023-10-15 20:31:33,239 epoch 2 - iter 3126/5212 - loss 0.17324370 - time (sec): 151.32 - samples/sec: 1462.37 - lr: 0.000028 - momentum: 0.000000
2023-10-15 20:31:58,539 epoch 2 - iter 3647/5212 - loss 0.16913298 - time (sec): 176.62 - samples/sec: 1471.29 - lr: 0.000028 - momentum: 0.000000
2023-10-15 20:32:23,133 epoch 2 - iter 4168/5212 - loss 0.17368275 - time (sec): 201.22 - samples/sec: 1467.38 - lr: 0.000027 - momentum: 0.000000
2023-10-15 20:32:48,579 epoch 2 - iter 4689/5212 - loss 0.17139323 - time (sec): 226.66 - samples/sec: 1472.89 - lr: 0.000027 - momentum: 0.000000
2023-10-15 20:33:13,124 epoch 2 - iter 5210/5212 - loss 0.16968841 - time (sec): 251.21 - samples/sec: 1462.47 - lr: 0.000027 - momentum: 0.000000
2023-10-15 20:33:13,213 ----------------------------------------------------------------------------------------------------
2023-10-15 20:33:13,213 EPOCH 2 done: loss 0.1697 - lr: 0.000027
2023-10-15 20:33:21,474 DEV : loss 0.1600484699010849 - f1-score (micro avg) 0.369
2023-10-15 20:33:21,503 saving best model
2023-10-15 20:33:21,991 ----------------------------------------------------------------------------------------------------
2023-10-15 20:33:46,977 epoch 3 - iter 521/5212 - loss 0.11625189 - time (sec): 24.98 - samples/sec: 1460.19 - lr: 0.000026 - momentum: 0.000000
2023-10-15 20:34:12,053 epoch 3 - iter 1042/5212 - loss 0.11731984 - time (sec): 50.06 - samples/sec: 1460.97 - lr: 0.000026 - momentum: 0.000000
2023-10-15 20:34:37,239 epoch 3 - iter 1563/5212 - loss 0.11673671 - time (sec): 75.24 - samples/sec: 1468.35 - lr: 0.000026 - momentum: 0.000000
2023-10-15 20:35:03,000 epoch 3 - iter 2084/5212 - loss 0.12067893 - time (sec): 101.01 - samples/sec: 1469.90 - lr: 0.000025 - momentum: 0.000000
2023-10-15 20:35:28,469 epoch 3 - iter 2605/5212 - loss 0.11697764 - time (sec): 126.47 - samples/sec: 1462.26 - lr: 0.000025 - momentum: 0.000000
2023-10-15 20:35:53,355 epoch 3 - iter 3126/5212 - loss 0.11647880 - time (sec): 151.36 - samples/sec: 1458.11 - lr: 0.000025 - momentum: 0.000000
2023-10-15 20:36:19,263 epoch 3 - iter 3647/5212 - loss 0.11797656 - time (sec): 177.27 - samples/sec: 1448.30 - lr: 0.000024 - momentum: 0.000000
2023-10-15 20:36:44,778 epoch 3 - iter 4168/5212 - loss 0.11532415 - time (sec): 202.78 - samples/sec: 1457.64 - lr: 0.000024 - momentum: 0.000000
2023-10-15 20:37:09,921 epoch 3 - iter 4689/5212 - loss 0.11597611 - time (sec): 227.93 - samples/sec: 1458.52 - lr: 0.000024 - momentum: 0.000000
2023-10-15 20:37:34,894 epoch 3 - iter 5210/5212 - loss 0.11657433 - time (sec): 252.90 - samples/sec: 1452.74 - lr: 0.000023 - momentum: 0.000000
2023-10-15 20:37:34,980 ----------------------------------------------------------------------------------------------------
2023-10-15 20:37:34,980 EPOCH 3 done: loss 0.1166 - lr: 0.000023
2023-10-15 20:37:43,168 DEV : loss 0.22019526362419128 - f1-score (micro avg) 0.3173
2023-10-15 20:37:43,195 ----------------------------------------------------------------------------------------------------
2023-10-15 20:38:08,655 epoch 4 - iter 521/5212 - loss 0.09517878 - time (sec): 25.46 - samples/sec: 1442.09 - lr: 0.000023 - momentum: 0.000000
2023-10-15 20:38:33,547 epoch 4 - iter 1042/5212 - loss 0.08447031 - time (sec): 50.35 - samples/sec: 1416.98 - lr: 0.000023 - momentum: 0.000000
2023-10-15 20:38:58,367 epoch 4 - iter 1563/5212 - loss 0.08700323 - time (sec): 75.17 - samples/sec: 1429.05 - lr: 0.000022 - momentum: 0.000000
2023-10-15 20:39:23,658 epoch 4 - iter 2084/5212 - loss 0.08555122 - time (sec): 100.46 - samples/sec: 1439.08 - lr: 0.000022 - momentum: 0.000000
2023-10-15 20:39:48,375 epoch 4 - iter 2605/5212 - loss 0.08406529 - time (sec): 125.18 - samples/sec: 1438.20 - lr: 0.000022 - momentum: 0.000000
2023-10-15 20:40:13,097 epoch 4 - iter 3126/5212 - loss 0.08735544 - time (sec): 149.90 - samples/sec: 1440.09 - lr: 0.000021 - momentum: 0.000000
2023-10-15 20:40:38,425 epoch 4 - iter 3647/5212 - loss 0.08619736 - time (sec): 175.23 - samples/sec: 1451.28 - lr: 0.000021 - momentum: 0.000000
2023-10-15 20:41:03,491 epoch 4 - iter 4168/5212 - loss 0.08622321 - time (sec): 200.30 - samples/sec: 1447.99 - lr: 0.000021 - momentum: 0.000000
2023-10-15 20:41:28,778 epoch 4 - iter 4689/5212 - loss 0.08588628 - time (sec): 225.58 - samples/sec: 1454.70 - lr: 0.000020 - momentum: 0.000000
2023-10-15 20:41:54,667 epoch 4 - iter 5210/5212 - loss 0.08579389 - time (sec): 251.47 - samples/sec: 1460.90 - lr: 0.000020 - momentum: 0.000000
2023-10-15 20:41:54,753 ----------------------------------------------------------------------------------------------------
2023-10-15 20:41:54,753 EPOCH 4 done: loss 0.0858 - lr: 0.000020
2023-10-15 20:42:03,901 DEV : loss 0.28183305263519287 - f1-score (micro avg) 0.3478
2023-10-15 20:42:03,930 ----------------------------------------------------------------------------------------------------
2023-10-15 20:42:28,675 epoch 5 - iter 521/5212 - loss 0.05241740 - time (sec): 24.74 - samples/sec: 1389.53 - lr: 0.000020 - momentum: 0.000000
2023-10-15 20:42:53,630 epoch 5 - iter 1042/5212 - loss 0.06367030 - time (sec): 49.70 - samples/sec: 1399.06 - lr: 0.000019 - momentum: 0.000000
2023-10-15 20:43:18,460 epoch 5 - iter 1563/5212 - loss 0.06209608 - time (sec): 74.53 - samples/sec: 1410.98 - lr: 0.000019 - momentum: 0.000000
2023-10-15 20:43:43,337 epoch 5 - iter 2084/5212 - loss 0.06304772 - time (sec): 99.41 - samples/sec: 1431.86 - lr: 0.000019 - momentum: 0.000000
2023-10-15 20:44:08,826 epoch 5 - iter 2605/5212 - loss 0.06257189 - time (sec): 124.89 - samples/sec: 1441.74 - lr: 0.000018 - momentum: 0.000000
2023-10-15 20:44:34,138 epoch 5 - iter 3126/5212 - loss 0.06061267 - time (sec): 150.21 - samples/sec: 1435.46 - lr: 0.000018 - momentum: 0.000000
2023-10-15 20:44:59,301 epoch 5 - iter 3647/5212 - loss 0.06083740 - time (sec): 175.37 - samples/sec: 1443.21 - lr: 0.000018 - momentum: 0.000000
2023-10-15 20:45:25,043 epoch 5 - iter 4168/5212 - loss 0.05950603 - time (sec): 201.11 - samples/sec: 1448.76 - lr: 0.000017 - momentum: 0.000000
2023-10-15 20:45:50,728 epoch 5 - iter 4689/5212 - loss 0.05942891 - time (sec): 226.80 - samples/sec: 1451.75 - lr: 0.000017 - momentum: 0.000000
2023-10-15 20:46:16,246 epoch 5 - iter 5210/5212 - loss 0.05999733 - time (sec): 252.31 - samples/sec: 1455.71 - lr: 0.000017 - momentum: 0.000000
2023-10-15 20:46:16,342 ----------------------------------------------------------------------------------------------------
2023-10-15 20:46:16,342 EPOCH 5 done: loss 0.0600 - lr: 0.000017
2023-10-15 20:46:25,496 DEV : loss 0.28720250725746155 - f1-score (micro avg) 0.3655
2023-10-15 20:46:25,523 ----------------------------------------------------------------------------------------------------
2023-10-15 20:46:50,940 epoch 6 - iter 521/5212 - loss 0.04731132 - time (sec): 25.42 - samples/sec: 1475.22 - lr: 0.000016 - momentum: 0.000000
2023-10-15 20:47:16,372 epoch 6 - iter 1042/5212 - loss 0.04736510 - time (sec): 50.85 - samples/sec: 1494.69 - lr: 0.000016 - momentum: 0.000000
2023-10-15 20:47:41,147 epoch 6 - iter 1563/5212 - loss 0.04572425 - time (sec): 75.62 - samples/sec: 1473.37 - lr: 0.000016 - momentum: 0.000000
2023-10-15 20:48:06,986 epoch 6 - iter 2084/5212 - loss 0.04405961 - time (sec): 101.46 - samples/sec: 1485.66 - lr: 0.000015 - momentum: 0.000000
2023-10-15 20:48:32,376 epoch 6 - iter 2605/5212 - loss 0.04285739 - time (sec): 126.85 - samples/sec: 1474.66 - lr: 0.000015 - momentum: 0.000000
2023-10-15 20:48:57,879 epoch 6 - iter 3126/5212 - loss 0.04390113 - time (sec): 152.35 - samples/sec: 1465.47 - lr: 0.000015 - momentum: 0.000000
2023-10-15 20:49:22,624 epoch 6 - iter 3647/5212 - loss 0.04383399 - time (sec): 177.10 - samples/sec: 1456.57 - lr: 0.000014 - momentum: 0.000000
2023-10-15 20:49:47,726 epoch 6 - iter 4168/5212 - loss 0.04360406 - time (sec): 202.20 - samples/sec: 1451.05 - lr: 0.000014 - momentum: 0.000000
2023-10-15 20:50:12,908 epoch 6 - iter 4689/5212 - loss 0.04400334 - time (sec): 227.38 - samples/sec: 1444.97 - lr: 0.000014 - momentum: 0.000000
2023-10-15 20:50:38,364 epoch 6 - iter 5210/5212 - loss 0.04464236 - time (sec): 252.84 - samples/sec: 1451.74 - lr: 0.000013 - momentum: 0.000000
2023-10-15 20:50:38,495 ----------------------------------------------------------------------------------------------------
2023-10-15 20:50:38,495 EPOCH 6 done: loss 0.0446 - lr: 0.000013
2023-10-15 20:50:47,694 DEV : loss 0.3866041302680969 - f1-score (micro avg) 0.3652
2023-10-15 20:50:47,722 ----------------------------------------------------------------------------------------------------
2023-10-15 20:51:13,075 epoch 7 - iter 521/5212 - loss 0.03788773 - time (sec): 25.35 - samples/sec: 1488.17 - lr: 0.000013 - momentum: 0.000000
2023-10-15 20:51:38,952 epoch 7 - iter 1042/5212 - loss 0.03141041 - time (sec): 51.23 - samples/sec: 1481.15 - lr: 0.000013 - momentum: 0.000000
2023-10-15 20:52:04,004 epoch 7 - iter 1563/5212 - loss 0.03509067 - time (sec): 76.28 - samples/sec: 1464.43 - lr: 0.000012 - momentum: 0.000000
2023-10-15 20:52:29,033 epoch 7 - iter 2084/5212 - loss 0.03393019 - time (sec): 101.31 - samples/sec: 1436.78 - lr: 0.000012 - momentum: 0.000000
2023-10-15 20:52:54,318 epoch 7 - iter 2605/5212 - loss 0.03383844 - time (sec): 126.60 - samples/sec: 1448.48 - lr: 0.000012 - momentum: 0.000000
2023-10-15 20:53:19,320 epoch 7 - iter 3126/5212 - loss 0.03348699 - time (sec): 151.60 - samples/sec: 1441.89 - lr: 0.000011 - momentum: 0.000000
2023-10-15 20:53:45,054 epoch 7 - iter 3647/5212 - loss 0.03338128 - time (sec): 177.33 - samples/sec: 1455.10 - lr: 0.000011 - momentum: 0.000000
2023-10-15 20:54:09,864 epoch 7 - iter 4168/5212 - loss 0.03278061 - time (sec): 202.14 - samples/sec: 1446.19 - lr: 0.000011 - momentum: 0.000000
2023-10-15 20:54:35,559 epoch 7 - iter 4689/5212 - loss 0.03275384 - time (sec): 227.84 - samples/sec: 1453.23 - lr: 0.000010 - momentum: 0.000000
2023-10-15 20:55:00,583 epoch 7 - iter 5210/5212 - loss 0.03182261 - time (sec): 252.86 - samples/sec: 1452.89 - lr: 0.000010 - momentum: 0.000000
2023-10-15 20:55:00,671 ----------------------------------------------------------------------------------------------------
2023-10-15 20:55:00,671 EPOCH 7 done: loss 0.0318 - lr: 0.000010
2023-10-15 20:55:09,772 DEV : loss 0.41479361057281494 - f1-score (micro avg) 0.3719
2023-10-15 20:55:09,801 saving best model
2023-10-15 20:55:10,286 ----------------------------------------------------------------------------------------------------
2023-10-15 20:55:34,976 epoch 8 - iter 521/5212 - loss 0.02874987 - time (sec): 24.69 - samples/sec: 1382.35 - lr: 0.000010 - momentum: 0.000000
2023-10-15 20:56:00,292 epoch 8 - iter 1042/5212 - loss 0.02974470 - time (sec): 50.00 - samples/sec: 1456.42 - lr: 0.000009 - momentum: 0.000000
2023-10-15 20:56:25,436 epoch 8 - iter 1563/5212 - loss 0.02516966 - time (sec): 75.15 - samples/sec: 1449.58 - lr: 0.000009 - momentum: 0.000000
2023-10-15 20:56:50,608 epoch 8 - iter 2084/5212 - loss 0.02526379 - time (sec): 100.32 - samples/sec: 1446.32 - lr: 0.000009 - momentum: 0.000000
2023-10-15 20:57:16,033 epoch 8 - iter 2605/5212 - loss 0.02435580 - time (sec): 125.74 - samples/sec: 1454.15 - lr: 0.000008 - momentum: 0.000000
2023-10-15 20:57:41,452 epoch 8 - iter 3126/5212 - loss 0.02388305 - time (sec): 151.16 - samples/sec: 1460.49 - lr: 0.000008 - momentum: 0.000000
2023-10-15 20:58:06,478 epoch 8 - iter 3647/5212 - loss 0.02325843 - time (sec): 176.19 - samples/sec: 1462.53 - lr: 0.000008 - momentum: 0.000000
2023-10-15 20:58:31,384 epoch 8 - iter 4168/5212 - loss 0.02355797 - time (sec): 201.10 - samples/sec: 1462.23 - lr: 0.000007 - momentum: 0.000000
2023-10-15 20:58:56,660 epoch 8 - iter 4689/5212 - loss 0.02339324 - time (sec): 226.37 - samples/sec: 1462.32 - lr: 0.000007 - momentum: 0.000000
2023-10-15 20:59:21,654 epoch 8 - iter 5210/5212 - loss 0.02336518 - time (sec): 251.37 - samples/sec: 1460.88 - lr: 0.000007 - momentum: 0.000000
2023-10-15 20:59:21,754 ----------------------------------------------------------------------------------------------------
2023-10-15 20:59:21,754 EPOCH 8 done: loss 0.0234 - lr: 0.000007
2023-10-15 20:59:30,902 DEV : loss 0.42180541157722473 - f1-score (micro avg) 0.3741
2023-10-15 20:59:30,930 saving best model
2023-10-15 20:59:31,418 ----------------------------------------------------------------------------------------------------
2023-10-15 20:59:57,079 epoch 9 - iter 521/5212 - loss 0.01219894 - time (sec): 25.66 - samples/sec: 1547.72 - lr: 0.000006 - momentum: 0.000000
2023-10-15 21:00:22,613 epoch 9 - iter 1042/5212 - loss 0.01351483 - time (sec): 51.19 - samples/sec: 1525.14 - lr: 0.000006 - momentum: 0.000000
2023-10-15 21:00:47,949 epoch 9 - iter 1563/5212 - loss 0.01411036 - time (sec): 76.53 - samples/sec: 1511.90 - lr: 0.000006 - momentum: 0.000000
2023-10-15 21:01:12,968 epoch 9 - iter 2084/5212 - loss 0.01442847 - time (sec): 101.55 - samples/sec: 1496.00 - lr: 0.000005 - momentum: 0.000000
2023-10-15 21:01:38,021 epoch 9 - iter 2605/5212 - loss 0.01486565 - time (sec): 126.60 - samples/sec: 1486.57 - lr: 0.000005 - momentum: 0.000000
2023-10-15 21:02:02,741 epoch 9 - iter 3126/5212 - loss 0.01496493 - time (sec): 151.32 - samples/sec: 1460.15 - lr: 0.000005 - momentum: 0.000000
2023-10-15 21:02:28,122 epoch 9 - iter 3647/5212 - loss 0.01502140 - time (sec): 176.70 - samples/sec: 1462.06 - lr: 0.000004 - momentum: 0.000000
2023-10-15 21:02:52,991 epoch 9 - iter 4168/5212 - loss 0.01484723 - time (sec): 201.57 - samples/sec: 1461.58 - lr: 0.000004 - momentum: 0.000000
2023-10-15 21:03:18,183 epoch 9 - iter 4689/5212 - loss 0.01495762 - time (sec): 226.76 - samples/sec: 1460.46 - lr: 0.000004 - momentum: 0.000000
2023-10-15 21:03:43,356 epoch 9 - iter 5210/5212 - loss 0.01475329 - time (sec): 251.93 - samples/sec: 1458.16 - lr: 0.000003 - momentum: 0.000000
2023-10-15 21:03:43,445 ----------------------------------------------------------------------------------------------------
2023-10-15 21:03:43,445 EPOCH 9 done: loss 0.0148 - lr: 0.000003
2023-10-15 21:03:52,572 DEV : loss 0.4861517548561096 - f1-score (micro avg) 0.3626
2023-10-15 21:03:52,599 ----------------------------------------------------------------------------------------------------
2023-10-15 21:04:17,540 epoch 10 - iter 521/5212 - loss 0.00895462 - time (sec): 24.94 - samples/sec: 1423.74 - lr: 0.000003 - momentum: 0.000000
2023-10-15 21:04:42,571 epoch 10 - iter 1042/5212 - loss 0.01126073 - time (sec): 49.97 - samples/sec: 1388.03 - lr: 0.000003 - momentum: 0.000000
2023-10-15 21:05:07,649 epoch 10 - iter 1563/5212 - loss 0.01142583 - time (sec): 75.05 - samples/sec: 1420.18 - lr: 0.000002 - momentum: 0.000000
2023-10-15 21:05:32,542 epoch 10 - iter 2084/5212 - loss 0.01092774 - time (sec): 99.94 - samples/sec: 1426.42 - lr: 0.000002 - momentum: 0.000000
2023-10-15 21:05:57,462 epoch 10 - iter 2605/5212 - loss 0.01134639 - time (sec): 124.86 - samples/sec: 1432.03 - lr: 0.000002 - momentum: 0.000000
2023-10-15 21:06:22,580 epoch 10 - iter 3126/5212 - loss 0.01131287 - time (sec): 149.98 - samples/sec: 1436.28 - lr: 0.000001 - momentum: 0.000000
2023-10-15 21:06:47,748 epoch 10 - iter 3647/5212 - loss 0.01122214 - time (sec): 175.15 - samples/sec: 1442.52 - lr: 0.000001 - momentum: 0.000000
2023-10-15 21:07:13,252 epoch 10 - iter 4168/5212 - loss 0.01080290 - time (sec): 200.65 - samples/sec: 1452.51 - lr: 0.000001 - momentum: 0.000000
2023-10-15 21:07:38,300 epoch 10 - iter 4689/5212 - loss 0.01104492 - time (sec): 225.70 - samples/sec: 1450.67 - lr: 0.000000 - momentum: 0.000000
2023-10-15 21:08:03,995 epoch 10 - iter 5210/5212 - loss 0.01096840 - time (sec): 251.39 - samples/sec: 1461.12 - lr: 0.000000 - momentum: 0.000000
2023-10-15 21:08:04,082 ----------------------------------------------------------------------------------------------------
2023-10-15 21:08:04,082 EPOCH 10 done: loss 0.0110 - lr: 0.000000
2023-10-15 21:08:12,303 DEV : loss 0.4902326166629791 - f1-score (micro avg) 0.3638
2023-10-15 21:08:12,728 ----------------------------------------------------------------------------------------------------
2023-10-15 21:08:12,729 Loading model from best epoch ...
2023-10-15 21:08:14,255 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-15 21:08:30,524
Results:
- F-score (micro) 0.4907
- F-score (macro) 0.3334
- Accuracy 0.3301
By class:
precision recall f1-score support
LOC 0.5226 0.6203 0.5672 1214
PER 0.4030 0.5371 0.4605 808
ORG 0.3028 0.3088 0.3058 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4481 0.5423 0.4907 2390
macro avg 0.3071 0.3665 0.3334 2390
weighted avg 0.4464 0.5423 0.4890 2390
2023-10-15 21:08:30,524 ----------------------------------------------------------------------------------------------------