stefan-it's picture
Upload folder using huggingface_hub
97e16d5
2023-10-15 12:31:16,764 ----------------------------------------------------------------------------------------------------
2023-10-15 12:31:16,765 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-15 12:31:16,765 ----------------------------------------------------------------------------------------------------
2023-10-15 12:31:16,765 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-15 12:31:16,765 ----------------------------------------------------------------------------------------------------
2023-10-15 12:31:16,765 Train: 20847 sentences
2023-10-15 12:31:16,765 (train_with_dev=False, train_with_test=False)
2023-10-15 12:31:16,765 ----------------------------------------------------------------------------------------------------
2023-10-15 12:31:16,765 Training Params:
2023-10-15 12:31:16,765 - learning_rate: "3e-05"
2023-10-15 12:31:16,765 - mini_batch_size: "4"
2023-10-15 12:31:16,765 - max_epochs: "10"
2023-10-15 12:31:16,765 - shuffle: "True"
2023-10-15 12:31:16,765 ----------------------------------------------------------------------------------------------------
2023-10-15 12:31:16,765 Plugins:
2023-10-15 12:31:16,765 - LinearScheduler | warmup_fraction: '0.1'
2023-10-15 12:31:16,765 ----------------------------------------------------------------------------------------------------
2023-10-15 12:31:16,765 Final evaluation on model from best epoch (best-model.pt)
2023-10-15 12:31:16,765 - metric: "('micro avg', 'f1-score')"
2023-10-15 12:31:16,765 ----------------------------------------------------------------------------------------------------
2023-10-15 12:31:16,765 Computation:
2023-10-15 12:31:16,765 - compute on device: cuda:0
2023-10-15 12:31:16,766 - embedding storage: none
2023-10-15 12:31:16,766 ----------------------------------------------------------------------------------------------------
2023-10-15 12:31:16,766 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-15 12:31:16,766 ----------------------------------------------------------------------------------------------------
2023-10-15 12:31:16,766 ----------------------------------------------------------------------------------------------------
2023-10-15 12:31:42,416 epoch 1 - iter 521/5212 - loss 1.65107690 - time (sec): 25.65 - samples/sec: 1428.48 - lr: 0.000003 - momentum: 0.000000
2023-10-15 12:32:07,452 epoch 1 - iter 1042/5212 - loss 1.04881765 - time (sec): 50.69 - samples/sec: 1445.79 - lr: 0.000006 - momentum: 0.000000
2023-10-15 12:32:33,017 epoch 1 - iter 1563/5212 - loss 0.78409455 - time (sec): 76.25 - samples/sec: 1475.39 - lr: 0.000009 - momentum: 0.000000
2023-10-15 12:32:58,619 epoch 1 - iter 2084/5212 - loss 0.66200413 - time (sec): 101.85 - samples/sec: 1457.97 - lr: 0.000012 - momentum: 0.000000
2023-10-15 12:33:24,941 epoch 1 - iter 2605/5212 - loss 0.56564184 - time (sec): 128.17 - samples/sec: 1472.11 - lr: 0.000015 - momentum: 0.000000
2023-10-15 12:33:50,287 epoch 1 - iter 3126/5212 - loss 0.51031280 - time (sec): 153.52 - samples/sec: 1469.15 - lr: 0.000018 - momentum: 0.000000
2023-10-15 12:34:14,962 epoch 1 - iter 3647/5212 - loss 0.47660046 - time (sec): 178.20 - samples/sec: 1454.65 - lr: 0.000021 - momentum: 0.000000
2023-10-15 12:34:40,269 epoch 1 - iter 4168/5212 - loss 0.44308784 - time (sec): 203.50 - samples/sec: 1461.60 - lr: 0.000024 - momentum: 0.000000
2023-10-15 12:35:05,226 epoch 1 - iter 4689/5212 - loss 0.41975678 - time (sec): 228.46 - samples/sec: 1459.22 - lr: 0.000027 - momentum: 0.000000
2023-10-15 12:35:29,725 epoch 1 - iter 5210/5212 - loss 0.40193518 - time (sec): 252.96 - samples/sec: 1452.18 - lr: 0.000030 - momentum: 0.000000
2023-10-15 12:35:29,815 ----------------------------------------------------------------------------------------------------
2023-10-15 12:35:29,815 EPOCH 1 done: loss 0.4018 - lr: 0.000030
2023-10-15 12:35:35,446 DEV : loss 0.12250665575265884 - f1-score (micro avg) 0.237
2023-10-15 12:35:35,472 saving best model
2023-10-15 12:35:35,862 ----------------------------------------------------------------------------------------------------
2023-10-15 12:36:01,334 epoch 2 - iter 521/5212 - loss 0.17665338 - time (sec): 25.47 - samples/sec: 1495.90 - lr: 0.000030 - momentum: 0.000000
2023-10-15 12:36:26,317 epoch 2 - iter 1042/5212 - loss 0.18260145 - time (sec): 50.45 - samples/sec: 1425.12 - lr: 0.000029 - momentum: 0.000000
2023-10-15 12:36:51,509 epoch 2 - iter 1563/5212 - loss 0.18236373 - time (sec): 75.65 - samples/sec: 1440.60 - lr: 0.000029 - momentum: 0.000000
2023-10-15 12:37:16,813 epoch 2 - iter 2084/5212 - loss 0.18783782 - time (sec): 100.95 - samples/sec: 1456.84 - lr: 0.000029 - momentum: 0.000000
2023-10-15 12:37:42,017 epoch 2 - iter 2605/5212 - loss 0.18436223 - time (sec): 126.15 - samples/sec: 1460.36 - lr: 0.000028 - momentum: 0.000000
2023-10-15 12:38:06,008 epoch 2 - iter 3126/5212 - loss 0.18087201 - time (sec): 150.14 - samples/sec: 1449.38 - lr: 0.000028 - momentum: 0.000000
2023-10-15 12:38:34,760 epoch 2 - iter 3647/5212 - loss 0.18105599 - time (sec): 178.90 - samples/sec: 1430.39 - lr: 0.000028 - momentum: 0.000000
2023-10-15 12:39:00,663 epoch 2 - iter 4168/5212 - loss 0.17564865 - time (sec): 204.80 - samples/sec: 1433.44 - lr: 0.000027 - momentum: 0.000000
2023-10-15 12:39:29,513 epoch 2 - iter 4689/5212 - loss 0.17426936 - time (sec): 233.65 - samples/sec: 1428.02 - lr: 0.000027 - momentum: 0.000000
2023-10-15 12:39:54,646 epoch 2 - iter 5210/5212 - loss 0.17441380 - time (sec): 258.78 - samples/sec: 1419.35 - lr: 0.000027 - momentum: 0.000000
2023-10-15 12:39:54,738 ----------------------------------------------------------------------------------------------------
2023-10-15 12:39:54,738 EPOCH 2 done: loss 0.1744 - lr: 0.000027
2023-10-15 12:40:03,882 DEV : loss 0.15540799498558044 - f1-score (micro avg) 0.3711
2023-10-15 12:40:03,908 saving best model
2023-10-15 12:40:04,385 ----------------------------------------------------------------------------------------------------
2023-10-15 12:40:30,445 epoch 3 - iter 521/5212 - loss 0.12465846 - time (sec): 26.05 - samples/sec: 1449.20 - lr: 0.000026 - momentum: 0.000000
2023-10-15 12:40:57,362 epoch 3 - iter 1042/5212 - loss 0.12448844 - time (sec): 52.97 - samples/sec: 1383.88 - lr: 0.000026 - momentum: 0.000000
2023-10-15 12:41:23,372 epoch 3 - iter 1563/5212 - loss 0.11976469 - time (sec): 78.98 - samples/sec: 1365.31 - lr: 0.000026 - momentum: 0.000000
2023-10-15 12:41:48,900 epoch 3 - iter 2084/5212 - loss 0.12448630 - time (sec): 104.51 - samples/sec: 1375.60 - lr: 0.000025 - momentum: 0.000000
2023-10-15 12:42:14,444 epoch 3 - iter 2605/5212 - loss 0.12482318 - time (sec): 130.05 - samples/sec: 1388.34 - lr: 0.000025 - momentum: 0.000000
2023-10-15 12:42:39,582 epoch 3 - iter 3126/5212 - loss 0.12114157 - time (sec): 155.19 - samples/sec: 1404.67 - lr: 0.000025 - momentum: 0.000000
2023-10-15 12:43:06,537 epoch 3 - iter 3647/5212 - loss 0.11954176 - time (sec): 182.14 - samples/sec: 1412.98 - lr: 0.000024 - momentum: 0.000000
2023-10-15 12:43:31,875 epoch 3 - iter 4168/5212 - loss 0.11725032 - time (sec): 207.48 - samples/sec: 1412.38 - lr: 0.000024 - momentum: 0.000000
2023-10-15 12:43:57,560 epoch 3 - iter 4689/5212 - loss 0.11753155 - time (sec): 233.17 - samples/sec: 1419.48 - lr: 0.000024 - momentum: 0.000000
2023-10-15 12:44:22,998 epoch 3 - iter 5210/5212 - loss 0.11654077 - time (sec): 258.61 - samples/sec: 1420.61 - lr: 0.000023 - momentum: 0.000000
2023-10-15 12:44:23,086 ----------------------------------------------------------------------------------------------------
2023-10-15 12:44:23,086 EPOCH 3 done: loss 0.1165 - lr: 0.000023
2023-10-15 12:44:32,250 DEV : loss 0.30272355675697327 - f1-score (micro avg) 0.3406
2023-10-15 12:44:32,276 ----------------------------------------------------------------------------------------------------
2023-10-15 12:44:57,267 epoch 4 - iter 521/5212 - loss 0.08646191 - time (sec): 24.99 - samples/sec: 1410.48 - lr: 0.000023 - momentum: 0.000000
2023-10-15 12:45:23,235 epoch 4 - iter 1042/5212 - loss 0.07591061 - time (sec): 50.96 - samples/sec: 1388.32 - lr: 0.000023 - momentum: 0.000000
2023-10-15 12:45:49,155 epoch 4 - iter 1563/5212 - loss 0.08070237 - time (sec): 76.88 - samples/sec: 1418.84 - lr: 0.000022 - momentum: 0.000000
2023-10-15 12:46:14,293 epoch 4 - iter 2084/5212 - loss 0.08273905 - time (sec): 102.02 - samples/sec: 1431.21 - lr: 0.000022 - momentum: 0.000000
2023-10-15 12:46:39,323 epoch 4 - iter 2605/5212 - loss 0.08254731 - time (sec): 127.05 - samples/sec: 1425.41 - lr: 0.000022 - momentum: 0.000000
2023-10-15 12:47:05,403 epoch 4 - iter 3126/5212 - loss 0.08232248 - time (sec): 153.13 - samples/sec: 1431.72 - lr: 0.000021 - momentum: 0.000000
2023-10-15 12:47:31,031 epoch 4 - iter 3647/5212 - loss 0.08272838 - time (sec): 178.75 - samples/sec: 1429.12 - lr: 0.000021 - momentum: 0.000000
2023-10-15 12:47:56,348 epoch 4 - iter 4168/5212 - loss 0.08217946 - time (sec): 204.07 - samples/sec: 1437.95 - lr: 0.000021 - momentum: 0.000000
2023-10-15 12:48:21,768 epoch 4 - iter 4689/5212 - loss 0.08130461 - time (sec): 229.49 - samples/sec: 1431.79 - lr: 0.000020 - momentum: 0.000000
2023-10-15 12:48:47,792 epoch 4 - iter 5210/5212 - loss 0.08086252 - time (sec): 255.51 - samples/sec: 1437.58 - lr: 0.000020 - momentum: 0.000000
2023-10-15 12:48:47,885 ----------------------------------------------------------------------------------------------------
2023-10-15 12:48:47,885 EPOCH 4 done: loss 0.0809 - lr: 0.000020
2023-10-15 12:48:57,739 DEV : loss 0.22659707069396973 - f1-score (micro avg) 0.3446
2023-10-15 12:48:57,775 ----------------------------------------------------------------------------------------------------
2023-10-15 12:49:24,168 epoch 5 - iter 521/5212 - loss 0.06166189 - time (sec): 26.39 - samples/sec: 1361.81 - lr: 0.000020 - momentum: 0.000000
2023-10-15 12:49:48,880 epoch 5 - iter 1042/5212 - loss 0.05657890 - time (sec): 51.10 - samples/sec: 1384.21 - lr: 0.000019 - momentum: 0.000000
2023-10-15 12:50:14,462 epoch 5 - iter 1563/5212 - loss 0.06599464 - time (sec): 76.68 - samples/sec: 1396.70 - lr: 0.000019 - momentum: 0.000000
2023-10-15 12:50:40,037 epoch 5 - iter 2084/5212 - loss 0.06735536 - time (sec): 102.26 - samples/sec: 1407.44 - lr: 0.000019 - momentum: 0.000000
2023-10-15 12:51:05,650 epoch 5 - iter 2605/5212 - loss 0.06616069 - time (sec): 127.87 - samples/sec: 1406.81 - lr: 0.000018 - momentum: 0.000000
2023-10-15 12:51:31,087 epoch 5 - iter 3126/5212 - loss 0.06521237 - time (sec): 153.31 - samples/sec: 1420.29 - lr: 0.000018 - momentum: 0.000000
2023-10-15 12:51:56,491 epoch 5 - iter 3647/5212 - loss 0.06386659 - time (sec): 178.71 - samples/sec: 1428.75 - lr: 0.000018 - momentum: 0.000000
2023-10-15 12:52:22,206 epoch 5 - iter 4168/5212 - loss 0.06327760 - time (sec): 204.43 - samples/sec: 1441.60 - lr: 0.000017 - momentum: 0.000000
2023-10-15 12:52:47,551 epoch 5 - iter 4689/5212 - loss 0.06392307 - time (sec): 229.77 - samples/sec: 1450.51 - lr: 0.000017 - momentum: 0.000000
2023-10-15 12:53:12,302 epoch 5 - iter 5210/5212 - loss 0.06327392 - time (sec): 254.52 - samples/sec: 1443.39 - lr: 0.000017 - momentum: 0.000000
2023-10-15 12:53:12,393 ----------------------------------------------------------------------------------------------------
2023-10-15 12:53:12,393 EPOCH 5 done: loss 0.0633 - lr: 0.000017
2023-10-15 12:53:21,387 DEV : loss 0.29836305975914 - f1-score (micro avg) 0.3461
2023-10-15 12:53:21,415 ----------------------------------------------------------------------------------------------------
2023-10-15 12:53:46,882 epoch 6 - iter 521/5212 - loss 0.03648971 - time (sec): 25.47 - samples/sec: 1543.84 - lr: 0.000016 - momentum: 0.000000
2023-10-15 12:54:12,031 epoch 6 - iter 1042/5212 - loss 0.03491365 - time (sec): 50.62 - samples/sec: 1509.08 - lr: 0.000016 - momentum: 0.000000
2023-10-15 12:54:37,381 epoch 6 - iter 1563/5212 - loss 0.03950362 - time (sec): 75.97 - samples/sec: 1525.17 - lr: 0.000016 - momentum: 0.000000
2023-10-15 12:55:01,670 epoch 6 - iter 2084/5212 - loss 0.04166816 - time (sec): 100.25 - samples/sec: 1500.88 - lr: 0.000015 - momentum: 0.000000
2023-10-15 12:55:26,198 epoch 6 - iter 2605/5212 - loss 0.04137428 - time (sec): 124.78 - samples/sec: 1468.52 - lr: 0.000015 - momentum: 0.000000
2023-10-15 12:55:51,598 epoch 6 - iter 3126/5212 - loss 0.04172807 - time (sec): 150.18 - samples/sec: 1473.04 - lr: 0.000015 - momentum: 0.000000
2023-10-15 12:56:18,660 epoch 6 - iter 3647/5212 - loss 0.04148080 - time (sec): 177.24 - samples/sec: 1436.93 - lr: 0.000014 - momentum: 0.000000
2023-10-15 12:56:48,113 epoch 6 - iter 4168/5212 - loss 0.04185072 - time (sec): 206.70 - samples/sec: 1425.39 - lr: 0.000014 - momentum: 0.000000
2023-10-15 12:57:14,874 epoch 6 - iter 4689/5212 - loss 0.04234827 - time (sec): 233.46 - samples/sec: 1423.26 - lr: 0.000014 - momentum: 0.000000
2023-10-15 12:57:39,407 epoch 6 - iter 5210/5212 - loss 0.04248019 - time (sec): 257.99 - samples/sec: 1423.91 - lr: 0.000013 - momentum: 0.000000
2023-10-15 12:57:39,496 ----------------------------------------------------------------------------------------------------
2023-10-15 12:57:39,496 EPOCH 6 done: loss 0.0425 - lr: 0.000013
2023-10-15 12:57:49,189 DEV : loss 0.38996848464012146 - f1-score (micro avg) 0.3697
2023-10-15 12:57:49,225 ----------------------------------------------------------------------------------------------------
2023-10-15 12:58:14,088 epoch 7 - iter 521/5212 - loss 0.03048326 - time (sec): 24.86 - samples/sec: 1418.67 - lr: 0.000013 - momentum: 0.000000
2023-10-15 12:58:41,269 epoch 7 - iter 1042/5212 - loss 0.02866313 - time (sec): 52.04 - samples/sec: 1386.42 - lr: 0.000013 - momentum: 0.000000
2023-10-15 12:59:06,440 epoch 7 - iter 1563/5212 - loss 0.03616746 - time (sec): 77.21 - samples/sec: 1419.60 - lr: 0.000012 - momentum: 0.000000
2023-10-15 12:59:31,369 epoch 7 - iter 2084/5212 - loss 0.03622811 - time (sec): 102.14 - samples/sec: 1435.71 - lr: 0.000012 - momentum: 0.000000
2023-10-15 12:59:56,319 epoch 7 - iter 2605/5212 - loss 0.03386848 - time (sec): 127.09 - samples/sec: 1440.15 - lr: 0.000012 - momentum: 0.000000
2023-10-15 13:00:21,524 epoch 7 - iter 3126/5212 - loss 0.03403511 - time (sec): 152.29 - samples/sec: 1443.79 - lr: 0.000011 - momentum: 0.000000
2023-10-15 13:00:47,278 epoch 7 - iter 3647/5212 - loss 0.03296196 - time (sec): 178.05 - samples/sec: 1452.00 - lr: 0.000011 - momentum: 0.000000
2023-10-15 13:01:12,245 epoch 7 - iter 4168/5212 - loss 0.03210474 - time (sec): 203.02 - samples/sec: 1450.16 - lr: 0.000011 - momentum: 0.000000
2023-10-15 13:01:37,095 epoch 7 - iter 4689/5212 - loss 0.03179437 - time (sec): 227.87 - samples/sec: 1449.99 - lr: 0.000010 - momentum: 0.000000
2023-10-15 13:02:01,984 epoch 7 - iter 5210/5212 - loss 0.03231694 - time (sec): 252.75 - samples/sec: 1452.59 - lr: 0.000010 - momentum: 0.000000
2023-10-15 13:02:02,089 ----------------------------------------------------------------------------------------------------
2023-10-15 13:02:02,089 EPOCH 7 done: loss 0.0323 - lr: 0.000010
2023-10-15 13:02:10,410 DEV : loss 0.4292934238910675 - f1-score (micro avg) 0.3573
2023-10-15 13:02:10,437 ----------------------------------------------------------------------------------------------------
2023-10-15 13:02:36,582 epoch 8 - iter 521/5212 - loss 0.01718034 - time (sec): 26.14 - samples/sec: 1431.13 - lr: 0.000010 - momentum: 0.000000
2023-10-15 13:03:01,497 epoch 8 - iter 1042/5212 - loss 0.02296624 - time (sec): 51.06 - samples/sec: 1422.71 - lr: 0.000009 - momentum: 0.000000
2023-10-15 13:03:26,494 epoch 8 - iter 1563/5212 - loss 0.02050494 - time (sec): 76.06 - samples/sec: 1438.11 - lr: 0.000009 - momentum: 0.000000
2023-10-15 13:03:51,390 epoch 8 - iter 2084/5212 - loss 0.02052023 - time (sec): 100.95 - samples/sec: 1434.50 - lr: 0.000009 - momentum: 0.000000
2023-10-15 13:04:16,470 epoch 8 - iter 2605/5212 - loss 0.02085292 - time (sec): 126.03 - samples/sec: 1435.46 - lr: 0.000008 - momentum: 0.000000
2023-10-15 13:04:42,255 epoch 8 - iter 3126/5212 - loss 0.02097252 - time (sec): 151.82 - samples/sec: 1453.03 - lr: 0.000008 - momentum: 0.000000
2023-10-15 13:05:07,098 epoch 8 - iter 3647/5212 - loss 0.02219826 - time (sec): 176.66 - samples/sec: 1460.28 - lr: 0.000008 - momentum: 0.000000
2023-10-15 13:05:32,019 epoch 8 - iter 4168/5212 - loss 0.02231714 - time (sec): 201.58 - samples/sec: 1453.62 - lr: 0.000007 - momentum: 0.000000
2023-10-15 13:05:57,035 epoch 8 - iter 4689/5212 - loss 0.02169589 - time (sec): 226.60 - samples/sec: 1456.33 - lr: 0.000007 - momentum: 0.000000
2023-10-15 13:06:22,253 epoch 8 - iter 5210/5212 - loss 0.02163248 - time (sec): 251.81 - samples/sec: 1458.53 - lr: 0.000007 - momentum: 0.000000
2023-10-15 13:06:22,352 ----------------------------------------------------------------------------------------------------
2023-10-15 13:06:22,352 EPOCH 8 done: loss 0.0216 - lr: 0.000007
2023-10-15 13:06:30,833 DEV : loss 0.49988579750061035 - f1-score (micro avg) 0.3488
2023-10-15 13:06:30,892 ----------------------------------------------------------------------------------------------------
2023-10-15 13:06:57,558 epoch 9 - iter 521/5212 - loss 0.01767213 - time (sec): 26.66 - samples/sec: 1480.44 - lr: 0.000006 - momentum: 0.000000
2023-10-15 13:07:22,775 epoch 9 - iter 1042/5212 - loss 0.01672506 - time (sec): 51.88 - samples/sec: 1484.59 - lr: 0.000006 - momentum: 0.000000
2023-10-15 13:07:47,441 epoch 9 - iter 1563/5212 - loss 0.01737499 - time (sec): 76.55 - samples/sec: 1445.12 - lr: 0.000006 - momentum: 0.000000
2023-10-15 13:08:13,548 epoch 9 - iter 2084/5212 - loss 0.01745047 - time (sec): 102.65 - samples/sec: 1444.23 - lr: 0.000005 - momentum: 0.000000
2023-10-15 13:08:38,703 epoch 9 - iter 2605/5212 - loss 0.01740294 - time (sec): 127.81 - samples/sec: 1446.33 - lr: 0.000005 - momentum: 0.000000
2023-10-15 13:09:04,305 epoch 9 - iter 3126/5212 - loss 0.01646355 - time (sec): 153.41 - samples/sec: 1437.32 - lr: 0.000005 - momentum: 0.000000
2023-10-15 13:09:30,090 epoch 9 - iter 3647/5212 - loss 0.01604700 - time (sec): 179.20 - samples/sec: 1432.55 - lr: 0.000004 - momentum: 0.000000
2023-10-15 13:09:55,749 epoch 9 - iter 4168/5212 - loss 0.01657625 - time (sec): 204.85 - samples/sec: 1431.01 - lr: 0.000004 - momentum: 0.000000
2023-10-15 13:10:20,975 epoch 9 - iter 4689/5212 - loss 0.01653058 - time (sec): 230.08 - samples/sec: 1422.36 - lr: 0.000004 - momentum: 0.000000
2023-10-15 13:10:48,149 epoch 9 - iter 5210/5212 - loss 0.01636217 - time (sec): 257.25 - samples/sec: 1427.84 - lr: 0.000003 - momentum: 0.000000
2023-10-15 13:10:48,244 ----------------------------------------------------------------------------------------------------
2023-10-15 13:10:48,244 EPOCH 9 done: loss 0.0164 - lr: 0.000003
2023-10-15 13:10:56,680 DEV : loss 0.49910324811935425 - f1-score (micro avg) 0.3656
2023-10-15 13:10:56,713 ----------------------------------------------------------------------------------------------------
2023-10-15 13:11:23,057 epoch 10 - iter 521/5212 - loss 0.00940569 - time (sec): 26.34 - samples/sec: 1454.27 - lr: 0.000003 - momentum: 0.000000
2023-10-15 13:11:48,413 epoch 10 - iter 1042/5212 - loss 0.00987505 - time (sec): 51.70 - samples/sec: 1470.96 - lr: 0.000003 - momentum: 0.000000
2023-10-15 13:12:14,305 epoch 10 - iter 1563/5212 - loss 0.01023767 - time (sec): 77.59 - samples/sec: 1483.17 - lr: 0.000002 - momentum: 0.000000
2023-10-15 13:12:39,460 epoch 10 - iter 2084/5212 - loss 0.01032580 - time (sec): 102.75 - samples/sec: 1482.32 - lr: 0.000002 - momentum: 0.000000
2023-10-15 13:13:04,679 epoch 10 - iter 2605/5212 - loss 0.01108425 - time (sec): 127.96 - samples/sec: 1472.25 - lr: 0.000002 - momentum: 0.000000
2023-10-15 13:13:29,443 epoch 10 - iter 3126/5212 - loss 0.01123512 - time (sec): 152.73 - samples/sec: 1451.98 - lr: 0.000001 - momentum: 0.000000
2023-10-15 13:13:54,441 epoch 10 - iter 3647/5212 - loss 0.01086175 - time (sec): 177.73 - samples/sec: 1450.39 - lr: 0.000001 - momentum: 0.000000
2023-10-15 13:14:19,603 epoch 10 - iter 4168/5212 - loss 0.01095131 - time (sec): 202.89 - samples/sec: 1451.33 - lr: 0.000001 - momentum: 0.000000
2023-10-15 13:14:45,781 epoch 10 - iter 4689/5212 - loss 0.01104708 - time (sec): 229.07 - samples/sec: 1450.56 - lr: 0.000000 - momentum: 0.000000
2023-10-15 13:15:10,852 epoch 10 - iter 5210/5212 - loss 0.01095808 - time (sec): 254.14 - samples/sec: 1445.55 - lr: 0.000000 - momentum: 0.000000
2023-10-15 13:15:10,942 ----------------------------------------------------------------------------------------------------
2023-10-15 13:15:10,942 EPOCH 10 done: loss 0.0110 - lr: 0.000000
2023-10-15 13:15:19,421 DEV : loss 0.5126752257347107 - f1-score (micro avg) 0.3732
2023-10-15 13:15:19,464 saving best model
2023-10-15 13:15:20,314 ----------------------------------------------------------------------------------------------------
2023-10-15 13:15:20,315 Loading model from best epoch ...
2023-10-15 13:15:21,741 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-15 13:15:38,600
Results:
- F-score (micro) 0.4948
- F-score (macro) 0.3328
- Accuracy 0.3335
By class:
precision recall f1-score support
LOC 0.5075 0.6680 0.5768 1214
PER 0.4158 0.4889 0.4494 808
ORG 0.2912 0.3201 0.3050 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4485 0.5519 0.4948 2390
macro avg 0.3036 0.3693 0.3328 2390
weighted avg 0.4414 0.5519 0.4900 2390
2023-10-15 13:15:38,600 ----------------------------------------------------------------------------------------------------