|
2023-10-15 23:45:36,931 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:45:36,932 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-15 23:45:36,932 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:45:36,932 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences |
|
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator |
|
2023-10-15 23:45:36,932 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:45:36,932 Train: 20847 sentences |
|
2023-10-15 23:45:36,932 (train_with_dev=False, train_with_test=False) |
|
2023-10-15 23:45:36,932 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:45:36,932 Training Params: |
|
2023-10-15 23:45:36,932 - learning_rate: "5e-05" |
|
2023-10-15 23:45:36,933 - mini_batch_size: "4" |
|
2023-10-15 23:45:36,933 - max_epochs: "10" |
|
2023-10-15 23:45:36,933 - shuffle: "True" |
|
2023-10-15 23:45:36,933 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:45:36,933 Plugins: |
|
2023-10-15 23:45:36,933 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-15 23:45:36,933 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:45:36,933 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-15 23:45:36,933 - metric: "('micro avg', 'f1-score')" |
|
2023-10-15 23:45:36,933 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:45:36,933 Computation: |
|
2023-10-15 23:45:36,933 - compute on device: cuda:0 |
|
2023-10-15 23:45:36,933 - embedding storage: none |
|
2023-10-15 23:45:36,933 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:45:36,933 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-15 23:45:36,933 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:45:36,933 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:46:01,690 epoch 1 - iter 521/5212 - loss 1.37202838 - time (sec): 24.76 - samples/sec: 1438.81 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-15 23:46:26,941 epoch 1 - iter 1042/5212 - loss 0.87856005 - time (sec): 50.01 - samples/sec: 1460.52 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-15 23:46:51,993 epoch 1 - iter 1563/5212 - loss 0.68736436 - time (sec): 75.06 - samples/sec: 1446.56 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-15 23:47:17,119 epoch 1 - iter 2084/5212 - loss 0.58462865 - time (sec): 100.18 - samples/sec: 1440.83 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-15 23:47:42,651 epoch 1 - iter 2605/5212 - loss 0.51553132 - time (sec): 125.72 - samples/sec: 1437.34 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-15 23:48:08,713 epoch 1 - iter 3126/5212 - loss 0.46300876 - time (sec): 151.78 - samples/sec: 1438.92 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-15 23:48:33,975 epoch 1 - iter 3647/5212 - loss 0.43081269 - time (sec): 177.04 - samples/sec: 1443.11 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-15 23:48:59,438 epoch 1 - iter 4168/5212 - loss 0.40412772 - time (sec): 202.50 - samples/sec: 1440.23 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-15 23:49:24,710 epoch 1 - iter 4689/5212 - loss 0.38575539 - time (sec): 227.78 - samples/sec: 1435.07 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-15 23:49:50,594 epoch 1 - iter 5210/5212 - loss 0.36876867 - time (sec): 253.66 - samples/sec: 1448.37 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-15 23:49:50,688 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:49:50,688 EPOCH 1 done: loss 0.3687 - lr: 0.000050 |
|
2023-10-15 23:49:56,687 DEV : loss 0.12752242386341095 - f1-score (micro avg) 0.3017 |
|
2023-10-15 23:49:56,714 saving best model |
|
2023-10-15 23:49:57,179 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:50:22,567 epoch 2 - iter 521/5212 - loss 0.20030192 - time (sec): 25.39 - samples/sec: 1528.79 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-15 23:50:47,800 epoch 2 - iter 1042/5212 - loss 0.19679440 - time (sec): 50.62 - samples/sec: 1482.50 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-15 23:51:13,032 epoch 2 - iter 1563/5212 - loss 0.18526110 - time (sec): 75.85 - samples/sec: 1470.72 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-15 23:51:38,382 epoch 2 - iter 2084/5212 - loss 0.18140515 - time (sec): 101.20 - samples/sec: 1471.11 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-15 23:52:03,691 epoch 2 - iter 2605/5212 - loss 0.18582562 - time (sec): 126.51 - samples/sec: 1464.80 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-15 23:52:28,906 epoch 2 - iter 3126/5212 - loss 0.18577517 - time (sec): 151.72 - samples/sec: 1469.25 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-15 23:52:53,821 epoch 2 - iter 3647/5212 - loss 0.18828427 - time (sec): 176.64 - samples/sec: 1465.00 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-15 23:53:19,622 epoch 2 - iter 4168/5212 - loss 0.18599284 - time (sec): 202.44 - samples/sec: 1470.56 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-15 23:53:44,440 epoch 2 - iter 4689/5212 - loss 0.18772856 - time (sec): 227.26 - samples/sec: 1455.41 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-15 23:54:09,398 epoch 2 - iter 5210/5212 - loss 0.18732109 - time (sec): 252.22 - samples/sec: 1455.77 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-15 23:54:09,513 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:54:09,513 EPOCH 2 done: loss 0.1873 - lr: 0.000044 |
|
2023-10-15 23:54:18,685 DEV : loss 0.14554405212402344 - f1-score (micro avg) 0.3546 |
|
2023-10-15 23:54:18,713 saving best model |
|
2023-10-15 23:54:19,332 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:54:44,600 epoch 3 - iter 521/5212 - loss 0.16892477 - time (sec): 25.26 - samples/sec: 1431.22 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-15 23:55:09,173 epoch 3 - iter 1042/5212 - loss 0.15480170 - time (sec): 49.84 - samples/sec: 1382.24 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-15 23:55:34,192 epoch 3 - iter 1563/5212 - loss 0.15253235 - time (sec): 74.86 - samples/sec: 1379.25 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-15 23:55:59,000 epoch 3 - iter 2084/5212 - loss 0.14966988 - time (sec): 99.66 - samples/sec: 1394.82 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-15 23:56:23,271 epoch 3 - iter 2605/5212 - loss 0.14248758 - time (sec): 123.94 - samples/sec: 1422.36 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-15 23:56:48,693 epoch 3 - iter 3126/5212 - loss 0.14250745 - time (sec): 149.36 - samples/sec: 1432.87 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-15 23:57:14,127 epoch 3 - iter 3647/5212 - loss 0.14348930 - time (sec): 174.79 - samples/sec: 1440.23 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-15 23:57:39,517 epoch 3 - iter 4168/5212 - loss 0.14496238 - time (sec): 200.18 - samples/sec: 1439.25 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-15 23:58:05,279 epoch 3 - iter 4689/5212 - loss 0.14349134 - time (sec): 225.94 - samples/sec: 1447.61 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-15 23:58:31,311 epoch 3 - iter 5210/5212 - loss 0.14146075 - time (sec): 251.98 - samples/sec: 1458.09 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-15 23:58:31,401 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:58:31,401 EPOCH 3 done: loss 0.1415 - lr: 0.000039 |
|
2023-10-15 23:58:40,504 DEV : loss 0.27121227979660034 - f1-score (micro avg) 0.3217 |
|
2023-10-15 23:58:40,532 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:59:05,417 epoch 4 - iter 521/5212 - loss 0.09043585 - time (sec): 24.88 - samples/sec: 1421.95 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-15 23:59:30,195 epoch 4 - iter 1042/5212 - loss 0.10494334 - time (sec): 49.66 - samples/sec: 1447.79 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-15 23:59:55,656 epoch 4 - iter 1563/5212 - loss 0.09724528 - time (sec): 75.12 - samples/sec: 1479.73 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 00:00:21,419 epoch 4 - iter 2084/5212 - loss 0.09872646 - time (sec): 100.89 - samples/sec: 1485.87 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 00:00:46,935 epoch 4 - iter 2605/5212 - loss 0.10353381 - time (sec): 126.40 - samples/sec: 1480.74 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 00:01:12,909 epoch 4 - iter 3126/5212 - loss 0.10532981 - time (sec): 152.38 - samples/sec: 1464.22 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 00:01:38,193 epoch 4 - iter 3647/5212 - loss 0.10412690 - time (sec): 177.66 - samples/sec: 1458.40 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 00:02:03,668 epoch 4 - iter 4168/5212 - loss 0.10176002 - time (sec): 203.14 - samples/sec: 1462.23 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-16 00:02:28,489 epoch 4 - iter 4689/5212 - loss 0.10219116 - time (sec): 227.96 - samples/sec: 1455.00 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-16 00:02:53,672 epoch 4 - iter 5210/5212 - loss 0.10091424 - time (sec): 253.14 - samples/sec: 1450.58 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 00:02:53,769 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 00:02:53,769 EPOCH 4 done: loss 0.1010 - lr: 0.000033 |
|
2023-10-16 00:03:02,079 DEV : loss 0.24610090255737305 - f1-score (micro avg) 0.3722 |
|
2023-10-16 00:03:02,108 saving best model |
|
2023-10-16 00:03:02,652 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 00:03:28,355 epoch 5 - iter 521/5212 - loss 0.08946468 - time (sec): 25.70 - samples/sec: 1499.17 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 00:03:53,305 epoch 5 - iter 1042/5212 - loss 0.08272511 - time (sec): 50.65 - samples/sec: 1445.65 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 00:04:19,363 epoch 5 - iter 1563/5212 - loss 0.07858447 - time (sec): 76.71 - samples/sec: 1419.47 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 00:04:45,160 epoch 5 - iter 2084/5212 - loss 0.07638850 - time (sec): 102.51 - samples/sec: 1440.40 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 00:05:10,817 epoch 5 - iter 2605/5212 - loss 0.07945967 - time (sec): 128.16 - samples/sec: 1449.44 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 00:05:35,603 epoch 5 - iter 3126/5212 - loss 0.07924286 - time (sec): 152.95 - samples/sec: 1448.28 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 00:06:00,818 epoch 5 - iter 3647/5212 - loss 0.07893342 - time (sec): 178.16 - samples/sec: 1442.27 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 00:06:26,438 epoch 5 - iter 4168/5212 - loss 0.07696755 - time (sec): 203.78 - samples/sec: 1448.05 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 00:06:51,588 epoch 5 - iter 4689/5212 - loss 0.07559505 - time (sec): 228.93 - samples/sec: 1447.40 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 00:07:16,799 epoch 5 - iter 5210/5212 - loss 0.07550940 - time (sec): 254.14 - samples/sec: 1445.61 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 00:07:16,884 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 00:07:16,885 EPOCH 5 done: loss 0.0756 - lr: 0.000028 |
|
2023-10-16 00:07:25,246 DEV : loss 0.27587252855300903 - f1-score (micro avg) 0.3933 |
|
2023-10-16 00:07:25,276 saving best model |
|
2023-10-16 00:07:25,899 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 00:07:51,242 epoch 6 - iter 521/5212 - loss 0.04500721 - time (sec): 25.34 - samples/sec: 1438.74 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 00:08:16,452 epoch 6 - iter 1042/5212 - loss 0.05219254 - time (sec): 50.55 - samples/sec: 1458.09 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 00:08:41,812 epoch 6 - iter 1563/5212 - loss 0.05073335 - time (sec): 75.91 - samples/sec: 1454.84 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 00:09:07,218 epoch 6 - iter 2084/5212 - loss 0.05244273 - time (sec): 101.32 - samples/sec: 1458.61 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 00:09:32,274 epoch 6 - iter 2605/5212 - loss 0.05597317 - time (sec): 126.37 - samples/sec: 1450.57 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 00:09:58,365 epoch 6 - iter 3126/5212 - loss 0.05870564 - time (sec): 152.46 - samples/sec: 1447.89 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 00:10:23,762 epoch 6 - iter 3647/5212 - loss 0.05885555 - time (sec): 177.86 - samples/sec: 1458.55 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 00:10:49,124 epoch 6 - iter 4168/5212 - loss 0.05780173 - time (sec): 203.22 - samples/sec: 1461.81 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 00:11:14,399 epoch 6 - iter 4689/5212 - loss 0.05720709 - time (sec): 228.50 - samples/sec: 1459.04 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 00:11:39,181 epoch 6 - iter 5210/5212 - loss 0.05635932 - time (sec): 253.28 - samples/sec: 1448.97 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 00:11:39,333 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 00:11:39,333 EPOCH 6 done: loss 0.0563 - lr: 0.000022 |
|
2023-10-16 00:11:47,640 DEV : loss 0.3152172863483429 - f1-score (micro avg) 0.3546 |
|
2023-10-16 00:11:47,669 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 00:12:12,705 epoch 7 - iter 521/5212 - loss 0.05081149 - time (sec): 25.03 - samples/sec: 1394.27 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 00:12:37,682 epoch 7 - iter 1042/5212 - loss 0.04120887 - time (sec): 50.01 - samples/sec: 1432.98 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 00:13:03,240 epoch 7 - iter 1563/5212 - loss 0.04350707 - time (sec): 75.57 - samples/sec: 1433.54 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 00:13:29,103 epoch 7 - iter 2084/5212 - loss 0.04429995 - time (sec): 101.43 - samples/sec: 1456.00 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 00:13:54,490 epoch 7 - iter 2605/5212 - loss 0.04318346 - time (sec): 126.82 - samples/sec: 1447.52 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 00:14:19,664 epoch 7 - iter 3126/5212 - loss 0.04497001 - time (sec): 151.99 - samples/sec: 1451.89 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 00:14:44,996 epoch 7 - iter 3647/5212 - loss 0.04321396 - time (sec): 177.33 - samples/sec: 1459.04 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 00:15:10,347 epoch 7 - iter 4168/5212 - loss 0.04275074 - time (sec): 202.68 - samples/sec: 1460.57 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 00:15:36,189 epoch 7 - iter 4689/5212 - loss 0.04205276 - time (sec): 228.52 - samples/sec: 1443.40 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 00:16:01,971 epoch 7 - iter 5210/5212 - loss 0.04183362 - time (sec): 254.30 - samples/sec: 1444.10 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 00:16:02,089 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 00:16:02,090 EPOCH 7 done: loss 0.0418 - lr: 0.000017 |
|
2023-10-16 00:16:10,408 DEV : loss 0.3469476103782654 - f1-score (micro avg) 0.3672 |
|
2023-10-16 00:16:10,453 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 00:16:37,044 epoch 8 - iter 521/5212 - loss 0.02217721 - time (sec): 26.59 - samples/sec: 1468.68 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 00:17:02,282 epoch 8 - iter 1042/5212 - loss 0.02729592 - time (sec): 51.83 - samples/sec: 1486.36 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 00:17:27,560 epoch 8 - iter 1563/5212 - loss 0.02832787 - time (sec): 77.10 - samples/sec: 1473.74 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 00:17:53,339 epoch 8 - iter 2084/5212 - loss 0.02989946 - time (sec): 102.88 - samples/sec: 1460.61 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 00:18:18,756 epoch 8 - iter 2605/5212 - loss 0.02875976 - time (sec): 128.30 - samples/sec: 1453.04 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 00:18:44,249 epoch 8 - iter 3126/5212 - loss 0.02972707 - time (sec): 153.79 - samples/sec: 1457.00 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 00:19:08,428 epoch 8 - iter 3647/5212 - loss 0.02955815 - time (sec): 177.97 - samples/sec: 1459.29 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 00:19:32,685 epoch 8 - iter 4168/5212 - loss 0.02992807 - time (sec): 202.23 - samples/sec: 1463.77 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 00:19:56,993 epoch 8 - iter 4689/5212 - loss 0.02991417 - time (sec): 226.54 - samples/sec: 1458.61 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 00:20:22,203 epoch 8 - iter 5210/5212 - loss 0.02960201 - time (sec): 251.75 - samples/sec: 1459.33 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 00:20:22,290 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 00:20:22,290 EPOCH 8 done: loss 0.0296 - lr: 0.000011 |
|
2023-10-16 00:20:31,432 DEV : loss 0.4151654839515686 - f1-score (micro avg) 0.3626 |
|
2023-10-16 00:20:31,464 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 00:20:56,517 epoch 9 - iter 521/5212 - loss 0.01705744 - time (sec): 25.05 - samples/sec: 1507.99 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 00:21:21,288 epoch 9 - iter 1042/5212 - loss 0.02156692 - time (sec): 49.82 - samples/sec: 1460.72 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 00:21:46,082 epoch 9 - iter 1563/5212 - loss 0.02449670 - time (sec): 74.62 - samples/sec: 1440.44 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 00:22:10,930 epoch 9 - iter 2084/5212 - loss 0.02337409 - time (sec): 99.46 - samples/sec: 1440.07 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 00:22:36,386 epoch 9 - iter 2605/5212 - loss 0.02237419 - time (sec): 124.92 - samples/sec: 1453.59 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 00:23:01,612 epoch 9 - iter 3126/5212 - loss 0.02194368 - time (sec): 150.15 - samples/sec: 1459.66 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 00:23:26,731 epoch 9 - iter 3647/5212 - loss 0.02224411 - time (sec): 175.26 - samples/sec: 1458.11 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 00:23:51,921 epoch 9 - iter 4168/5212 - loss 0.02139231 - time (sec): 200.46 - samples/sec: 1462.79 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 00:24:17,364 epoch 9 - iter 4689/5212 - loss 0.02129007 - time (sec): 225.90 - samples/sec: 1465.66 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 00:24:42,558 epoch 9 - iter 5210/5212 - loss 0.02095764 - time (sec): 251.09 - samples/sec: 1463.02 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 00:24:42,651 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 00:24:42,652 EPOCH 9 done: loss 0.0210 - lr: 0.000006 |
|
2023-10-16 00:24:52,948 DEV : loss 0.40386104583740234 - f1-score (micro avg) 0.3607 |
|
2023-10-16 00:24:52,983 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 00:25:18,379 epoch 10 - iter 521/5212 - loss 0.01307094 - time (sec): 25.39 - samples/sec: 1403.60 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 00:25:43,265 epoch 10 - iter 1042/5212 - loss 0.01629748 - time (sec): 50.28 - samples/sec: 1423.93 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 00:26:08,138 epoch 10 - iter 1563/5212 - loss 0.01516469 - time (sec): 75.15 - samples/sec: 1421.72 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 00:26:33,158 epoch 10 - iter 2084/5212 - loss 0.01544442 - time (sec): 100.17 - samples/sec: 1426.48 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 00:26:58,058 epoch 10 - iter 2605/5212 - loss 0.01529237 - time (sec): 125.07 - samples/sec: 1425.16 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 00:27:23,342 epoch 10 - iter 3126/5212 - loss 0.01463251 - time (sec): 150.36 - samples/sec: 1442.88 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 00:27:48,516 epoch 10 - iter 3647/5212 - loss 0.01441137 - time (sec): 175.53 - samples/sec: 1454.61 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 00:28:14,171 epoch 10 - iter 4168/5212 - loss 0.01390284 - time (sec): 201.19 - samples/sec: 1462.31 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 00:28:39,493 epoch 10 - iter 4689/5212 - loss 0.01394799 - time (sec): 226.51 - samples/sec: 1464.10 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 00:29:04,370 epoch 10 - iter 5210/5212 - loss 0.01412248 - time (sec): 251.39 - samples/sec: 1461.38 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 00:29:04,465 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 00:29:04,465 EPOCH 10 done: loss 0.0141 - lr: 0.000000 |
|
2023-10-16 00:29:13,621 DEV : loss 0.41680729389190674 - f1-score (micro avg) 0.3728 |
|
2023-10-16 00:29:14,169 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 00:29:14,171 Loading model from best epoch ... |
|
2023-10-16 00:29:15,700 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-16 00:29:30,958 |
|
Results: |
|
- F-score (micro) 0.4193 |
|
- F-score (macro) 0.2665 |
|
- Accuracy 0.2701 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.4918 0.5206 0.5058 1214 |
|
PER 0.3522 0.4084 0.3782 808 |
|
ORG 0.2314 0.1501 0.1821 353 |
|
HumanProd 0.0000 0.0000 0.0000 15 |
|
|
|
micro avg 0.4141 0.4247 0.4193 2390 |
|
macro avg 0.2689 0.2698 0.2665 2390 |
|
weighted avg 0.4031 0.4247 0.4117 2390 |
|
|
|
2023-10-16 00:29:30,958 ---------------------------------------------------------------------------------------------------- |
|
|