stefan-it's picture
Upload folder using huggingface_hub
429318c
2023-10-15 15:53:45,075 ----------------------------------------------------------------------------------------------------
2023-10-15 15:53:45,076 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-15 15:53:45,076 ----------------------------------------------------------------------------------------------------
2023-10-15 15:53:45,076 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-15 15:53:45,076 ----------------------------------------------------------------------------------------------------
2023-10-15 15:53:45,076 Train: 20847 sentences
2023-10-15 15:53:45,076 (train_with_dev=False, train_with_test=False)
2023-10-15 15:53:45,076 ----------------------------------------------------------------------------------------------------
2023-10-15 15:53:45,076 Training Params:
2023-10-15 15:53:45,077 - learning_rate: "5e-05"
2023-10-15 15:53:45,077 - mini_batch_size: "4"
2023-10-15 15:53:45,077 - max_epochs: "10"
2023-10-15 15:53:45,077 - shuffle: "True"
2023-10-15 15:53:45,077 ----------------------------------------------------------------------------------------------------
2023-10-15 15:53:45,077 Plugins:
2023-10-15 15:53:45,077 - LinearScheduler | warmup_fraction: '0.1'
2023-10-15 15:53:45,077 ----------------------------------------------------------------------------------------------------
2023-10-15 15:53:45,077 Final evaluation on model from best epoch (best-model.pt)
2023-10-15 15:53:45,077 - metric: "('micro avg', 'f1-score')"
2023-10-15 15:53:45,077 ----------------------------------------------------------------------------------------------------
2023-10-15 15:53:45,077 Computation:
2023-10-15 15:53:45,077 - compute on device: cuda:0
2023-10-15 15:53:45,077 - embedding storage: none
2023-10-15 15:53:45,077 ----------------------------------------------------------------------------------------------------
2023-10-15 15:53:45,077 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-15 15:53:45,077 ----------------------------------------------------------------------------------------------------
2023-10-15 15:53:45,077 ----------------------------------------------------------------------------------------------------
2023-10-15 15:54:12,422 epoch 1 - iter 521/5212 - loss 1.42428782 - time (sec): 27.34 - samples/sec: 1383.73 - lr: 0.000005 - momentum: 0.000000
2023-10-15 15:54:37,920 epoch 1 - iter 1042/5212 - loss 0.90590414 - time (sec): 52.84 - samples/sec: 1413.53 - lr: 0.000010 - momentum: 0.000000
2023-10-15 15:55:04,008 epoch 1 - iter 1563/5212 - loss 0.69424645 - time (sec): 78.93 - samples/sec: 1451.87 - lr: 0.000015 - momentum: 0.000000
2023-10-15 15:55:29,047 epoch 1 - iter 2084/5212 - loss 0.59861622 - time (sec): 103.97 - samples/sec: 1445.56 - lr: 0.000020 - momentum: 0.000000
2023-10-15 15:55:54,045 epoch 1 - iter 2605/5212 - loss 0.52723282 - time (sec): 128.97 - samples/sec: 1450.94 - lr: 0.000025 - momentum: 0.000000
2023-10-15 15:56:19,828 epoch 1 - iter 3126/5212 - loss 0.48298112 - time (sec): 154.75 - samples/sec: 1443.68 - lr: 0.000030 - momentum: 0.000000
2023-10-15 15:56:45,333 epoch 1 - iter 3647/5212 - loss 0.44686105 - time (sec): 180.26 - samples/sec: 1437.31 - lr: 0.000035 - momentum: 0.000000
2023-10-15 15:57:10,311 epoch 1 - iter 4168/5212 - loss 0.42409620 - time (sec): 205.23 - samples/sec: 1432.13 - lr: 0.000040 - momentum: 0.000000
2023-10-15 15:57:35,519 epoch 1 - iter 4689/5212 - loss 0.39945750 - time (sec): 230.44 - samples/sec: 1435.36 - lr: 0.000045 - momentum: 0.000000
2023-10-15 15:58:00,701 epoch 1 - iter 5210/5212 - loss 0.38219956 - time (sec): 255.62 - samples/sec: 1437.19 - lr: 0.000050 - momentum: 0.000000
2023-10-15 15:58:00,791 ----------------------------------------------------------------------------------------------------
2023-10-15 15:58:00,792 EPOCH 1 done: loss 0.3822 - lr: 0.000050
2023-10-15 15:58:06,673 DEV : loss 0.21987049281597137 - f1-score (micro avg) 0.2501
2023-10-15 15:58:06,699 saving best model
2023-10-15 15:58:07,180 ----------------------------------------------------------------------------------------------------
2023-10-15 15:58:31,620 epoch 2 - iter 521/5212 - loss 0.19720857 - time (sec): 24.44 - samples/sec: 1439.02 - lr: 0.000049 - momentum: 0.000000
2023-10-15 15:58:57,131 epoch 2 - iter 1042/5212 - loss 0.19191776 - time (sec): 49.95 - samples/sec: 1427.73 - lr: 0.000049 - momentum: 0.000000
2023-10-15 15:59:22,690 epoch 2 - iter 1563/5212 - loss 0.20354397 - time (sec): 75.51 - samples/sec: 1443.82 - lr: 0.000048 - momentum: 0.000000
2023-10-15 15:59:47,964 epoch 2 - iter 2084/5212 - loss 0.20212058 - time (sec): 100.78 - samples/sec: 1438.76 - lr: 0.000048 - momentum: 0.000000
2023-10-15 16:00:13,268 epoch 2 - iter 2605/5212 - loss 0.19518494 - time (sec): 126.09 - samples/sec: 1446.55 - lr: 0.000047 - momentum: 0.000000
2023-10-15 16:00:39,112 epoch 2 - iter 3126/5212 - loss 0.19373199 - time (sec): 151.93 - samples/sec: 1448.57 - lr: 0.000047 - momentum: 0.000000
2023-10-15 16:01:04,251 epoch 2 - iter 3647/5212 - loss 0.19691764 - time (sec): 177.07 - samples/sec: 1449.02 - lr: 0.000046 - momentum: 0.000000
2023-10-15 16:01:29,973 epoch 2 - iter 4168/5212 - loss 0.19565449 - time (sec): 202.79 - samples/sec: 1453.27 - lr: 0.000046 - momentum: 0.000000
2023-10-15 16:01:55,741 epoch 2 - iter 4689/5212 - loss 0.19648370 - time (sec): 228.56 - samples/sec: 1448.67 - lr: 0.000045 - momentum: 0.000000
2023-10-15 16:02:20,995 epoch 2 - iter 5210/5212 - loss 0.19381268 - time (sec): 253.81 - samples/sec: 1446.91 - lr: 0.000044 - momentum: 0.000000
2023-10-15 16:02:21,091 ----------------------------------------------------------------------------------------------------
2023-10-15 16:02:21,092 EPOCH 2 done: loss 0.1937 - lr: 0.000044
2023-10-15 16:02:30,243 DEV : loss 0.14183852076530457 - f1-score (micro avg) 0.3338
2023-10-15 16:02:30,270 saving best model
2023-10-15 16:02:30,692 ----------------------------------------------------------------------------------------------------
2023-10-15 16:02:56,533 epoch 3 - iter 521/5212 - loss 0.15048813 - time (sec): 25.84 - samples/sec: 1508.68 - lr: 0.000044 - momentum: 0.000000
2023-10-15 16:03:22,016 epoch 3 - iter 1042/5212 - loss 0.14728958 - time (sec): 51.32 - samples/sec: 1511.09 - lr: 0.000043 - momentum: 0.000000
2023-10-15 16:03:47,329 epoch 3 - iter 1563/5212 - loss 0.14397991 - time (sec): 76.64 - samples/sec: 1498.91 - lr: 0.000043 - momentum: 0.000000
2023-10-15 16:04:11,841 epoch 3 - iter 2084/5212 - loss 0.15028463 - time (sec): 101.15 - samples/sec: 1487.12 - lr: 0.000042 - momentum: 0.000000
2023-10-15 16:04:36,934 epoch 3 - iter 2605/5212 - loss 0.14782278 - time (sec): 126.24 - samples/sec: 1494.40 - lr: 0.000042 - momentum: 0.000000
2023-10-15 16:05:01,951 epoch 3 - iter 3126/5212 - loss 0.14679753 - time (sec): 151.26 - samples/sec: 1475.27 - lr: 0.000041 - momentum: 0.000000
2023-10-15 16:05:26,925 epoch 3 - iter 3647/5212 - loss 0.14660018 - time (sec): 176.23 - samples/sec: 1465.35 - lr: 0.000041 - momentum: 0.000000
2023-10-15 16:05:51,750 epoch 3 - iter 4168/5212 - loss 0.14632403 - time (sec): 201.06 - samples/sec: 1458.10 - lr: 0.000040 - momentum: 0.000000
2023-10-15 16:06:17,508 epoch 3 - iter 4689/5212 - loss 0.14395152 - time (sec): 226.81 - samples/sec: 1459.44 - lr: 0.000039 - momentum: 0.000000
2023-10-15 16:06:42,520 epoch 3 - iter 5210/5212 - loss 0.14349579 - time (sec): 251.83 - samples/sec: 1458.62 - lr: 0.000039 - momentum: 0.000000
2023-10-15 16:06:42,612 ----------------------------------------------------------------------------------------------------
2023-10-15 16:06:42,612 EPOCH 3 done: loss 0.1435 - lr: 0.000039
2023-10-15 16:06:51,777 DEV : loss 0.2369118481874466 - f1-score (micro avg) 0.3756
2023-10-15 16:06:51,804 saving best model
2023-10-15 16:06:52,374 ----------------------------------------------------------------------------------------------------
2023-10-15 16:07:18,275 epoch 4 - iter 521/5212 - loss 0.09929994 - time (sec): 25.90 - samples/sec: 1501.45 - lr: 0.000038 - momentum: 0.000000
2023-10-15 16:07:43,152 epoch 4 - iter 1042/5212 - loss 0.10520802 - time (sec): 50.77 - samples/sec: 1429.70 - lr: 0.000038 - momentum: 0.000000
2023-10-15 16:08:08,420 epoch 4 - iter 1563/5212 - loss 0.10690963 - time (sec): 76.04 - samples/sec: 1444.08 - lr: 0.000037 - momentum: 0.000000
2023-10-15 16:08:33,597 epoch 4 - iter 2084/5212 - loss 0.10454649 - time (sec): 101.22 - samples/sec: 1434.91 - lr: 0.000037 - momentum: 0.000000
2023-10-15 16:08:58,727 epoch 4 - iter 2605/5212 - loss 0.10155165 - time (sec): 126.35 - samples/sec: 1441.42 - lr: 0.000036 - momentum: 0.000000
2023-10-15 16:09:23,935 epoch 4 - iter 3126/5212 - loss 0.10263530 - time (sec): 151.56 - samples/sec: 1449.87 - lr: 0.000036 - momentum: 0.000000
2023-10-15 16:09:49,001 epoch 4 - iter 3647/5212 - loss 0.10402987 - time (sec): 176.62 - samples/sec: 1449.15 - lr: 0.000035 - momentum: 0.000000
2023-10-15 16:10:14,571 epoch 4 - iter 4168/5212 - loss 0.10264461 - time (sec): 202.19 - samples/sec: 1444.15 - lr: 0.000034 - momentum: 0.000000
2023-10-15 16:10:39,072 epoch 4 - iter 4689/5212 - loss 0.10543035 - time (sec): 226.69 - samples/sec: 1454.18 - lr: 0.000034 - momentum: 0.000000
2023-10-15 16:11:04,451 epoch 4 - iter 5210/5212 - loss 0.10654412 - time (sec): 252.07 - samples/sec: 1457.38 - lr: 0.000033 - momentum: 0.000000
2023-10-15 16:11:04,540 ----------------------------------------------------------------------------------------------------
2023-10-15 16:11:04,541 EPOCH 4 done: loss 0.1065 - lr: 0.000033
2023-10-15 16:11:12,877 DEV : loss 0.2565504312515259 - f1-score (micro avg) 0.3358
2023-10-15 16:11:12,907 ----------------------------------------------------------------------------------------------------
2023-10-15 16:11:37,905 epoch 5 - iter 521/5212 - loss 0.08039621 - time (sec): 25.00 - samples/sec: 1471.29 - lr: 0.000033 - momentum: 0.000000
2023-10-15 16:12:03,919 epoch 5 - iter 1042/5212 - loss 0.07489001 - time (sec): 51.01 - samples/sec: 1533.08 - lr: 0.000032 - momentum: 0.000000
2023-10-15 16:12:29,068 epoch 5 - iter 1563/5212 - loss 0.07660733 - time (sec): 76.16 - samples/sec: 1493.86 - lr: 0.000032 - momentum: 0.000000
2023-10-15 16:12:55,057 epoch 5 - iter 2084/5212 - loss 0.07802326 - time (sec): 102.15 - samples/sec: 1493.44 - lr: 0.000031 - momentum: 0.000000
2023-10-15 16:13:19,934 epoch 5 - iter 2605/5212 - loss 0.07755748 - time (sec): 127.03 - samples/sec: 1480.12 - lr: 0.000031 - momentum: 0.000000
2023-10-15 16:13:45,236 epoch 5 - iter 3126/5212 - loss 0.07654622 - time (sec): 152.33 - samples/sec: 1485.52 - lr: 0.000030 - momentum: 0.000000
2023-10-15 16:14:10,591 epoch 5 - iter 3647/5212 - loss 0.07603185 - time (sec): 177.68 - samples/sec: 1468.77 - lr: 0.000029 - momentum: 0.000000
2023-10-15 16:14:35,469 epoch 5 - iter 4168/5212 - loss 0.07706494 - time (sec): 202.56 - samples/sec: 1462.78 - lr: 0.000029 - momentum: 0.000000
2023-10-15 16:15:00,775 epoch 5 - iter 4689/5212 - loss 0.07815992 - time (sec): 227.87 - samples/sec: 1466.35 - lr: 0.000028 - momentum: 0.000000
2023-10-15 16:15:25,340 epoch 5 - iter 5210/5212 - loss 0.07889764 - time (sec): 252.43 - samples/sec: 1455.02 - lr: 0.000028 - momentum: 0.000000
2023-10-15 16:15:25,429 ----------------------------------------------------------------------------------------------------
2023-10-15 16:15:25,429 EPOCH 5 done: loss 0.0789 - lr: 0.000028
2023-10-15 16:15:33,628 DEV : loss 0.3160976767539978 - f1-score (micro avg) 0.3371
2023-10-15 16:15:33,657 ----------------------------------------------------------------------------------------------------
2023-10-15 16:15:59,280 epoch 6 - iter 521/5212 - loss 0.04474731 - time (sec): 25.62 - samples/sec: 1505.56 - lr: 0.000027 - momentum: 0.000000
2023-10-15 16:16:24,095 epoch 6 - iter 1042/5212 - loss 0.05493988 - time (sec): 50.44 - samples/sec: 1477.06 - lr: 0.000027 - momentum: 0.000000
2023-10-15 16:16:49,065 epoch 6 - iter 1563/5212 - loss 0.05407312 - time (sec): 75.41 - samples/sec: 1483.28 - lr: 0.000026 - momentum: 0.000000
2023-10-15 16:17:13,915 epoch 6 - iter 2084/5212 - loss 0.05856282 - time (sec): 100.26 - samples/sec: 1461.36 - lr: 0.000026 - momentum: 0.000000
2023-10-15 16:17:39,994 epoch 6 - iter 2605/5212 - loss 0.05768703 - time (sec): 126.34 - samples/sec: 1456.59 - lr: 0.000025 - momentum: 0.000000
2023-10-15 16:18:05,347 epoch 6 - iter 3126/5212 - loss 0.05931224 - time (sec): 151.69 - samples/sec: 1461.36 - lr: 0.000024 - momentum: 0.000000
2023-10-15 16:18:31,277 epoch 6 - iter 3647/5212 - loss 0.05884553 - time (sec): 177.62 - samples/sec: 1474.82 - lr: 0.000024 - momentum: 0.000000
2023-10-15 16:18:55,841 epoch 6 - iter 4168/5212 - loss 0.05923454 - time (sec): 202.18 - samples/sec: 1463.15 - lr: 0.000023 - momentum: 0.000000
2023-10-15 16:19:20,781 epoch 6 - iter 4689/5212 - loss 0.05907592 - time (sec): 227.12 - samples/sec: 1461.79 - lr: 0.000023 - momentum: 0.000000
2023-10-15 16:19:45,618 epoch 6 - iter 5210/5212 - loss 0.05900658 - time (sec): 251.96 - samples/sec: 1458.07 - lr: 0.000022 - momentum: 0.000000
2023-10-15 16:19:45,712 ----------------------------------------------------------------------------------------------------
2023-10-15 16:19:45,713 EPOCH 6 done: loss 0.0590 - lr: 0.000022
2023-10-15 16:19:53,890 DEV : loss 0.3008023798465729 - f1-score (micro avg) 0.368
2023-10-15 16:19:53,917 ----------------------------------------------------------------------------------------------------
2023-10-15 16:20:18,546 epoch 7 - iter 521/5212 - loss 0.03798659 - time (sec): 24.63 - samples/sec: 1370.28 - lr: 0.000022 - momentum: 0.000000
2023-10-15 16:20:43,389 epoch 7 - iter 1042/5212 - loss 0.04416541 - time (sec): 49.47 - samples/sec: 1404.13 - lr: 0.000021 - momentum: 0.000000
2023-10-15 16:21:08,953 epoch 7 - iter 1563/5212 - loss 0.04343458 - time (sec): 75.03 - samples/sec: 1442.14 - lr: 0.000021 - momentum: 0.000000
2023-10-15 16:21:33,746 epoch 7 - iter 2084/5212 - loss 0.04461767 - time (sec): 99.83 - samples/sec: 1433.43 - lr: 0.000020 - momentum: 0.000000
2023-10-15 16:21:58,554 epoch 7 - iter 2605/5212 - loss 0.04703147 - time (sec): 124.64 - samples/sec: 1421.12 - lr: 0.000019 - momentum: 0.000000
2023-10-15 16:22:23,462 epoch 7 - iter 3126/5212 - loss 0.04510966 - time (sec): 149.54 - samples/sec: 1431.70 - lr: 0.000019 - momentum: 0.000000
2023-10-15 16:22:48,842 epoch 7 - iter 3647/5212 - loss 0.04451587 - time (sec): 174.92 - samples/sec: 1436.54 - lr: 0.000018 - momentum: 0.000000
2023-10-15 16:23:15,248 epoch 7 - iter 4168/5212 - loss 0.04275772 - time (sec): 201.33 - samples/sec: 1442.37 - lr: 0.000018 - momentum: 0.000000
2023-10-15 16:23:40,584 epoch 7 - iter 4689/5212 - loss 0.04210492 - time (sec): 226.67 - samples/sec: 1452.47 - lr: 0.000017 - momentum: 0.000000
2023-10-15 16:24:06,364 epoch 7 - iter 5210/5212 - loss 0.04313822 - time (sec): 252.45 - samples/sec: 1454.14 - lr: 0.000017 - momentum: 0.000000
2023-10-15 16:24:06,488 ----------------------------------------------------------------------------------------------------
2023-10-15 16:24:06,488 EPOCH 7 done: loss 0.0438 - lr: 0.000017
2023-10-15 16:24:14,811 DEV : loss 0.3495810329914093 - f1-score (micro avg) 0.3583
2023-10-15 16:24:14,840 ----------------------------------------------------------------------------------------------------
2023-10-15 16:24:40,928 epoch 8 - iter 521/5212 - loss 0.02741041 - time (sec): 26.09 - samples/sec: 1466.39 - lr: 0.000016 - momentum: 0.000000
2023-10-15 16:25:06,388 epoch 8 - iter 1042/5212 - loss 0.02746990 - time (sec): 51.55 - samples/sec: 1456.12 - lr: 0.000016 - momentum: 0.000000
2023-10-15 16:25:31,461 epoch 8 - iter 1563/5212 - loss 0.02860619 - time (sec): 76.62 - samples/sec: 1420.26 - lr: 0.000015 - momentum: 0.000000
2023-10-15 16:25:56,533 epoch 8 - iter 2084/5212 - loss 0.02731631 - time (sec): 101.69 - samples/sec: 1426.33 - lr: 0.000014 - momentum: 0.000000
2023-10-15 16:26:21,417 epoch 8 - iter 2605/5212 - loss 0.02846145 - time (sec): 126.58 - samples/sec: 1414.78 - lr: 0.000014 - momentum: 0.000000
2023-10-15 16:26:46,888 epoch 8 - iter 3126/5212 - loss 0.03072091 - time (sec): 152.05 - samples/sec: 1420.92 - lr: 0.000013 - momentum: 0.000000
2023-10-15 16:27:12,580 epoch 8 - iter 3647/5212 - loss 0.03088493 - time (sec): 177.74 - samples/sec: 1443.38 - lr: 0.000013 - momentum: 0.000000
2023-10-15 16:27:37,843 epoch 8 - iter 4168/5212 - loss 0.03251536 - time (sec): 203.00 - samples/sec: 1450.92 - lr: 0.000012 - momentum: 0.000000
2023-10-15 16:28:02,636 epoch 8 - iter 4689/5212 - loss 0.03281107 - time (sec): 227.80 - samples/sec: 1448.74 - lr: 0.000012 - momentum: 0.000000
2023-10-15 16:28:28,068 epoch 8 - iter 5210/5212 - loss 0.03217309 - time (sec): 253.23 - samples/sec: 1450.23 - lr: 0.000011 - momentum: 0.000000
2023-10-15 16:28:28,165 ----------------------------------------------------------------------------------------------------
2023-10-15 16:28:28,165 EPOCH 8 done: loss 0.0322 - lr: 0.000011
2023-10-15 16:28:37,168 DEV : loss 0.43352392315864563 - f1-score (micro avg) 0.3385
2023-10-15 16:28:37,196 ----------------------------------------------------------------------------------------------------
2023-10-15 16:29:01,956 epoch 9 - iter 521/5212 - loss 0.02250316 - time (sec): 24.76 - samples/sec: 1401.16 - lr: 0.000011 - momentum: 0.000000
2023-10-15 16:29:26,508 epoch 9 - iter 1042/5212 - loss 0.02177090 - time (sec): 49.31 - samples/sec: 1390.23 - lr: 0.000010 - momentum: 0.000000
2023-10-15 16:29:52,152 epoch 9 - iter 1563/5212 - loss 0.02249285 - time (sec): 74.95 - samples/sec: 1445.22 - lr: 0.000009 - momentum: 0.000000
2023-10-15 16:30:17,335 epoch 9 - iter 2084/5212 - loss 0.02163876 - time (sec): 100.14 - samples/sec: 1453.12 - lr: 0.000009 - momentum: 0.000000
2023-10-15 16:30:42,609 epoch 9 - iter 2605/5212 - loss 0.02204627 - time (sec): 125.41 - samples/sec: 1463.78 - lr: 0.000008 - momentum: 0.000000
2023-10-15 16:31:07,865 epoch 9 - iter 3126/5212 - loss 0.02294189 - time (sec): 150.67 - samples/sec: 1460.80 - lr: 0.000008 - momentum: 0.000000
2023-10-15 16:31:33,092 epoch 9 - iter 3647/5212 - loss 0.02268476 - time (sec): 175.89 - samples/sec: 1464.48 - lr: 0.000007 - momentum: 0.000000
2023-10-15 16:31:57,984 epoch 9 - iter 4168/5212 - loss 0.02261921 - time (sec): 200.79 - samples/sec: 1466.59 - lr: 0.000007 - momentum: 0.000000
2023-10-15 16:32:23,394 epoch 9 - iter 4689/5212 - loss 0.02185791 - time (sec): 226.20 - samples/sec: 1464.77 - lr: 0.000006 - momentum: 0.000000
2023-10-15 16:32:48,509 epoch 9 - iter 5210/5212 - loss 0.02137368 - time (sec): 251.31 - samples/sec: 1461.79 - lr: 0.000006 - momentum: 0.000000
2023-10-15 16:32:48,596 ----------------------------------------------------------------------------------------------------
2023-10-15 16:32:48,596 EPOCH 9 done: loss 0.0214 - lr: 0.000006
2023-10-15 16:32:57,649 DEV : loss 0.3968445360660553 - f1-score (micro avg) 0.3705
2023-10-15 16:32:57,681 ----------------------------------------------------------------------------------------------------
2023-10-15 16:33:23,189 epoch 10 - iter 521/5212 - loss 0.01589295 - time (sec): 25.51 - samples/sec: 1458.41 - lr: 0.000005 - momentum: 0.000000
2023-10-15 16:33:48,367 epoch 10 - iter 1042/5212 - loss 0.01651173 - time (sec): 50.68 - samples/sec: 1457.84 - lr: 0.000004 - momentum: 0.000000
2023-10-15 16:34:14,003 epoch 10 - iter 1563/5212 - loss 0.01399251 - time (sec): 76.32 - samples/sec: 1486.45 - lr: 0.000004 - momentum: 0.000000
2023-10-15 16:34:39,214 epoch 10 - iter 2084/5212 - loss 0.01332742 - time (sec): 101.53 - samples/sec: 1479.59 - lr: 0.000003 - momentum: 0.000000
2023-10-15 16:35:03,758 epoch 10 - iter 2605/5212 - loss 0.01404225 - time (sec): 126.08 - samples/sec: 1463.52 - lr: 0.000003 - momentum: 0.000000
2023-10-15 16:35:28,888 epoch 10 - iter 3126/5212 - loss 0.01439832 - time (sec): 151.21 - samples/sec: 1460.59 - lr: 0.000002 - momentum: 0.000000
2023-10-15 16:35:54,359 epoch 10 - iter 3647/5212 - loss 0.01458812 - time (sec): 176.68 - samples/sec: 1463.83 - lr: 0.000002 - momentum: 0.000000
2023-10-15 16:36:19,151 epoch 10 - iter 4168/5212 - loss 0.01415824 - time (sec): 201.47 - samples/sec: 1459.02 - lr: 0.000001 - momentum: 0.000000
2023-10-15 16:36:44,161 epoch 10 - iter 4689/5212 - loss 0.01387390 - time (sec): 226.48 - samples/sec: 1459.48 - lr: 0.000001 - momentum: 0.000000
2023-10-15 16:37:09,000 epoch 10 - iter 5210/5212 - loss 0.01389275 - time (sec): 251.32 - samples/sec: 1461.92 - lr: 0.000000 - momentum: 0.000000
2023-10-15 16:37:09,090 ----------------------------------------------------------------------------------------------------
2023-10-15 16:37:09,090 EPOCH 10 done: loss 0.0139 - lr: 0.000000
2023-10-15 16:37:18,126 DEV : loss 0.4460391700267792 - f1-score (micro avg) 0.3667
2023-10-15 16:37:18,535 ----------------------------------------------------------------------------------------------------
2023-10-15 16:37:18,536 Loading model from best epoch ...
2023-10-15 16:37:20,025 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-15 16:37:36,732
Results:
- F-score (micro) 0.4563
- F-score (macro) 0.2731
- Accuracy 0.3008
By class:
precision recall f1-score support
LOC 0.5006 0.6573 0.5684 1214
PER 0.3482 0.4356 0.3870 808
ORG 0.1806 0.1105 0.1371 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4215 0.4975 0.4563 2390
macro avg 0.2573 0.3009 0.2731 2390
weighted avg 0.3987 0.4975 0.4398 2390
2023-10-15 16:37:36,733 ----------------------------------------------------------------------------------------------------