|
2023-10-15 15:53:45,075 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 15:53:45,076 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-15 15:53:45,076 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 15:53:45,076 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences |
|
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator |
|
2023-10-15 15:53:45,076 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 15:53:45,076 Train: 20847 sentences |
|
2023-10-15 15:53:45,076 (train_with_dev=False, train_with_test=False) |
|
2023-10-15 15:53:45,076 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 15:53:45,076 Training Params: |
|
2023-10-15 15:53:45,077 - learning_rate: "5e-05" |
|
2023-10-15 15:53:45,077 - mini_batch_size: "4" |
|
2023-10-15 15:53:45,077 - max_epochs: "10" |
|
2023-10-15 15:53:45,077 - shuffle: "True" |
|
2023-10-15 15:53:45,077 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 15:53:45,077 Plugins: |
|
2023-10-15 15:53:45,077 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-15 15:53:45,077 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 15:53:45,077 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-15 15:53:45,077 - metric: "('micro avg', 'f1-score')" |
|
2023-10-15 15:53:45,077 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 15:53:45,077 Computation: |
|
2023-10-15 15:53:45,077 - compute on device: cuda:0 |
|
2023-10-15 15:53:45,077 - embedding storage: none |
|
2023-10-15 15:53:45,077 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 15:53:45,077 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-15 15:53:45,077 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 15:53:45,077 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 15:54:12,422 epoch 1 - iter 521/5212 - loss 1.42428782 - time (sec): 27.34 - samples/sec: 1383.73 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-15 15:54:37,920 epoch 1 - iter 1042/5212 - loss 0.90590414 - time (sec): 52.84 - samples/sec: 1413.53 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-15 15:55:04,008 epoch 1 - iter 1563/5212 - loss 0.69424645 - time (sec): 78.93 - samples/sec: 1451.87 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-15 15:55:29,047 epoch 1 - iter 2084/5212 - loss 0.59861622 - time (sec): 103.97 - samples/sec: 1445.56 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-15 15:55:54,045 epoch 1 - iter 2605/5212 - loss 0.52723282 - time (sec): 128.97 - samples/sec: 1450.94 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-15 15:56:19,828 epoch 1 - iter 3126/5212 - loss 0.48298112 - time (sec): 154.75 - samples/sec: 1443.68 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-15 15:56:45,333 epoch 1 - iter 3647/5212 - loss 0.44686105 - time (sec): 180.26 - samples/sec: 1437.31 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-15 15:57:10,311 epoch 1 - iter 4168/5212 - loss 0.42409620 - time (sec): 205.23 - samples/sec: 1432.13 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-15 15:57:35,519 epoch 1 - iter 4689/5212 - loss 0.39945750 - time (sec): 230.44 - samples/sec: 1435.36 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-15 15:58:00,701 epoch 1 - iter 5210/5212 - loss 0.38219956 - time (sec): 255.62 - samples/sec: 1437.19 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-15 15:58:00,791 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 15:58:00,792 EPOCH 1 done: loss 0.3822 - lr: 0.000050 |
|
2023-10-15 15:58:06,673 DEV : loss 0.21987049281597137 - f1-score (micro avg) 0.2501 |
|
2023-10-15 15:58:06,699 saving best model |
|
2023-10-15 15:58:07,180 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 15:58:31,620 epoch 2 - iter 521/5212 - loss 0.19720857 - time (sec): 24.44 - samples/sec: 1439.02 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-15 15:58:57,131 epoch 2 - iter 1042/5212 - loss 0.19191776 - time (sec): 49.95 - samples/sec: 1427.73 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-15 15:59:22,690 epoch 2 - iter 1563/5212 - loss 0.20354397 - time (sec): 75.51 - samples/sec: 1443.82 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-15 15:59:47,964 epoch 2 - iter 2084/5212 - loss 0.20212058 - time (sec): 100.78 - samples/sec: 1438.76 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-15 16:00:13,268 epoch 2 - iter 2605/5212 - loss 0.19518494 - time (sec): 126.09 - samples/sec: 1446.55 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-15 16:00:39,112 epoch 2 - iter 3126/5212 - loss 0.19373199 - time (sec): 151.93 - samples/sec: 1448.57 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-15 16:01:04,251 epoch 2 - iter 3647/5212 - loss 0.19691764 - time (sec): 177.07 - samples/sec: 1449.02 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-15 16:01:29,973 epoch 2 - iter 4168/5212 - loss 0.19565449 - time (sec): 202.79 - samples/sec: 1453.27 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-15 16:01:55,741 epoch 2 - iter 4689/5212 - loss 0.19648370 - time (sec): 228.56 - samples/sec: 1448.67 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-15 16:02:20,995 epoch 2 - iter 5210/5212 - loss 0.19381268 - time (sec): 253.81 - samples/sec: 1446.91 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-15 16:02:21,091 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 16:02:21,092 EPOCH 2 done: loss 0.1937 - lr: 0.000044 |
|
2023-10-15 16:02:30,243 DEV : loss 0.14183852076530457 - f1-score (micro avg) 0.3338 |
|
2023-10-15 16:02:30,270 saving best model |
|
2023-10-15 16:02:30,692 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 16:02:56,533 epoch 3 - iter 521/5212 - loss 0.15048813 - time (sec): 25.84 - samples/sec: 1508.68 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-15 16:03:22,016 epoch 3 - iter 1042/5212 - loss 0.14728958 - time (sec): 51.32 - samples/sec: 1511.09 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-15 16:03:47,329 epoch 3 - iter 1563/5212 - loss 0.14397991 - time (sec): 76.64 - samples/sec: 1498.91 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-15 16:04:11,841 epoch 3 - iter 2084/5212 - loss 0.15028463 - time (sec): 101.15 - samples/sec: 1487.12 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-15 16:04:36,934 epoch 3 - iter 2605/5212 - loss 0.14782278 - time (sec): 126.24 - samples/sec: 1494.40 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-15 16:05:01,951 epoch 3 - iter 3126/5212 - loss 0.14679753 - time (sec): 151.26 - samples/sec: 1475.27 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-15 16:05:26,925 epoch 3 - iter 3647/5212 - loss 0.14660018 - time (sec): 176.23 - samples/sec: 1465.35 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-15 16:05:51,750 epoch 3 - iter 4168/5212 - loss 0.14632403 - time (sec): 201.06 - samples/sec: 1458.10 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-15 16:06:17,508 epoch 3 - iter 4689/5212 - loss 0.14395152 - time (sec): 226.81 - samples/sec: 1459.44 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-15 16:06:42,520 epoch 3 - iter 5210/5212 - loss 0.14349579 - time (sec): 251.83 - samples/sec: 1458.62 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-15 16:06:42,612 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 16:06:42,612 EPOCH 3 done: loss 0.1435 - lr: 0.000039 |
|
2023-10-15 16:06:51,777 DEV : loss 0.2369118481874466 - f1-score (micro avg) 0.3756 |
|
2023-10-15 16:06:51,804 saving best model |
|
2023-10-15 16:06:52,374 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 16:07:18,275 epoch 4 - iter 521/5212 - loss 0.09929994 - time (sec): 25.90 - samples/sec: 1501.45 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-15 16:07:43,152 epoch 4 - iter 1042/5212 - loss 0.10520802 - time (sec): 50.77 - samples/sec: 1429.70 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-15 16:08:08,420 epoch 4 - iter 1563/5212 - loss 0.10690963 - time (sec): 76.04 - samples/sec: 1444.08 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-15 16:08:33,597 epoch 4 - iter 2084/5212 - loss 0.10454649 - time (sec): 101.22 - samples/sec: 1434.91 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-15 16:08:58,727 epoch 4 - iter 2605/5212 - loss 0.10155165 - time (sec): 126.35 - samples/sec: 1441.42 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-15 16:09:23,935 epoch 4 - iter 3126/5212 - loss 0.10263530 - time (sec): 151.56 - samples/sec: 1449.87 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-15 16:09:49,001 epoch 4 - iter 3647/5212 - loss 0.10402987 - time (sec): 176.62 - samples/sec: 1449.15 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-15 16:10:14,571 epoch 4 - iter 4168/5212 - loss 0.10264461 - time (sec): 202.19 - samples/sec: 1444.15 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-15 16:10:39,072 epoch 4 - iter 4689/5212 - loss 0.10543035 - time (sec): 226.69 - samples/sec: 1454.18 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-15 16:11:04,451 epoch 4 - iter 5210/5212 - loss 0.10654412 - time (sec): 252.07 - samples/sec: 1457.38 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-15 16:11:04,540 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 16:11:04,541 EPOCH 4 done: loss 0.1065 - lr: 0.000033 |
|
2023-10-15 16:11:12,877 DEV : loss 0.2565504312515259 - f1-score (micro avg) 0.3358 |
|
2023-10-15 16:11:12,907 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 16:11:37,905 epoch 5 - iter 521/5212 - loss 0.08039621 - time (sec): 25.00 - samples/sec: 1471.29 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-15 16:12:03,919 epoch 5 - iter 1042/5212 - loss 0.07489001 - time (sec): 51.01 - samples/sec: 1533.08 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-15 16:12:29,068 epoch 5 - iter 1563/5212 - loss 0.07660733 - time (sec): 76.16 - samples/sec: 1493.86 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-15 16:12:55,057 epoch 5 - iter 2084/5212 - loss 0.07802326 - time (sec): 102.15 - samples/sec: 1493.44 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-15 16:13:19,934 epoch 5 - iter 2605/5212 - loss 0.07755748 - time (sec): 127.03 - samples/sec: 1480.12 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-15 16:13:45,236 epoch 5 - iter 3126/5212 - loss 0.07654622 - time (sec): 152.33 - samples/sec: 1485.52 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-15 16:14:10,591 epoch 5 - iter 3647/5212 - loss 0.07603185 - time (sec): 177.68 - samples/sec: 1468.77 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-15 16:14:35,469 epoch 5 - iter 4168/5212 - loss 0.07706494 - time (sec): 202.56 - samples/sec: 1462.78 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-15 16:15:00,775 epoch 5 - iter 4689/5212 - loss 0.07815992 - time (sec): 227.87 - samples/sec: 1466.35 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-15 16:15:25,340 epoch 5 - iter 5210/5212 - loss 0.07889764 - time (sec): 252.43 - samples/sec: 1455.02 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-15 16:15:25,429 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 16:15:25,429 EPOCH 5 done: loss 0.0789 - lr: 0.000028 |
|
2023-10-15 16:15:33,628 DEV : loss 0.3160976767539978 - f1-score (micro avg) 0.3371 |
|
2023-10-15 16:15:33,657 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 16:15:59,280 epoch 6 - iter 521/5212 - loss 0.04474731 - time (sec): 25.62 - samples/sec: 1505.56 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-15 16:16:24,095 epoch 6 - iter 1042/5212 - loss 0.05493988 - time (sec): 50.44 - samples/sec: 1477.06 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-15 16:16:49,065 epoch 6 - iter 1563/5212 - loss 0.05407312 - time (sec): 75.41 - samples/sec: 1483.28 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-15 16:17:13,915 epoch 6 - iter 2084/5212 - loss 0.05856282 - time (sec): 100.26 - samples/sec: 1461.36 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-15 16:17:39,994 epoch 6 - iter 2605/5212 - loss 0.05768703 - time (sec): 126.34 - samples/sec: 1456.59 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-15 16:18:05,347 epoch 6 - iter 3126/5212 - loss 0.05931224 - time (sec): 151.69 - samples/sec: 1461.36 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-15 16:18:31,277 epoch 6 - iter 3647/5212 - loss 0.05884553 - time (sec): 177.62 - samples/sec: 1474.82 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-15 16:18:55,841 epoch 6 - iter 4168/5212 - loss 0.05923454 - time (sec): 202.18 - samples/sec: 1463.15 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-15 16:19:20,781 epoch 6 - iter 4689/5212 - loss 0.05907592 - time (sec): 227.12 - samples/sec: 1461.79 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-15 16:19:45,618 epoch 6 - iter 5210/5212 - loss 0.05900658 - time (sec): 251.96 - samples/sec: 1458.07 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-15 16:19:45,712 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 16:19:45,713 EPOCH 6 done: loss 0.0590 - lr: 0.000022 |
|
2023-10-15 16:19:53,890 DEV : loss 0.3008023798465729 - f1-score (micro avg) 0.368 |
|
2023-10-15 16:19:53,917 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 16:20:18,546 epoch 7 - iter 521/5212 - loss 0.03798659 - time (sec): 24.63 - samples/sec: 1370.28 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-15 16:20:43,389 epoch 7 - iter 1042/5212 - loss 0.04416541 - time (sec): 49.47 - samples/sec: 1404.13 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-15 16:21:08,953 epoch 7 - iter 1563/5212 - loss 0.04343458 - time (sec): 75.03 - samples/sec: 1442.14 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-15 16:21:33,746 epoch 7 - iter 2084/5212 - loss 0.04461767 - time (sec): 99.83 - samples/sec: 1433.43 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-15 16:21:58,554 epoch 7 - iter 2605/5212 - loss 0.04703147 - time (sec): 124.64 - samples/sec: 1421.12 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-15 16:22:23,462 epoch 7 - iter 3126/5212 - loss 0.04510966 - time (sec): 149.54 - samples/sec: 1431.70 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-15 16:22:48,842 epoch 7 - iter 3647/5212 - loss 0.04451587 - time (sec): 174.92 - samples/sec: 1436.54 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-15 16:23:15,248 epoch 7 - iter 4168/5212 - loss 0.04275772 - time (sec): 201.33 - samples/sec: 1442.37 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-15 16:23:40,584 epoch 7 - iter 4689/5212 - loss 0.04210492 - time (sec): 226.67 - samples/sec: 1452.47 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-15 16:24:06,364 epoch 7 - iter 5210/5212 - loss 0.04313822 - time (sec): 252.45 - samples/sec: 1454.14 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-15 16:24:06,488 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 16:24:06,488 EPOCH 7 done: loss 0.0438 - lr: 0.000017 |
|
2023-10-15 16:24:14,811 DEV : loss 0.3495810329914093 - f1-score (micro avg) 0.3583 |
|
2023-10-15 16:24:14,840 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 16:24:40,928 epoch 8 - iter 521/5212 - loss 0.02741041 - time (sec): 26.09 - samples/sec: 1466.39 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-15 16:25:06,388 epoch 8 - iter 1042/5212 - loss 0.02746990 - time (sec): 51.55 - samples/sec: 1456.12 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-15 16:25:31,461 epoch 8 - iter 1563/5212 - loss 0.02860619 - time (sec): 76.62 - samples/sec: 1420.26 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-15 16:25:56,533 epoch 8 - iter 2084/5212 - loss 0.02731631 - time (sec): 101.69 - samples/sec: 1426.33 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-15 16:26:21,417 epoch 8 - iter 2605/5212 - loss 0.02846145 - time (sec): 126.58 - samples/sec: 1414.78 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-15 16:26:46,888 epoch 8 - iter 3126/5212 - loss 0.03072091 - time (sec): 152.05 - samples/sec: 1420.92 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-15 16:27:12,580 epoch 8 - iter 3647/5212 - loss 0.03088493 - time (sec): 177.74 - samples/sec: 1443.38 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-15 16:27:37,843 epoch 8 - iter 4168/5212 - loss 0.03251536 - time (sec): 203.00 - samples/sec: 1450.92 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-15 16:28:02,636 epoch 8 - iter 4689/5212 - loss 0.03281107 - time (sec): 227.80 - samples/sec: 1448.74 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-15 16:28:28,068 epoch 8 - iter 5210/5212 - loss 0.03217309 - time (sec): 253.23 - samples/sec: 1450.23 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-15 16:28:28,165 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 16:28:28,165 EPOCH 8 done: loss 0.0322 - lr: 0.000011 |
|
2023-10-15 16:28:37,168 DEV : loss 0.43352392315864563 - f1-score (micro avg) 0.3385 |
|
2023-10-15 16:28:37,196 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 16:29:01,956 epoch 9 - iter 521/5212 - loss 0.02250316 - time (sec): 24.76 - samples/sec: 1401.16 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-15 16:29:26,508 epoch 9 - iter 1042/5212 - loss 0.02177090 - time (sec): 49.31 - samples/sec: 1390.23 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-15 16:29:52,152 epoch 9 - iter 1563/5212 - loss 0.02249285 - time (sec): 74.95 - samples/sec: 1445.22 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-15 16:30:17,335 epoch 9 - iter 2084/5212 - loss 0.02163876 - time (sec): 100.14 - samples/sec: 1453.12 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-15 16:30:42,609 epoch 9 - iter 2605/5212 - loss 0.02204627 - time (sec): 125.41 - samples/sec: 1463.78 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-15 16:31:07,865 epoch 9 - iter 3126/5212 - loss 0.02294189 - time (sec): 150.67 - samples/sec: 1460.80 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-15 16:31:33,092 epoch 9 - iter 3647/5212 - loss 0.02268476 - time (sec): 175.89 - samples/sec: 1464.48 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-15 16:31:57,984 epoch 9 - iter 4168/5212 - loss 0.02261921 - time (sec): 200.79 - samples/sec: 1466.59 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-15 16:32:23,394 epoch 9 - iter 4689/5212 - loss 0.02185791 - time (sec): 226.20 - samples/sec: 1464.77 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-15 16:32:48,509 epoch 9 - iter 5210/5212 - loss 0.02137368 - time (sec): 251.31 - samples/sec: 1461.79 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-15 16:32:48,596 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 16:32:48,596 EPOCH 9 done: loss 0.0214 - lr: 0.000006 |
|
2023-10-15 16:32:57,649 DEV : loss 0.3968445360660553 - f1-score (micro avg) 0.3705 |
|
2023-10-15 16:32:57,681 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 16:33:23,189 epoch 10 - iter 521/5212 - loss 0.01589295 - time (sec): 25.51 - samples/sec: 1458.41 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-15 16:33:48,367 epoch 10 - iter 1042/5212 - loss 0.01651173 - time (sec): 50.68 - samples/sec: 1457.84 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-15 16:34:14,003 epoch 10 - iter 1563/5212 - loss 0.01399251 - time (sec): 76.32 - samples/sec: 1486.45 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-15 16:34:39,214 epoch 10 - iter 2084/5212 - loss 0.01332742 - time (sec): 101.53 - samples/sec: 1479.59 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-15 16:35:03,758 epoch 10 - iter 2605/5212 - loss 0.01404225 - time (sec): 126.08 - samples/sec: 1463.52 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-15 16:35:28,888 epoch 10 - iter 3126/5212 - loss 0.01439832 - time (sec): 151.21 - samples/sec: 1460.59 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-15 16:35:54,359 epoch 10 - iter 3647/5212 - loss 0.01458812 - time (sec): 176.68 - samples/sec: 1463.83 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-15 16:36:19,151 epoch 10 - iter 4168/5212 - loss 0.01415824 - time (sec): 201.47 - samples/sec: 1459.02 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-15 16:36:44,161 epoch 10 - iter 4689/5212 - loss 0.01387390 - time (sec): 226.48 - samples/sec: 1459.48 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-15 16:37:09,000 epoch 10 - iter 5210/5212 - loss 0.01389275 - time (sec): 251.32 - samples/sec: 1461.92 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-15 16:37:09,090 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 16:37:09,090 EPOCH 10 done: loss 0.0139 - lr: 0.000000 |
|
2023-10-15 16:37:18,126 DEV : loss 0.4460391700267792 - f1-score (micro avg) 0.3667 |
|
2023-10-15 16:37:18,535 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 16:37:18,536 Loading model from best epoch ... |
|
2023-10-15 16:37:20,025 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-15 16:37:36,732 |
|
Results: |
|
- F-score (micro) 0.4563 |
|
- F-score (macro) 0.2731 |
|
- Accuracy 0.3008 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.5006 0.6573 0.5684 1214 |
|
PER 0.3482 0.4356 0.3870 808 |
|
ORG 0.1806 0.1105 0.1371 353 |
|
HumanProd 0.0000 0.0000 0.0000 15 |
|
|
|
micro avg 0.4215 0.4975 0.4563 2390 |
|
macro avg 0.2573 0.3009 0.2731 2390 |
|
weighted avg 0.3987 0.4975 0.4398 2390 |
|
|
|
2023-10-15 16:37:36,733 ---------------------------------------------------------------------------------------------------- |
|
|