|
2023-10-25 09:32:23,975 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:32:23,976 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 09:32:23,976 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:32:23,976 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences |
|
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator |
|
2023-10-25 09:32:23,976 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:32:23,976 Train: 20847 sentences |
|
2023-10-25 09:32:23,976 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 09:32:23,976 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:32:23,976 Training Params: |
|
2023-10-25 09:32:23,976 - learning_rate: "3e-05" |
|
2023-10-25 09:32:23,976 - mini_batch_size: "8" |
|
2023-10-25 09:32:23,976 - max_epochs: "10" |
|
2023-10-25 09:32:23,976 - shuffle: "True" |
|
2023-10-25 09:32:23,976 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:32:23,977 Plugins: |
|
2023-10-25 09:32:23,977 - TensorboardLogger |
|
2023-10-25 09:32:23,977 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 09:32:23,977 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:32:23,977 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 09:32:23,977 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 09:32:23,977 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:32:23,977 Computation: |
|
2023-10-25 09:32:23,977 - compute on device: cuda:0 |
|
2023-10-25 09:32:23,977 - embedding storage: none |
|
2023-10-25 09:32:23,977 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:32:23,977 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-25 09:32:23,977 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:32:23,977 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:32:23,977 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 09:32:38,841 epoch 1 - iter 260/2606 - loss 1.91916530 - time (sec): 14.86 - samples/sec: 2560.14 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 09:32:52,820 epoch 1 - iter 520/2606 - loss 1.18191868 - time (sec): 28.84 - samples/sec: 2556.63 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 09:33:06,211 epoch 1 - iter 780/2606 - loss 0.90108307 - time (sec): 42.23 - samples/sec: 2567.04 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 09:33:19,609 epoch 1 - iter 1040/2606 - loss 0.75074608 - time (sec): 55.63 - samples/sec: 2561.12 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 09:33:33,387 epoch 1 - iter 1300/2606 - loss 0.64068299 - time (sec): 69.41 - samples/sec: 2580.54 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 09:33:47,200 epoch 1 - iter 1560/2606 - loss 0.57005463 - time (sec): 83.22 - samples/sec: 2598.87 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 09:34:01,376 epoch 1 - iter 1820/2606 - loss 0.51493718 - time (sec): 97.40 - samples/sec: 2611.49 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 09:34:15,336 epoch 1 - iter 2080/2606 - loss 0.47099351 - time (sec): 111.36 - samples/sec: 2620.87 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 09:34:29,098 epoch 1 - iter 2340/2606 - loss 0.44065222 - time (sec): 125.12 - samples/sec: 2629.88 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 09:34:42,914 epoch 1 - iter 2600/2606 - loss 0.41298987 - time (sec): 138.94 - samples/sec: 2637.58 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 09:34:43,275 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:34:43,275 EPOCH 1 done: loss 0.4124 - lr: 0.000030 |
|
2023-10-25 09:34:46,985 DEV : loss 0.12660576403141022 - f1-score (micro avg) 0.3178 |
|
2023-10-25 09:34:47,008 saving best model |
|
2023-10-25 09:34:47,466 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:35:00,496 epoch 2 - iter 260/2606 - loss 0.16774953 - time (sec): 13.03 - samples/sec: 2688.86 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 09:35:14,475 epoch 2 - iter 520/2606 - loss 0.16502428 - time (sec): 27.01 - samples/sec: 2756.38 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 09:35:28,549 epoch 2 - iter 780/2606 - loss 0.16036022 - time (sec): 41.08 - samples/sec: 2714.43 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 09:35:42,083 epoch 2 - iter 1040/2606 - loss 0.16316114 - time (sec): 54.62 - samples/sec: 2703.40 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 09:35:55,550 epoch 2 - iter 1300/2606 - loss 0.15781012 - time (sec): 68.08 - samples/sec: 2672.52 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 09:36:09,530 epoch 2 - iter 1560/2606 - loss 0.16024587 - time (sec): 82.06 - samples/sec: 2654.34 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 09:36:23,119 epoch 2 - iter 1820/2606 - loss 0.15728484 - time (sec): 95.65 - samples/sec: 2650.61 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 09:36:37,206 epoch 2 - iter 2080/2606 - loss 0.15411183 - time (sec): 109.74 - samples/sec: 2642.31 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 09:36:51,580 epoch 2 - iter 2340/2606 - loss 0.14959803 - time (sec): 124.11 - samples/sec: 2648.05 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 09:37:05,658 epoch 2 - iter 2600/2606 - loss 0.14612654 - time (sec): 138.19 - samples/sec: 2652.50 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 09:37:05,987 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:37:05,987 EPOCH 2 done: loss 0.1462 - lr: 0.000027 |
|
2023-10-25 09:37:12,813 DEV : loss 0.1923154890537262 - f1-score (micro avg) 0.3215 |
|
2023-10-25 09:37:12,836 saving best model |
|
2023-10-25 09:37:13,426 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:37:26,753 epoch 3 - iter 260/2606 - loss 0.13069904 - time (sec): 13.33 - samples/sec: 2680.57 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 09:37:40,949 epoch 3 - iter 520/2606 - loss 0.10503374 - time (sec): 27.52 - samples/sec: 2634.53 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 09:37:54,680 epoch 3 - iter 780/2606 - loss 0.10110985 - time (sec): 41.25 - samples/sec: 2604.34 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 09:38:08,826 epoch 3 - iter 1040/2606 - loss 0.10167366 - time (sec): 55.40 - samples/sec: 2601.75 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 09:38:23,123 epoch 3 - iter 1300/2606 - loss 0.10033990 - time (sec): 69.70 - samples/sec: 2608.36 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 09:38:36,886 epoch 3 - iter 1560/2606 - loss 0.09969762 - time (sec): 83.46 - samples/sec: 2610.55 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 09:38:51,367 epoch 3 - iter 1820/2606 - loss 0.09635472 - time (sec): 97.94 - samples/sec: 2630.45 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 09:39:05,059 epoch 3 - iter 2080/2606 - loss 0.09529213 - time (sec): 111.63 - samples/sec: 2620.52 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 09:39:19,479 epoch 3 - iter 2340/2606 - loss 0.09619600 - time (sec): 126.05 - samples/sec: 2631.64 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 09:39:33,450 epoch 3 - iter 2600/2606 - loss 0.09533594 - time (sec): 140.02 - samples/sec: 2616.47 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 09:39:33,854 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:39:33,854 EPOCH 3 done: loss 0.0953 - lr: 0.000023 |
|
2023-10-25 09:39:40,571 DEV : loss 0.19747453927993774 - f1-score (micro avg) 0.3648 |
|
2023-10-25 09:39:40,594 saving best model |
|
2023-10-25 09:39:41,189 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:39:55,113 epoch 4 - iter 260/2606 - loss 0.08164508 - time (sec): 13.92 - samples/sec: 2572.15 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 09:40:09,126 epoch 4 - iter 520/2606 - loss 0.06605758 - time (sec): 27.93 - samples/sec: 2624.75 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 09:40:22,785 epoch 4 - iter 780/2606 - loss 0.06712147 - time (sec): 41.59 - samples/sec: 2640.96 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 09:40:36,402 epoch 4 - iter 1040/2606 - loss 0.06724571 - time (sec): 55.21 - samples/sec: 2668.74 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 09:40:49,725 epoch 4 - iter 1300/2606 - loss 0.06921180 - time (sec): 68.53 - samples/sec: 2665.68 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 09:41:03,669 epoch 4 - iter 1560/2606 - loss 0.06732259 - time (sec): 82.48 - samples/sec: 2675.32 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 09:41:17,799 epoch 4 - iter 1820/2606 - loss 0.06691241 - time (sec): 96.61 - samples/sec: 2668.50 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 09:41:31,365 epoch 4 - iter 2080/2606 - loss 0.06706554 - time (sec): 110.17 - samples/sec: 2655.56 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 09:41:45,476 epoch 4 - iter 2340/2606 - loss 0.06642302 - time (sec): 124.29 - samples/sec: 2650.44 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 09:41:59,517 epoch 4 - iter 2600/2606 - loss 0.06749219 - time (sec): 138.33 - samples/sec: 2652.05 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 09:41:59,789 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:41:59,790 EPOCH 4 done: loss 0.0674 - lr: 0.000020 |
|
2023-10-25 09:42:06,523 DEV : loss 0.28529059886932373 - f1-score (micro avg) 0.3759 |
|
2023-10-25 09:42:06,546 saving best model |
|
2023-10-25 09:42:07,141 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:42:21,166 epoch 5 - iter 260/2606 - loss 0.03497114 - time (sec): 14.02 - samples/sec: 2673.58 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 09:42:34,567 epoch 5 - iter 520/2606 - loss 0.04413771 - time (sec): 27.43 - samples/sec: 2643.52 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 09:42:48,727 epoch 5 - iter 780/2606 - loss 0.04501268 - time (sec): 41.59 - samples/sec: 2663.69 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 09:43:02,553 epoch 5 - iter 1040/2606 - loss 0.04407144 - time (sec): 55.41 - samples/sec: 2687.59 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 09:43:16,404 epoch 5 - iter 1300/2606 - loss 0.04455768 - time (sec): 69.26 - samples/sec: 2707.16 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 09:43:29,824 epoch 5 - iter 1560/2606 - loss 0.04491605 - time (sec): 82.68 - samples/sec: 2703.65 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 09:43:43,754 epoch 5 - iter 1820/2606 - loss 0.04688719 - time (sec): 96.61 - samples/sec: 2682.70 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 09:43:58,055 epoch 5 - iter 2080/2606 - loss 0.04755340 - time (sec): 110.91 - samples/sec: 2677.02 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 09:44:11,769 epoch 5 - iter 2340/2606 - loss 0.04896755 - time (sec): 124.63 - samples/sec: 2655.71 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 09:44:25,748 epoch 5 - iter 2600/2606 - loss 0.04920520 - time (sec): 138.61 - samples/sec: 2642.48 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 09:44:26,101 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:44:26,102 EPOCH 5 done: loss 0.0492 - lr: 0.000017 |
|
2023-10-25 09:44:32,900 DEV : loss 0.35076966881752014 - f1-score (micro avg) 0.3618 |
|
2023-10-25 09:44:32,923 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:44:46,744 epoch 6 - iter 260/2606 - loss 0.03560179 - time (sec): 13.82 - samples/sec: 2595.33 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 09:45:00,843 epoch 6 - iter 520/2606 - loss 0.03582430 - time (sec): 27.92 - samples/sec: 2669.35 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 09:45:14,351 epoch 6 - iter 780/2606 - loss 0.03458537 - time (sec): 41.43 - samples/sec: 2652.13 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 09:45:28,753 epoch 6 - iter 1040/2606 - loss 0.03331829 - time (sec): 55.83 - samples/sec: 2650.92 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 09:45:42,857 epoch 6 - iter 1300/2606 - loss 0.03408383 - time (sec): 69.93 - samples/sec: 2655.19 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 09:45:56,833 epoch 6 - iter 1560/2606 - loss 0.03377234 - time (sec): 83.91 - samples/sec: 2644.13 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 09:46:11,281 epoch 6 - iter 1820/2606 - loss 0.03383962 - time (sec): 98.36 - samples/sec: 2661.43 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 09:46:25,546 epoch 6 - iter 2080/2606 - loss 0.03394836 - time (sec): 112.62 - samples/sec: 2655.82 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 09:46:39,253 epoch 6 - iter 2340/2606 - loss 0.03469084 - time (sec): 126.33 - samples/sec: 2629.88 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 09:46:52,830 epoch 6 - iter 2600/2606 - loss 0.03601844 - time (sec): 139.91 - samples/sec: 2620.61 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 09:46:53,130 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:46:53,130 EPOCH 6 done: loss 0.0360 - lr: 0.000013 |
|
2023-10-25 09:46:59,919 DEV : loss 0.35124367475509644 - f1-score (micro avg) 0.3706 |
|
2023-10-25 09:46:59,943 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:47:14,435 epoch 7 - iter 260/2606 - loss 0.02257131 - time (sec): 14.49 - samples/sec: 2847.05 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 09:47:28,207 epoch 7 - iter 520/2606 - loss 0.02441153 - time (sec): 28.26 - samples/sec: 2648.66 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 09:47:41,862 epoch 7 - iter 780/2606 - loss 0.02721138 - time (sec): 41.92 - samples/sec: 2602.98 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 09:47:55,829 epoch 7 - iter 1040/2606 - loss 0.02651296 - time (sec): 55.89 - samples/sec: 2607.82 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 09:48:10,223 epoch 7 - iter 1300/2606 - loss 0.02717455 - time (sec): 70.28 - samples/sec: 2621.10 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 09:48:24,258 epoch 7 - iter 1560/2606 - loss 0.02692439 - time (sec): 84.31 - samples/sec: 2615.76 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 09:48:38,153 epoch 7 - iter 1820/2606 - loss 0.02650464 - time (sec): 98.21 - samples/sec: 2604.38 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 09:48:51,581 epoch 7 - iter 2080/2606 - loss 0.02689021 - time (sec): 111.64 - samples/sec: 2606.39 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 09:49:05,590 epoch 7 - iter 2340/2606 - loss 0.02696018 - time (sec): 125.65 - samples/sec: 2617.72 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 09:49:19,451 epoch 7 - iter 2600/2606 - loss 0.02654389 - time (sec): 139.51 - samples/sec: 2624.77 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 09:49:19,808 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:49:19,808 EPOCH 7 done: loss 0.0265 - lr: 0.000010 |
|
2023-10-25 09:49:26,571 DEV : loss 0.39285099506378174 - f1-score (micro avg) 0.3931 |
|
2023-10-25 09:49:26,595 saving best model |
|
2023-10-25 09:49:27,146 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:49:41,473 epoch 8 - iter 260/2606 - loss 0.02320587 - time (sec): 14.33 - samples/sec: 2788.59 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 09:49:55,818 epoch 8 - iter 520/2606 - loss 0.02224982 - time (sec): 28.67 - samples/sec: 2734.31 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 09:50:09,626 epoch 8 - iter 780/2606 - loss 0.02168589 - time (sec): 42.48 - samples/sec: 2685.33 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 09:50:24,124 epoch 8 - iter 1040/2606 - loss 0.02155340 - time (sec): 56.98 - samples/sec: 2676.06 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 09:50:38,176 epoch 8 - iter 1300/2606 - loss 0.02209628 - time (sec): 71.03 - samples/sec: 2667.70 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 09:50:51,996 epoch 8 - iter 1560/2606 - loss 0.02278193 - time (sec): 84.85 - samples/sec: 2648.56 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 09:51:05,626 epoch 8 - iter 1820/2606 - loss 0.02309913 - time (sec): 98.48 - samples/sec: 2628.42 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 09:51:19,691 epoch 8 - iter 2080/2606 - loss 0.02373236 - time (sec): 112.54 - samples/sec: 2631.71 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 09:51:33,624 epoch 8 - iter 2340/2606 - loss 0.02449169 - time (sec): 126.48 - samples/sec: 2625.76 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 09:51:47,018 epoch 8 - iter 2600/2606 - loss 0.02506631 - time (sec): 139.87 - samples/sec: 2621.51 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 09:51:47,298 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:51:47,298 EPOCH 8 done: loss 0.0251 - lr: 0.000007 |
|
2023-10-25 09:51:54,105 DEV : loss 0.389981210231781 - f1-score (micro avg) 0.3328 |
|
2023-10-25 09:51:54,129 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:52:08,291 epoch 9 - iter 260/2606 - loss 0.02624175 - time (sec): 14.16 - samples/sec: 2584.93 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 09:52:22,424 epoch 9 - iter 520/2606 - loss 0.03865993 - time (sec): 28.29 - samples/sec: 2566.31 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 09:52:36,208 epoch 9 - iter 780/2606 - loss 0.05696753 - time (sec): 42.08 - samples/sec: 2621.62 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 09:52:50,318 epoch 9 - iter 1040/2606 - loss 0.07090173 - time (sec): 56.19 - samples/sec: 2647.40 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 09:53:03,620 epoch 9 - iter 1300/2606 - loss 0.07602929 - time (sec): 69.49 - samples/sec: 2656.13 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 09:53:17,884 epoch 9 - iter 1560/2606 - loss 0.07946920 - time (sec): 83.75 - samples/sec: 2642.05 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 09:53:32,078 epoch 9 - iter 1820/2606 - loss 0.08234525 - time (sec): 97.95 - samples/sec: 2626.20 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 09:53:46,116 epoch 9 - iter 2080/2606 - loss 0.08103766 - time (sec): 111.99 - samples/sec: 2619.57 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 09:53:59,915 epoch 9 - iter 2340/2606 - loss 0.08253137 - time (sec): 125.78 - samples/sec: 2619.26 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 09:54:13,937 epoch 9 - iter 2600/2606 - loss 0.08445543 - time (sec): 139.81 - samples/sec: 2621.19 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 09:54:14,275 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:54:14,276 EPOCH 9 done: loss 0.0845 - lr: 0.000003 |
|
2023-10-25 09:54:20,440 DEV : loss 0.32093799114227295 - f1-score (micro avg) 0.1625 |
|
2023-10-25 09:54:20,464 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:54:35,426 epoch 10 - iter 260/2606 - loss 0.11039614 - time (sec): 14.96 - samples/sec: 2555.33 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 09:54:49,295 epoch 10 - iter 520/2606 - loss 0.11206933 - time (sec): 28.83 - samples/sec: 2602.10 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 09:55:03,220 epoch 10 - iter 780/2606 - loss 0.10809914 - time (sec): 42.75 - samples/sec: 2623.97 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 09:55:16,759 epoch 10 - iter 1040/2606 - loss 0.10634910 - time (sec): 56.29 - samples/sec: 2639.33 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 09:55:30,685 epoch 10 - iter 1300/2606 - loss 0.10445432 - time (sec): 70.22 - samples/sec: 2644.55 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 09:55:44,761 epoch 10 - iter 1560/2606 - loss 0.10588664 - time (sec): 84.30 - samples/sec: 2642.27 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 09:55:58,511 epoch 10 - iter 1820/2606 - loss 0.10746459 - time (sec): 98.05 - samples/sec: 2629.98 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 09:56:11,912 epoch 10 - iter 2080/2606 - loss 0.10861096 - time (sec): 111.45 - samples/sec: 2621.60 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 09:56:26,161 epoch 10 - iter 2340/2606 - loss 0.10863908 - time (sec): 125.70 - samples/sec: 2616.16 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 09:56:40,267 epoch 10 - iter 2600/2606 - loss 0.10855224 - time (sec): 139.80 - samples/sec: 2621.94 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 09:56:40,569 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:56:40,569 EPOCH 10 done: loss 0.1086 - lr: 0.000000 |
|
2023-10-25 09:56:46,739 DEV : loss 0.3003169000148773 - f1-score (micro avg) 0.1001 |
|
2023-10-25 09:56:47,170 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:56:47,171 Loading model from best epoch ... |
|
2023-10-25 09:56:48,915 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-25 09:56:58,445 |
|
Results: |
|
- F-score (micro) 0.4597 |
|
- F-score (macro) 0.3186 |
|
- Accuracy 0.302 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.5000 0.5231 0.5113 1214 |
|
PER 0.4151 0.4963 0.4521 808 |
|
ORG 0.3194 0.3031 0.3110 353 |
|
HumanProd 0.0000 0.0000 0.0000 15 |
|
|
|
micro avg 0.4425 0.4782 0.4597 2390 |
|
macro avg 0.3086 0.3306 0.3186 2390 |
|
weighted avg 0.4415 0.4782 0.4585 2390 |
|
|
|
2023-10-25 09:56:58,445 ---------------------------------------------------------------------------------------------------- |
|
|