stefan-it's picture
Upload ./training.log with huggingface_hub
c2fd0dc
2023-10-25 17:23:23,518 ----------------------------------------------------------------------------------------------------
2023-10-25 17:23:23,519 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 17:23:23,519 ----------------------------------------------------------------------------------------------------
2023-10-25 17:23:23,520 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-25 17:23:23,520 ----------------------------------------------------------------------------------------------------
2023-10-25 17:23:23,520 Train: 7142 sentences
2023-10-25 17:23:23,520 (train_with_dev=False, train_with_test=False)
2023-10-25 17:23:23,520 ----------------------------------------------------------------------------------------------------
2023-10-25 17:23:23,520 Training Params:
2023-10-25 17:23:23,520 - learning_rate: "3e-05"
2023-10-25 17:23:23,520 - mini_batch_size: "4"
2023-10-25 17:23:23,520 - max_epochs: "10"
2023-10-25 17:23:23,520 - shuffle: "True"
2023-10-25 17:23:23,520 ----------------------------------------------------------------------------------------------------
2023-10-25 17:23:23,520 Plugins:
2023-10-25 17:23:23,520 - TensorboardLogger
2023-10-25 17:23:23,520 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 17:23:23,520 ----------------------------------------------------------------------------------------------------
2023-10-25 17:23:23,520 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 17:23:23,520 - metric: "('micro avg', 'f1-score')"
2023-10-25 17:23:23,520 ----------------------------------------------------------------------------------------------------
2023-10-25 17:23:23,520 Computation:
2023-10-25 17:23:23,520 - compute on device: cuda:0
2023-10-25 17:23:23,520 - embedding storage: none
2023-10-25 17:23:23,520 ----------------------------------------------------------------------------------------------------
2023-10-25 17:23:23,520 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-25 17:23:23,520 ----------------------------------------------------------------------------------------------------
2023-10-25 17:23:23,520 ----------------------------------------------------------------------------------------------------
2023-10-25 17:23:23,521 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 17:23:33,478 epoch 1 - iter 178/1786 - loss 1.87528964 - time (sec): 9.96 - samples/sec: 2536.31 - lr: 0.000003 - momentum: 0.000000
2023-10-25 17:23:43,037 epoch 1 - iter 356/1786 - loss 1.18982744 - time (sec): 19.52 - samples/sec: 2601.34 - lr: 0.000006 - momentum: 0.000000
2023-10-25 17:23:52,403 epoch 1 - iter 534/1786 - loss 0.91799054 - time (sec): 28.88 - samples/sec: 2589.02 - lr: 0.000009 - momentum: 0.000000
2023-10-25 17:24:01,795 epoch 1 - iter 712/1786 - loss 0.74578333 - time (sec): 38.27 - samples/sec: 2602.25 - lr: 0.000012 - momentum: 0.000000
2023-10-25 17:24:11,181 epoch 1 - iter 890/1786 - loss 0.63832228 - time (sec): 47.66 - samples/sec: 2589.58 - lr: 0.000015 - momentum: 0.000000
2023-10-25 17:24:21,091 epoch 1 - iter 1068/1786 - loss 0.56408854 - time (sec): 57.57 - samples/sec: 2572.05 - lr: 0.000018 - momentum: 0.000000
2023-10-25 17:24:30,933 epoch 1 - iter 1246/1786 - loss 0.50330520 - time (sec): 67.41 - samples/sec: 2565.42 - lr: 0.000021 - momentum: 0.000000
2023-10-25 17:24:40,475 epoch 1 - iter 1424/1786 - loss 0.45667579 - time (sec): 76.95 - samples/sec: 2583.90 - lr: 0.000024 - momentum: 0.000000
2023-10-25 17:24:49,930 epoch 1 - iter 1602/1786 - loss 0.42322100 - time (sec): 86.41 - samples/sec: 2591.20 - lr: 0.000027 - momentum: 0.000000
2023-10-25 17:24:59,517 epoch 1 - iter 1780/1786 - loss 0.39579531 - time (sec): 96.00 - samples/sec: 2580.11 - lr: 0.000030 - momentum: 0.000000
2023-10-25 17:24:59,829 ----------------------------------------------------------------------------------------------------
2023-10-25 17:24:59,830 EPOCH 1 done: loss 0.3947 - lr: 0.000030
2023-10-25 17:25:04,012 DEV : loss 0.12539885938167572 - f1-score (micro avg) 0.7187
2023-10-25 17:25:04,032 saving best model
2023-10-25 17:25:04,511 ----------------------------------------------------------------------------------------------------
2023-10-25 17:25:14,065 epoch 2 - iter 178/1786 - loss 0.11189743 - time (sec): 9.55 - samples/sec: 2586.45 - lr: 0.000030 - momentum: 0.000000
2023-10-25 17:25:23,584 epoch 2 - iter 356/1786 - loss 0.10293552 - time (sec): 19.07 - samples/sec: 2574.86 - lr: 0.000029 - momentum: 0.000000
2023-10-25 17:25:32,838 epoch 2 - iter 534/1786 - loss 0.10659511 - time (sec): 28.33 - samples/sec: 2648.49 - lr: 0.000029 - momentum: 0.000000
2023-10-25 17:25:42,662 epoch 2 - iter 712/1786 - loss 0.11165249 - time (sec): 38.15 - samples/sec: 2652.80 - lr: 0.000029 - momentum: 0.000000
2023-10-25 17:25:51,915 epoch 2 - iter 890/1786 - loss 0.10776982 - time (sec): 47.40 - samples/sec: 2679.23 - lr: 0.000028 - momentum: 0.000000
2023-10-25 17:26:00,968 epoch 2 - iter 1068/1786 - loss 0.10827805 - time (sec): 56.45 - samples/sec: 2669.10 - lr: 0.000028 - momentum: 0.000000
2023-10-25 17:26:10,079 epoch 2 - iter 1246/1786 - loss 0.11103149 - time (sec): 65.57 - samples/sec: 2673.89 - lr: 0.000028 - momentum: 0.000000
2023-10-25 17:26:19,183 epoch 2 - iter 1424/1786 - loss 0.11107111 - time (sec): 74.67 - samples/sec: 2686.65 - lr: 0.000027 - momentum: 0.000000
2023-10-25 17:26:28,186 epoch 2 - iter 1602/1786 - loss 0.11239949 - time (sec): 83.67 - samples/sec: 2664.30 - lr: 0.000027 - momentum: 0.000000
2023-10-25 17:26:37,543 epoch 2 - iter 1780/1786 - loss 0.11131036 - time (sec): 93.03 - samples/sec: 2667.48 - lr: 0.000027 - momentum: 0.000000
2023-10-25 17:26:37,837 ----------------------------------------------------------------------------------------------------
2023-10-25 17:26:37,838 EPOCH 2 done: loss 0.1112 - lr: 0.000027
2023-10-25 17:26:42,572 DEV : loss 0.12097379565238953 - f1-score (micro avg) 0.775
2023-10-25 17:26:42,593 saving best model
2023-10-25 17:26:43,258 ----------------------------------------------------------------------------------------------------
2023-10-25 17:26:52,661 epoch 3 - iter 178/1786 - loss 0.08101302 - time (sec): 9.40 - samples/sec: 2495.76 - lr: 0.000026 - momentum: 0.000000
2023-10-25 17:27:01,957 epoch 3 - iter 356/1786 - loss 0.07838070 - time (sec): 18.70 - samples/sec: 2642.13 - lr: 0.000026 - momentum: 0.000000
2023-10-25 17:27:10,802 epoch 3 - iter 534/1786 - loss 0.07477397 - time (sec): 27.54 - samples/sec: 2695.98 - lr: 0.000026 - momentum: 0.000000
2023-10-25 17:27:19,571 epoch 3 - iter 712/1786 - loss 0.07540716 - time (sec): 36.31 - samples/sec: 2728.65 - lr: 0.000025 - momentum: 0.000000
2023-10-25 17:27:28,370 epoch 3 - iter 890/1786 - loss 0.07449414 - time (sec): 45.11 - samples/sec: 2762.94 - lr: 0.000025 - momentum: 0.000000
2023-10-25 17:27:37,759 epoch 3 - iter 1068/1786 - loss 0.07391613 - time (sec): 54.50 - samples/sec: 2746.67 - lr: 0.000025 - momentum: 0.000000
2023-10-25 17:27:46,941 epoch 3 - iter 1246/1786 - loss 0.07374199 - time (sec): 63.68 - samples/sec: 2737.91 - lr: 0.000024 - momentum: 0.000000
2023-10-25 17:27:56,304 epoch 3 - iter 1424/1786 - loss 0.07439314 - time (sec): 73.04 - samples/sec: 2698.11 - lr: 0.000024 - momentum: 0.000000
2023-10-25 17:28:05,514 epoch 3 - iter 1602/1786 - loss 0.07328092 - time (sec): 82.25 - samples/sec: 2717.57 - lr: 0.000024 - momentum: 0.000000
2023-10-25 17:28:14,550 epoch 3 - iter 1780/1786 - loss 0.07531818 - time (sec): 91.29 - samples/sec: 2717.15 - lr: 0.000023 - momentum: 0.000000
2023-10-25 17:28:14,857 ----------------------------------------------------------------------------------------------------
2023-10-25 17:28:14,857 EPOCH 3 done: loss 0.0756 - lr: 0.000023
2023-10-25 17:28:19,861 DEV : loss 0.13726350665092468 - f1-score (micro avg) 0.7928
2023-10-25 17:28:19,883 saving best model
2023-10-25 17:28:20,564 ----------------------------------------------------------------------------------------------------
2023-10-25 17:28:30,021 epoch 4 - iter 178/1786 - loss 0.04996112 - time (sec): 9.45 - samples/sec: 2744.64 - lr: 0.000023 - momentum: 0.000000
2023-10-25 17:28:39,557 epoch 4 - iter 356/1786 - loss 0.05423994 - time (sec): 18.99 - samples/sec: 2715.79 - lr: 0.000023 - momentum: 0.000000
2023-10-25 17:28:49,182 epoch 4 - iter 534/1786 - loss 0.05445421 - time (sec): 28.61 - samples/sec: 2621.06 - lr: 0.000022 - momentum: 0.000000
2023-10-25 17:28:59,104 epoch 4 - iter 712/1786 - loss 0.05330529 - time (sec): 38.54 - samples/sec: 2568.59 - lr: 0.000022 - momentum: 0.000000
2023-10-25 17:29:09,009 epoch 4 - iter 890/1786 - loss 0.05298052 - time (sec): 48.44 - samples/sec: 2568.65 - lr: 0.000022 - momentum: 0.000000
2023-10-25 17:29:18,512 epoch 4 - iter 1068/1786 - loss 0.05390792 - time (sec): 57.95 - samples/sec: 2587.39 - lr: 0.000021 - momentum: 0.000000
2023-10-25 17:29:27,598 epoch 4 - iter 1246/1786 - loss 0.05456634 - time (sec): 67.03 - samples/sec: 2586.59 - lr: 0.000021 - momentum: 0.000000
2023-10-25 17:29:36,978 epoch 4 - iter 1424/1786 - loss 0.05422428 - time (sec): 76.41 - samples/sec: 2598.33 - lr: 0.000021 - momentum: 0.000000
2023-10-25 17:29:46,218 epoch 4 - iter 1602/1786 - loss 0.05423185 - time (sec): 85.65 - samples/sec: 2614.31 - lr: 0.000020 - momentum: 0.000000
2023-10-25 17:29:54,773 epoch 4 - iter 1780/1786 - loss 0.05389469 - time (sec): 94.21 - samples/sec: 2625.28 - lr: 0.000020 - momentum: 0.000000
2023-10-25 17:29:55,096 ----------------------------------------------------------------------------------------------------
2023-10-25 17:29:55,097 EPOCH 4 done: loss 0.0537 - lr: 0.000020
2023-10-25 17:29:59,399 DEV : loss 0.16342049837112427 - f1-score (micro avg) 0.7707
2023-10-25 17:29:59,422 ----------------------------------------------------------------------------------------------------
2023-10-25 17:30:08,905 epoch 5 - iter 178/1786 - loss 0.04464999 - time (sec): 9.48 - samples/sec: 2445.91 - lr: 0.000020 - momentum: 0.000000
2023-10-25 17:30:18,589 epoch 5 - iter 356/1786 - loss 0.04124529 - time (sec): 19.16 - samples/sec: 2505.97 - lr: 0.000019 - momentum: 0.000000
2023-10-25 17:30:28,145 epoch 5 - iter 534/1786 - loss 0.04150244 - time (sec): 28.72 - samples/sec: 2535.88 - lr: 0.000019 - momentum: 0.000000
2023-10-25 17:30:37,781 epoch 5 - iter 712/1786 - loss 0.04088327 - time (sec): 38.36 - samples/sec: 2547.42 - lr: 0.000019 - momentum: 0.000000
2023-10-25 17:30:47,585 epoch 5 - iter 890/1786 - loss 0.04061157 - time (sec): 48.16 - samples/sec: 2553.53 - lr: 0.000018 - momentum: 0.000000
2023-10-25 17:30:57,459 epoch 5 - iter 1068/1786 - loss 0.04057435 - time (sec): 58.03 - samples/sec: 2551.95 - lr: 0.000018 - momentum: 0.000000
2023-10-25 17:31:06,910 epoch 5 - iter 1246/1786 - loss 0.04003354 - time (sec): 67.49 - samples/sec: 2554.71 - lr: 0.000018 - momentum: 0.000000
2023-10-25 17:31:16,488 epoch 5 - iter 1424/1786 - loss 0.04031376 - time (sec): 77.06 - samples/sec: 2553.26 - lr: 0.000017 - momentum: 0.000000
2023-10-25 17:31:26,644 epoch 5 - iter 1602/1786 - loss 0.04015972 - time (sec): 87.22 - samples/sec: 2558.40 - lr: 0.000017 - momentum: 0.000000
2023-10-25 17:31:36,587 epoch 5 - iter 1780/1786 - loss 0.04049095 - time (sec): 97.16 - samples/sec: 2550.72 - lr: 0.000017 - momentum: 0.000000
2023-10-25 17:31:36,969 ----------------------------------------------------------------------------------------------------
2023-10-25 17:31:36,970 EPOCH 5 done: loss 0.0404 - lr: 0.000017
2023-10-25 17:31:42,252 DEV : loss 0.1974058896303177 - f1-score (micro avg) 0.8089
2023-10-25 17:31:42,284 saving best model
2023-10-25 17:31:42,993 ----------------------------------------------------------------------------------------------------
2023-10-25 17:31:53,000 epoch 6 - iter 178/1786 - loss 0.03793523 - time (sec): 10.00 - samples/sec: 2367.43 - lr: 0.000016 - momentum: 0.000000
2023-10-25 17:32:03,065 epoch 6 - iter 356/1786 - loss 0.03298399 - time (sec): 20.07 - samples/sec: 2320.04 - lr: 0.000016 - momentum: 0.000000
2023-10-25 17:32:12,621 epoch 6 - iter 534/1786 - loss 0.03151283 - time (sec): 29.63 - samples/sec: 2435.85 - lr: 0.000016 - momentum: 0.000000
2023-10-25 17:32:21,750 epoch 6 - iter 712/1786 - loss 0.02930631 - time (sec): 38.75 - samples/sec: 2506.82 - lr: 0.000015 - momentum: 0.000000
2023-10-25 17:32:30,885 epoch 6 - iter 890/1786 - loss 0.02923823 - time (sec): 47.89 - samples/sec: 2562.27 - lr: 0.000015 - momentum: 0.000000
2023-10-25 17:32:40,336 epoch 6 - iter 1068/1786 - loss 0.02878535 - time (sec): 57.34 - samples/sec: 2586.14 - lr: 0.000015 - momentum: 0.000000
2023-10-25 17:32:49,610 epoch 6 - iter 1246/1786 - loss 0.02891460 - time (sec): 66.62 - samples/sec: 2597.26 - lr: 0.000014 - momentum: 0.000000
2023-10-25 17:32:58,853 epoch 6 - iter 1424/1786 - loss 0.02934355 - time (sec): 75.86 - samples/sec: 2616.11 - lr: 0.000014 - momentum: 0.000000
2023-10-25 17:33:08,130 epoch 6 - iter 1602/1786 - loss 0.03023334 - time (sec): 85.14 - samples/sec: 2615.92 - lr: 0.000014 - momentum: 0.000000
2023-10-25 17:33:17,113 epoch 6 - iter 1780/1786 - loss 0.03047538 - time (sec): 94.12 - samples/sec: 2638.32 - lr: 0.000013 - momentum: 0.000000
2023-10-25 17:33:17,392 ----------------------------------------------------------------------------------------------------
2023-10-25 17:33:17,392 EPOCH 6 done: loss 0.0304 - lr: 0.000013
2023-10-25 17:33:21,674 DEV : loss 0.19189877808094025 - f1-score (micro avg) 0.8043
2023-10-25 17:33:21,697 ----------------------------------------------------------------------------------------------------
2023-10-25 17:33:31,238 epoch 7 - iter 178/1786 - loss 0.01982990 - time (sec): 9.54 - samples/sec: 2512.17 - lr: 0.000013 - momentum: 0.000000
2023-10-25 17:33:41,384 epoch 7 - iter 356/1786 - loss 0.01840636 - time (sec): 19.69 - samples/sec: 2469.16 - lr: 0.000013 - momentum: 0.000000
2023-10-25 17:33:50,348 epoch 7 - iter 534/1786 - loss 0.01939487 - time (sec): 28.65 - samples/sec: 2606.14 - lr: 0.000012 - momentum: 0.000000
2023-10-25 17:33:59,629 epoch 7 - iter 712/1786 - loss 0.02046481 - time (sec): 37.93 - samples/sec: 2621.63 - lr: 0.000012 - momentum: 0.000000
2023-10-25 17:34:09,099 epoch 7 - iter 890/1786 - loss 0.02294225 - time (sec): 47.40 - samples/sec: 2642.53 - lr: 0.000012 - momentum: 0.000000
2023-10-25 17:34:18,192 epoch 7 - iter 1068/1786 - loss 0.02292824 - time (sec): 56.49 - samples/sec: 2671.35 - lr: 0.000011 - momentum: 0.000000
2023-10-25 17:34:27,413 epoch 7 - iter 1246/1786 - loss 0.02364405 - time (sec): 65.71 - samples/sec: 2669.59 - lr: 0.000011 - momentum: 0.000000
2023-10-25 17:34:36,232 epoch 7 - iter 1424/1786 - loss 0.02346007 - time (sec): 74.53 - samples/sec: 2658.47 - lr: 0.000011 - momentum: 0.000000
2023-10-25 17:34:45,410 epoch 7 - iter 1602/1786 - loss 0.02322992 - time (sec): 83.71 - samples/sec: 2661.35 - lr: 0.000010 - momentum: 0.000000
2023-10-25 17:34:54,732 epoch 7 - iter 1780/1786 - loss 0.02305501 - time (sec): 93.03 - samples/sec: 2668.23 - lr: 0.000010 - momentum: 0.000000
2023-10-25 17:34:55,055 ----------------------------------------------------------------------------------------------------
2023-10-25 17:34:55,056 EPOCH 7 done: loss 0.0230 - lr: 0.000010
2023-10-25 17:34:59,167 DEV : loss 0.2131357192993164 - f1-score (micro avg) 0.7831
2023-10-25 17:34:59,190 ----------------------------------------------------------------------------------------------------
2023-10-25 17:35:08,902 epoch 8 - iter 178/1786 - loss 0.01504470 - time (sec): 9.71 - samples/sec: 2654.96 - lr: 0.000010 - momentum: 0.000000
2023-10-25 17:35:18,633 epoch 8 - iter 356/1786 - loss 0.01739513 - time (sec): 19.44 - samples/sec: 2593.83 - lr: 0.000009 - momentum: 0.000000
2023-10-25 17:35:28,289 epoch 8 - iter 534/1786 - loss 0.01711964 - time (sec): 29.10 - samples/sec: 2571.05 - lr: 0.000009 - momentum: 0.000000
2023-10-25 17:35:38,087 epoch 8 - iter 712/1786 - loss 0.01732972 - time (sec): 38.90 - samples/sec: 2522.70 - lr: 0.000009 - momentum: 0.000000
2023-10-25 17:35:47,755 epoch 8 - iter 890/1786 - loss 0.01614740 - time (sec): 48.56 - samples/sec: 2516.79 - lr: 0.000008 - momentum: 0.000000
2023-10-25 17:35:57,597 epoch 8 - iter 1068/1786 - loss 0.01612817 - time (sec): 58.41 - samples/sec: 2526.85 - lr: 0.000008 - momentum: 0.000000
2023-10-25 17:36:07,183 epoch 8 - iter 1246/1786 - loss 0.01598880 - time (sec): 67.99 - samples/sec: 2529.07 - lr: 0.000008 - momentum: 0.000000
2023-10-25 17:36:16,440 epoch 8 - iter 1424/1786 - loss 0.01547610 - time (sec): 77.25 - samples/sec: 2538.84 - lr: 0.000007 - momentum: 0.000000
2023-10-25 17:36:25,664 epoch 8 - iter 1602/1786 - loss 0.01548184 - time (sec): 86.47 - samples/sec: 2564.69 - lr: 0.000007 - momentum: 0.000000
2023-10-25 17:36:34,480 epoch 8 - iter 1780/1786 - loss 0.01595884 - time (sec): 95.29 - samples/sec: 2602.70 - lr: 0.000007 - momentum: 0.000000
2023-10-25 17:36:34,803 ----------------------------------------------------------------------------------------------------
2023-10-25 17:36:34,803 EPOCH 8 done: loss 0.0160 - lr: 0.000007
2023-10-25 17:36:39,864 DEV : loss 0.2298695296049118 - f1-score (micro avg) 0.7898
2023-10-25 17:36:39,893 ----------------------------------------------------------------------------------------------------
2023-10-25 17:36:49,628 epoch 9 - iter 178/1786 - loss 0.00726213 - time (sec): 9.73 - samples/sec: 2600.72 - lr: 0.000006 - momentum: 0.000000
2023-10-25 17:36:59,807 epoch 9 - iter 356/1786 - loss 0.00965484 - time (sec): 19.91 - samples/sec: 2524.95 - lr: 0.000006 - momentum: 0.000000
2023-10-25 17:37:09,927 epoch 9 - iter 534/1786 - loss 0.01192364 - time (sec): 30.03 - samples/sec: 2468.77 - lr: 0.000006 - momentum: 0.000000
2023-10-25 17:37:19,730 epoch 9 - iter 712/1786 - loss 0.01160538 - time (sec): 39.83 - samples/sec: 2512.61 - lr: 0.000005 - momentum: 0.000000
2023-10-25 17:37:29,248 epoch 9 - iter 890/1786 - loss 0.01091175 - time (sec): 49.35 - samples/sec: 2540.27 - lr: 0.000005 - momentum: 0.000000
2023-10-25 17:37:38,393 epoch 9 - iter 1068/1786 - loss 0.01020469 - time (sec): 58.50 - samples/sec: 2544.09 - lr: 0.000005 - momentum: 0.000000
2023-10-25 17:37:47,923 epoch 9 - iter 1246/1786 - loss 0.01074955 - time (sec): 68.03 - samples/sec: 2573.10 - lr: 0.000004 - momentum: 0.000000
2023-10-25 17:37:57,591 epoch 9 - iter 1424/1786 - loss 0.01097906 - time (sec): 77.70 - samples/sec: 2550.93 - lr: 0.000004 - momentum: 0.000000
2023-10-25 17:38:07,462 epoch 9 - iter 1602/1786 - loss 0.01111155 - time (sec): 87.57 - samples/sec: 2539.02 - lr: 0.000004 - momentum: 0.000000
2023-10-25 17:38:17,221 epoch 9 - iter 1780/1786 - loss 0.01126743 - time (sec): 97.33 - samples/sec: 2546.25 - lr: 0.000003 - momentum: 0.000000
2023-10-25 17:38:17,555 ----------------------------------------------------------------------------------------------------
2023-10-25 17:38:17,556 EPOCH 9 done: loss 0.0112 - lr: 0.000003
2023-10-25 17:38:21,700 DEV : loss 0.2397831529378891 - f1-score (micro avg) 0.7839
2023-10-25 17:38:21,725 ----------------------------------------------------------------------------------------------------
2023-10-25 17:38:31,163 epoch 10 - iter 178/1786 - loss 0.01020874 - time (sec): 9.44 - samples/sec: 2571.28 - lr: 0.000003 - momentum: 0.000000
2023-10-25 17:38:41,133 epoch 10 - iter 356/1786 - loss 0.00987512 - time (sec): 19.41 - samples/sec: 2440.84 - lr: 0.000003 - momentum: 0.000000
2023-10-25 17:38:50,313 epoch 10 - iter 534/1786 - loss 0.00992584 - time (sec): 28.59 - samples/sec: 2567.48 - lr: 0.000002 - momentum: 0.000000
2023-10-25 17:38:59,708 epoch 10 - iter 712/1786 - loss 0.00990542 - time (sec): 37.98 - samples/sec: 2596.01 - lr: 0.000002 - momentum: 0.000000
2023-10-25 17:39:08,396 epoch 10 - iter 890/1786 - loss 0.01027219 - time (sec): 46.67 - samples/sec: 2613.03 - lr: 0.000002 - momentum: 0.000000
2023-10-25 17:39:17,994 epoch 10 - iter 1068/1786 - loss 0.00974899 - time (sec): 56.27 - samples/sec: 2630.15 - lr: 0.000001 - momentum: 0.000000
2023-10-25 17:39:27,610 epoch 10 - iter 1246/1786 - loss 0.00964506 - time (sec): 65.88 - samples/sec: 2626.26 - lr: 0.000001 - momentum: 0.000000
2023-10-25 17:39:36,684 epoch 10 - iter 1424/1786 - loss 0.00911795 - time (sec): 74.96 - samples/sec: 2618.10 - lr: 0.000001 - momentum: 0.000000
2023-10-25 17:39:45,879 epoch 10 - iter 1602/1786 - loss 0.00873331 - time (sec): 84.15 - samples/sec: 2636.68 - lr: 0.000000 - momentum: 0.000000
2023-10-25 17:39:54,763 epoch 10 - iter 1780/1786 - loss 0.00827384 - time (sec): 93.04 - samples/sec: 2665.21 - lr: 0.000000 - momentum: 0.000000
2023-10-25 17:39:55,053 ----------------------------------------------------------------------------------------------------
2023-10-25 17:39:55,053 EPOCH 10 done: loss 0.0082 - lr: 0.000000
2023-10-25 17:39:59,495 DEV : loss 0.23819302022457123 - f1-score (micro avg) 0.7941
2023-10-25 17:40:00,028 ----------------------------------------------------------------------------------------------------
2023-10-25 17:40:00,030 Loading model from best epoch ...
2023-10-25 17:40:01,959 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 17:40:13,856
Results:
- F-score (micro) 0.687
- F-score (macro) 0.6126
- Accuracy 0.5402
By class:
precision recall f1-score support
LOC 0.7286 0.6447 0.6841 1095
PER 0.7741 0.7787 0.7764 1012
ORG 0.4390 0.5546 0.4901 357
HumanProd 0.4000 0.6667 0.5000 33
micro avg 0.6875 0.6864 0.6870 2497
macro avg 0.5854 0.6612 0.6126 2497
weighted avg 0.7013 0.6864 0.6913 2497
2023-10-25 17:40:13,857 ----------------------------------------------------------------------------------------------------