|
2023-10-25 17:23:23,518 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:23:23,519 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 17:23:23,519 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:23:23,520 MultiCorpus: 7142 train + 698 dev + 2570 test sentences |
|
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator |
|
2023-10-25 17:23:23,520 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:23:23,520 Train: 7142 sentences |
|
2023-10-25 17:23:23,520 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 17:23:23,520 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:23:23,520 Training Params: |
|
2023-10-25 17:23:23,520 - learning_rate: "3e-05" |
|
2023-10-25 17:23:23,520 - mini_batch_size: "4" |
|
2023-10-25 17:23:23,520 - max_epochs: "10" |
|
2023-10-25 17:23:23,520 - shuffle: "True" |
|
2023-10-25 17:23:23,520 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:23:23,520 Plugins: |
|
2023-10-25 17:23:23,520 - TensorboardLogger |
|
2023-10-25 17:23:23,520 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 17:23:23,520 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:23:23,520 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 17:23:23,520 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 17:23:23,520 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:23:23,520 Computation: |
|
2023-10-25 17:23:23,520 - compute on device: cuda:0 |
|
2023-10-25 17:23:23,520 - embedding storage: none |
|
2023-10-25 17:23:23,520 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:23:23,520 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-25 17:23:23,520 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:23:23,520 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:23:23,521 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 17:23:33,478 epoch 1 - iter 178/1786 - loss 1.87528964 - time (sec): 9.96 - samples/sec: 2536.31 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 17:23:43,037 epoch 1 - iter 356/1786 - loss 1.18982744 - time (sec): 19.52 - samples/sec: 2601.34 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 17:23:52,403 epoch 1 - iter 534/1786 - loss 0.91799054 - time (sec): 28.88 - samples/sec: 2589.02 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 17:24:01,795 epoch 1 - iter 712/1786 - loss 0.74578333 - time (sec): 38.27 - samples/sec: 2602.25 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 17:24:11,181 epoch 1 - iter 890/1786 - loss 0.63832228 - time (sec): 47.66 - samples/sec: 2589.58 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 17:24:21,091 epoch 1 - iter 1068/1786 - loss 0.56408854 - time (sec): 57.57 - samples/sec: 2572.05 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 17:24:30,933 epoch 1 - iter 1246/1786 - loss 0.50330520 - time (sec): 67.41 - samples/sec: 2565.42 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 17:24:40,475 epoch 1 - iter 1424/1786 - loss 0.45667579 - time (sec): 76.95 - samples/sec: 2583.90 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 17:24:49,930 epoch 1 - iter 1602/1786 - loss 0.42322100 - time (sec): 86.41 - samples/sec: 2591.20 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 17:24:59,517 epoch 1 - iter 1780/1786 - loss 0.39579531 - time (sec): 96.00 - samples/sec: 2580.11 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 17:24:59,829 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:24:59,830 EPOCH 1 done: loss 0.3947 - lr: 0.000030 |
|
2023-10-25 17:25:04,012 DEV : loss 0.12539885938167572 - f1-score (micro avg) 0.7187 |
|
2023-10-25 17:25:04,032 saving best model |
|
2023-10-25 17:25:04,511 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:25:14,065 epoch 2 - iter 178/1786 - loss 0.11189743 - time (sec): 9.55 - samples/sec: 2586.45 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 17:25:23,584 epoch 2 - iter 356/1786 - loss 0.10293552 - time (sec): 19.07 - samples/sec: 2574.86 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 17:25:32,838 epoch 2 - iter 534/1786 - loss 0.10659511 - time (sec): 28.33 - samples/sec: 2648.49 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 17:25:42,662 epoch 2 - iter 712/1786 - loss 0.11165249 - time (sec): 38.15 - samples/sec: 2652.80 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 17:25:51,915 epoch 2 - iter 890/1786 - loss 0.10776982 - time (sec): 47.40 - samples/sec: 2679.23 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 17:26:00,968 epoch 2 - iter 1068/1786 - loss 0.10827805 - time (sec): 56.45 - samples/sec: 2669.10 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 17:26:10,079 epoch 2 - iter 1246/1786 - loss 0.11103149 - time (sec): 65.57 - samples/sec: 2673.89 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 17:26:19,183 epoch 2 - iter 1424/1786 - loss 0.11107111 - time (sec): 74.67 - samples/sec: 2686.65 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 17:26:28,186 epoch 2 - iter 1602/1786 - loss 0.11239949 - time (sec): 83.67 - samples/sec: 2664.30 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 17:26:37,543 epoch 2 - iter 1780/1786 - loss 0.11131036 - time (sec): 93.03 - samples/sec: 2667.48 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 17:26:37,837 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:26:37,838 EPOCH 2 done: loss 0.1112 - lr: 0.000027 |
|
2023-10-25 17:26:42,572 DEV : loss 0.12097379565238953 - f1-score (micro avg) 0.775 |
|
2023-10-25 17:26:42,593 saving best model |
|
2023-10-25 17:26:43,258 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:26:52,661 epoch 3 - iter 178/1786 - loss 0.08101302 - time (sec): 9.40 - samples/sec: 2495.76 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 17:27:01,957 epoch 3 - iter 356/1786 - loss 0.07838070 - time (sec): 18.70 - samples/sec: 2642.13 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 17:27:10,802 epoch 3 - iter 534/1786 - loss 0.07477397 - time (sec): 27.54 - samples/sec: 2695.98 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 17:27:19,571 epoch 3 - iter 712/1786 - loss 0.07540716 - time (sec): 36.31 - samples/sec: 2728.65 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 17:27:28,370 epoch 3 - iter 890/1786 - loss 0.07449414 - time (sec): 45.11 - samples/sec: 2762.94 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 17:27:37,759 epoch 3 - iter 1068/1786 - loss 0.07391613 - time (sec): 54.50 - samples/sec: 2746.67 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 17:27:46,941 epoch 3 - iter 1246/1786 - loss 0.07374199 - time (sec): 63.68 - samples/sec: 2737.91 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 17:27:56,304 epoch 3 - iter 1424/1786 - loss 0.07439314 - time (sec): 73.04 - samples/sec: 2698.11 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 17:28:05,514 epoch 3 - iter 1602/1786 - loss 0.07328092 - time (sec): 82.25 - samples/sec: 2717.57 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 17:28:14,550 epoch 3 - iter 1780/1786 - loss 0.07531818 - time (sec): 91.29 - samples/sec: 2717.15 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 17:28:14,857 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:28:14,857 EPOCH 3 done: loss 0.0756 - lr: 0.000023 |
|
2023-10-25 17:28:19,861 DEV : loss 0.13726350665092468 - f1-score (micro avg) 0.7928 |
|
2023-10-25 17:28:19,883 saving best model |
|
2023-10-25 17:28:20,564 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:28:30,021 epoch 4 - iter 178/1786 - loss 0.04996112 - time (sec): 9.45 - samples/sec: 2744.64 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 17:28:39,557 epoch 4 - iter 356/1786 - loss 0.05423994 - time (sec): 18.99 - samples/sec: 2715.79 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 17:28:49,182 epoch 4 - iter 534/1786 - loss 0.05445421 - time (sec): 28.61 - samples/sec: 2621.06 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 17:28:59,104 epoch 4 - iter 712/1786 - loss 0.05330529 - time (sec): 38.54 - samples/sec: 2568.59 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 17:29:09,009 epoch 4 - iter 890/1786 - loss 0.05298052 - time (sec): 48.44 - samples/sec: 2568.65 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 17:29:18,512 epoch 4 - iter 1068/1786 - loss 0.05390792 - time (sec): 57.95 - samples/sec: 2587.39 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 17:29:27,598 epoch 4 - iter 1246/1786 - loss 0.05456634 - time (sec): 67.03 - samples/sec: 2586.59 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 17:29:36,978 epoch 4 - iter 1424/1786 - loss 0.05422428 - time (sec): 76.41 - samples/sec: 2598.33 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 17:29:46,218 epoch 4 - iter 1602/1786 - loss 0.05423185 - time (sec): 85.65 - samples/sec: 2614.31 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 17:29:54,773 epoch 4 - iter 1780/1786 - loss 0.05389469 - time (sec): 94.21 - samples/sec: 2625.28 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 17:29:55,096 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:29:55,097 EPOCH 4 done: loss 0.0537 - lr: 0.000020 |
|
2023-10-25 17:29:59,399 DEV : loss 0.16342049837112427 - f1-score (micro avg) 0.7707 |
|
2023-10-25 17:29:59,422 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:30:08,905 epoch 5 - iter 178/1786 - loss 0.04464999 - time (sec): 9.48 - samples/sec: 2445.91 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 17:30:18,589 epoch 5 - iter 356/1786 - loss 0.04124529 - time (sec): 19.16 - samples/sec: 2505.97 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 17:30:28,145 epoch 5 - iter 534/1786 - loss 0.04150244 - time (sec): 28.72 - samples/sec: 2535.88 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 17:30:37,781 epoch 5 - iter 712/1786 - loss 0.04088327 - time (sec): 38.36 - samples/sec: 2547.42 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 17:30:47,585 epoch 5 - iter 890/1786 - loss 0.04061157 - time (sec): 48.16 - samples/sec: 2553.53 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 17:30:57,459 epoch 5 - iter 1068/1786 - loss 0.04057435 - time (sec): 58.03 - samples/sec: 2551.95 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 17:31:06,910 epoch 5 - iter 1246/1786 - loss 0.04003354 - time (sec): 67.49 - samples/sec: 2554.71 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 17:31:16,488 epoch 5 - iter 1424/1786 - loss 0.04031376 - time (sec): 77.06 - samples/sec: 2553.26 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 17:31:26,644 epoch 5 - iter 1602/1786 - loss 0.04015972 - time (sec): 87.22 - samples/sec: 2558.40 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 17:31:36,587 epoch 5 - iter 1780/1786 - loss 0.04049095 - time (sec): 97.16 - samples/sec: 2550.72 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 17:31:36,969 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:31:36,970 EPOCH 5 done: loss 0.0404 - lr: 0.000017 |
|
2023-10-25 17:31:42,252 DEV : loss 0.1974058896303177 - f1-score (micro avg) 0.8089 |
|
2023-10-25 17:31:42,284 saving best model |
|
2023-10-25 17:31:42,993 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:31:53,000 epoch 6 - iter 178/1786 - loss 0.03793523 - time (sec): 10.00 - samples/sec: 2367.43 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 17:32:03,065 epoch 6 - iter 356/1786 - loss 0.03298399 - time (sec): 20.07 - samples/sec: 2320.04 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 17:32:12,621 epoch 6 - iter 534/1786 - loss 0.03151283 - time (sec): 29.63 - samples/sec: 2435.85 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 17:32:21,750 epoch 6 - iter 712/1786 - loss 0.02930631 - time (sec): 38.75 - samples/sec: 2506.82 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 17:32:30,885 epoch 6 - iter 890/1786 - loss 0.02923823 - time (sec): 47.89 - samples/sec: 2562.27 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 17:32:40,336 epoch 6 - iter 1068/1786 - loss 0.02878535 - time (sec): 57.34 - samples/sec: 2586.14 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 17:32:49,610 epoch 6 - iter 1246/1786 - loss 0.02891460 - time (sec): 66.62 - samples/sec: 2597.26 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 17:32:58,853 epoch 6 - iter 1424/1786 - loss 0.02934355 - time (sec): 75.86 - samples/sec: 2616.11 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 17:33:08,130 epoch 6 - iter 1602/1786 - loss 0.03023334 - time (sec): 85.14 - samples/sec: 2615.92 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 17:33:17,113 epoch 6 - iter 1780/1786 - loss 0.03047538 - time (sec): 94.12 - samples/sec: 2638.32 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 17:33:17,392 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:33:17,392 EPOCH 6 done: loss 0.0304 - lr: 0.000013 |
|
2023-10-25 17:33:21,674 DEV : loss 0.19189877808094025 - f1-score (micro avg) 0.8043 |
|
2023-10-25 17:33:21,697 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:33:31,238 epoch 7 - iter 178/1786 - loss 0.01982990 - time (sec): 9.54 - samples/sec: 2512.17 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 17:33:41,384 epoch 7 - iter 356/1786 - loss 0.01840636 - time (sec): 19.69 - samples/sec: 2469.16 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 17:33:50,348 epoch 7 - iter 534/1786 - loss 0.01939487 - time (sec): 28.65 - samples/sec: 2606.14 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 17:33:59,629 epoch 7 - iter 712/1786 - loss 0.02046481 - time (sec): 37.93 - samples/sec: 2621.63 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 17:34:09,099 epoch 7 - iter 890/1786 - loss 0.02294225 - time (sec): 47.40 - samples/sec: 2642.53 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 17:34:18,192 epoch 7 - iter 1068/1786 - loss 0.02292824 - time (sec): 56.49 - samples/sec: 2671.35 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 17:34:27,413 epoch 7 - iter 1246/1786 - loss 0.02364405 - time (sec): 65.71 - samples/sec: 2669.59 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 17:34:36,232 epoch 7 - iter 1424/1786 - loss 0.02346007 - time (sec): 74.53 - samples/sec: 2658.47 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 17:34:45,410 epoch 7 - iter 1602/1786 - loss 0.02322992 - time (sec): 83.71 - samples/sec: 2661.35 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 17:34:54,732 epoch 7 - iter 1780/1786 - loss 0.02305501 - time (sec): 93.03 - samples/sec: 2668.23 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 17:34:55,055 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:34:55,056 EPOCH 7 done: loss 0.0230 - lr: 0.000010 |
|
2023-10-25 17:34:59,167 DEV : loss 0.2131357192993164 - f1-score (micro avg) 0.7831 |
|
2023-10-25 17:34:59,190 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:35:08,902 epoch 8 - iter 178/1786 - loss 0.01504470 - time (sec): 9.71 - samples/sec: 2654.96 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 17:35:18,633 epoch 8 - iter 356/1786 - loss 0.01739513 - time (sec): 19.44 - samples/sec: 2593.83 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 17:35:28,289 epoch 8 - iter 534/1786 - loss 0.01711964 - time (sec): 29.10 - samples/sec: 2571.05 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 17:35:38,087 epoch 8 - iter 712/1786 - loss 0.01732972 - time (sec): 38.90 - samples/sec: 2522.70 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 17:35:47,755 epoch 8 - iter 890/1786 - loss 0.01614740 - time (sec): 48.56 - samples/sec: 2516.79 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 17:35:57,597 epoch 8 - iter 1068/1786 - loss 0.01612817 - time (sec): 58.41 - samples/sec: 2526.85 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 17:36:07,183 epoch 8 - iter 1246/1786 - loss 0.01598880 - time (sec): 67.99 - samples/sec: 2529.07 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 17:36:16,440 epoch 8 - iter 1424/1786 - loss 0.01547610 - time (sec): 77.25 - samples/sec: 2538.84 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 17:36:25,664 epoch 8 - iter 1602/1786 - loss 0.01548184 - time (sec): 86.47 - samples/sec: 2564.69 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 17:36:34,480 epoch 8 - iter 1780/1786 - loss 0.01595884 - time (sec): 95.29 - samples/sec: 2602.70 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 17:36:34,803 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:36:34,803 EPOCH 8 done: loss 0.0160 - lr: 0.000007 |
|
2023-10-25 17:36:39,864 DEV : loss 0.2298695296049118 - f1-score (micro avg) 0.7898 |
|
2023-10-25 17:36:39,893 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:36:49,628 epoch 9 - iter 178/1786 - loss 0.00726213 - time (sec): 9.73 - samples/sec: 2600.72 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 17:36:59,807 epoch 9 - iter 356/1786 - loss 0.00965484 - time (sec): 19.91 - samples/sec: 2524.95 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 17:37:09,927 epoch 9 - iter 534/1786 - loss 0.01192364 - time (sec): 30.03 - samples/sec: 2468.77 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 17:37:19,730 epoch 9 - iter 712/1786 - loss 0.01160538 - time (sec): 39.83 - samples/sec: 2512.61 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 17:37:29,248 epoch 9 - iter 890/1786 - loss 0.01091175 - time (sec): 49.35 - samples/sec: 2540.27 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 17:37:38,393 epoch 9 - iter 1068/1786 - loss 0.01020469 - time (sec): 58.50 - samples/sec: 2544.09 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 17:37:47,923 epoch 9 - iter 1246/1786 - loss 0.01074955 - time (sec): 68.03 - samples/sec: 2573.10 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 17:37:57,591 epoch 9 - iter 1424/1786 - loss 0.01097906 - time (sec): 77.70 - samples/sec: 2550.93 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 17:38:07,462 epoch 9 - iter 1602/1786 - loss 0.01111155 - time (sec): 87.57 - samples/sec: 2539.02 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 17:38:17,221 epoch 9 - iter 1780/1786 - loss 0.01126743 - time (sec): 97.33 - samples/sec: 2546.25 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 17:38:17,555 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:38:17,556 EPOCH 9 done: loss 0.0112 - lr: 0.000003 |
|
2023-10-25 17:38:21,700 DEV : loss 0.2397831529378891 - f1-score (micro avg) 0.7839 |
|
2023-10-25 17:38:21,725 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:38:31,163 epoch 10 - iter 178/1786 - loss 0.01020874 - time (sec): 9.44 - samples/sec: 2571.28 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 17:38:41,133 epoch 10 - iter 356/1786 - loss 0.00987512 - time (sec): 19.41 - samples/sec: 2440.84 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 17:38:50,313 epoch 10 - iter 534/1786 - loss 0.00992584 - time (sec): 28.59 - samples/sec: 2567.48 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 17:38:59,708 epoch 10 - iter 712/1786 - loss 0.00990542 - time (sec): 37.98 - samples/sec: 2596.01 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 17:39:08,396 epoch 10 - iter 890/1786 - loss 0.01027219 - time (sec): 46.67 - samples/sec: 2613.03 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 17:39:17,994 epoch 10 - iter 1068/1786 - loss 0.00974899 - time (sec): 56.27 - samples/sec: 2630.15 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 17:39:27,610 epoch 10 - iter 1246/1786 - loss 0.00964506 - time (sec): 65.88 - samples/sec: 2626.26 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 17:39:36,684 epoch 10 - iter 1424/1786 - loss 0.00911795 - time (sec): 74.96 - samples/sec: 2618.10 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 17:39:45,879 epoch 10 - iter 1602/1786 - loss 0.00873331 - time (sec): 84.15 - samples/sec: 2636.68 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 17:39:54,763 epoch 10 - iter 1780/1786 - loss 0.00827384 - time (sec): 93.04 - samples/sec: 2665.21 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 17:39:55,053 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:39:55,053 EPOCH 10 done: loss 0.0082 - lr: 0.000000 |
|
2023-10-25 17:39:59,495 DEV : loss 0.23819302022457123 - f1-score (micro avg) 0.7941 |
|
2023-10-25 17:40:00,028 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 17:40:00,030 Loading model from best epoch ... |
|
2023-10-25 17:40:01,959 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-25 17:40:13,856 |
|
Results: |
|
- F-score (micro) 0.687 |
|
- F-score (macro) 0.6126 |
|
- Accuracy 0.5402 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.7286 0.6447 0.6841 1095 |
|
PER 0.7741 0.7787 0.7764 1012 |
|
ORG 0.4390 0.5546 0.4901 357 |
|
HumanProd 0.4000 0.6667 0.5000 33 |
|
|
|
micro avg 0.6875 0.6864 0.6870 2497 |
|
macro avg 0.5854 0.6612 0.6126 2497 |
|
weighted avg 0.7013 0.6864 0.6913 2497 |
|
|
|
2023-10-25 17:40:13,857 ---------------------------------------------------------------------------------------------------- |
|
|