|
2023-10-17 17:44:25,631 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:44:25,632 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 17:44:25,632 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:44:25,632 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-17 17:44:25,632 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:44:25,632 Train: 1166 sentences |
|
2023-10-17 17:44:25,632 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 17:44:25,632 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:44:25,632 Training Params: |
|
2023-10-17 17:44:25,632 - learning_rate: "5e-05" |
|
2023-10-17 17:44:25,632 - mini_batch_size: "4" |
|
2023-10-17 17:44:25,632 - max_epochs: "10" |
|
2023-10-17 17:44:25,632 - shuffle: "True" |
|
2023-10-17 17:44:25,632 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:44:25,632 Plugins: |
|
2023-10-17 17:44:25,632 - TensorboardLogger |
|
2023-10-17 17:44:25,632 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 17:44:25,632 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:44:25,632 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 17:44:25,632 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 17:44:25,632 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:44:25,632 Computation: |
|
2023-10-17 17:44:25,633 - compute on device: cuda:0 |
|
2023-10-17 17:44:25,633 - embedding storage: none |
|
2023-10-17 17:44:25,633 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:44:25,633 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-17 17:44:25,633 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:44:25,633 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:44:25,633 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 17:44:27,412 epoch 1 - iter 29/292 - loss 3.57048903 - time (sec): 1.78 - samples/sec: 2510.89 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:44:29,115 epoch 1 - iter 58/292 - loss 2.69134226 - time (sec): 3.48 - samples/sec: 2638.12 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 17:44:30,931 epoch 1 - iter 87/292 - loss 2.00635595 - time (sec): 5.30 - samples/sec: 2531.16 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:44:32,440 epoch 1 - iter 116/292 - loss 1.67505873 - time (sec): 6.81 - samples/sec: 2517.67 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:44:33,997 epoch 1 - iter 145/292 - loss 1.44361519 - time (sec): 8.36 - samples/sec: 2543.88 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:44:35,582 epoch 1 - iter 174/292 - loss 1.26249810 - time (sec): 9.95 - samples/sec: 2555.63 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 17:44:37,185 epoch 1 - iter 203/292 - loss 1.12316057 - time (sec): 11.55 - samples/sec: 2573.81 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 17:44:38,686 epoch 1 - iter 232/292 - loss 1.03035540 - time (sec): 13.05 - samples/sec: 2580.37 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 17:44:40,348 epoch 1 - iter 261/292 - loss 0.93353604 - time (sec): 14.71 - samples/sec: 2610.98 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 17:44:42,337 epoch 1 - iter 290/292 - loss 0.85173568 - time (sec): 16.70 - samples/sec: 2648.79 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 17:44:42,438 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:44:42,438 EPOCH 1 done: loss 0.8497 - lr: 0.000049 |
|
2023-10-17 17:44:43,302 DEV : loss 0.15992386639118195 - f1-score (micro avg) 0.525 |
|
2023-10-17 17:44:43,310 saving best model |
|
2023-10-17 17:44:43,721 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:44:45,590 epoch 2 - iter 29/292 - loss 0.25767213 - time (sec): 1.87 - samples/sec: 2697.46 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 17:44:47,114 epoch 2 - iter 58/292 - loss 0.25456634 - time (sec): 3.39 - samples/sec: 2571.19 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 17:44:48,626 epoch 2 - iter 87/292 - loss 0.24045766 - time (sec): 4.90 - samples/sec: 2556.28 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 17:44:50,105 epoch 2 - iter 116/292 - loss 0.22630269 - time (sec): 6.38 - samples/sec: 2537.48 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 17:44:51,910 epoch 2 - iter 145/292 - loss 0.22581740 - time (sec): 8.19 - samples/sec: 2618.08 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 17:44:53,578 epoch 2 - iter 174/292 - loss 0.21214049 - time (sec): 9.85 - samples/sec: 2567.67 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 17:44:55,252 epoch 2 - iter 203/292 - loss 0.20001713 - time (sec): 11.53 - samples/sec: 2563.49 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 17:44:56,845 epoch 2 - iter 232/292 - loss 0.19935194 - time (sec): 13.12 - samples/sec: 2570.87 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 17:44:58,631 epoch 2 - iter 261/292 - loss 0.18698302 - time (sec): 14.91 - samples/sec: 2621.32 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 17:45:00,465 epoch 2 - iter 290/292 - loss 0.18609856 - time (sec): 16.74 - samples/sec: 2644.64 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 17:45:00,549 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:45:00,549 EPOCH 2 done: loss 0.1855 - lr: 0.000045 |
|
2023-10-17 17:45:02,044 DEV : loss 0.1280839890241623 - f1-score (micro avg) 0.6897 |
|
2023-10-17 17:45:02,049 saving best model |
|
2023-10-17 17:45:02,529 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:45:04,015 epoch 3 - iter 29/292 - loss 0.11027878 - time (sec): 1.48 - samples/sec: 2381.84 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 17:45:05,837 epoch 3 - iter 58/292 - loss 0.10728336 - time (sec): 3.31 - samples/sec: 2645.70 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 17:45:07,539 epoch 3 - iter 87/292 - loss 0.09859684 - time (sec): 5.01 - samples/sec: 2698.14 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 17:45:09,206 epoch 3 - iter 116/292 - loss 0.09629751 - time (sec): 6.67 - samples/sec: 2670.46 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 17:45:10,833 epoch 3 - iter 145/292 - loss 0.10399359 - time (sec): 8.30 - samples/sec: 2627.51 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 17:45:12,436 epoch 3 - iter 174/292 - loss 0.10387591 - time (sec): 9.90 - samples/sec: 2632.99 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 17:45:14,016 epoch 3 - iter 203/292 - loss 0.10831849 - time (sec): 11.48 - samples/sec: 2604.01 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 17:45:15,776 epoch 3 - iter 232/292 - loss 0.10540846 - time (sec): 13.24 - samples/sec: 2646.30 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 17:45:17,546 epoch 3 - iter 261/292 - loss 0.10375178 - time (sec): 15.01 - samples/sec: 2643.50 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 17:45:19,277 epoch 3 - iter 290/292 - loss 0.10490339 - time (sec): 16.75 - samples/sec: 2645.03 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-17 17:45:19,369 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:45:19,370 EPOCH 3 done: loss 0.1046 - lr: 0.000039 |
|
2023-10-17 17:45:20,624 DEV : loss 0.09797272831201553 - f1-score (micro avg) 0.7626 |
|
2023-10-17 17:45:20,630 saving best model |
|
2023-10-17 17:45:21,089 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:45:22,852 epoch 4 - iter 29/292 - loss 0.07421333 - time (sec): 1.76 - samples/sec: 2811.82 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 17:45:24,540 epoch 4 - iter 58/292 - loss 0.08033029 - time (sec): 3.44 - samples/sec: 2694.02 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 17:45:26,333 epoch 4 - iter 87/292 - loss 0.08827135 - time (sec): 5.24 - samples/sec: 2625.30 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 17:45:27,884 epoch 4 - iter 116/292 - loss 0.08338749 - time (sec): 6.79 - samples/sec: 2539.07 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 17:45:29,628 epoch 4 - iter 145/292 - loss 0.07896574 - time (sec): 8.53 - samples/sec: 2541.56 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 17:45:31,449 epoch 4 - iter 174/292 - loss 0.08184651 - time (sec): 10.35 - samples/sec: 2586.75 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 17:45:33,000 epoch 4 - iter 203/292 - loss 0.07823801 - time (sec): 11.90 - samples/sec: 2566.39 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 17:45:34,653 epoch 4 - iter 232/292 - loss 0.07894673 - time (sec): 13.56 - samples/sec: 2553.86 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 17:45:36,427 epoch 4 - iter 261/292 - loss 0.07449010 - time (sec): 15.33 - samples/sec: 2553.21 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 17:45:38,163 epoch 4 - iter 290/292 - loss 0.07251058 - time (sec): 17.07 - samples/sec: 2597.22 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 17:45:38,252 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:45:38,253 EPOCH 4 done: loss 0.0723 - lr: 0.000033 |
|
2023-10-17 17:45:39,502 DEV : loss 0.12399590760469437 - f1-score (micro avg) 0.7545 |
|
2023-10-17 17:45:39,508 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:45:41,142 epoch 5 - iter 29/292 - loss 0.06122202 - time (sec): 1.63 - samples/sec: 2384.89 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 17:45:42,878 epoch 5 - iter 58/292 - loss 0.05684321 - time (sec): 3.37 - samples/sec: 2592.69 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 17:45:44,550 epoch 5 - iter 87/292 - loss 0.04607849 - time (sec): 5.04 - samples/sec: 2681.49 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 17:45:46,392 epoch 5 - iter 116/292 - loss 0.04987496 - time (sec): 6.88 - samples/sec: 2649.25 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 17:45:47,996 epoch 5 - iter 145/292 - loss 0.04808783 - time (sec): 8.49 - samples/sec: 2600.28 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 17:45:49,559 epoch 5 - iter 174/292 - loss 0.04564148 - time (sec): 10.05 - samples/sec: 2605.18 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 17:45:51,278 epoch 5 - iter 203/292 - loss 0.04341831 - time (sec): 11.77 - samples/sec: 2614.97 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 17:45:52,908 epoch 5 - iter 232/292 - loss 0.04291497 - time (sec): 13.40 - samples/sec: 2621.26 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 17:45:54,762 epoch 5 - iter 261/292 - loss 0.04403114 - time (sec): 15.25 - samples/sec: 2619.43 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 17:45:56,349 epoch 5 - iter 290/292 - loss 0.04465672 - time (sec): 16.84 - samples/sec: 2618.02 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 17:45:56,464 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:45:56,464 EPOCH 5 done: loss 0.0447 - lr: 0.000028 |
|
2023-10-17 17:45:57,721 DEV : loss 0.15154214203357697 - f1-score (micro avg) 0.7635 |
|
2023-10-17 17:45:57,727 saving best model |
|
2023-10-17 17:45:58,188 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:45:59,748 epoch 6 - iter 29/292 - loss 0.03976061 - time (sec): 1.55 - samples/sec: 2589.61 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:46:01,590 epoch 6 - iter 58/292 - loss 0.03082695 - time (sec): 3.39 - samples/sec: 2678.28 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:46:03,128 epoch 6 - iter 87/292 - loss 0.03254628 - time (sec): 4.93 - samples/sec: 2579.17 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:46:04,899 epoch 6 - iter 116/292 - loss 0.03205393 - time (sec): 6.70 - samples/sec: 2590.92 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:46:06,780 epoch 6 - iter 145/292 - loss 0.03458738 - time (sec): 8.58 - samples/sec: 2595.02 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:46:08,466 epoch 6 - iter 174/292 - loss 0.03208665 - time (sec): 10.26 - samples/sec: 2641.93 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:46:10,044 epoch 6 - iter 203/292 - loss 0.03331330 - time (sec): 11.84 - samples/sec: 2639.43 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:46:11,732 epoch 6 - iter 232/292 - loss 0.03066017 - time (sec): 13.53 - samples/sec: 2610.71 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 17:46:13,409 epoch 6 - iter 261/292 - loss 0.03183791 - time (sec): 15.21 - samples/sec: 2605.18 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 17:46:14,961 epoch 6 - iter 290/292 - loss 0.03229201 - time (sec): 16.76 - samples/sec: 2646.98 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 17:46:15,043 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:46:15,043 EPOCH 6 done: loss 0.0322 - lr: 0.000022 |
|
2023-10-17 17:46:16,317 DEV : loss 0.16179697215557098 - f1-score (micro avg) 0.744 |
|
2023-10-17 17:46:16,322 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:46:17,927 epoch 7 - iter 29/292 - loss 0.00901751 - time (sec): 1.60 - samples/sec: 2384.42 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 17:46:19,649 epoch 7 - iter 58/292 - loss 0.02237209 - time (sec): 3.33 - samples/sec: 2604.07 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:46:21,396 epoch 7 - iter 87/292 - loss 0.02251530 - time (sec): 5.07 - samples/sec: 2655.62 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:46:23,095 epoch 7 - iter 116/292 - loss 0.02465865 - time (sec): 6.77 - samples/sec: 2627.44 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:46:24,758 epoch 7 - iter 145/292 - loss 0.02146014 - time (sec): 8.43 - samples/sec: 2660.21 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:46:26,368 epoch 7 - iter 174/292 - loss 0.02068415 - time (sec): 10.04 - samples/sec: 2597.45 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 17:46:28,058 epoch 7 - iter 203/292 - loss 0.02188084 - time (sec): 11.73 - samples/sec: 2647.43 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:46:29,721 epoch 7 - iter 232/292 - loss 0.02297817 - time (sec): 13.40 - samples/sec: 2643.17 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:46:31,432 epoch 7 - iter 261/292 - loss 0.02240980 - time (sec): 15.11 - samples/sec: 2651.96 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 17:46:33,093 epoch 7 - iter 290/292 - loss 0.02259232 - time (sec): 16.77 - samples/sec: 2638.68 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 17:46:33,193 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:46:33,193 EPOCH 7 done: loss 0.0225 - lr: 0.000017 |
|
2023-10-17 17:46:34,648 DEV : loss 0.1630016714334488 - f1-score (micro avg) 0.7623 |
|
2023-10-17 17:46:34,653 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:46:36,218 epoch 8 - iter 29/292 - loss 0.01934696 - time (sec): 1.56 - samples/sec: 2570.85 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 17:46:37,938 epoch 8 - iter 58/292 - loss 0.02003869 - time (sec): 3.28 - samples/sec: 2562.21 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 17:46:39,606 epoch 8 - iter 87/292 - loss 0.01742481 - time (sec): 4.95 - samples/sec: 2539.20 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:46:41,191 epoch 8 - iter 116/292 - loss 0.01679156 - time (sec): 6.54 - samples/sec: 2539.65 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:46:42,848 epoch 8 - iter 145/292 - loss 0.01590234 - time (sec): 8.19 - samples/sec: 2582.42 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 17:46:44,530 epoch 8 - iter 174/292 - loss 0.01751979 - time (sec): 9.88 - samples/sec: 2615.82 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 17:46:46,030 epoch 8 - iter 203/292 - loss 0.01821748 - time (sec): 11.38 - samples/sec: 2607.56 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 17:46:47,894 epoch 8 - iter 232/292 - loss 0.01646602 - time (sec): 13.24 - samples/sec: 2642.17 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:46:49,628 epoch 8 - iter 261/292 - loss 0.01566264 - time (sec): 14.97 - samples/sec: 2617.99 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:46:51,454 epoch 8 - iter 290/292 - loss 0.01638527 - time (sec): 16.80 - samples/sec: 2637.25 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 17:46:51,546 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:46:51,547 EPOCH 8 done: loss 0.0163 - lr: 0.000011 |
|
2023-10-17 17:46:52,820 DEV : loss 0.16230525076389313 - f1-score (micro avg) 0.7716 |
|
2023-10-17 17:46:52,825 saving best model |
|
2023-10-17 17:46:53,310 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:46:54,984 epoch 9 - iter 29/292 - loss 0.00632684 - time (sec): 1.67 - samples/sec: 2789.68 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 17:46:56,690 epoch 9 - iter 58/292 - loss 0.01106083 - time (sec): 3.38 - samples/sec: 2618.10 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 17:46:58,545 epoch 9 - iter 87/292 - loss 0.01162112 - time (sec): 5.23 - samples/sec: 2633.00 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 17:47:00,497 epoch 9 - iter 116/292 - loss 0.01186606 - time (sec): 7.19 - samples/sec: 2578.57 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 17:47:02,415 epoch 9 - iter 145/292 - loss 0.01186437 - time (sec): 9.10 - samples/sec: 2567.92 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 17:47:04,070 epoch 9 - iter 174/292 - loss 0.01099713 - time (sec): 10.76 - samples/sec: 2561.94 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 17:47:05,657 epoch 9 - iter 203/292 - loss 0.01147828 - time (sec): 12.35 - samples/sec: 2568.19 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 17:47:07,263 epoch 9 - iter 232/292 - loss 0.01048590 - time (sec): 13.95 - samples/sec: 2572.44 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 17:47:08,725 epoch 9 - iter 261/292 - loss 0.01103468 - time (sec): 15.41 - samples/sec: 2549.87 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:47:10,422 epoch 9 - iter 290/292 - loss 0.01088504 - time (sec): 17.11 - samples/sec: 2578.89 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:47:10,518 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:47:10,518 EPOCH 9 done: loss 0.0112 - lr: 0.000006 |
|
2023-10-17 17:47:11,779 DEV : loss 0.1698322296142578 - f1-score (micro avg) 0.7462 |
|
2023-10-17 17:47:11,784 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:47:13,358 epoch 10 - iter 29/292 - loss 0.00521108 - time (sec): 1.57 - samples/sec: 2788.97 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:47:14,996 epoch 10 - iter 58/292 - loss 0.00785901 - time (sec): 3.21 - samples/sec: 2645.32 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:47:16,751 epoch 10 - iter 87/292 - loss 0.00903835 - time (sec): 4.97 - samples/sec: 2593.06 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 17:47:18,520 epoch 10 - iter 116/292 - loss 0.00918701 - time (sec): 6.73 - samples/sec: 2664.33 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:47:20,212 epoch 10 - iter 145/292 - loss 0.00813227 - time (sec): 8.43 - samples/sec: 2681.74 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:47:21,773 epoch 10 - iter 174/292 - loss 0.00709191 - time (sec): 9.99 - samples/sec: 2654.89 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 17:47:23,671 epoch 10 - iter 203/292 - loss 0.00756203 - time (sec): 11.89 - samples/sec: 2653.04 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 17:47:25,354 epoch 10 - iter 232/292 - loss 0.00685533 - time (sec): 13.57 - samples/sec: 2671.01 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 17:47:26,919 epoch 10 - iter 261/292 - loss 0.00729752 - time (sec): 15.13 - samples/sec: 2663.07 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 17:47:28,459 epoch 10 - iter 290/292 - loss 0.00678873 - time (sec): 16.67 - samples/sec: 2652.99 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 17:47:28,548 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:47:28,549 EPOCH 10 done: loss 0.0068 - lr: 0.000000 |
|
2023-10-17 17:47:29,786 DEV : loss 0.17408965528011322 - f1-score (micro avg) 0.7623 |
|
2023-10-17 17:47:30,144 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:47:30,145 Loading model from best epoch ... |
|
2023-10-17 17:47:31,483 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-17 17:47:33,834 |
|
Results: |
|
- F-score (micro) 0.7609 |
|
- F-score (macro) 0.7061 |
|
- Accuracy 0.633 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8104 0.8477 0.8287 348 |
|
LOC 0.6337 0.8352 0.7207 261 |
|
ORG 0.5532 0.5000 0.5253 52 |
|
HumanProd 0.6923 0.8182 0.7500 22 |
|
|
|
micro avg 0.7132 0.8155 0.7609 683 |
|
macro avg 0.6724 0.7503 0.7061 683 |
|
weighted avg 0.7195 0.8155 0.7618 683 |
|
|
|
2023-10-17 17:47:33,834 ---------------------------------------------------------------------------------------------------- |
|
|