|
2023-10-17 17:58:07,427 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:58:07,428 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 17:58:07,428 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:58:07,428 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-17 17:58:07,428 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:58:07,428 Train: 1166 sentences |
|
2023-10-17 17:58:07,428 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 17:58:07,429 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:58:07,429 Training Params: |
|
2023-10-17 17:58:07,429 - learning_rate: "5e-05" |
|
2023-10-17 17:58:07,429 - mini_batch_size: "4" |
|
2023-10-17 17:58:07,429 - max_epochs: "10" |
|
2023-10-17 17:58:07,429 - shuffle: "True" |
|
2023-10-17 17:58:07,429 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:58:07,429 Plugins: |
|
2023-10-17 17:58:07,429 - TensorboardLogger |
|
2023-10-17 17:58:07,429 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 17:58:07,429 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:58:07,429 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 17:58:07,429 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 17:58:07,429 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:58:07,429 Computation: |
|
2023-10-17 17:58:07,429 - compute on device: cuda:0 |
|
2023-10-17 17:58:07,429 - embedding storage: none |
|
2023-10-17 17:58:07,429 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:58:07,429 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-17 17:58:07,429 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:58:07,429 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:58:07,429 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 17:58:08,948 epoch 1 - iter 29/292 - loss 3.35693242 - time (sec): 1.52 - samples/sec: 2481.07 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:58:10,706 epoch 1 - iter 58/292 - loss 2.40158772 - time (sec): 3.28 - samples/sec: 2746.46 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 17:58:12,378 epoch 1 - iter 87/292 - loss 1.81703963 - time (sec): 4.95 - samples/sec: 2780.63 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:58:14,242 epoch 1 - iter 116/292 - loss 1.50647114 - time (sec): 6.81 - samples/sec: 2755.93 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:58:15,845 epoch 1 - iter 145/292 - loss 1.30749839 - time (sec): 8.41 - samples/sec: 2710.21 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:58:17,535 epoch 1 - iter 174/292 - loss 1.14398900 - time (sec): 10.10 - samples/sec: 2704.56 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 17:58:19,202 epoch 1 - iter 203/292 - loss 1.02388352 - time (sec): 11.77 - samples/sec: 2705.79 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 17:58:20,739 epoch 1 - iter 232/292 - loss 0.95215720 - time (sec): 13.31 - samples/sec: 2692.50 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 17:58:22,437 epoch 1 - iter 261/292 - loss 0.87431897 - time (sec): 15.01 - samples/sec: 2663.32 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 17:58:24,018 epoch 1 - iter 290/292 - loss 0.81843563 - time (sec): 16.59 - samples/sec: 2663.35 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 17:58:24,125 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:58:24,125 EPOCH 1 done: loss 0.8153 - lr: 0.000049 |
|
2023-10-17 17:58:24,983 DEV : loss 0.16816182434558868 - f1-score (micro avg) 0.4943 |
|
2023-10-17 17:58:24,997 saving best model |
|
2023-10-17 17:58:25,403 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:58:27,070 epoch 2 - iter 29/292 - loss 0.20843227 - time (sec): 1.66 - samples/sec: 2647.59 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 17:58:28,672 epoch 2 - iter 58/292 - loss 0.18562051 - time (sec): 3.27 - samples/sec: 2576.35 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 17:58:30,295 epoch 2 - iter 87/292 - loss 0.18807184 - time (sec): 4.89 - samples/sec: 2638.19 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 17:58:32,156 epoch 2 - iter 116/292 - loss 0.19178363 - time (sec): 6.75 - samples/sec: 2679.90 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 17:58:33,681 epoch 2 - iter 145/292 - loss 0.19660908 - time (sec): 8.28 - samples/sec: 2649.53 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 17:58:35,401 epoch 2 - iter 174/292 - loss 0.18834867 - time (sec): 10.00 - samples/sec: 2642.70 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 17:58:37,236 epoch 2 - iter 203/292 - loss 0.18795088 - time (sec): 11.83 - samples/sec: 2676.70 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 17:58:39,005 epoch 2 - iter 232/292 - loss 0.18476687 - time (sec): 13.60 - samples/sec: 2698.36 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 17:58:40,512 epoch 2 - iter 261/292 - loss 0.18398307 - time (sec): 15.11 - samples/sec: 2659.83 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 17:58:42,146 epoch 2 - iter 290/292 - loss 0.18080703 - time (sec): 16.74 - samples/sec: 2645.78 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 17:58:42,237 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:58:42,238 EPOCH 2 done: loss 0.1807 - lr: 0.000045 |
|
2023-10-17 17:58:43,941 DEV : loss 0.15552328526973724 - f1-score (micro avg) 0.6562 |
|
2023-10-17 17:58:43,947 saving best model |
|
2023-10-17 17:58:44,403 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:58:46,086 epoch 3 - iter 29/292 - loss 0.10474185 - time (sec): 1.68 - samples/sec: 2898.39 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 17:58:47,922 epoch 3 - iter 58/292 - loss 0.10398762 - time (sec): 3.52 - samples/sec: 2759.49 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 17:58:49,548 epoch 3 - iter 87/292 - loss 0.12767526 - time (sec): 5.14 - samples/sec: 2718.62 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 17:58:51,123 epoch 3 - iter 116/292 - loss 0.12096457 - time (sec): 6.72 - samples/sec: 2666.56 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 17:58:52,826 epoch 3 - iter 145/292 - loss 0.11567654 - time (sec): 8.42 - samples/sec: 2657.83 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 17:58:54,601 epoch 3 - iter 174/292 - loss 0.11268765 - time (sec): 10.20 - samples/sec: 2654.12 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 17:58:56,170 epoch 3 - iter 203/292 - loss 0.10757754 - time (sec): 11.76 - samples/sec: 2657.91 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 17:58:57,967 epoch 3 - iter 232/292 - loss 0.10369889 - time (sec): 13.56 - samples/sec: 2639.60 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 17:58:59,651 epoch 3 - iter 261/292 - loss 0.10296987 - time (sec): 15.25 - samples/sec: 2626.33 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 17:59:01,340 epoch 3 - iter 290/292 - loss 0.10510734 - time (sec): 16.93 - samples/sec: 2614.92 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-17 17:59:01,427 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:59:01,428 EPOCH 3 done: loss 0.1050 - lr: 0.000039 |
|
2023-10-17 17:59:02,790 DEV : loss 0.10585162043571472 - f1-score (micro avg) 0.7419 |
|
2023-10-17 17:59:02,798 saving best model |
|
2023-10-17 17:59:03,250 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:59:04,919 epoch 4 - iter 29/292 - loss 0.05183600 - time (sec): 1.66 - samples/sec: 2769.70 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 17:59:06,661 epoch 4 - iter 58/292 - loss 0.05491146 - time (sec): 3.41 - samples/sec: 2750.09 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 17:59:08,204 epoch 4 - iter 87/292 - loss 0.05566885 - time (sec): 4.95 - samples/sec: 2676.73 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 17:59:10,045 epoch 4 - iter 116/292 - loss 0.05537419 - time (sec): 6.79 - samples/sec: 2710.16 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 17:59:11,681 epoch 4 - iter 145/292 - loss 0.06564170 - time (sec): 8.43 - samples/sec: 2711.35 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 17:59:13,443 epoch 4 - iter 174/292 - loss 0.07091282 - time (sec): 10.19 - samples/sec: 2697.24 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 17:59:15,062 epoch 4 - iter 203/292 - loss 0.06947365 - time (sec): 11.81 - samples/sec: 2665.38 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 17:59:16,819 epoch 4 - iter 232/292 - loss 0.07485098 - time (sec): 13.56 - samples/sec: 2647.81 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 17:59:18,435 epoch 4 - iter 261/292 - loss 0.07325834 - time (sec): 15.18 - samples/sec: 2646.67 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 17:59:20,007 epoch 4 - iter 290/292 - loss 0.07071229 - time (sec): 16.75 - samples/sec: 2644.73 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 17:59:20,091 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:59:20,092 EPOCH 4 done: loss 0.0705 - lr: 0.000033 |
|
2023-10-17 17:59:21,397 DEV : loss 0.15273110568523407 - f1-score (micro avg) 0.7775 |
|
2023-10-17 17:59:21,403 saving best model |
|
2023-10-17 17:59:21,837 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:59:23,539 epoch 5 - iter 29/292 - loss 0.03385877 - time (sec): 1.70 - samples/sec: 2504.93 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 17:59:25,093 epoch 5 - iter 58/292 - loss 0.04827007 - time (sec): 3.25 - samples/sec: 2622.66 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 17:59:26,808 epoch 5 - iter 87/292 - loss 0.04812479 - time (sec): 4.97 - samples/sec: 2736.37 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 17:59:28,532 epoch 5 - iter 116/292 - loss 0.04989877 - time (sec): 6.69 - samples/sec: 2676.73 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 17:59:30,292 epoch 5 - iter 145/292 - loss 0.05178593 - time (sec): 8.45 - samples/sec: 2646.29 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 17:59:31,921 epoch 5 - iter 174/292 - loss 0.04817322 - time (sec): 10.08 - samples/sec: 2640.85 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 17:59:33,512 epoch 5 - iter 203/292 - loss 0.04781243 - time (sec): 11.67 - samples/sec: 2643.03 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 17:59:35,227 epoch 5 - iter 232/292 - loss 0.04594962 - time (sec): 13.39 - samples/sec: 2631.40 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 17:59:36,905 epoch 5 - iter 261/292 - loss 0.04519214 - time (sec): 15.07 - samples/sec: 2644.65 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 17:59:38,544 epoch 5 - iter 290/292 - loss 0.04462130 - time (sec): 16.71 - samples/sec: 2652.22 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 17:59:38,636 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:59:38,637 EPOCH 5 done: loss 0.0446 - lr: 0.000028 |
|
2023-10-17 17:59:39,971 DEV : loss 0.1602601408958435 - f1-score (micro avg) 0.7296 |
|
2023-10-17 17:59:39,978 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:59:41,745 epoch 6 - iter 29/292 - loss 0.03117945 - time (sec): 1.77 - samples/sec: 2448.91 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:59:43,627 epoch 6 - iter 58/292 - loss 0.02884635 - time (sec): 3.65 - samples/sec: 2482.68 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:59:45,431 epoch 6 - iter 87/292 - loss 0.02959983 - time (sec): 5.45 - samples/sec: 2387.97 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:59:47,014 epoch 6 - iter 116/292 - loss 0.03130694 - time (sec): 7.03 - samples/sec: 2332.69 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:59:48,803 epoch 6 - iter 145/292 - loss 0.02949162 - time (sec): 8.82 - samples/sec: 2410.79 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:59:50,550 epoch 6 - iter 174/292 - loss 0.03282502 - time (sec): 10.57 - samples/sec: 2485.60 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:59:52,093 epoch 6 - iter 203/292 - loss 0.03369730 - time (sec): 12.11 - samples/sec: 2487.82 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:59:53,748 epoch 6 - iter 232/292 - loss 0.03273151 - time (sec): 13.77 - samples/sec: 2494.63 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 17:59:55,348 epoch 6 - iter 261/292 - loss 0.03486864 - time (sec): 15.37 - samples/sec: 2521.05 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 17:59:57,157 epoch 6 - iter 290/292 - loss 0.03165586 - time (sec): 17.18 - samples/sec: 2573.35 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 17:59:57,250 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:59:57,250 EPOCH 6 done: loss 0.0319 - lr: 0.000022 |
|
2023-10-17 17:59:58,554 DEV : loss 0.1713484823703766 - f1-score (micro avg) 0.773 |
|
2023-10-17 17:59:58,561 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:00:00,155 epoch 7 - iter 29/292 - loss 0.02059682 - time (sec): 1.59 - samples/sec: 2614.99 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 18:00:01,684 epoch 7 - iter 58/292 - loss 0.02743531 - time (sec): 3.12 - samples/sec: 2516.97 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 18:00:03,380 epoch 7 - iter 87/292 - loss 0.02412808 - time (sec): 4.82 - samples/sec: 2563.43 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 18:00:04,976 epoch 7 - iter 116/292 - loss 0.02556567 - time (sec): 6.41 - samples/sec: 2576.34 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 18:00:06,649 epoch 7 - iter 145/292 - loss 0.03034182 - time (sec): 8.09 - samples/sec: 2638.95 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 18:00:08,341 epoch 7 - iter 174/292 - loss 0.02823755 - time (sec): 9.78 - samples/sec: 2600.27 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 18:00:10,116 epoch 7 - iter 203/292 - loss 0.02560592 - time (sec): 11.55 - samples/sec: 2602.05 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 18:00:11,803 epoch 7 - iter 232/292 - loss 0.02413585 - time (sec): 13.24 - samples/sec: 2583.25 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 18:00:13,524 epoch 7 - iter 261/292 - loss 0.02391633 - time (sec): 14.96 - samples/sec: 2594.87 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 18:00:15,255 epoch 7 - iter 290/292 - loss 0.02352649 - time (sec): 16.69 - samples/sec: 2626.34 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 18:00:15,455 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:00:15,456 EPOCH 7 done: loss 0.0234 - lr: 0.000017 |
|
2023-10-17 18:00:16,783 DEV : loss 0.1889658421278 - f1-score (micro avg) 0.756 |
|
2023-10-17 18:00:16,790 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:00:18,494 epoch 8 - iter 29/292 - loss 0.01156358 - time (sec): 1.70 - samples/sec: 2576.80 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 18:00:20,203 epoch 8 - iter 58/292 - loss 0.01363486 - time (sec): 3.41 - samples/sec: 2579.01 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 18:00:21,863 epoch 8 - iter 87/292 - loss 0.01449766 - time (sec): 5.07 - samples/sec: 2619.98 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 18:00:23,775 epoch 8 - iter 116/292 - loss 0.01864699 - time (sec): 6.98 - samples/sec: 2550.16 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 18:00:25,440 epoch 8 - iter 145/292 - loss 0.01972188 - time (sec): 8.65 - samples/sec: 2588.01 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 18:00:27,084 epoch 8 - iter 174/292 - loss 0.01866095 - time (sec): 10.29 - samples/sec: 2627.52 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 18:00:28,834 epoch 8 - iter 203/292 - loss 0.01727761 - time (sec): 12.04 - samples/sec: 2658.79 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 18:00:30,336 epoch 8 - iter 232/292 - loss 0.01740192 - time (sec): 13.54 - samples/sec: 2627.55 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 18:00:32,062 epoch 8 - iter 261/292 - loss 0.01627368 - time (sec): 15.27 - samples/sec: 2641.10 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 18:00:33,679 epoch 8 - iter 290/292 - loss 0.01606708 - time (sec): 16.89 - samples/sec: 2620.90 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 18:00:33,770 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:00:33,770 EPOCH 8 done: loss 0.0160 - lr: 0.000011 |
|
2023-10-17 18:00:35,012 DEV : loss 0.20527206361293793 - f1-score (micro avg) 0.7478 |
|
2023-10-17 18:00:35,018 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:00:36,610 epoch 9 - iter 29/292 - loss 0.01233241 - time (sec): 1.59 - samples/sec: 2518.86 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 18:00:38,511 epoch 9 - iter 58/292 - loss 0.01637592 - time (sec): 3.49 - samples/sec: 2727.71 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 18:00:40,332 epoch 9 - iter 87/292 - loss 0.02791061 - time (sec): 5.31 - samples/sec: 2765.82 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 18:00:41,916 epoch 9 - iter 116/292 - loss 0.02442203 - time (sec): 6.90 - samples/sec: 2697.63 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 18:00:43,406 epoch 9 - iter 145/292 - loss 0.02208196 - time (sec): 8.39 - samples/sec: 2646.99 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 18:00:45,075 epoch 9 - iter 174/292 - loss 0.01999989 - time (sec): 10.06 - samples/sec: 2666.89 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 18:00:46,681 epoch 9 - iter 203/292 - loss 0.01804618 - time (sec): 11.66 - samples/sec: 2648.58 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 18:00:48,397 epoch 9 - iter 232/292 - loss 0.01685413 - time (sec): 13.38 - samples/sec: 2684.23 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 18:00:49,945 epoch 9 - iter 261/292 - loss 0.01630731 - time (sec): 14.93 - samples/sec: 2653.53 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 18:00:51,498 epoch 9 - iter 290/292 - loss 0.01472446 - time (sec): 16.48 - samples/sec: 2667.85 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 18:00:51,635 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:00:51,635 EPOCH 9 done: loss 0.0146 - lr: 0.000006 |
|
2023-10-17 18:00:52,961 DEV : loss 0.19456231594085693 - f1-score (micro avg) 0.7682 |
|
2023-10-17 18:00:52,968 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:00:54,993 epoch 10 - iter 29/292 - loss 0.01092397 - time (sec): 2.02 - samples/sec: 2646.57 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 18:00:56,816 epoch 10 - iter 58/292 - loss 0.00845707 - time (sec): 3.85 - samples/sec: 2558.88 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 18:00:58,489 epoch 10 - iter 87/292 - loss 0.00773468 - time (sec): 5.52 - samples/sec: 2551.82 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 18:01:00,138 epoch 10 - iter 116/292 - loss 0.00801451 - time (sec): 7.17 - samples/sec: 2554.52 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 18:01:01,812 epoch 10 - iter 145/292 - loss 0.00771078 - time (sec): 8.84 - samples/sec: 2498.95 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 18:01:03,554 epoch 10 - iter 174/292 - loss 0.00748683 - time (sec): 10.58 - samples/sec: 2490.12 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 18:01:05,404 epoch 10 - iter 203/292 - loss 0.00738807 - time (sec): 12.43 - samples/sec: 2525.12 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 18:01:07,056 epoch 10 - iter 232/292 - loss 0.00744920 - time (sec): 14.09 - samples/sec: 2533.47 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 18:01:08,664 epoch 10 - iter 261/292 - loss 0.00703627 - time (sec): 15.69 - samples/sec: 2522.98 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 18:01:10,405 epoch 10 - iter 290/292 - loss 0.00754299 - time (sec): 17.44 - samples/sec: 2529.90 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 18:01:10,507 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:01:10,507 EPOCH 10 done: loss 0.0075 - lr: 0.000000 |
|
2023-10-17 18:01:11,756 DEV : loss 0.20955659449100494 - f1-score (micro avg) 0.7639 |
|
2023-10-17 18:01:12,085 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:01:12,086 Loading model from best epoch ... |
|
2023-10-17 18:01:13,492 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-17 18:01:15,926 |
|
Results: |
|
- F-score (micro) 0.7333 |
|
- F-score (macro) 0.6735 |
|
- Accuracy 0.6027 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.7995 0.8362 0.8174 348 |
|
LOC 0.6030 0.7739 0.6779 261 |
|
ORG 0.4884 0.4038 0.4421 52 |
|
HumanProd 0.9333 0.6364 0.7568 22 |
|
|
|
micro avg 0.6975 0.7731 0.7333 683 |
|
macro avg 0.7060 0.6626 0.6735 683 |
|
weighted avg 0.7050 0.7731 0.7336 683 |
|
|
|
2023-10-17 18:01:15,926 ---------------------------------------------------------------------------------------------------- |
|
|