|
2023-10-17 18:29:21,229 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:29:21,230 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 18:29:21,230 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:29:21,230 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-17 18:29:21,230 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:29:21,230 Train: 1166 sentences |
|
2023-10-17 18:29:21,230 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 18:29:21,230 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:29:21,230 Training Params: |
|
2023-10-17 18:29:21,230 - learning_rate: "3e-05" |
|
2023-10-17 18:29:21,230 - mini_batch_size: "8" |
|
2023-10-17 18:29:21,230 - max_epochs: "10" |
|
2023-10-17 18:29:21,231 - shuffle: "True" |
|
2023-10-17 18:29:21,231 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:29:21,231 Plugins: |
|
2023-10-17 18:29:21,231 - TensorboardLogger |
|
2023-10-17 18:29:21,231 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 18:29:21,231 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:29:21,231 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 18:29:21,231 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 18:29:21,231 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:29:21,231 Computation: |
|
2023-10-17 18:29:21,231 - compute on device: cuda:0 |
|
2023-10-17 18:29:21,231 - embedding storage: none |
|
2023-10-17 18:29:21,231 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:29:21,231 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-17 18:29:21,231 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:29:21,231 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:29:21,231 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 18:29:22,716 epoch 1 - iter 14/146 - loss 3.50355148 - time (sec): 1.48 - samples/sec: 2833.13 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 18:29:24,029 epoch 1 - iter 28/146 - loss 3.25324592 - time (sec): 2.80 - samples/sec: 3022.82 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 18:29:25,560 epoch 1 - iter 42/146 - loss 2.83306395 - time (sec): 4.33 - samples/sec: 2941.17 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 18:29:26,952 epoch 1 - iter 56/146 - loss 2.34617359 - time (sec): 5.72 - samples/sec: 2903.91 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 18:29:28,446 epoch 1 - iter 70/146 - loss 1.98648803 - time (sec): 7.21 - samples/sec: 2891.78 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 18:29:29,805 epoch 1 - iter 84/146 - loss 1.76467730 - time (sec): 8.57 - samples/sec: 2921.92 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 18:29:31,676 epoch 1 - iter 98/146 - loss 1.55755458 - time (sec): 10.44 - samples/sec: 2869.23 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 18:29:33,154 epoch 1 - iter 112/146 - loss 1.39712517 - time (sec): 11.92 - samples/sec: 2890.02 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 18:29:34,404 epoch 1 - iter 126/146 - loss 1.28447747 - time (sec): 13.17 - samples/sec: 2904.82 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 18:29:35,692 epoch 1 - iter 140/146 - loss 1.20091206 - time (sec): 14.46 - samples/sec: 2894.54 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 18:29:36,539 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:29:36,539 EPOCH 1 done: loss 1.1413 - lr: 0.000029 |
|
2023-10-17 18:29:37,573 DEV : loss 0.2010345607995987 - f1-score (micro avg) 0.4167 |
|
2023-10-17 18:29:37,578 saving best model |
|
2023-10-17 18:29:37,947 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:29:39,304 epoch 2 - iter 14/146 - loss 0.21316978 - time (sec): 1.36 - samples/sec: 3106.80 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 18:29:40,935 epoch 2 - iter 28/146 - loss 0.31888932 - time (sec): 2.99 - samples/sec: 3057.61 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 18:29:42,400 epoch 2 - iter 42/146 - loss 0.29320937 - time (sec): 4.45 - samples/sec: 3055.95 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 18:29:43,927 epoch 2 - iter 56/146 - loss 0.26948633 - time (sec): 5.98 - samples/sec: 3026.62 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 18:29:45,777 epoch 2 - iter 70/146 - loss 0.25509469 - time (sec): 7.83 - samples/sec: 2887.24 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 18:29:47,099 epoch 2 - iter 84/146 - loss 0.24355715 - time (sec): 9.15 - samples/sec: 2854.39 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 18:29:48,415 epoch 2 - iter 98/146 - loss 0.23666148 - time (sec): 10.47 - samples/sec: 2871.26 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 18:29:49,979 epoch 2 - iter 112/146 - loss 0.22451920 - time (sec): 12.03 - samples/sec: 2868.03 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 18:29:51,508 epoch 2 - iter 126/146 - loss 0.21799436 - time (sec): 13.56 - samples/sec: 2858.93 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 18:29:52,916 epoch 2 - iter 140/146 - loss 0.21515085 - time (sec): 14.97 - samples/sec: 2836.38 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 18:29:53,623 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:29:53,623 EPOCH 2 done: loss 0.2128 - lr: 0.000027 |
|
2023-10-17 18:29:54,918 DEV : loss 0.12932813167572021 - f1-score (micro avg) 0.6043 |
|
2023-10-17 18:29:54,925 saving best model |
|
2023-10-17 18:29:55,394 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:29:57,051 epoch 3 - iter 14/146 - loss 0.14055569 - time (sec): 1.66 - samples/sec: 3029.42 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 18:29:58,264 epoch 3 - iter 28/146 - loss 0.13162903 - time (sec): 2.87 - samples/sec: 3054.59 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 18:29:59,912 epoch 3 - iter 42/146 - loss 0.11361109 - time (sec): 4.52 - samples/sec: 3029.30 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 18:30:01,662 epoch 3 - iter 56/146 - loss 0.12750318 - time (sec): 6.27 - samples/sec: 2868.35 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 18:30:03,094 epoch 3 - iter 70/146 - loss 0.12967160 - time (sec): 7.70 - samples/sec: 2920.32 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 18:30:04,308 epoch 3 - iter 84/146 - loss 0.13277965 - time (sec): 8.91 - samples/sec: 2928.00 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 18:30:05,767 epoch 3 - iter 98/146 - loss 0.13399034 - time (sec): 10.37 - samples/sec: 2944.70 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 18:30:07,214 epoch 3 - iter 112/146 - loss 0.13409185 - time (sec): 11.82 - samples/sec: 2917.61 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 18:30:08,579 epoch 3 - iter 126/146 - loss 0.12926495 - time (sec): 13.18 - samples/sec: 2926.05 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 18:30:09,889 epoch 3 - iter 140/146 - loss 0.12582582 - time (sec): 14.49 - samples/sec: 2948.69 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 18:30:10,469 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:30:10,470 EPOCH 3 done: loss 0.1239 - lr: 0.000024 |
|
2023-10-17 18:30:11,759 DEV : loss 0.11435481905937195 - f1-score (micro avg) 0.6991 |
|
2023-10-17 18:30:11,766 saving best model |
|
2023-10-17 18:30:12,263 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:30:13,869 epoch 4 - iter 14/146 - loss 0.09119870 - time (sec): 1.60 - samples/sec: 3135.17 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 18:30:15,299 epoch 4 - iter 28/146 - loss 0.08194537 - time (sec): 3.03 - samples/sec: 3091.34 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 18:30:16,620 epoch 4 - iter 42/146 - loss 0.08062505 - time (sec): 4.36 - samples/sec: 3004.00 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 18:30:17,929 epoch 4 - iter 56/146 - loss 0.08080567 - time (sec): 5.66 - samples/sec: 3013.90 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 18:30:19,776 epoch 4 - iter 70/146 - loss 0.08296659 - time (sec): 7.51 - samples/sec: 2889.81 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 18:30:21,196 epoch 4 - iter 84/146 - loss 0.08505015 - time (sec): 8.93 - samples/sec: 2843.59 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 18:30:22,632 epoch 4 - iter 98/146 - loss 0.08597811 - time (sec): 10.37 - samples/sec: 2861.21 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 18:30:24,375 epoch 4 - iter 112/146 - loss 0.08396738 - time (sec): 12.11 - samples/sec: 2831.51 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 18:30:25,867 epoch 4 - iter 126/146 - loss 0.08482342 - time (sec): 13.60 - samples/sec: 2837.10 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 18:30:27,449 epoch 4 - iter 140/146 - loss 0.08515369 - time (sec): 15.18 - samples/sec: 2832.57 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 18:30:27,994 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:30:27,994 EPOCH 4 done: loss 0.0841 - lr: 0.000020 |
|
2023-10-17 18:30:29,328 DEV : loss 0.10569198429584503 - f1-score (micro avg) 0.7539 |
|
2023-10-17 18:30:29,334 saving best model |
|
2023-10-17 18:30:29,848 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:30:31,801 epoch 5 - iter 14/146 - loss 0.05778226 - time (sec): 1.95 - samples/sec: 2693.83 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 18:30:32,972 epoch 5 - iter 28/146 - loss 0.06230529 - time (sec): 3.12 - samples/sec: 2757.51 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 18:30:34,248 epoch 5 - iter 42/146 - loss 0.06228412 - time (sec): 4.40 - samples/sec: 2778.13 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 18:30:35,907 epoch 5 - iter 56/146 - loss 0.05920739 - time (sec): 6.06 - samples/sec: 2743.48 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 18:30:37,441 epoch 5 - iter 70/146 - loss 0.05930941 - time (sec): 7.59 - samples/sec: 2838.75 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 18:30:38,805 epoch 5 - iter 84/146 - loss 0.06127520 - time (sec): 8.95 - samples/sec: 2880.65 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 18:30:40,183 epoch 5 - iter 98/146 - loss 0.05945058 - time (sec): 10.33 - samples/sec: 2873.88 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 18:30:41,453 epoch 5 - iter 112/146 - loss 0.05860727 - time (sec): 11.60 - samples/sec: 2891.09 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 18:30:42,816 epoch 5 - iter 126/146 - loss 0.05959648 - time (sec): 12.97 - samples/sec: 2918.13 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 18:30:44,239 epoch 5 - iter 140/146 - loss 0.06104726 - time (sec): 14.39 - samples/sec: 2958.14 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 18:30:44,915 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:30:44,915 EPOCH 5 done: loss 0.0597 - lr: 0.000017 |
|
2023-10-17 18:30:46,210 DEV : loss 0.10582081973552704 - f1-score (micro avg) 0.7652 |
|
2023-10-17 18:30:46,215 saving best model |
|
2023-10-17 18:30:46,672 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:30:48,511 epoch 6 - iter 14/146 - loss 0.05051652 - time (sec): 1.84 - samples/sec: 2994.31 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 18:30:49,651 epoch 6 - iter 28/146 - loss 0.04479525 - time (sec): 2.98 - samples/sec: 3079.50 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 18:30:51,160 epoch 6 - iter 42/146 - loss 0.04342063 - time (sec): 4.49 - samples/sec: 3009.15 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 18:30:52,674 epoch 6 - iter 56/146 - loss 0.03906415 - time (sec): 6.00 - samples/sec: 3005.44 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 18:30:54,064 epoch 6 - iter 70/146 - loss 0.04181724 - time (sec): 7.39 - samples/sec: 2992.72 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 18:30:55,684 epoch 6 - iter 84/146 - loss 0.03969524 - time (sec): 9.01 - samples/sec: 2881.81 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 18:30:57,131 epoch 6 - iter 98/146 - loss 0.03973318 - time (sec): 10.46 - samples/sec: 2891.90 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 18:30:58,685 epoch 6 - iter 112/146 - loss 0.04223050 - time (sec): 12.01 - samples/sec: 2888.74 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 18:30:59,993 epoch 6 - iter 126/146 - loss 0.04220367 - time (sec): 13.32 - samples/sec: 2911.54 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 18:31:01,341 epoch 6 - iter 140/146 - loss 0.04001740 - time (sec): 14.67 - samples/sec: 2917.48 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 18:31:01,845 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:31:01,845 EPOCH 6 done: loss 0.0400 - lr: 0.000014 |
|
2023-10-17 18:31:03,126 DEV : loss 0.13629357516765594 - f1-score (micro avg) 0.7617 |
|
2023-10-17 18:31:03,131 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:31:04,738 epoch 7 - iter 14/146 - loss 0.02967247 - time (sec): 1.61 - samples/sec: 3051.52 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 18:31:06,216 epoch 7 - iter 28/146 - loss 0.03651477 - time (sec): 3.08 - samples/sec: 2906.74 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 18:31:07,491 epoch 7 - iter 42/146 - loss 0.03086782 - time (sec): 4.36 - samples/sec: 2997.40 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 18:31:09,037 epoch 7 - iter 56/146 - loss 0.02920746 - time (sec): 5.91 - samples/sec: 3029.61 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 18:31:10,394 epoch 7 - iter 70/146 - loss 0.02817667 - time (sec): 7.26 - samples/sec: 3058.22 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 18:31:12,016 epoch 7 - iter 84/146 - loss 0.03114981 - time (sec): 8.88 - samples/sec: 3020.00 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 18:31:13,347 epoch 7 - iter 98/146 - loss 0.03218262 - time (sec): 10.21 - samples/sec: 2969.90 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 18:31:14,705 epoch 7 - iter 112/146 - loss 0.03505462 - time (sec): 11.57 - samples/sec: 2968.48 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 18:31:16,034 epoch 7 - iter 126/146 - loss 0.03492020 - time (sec): 12.90 - samples/sec: 2968.80 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 18:31:17,535 epoch 7 - iter 140/146 - loss 0.03310629 - time (sec): 14.40 - samples/sec: 2978.57 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 18:31:18,069 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:31:18,069 EPOCH 7 done: loss 0.0330 - lr: 0.000010 |
|
2023-10-17 18:31:19,369 DEV : loss 0.12761062383651733 - f1-score (micro avg) 0.7659 |
|
2023-10-17 18:31:19,375 saving best model |
|
2023-10-17 18:31:19,894 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:31:21,355 epoch 8 - iter 14/146 - loss 0.03349981 - time (sec): 1.46 - samples/sec: 3024.57 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 18:31:22,790 epoch 8 - iter 28/146 - loss 0.02887435 - time (sec): 2.89 - samples/sec: 2974.56 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 18:31:24,403 epoch 8 - iter 42/146 - loss 0.02995794 - time (sec): 4.51 - samples/sec: 3121.61 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 18:31:26,018 epoch 8 - iter 56/146 - loss 0.02678266 - time (sec): 6.12 - samples/sec: 3077.72 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 18:31:27,347 epoch 8 - iter 70/146 - loss 0.02314035 - time (sec): 7.45 - samples/sec: 3105.83 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 18:31:28,770 epoch 8 - iter 84/146 - loss 0.02186221 - time (sec): 8.87 - samples/sec: 3062.90 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 18:31:30,290 epoch 8 - iter 98/146 - loss 0.02199109 - time (sec): 10.39 - samples/sec: 3008.44 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 18:31:31,605 epoch 8 - iter 112/146 - loss 0.02235855 - time (sec): 11.71 - samples/sec: 2972.83 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 18:31:32,957 epoch 8 - iter 126/146 - loss 0.02176303 - time (sec): 13.06 - samples/sec: 2974.26 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 18:31:34,601 epoch 8 - iter 140/146 - loss 0.02565869 - time (sec): 14.71 - samples/sec: 2940.76 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 18:31:35,066 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:31:35,066 EPOCH 8 done: loss 0.0257 - lr: 0.000007 |
|
2023-10-17 18:31:36,345 DEV : loss 0.12473238259553909 - f1-score (micro avg) 0.7709 |
|
2023-10-17 18:31:36,352 saving best model |
|
2023-10-17 18:31:36,901 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:31:38,456 epoch 9 - iter 14/146 - loss 0.02290499 - time (sec): 1.55 - samples/sec: 2919.97 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 18:31:40,125 epoch 9 - iter 28/146 - loss 0.02113077 - time (sec): 3.22 - samples/sec: 2944.98 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 18:31:41,743 epoch 9 - iter 42/146 - loss 0.02181818 - time (sec): 4.84 - samples/sec: 2872.92 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 18:31:43,033 epoch 9 - iter 56/146 - loss 0.02005836 - time (sec): 6.13 - samples/sec: 2904.48 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 18:31:44,850 epoch 9 - iter 70/146 - loss 0.01847436 - time (sec): 7.95 - samples/sec: 2820.34 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 18:31:46,338 epoch 9 - iter 84/146 - loss 0.01856837 - time (sec): 9.43 - samples/sec: 2815.95 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 18:31:47,749 epoch 9 - iter 98/146 - loss 0.02083486 - time (sec): 10.85 - samples/sec: 2816.24 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 18:31:48,997 epoch 9 - iter 112/146 - loss 0.02077486 - time (sec): 12.09 - samples/sec: 2798.72 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 18:31:50,885 epoch 9 - iter 126/146 - loss 0.01948000 - time (sec): 13.98 - samples/sec: 2749.61 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 18:31:52,425 epoch 9 - iter 140/146 - loss 0.01950185 - time (sec): 15.52 - samples/sec: 2758.56 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 18:31:53,021 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:31:53,022 EPOCH 9 done: loss 0.0196 - lr: 0.000004 |
|
2023-10-17 18:31:54,356 DEV : loss 0.12521956861019135 - f1-score (micro avg) 0.7835 |
|
2023-10-17 18:31:54,362 saving best model |
|
2023-10-17 18:31:54,886 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:31:56,635 epoch 10 - iter 14/146 - loss 0.01159532 - time (sec): 1.74 - samples/sec: 2794.62 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 18:31:58,076 epoch 10 - iter 28/146 - loss 0.01749210 - time (sec): 3.18 - samples/sec: 2809.62 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 18:31:59,503 epoch 10 - iter 42/146 - loss 0.02306435 - time (sec): 4.61 - samples/sec: 2729.46 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 18:32:01,069 epoch 10 - iter 56/146 - loss 0.02221024 - time (sec): 6.17 - samples/sec: 2772.54 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 18:32:02,694 epoch 10 - iter 70/146 - loss 0.02024334 - time (sec): 7.80 - samples/sec: 2787.96 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 18:32:04,062 epoch 10 - iter 84/146 - loss 0.01765259 - time (sec): 9.17 - samples/sec: 2861.79 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 18:32:05,575 epoch 10 - iter 98/146 - loss 0.01690867 - time (sec): 10.68 - samples/sec: 2848.31 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 18:32:06,896 epoch 10 - iter 112/146 - loss 0.01579067 - time (sec): 12.00 - samples/sec: 2862.39 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 18:32:08,298 epoch 10 - iter 126/146 - loss 0.01607607 - time (sec): 13.40 - samples/sec: 2867.29 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 18:32:09,749 epoch 10 - iter 140/146 - loss 0.01649865 - time (sec): 14.85 - samples/sec: 2872.24 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 18:32:10,433 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:32:10,434 EPOCH 10 done: loss 0.0164 - lr: 0.000000 |
|
2023-10-17 18:32:11,735 DEV : loss 0.12415261566638947 - f1-score (micro avg) 0.7756 |
|
2023-10-17 18:32:12,098 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:32:12,100 Loading model from best epoch ... |
|
2023-10-17 18:32:13,828 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-17 18:32:16,785 |
|
Results: |
|
- F-score (micro) 0.7545 |
|
- F-score (macro) 0.665 |
|
- Accuracy 0.6287 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8127 0.8477 0.8298 348 |
|
LOC 0.6505 0.8199 0.7254 261 |
|
ORG 0.4444 0.3846 0.4124 52 |
|
HumanProd 0.6000 0.8182 0.6923 22 |
|
|
|
micro avg 0.7132 0.8009 0.7545 683 |
|
macro avg 0.6269 0.7176 0.6650 683 |
|
weighted avg 0.7158 0.8009 0.7537 683 |
|
|
|
2023-10-17 18:32:16,786 ---------------------------------------------------------------------------------------------------- |
|
|