|
2023-10-17 17:47:58,832 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:47:58,833 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 17:47:58,833 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:47:58,833 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-17 17:47:58,833 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:47:58,833 Train: 1166 sentences |
|
2023-10-17 17:47:58,833 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 17:47:58,833 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:47:58,833 Training Params: |
|
2023-10-17 17:47:58,833 - learning_rate: "3e-05" |
|
2023-10-17 17:47:58,833 - mini_batch_size: "8" |
|
2023-10-17 17:47:58,833 - max_epochs: "10" |
|
2023-10-17 17:47:58,833 - shuffle: "True" |
|
2023-10-17 17:47:58,834 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:47:58,834 Plugins: |
|
2023-10-17 17:47:58,834 - TensorboardLogger |
|
2023-10-17 17:47:58,834 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 17:47:58,834 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:47:58,834 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 17:47:58,834 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 17:47:58,834 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:47:58,834 Computation: |
|
2023-10-17 17:47:58,834 - compute on device: cuda:0 |
|
2023-10-17 17:47:58,834 - embedding storage: none |
|
2023-10-17 17:47:58,834 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:47:58,834 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-17 17:47:58,834 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:47:58,834 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:47:58,834 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 17:48:00,326 epoch 1 - iter 14/146 - loss 3.70831643 - time (sec): 1.49 - samples/sec: 2870.09 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:48:01,701 epoch 1 - iter 28/146 - loss 3.48850072 - time (sec): 2.87 - samples/sec: 3072.96 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:48:03,431 epoch 1 - iter 42/146 - loss 3.02057847 - time (sec): 4.60 - samples/sec: 2846.31 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 17:48:04,717 epoch 1 - iter 56/146 - loss 2.50123340 - time (sec): 5.88 - samples/sec: 2868.63 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 17:48:05,892 epoch 1 - iter 70/146 - loss 2.20026054 - time (sec): 7.06 - samples/sec: 2894.66 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 17:48:07,176 epoch 1 - iter 84/146 - loss 1.93860951 - time (sec): 8.34 - samples/sec: 2910.26 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 17:48:08,604 epoch 1 - iter 98/146 - loss 1.71290433 - time (sec): 9.77 - samples/sec: 2947.37 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:48:09,829 epoch 1 - iter 112/146 - loss 1.56730272 - time (sec): 10.99 - samples/sec: 2961.94 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 17:48:11,184 epoch 1 - iter 126/146 - loss 1.43497338 - time (sec): 12.35 - samples/sec: 2966.86 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:48:13,063 epoch 1 - iter 140/146 - loss 1.30380866 - time (sec): 14.23 - samples/sec: 2968.73 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 17:48:13,871 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:48:13,871 EPOCH 1 done: loss 1.2702 - lr: 0.000029 |
|
2023-10-17 17:48:14,887 DEV : loss 0.20914216339588165 - f1-score (micro avg) 0.4434 |
|
2023-10-17 17:48:14,891 saving best model |
|
2023-10-17 17:48:15,219 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:48:16,888 epoch 2 - iter 14/146 - loss 0.34822885 - time (sec): 1.67 - samples/sec: 2862.44 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 17:48:18,130 epoch 2 - iter 28/146 - loss 0.31488826 - time (sec): 2.91 - samples/sec: 2912.01 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 17:48:19,353 epoch 2 - iter 42/146 - loss 0.28349960 - time (sec): 4.13 - samples/sec: 2950.95 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 17:48:20,504 epoch 2 - iter 56/146 - loss 0.27893653 - time (sec): 5.28 - samples/sec: 2993.09 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 17:48:22,128 epoch 2 - iter 70/146 - loss 0.28179222 - time (sec): 6.91 - samples/sec: 2975.41 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 17:48:23,714 epoch 2 - iter 84/146 - loss 0.26094264 - time (sec): 8.49 - samples/sec: 2918.48 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 17:48:25,054 epoch 2 - iter 98/146 - loss 0.24468379 - time (sec): 9.83 - samples/sec: 2907.34 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 17:48:26,279 epoch 2 - iter 112/146 - loss 0.23942792 - time (sec): 11.06 - samples/sec: 2929.29 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:48:27,639 epoch 2 - iter 126/146 - loss 0.22920784 - time (sec): 12.42 - samples/sec: 2958.30 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:48:29,567 epoch 2 - iter 140/146 - loss 0.22136768 - time (sec): 14.35 - samples/sec: 2968.22 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:48:30,113 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:48:30,113 EPOCH 2 done: loss 0.2179 - lr: 0.000027 |
|
2023-10-17 17:48:31,360 DEV : loss 0.1328418254852295 - f1-score (micro avg) 0.6194 |
|
2023-10-17 17:48:31,365 saving best model |
|
2023-10-17 17:48:31,804 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:48:32,991 epoch 3 - iter 14/146 - loss 0.16652025 - time (sec): 1.19 - samples/sec: 2928.21 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:48:34,759 epoch 3 - iter 28/146 - loss 0.13123510 - time (sec): 2.95 - samples/sec: 2859.92 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:48:36,129 epoch 3 - iter 42/146 - loss 0.12284675 - time (sec): 4.32 - samples/sec: 2937.26 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:48:37,838 epoch 3 - iter 56/146 - loss 0.11703501 - time (sec): 6.03 - samples/sec: 2887.88 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:48:39,065 epoch 3 - iter 70/146 - loss 0.12806525 - time (sec): 7.26 - samples/sec: 2910.71 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:48:40,354 epoch 3 - iter 84/146 - loss 0.12466717 - time (sec): 8.55 - samples/sec: 2959.90 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:48:41,475 epoch 3 - iter 98/146 - loss 0.12411716 - time (sec): 9.67 - samples/sec: 2953.35 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:48:43,083 epoch 3 - iter 112/146 - loss 0.12366808 - time (sec): 11.28 - samples/sec: 2980.33 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:48:44,797 epoch 3 - iter 126/146 - loss 0.12260040 - time (sec): 12.99 - samples/sec: 2924.89 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:48:46,433 epoch 3 - iter 140/146 - loss 0.12422938 - time (sec): 14.63 - samples/sec: 2936.40 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:48:46,872 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:48:46,872 EPOCH 3 done: loss 0.1216 - lr: 0.000024 |
|
2023-10-17 17:48:48,107 DEV : loss 0.09678074717521667 - f1-score (micro avg) 0.7391 |
|
2023-10-17 17:48:48,113 saving best model |
|
2023-10-17 17:48:48,546 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:48:50,020 epoch 4 - iter 14/146 - loss 0.07210812 - time (sec): 1.47 - samples/sec: 3219.07 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 17:48:51,407 epoch 4 - iter 28/146 - loss 0.08676056 - time (sec): 2.86 - samples/sec: 3182.49 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 17:48:52,963 epoch 4 - iter 42/146 - loss 0.09667761 - time (sec): 4.42 - samples/sec: 3032.38 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 17:48:54,228 epoch 4 - iter 56/146 - loss 0.09045365 - time (sec): 5.68 - samples/sec: 2967.39 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 17:48:55,704 epoch 4 - iter 70/146 - loss 0.08679327 - time (sec): 7.16 - samples/sec: 2956.52 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 17:48:57,298 epoch 4 - iter 84/146 - loss 0.08850398 - time (sec): 8.75 - samples/sec: 2964.11 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:48:58,514 epoch 4 - iter 98/146 - loss 0.08454841 - time (sec): 9.97 - samples/sec: 2943.86 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:48:59,963 epoch 4 - iter 112/146 - loss 0.08390306 - time (sec): 11.42 - samples/sec: 2920.25 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:49:01,517 epoch 4 - iter 126/146 - loss 0.08184749 - time (sec): 12.97 - samples/sec: 2928.00 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:49:03,073 epoch 4 - iter 140/146 - loss 0.07829550 - time (sec): 14.53 - samples/sec: 2931.91 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:49:03,702 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:49:03,703 EPOCH 4 done: loss 0.0787 - lr: 0.000020 |
|
2023-10-17 17:49:04,965 DEV : loss 0.10548200458288193 - f1-score (micro avg) 0.7217 |
|
2023-10-17 17:49:04,970 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:49:06,352 epoch 5 - iter 14/146 - loss 0.07117211 - time (sec): 1.38 - samples/sec: 2781.96 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:49:07,894 epoch 5 - iter 28/146 - loss 0.06763706 - time (sec): 2.92 - samples/sec: 2897.40 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 17:49:09,304 epoch 5 - iter 42/146 - loss 0.05999953 - time (sec): 4.33 - samples/sec: 3006.81 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 17:49:11,007 epoch 5 - iter 56/146 - loss 0.06257175 - time (sec): 6.03 - samples/sec: 2940.26 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 17:49:12,341 epoch 5 - iter 70/146 - loss 0.05800479 - time (sec): 7.37 - samples/sec: 2936.86 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:49:13,615 epoch 5 - iter 84/146 - loss 0.05788663 - time (sec): 8.64 - samples/sec: 2906.02 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:49:15,184 epoch 5 - iter 98/146 - loss 0.05396484 - time (sec): 10.21 - samples/sec: 2930.36 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:49:16,441 epoch 5 - iter 112/146 - loss 0.05252850 - time (sec): 11.47 - samples/sec: 2935.02 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:49:18,320 epoch 5 - iter 126/146 - loss 0.05460215 - time (sec): 13.35 - samples/sec: 2899.42 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 17:49:19,809 epoch 5 - iter 140/146 - loss 0.05860187 - time (sec): 14.84 - samples/sec: 2878.39 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 17:49:20,338 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:49:20,338 EPOCH 5 done: loss 0.0577 - lr: 0.000017 |
|
2023-10-17 17:49:21,645 DEV : loss 0.11666107177734375 - f1-score (micro avg) 0.7357 |
|
2023-10-17 17:49:21,653 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:49:23,019 epoch 6 - iter 14/146 - loss 0.03065155 - time (sec): 1.36 - samples/sec: 2882.79 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 17:49:24,773 epoch 6 - iter 28/146 - loss 0.03493187 - time (sec): 3.12 - samples/sec: 2845.14 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 17:49:26,054 epoch 6 - iter 42/146 - loss 0.03689635 - time (sec): 4.40 - samples/sec: 2766.74 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 17:49:27,639 epoch 6 - iter 56/146 - loss 0.03772627 - time (sec): 5.98 - samples/sec: 2815.18 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:49:29,447 epoch 6 - iter 70/146 - loss 0.04035301 - time (sec): 7.79 - samples/sec: 2768.42 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:49:30,920 epoch 6 - iter 84/146 - loss 0.04015679 - time (sec): 9.27 - samples/sec: 2812.19 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:49:32,425 epoch 6 - iter 98/146 - loss 0.04222793 - time (sec): 10.77 - samples/sec: 2813.85 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:49:33,802 epoch 6 - iter 112/146 - loss 0.04043294 - time (sec): 12.15 - samples/sec: 2809.62 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 17:49:35,285 epoch 6 - iter 126/146 - loss 0.04103862 - time (sec): 13.63 - samples/sec: 2815.75 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 17:49:36,664 epoch 6 - iter 140/146 - loss 0.04074709 - time (sec): 15.01 - samples/sec: 2855.51 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 17:49:37,163 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:49:37,163 EPOCH 6 done: loss 0.0412 - lr: 0.000014 |
|
2023-10-17 17:49:38,675 DEV : loss 0.117428258061409 - f1-score (micro avg) 0.7669 |
|
2023-10-17 17:49:38,681 saving best model |
|
2023-10-17 17:49:39,120 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:49:40,477 epoch 7 - iter 14/146 - loss 0.01790611 - time (sec): 1.35 - samples/sec: 2742.33 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 17:49:41,979 epoch 7 - iter 28/146 - loss 0.02367119 - time (sec): 2.86 - samples/sec: 2953.13 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 17:49:43,472 epoch 7 - iter 42/146 - loss 0.03528884 - time (sec): 4.35 - samples/sec: 2978.62 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:49:45,039 epoch 7 - iter 56/146 - loss 0.03487856 - time (sec): 5.92 - samples/sec: 2939.16 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:49:46,384 epoch 7 - iter 70/146 - loss 0.03083187 - time (sec): 7.26 - samples/sec: 2964.58 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:49:47,960 epoch 7 - iter 84/146 - loss 0.02860470 - time (sec): 8.84 - samples/sec: 2885.77 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:49:49,345 epoch 7 - iter 98/146 - loss 0.03056844 - time (sec): 10.22 - samples/sec: 2915.50 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 17:49:50,648 epoch 7 - iter 112/146 - loss 0.03330335 - time (sec): 11.53 - samples/sec: 2959.55 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 17:49:52,247 epoch 7 - iter 126/146 - loss 0.03165719 - time (sec): 13.12 - samples/sec: 2938.43 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 17:49:53,929 epoch 7 - iter 140/146 - loss 0.03178463 - time (sec): 14.81 - samples/sec: 2905.25 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 17:49:54,424 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:49:54,424 EPOCH 7 done: loss 0.0322 - lr: 0.000010 |
|
2023-10-17 17:49:55,741 DEV : loss 0.12387555837631226 - f1-score (micro avg) 0.7565 |
|
2023-10-17 17:49:55,747 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:49:57,081 epoch 8 - iter 14/146 - loss 0.03722011 - time (sec): 1.33 - samples/sec: 2811.37 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 17:49:58,664 epoch 8 - iter 28/146 - loss 0.03633970 - time (sec): 2.92 - samples/sec: 2805.10 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 17:50:00,042 epoch 8 - iter 42/146 - loss 0.03006267 - time (sec): 4.29 - samples/sec: 2771.47 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 17:50:01,404 epoch 8 - iter 56/146 - loss 0.02940184 - time (sec): 5.66 - samples/sec: 2809.56 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 17:50:02,743 epoch 8 - iter 70/146 - loss 0.02738822 - time (sec): 6.99 - samples/sec: 2896.54 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 17:50:04,234 epoch 8 - iter 84/146 - loss 0.02622802 - time (sec): 8.49 - samples/sec: 2951.33 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 17:50:05,468 epoch 8 - iter 98/146 - loss 0.02554764 - time (sec): 9.72 - samples/sec: 2961.07 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 17:50:07,005 epoch 8 - iter 112/146 - loss 0.02539222 - time (sec): 11.26 - samples/sec: 2977.63 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 17:50:08,351 epoch 8 - iter 126/146 - loss 0.02497153 - time (sec): 12.60 - samples/sec: 2975.95 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 17:50:09,996 epoch 8 - iter 140/146 - loss 0.02398278 - time (sec): 14.25 - samples/sec: 2994.84 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 17:50:10,711 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:50:10,711 EPOCH 8 done: loss 0.0239 - lr: 0.000007 |
|
2023-10-17 17:50:11,971 DEV : loss 0.12609358131885529 - f1-score (micro avg) 0.7699 |
|
2023-10-17 17:50:11,976 saving best model |
|
2023-10-17 17:50:12,414 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:50:13,900 epoch 9 - iter 14/146 - loss 0.01567117 - time (sec): 1.48 - samples/sec: 3033.06 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:50:15,389 epoch 9 - iter 28/146 - loss 0.01839902 - time (sec): 2.97 - samples/sec: 2909.24 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:50:17,035 epoch 9 - iter 42/146 - loss 0.01793049 - time (sec): 4.61 - samples/sec: 2875.68 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:50:18,804 epoch 9 - iter 56/146 - loss 0.02188313 - time (sec): 6.38 - samples/sec: 2798.01 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:50:20,408 epoch 9 - iter 70/146 - loss 0.02090357 - time (sec): 7.98 - samples/sec: 2803.78 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:50:21,786 epoch 9 - iter 84/146 - loss 0.01866954 - time (sec): 9.36 - samples/sec: 2816.37 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:50:23,480 epoch 9 - iter 98/146 - loss 0.01734549 - time (sec): 11.06 - samples/sec: 2784.86 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:50:24,712 epoch 9 - iter 112/146 - loss 0.01775145 - time (sec): 12.29 - samples/sec: 2801.24 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 17:50:26,122 epoch 9 - iter 126/146 - loss 0.01676461 - time (sec): 13.70 - samples/sec: 2807.04 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 17:50:27,462 epoch 9 - iter 140/146 - loss 0.01669329 - time (sec): 15.04 - samples/sec: 2848.20 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 17:50:28,138 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:50:28,138 EPOCH 9 done: loss 0.0172 - lr: 0.000004 |
|
2023-10-17 17:50:29,406 DEV : loss 0.13158701360225677 - f1-score (micro avg) 0.7722 |
|
2023-10-17 17:50:29,412 saving best model |
|
2023-10-17 17:50:29,850 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:50:31,171 epoch 10 - iter 14/146 - loss 0.01451792 - time (sec): 1.32 - samples/sec: 3077.53 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:50:32,784 epoch 10 - iter 28/146 - loss 0.01705737 - time (sec): 2.93 - samples/sec: 2778.86 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:50:34,412 epoch 10 - iter 42/146 - loss 0.01983646 - time (sec): 4.56 - samples/sec: 2737.42 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:50:35,780 epoch 10 - iter 56/146 - loss 0.01703870 - time (sec): 5.92 - samples/sec: 2850.19 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 17:50:37,301 epoch 10 - iter 70/146 - loss 0.01574640 - time (sec): 7.45 - samples/sec: 2902.74 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 17:50:38,843 epoch 10 - iter 84/146 - loss 0.01431043 - time (sec): 8.99 - samples/sec: 2883.94 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 17:50:40,568 epoch 10 - iter 98/146 - loss 0.01549304 - time (sec): 10.71 - samples/sec: 2825.21 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 17:50:42,069 epoch 10 - iter 112/146 - loss 0.01473409 - time (sec): 12.21 - samples/sec: 2846.18 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 17:50:43,565 epoch 10 - iter 126/146 - loss 0.01434044 - time (sec): 13.71 - samples/sec: 2843.59 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 17:50:44,833 epoch 10 - iter 140/146 - loss 0.01637173 - time (sec): 14.98 - samples/sec: 2850.68 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 17:50:45,380 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:50:45,380 EPOCH 10 done: loss 0.0159 - lr: 0.000000 |
|
2023-10-17 17:50:46,646 DEV : loss 0.13107363879680634 - f1-score (micro avg) 0.7766 |
|
2023-10-17 17:50:46,651 saving best model |
|
2023-10-17 17:50:47,431 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:50:47,432 Loading model from best epoch ... |
|
2023-10-17 17:50:48,810 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-17 17:50:51,565 |
|
Results: |
|
- F-score (micro) 0.7693 |
|
- F-score (macro) 0.7008 |
|
- Accuracy 0.6408 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8154 0.8506 0.8326 348 |
|
LOC 0.6588 0.8582 0.7454 261 |
|
ORG 0.4898 0.4615 0.4752 52 |
|
HumanProd 0.6923 0.8182 0.7500 22 |
|
|
|
micro avg 0.7224 0.8228 0.7693 683 |
|
macro avg 0.6641 0.7471 0.7008 683 |
|
weighted avg 0.7268 0.8228 0.7694 683 |
|
|
|
2023-10-17 17:50:51,565 ---------------------------------------------------------------------------------------------------- |
|
|