2023-10-17 18:05:02,992 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:05:02,993 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 18:05:02,993 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:05:02,993 MultiCorpus: 1166 train + 165 dev + 415 test sentences - NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator 2023-10-17 18:05:02,994 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:05:02,994 Train: 1166 sentences 2023-10-17 18:05:02,994 (train_with_dev=False, train_with_test=False) 2023-10-17 18:05:02,994 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:05:02,994 Training Params: 2023-10-17 18:05:02,994 - learning_rate: "5e-05" 2023-10-17 18:05:02,994 - mini_batch_size: "8" 2023-10-17 18:05:02,994 - max_epochs: "10" 2023-10-17 18:05:02,994 - shuffle: "True" 2023-10-17 18:05:02,994 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:05:02,994 Plugins: 2023-10-17 18:05:02,994 - TensorboardLogger 2023-10-17 18:05:02,994 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 18:05:02,994 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:05:02,994 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 18:05:02,994 - metric: "('micro avg', 'f1-score')" 2023-10-17 18:05:02,994 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:05:02,994 Computation: 2023-10-17 18:05:02,994 - compute on device: cuda:0 2023-10-17 18:05:02,994 - embedding storage: none 2023-10-17 18:05:02,994 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:05:02,994 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-17 18:05:02,994 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:05:02,994 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:05:02,994 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 18:05:04,216 epoch 1 - iter 14/146 - loss 3.47286437 - time (sec): 1.22 - samples/sec: 3020.40 - lr: 0.000004 - momentum: 0.000000 2023-10-17 18:05:05,861 epoch 1 - iter 28/146 - loss 2.95733369 - time (sec): 2.87 - samples/sec: 3024.74 - lr: 0.000009 - momentum: 0.000000 2023-10-17 18:05:07,185 epoch 1 - iter 42/146 - loss 2.39187734 - time (sec): 4.19 - samples/sec: 3083.46 - lr: 0.000014 - momentum: 0.000000 2023-10-17 18:05:08,974 epoch 1 - iter 56/146 - loss 1.95449677 - time (sec): 5.98 - samples/sec: 2967.80 - lr: 0.000019 - momentum: 0.000000 2023-10-17 18:05:10,441 epoch 1 - iter 70/146 - loss 1.66046494 - time (sec): 7.45 - samples/sec: 2956.40 - lr: 0.000024 - momentum: 0.000000 2023-10-17 18:05:12,085 epoch 1 - iter 84/146 - loss 1.44854117 - time (sec): 9.09 - samples/sec: 2915.08 - lr: 0.000028 - momentum: 0.000000 2023-10-17 18:05:13,346 epoch 1 - iter 98/146 - loss 1.29761420 - time (sec): 10.35 - samples/sec: 2963.66 - lr: 0.000033 - momentum: 0.000000 2023-10-17 18:05:14,662 epoch 1 - iter 112/146 - loss 1.20737029 - time (sec): 11.67 - samples/sec: 2956.59 - lr: 0.000038 - momentum: 0.000000 2023-10-17 18:05:16,112 epoch 1 - iter 126/146 - loss 1.10631188 - time (sec): 13.12 - samples/sec: 2951.03 - lr: 0.000043 - momentum: 0.000000 2023-10-17 18:05:17,340 epoch 1 - iter 140/146 - loss 1.03133640 - time (sec): 14.34 - samples/sec: 2977.96 - lr: 0.000048 - momentum: 0.000000 2023-10-17 18:05:17,983 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:05:17,983 EPOCH 1 done: loss 1.0020 - lr: 0.000048 2023-10-17 18:05:19,171 DEV : loss 0.19741927087306976 - f1-score (micro avg) 0.4553 2023-10-17 18:05:19,177 saving best model 2023-10-17 18:05:19,608 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:05:21,033 epoch 2 - iter 14/146 - loss 0.25789668 - time (sec): 1.42 - samples/sec: 3035.78 - lr: 0.000050 - momentum: 0.000000 2023-10-17 18:05:22,205 epoch 2 - iter 28/146 - loss 0.23029793 - time (sec): 2.59 - samples/sec: 2944.73 - lr: 0.000049 - momentum: 0.000000 2023-10-17 18:05:23,767 epoch 2 - iter 42/146 - loss 0.21329382 - time (sec): 4.16 - samples/sec: 3004.58 - lr: 0.000048 - momentum: 0.000000 2023-10-17 18:05:25,642 epoch 2 - iter 56/146 - loss 0.21601636 - time (sec): 6.03 - samples/sec: 2908.44 - lr: 0.000048 - momentum: 0.000000 2023-10-17 18:05:26,915 epoch 2 - iter 70/146 - loss 0.21063728 - time (sec): 7.30 - samples/sec: 2912.94 - lr: 0.000047 - momentum: 0.000000 2023-10-17 18:05:28,452 epoch 2 - iter 84/146 - loss 0.20436532 - time (sec): 8.84 - samples/sec: 2886.36 - lr: 0.000047 - momentum: 0.000000 2023-10-17 18:05:30,128 epoch 2 - iter 98/146 - loss 0.19733768 - time (sec): 10.52 - samples/sec: 2898.31 - lr: 0.000046 - momentum: 0.000000 2023-10-17 18:05:31,546 epoch 2 - iter 112/146 - loss 0.19273671 - time (sec): 11.94 - samples/sec: 2908.88 - lr: 0.000046 - momentum: 0.000000 2023-10-17 18:05:33,102 epoch 2 - iter 126/146 - loss 0.19300856 - time (sec): 13.49 - samples/sec: 2911.12 - lr: 0.000045 - momentum: 0.000000 2023-10-17 18:05:34,409 epoch 2 - iter 140/146 - loss 0.19113966 - time (sec): 14.80 - samples/sec: 2894.14 - lr: 0.000045 - momentum: 0.000000 2023-10-17 18:05:34,939 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:05:34,939 EPOCH 2 done: loss 0.1881 - lr: 0.000045 2023-10-17 18:05:36,222 DEV : loss 0.13468769192695618 - f1-score (micro avg) 0.6377 2023-10-17 18:05:36,228 saving best model 2023-10-17 18:05:36,706 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:05:38,233 epoch 3 - iter 14/146 - loss 0.11423988 - time (sec): 1.52 - samples/sec: 3080.79 - lr: 0.000044 - momentum: 0.000000 2023-10-17 18:05:39,736 epoch 3 - iter 28/146 - loss 0.11646607 - time (sec): 3.02 - samples/sec: 3041.37 - lr: 0.000043 - momentum: 0.000000 2023-10-17 18:05:41,158 epoch 3 - iter 42/146 - loss 0.12663839 - time (sec): 4.45 - samples/sec: 2980.57 - lr: 0.000043 - momentum: 0.000000 2023-10-17 18:05:42,666 epoch 3 - iter 56/146 - loss 0.11742262 - time (sec): 5.96 - samples/sec: 2917.25 - lr: 0.000042 - momentum: 0.000000 2023-10-17 18:05:44,139 epoch 3 - iter 70/146 - loss 0.11414842 - time (sec): 7.43 - samples/sec: 2933.62 - lr: 0.000042 - momentum: 0.000000 2023-10-17 18:05:45,700 epoch 3 - iter 84/146 - loss 0.11775656 - time (sec): 8.99 - samples/sec: 2925.79 - lr: 0.000041 - momentum: 0.000000 2023-10-17 18:05:46,857 epoch 3 - iter 98/146 - loss 0.11622171 - time (sec): 10.15 - samples/sec: 2956.79 - lr: 0.000041 - momentum: 0.000000 2023-10-17 18:05:48,433 epoch 3 - iter 112/146 - loss 0.10943556 - time (sec): 11.72 - samples/sec: 2963.12 - lr: 0.000040 - momentum: 0.000000 2023-10-17 18:05:49,959 epoch 3 - iter 126/146 - loss 0.10714731 - time (sec): 13.25 - samples/sec: 2946.48 - lr: 0.000040 - momentum: 0.000000 2023-10-17 18:05:51,484 epoch 3 - iter 140/146 - loss 0.10632914 - time (sec): 14.77 - samples/sec: 2906.52 - lr: 0.000039 - momentum: 0.000000 2023-10-17 18:05:52,028 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:05:52,028 EPOCH 3 done: loss 0.1094 - lr: 0.000039 2023-10-17 18:05:53,295 DEV : loss 0.11463181674480438 - f1-score (micro avg) 0.7632 2023-10-17 18:05:53,302 saving best model 2023-10-17 18:05:53,742 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:05:55,188 epoch 4 - iter 14/146 - loss 0.06282635 - time (sec): 1.44 - samples/sec: 3137.79 - lr: 0.000038 - momentum: 0.000000 2023-10-17 18:05:56,757 epoch 4 - iter 28/146 - loss 0.05920855 - time (sec): 3.01 - samples/sec: 3022.23 - lr: 0.000038 - momentum: 0.000000 2023-10-17 18:05:57,989 epoch 4 - iter 42/146 - loss 0.06414759 - time (sec): 4.25 - samples/sec: 3033.61 - lr: 0.000037 - momentum: 0.000000 2023-10-17 18:05:59,649 epoch 4 - iter 56/146 - loss 0.06138867 - time (sec): 5.91 - samples/sec: 2989.23 - lr: 0.000037 - momentum: 0.000000 2023-10-17 18:06:01,189 epoch 4 - iter 70/146 - loss 0.06194662 - time (sec): 7.45 - samples/sec: 2986.56 - lr: 0.000036 - momentum: 0.000000 2023-10-17 18:06:02,572 epoch 4 - iter 84/146 - loss 0.06618929 - time (sec): 8.83 - samples/sec: 3000.74 - lr: 0.000036 - momentum: 0.000000 2023-10-17 18:06:04,127 epoch 4 - iter 98/146 - loss 0.06718422 - time (sec): 10.38 - samples/sec: 2940.97 - lr: 0.000035 - momentum: 0.000000 2023-10-17 18:06:05,598 epoch 4 - iter 112/146 - loss 0.07020686 - time (sec): 11.85 - samples/sec: 2949.15 - lr: 0.000035 - momentum: 0.000000 2023-10-17 18:06:07,060 epoch 4 - iter 126/146 - loss 0.07201581 - time (sec): 13.32 - samples/sec: 2939.98 - lr: 0.000034 - momentum: 0.000000 2023-10-17 18:06:08,314 epoch 4 - iter 140/146 - loss 0.07102769 - time (sec): 14.57 - samples/sec: 2918.29 - lr: 0.000034 - momentum: 0.000000 2023-10-17 18:06:08,903 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:06:08,903 EPOCH 4 done: loss 0.0717 - lr: 0.000034 2023-10-17 18:06:10,317 DEV : loss 0.10693204402923584 - f1-score (micro avg) 0.7439 2023-10-17 18:06:10,322 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:06:11,781 epoch 5 - iter 14/146 - loss 0.03715351 - time (sec): 1.46 - samples/sec: 2881.34 - lr: 0.000033 - momentum: 0.000000 2023-10-17 18:06:13,056 epoch 5 - iter 28/146 - loss 0.04279936 - time (sec): 2.73 - samples/sec: 3043.99 - lr: 0.000032 - momentum: 0.000000 2023-10-17 18:06:14,564 epoch 5 - iter 42/146 - loss 0.04029772 - time (sec): 4.24 - samples/sec: 3112.72 - lr: 0.000032 - momentum: 0.000000 2023-10-17 18:06:16,028 epoch 5 - iter 56/146 - loss 0.04254317 - time (sec): 5.70 - samples/sec: 3035.67 - lr: 0.000031 - momentum: 0.000000 2023-10-17 18:06:17,575 epoch 5 - iter 70/146 - loss 0.05147397 - time (sec): 7.25 - samples/sec: 2907.16 - lr: 0.000031 - momentum: 0.000000 2023-10-17 18:06:19,092 epoch 5 - iter 84/146 - loss 0.04970674 - time (sec): 8.77 - samples/sec: 2943.52 - lr: 0.000030 - momentum: 0.000000 2023-10-17 18:06:20,386 epoch 5 - iter 98/146 - loss 0.04728730 - time (sec): 10.06 - samples/sec: 2947.61 - lr: 0.000030 - momentum: 0.000000 2023-10-17 18:06:21,887 epoch 5 - iter 112/146 - loss 0.04576205 - time (sec): 11.56 - samples/sec: 2916.40 - lr: 0.000029 - momentum: 0.000000 2023-10-17 18:06:23,372 epoch 5 - iter 126/146 - loss 0.04431415 - time (sec): 13.05 - samples/sec: 2949.96 - lr: 0.000029 - momentum: 0.000000 2023-10-17 18:06:24,968 epoch 5 - iter 140/146 - loss 0.04350621 - time (sec): 14.65 - samples/sec: 2939.01 - lr: 0.000028 - momentum: 0.000000 2023-10-17 18:06:25,457 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:06:25,457 EPOCH 5 done: loss 0.0430 - lr: 0.000028 2023-10-17 18:06:26,744 DEV : loss 0.1116286963224411 - f1-score (micro avg) 0.7702 2023-10-17 18:06:26,749 saving best model 2023-10-17 18:06:27,203 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:06:28,618 epoch 6 - iter 14/146 - loss 0.04553866 - time (sec): 1.41 - samples/sec: 2913.20 - lr: 0.000027 - momentum: 0.000000 2023-10-17 18:06:30,185 epoch 6 - iter 28/146 - loss 0.03490225 - time (sec): 2.98 - samples/sec: 2979.56 - lr: 0.000027 - momentum: 0.000000 2023-10-17 18:06:31,516 epoch 6 - iter 42/146 - loss 0.04149229 - time (sec): 4.31 - samples/sec: 2877.42 - lr: 0.000026 - momentum: 0.000000 2023-10-17 18:06:32,937 epoch 6 - iter 56/146 - loss 0.03838322 - time (sec): 5.73 - samples/sec: 2794.34 - lr: 0.000026 - momentum: 0.000000 2023-10-17 18:06:34,459 epoch 6 - iter 70/146 - loss 0.03566342 - time (sec): 7.25 - samples/sec: 2817.37 - lr: 0.000025 - momentum: 0.000000 2023-10-17 18:06:36,022 epoch 6 - iter 84/146 - loss 0.03699880 - time (sec): 8.82 - samples/sec: 2887.86 - lr: 0.000025 - momentum: 0.000000 2023-10-17 18:06:37,123 epoch 6 - iter 98/146 - loss 0.03665218 - time (sec): 9.92 - samples/sec: 2914.67 - lr: 0.000024 - momentum: 0.000000 2023-10-17 18:06:38,530 epoch 6 - iter 112/146 - loss 0.03524922 - time (sec): 11.32 - samples/sec: 2914.45 - lr: 0.000024 - momentum: 0.000000 2023-10-17 18:06:39,961 epoch 6 - iter 126/146 - loss 0.03441689 - time (sec): 12.76 - samples/sec: 2944.67 - lr: 0.000023 - momentum: 0.000000 2023-10-17 18:06:41,501 epoch 6 - iter 140/146 - loss 0.03221802 - time (sec): 14.30 - samples/sec: 2971.92 - lr: 0.000023 - momentum: 0.000000 2023-10-17 18:06:42,233 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:06:42,234 EPOCH 6 done: loss 0.0318 - lr: 0.000023 2023-10-17 18:06:43,498 DEV : loss 0.12657296657562256 - f1-score (micro avg) 0.7738 2023-10-17 18:06:43,503 saving best model 2023-10-17 18:06:43,948 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:06:45,372 epoch 7 - iter 14/146 - loss 0.02524628 - time (sec): 1.42 - samples/sec: 2856.66 - lr: 0.000022 - momentum: 0.000000 2023-10-17 18:06:46,593 epoch 7 - iter 28/146 - loss 0.02680280 - time (sec): 2.64 - samples/sec: 2870.59 - lr: 0.000021 - momentum: 0.000000 2023-10-17 18:06:47,949 epoch 7 - iter 42/146 - loss 0.02213969 - time (sec): 4.00 - samples/sec: 2918.32 - lr: 0.000021 - momentum: 0.000000 2023-10-17 18:06:49,309 epoch 7 - iter 56/146 - loss 0.02139683 - time (sec): 5.36 - samples/sec: 2981.76 - lr: 0.000020 - momentum: 0.000000 2023-10-17 18:06:50,826 epoch 7 - iter 70/146 - loss 0.02696387 - time (sec): 6.87 - samples/sec: 3016.39 - lr: 0.000020 - momentum: 0.000000 2023-10-17 18:06:52,353 epoch 7 - iter 84/146 - loss 0.02742539 - time (sec): 8.40 - samples/sec: 2924.97 - lr: 0.000019 - momentum: 0.000000 2023-10-17 18:06:53,828 epoch 7 - iter 98/146 - loss 0.02526137 - time (sec): 9.88 - samples/sec: 2923.43 - lr: 0.000019 - momentum: 0.000000 2023-10-17 18:06:55,369 epoch 7 - iter 112/146 - loss 0.02426403 - time (sec): 11.42 - samples/sec: 2891.27 - lr: 0.000018 - momentum: 0.000000 2023-10-17 18:06:57,093 epoch 7 - iter 126/146 - loss 0.02384645 - time (sec): 13.14 - samples/sec: 2859.96 - lr: 0.000018 - momentum: 0.000000 2023-10-17 18:06:58,546 epoch 7 - iter 140/146 - loss 0.02357317 - time (sec): 14.60 - samples/sec: 2899.92 - lr: 0.000017 - momentum: 0.000000 2023-10-17 18:06:59,258 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:06:59,258 EPOCH 7 done: loss 0.0229 - lr: 0.000017 2023-10-17 18:07:00,727 DEV : loss 0.14095118641853333 - f1-score (micro avg) 0.7478 2023-10-17 18:07:00,731 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:07:02,124 epoch 8 - iter 14/146 - loss 0.01576638 - time (sec): 1.39 - samples/sec: 2988.30 - lr: 0.000016 - momentum: 0.000000 2023-10-17 18:07:03,698 epoch 8 - iter 28/146 - loss 0.01550301 - time (sec): 2.97 - samples/sec: 2868.29 - lr: 0.000016 - momentum: 0.000000 2023-10-17 18:07:05,126 epoch 8 - iter 42/146 - loss 0.01623872 - time (sec): 4.39 - samples/sec: 2890.47 - lr: 0.000015 - momentum: 0.000000 2023-10-17 18:07:06,652 epoch 8 - iter 56/146 - loss 0.02023556 - time (sec): 5.92 - samples/sec: 2938.03 - lr: 0.000015 - momentum: 0.000000 2023-10-17 18:07:08,136 epoch 8 - iter 70/146 - loss 0.02022101 - time (sec): 7.40 - samples/sec: 2935.66 - lr: 0.000014 - momentum: 0.000000 2023-10-17 18:07:09,637 epoch 8 - iter 84/146 - loss 0.01985932 - time (sec): 8.90 - samples/sec: 2943.32 - lr: 0.000014 - momentum: 0.000000 2023-10-17 18:07:11,362 epoch 8 - iter 98/146 - loss 0.01851297 - time (sec): 10.63 - samples/sec: 2911.68 - lr: 0.000013 - momentum: 0.000000 2023-10-17 18:07:12,649 epoch 8 - iter 112/146 - loss 0.01853413 - time (sec): 11.92 - samples/sec: 2887.11 - lr: 0.000013 - momentum: 0.000000 2023-10-17 18:07:14,247 epoch 8 - iter 126/146 - loss 0.01856705 - time (sec): 13.52 - samples/sec: 2907.94 - lr: 0.000012 - momentum: 0.000000 2023-10-17 18:07:15,590 epoch 8 - iter 140/146 - loss 0.01763127 - time (sec): 14.86 - samples/sec: 2898.42 - lr: 0.000012 - momentum: 0.000000 2023-10-17 18:07:16,079 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:07:16,079 EPOCH 8 done: loss 0.0176 - lr: 0.000012 2023-10-17 18:07:17,329 DEV : loss 0.1616245061159134 - f1-score (micro avg) 0.7654 2023-10-17 18:07:17,334 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:07:18,671 epoch 9 - iter 14/146 - loss 0.01183606 - time (sec): 1.34 - samples/sec: 2853.15 - lr: 0.000011 - momentum: 0.000000 2023-10-17 18:07:20,453 epoch 9 - iter 28/146 - loss 0.01570372 - time (sec): 3.12 - samples/sec: 2805.11 - lr: 0.000010 - momentum: 0.000000 2023-10-17 18:07:22,240 epoch 9 - iter 42/146 - loss 0.01929006 - time (sec): 4.90 - samples/sec: 2875.05 - lr: 0.000010 - momentum: 0.000000 2023-10-17 18:07:23,638 epoch 9 - iter 56/146 - loss 0.01748759 - time (sec): 6.30 - samples/sec: 2857.79 - lr: 0.000009 - momentum: 0.000000 2023-10-17 18:07:24,850 epoch 9 - iter 70/146 - loss 0.01629176 - time (sec): 7.52 - samples/sec: 2892.31 - lr: 0.000009 - momentum: 0.000000 2023-10-17 18:07:26,312 epoch 9 - iter 84/146 - loss 0.01521756 - time (sec): 8.98 - samples/sec: 2916.25 - lr: 0.000008 - momentum: 0.000000 2023-10-17 18:07:27,669 epoch 9 - iter 98/146 - loss 0.01433705 - time (sec): 10.33 - samples/sec: 2895.91 - lr: 0.000008 - momentum: 0.000000 2023-10-17 18:07:29,090 epoch 9 - iter 112/146 - loss 0.01425475 - time (sec): 11.76 - samples/sec: 2951.20 - lr: 0.000007 - momentum: 0.000000 2023-10-17 18:07:30,475 epoch 9 - iter 126/146 - loss 0.01361620 - time (sec): 13.14 - samples/sec: 2915.46 - lr: 0.000007 - momentum: 0.000000 2023-10-17 18:07:31,749 epoch 9 - iter 140/146 - loss 0.01297694 - time (sec): 14.41 - samples/sec: 2921.88 - lr: 0.000006 - momentum: 0.000000 2023-10-17 18:07:32,406 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:07:32,406 EPOCH 9 done: loss 0.0124 - lr: 0.000006 2023-10-17 18:07:33,632 DEV : loss 0.1573966145515442 - f1-score (micro avg) 0.773 2023-10-17 18:07:33,637 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:07:35,439 epoch 10 - iter 14/146 - loss 0.01626036 - time (sec): 1.80 - samples/sec: 2789.02 - lr: 0.000005 - momentum: 0.000000 2023-10-17 18:07:36,974 epoch 10 - iter 28/146 - loss 0.01146527 - time (sec): 3.34 - samples/sec: 2861.72 - lr: 0.000005 - momentum: 0.000000 2023-10-17 18:07:38,303 epoch 10 - iter 42/146 - loss 0.01070628 - time (sec): 4.67 - samples/sec: 2915.67 - lr: 0.000004 - momentum: 0.000000 2023-10-17 18:07:39,553 epoch 10 - iter 56/146 - loss 0.00897110 - time (sec): 5.91 - samples/sec: 2970.08 - lr: 0.000004 - momentum: 0.000000 2023-10-17 18:07:40,942 epoch 10 - iter 70/146 - loss 0.00880247 - time (sec): 7.30 - samples/sec: 2932.76 - lr: 0.000003 - momentum: 0.000000 2023-10-17 18:07:42,486 epoch 10 - iter 84/146 - loss 0.00928154 - time (sec): 8.85 - samples/sec: 2866.28 - lr: 0.000003 - momentum: 0.000000 2023-10-17 18:07:43,982 epoch 10 - iter 98/146 - loss 0.00880370 - time (sec): 10.34 - samples/sec: 2874.41 - lr: 0.000002 - momentum: 0.000000 2023-10-17 18:07:45,598 epoch 10 - iter 112/146 - loss 0.00874673 - time (sec): 11.96 - samples/sec: 2885.33 - lr: 0.000002 - momentum: 0.000000 2023-10-17 18:07:46,803 epoch 10 - iter 126/146 - loss 0.00921825 - time (sec): 13.16 - samples/sec: 2915.35 - lr: 0.000001 - momentum: 0.000000 2023-10-17 18:07:48,198 epoch 10 - iter 140/146 - loss 0.00968192 - time (sec): 14.56 - samples/sec: 2915.90 - lr: 0.000000 - momentum: 0.000000 2023-10-17 18:07:49,005 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:07:49,005 EPOCH 10 done: loss 0.0093 - lr: 0.000000 2023-10-17 18:07:50,352 DEV : loss 0.1613503247499466 - f1-score (micro avg) 0.7775 2023-10-17 18:07:50,357 saving best model 2023-10-17 18:07:51,156 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:07:51,157 Loading model from best epoch ... 2023-10-17 18:07:52,769 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-17 18:07:55,187 Results: - F-score (micro) 0.7562 - F-score (macro) 0.6535 - Accuracy 0.6287 By class: precision recall f1-score support PER 0.8229 0.8678 0.8448 348 LOC 0.6350 0.8199 0.7157 261 ORG 0.4400 0.4231 0.4314 52 HumanProd 0.6087 0.6364 0.6222 22 micro avg 0.7104 0.8082 0.7562 683 macro avg 0.6266 0.6868 0.6535 683 weighted avg 0.7150 0.8082 0.7568 683 2023-10-17 18:07:55,187 ----------------------------------------------------------------------------------------------------