stefan-it's picture
Upload folder using huggingface_hub
8f15959
2023-10-17 18:05:02,992 ----------------------------------------------------------------------------------------------------
2023-10-17 18:05:02,993 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 18:05:02,993 ----------------------------------------------------------------------------------------------------
2023-10-17 18:05:02,993 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-17 18:05:02,994 ----------------------------------------------------------------------------------------------------
2023-10-17 18:05:02,994 Train: 1166 sentences
2023-10-17 18:05:02,994 (train_with_dev=False, train_with_test=False)
2023-10-17 18:05:02,994 ----------------------------------------------------------------------------------------------------
2023-10-17 18:05:02,994 Training Params:
2023-10-17 18:05:02,994 - learning_rate: "5e-05"
2023-10-17 18:05:02,994 - mini_batch_size: "8"
2023-10-17 18:05:02,994 - max_epochs: "10"
2023-10-17 18:05:02,994 - shuffle: "True"
2023-10-17 18:05:02,994 ----------------------------------------------------------------------------------------------------
2023-10-17 18:05:02,994 Plugins:
2023-10-17 18:05:02,994 - TensorboardLogger
2023-10-17 18:05:02,994 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 18:05:02,994 ----------------------------------------------------------------------------------------------------
2023-10-17 18:05:02,994 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 18:05:02,994 - metric: "('micro avg', 'f1-score')"
2023-10-17 18:05:02,994 ----------------------------------------------------------------------------------------------------
2023-10-17 18:05:02,994 Computation:
2023-10-17 18:05:02,994 - compute on device: cuda:0
2023-10-17 18:05:02,994 - embedding storage: none
2023-10-17 18:05:02,994 ----------------------------------------------------------------------------------------------------
2023-10-17 18:05:02,994 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-17 18:05:02,994 ----------------------------------------------------------------------------------------------------
2023-10-17 18:05:02,994 ----------------------------------------------------------------------------------------------------
2023-10-17 18:05:02,994 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 18:05:04,216 epoch 1 - iter 14/146 - loss 3.47286437 - time (sec): 1.22 - samples/sec: 3020.40 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:05:05,861 epoch 1 - iter 28/146 - loss 2.95733369 - time (sec): 2.87 - samples/sec: 3024.74 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:05:07,185 epoch 1 - iter 42/146 - loss 2.39187734 - time (sec): 4.19 - samples/sec: 3083.46 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:05:08,974 epoch 1 - iter 56/146 - loss 1.95449677 - time (sec): 5.98 - samples/sec: 2967.80 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:05:10,441 epoch 1 - iter 70/146 - loss 1.66046494 - time (sec): 7.45 - samples/sec: 2956.40 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:05:12,085 epoch 1 - iter 84/146 - loss 1.44854117 - time (sec): 9.09 - samples/sec: 2915.08 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:05:13,346 epoch 1 - iter 98/146 - loss 1.29761420 - time (sec): 10.35 - samples/sec: 2963.66 - lr: 0.000033 - momentum: 0.000000
2023-10-17 18:05:14,662 epoch 1 - iter 112/146 - loss 1.20737029 - time (sec): 11.67 - samples/sec: 2956.59 - lr: 0.000038 - momentum: 0.000000
2023-10-17 18:05:16,112 epoch 1 - iter 126/146 - loss 1.10631188 - time (sec): 13.12 - samples/sec: 2951.03 - lr: 0.000043 - momentum: 0.000000
2023-10-17 18:05:17,340 epoch 1 - iter 140/146 - loss 1.03133640 - time (sec): 14.34 - samples/sec: 2977.96 - lr: 0.000048 - momentum: 0.000000
2023-10-17 18:05:17,983 ----------------------------------------------------------------------------------------------------
2023-10-17 18:05:17,983 EPOCH 1 done: loss 1.0020 - lr: 0.000048
2023-10-17 18:05:19,171 DEV : loss 0.19741927087306976 - f1-score (micro avg) 0.4553
2023-10-17 18:05:19,177 saving best model
2023-10-17 18:05:19,608 ----------------------------------------------------------------------------------------------------
2023-10-17 18:05:21,033 epoch 2 - iter 14/146 - loss 0.25789668 - time (sec): 1.42 - samples/sec: 3035.78 - lr: 0.000050 - momentum: 0.000000
2023-10-17 18:05:22,205 epoch 2 - iter 28/146 - loss 0.23029793 - time (sec): 2.59 - samples/sec: 2944.73 - lr: 0.000049 - momentum: 0.000000
2023-10-17 18:05:23,767 epoch 2 - iter 42/146 - loss 0.21329382 - time (sec): 4.16 - samples/sec: 3004.58 - lr: 0.000048 - momentum: 0.000000
2023-10-17 18:05:25,642 epoch 2 - iter 56/146 - loss 0.21601636 - time (sec): 6.03 - samples/sec: 2908.44 - lr: 0.000048 - momentum: 0.000000
2023-10-17 18:05:26,915 epoch 2 - iter 70/146 - loss 0.21063728 - time (sec): 7.30 - samples/sec: 2912.94 - lr: 0.000047 - momentum: 0.000000
2023-10-17 18:05:28,452 epoch 2 - iter 84/146 - loss 0.20436532 - time (sec): 8.84 - samples/sec: 2886.36 - lr: 0.000047 - momentum: 0.000000
2023-10-17 18:05:30,128 epoch 2 - iter 98/146 - loss 0.19733768 - time (sec): 10.52 - samples/sec: 2898.31 - lr: 0.000046 - momentum: 0.000000
2023-10-17 18:05:31,546 epoch 2 - iter 112/146 - loss 0.19273671 - time (sec): 11.94 - samples/sec: 2908.88 - lr: 0.000046 - momentum: 0.000000
2023-10-17 18:05:33,102 epoch 2 - iter 126/146 - loss 0.19300856 - time (sec): 13.49 - samples/sec: 2911.12 - lr: 0.000045 - momentum: 0.000000
2023-10-17 18:05:34,409 epoch 2 - iter 140/146 - loss 0.19113966 - time (sec): 14.80 - samples/sec: 2894.14 - lr: 0.000045 - momentum: 0.000000
2023-10-17 18:05:34,939 ----------------------------------------------------------------------------------------------------
2023-10-17 18:05:34,939 EPOCH 2 done: loss 0.1881 - lr: 0.000045
2023-10-17 18:05:36,222 DEV : loss 0.13468769192695618 - f1-score (micro avg) 0.6377
2023-10-17 18:05:36,228 saving best model
2023-10-17 18:05:36,706 ----------------------------------------------------------------------------------------------------
2023-10-17 18:05:38,233 epoch 3 - iter 14/146 - loss 0.11423988 - time (sec): 1.52 - samples/sec: 3080.79 - lr: 0.000044 - momentum: 0.000000
2023-10-17 18:05:39,736 epoch 3 - iter 28/146 - loss 0.11646607 - time (sec): 3.02 - samples/sec: 3041.37 - lr: 0.000043 - momentum: 0.000000
2023-10-17 18:05:41,158 epoch 3 - iter 42/146 - loss 0.12663839 - time (sec): 4.45 - samples/sec: 2980.57 - lr: 0.000043 - momentum: 0.000000
2023-10-17 18:05:42,666 epoch 3 - iter 56/146 - loss 0.11742262 - time (sec): 5.96 - samples/sec: 2917.25 - lr: 0.000042 - momentum: 0.000000
2023-10-17 18:05:44,139 epoch 3 - iter 70/146 - loss 0.11414842 - time (sec): 7.43 - samples/sec: 2933.62 - lr: 0.000042 - momentum: 0.000000
2023-10-17 18:05:45,700 epoch 3 - iter 84/146 - loss 0.11775656 - time (sec): 8.99 - samples/sec: 2925.79 - lr: 0.000041 - momentum: 0.000000
2023-10-17 18:05:46,857 epoch 3 - iter 98/146 - loss 0.11622171 - time (sec): 10.15 - samples/sec: 2956.79 - lr: 0.000041 - momentum: 0.000000
2023-10-17 18:05:48,433 epoch 3 - iter 112/146 - loss 0.10943556 - time (sec): 11.72 - samples/sec: 2963.12 - lr: 0.000040 - momentum: 0.000000
2023-10-17 18:05:49,959 epoch 3 - iter 126/146 - loss 0.10714731 - time (sec): 13.25 - samples/sec: 2946.48 - lr: 0.000040 - momentum: 0.000000
2023-10-17 18:05:51,484 epoch 3 - iter 140/146 - loss 0.10632914 - time (sec): 14.77 - samples/sec: 2906.52 - lr: 0.000039 - momentum: 0.000000
2023-10-17 18:05:52,028 ----------------------------------------------------------------------------------------------------
2023-10-17 18:05:52,028 EPOCH 3 done: loss 0.1094 - lr: 0.000039
2023-10-17 18:05:53,295 DEV : loss 0.11463181674480438 - f1-score (micro avg) 0.7632
2023-10-17 18:05:53,302 saving best model
2023-10-17 18:05:53,742 ----------------------------------------------------------------------------------------------------
2023-10-17 18:05:55,188 epoch 4 - iter 14/146 - loss 0.06282635 - time (sec): 1.44 - samples/sec: 3137.79 - lr: 0.000038 - momentum: 0.000000
2023-10-17 18:05:56,757 epoch 4 - iter 28/146 - loss 0.05920855 - time (sec): 3.01 - samples/sec: 3022.23 - lr: 0.000038 - momentum: 0.000000
2023-10-17 18:05:57,989 epoch 4 - iter 42/146 - loss 0.06414759 - time (sec): 4.25 - samples/sec: 3033.61 - lr: 0.000037 - momentum: 0.000000
2023-10-17 18:05:59,649 epoch 4 - iter 56/146 - loss 0.06138867 - time (sec): 5.91 - samples/sec: 2989.23 - lr: 0.000037 - momentum: 0.000000
2023-10-17 18:06:01,189 epoch 4 - iter 70/146 - loss 0.06194662 - time (sec): 7.45 - samples/sec: 2986.56 - lr: 0.000036 - momentum: 0.000000
2023-10-17 18:06:02,572 epoch 4 - iter 84/146 - loss 0.06618929 - time (sec): 8.83 - samples/sec: 3000.74 - lr: 0.000036 - momentum: 0.000000
2023-10-17 18:06:04,127 epoch 4 - iter 98/146 - loss 0.06718422 - time (sec): 10.38 - samples/sec: 2940.97 - lr: 0.000035 - momentum: 0.000000
2023-10-17 18:06:05,598 epoch 4 - iter 112/146 - loss 0.07020686 - time (sec): 11.85 - samples/sec: 2949.15 - lr: 0.000035 - momentum: 0.000000
2023-10-17 18:06:07,060 epoch 4 - iter 126/146 - loss 0.07201581 - time (sec): 13.32 - samples/sec: 2939.98 - lr: 0.000034 - momentum: 0.000000
2023-10-17 18:06:08,314 epoch 4 - iter 140/146 - loss 0.07102769 - time (sec): 14.57 - samples/sec: 2918.29 - lr: 0.000034 - momentum: 0.000000
2023-10-17 18:06:08,903 ----------------------------------------------------------------------------------------------------
2023-10-17 18:06:08,903 EPOCH 4 done: loss 0.0717 - lr: 0.000034
2023-10-17 18:06:10,317 DEV : loss 0.10693204402923584 - f1-score (micro avg) 0.7439
2023-10-17 18:06:10,322 ----------------------------------------------------------------------------------------------------
2023-10-17 18:06:11,781 epoch 5 - iter 14/146 - loss 0.03715351 - time (sec): 1.46 - samples/sec: 2881.34 - lr: 0.000033 - momentum: 0.000000
2023-10-17 18:06:13,056 epoch 5 - iter 28/146 - loss 0.04279936 - time (sec): 2.73 - samples/sec: 3043.99 - lr: 0.000032 - momentum: 0.000000
2023-10-17 18:06:14,564 epoch 5 - iter 42/146 - loss 0.04029772 - time (sec): 4.24 - samples/sec: 3112.72 - lr: 0.000032 - momentum: 0.000000
2023-10-17 18:06:16,028 epoch 5 - iter 56/146 - loss 0.04254317 - time (sec): 5.70 - samples/sec: 3035.67 - lr: 0.000031 - momentum: 0.000000
2023-10-17 18:06:17,575 epoch 5 - iter 70/146 - loss 0.05147397 - time (sec): 7.25 - samples/sec: 2907.16 - lr: 0.000031 - momentum: 0.000000
2023-10-17 18:06:19,092 epoch 5 - iter 84/146 - loss 0.04970674 - time (sec): 8.77 - samples/sec: 2943.52 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:06:20,386 epoch 5 - iter 98/146 - loss 0.04728730 - time (sec): 10.06 - samples/sec: 2947.61 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:06:21,887 epoch 5 - iter 112/146 - loss 0.04576205 - time (sec): 11.56 - samples/sec: 2916.40 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:06:23,372 epoch 5 - iter 126/146 - loss 0.04431415 - time (sec): 13.05 - samples/sec: 2949.96 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:06:24,968 epoch 5 - iter 140/146 - loss 0.04350621 - time (sec): 14.65 - samples/sec: 2939.01 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:06:25,457 ----------------------------------------------------------------------------------------------------
2023-10-17 18:06:25,457 EPOCH 5 done: loss 0.0430 - lr: 0.000028
2023-10-17 18:06:26,744 DEV : loss 0.1116286963224411 - f1-score (micro avg) 0.7702
2023-10-17 18:06:26,749 saving best model
2023-10-17 18:06:27,203 ----------------------------------------------------------------------------------------------------
2023-10-17 18:06:28,618 epoch 6 - iter 14/146 - loss 0.04553866 - time (sec): 1.41 - samples/sec: 2913.20 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:06:30,185 epoch 6 - iter 28/146 - loss 0.03490225 - time (sec): 2.98 - samples/sec: 2979.56 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:06:31,516 epoch 6 - iter 42/146 - loss 0.04149229 - time (sec): 4.31 - samples/sec: 2877.42 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:06:32,937 epoch 6 - iter 56/146 - loss 0.03838322 - time (sec): 5.73 - samples/sec: 2794.34 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:06:34,459 epoch 6 - iter 70/146 - loss 0.03566342 - time (sec): 7.25 - samples/sec: 2817.37 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:06:36,022 epoch 6 - iter 84/146 - loss 0.03699880 - time (sec): 8.82 - samples/sec: 2887.86 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:06:37,123 epoch 6 - iter 98/146 - loss 0.03665218 - time (sec): 9.92 - samples/sec: 2914.67 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:06:38,530 epoch 6 - iter 112/146 - loss 0.03524922 - time (sec): 11.32 - samples/sec: 2914.45 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:06:39,961 epoch 6 - iter 126/146 - loss 0.03441689 - time (sec): 12.76 - samples/sec: 2944.67 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:06:41,501 epoch 6 - iter 140/146 - loss 0.03221802 - time (sec): 14.30 - samples/sec: 2971.92 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:06:42,233 ----------------------------------------------------------------------------------------------------
2023-10-17 18:06:42,234 EPOCH 6 done: loss 0.0318 - lr: 0.000023
2023-10-17 18:06:43,498 DEV : loss 0.12657296657562256 - f1-score (micro avg) 0.7738
2023-10-17 18:06:43,503 saving best model
2023-10-17 18:06:43,948 ----------------------------------------------------------------------------------------------------
2023-10-17 18:06:45,372 epoch 7 - iter 14/146 - loss 0.02524628 - time (sec): 1.42 - samples/sec: 2856.66 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:06:46,593 epoch 7 - iter 28/146 - loss 0.02680280 - time (sec): 2.64 - samples/sec: 2870.59 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:06:47,949 epoch 7 - iter 42/146 - loss 0.02213969 - time (sec): 4.00 - samples/sec: 2918.32 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:06:49,309 epoch 7 - iter 56/146 - loss 0.02139683 - time (sec): 5.36 - samples/sec: 2981.76 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:06:50,826 epoch 7 - iter 70/146 - loss 0.02696387 - time (sec): 6.87 - samples/sec: 3016.39 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:06:52,353 epoch 7 - iter 84/146 - loss 0.02742539 - time (sec): 8.40 - samples/sec: 2924.97 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:06:53,828 epoch 7 - iter 98/146 - loss 0.02526137 - time (sec): 9.88 - samples/sec: 2923.43 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:06:55,369 epoch 7 - iter 112/146 - loss 0.02426403 - time (sec): 11.42 - samples/sec: 2891.27 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:06:57,093 epoch 7 - iter 126/146 - loss 0.02384645 - time (sec): 13.14 - samples/sec: 2859.96 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:06:58,546 epoch 7 - iter 140/146 - loss 0.02357317 - time (sec): 14.60 - samples/sec: 2899.92 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:06:59,258 ----------------------------------------------------------------------------------------------------
2023-10-17 18:06:59,258 EPOCH 7 done: loss 0.0229 - lr: 0.000017
2023-10-17 18:07:00,727 DEV : loss 0.14095118641853333 - f1-score (micro avg) 0.7478
2023-10-17 18:07:00,731 ----------------------------------------------------------------------------------------------------
2023-10-17 18:07:02,124 epoch 8 - iter 14/146 - loss 0.01576638 - time (sec): 1.39 - samples/sec: 2988.30 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:07:03,698 epoch 8 - iter 28/146 - loss 0.01550301 - time (sec): 2.97 - samples/sec: 2868.29 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:07:05,126 epoch 8 - iter 42/146 - loss 0.01623872 - time (sec): 4.39 - samples/sec: 2890.47 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:07:06,652 epoch 8 - iter 56/146 - loss 0.02023556 - time (sec): 5.92 - samples/sec: 2938.03 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:07:08,136 epoch 8 - iter 70/146 - loss 0.02022101 - time (sec): 7.40 - samples/sec: 2935.66 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:07:09,637 epoch 8 - iter 84/146 - loss 0.01985932 - time (sec): 8.90 - samples/sec: 2943.32 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:07:11,362 epoch 8 - iter 98/146 - loss 0.01851297 - time (sec): 10.63 - samples/sec: 2911.68 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:07:12,649 epoch 8 - iter 112/146 - loss 0.01853413 - time (sec): 11.92 - samples/sec: 2887.11 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:07:14,247 epoch 8 - iter 126/146 - loss 0.01856705 - time (sec): 13.52 - samples/sec: 2907.94 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:07:15,590 epoch 8 - iter 140/146 - loss 0.01763127 - time (sec): 14.86 - samples/sec: 2898.42 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:07:16,079 ----------------------------------------------------------------------------------------------------
2023-10-17 18:07:16,079 EPOCH 8 done: loss 0.0176 - lr: 0.000012
2023-10-17 18:07:17,329 DEV : loss 0.1616245061159134 - f1-score (micro avg) 0.7654
2023-10-17 18:07:17,334 ----------------------------------------------------------------------------------------------------
2023-10-17 18:07:18,671 epoch 9 - iter 14/146 - loss 0.01183606 - time (sec): 1.34 - samples/sec: 2853.15 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:07:20,453 epoch 9 - iter 28/146 - loss 0.01570372 - time (sec): 3.12 - samples/sec: 2805.11 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:07:22,240 epoch 9 - iter 42/146 - loss 0.01929006 - time (sec): 4.90 - samples/sec: 2875.05 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:07:23,638 epoch 9 - iter 56/146 - loss 0.01748759 - time (sec): 6.30 - samples/sec: 2857.79 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:07:24,850 epoch 9 - iter 70/146 - loss 0.01629176 - time (sec): 7.52 - samples/sec: 2892.31 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:07:26,312 epoch 9 - iter 84/146 - loss 0.01521756 - time (sec): 8.98 - samples/sec: 2916.25 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:07:27,669 epoch 9 - iter 98/146 - loss 0.01433705 - time (sec): 10.33 - samples/sec: 2895.91 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:07:29,090 epoch 9 - iter 112/146 - loss 0.01425475 - time (sec): 11.76 - samples/sec: 2951.20 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:07:30,475 epoch 9 - iter 126/146 - loss 0.01361620 - time (sec): 13.14 - samples/sec: 2915.46 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:07:31,749 epoch 9 - iter 140/146 - loss 0.01297694 - time (sec): 14.41 - samples/sec: 2921.88 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:07:32,406 ----------------------------------------------------------------------------------------------------
2023-10-17 18:07:32,406 EPOCH 9 done: loss 0.0124 - lr: 0.000006
2023-10-17 18:07:33,632 DEV : loss 0.1573966145515442 - f1-score (micro avg) 0.773
2023-10-17 18:07:33,637 ----------------------------------------------------------------------------------------------------
2023-10-17 18:07:35,439 epoch 10 - iter 14/146 - loss 0.01626036 - time (sec): 1.80 - samples/sec: 2789.02 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:07:36,974 epoch 10 - iter 28/146 - loss 0.01146527 - time (sec): 3.34 - samples/sec: 2861.72 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:07:38,303 epoch 10 - iter 42/146 - loss 0.01070628 - time (sec): 4.67 - samples/sec: 2915.67 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:07:39,553 epoch 10 - iter 56/146 - loss 0.00897110 - time (sec): 5.91 - samples/sec: 2970.08 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:07:40,942 epoch 10 - iter 70/146 - loss 0.00880247 - time (sec): 7.30 - samples/sec: 2932.76 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:07:42,486 epoch 10 - iter 84/146 - loss 0.00928154 - time (sec): 8.85 - samples/sec: 2866.28 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:07:43,982 epoch 10 - iter 98/146 - loss 0.00880370 - time (sec): 10.34 - samples/sec: 2874.41 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:07:45,598 epoch 10 - iter 112/146 - loss 0.00874673 - time (sec): 11.96 - samples/sec: 2885.33 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:07:46,803 epoch 10 - iter 126/146 - loss 0.00921825 - time (sec): 13.16 - samples/sec: 2915.35 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:07:48,198 epoch 10 - iter 140/146 - loss 0.00968192 - time (sec): 14.56 - samples/sec: 2915.90 - lr: 0.000000 - momentum: 0.000000
2023-10-17 18:07:49,005 ----------------------------------------------------------------------------------------------------
2023-10-17 18:07:49,005 EPOCH 10 done: loss 0.0093 - lr: 0.000000
2023-10-17 18:07:50,352 DEV : loss 0.1613503247499466 - f1-score (micro avg) 0.7775
2023-10-17 18:07:50,357 saving best model
2023-10-17 18:07:51,156 ----------------------------------------------------------------------------------------------------
2023-10-17 18:07:51,157 Loading model from best epoch ...
2023-10-17 18:07:52,769 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-17 18:07:55,187
Results:
- F-score (micro) 0.7562
- F-score (macro) 0.6535
- Accuracy 0.6287
By class:
precision recall f1-score support
PER 0.8229 0.8678 0.8448 348
LOC 0.6350 0.8199 0.7157 261
ORG 0.4400 0.4231 0.4314 52
HumanProd 0.6087 0.6364 0.6222 22
micro avg 0.7104 0.8082 0.7562 683
macro avg 0.6266 0.6868 0.6535 683
weighted avg 0.7150 0.8082 0.7568 683
2023-10-17 18:07:55,187 ----------------------------------------------------------------------------------------------------