stefan-it's picture
Upload folder using huggingface_hub
a546dfa
2023-10-17 18:32:42,717 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:42,719 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 18:32:42,719 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:42,719 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-17 18:32:42,719 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:42,719 Train: 1166 sentences
2023-10-17 18:32:42,719 (train_with_dev=False, train_with_test=False)
2023-10-17 18:32:42,719 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:42,719 Training Params:
2023-10-17 18:32:42,719 - learning_rate: "5e-05"
2023-10-17 18:32:42,719 - mini_batch_size: "8"
2023-10-17 18:32:42,719 - max_epochs: "10"
2023-10-17 18:32:42,719 - shuffle: "True"
2023-10-17 18:32:42,719 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:42,719 Plugins:
2023-10-17 18:32:42,719 - TensorboardLogger
2023-10-17 18:32:42,719 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 18:32:42,719 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:42,719 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 18:32:42,719 - metric: "('micro avg', 'f1-score')"
2023-10-17 18:32:42,719 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:42,719 Computation:
2023-10-17 18:32:42,719 - compute on device: cuda:0
2023-10-17 18:32:42,719 - embedding storage: none
2023-10-17 18:32:42,719 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:42,720 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-17 18:32:42,720 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:42,720 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:42,720 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 18:32:44,237 epoch 1 - iter 14/146 - loss 3.44721500 - time (sec): 1.52 - samples/sec: 2772.49 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:32:45,581 epoch 1 - iter 28/146 - loss 3.01929126 - time (sec): 2.86 - samples/sec: 2955.98 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:32:47,142 epoch 1 - iter 42/146 - loss 2.43085235 - time (sec): 4.42 - samples/sec: 2879.15 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:32:48,528 epoch 1 - iter 56/146 - loss 1.99245657 - time (sec): 5.81 - samples/sec: 2860.18 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:32:50,024 epoch 1 - iter 70/146 - loss 1.68208133 - time (sec): 7.30 - samples/sec: 2856.46 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:32:51,369 epoch 1 - iter 84/146 - loss 1.49120249 - time (sec): 8.65 - samples/sec: 2896.45 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:32:53,228 epoch 1 - iter 98/146 - loss 1.31372117 - time (sec): 10.51 - samples/sec: 2851.91 - lr: 0.000033 - momentum: 0.000000
2023-10-17 18:32:54,696 epoch 1 - iter 112/146 - loss 1.17868847 - time (sec): 11.98 - samples/sec: 2877.10 - lr: 0.000038 - momentum: 0.000000
2023-10-17 18:32:55,929 epoch 1 - iter 126/146 - loss 1.08469627 - time (sec): 13.21 - samples/sec: 2896.82 - lr: 0.000043 - momentum: 0.000000
2023-10-17 18:32:57,204 epoch 1 - iter 140/146 - loss 1.01543642 - time (sec): 14.48 - samples/sec: 2889.85 - lr: 0.000048 - momentum: 0.000000
2023-10-17 18:32:58,041 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:58,041 EPOCH 1 done: loss 0.9658 - lr: 0.000048
2023-10-17 18:32:59,291 DEV : loss 0.18053634464740753 - f1-score (micro avg) 0.4883
2023-10-17 18:32:59,298 saving best model
2023-10-17 18:32:59,712 ----------------------------------------------------------------------------------------------------
2023-10-17 18:33:01,122 epoch 2 - iter 14/146 - loss 0.19521870 - time (sec): 1.41 - samples/sec: 2989.49 - lr: 0.000050 - momentum: 0.000000
2023-10-17 18:33:02,779 epoch 2 - iter 28/146 - loss 0.26696941 - time (sec): 3.06 - samples/sec: 2979.34 - lr: 0.000049 - momentum: 0.000000
2023-10-17 18:33:04,231 epoch 2 - iter 42/146 - loss 0.24556875 - time (sec): 4.52 - samples/sec: 3011.47 - lr: 0.000048 - momentum: 0.000000
2023-10-17 18:33:05,768 epoch 2 - iter 56/146 - loss 0.22549833 - time (sec): 6.05 - samples/sec: 2988.86 - lr: 0.000048 - momentum: 0.000000
2023-10-17 18:33:07,648 epoch 2 - iter 70/146 - loss 0.21672435 - time (sec): 7.93 - samples/sec: 2849.12 - lr: 0.000047 - momentum: 0.000000
2023-10-17 18:33:08,979 epoch 2 - iter 84/146 - loss 0.20594215 - time (sec): 9.26 - samples/sec: 2819.20 - lr: 0.000047 - momentum: 0.000000
2023-10-17 18:33:10,328 epoch 2 - iter 98/146 - loss 0.20138047 - time (sec): 10.61 - samples/sec: 2831.38 - lr: 0.000046 - momentum: 0.000000
2023-10-17 18:33:11,876 epoch 2 - iter 112/146 - loss 0.19331625 - time (sec): 12.16 - samples/sec: 2836.85 - lr: 0.000046 - momentum: 0.000000
2023-10-17 18:33:13,396 epoch 2 - iter 126/146 - loss 0.18895690 - time (sec): 13.68 - samples/sec: 2833.45 - lr: 0.000045 - momentum: 0.000000
2023-10-17 18:33:14,789 epoch 2 - iter 140/146 - loss 0.18610808 - time (sec): 15.08 - samples/sec: 2816.05 - lr: 0.000045 - momentum: 0.000000
2023-10-17 18:33:15,502 ----------------------------------------------------------------------------------------------------
2023-10-17 18:33:15,502 EPOCH 2 done: loss 0.1847 - lr: 0.000045
2023-10-17 18:33:16,756 DEV : loss 0.11022156476974487 - f1-score (micro avg) 0.6517
2023-10-17 18:33:16,761 saving best model
2023-10-17 18:33:17,198 ----------------------------------------------------------------------------------------------------
2023-10-17 18:33:18,831 epoch 3 - iter 14/146 - loss 0.13107967 - time (sec): 1.63 - samples/sec: 3074.78 - lr: 0.000044 - momentum: 0.000000
2023-10-17 18:33:20,000 epoch 3 - iter 28/146 - loss 0.12321225 - time (sec): 2.80 - samples/sec: 3129.35 - lr: 0.000043 - momentum: 0.000000
2023-10-17 18:33:21,626 epoch 3 - iter 42/146 - loss 0.10417679 - time (sec): 4.43 - samples/sec: 3090.88 - lr: 0.000043 - momentum: 0.000000
2023-10-17 18:33:23,095 epoch 3 - iter 56/146 - loss 0.11353587 - time (sec): 5.89 - samples/sec: 3049.43 - lr: 0.000042 - momentum: 0.000000
2023-10-17 18:33:24,480 epoch 3 - iter 70/146 - loss 0.11292059 - time (sec): 7.28 - samples/sec: 3088.21 - lr: 0.000042 - momentum: 0.000000
2023-10-17 18:33:25,660 epoch 3 - iter 84/146 - loss 0.11525650 - time (sec): 8.46 - samples/sec: 3084.38 - lr: 0.000041 - momentum: 0.000000
2023-10-17 18:33:27,132 epoch 3 - iter 98/146 - loss 0.11750890 - time (sec): 9.93 - samples/sec: 3075.07 - lr: 0.000041 - momentum: 0.000000
2023-10-17 18:33:28,580 epoch 3 - iter 112/146 - loss 0.11798660 - time (sec): 11.38 - samples/sec: 3030.07 - lr: 0.000040 - momentum: 0.000000
2023-10-17 18:33:29,970 epoch 3 - iter 126/146 - loss 0.11282492 - time (sec): 12.77 - samples/sec: 3020.79 - lr: 0.000040 - momentum: 0.000000
2023-10-17 18:33:31,273 epoch 3 - iter 140/146 - loss 0.10978467 - time (sec): 14.07 - samples/sec: 3036.73 - lr: 0.000039 - momentum: 0.000000
2023-10-17 18:33:31,853 ----------------------------------------------------------------------------------------------------
2023-10-17 18:33:31,854 EPOCH 3 done: loss 0.1075 - lr: 0.000039
2023-10-17 18:33:33,152 DEV : loss 0.1118723452091217 - f1-score (micro avg) 0.7425
2023-10-17 18:33:33,160 saving best model
2023-10-17 18:33:33,735 ----------------------------------------------------------------------------------------------------
2023-10-17 18:33:35,365 epoch 4 - iter 14/146 - loss 0.08288392 - time (sec): 1.63 - samples/sec: 3088.56 - lr: 0.000038 - momentum: 0.000000
2023-10-17 18:33:36,817 epoch 4 - iter 28/146 - loss 0.07072301 - time (sec): 3.08 - samples/sec: 3044.62 - lr: 0.000038 - momentum: 0.000000
2023-10-17 18:33:38,186 epoch 4 - iter 42/146 - loss 0.06783415 - time (sec): 4.45 - samples/sec: 2941.34 - lr: 0.000037 - momentum: 0.000000
2023-10-17 18:33:39,817 epoch 4 - iter 56/146 - loss 0.06713406 - time (sec): 6.08 - samples/sec: 2807.96 - lr: 0.000037 - momentum: 0.000000
2023-10-17 18:33:41,638 epoch 4 - iter 70/146 - loss 0.06785787 - time (sec): 7.90 - samples/sec: 2747.44 - lr: 0.000036 - momentum: 0.000000
2023-10-17 18:33:43,030 epoch 4 - iter 84/146 - loss 0.07005552 - time (sec): 9.29 - samples/sec: 2732.78 - lr: 0.000036 - momentum: 0.000000
2023-10-17 18:33:44,485 epoch 4 - iter 98/146 - loss 0.07304693 - time (sec): 10.75 - samples/sec: 2759.73 - lr: 0.000035 - momentum: 0.000000
2023-10-17 18:33:46,178 epoch 4 - iter 112/146 - loss 0.07014498 - time (sec): 12.44 - samples/sec: 2756.18 - lr: 0.000035 - momentum: 0.000000
2023-10-17 18:33:47,585 epoch 4 - iter 126/146 - loss 0.07006272 - time (sec): 13.85 - samples/sec: 2786.75 - lr: 0.000034 - momentum: 0.000000
2023-10-17 18:33:49,152 epoch 4 - iter 140/146 - loss 0.07019238 - time (sec): 15.41 - samples/sec: 2790.11 - lr: 0.000034 - momentum: 0.000000
2023-10-17 18:33:49,706 ----------------------------------------------------------------------------------------------------
2023-10-17 18:33:49,706 EPOCH 4 done: loss 0.0690 - lr: 0.000034
2023-10-17 18:33:50,933 DEV : loss 0.10351864993572235 - f1-score (micro avg) 0.7398
2023-10-17 18:33:50,938 ----------------------------------------------------------------------------------------------------
2023-10-17 18:33:52,938 epoch 5 - iter 14/146 - loss 0.04384680 - time (sec): 2.00 - samples/sec: 2628.61 - lr: 0.000033 - momentum: 0.000000
2023-10-17 18:33:54,137 epoch 5 - iter 28/146 - loss 0.04534817 - time (sec): 3.20 - samples/sec: 2691.58 - lr: 0.000032 - momentum: 0.000000
2023-10-17 18:33:55,438 epoch 5 - iter 42/146 - loss 0.04253725 - time (sec): 4.50 - samples/sec: 2715.69 - lr: 0.000032 - momentum: 0.000000
2023-10-17 18:33:57,098 epoch 5 - iter 56/146 - loss 0.04253393 - time (sec): 6.16 - samples/sec: 2697.89 - lr: 0.000031 - momentum: 0.000000
2023-10-17 18:33:58,628 epoch 5 - iter 70/146 - loss 0.04450004 - time (sec): 7.69 - samples/sec: 2802.33 - lr: 0.000031 - momentum: 0.000000
2023-10-17 18:33:59,988 epoch 5 - iter 84/146 - loss 0.04818641 - time (sec): 9.05 - samples/sec: 2850.62 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:34:01,363 epoch 5 - iter 98/146 - loss 0.04677106 - time (sec): 10.42 - samples/sec: 2848.72 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:34:02,648 epoch 5 - iter 112/146 - loss 0.04521279 - time (sec): 11.71 - samples/sec: 2864.81 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:34:04,034 epoch 5 - iter 126/146 - loss 0.04712952 - time (sec): 13.09 - samples/sec: 2889.35 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:34:05,510 epoch 5 - iter 140/146 - loss 0.04798974 - time (sec): 14.57 - samples/sec: 2921.07 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:34:06,203 ----------------------------------------------------------------------------------------------------
2023-10-17 18:34:06,203 EPOCH 5 done: loss 0.0466 - lr: 0.000028
2023-10-17 18:34:07,478 DEV : loss 0.10968425124883652 - f1-score (micro avg) 0.7361
2023-10-17 18:34:07,485 ----------------------------------------------------------------------------------------------------
2023-10-17 18:34:09,419 epoch 6 - iter 14/146 - loss 0.03514218 - time (sec): 1.93 - samples/sec: 2846.19 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:34:10,591 epoch 6 - iter 28/146 - loss 0.03242432 - time (sec): 3.10 - samples/sec: 2953.00 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:34:12,157 epoch 6 - iter 42/146 - loss 0.03032983 - time (sec): 4.67 - samples/sec: 2890.26 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:34:13,700 epoch 6 - iter 56/146 - loss 0.02748075 - time (sec): 6.21 - samples/sec: 2901.94 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:34:15,088 epoch 6 - iter 70/146 - loss 0.02612346 - time (sec): 7.60 - samples/sec: 2909.63 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:34:16,503 epoch 6 - iter 84/146 - loss 0.02494045 - time (sec): 9.02 - samples/sec: 2879.71 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:34:17,930 epoch 6 - iter 98/146 - loss 0.02559131 - time (sec): 10.44 - samples/sec: 2895.53 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:34:19,504 epoch 6 - iter 112/146 - loss 0.02864853 - time (sec): 12.02 - samples/sec: 2887.29 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:34:20,811 epoch 6 - iter 126/146 - loss 0.02888129 - time (sec): 13.32 - samples/sec: 2910.47 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:34:22,186 epoch 6 - iter 140/146 - loss 0.02827654 - time (sec): 14.70 - samples/sec: 2910.93 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:34:22,693 ----------------------------------------------------------------------------------------------------
2023-10-17 18:34:22,693 EPOCH 6 done: loss 0.0290 - lr: 0.000023
2023-10-17 18:34:24,038 DEV : loss 0.11068177223205566 - f1-score (micro avg) 0.7696
2023-10-17 18:34:24,044 saving best model
2023-10-17 18:34:24,678 ----------------------------------------------------------------------------------------------------
2023-10-17 18:34:26,356 epoch 7 - iter 14/146 - loss 0.02230428 - time (sec): 1.68 - samples/sec: 2924.12 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:34:27,874 epoch 7 - iter 28/146 - loss 0.02787918 - time (sec): 3.19 - samples/sec: 2805.98 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:34:29,163 epoch 7 - iter 42/146 - loss 0.02138936 - time (sec): 4.48 - samples/sec: 2914.64 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:34:30,745 epoch 7 - iter 56/146 - loss 0.02189885 - time (sec): 6.06 - samples/sec: 2950.23 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:34:32,131 epoch 7 - iter 70/146 - loss 0.02030357 - time (sec): 7.45 - samples/sec: 2980.81 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:34:33,791 epoch 7 - iter 84/146 - loss 0.02090295 - time (sec): 9.11 - samples/sec: 2944.88 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:34:35,159 epoch 7 - iter 98/146 - loss 0.02278254 - time (sec): 10.48 - samples/sec: 2894.81 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:34:36,554 epoch 7 - iter 112/146 - loss 0.02456466 - time (sec): 11.87 - samples/sec: 2893.26 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:34:37,907 epoch 7 - iter 126/146 - loss 0.02333651 - time (sec): 13.23 - samples/sec: 2895.98 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:34:39,713 epoch 7 - iter 140/146 - loss 0.02205748 - time (sec): 15.03 - samples/sec: 2853.68 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:34:40,279 ----------------------------------------------------------------------------------------------------
2023-10-17 18:34:40,279 EPOCH 7 done: loss 0.0224 - lr: 0.000017
2023-10-17 18:34:41,555 DEV : loss 0.11937715113162994 - f1-score (micro avg) 0.7716
2023-10-17 18:34:41,560 saving best model
2023-10-17 18:34:42,061 ----------------------------------------------------------------------------------------------------
2023-10-17 18:34:43,547 epoch 8 - iter 14/146 - loss 0.02757955 - time (sec): 1.48 - samples/sec: 2973.09 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:34:45,020 epoch 8 - iter 28/146 - loss 0.02223759 - time (sec): 2.96 - samples/sec: 2910.84 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:34:46,608 epoch 8 - iter 42/146 - loss 0.01936289 - time (sec): 4.54 - samples/sec: 3095.57 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:34:48,219 epoch 8 - iter 56/146 - loss 0.01619699 - time (sec): 6.15 - samples/sec: 3060.94 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:34:49,574 epoch 8 - iter 70/146 - loss 0.01338593 - time (sec): 7.51 - samples/sec: 3081.20 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:34:51,018 epoch 8 - iter 84/146 - loss 0.01288682 - time (sec): 8.95 - samples/sec: 3035.27 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:34:52,559 epoch 8 - iter 98/146 - loss 0.01281177 - time (sec): 10.50 - samples/sec: 2979.26 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:34:53,900 epoch 8 - iter 112/146 - loss 0.01343863 - time (sec): 11.84 - samples/sec: 2940.78 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:34:55,230 epoch 8 - iter 126/146 - loss 0.01305752 - time (sec): 13.17 - samples/sec: 2950.47 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:34:56,904 epoch 8 - iter 140/146 - loss 0.01765526 - time (sec): 14.84 - samples/sec: 2913.93 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:34:57,379 ----------------------------------------------------------------------------------------------------
2023-10-17 18:34:57,380 EPOCH 8 done: loss 0.0178 - lr: 0.000012
2023-10-17 18:34:58,661 DEV : loss 0.12875597178936005 - f1-score (micro avg) 0.777
2023-10-17 18:34:58,666 saving best model
2023-10-17 18:34:59,170 ----------------------------------------------------------------------------------------------------
2023-10-17 18:35:00,731 epoch 9 - iter 14/146 - loss 0.00793277 - time (sec): 1.56 - samples/sec: 2907.10 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:35:02,406 epoch 9 - iter 28/146 - loss 0.00780084 - time (sec): 3.23 - samples/sec: 2933.67 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:35:04,010 epoch 9 - iter 42/146 - loss 0.01119877 - time (sec): 4.84 - samples/sec: 2873.66 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:35:05,275 epoch 9 - iter 56/146 - loss 0.01053210 - time (sec): 6.10 - samples/sec: 2916.89 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:35:07,113 epoch 9 - iter 70/146 - loss 0.00945729 - time (sec): 7.94 - samples/sec: 2821.79 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:35:08,599 epoch 9 - iter 84/146 - loss 0.01012236 - time (sec): 9.43 - samples/sec: 2818.25 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:35:09,990 epoch 9 - iter 98/146 - loss 0.01351672 - time (sec): 10.82 - samples/sec: 2823.37 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:35:11,228 epoch 9 - iter 112/146 - loss 0.01405738 - time (sec): 12.06 - samples/sec: 2807.34 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:35:12,890 epoch 9 - iter 126/146 - loss 0.01276090 - time (sec): 13.72 - samples/sec: 2802.31 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:35:14,457 epoch 9 - iter 140/146 - loss 0.01273768 - time (sec): 15.29 - samples/sec: 2801.19 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:35:15,060 ----------------------------------------------------------------------------------------------------
2023-10-17 18:35:15,060 EPOCH 9 done: loss 0.0124 - lr: 0.000006
2023-10-17 18:35:16,339 DEV : loss 0.12694209814071655 - f1-score (micro avg) 0.7815
2023-10-17 18:35:16,344 saving best model
2023-10-17 18:35:16,844 ----------------------------------------------------------------------------------------------------
2023-10-17 18:35:18,562 epoch 10 - iter 14/146 - loss 0.00354045 - time (sec): 1.72 - samples/sec: 2833.65 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:35:19,975 epoch 10 - iter 28/146 - loss 0.01183736 - time (sec): 3.13 - samples/sec: 2855.53 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:35:21,397 epoch 10 - iter 42/146 - loss 0.01452345 - time (sec): 4.55 - samples/sec: 2763.23 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:35:22,965 epoch 10 - iter 56/146 - loss 0.01324616 - time (sec): 6.12 - samples/sec: 2797.27 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:35:24,599 epoch 10 - iter 70/146 - loss 0.01119466 - time (sec): 7.75 - samples/sec: 2804.22 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:35:25,980 epoch 10 - iter 84/146 - loss 0.00937811 - time (sec): 9.13 - samples/sec: 2872.00 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:35:27,520 epoch 10 - iter 98/146 - loss 0.00961423 - time (sec): 10.67 - samples/sec: 2849.67 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:35:29,048 epoch 10 - iter 112/146 - loss 0.00861973 - time (sec): 12.20 - samples/sec: 2815.11 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:35:30,426 epoch 10 - iter 126/146 - loss 0.00888169 - time (sec): 13.58 - samples/sec: 2829.81 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:35:31,841 epoch 10 - iter 140/146 - loss 0.00956492 - time (sec): 15.00 - samples/sec: 2845.01 - lr: 0.000000 - momentum: 0.000000
2023-10-17 18:35:32,503 ----------------------------------------------------------------------------------------------------
2023-10-17 18:35:32,504 EPOCH 10 done: loss 0.0097 - lr: 0.000000
2023-10-17 18:35:33,785 DEV : loss 0.12970572710037231 - f1-score (micro avg) 0.7652
2023-10-17 18:35:34,173 ----------------------------------------------------------------------------------------------------
2023-10-17 18:35:34,174 Loading model from best epoch ...
2023-10-17 18:35:35,857 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-17 18:35:38,423
Results:
- F-score (micro) 0.7632
- F-score (macro) 0.6905
- Accuracy 0.6391
By class:
precision recall f1-score support
PER 0.8137 0.8534 0.8331 348
LOC 0.6478 0.8314 0.7282 261
ORG 0.5333 0.4615 0.4948 52
HumanProd 0.6207 0.8182 0.7059 22
micro avg 0.7183 0.8141 0.7632 683
macro avg 0.6539 0.7411 0.6905 683
weighted avg 0.7227 0.8141 0.7632 683
2023-10-17 18:35:38,423 ----------------------------------------------------------------------------------------------------