stefan-it's picture
Upload folder using huggingface_hub
99c741b
2023-10-17 17:37:29,787 ----------------------------------------------------------------------------------------------------
2023-10-17 17:37:29,788 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 17:37:29,788 ----------------------------------------------------------------------------------------------------
2023-10-17 17:37:29,788 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-17 17:37:29,788 ----------------------------------------------------------------------------------------------------
2023-10-17 17:37:29,788 Train: 1166 sentences
2023-10-17 17:37:29,788 (train_with_dev=False, train_with_test=False)
2023-10-17 17:37:29,788 ----------------------------------------------------------------------------------------------------
2023-10-17 17:37:29,788 Training Params:
2023-10-17 17:37:29,788 - learning_rate: "5e-05"
2023-10-17 17:37:29,789 - mini_batch_size: "8"
2023-10-17 17:37:29,789 - max_epochs: "10"
2023-10-17 17:37:29,789 - shuffle: "True"
2023-10-17 17:37:29,789 ----------------------------------------------------------------------------------------------------
2023-10-17 17:37:29,789 Plugins:
2023-10-17 17:37:29,789 - TensorboardLogger
2023-10-17 17:37:29,789 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 17:37:29,789 ----------------------------------------------------------------------------------------------------
2023-10-17 17:37:29,789 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 17:37:29,789 - metric: "('micro avg', 'f1-score')"
2023-10-17 17:37:29,789 ----------------------------------------------------------------------------------------------------
2023-10-17 17:37:29,789 Computation:
2023-10-17 17:37:29,789 - compute on device: cuda:0
2023-10-17 17:37:29,789 - embedding storage: none
2023-10-17 17:37:29,789 ----------------------------------------------------------------------------------------------------
2023-10-17 17:37:29,789 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-17 17:37:29,789 ----------------------------------------------------------------------------------------------------
2023-10-17 17:37:29,789 ----------------------------------------------------------------------------------------------------
2023-10-17 17:37:29,789 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 17:37:31,187 epoch 1 - iter 14/146 - loss 3.37573558 - time (sec): 1.40 - samples/sec: 2716.09 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:37:32,560 epoch 1 - iter 28/146 - loss 2.95919448 - time (sec): 2.77 - samples/sec: 2752.12 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:37:34,759 epoch 1 - iter 42/146 - loss 2.24791386 - time (sec): 4.97 - samples/sec: 2756.23 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:37:36,522 epoch 1 - iter 56/146 - loss 1.84282911 - time (sec): 6.73 - samples/sec: 2657.88 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:37:37,701 epoch 1 - iter 70/146 - loss 1.61728577 - time (sec): 7.91 - samples/sec: 2728.57 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:37:39,260 epoch 1 - iter 84/146 - loss 1.44152327 - time (sec): 9.47 - samples/sec: 2721.35 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:37:40,590 epoch 1 - iter 98/146 - loss 1.28259256 - time (sec): 10.80 - samples/sec: 2763.18 - lr: 0.000033 - momentum: 0.000000
2023-10-17 17:37:41,945 epoch 1 - iter 112/146 - loss 1.15921053 - time (sec): 12.16 - samples/sec: 2807.55 - lr: 0.000038 - momentum: 0.000000
2023-10-17 17:37:43,407 epoch 1 - iter 126/146 - loss 1.06002134 - time (sec): 13.62 - samples/sec: 2821.55 - lr: 0.000043 - momentum: 0.000000
2023-10-17 17:37:44,867 epoch 1 - iter 140/146 - loss 0.98907932 - time (sec): 15.08 - samples/sec: 2804.11 - lr: 0.000048 - momentum: 0.000000
2023-10-17 17:37:45,491 ----------------------------------------------------------------------------------------------------
2023-10-17 17:37:45,491 EPOCH 1 done: loss 0.9535 - lr: 0.000048
2023-10-17 17:37:46,312 DEV : loss 0.16792482137680054 - f1-score (micro avg) 0.5135
2023-10-17 17:37:46,317 saving best model
2023-10-17 17:37:46,657 ----------------------------------------------------------------------------------------------------
2023-10-17 17:37:48,271 epoch 2 - iter 14/146 - loss 0.21406715 - time (sec): 1.61 - samples/sec: 2819.73 - lr: 0.000050 - momentum: 0.000000
2023-10-17 17:37:49,772 epoch 2 - iter 28/146 - loss 0.19088152 - time (sec): 3.11 - samples/sec: 2721.56 - lr: 0.000049 - momentum: 0.000000
2023-10-17 17:37:51,118 epoch 2 - iter 42/146 - loss 0.18486880 - time (sec): 4.46 - samples/sec: 2842.05 - lr: 0.000048 - momentum: 0.000000
2023-10-17 17:37:52,987 epoch 2 - iter 56/146 - loss 0.17632972 - time (sec): 6.33 - samples/sec: 2819.25 - lr: 0.000048 - momentum: 0.000000
2023-10-17 17:37:54,311 epoch 2 - iter 70/146 - loss 0.17389479 - time (sec): 7.65 - samples/sec: 2888.93 - lr: 0.000047 - momentum: 0.000000
2023-10-17 17:37:55,626 epoch 2 - iter 84/146 - loss 0.17353218 - time (sec): 8.97 - samples/sec: 2995.75 - lr: 0.000047 - momentum: 0.000000
2023-10-17 17:37:56,773 epoch 2 - iter 98/146 - loss 0.17824868 - time (sec): 10.12 - samples/sec: 2997.07 - lr: 0.000046 - momentum: 0.000000
2023-10-17 17:37:57,986 epoch 2 - iter 112/146 - loss 0.18444188 - time (sec): 11.33 - samples/sec: 2983.30 - lr: 0.000046 - momentum: 0.000000
2023-10-17 17:37:59,584 epoch 2 - iter 126/146 - loss 0.18735041 - time (sec): 12.93 - samples/sec: 2976.20 - lr: 0.000045 - momentum: 0.000000
2023-10-17 17:38:01,174 epoch 2 - iter 140/146 - loss 0.18194918 - time (sec): 14.52 - samples/sec: 2946.22 - lr: 0.000045 - momentum: 0.000000
2023-10-17 17:38:01,804 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:01,804 EPOCH 2 done: loss 0.1799 - lr: 0.000045
2023-10-17 17:38:03,284 DEV : loss 0.1151837706565857 - f1-score (micro avg) 0.668
2023-10-17 17:38:03,289 saving best model
2023-10-17 17:38:03,764 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:05,411 epoch 3 - iter 14/146 - loss 0.11815419 - time (sec): 1.65 - samples/sec: 2724.25 - lr: 0.000044 - momentum: 0.000000
2023-10-17 17:38:06,872 epoch 3 - iter 28/146 - loss 0.11456933 - time (sec): 3.11 - samples/sec: 2816.48 - lr: 0.000043 - momentum: 0.000000
2023-10-17 17:38:08,324 epoch 3 - iter 42/146 - loss 0.12498663 - time (sec): 4.56 - samples/sec: 2901.73 - lr: 0.000043 - momentum: 0.000000
2023-10-17 17:38:09,615 epoch 3 - iter 56/146 - loss 0.11790534 - time (sec): 5.85 - samples/sec: 2942.75 - lr: 0.000042 - momentum: 0.000000
2023-10-17 17:38:10,964 epoch 3 - iter 70/146 - loss 0.11400928 - time (sec): 7.20 - samples/sec: 2934.56 - lr: 0.000042 - momentum: 0.000000
2023-10-17 17:38:12,331 epoch 3 - iter 84/146 - loss 0.10974986 - time (sec): 8.57 - samples/sec: 2940.41 - lr: 0.000041 - momentum: 0.000000
2023-10-17 17:38:13,931 epoch 3 - iter 98/146 - loss 0.10732321 - time (sec): 10.17 - samples/sec: 2945.09 - lr: 0.000041 - momentum: 0.000000
2023-10-17 17:38:15,528 epoch 3 - iter 112/146 - loss 0.10781926 - time (sec): 11.76 - samples/sec: 2947.88 - lr: 0.000040 - momentum: 0.000000
2023-10-17 17:38:16,806 epoch 3 - iter 126/146 - loss 0.10821794 - time (sec): 13.04 - samples/sec: 2952.99 - lr: 0.000040 - momentum: 0.000000
2023-10-17 17:38:18,148 epoch 3 - iter 140/146 - loss 0.10839025 - time (sec): 14.38 - samples/sec: 2930.90 - lr: 0.000039 - momentum: 0.000000
2023-10-17 17:38:18,954 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:18,954 EPOCH 3 done: loss 0.1088 - lr: 0.000039
2023-10-17 17:38:20,213 DEV : loss 0.10572662949562073 - f1-score (micro avg) 0.6921
2023-10-17 17:38:20,217 saving best model
2023-10-17 17:38:20,697 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:21,822 epoch 4 - iter 14/146 - loss 0.09358790 - time (sec): 1.12 - samples/sec: 3095.06 - lr: 0.000038 - momentum: 0.000000
2023-10-17 17:38:23,043 epoch 4 - iter 28/146 - loss 0.07984905 - time (sec): 2.34 - samples/sec: 3154.25 - lr: 0.000038 - momentum: 0.000000
2023-10-17 17:38:24,710 epoch 4 - iter 42/146 - loss 0.06872645 - time (sec): 4.00 - samples/sec: 3109.62 - lr: 0.000037 - momentum: 0.000000
2023-10-17 17:38:26,305 epoch 4 - iter 56/146 - loss 0.06929502 - time (sec): 5.60 - samples/sec: 2991.71 - lr: 0.000037 - momentum: 0.000000
2023-10-17 17:38:27,794 epoch 4 - iter 70/146 - loss 0.06615047 - time (sec): 7.09 - samples/sec: 2967.01 - lr: 0.000036 - momentum: 0.000000
2023-10-17 17:38:29,354 epoch 4 - iter 84/146 - loss 0.06341990 - time (sec): 8.65 - samples/sec: 2962.93 - lr: 0.000036 - momentum: 0.000000
2023-10-17 17:38:31,194 epoch 4 - iter 98/146 - loss 0.06994356 - time (sec): 10.49 - samples/sec: 2918.65 - lr: 0.000035 - momentum: 0.000000
2023-10-17 17:38:32,546 epoch 4 - iter 112/146 - loss 0.06799879 - time (sec): 11.84 - samples/sec: 2904.32 - lr: 0.000035 - momentum: 0.000000
2023-10-17 17:38:34,013 epoch 4 - iter 126/146 - loss 0.07029829 - time (sec): 13.31 - samples/sec: 2925.93 - lr: 0.000034 - momentum: 0.000000
2023-10-17 17:38:35,487 epoch 4 - iter 140/146 - loss 0.07084618 - time (sec): 14.78 - samples/sec: 2901.54 - lr: 0.000034 - momentum: 0.000000
2023-10-17 17:38:36,018 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:36,018 EPOCH 4 done: loss 0.0699 - lr: 0.000034
2023-10-17 17:38:37,313 DEV : loss 0.126682311296463 - f1-score (micro avg) 0.7511
2023-10-17 17:38:37,318 saving best model
2023-10-17 17:38:37,798 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:39,540 epoch 5 - iter 14/146 - loss 0.05401972 - time (sec): 1.74 - samples/sec: 2868.10 - lr: 0.000033 - momentum: 0.000000
2023-10-17 17:38:41,233 epoch 5 - iter 28/146 - loss 0.04019088 - time (sec): 3.43 - samples/sec: 2849.95 - lr: 0.000032 - momentum: 0.000000
2023-10-17 17:38:42,521 epoch 5 - iter 42/146 - loss 0.04999847 - time (sec): 4.72 - samples/sec: 2983.70 - lr: 0.000032 - momentum: 0.000000
2023-10-17 17:38:43,717 epoch 5 - iter 56/146 - loss 0.05432339 - time (sec): 5.92 - samples/sec: 2985.17 - lr: 0.000031 - momentum: 0.000000
2023-10-17 17:38:45,230 epoch 5 - iter 70/146 - loss 0.05670146 - time (sec): 7.43 - samples/sec: 2940.20 - lr: 0.000031 - momentum: 0.000000
2023-10-17 17:38:46,681 epoch 5 - iter 84/146 - loss 0.05548785 - time (sec): 8.88 - samples/sec: 2943.86 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:38:48,071 epoch 5 - iter 98/146 - loss 0.05219240 - time (sec): 10.27 - samples/sec: 2986.20 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:38:49,402 epoch 5 - iter 112/146 - loss 0.05188564 - time (sec): 11.60 - samples/sec: 2993.44 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:38:50,761 epoch 5 - iter 126/146 - loss 0.04945330 - time (sec): 12.96 - samples/sec: 2977.12 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:38:52,167 epoch 5 - iter 140/146 - loss 0.04807587 - time (sec): 14.37 - samples/sec: 2983.25 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:38:52,632 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:52,632 EPOCH 5 done: loss 0.0471 - lr: 0.000028
2023-10-17 17:38:53,948 DEV : loss 0.12283353507518768 - f1-score (micro avg) 0.764
2023-10-17 17:38:53,953 saving best model
2023-10-17 17:38:54,398 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:55,858 epoch 6 - iter 14/146 - loss 0.03681802 - time (sec): 1.46 - samples/sec: 2991.47 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:38:57,148 epoch 6 - iter 28/146 - loss 0.03075934 - time (sec): 2.75 - samples/sec: 2831.24 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:38:59,358 epoch 6 - iter 42/146 - loss 0.03043773 - time (sec): 4.96 - samples/sec: 2519.99 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:39:00,913 epoch 6 - iter 56/146 - loss 0.03076017 - time (sec): 6.51 - samples/sec: 2587.83 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:39:02,240 epoch 6 - iter 70/146 - loss 0.02960510 - time (sec): 7.84 - samples/sec: 2593.20 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:39:03,797 epoch 6 - iter 84/146 - loss 0.02886610 - time (sec): 9.40 - samples/sec: 2581.09 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:39:05,279 epoch 6 - iter 98/146 - loss 0.03043924 - time (sec): 10.88 - samples/sec: 2590.89 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:39:06,942 epoch 6 - iter 112/146 - loss 0.03186061 - time (sec): 12.54 - samples/sec: 2628.15 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:39:08,443 epoch 6 - iter 126/146 - loss 0.03215404 - time (sec): 14.04 - samples/sec: 2662.69 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:39:10,324 epoch 6 - iter 140/146 - loss 0.03176934 - time (sec): 15.92 - samples/sec: 2692.28 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:39:10,845 ----------------------------------------------------------------------------------------------------
2023-10-17 17:39:10,845 EPOCH 6 done: loss 0.0315 - lr: 0.000023
2023-10-17 17:39:12,121 DEV : loss 0.10742789506912231 - f1-score (micro avg) 0.7627
2023-10-17 17:39:12,125 ----------------------------------------------------------------------------------------------------
2023-10-17 17:39:13,750 epoch 7 - iter 14/146 - loss 0.01262368 - time (sec): 1.62 - samples/sec: 2937.75 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:39:15,331 epoch 7 - iter 28/146 - loss 0.02207873 - time (sec): 3.20 - samples/sec: 2714.29 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:39:16,945 epoch 7 - iter 42/146 - loss 0.02262086 - time (sec): 4.82 - samples/sec: 2793.54 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:39:18,482 epoch 7 - iter 56/146 - loss 0.02038830 - time (sec): 6.36 - samples/sec: 2833.69 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:39:20,148 epoch 7 - iter 70/146 - loss 0.01965031 - time (sec): 8.02 - samples/sec: 2839.38 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:39:21,490 epoch 7 - iter 84/146 - loss 0.01978349 - time (sec): 9.36 - samples/sec: 2883.88 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:39:23,003 epoch 7 - iter 98/146 - loss 0.02074431 - time (sec): 10.88 - samples/sec: 2891.38 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:39:24,323 epoch 7 - iter 112/146 - loss 0.01969168 - time (sec): 12.20 - samples/sec: 2922.56 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:39:25,487 epoch 7 - iter 126/146 - loss 0.01981820 - time (sec): 13.36 - samples/sec: 2922.63 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:39:26,788 epoch 7 - iter 140/146 - loss 0.02159605 - time (sec): 14.66 - samples/sec: 2925.17 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:39:27,363 ----------------------------------------------------------------------------------------------------
2023-10-17 17:39:27,363 EPOCH 7 done: loss 0.0213 - lr: 0.000017
2023-10-17 17:39:28,709 DEV : loss 0.1274358183145523 - f1-score (micro avg) 0.7559
2023-10-17 17:39:28,715 ----------------------------------------------------------------------------------------------------
2023-10-17 17:39:30,242 epoch 8 - iter 14/146 - loss 0.01472338 - time (sec): 1.53 - samples/sec: 2973.44 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:39:31,880 epoch 8 - iter 28/146 - loss 0.02160793 - time (sec): 3.16 - samples/sec: 2927.18 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:39:33,565 epoch 8 - iter 42/146 - loss 0.01728589 - time (sec): 4.85 - samples/sec: 2815.92 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:39:34,870 epoch 8 - iter 56/146 - loss 0.01931026 - time (sec): 6.15 - samples/sec: 2868.69 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:39:36,236 epoch 8 - iter 70/146 - loss 0.01779502 - time (sec): 7.52 - samples/sec: 2842.77 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:39:37,477 epoch 8 - iter 84/146 - loss 0.01627756 - time (sec): 8.76 - samples/sec: 2865.82 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:39:38,941 epoch 8 - iter 98/146 - loss 0.01662670 - time (sec): 10.22 - samples/sec: 2873.57 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:39:40,407 epoch 8 - iter 112/146 - loss 0.01534973 - time (sec): 11.69 - samples/sec: 2859.92 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:39:41,805 epoch 8 - iter 126/146 - loss 0.01445702 - time (sec): 13.09 - samples/sec: 2884.04 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:39:43,494 epoch 8 - iter 140/146 - loss 0.01553068 - time (sec): 14.78 - samples/sec: 2900.40 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:39:43,971 ----------------------------------------------------------------------------------------------------
2023-10-17 17:39:43,971 EPOCH 8 done: loss 0.0153 - lr: 0.000012
2023-10-17 17:39:45,269 DEV : loss 0.141546830534935 - f1-score (micro avg) 0.7462
2023-10-17 17:39:45,274 ----------------------------------------------------------------------------------------------------
2023-10-17 17:39:46,828 epoch 9 - iter 14/146 - loss 0.00978563 - time (sec): 1.55 - samples/sec: 2657.05 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:39:48,213 epoch 9 - iter 28/146 - loss 0.01442725 - time (sec): 2.94 - samples/sec: 2729.65 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:39:49,441 epoch 9 - iter 42/146 - loss 0.01169914 - time (sec): 4.17 - samples/sec: 2734.87 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:39:50,688 epoch 9 - iter 56/146 - loss 0.01092067 - time (sec): 5.41 - samples/sec: 2777.73 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:39:52,381 epoch 9 - iter 70/146 - loss 0.00994777 - time (sec): 7.11 - samples/sec: 2837.20 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:39:53,855 epoch 9 - iter 84/146 - loss 0.01072843 - time (sec): 8.58 - samples/sec: 2879.33 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:39:55,147 epoch 9 - iter 98/146 - loss 0.01072225 - time (sec): 9.87 - samples/sec: 2884.07 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:39:56,641 epoch 9 - iter 112/146 - loss 0.00993920 - time (sec): 11.37 - samples/sec: 2881.71 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:39:58,337 epoch 9 - iter 126/146 - loss 0.01048425 - time (sec): 13.06 - samples/sec: 2909.76 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:39:59,721 epoch 9 - iter 140/146 - loss 0.01035252 - time (sec): 14.45 - samples/sec: 2919.43 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:40:00,613 ----------------------------------------------------------------------------------------------------
2023-10-17 17:40:00,613 EPOCH 9 done: loss 0.0104 - lr: 0.000006
2023-10-17 17:40:01,845 DEV : loss 0.1482328176498413 - f1-score (micro avg) 0.7843
2023-10-17 17:40:01,849 saving best model
2023-10-17 17:40:02,314 ----------------------------------------------------------------------------------------------------
2023-10-17 17:40:03,886 epoch 10 - iter 14/146 - loss 0.01064113 - time (sec): 1.56 - samples/sec: 2989.33 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:40:05,357 epoch 10 - iter 28/146 - loss 0.00752883 - time (sec): 3.04 - samples/sec: 3075.90 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:40:06,801 epoch 10 - iter 42/146 - loss 0.00810847 - time (sec): 4.48 - samples/sec: 3031.72 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:40:08,183 epoch 10 - iter 56/146 - loss 0.00767424 - time (sec): 5.86 - samples/sec: 2976.27 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:40:09,664 epoch 10 - iter 70/146 - loss 0.00631805 - time (sec): 7.34 - samples/sec: 2947.32 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:40:11,180 epoch 10 - iter 84/146 - loss 0.00616265 - time (sec): 8.86 - samples/sec: 2935.80 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:40:12,703 epoch 10 - iter 98/146 - loss 0.00683143 - time (sec): 10.38 - samples/sec: 2905.56 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:40:14,228 epoch 10 - iter 112/146 - loss 0.00780732 - time (sec): 11.91 - samples/sec: 2914.68 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:40:15,862 epoch 10 - iter 126/146 - loss 0.00805699 - time (sec): 13.54 - samples/sec: 2908.24 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:40:17,212 epoch 10 - iter 140/146 - loss 0.00786973 - time (sec): 14.89 - samples/sec: 2910.88 - lr: 0.000000 - momentum: 0.000000
2023-10-17 17:40:17,622 ----------------------------------------------------------------------------------------------------
2023-10-17 17:40:17,623 EPOCH 10 done: loss 0.0077 - lr: 0.000000
2023-10-17 17:40:18,868 DEV : loss 0.14403872191905975 - f1-score (micro avg) 0.7783
2023-10-17 17:40:19,212 ----------------------------------------------------------------------------------------------------
2023-10-17 17:40:19,213 Loading model from best epoch ...
2023-10-17 17:40:20,598 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-17 17:40:23,115
Results:
- F-score (micro) 0.7612
- F-score (macro) 0.701
- Accuracy 0.6366
By class:
precision recall f1-score support
PER 0.8122 0.8448 0.8282 348
LOC 0.6605 0.8199 0.7316 261
ORG 0.4615 0.4615 0.4615 52
HumanProd 0.7500 0.8182 0.7826 22
micro avg 0.7218 0.8053 0.7612 683
macro avg 0.6710 0.7361 0.7010 683
weighted avg 0.7255 0.8053 0.7619 683
2023-10-17 17:40:23,116 ----------------------------------------------------------------------------------------------------