stefan-it's picture
Upload folder using huggingface_hub
dae8a4e
2023-10-13 13:20:08,359 ----------------------------------------------------------------------------------------------------
2023-10-13 13:20:08,360 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 13:20:08,360 ----------------------------------------------------------------------------------------------------
2023-10-13 13:20:08,360 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-13 13:20:08,360 ----------------------------------------------------------------------------------------------------
2023-10-13 13:20:08,360 Train: 3575 sentences
2023-10-13 13:20:08,360 (train_with_dev=False, train_with_test=False)
2023-10-13 13:20:08,360 ----------------------------------------------------------------------------------------------------
2023-10-13 13:20:08,360 Training Params:
2023-10-13 13:20:08,360 - learning_rate: "5e-05"
2023-10-13 13:20:08,360 - mini_batch_size: "8"
2023-10-13 13:20:08,360 - max_epochs: "10"
2023-10-13 13:20:08,360 - shuffle: "True"
2023-10-13 13:20:08,361 ----------------------------------------------------------------------------------------------------
2023-10-13 13:20:08,361 Plugins:
2023-10-13 13:20:08,361 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 13:20:08,361 ----------------------------------------------------------------------------------------------------
2023-10-13 13:20:08,361 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 13:20:08,361 - metric: "('micro avg', 'f1-score')"
2023-10-13 13:20:08,361 ----------------------------------------------------------------------------------------------------
2023-10-13 13:20:08,361 Computation:
2023-10-13 13:20:08,361 - compute on device: cuda:0
2023-10-13 13:20:08,361 - embedding storage: none
2023-10-13 13:20:08,361 ----------------------------------------------------------------------------------------------------
2023-10-13 13:20:08,361 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-13 13:20:08,361 ----------------------------------------------------------------------------------------------------
2023-10-13 13:20:08,361 ----------------------------------------------------------------------------------------------------
2023-10-13 13:20:11,185 epoch 1 - iter 44/447 - loss 2.98851820 - time (sec): 2.82 - samples/sec: 3102.17 - lr: 0.000005 - momentum: 0.000000
2023-10-13 13:20:14,004 epoch 1 - iter 88/447 - loss 1.95325123 - time (sec): 5.64 - samples/sec: 3148.92 - lr: 0.000010 - momentum: 0.000000
2023-10-13 13:20:16,621 epoch 1 - iter 132/447 - loss 1.49794150 - time (sec): 8.26 - samples/sec: 3114.70 - lr: 0.000015 - momentum: 0.000000
2023-10-13 13:20:19,630 epoch 1 - iter 176/447 - loss 1.20997650 - time (sec): 11.27 - samples/sec: 3062.86 - lr: 0.000020 - momentum: 0.000000
2023-10-13 13:20:22,366 epoch 1 - iter 220/447 - loss 1.03351243 - time (sec): 14.00 - samples/sec: 3050.14 - lr: 0.000024 - momentum: 0.000000
2023-10-13 13:20:25,109 epoch 1 - iter 264/447 - loss 0.91449018 - time (sec): 16.75 - samples/sec: 3041.16 - lr: 0.000029 - momentum: 0.000000
2023-10-13 13:20:27,854 epoch 1 - iter 308/447 - loss 0.82776650 - time (sec): 19.49 - samples/sec: 3038.53 - lr: 0.000034 - momentum: 0.000000
2023-10-13 13:20:30,594 epoch 1 - iter 352/447 - loss 0.75655567 - time (sec): 22.23 - samples/sec: 3042.01 - lr: 0.000039 - momentum: 0.000000
2023-10-13 13:20:33,213 epoch 1 - iter 396/447 - loss 0.69589232 - time (sec): 24.85 - samples/sec: 3043.96 - lr: 0.000044 - momentum: 0.000000
2023-10-13 13:20:36,387 epoch 1 - iter 440/447 - loss 0.64524193 - time (sec): 28.03 - samples/sec: 3041.44 - lr: 0.000049 - momentum: 0.000000
2023-10-13 13:20:36,800 ----------------------------------------------------------------------------------------------------
2023-10-13 13:20:36,800 EPOCH 1 done: loss 0.6389 - lr: 0.000049
2023-10-13 13:20:41,834 DEV : loss 0.17851579189300537 - f1-score (micro avg) 0.6619
2023-10-13 13:20:41,868 saving best model
2023-10-13 13:20:42,189 ----------------------------------------------------------------------------------------------------
2023-10-13 13:20:45,136 epoch 2 - iter 44/447 - loss 0.18786776 - time (sec): 2.95 - samples/sec: 3040.94 - lr: 0.000049 - momentum: 0.000000
2023-10-13 13:20:48,309 epoch 2 - iter 88/447 - loss 0.18005101 - time (sec): 6.12 - samples/sec: 3024.51 - lr: 0.000049 - momentum: 0.000000
2023-10-13 13:20:50,957 epoch 2 - iter 132/447 - loss 0.17172587 - time (sec): 8.77 - samples/sec: 2987.88 - lr: 0.000048 - momentum: 0.000000
2023-10-13 13:20:53,671 epoch 2 - iter 176/447 - loss 0.17670397 - time (sec): 11.48 - samples/sec: 3002.83 - lr: 0.000048 - momentum: 0.000000
2023-10-13 13:20:56,617 epoch 2 - iter 220/447 - loss 0.17152176 - time (sec): 14.43 - samples/sec: 2986.38 - lr: 0.000047 - momentum: 0.000000
2023-10-13 13:20:59,327 epoch 2 - iter 264/447 - loss 0.16193941 - time (sec): 17.14 - samples/sec: 3019.15 - lr: 0.000047 - momentum: 0.000000
2023-10-13 13:21:01,946 epoch 2 - iter 308/447 - loss 0.15929042 - time (sec): 19.75 - samples/sec: 3022.81 - lr: 0.000046 - momentum: 0.000000
2023-10-13 13:21:04,541 epoch 2 - iter 352/447 - loss 0.15834174 - time (sec): 22.35 - samples/sec: 3031.62 - lr: 0.000046 - momentum: 0.000000
2023-10-13 13:21:07,674 epoch 2 - iter 396/447 - loss 0.15381226 - time (sec): 25.48 - samples/sec: 3014.93 - lr: 0.000045 - momentum: 0.000000
2023-10-13 13:21:10,380 epoch 2 - iter 440/447 - loss 0.15235393 - time (sec): 28.19 - samples/sec: 3024.09 - lr: 0.000045 - momentum: 0.000000
2023-10-13 13:21:10,888 ----------------------------------------------------------------------------------------------------
2023-10-13 13:21:10,889 EPOCH 2 done: loss 0.1514 - lr: 0.000045
2023-10-13 13:21:19,373 DEV : loss 0.12778525054454803 - f1-score (micro avg) 0.6984
2023-10-13 13:21:19,406 saving best model
2023-10-13 13:21:19,820 ----------------------------------------------------------------------------------------------------
2023-10-13 13:21:22,656 epoch 3 - iter 44/447 - loss 0.10065484 - time (sec): 2.83 - samples/sec: 3018.94 - lr: 0.000044 - momentum: 0.000000
2023-10-13 13:21:25,786 epoch 3 - iter 88/447 - loss 0.08666742 - time (sec): 5.96 - samples/sec: 2992.75 - lr: 0.000043 - momentum: 0.000000
2023-10-13 13:21:28,598 epoch 3 - iter 132/447 - loss 0.08572550 - time (sec): 8.78 - samples/sec: 2986.73 - lr: 0.000043 - momentum: 0.000000
2023-10-13 13:21:31,298 epoch 3 - iter 176/447 - loss 0.08491542 - time (sec): 11.48 - samples/sec: 3003.02 - lr: 0.000042 - momentum: 0.000000
2023-10-13 13:21:33,912 epoch 3 - iter 220/447 - loss 0.08321697 - time (sec): 14.09 - samples/sec: 2985.23 - lr: 0.000042 - momentum: 0.000000
2023-10-13 13:21:36,687 epoch 3 - iter 264/447 - loss 0.08312643 - time (sec): 16.87 - samples/sec: 2997.64 - lr: 0.000041 - momentum: 0.000000
2023-10-13 13:21:39,389 epoch 3 - iter 308/447 - loss 0.08303934 - time (sec): 19.57 - samples/sec: 2999.67 - lr: 0.000041 - momentum: 0.000000
2023-10-13 13:21:42,237 epoch 3 - iter 352/447 - loss 0.07934551 - time (sec): 22.41 - samples/sec: 3012.75 - lr: 0.000040 - momentum: 0.000000
2023-10-13 13:21:44,842 epoch 3 - iter 396/447 - loss 0.08105453 - time (sec): 25.02 - samples/sec: 3030.91 - lr: 0.000040 - momentum: 0.000000
2023-10-13 13:21:47,987 epoch 3 - iter 440/447 - loss 0.08108549 - time (sec): 28.16 - samples/sec: 3028.88 - lr: 0.000039 - momentum: 0.000000
2023-10-13 13:21:48,390 ----------------------------------------------------------------------------------------------------
2023-10-13 13:21:48,390 EPOCH 3 done: loss 0.0808 - lr: 0.000039
2023-10-13 13:21:56,816 DEV : loss 0.13502156734466553 - f1-score (micro avg) 0.7328
2023-10-13 13:21:56,851 saving best model
2023-10-13 13:21:57,269 ----------------------------------------------------------------------------------------------------
2023-10-13 13:21:59,999 epoch 4 - iter 44/447 - loss 0.04201711 - time (sec): 2.72 - samples/sec: 3098.25 - lr: 0.000038 - momentum: 0.000000
2023-10-13 13:22:02,677 epoch 4 - iter 88/447 - loss 0.05222326 - time (sec): 5.40 - samples/sec: 3037.66 - lr: 0.000038 - momentum: 0.000000
2023-10-13 13:22:05,495 epoch 4 - iter 132/447 - loss 0.04940786 - time (sec): 8.22 - samples/sec: 3037.73 - lr: 0.000037 - momentum: 0.000000
2023-10-13 13:22:08,212 epoch 4 - iter 176/447 - loss 0.04414691 - time (sec): 10.94 - samples/sec: 3046.88 - lr: 0.000037 - momentum: 0.000000
2023-10-13 13:22:11,658 epoch 4 - iter 220/447 - loss 0.04915488 - time (sec): 14.38 - samples/sec: 3015.37 - lr: 0.000036 - momentum: 0.000000
2023-10-13 13:22:14,498 epoch 4 - iter 264/447 - loss 0.05177181 - time (sec): 17.22 - samples/sec: 3015.62 - lr: 0.000036 - momentum: 0.000000
2023-10-13 13:22:17,164 epoch 4 - iter 308/447 - loss 0.04997384 - time (sec): 19.89 - samples/sec: 3006.17 - lr: 0.000035 - momentum: 0.000000
2023-10-13 13:22:19,895 epoch 4 - iter 352/447 - loss 0.05023042 - time (sec): 22.62 - samples/sec: 2998.23 - lr: 0.000035 - momentum: 0.000000
2023-10-13 13:22:23,206 epoch 4 - iter 396/447 - loss 0.05174515 - time (sec): 25.93 - samples/sec: 2981.06 - lr: 0.000034 - momentum: 0.000000
2023-10-13 13:22:25,978 epoch 4 - iter 440/447 - loss 0.05238596 - time (sec): 28.70 - samples/sec: 2969.31 - lr: 0.000033 - momentum: 0.000000
2023-10-13 13:22:26,430 ----------------------------------------------------------------------------------------------------
2023-10-13 13:22:26,431 EPOCH 4 done: loss 0.0529 - lr: 0.000033
2023-10-13 13:22:35,016 DEV : loss 0.15478116273880005 - f1-score (micro avg) 0.7741
2023-10-13 13:22:35,048 saving best model
2023-10-13 13:22:35,500 ----------------------------------------------------------------------------------------------------
2023-10-13 13:22:38,504 epoch 5 - iter 44/447 - loss 0.03979897 - time (sec): 3.00 - samples/sec: 2988.15 - lr: 0.000033 - momentum: 0.000000
2023-10-13 13:22:41,411 epoch 5 - iter 88/447 - loss 0.03334224 - time (sec): 5.91 - samples/sec: 2922.69 - lr: 0.000032 - momentum: 0.000000
2023-10-13 13:22:44,290 epoch 5 - iter 132/447 - loss 0.03321982 - time (sec): 8.79 - samples/sec: 2971.55 - lr: 0.000032 - momentum: 0.000000
2023-10-13 13:22:47,083 epoch 5 - iter 176/447 - loss 0.03182161 - time (sec): 11.58 - samples/sec: 2994.52 - lr: 0.000031 - momentum: 0.000000
2023-10-13 13:22:49,725 epoch 5 - iter 220/447 - loss 0.03370102 - time (sec): 14.22 - samples/sec: 2998.29 - lr: 0.000031 - momentum: 0.000000
2023-10-13 13:22:52,582 epoch 5 - iter 264/447 - loss 0.03501394 - time (sec): 17.08 - samples/sec: 3001.26 - lr: 0.000030 - momentum: 0.000000
2023-10-13 13:22:55,826 epoch 5 - iter 308/447 - loss 0.03844377 - time (sec): 20.32 - samples/sec: 2980.60 - lr: 0.000030 - momentum: 0.000000
2023-10-13 13:22:58,482 epoch 5 - iter 352/447 - loss 0.03808194 - time (sec): 22.98 - samples/sec: 2984.83 - lr: 0.000029 - momentum: 0.000000
2023-10-13 13:23:01,347 epoch 5 - iter 396/447 - loss 0.03629037 - time (sec): 25.85 - samples/sec: 2970.33 - lr: 0.000028 - momentum: 0.000000
2023-10-13 13:23:04,266 epoch 5 - iter 440/447 - loss 0.03561767 - time (sec): 28.76 - samples/sec: 2968.14 - lr: 0.000028 - momentum: 0.000000
2023-10-13 13:23:04,682 ----------------------------------------------------------------------------------------------------
2023-10-13 13:23:04,682 EPOCH 5 done: loss 0.0353 - lr: 0.000028
2023-10-13 13:23:13,184 DEV : loss 0.17760951817035675 - f1-score (micro avg) 0.7682
2023-10-13 13:23:13,217 ----------------------------------------------------------------------------------------------------
2023-10-13 13:23:16,042 epoch 6 - iter 44/447 - loss 0.01877968 - time (sec): 2.82 - samples/sec: 3048.07 - lr: 0.000027 - momentum: 0.000000
2023-10-13 13:23:18,942 epoch 6 - iter 88/447 - loss 0.01932285 - time (sec): 5.72 - samples/sec: 3077.70 - lr: 0.000027 - momentum: 0.000000
2023-10-13 13:23:21,625 epoch 6 - iter 132/447 - loss 0.02015927 - time (sec): 8.41 - samples/sec: 3088.96 - lr: 0.000026 - momentum: 0.000000
2023-10-13 13:23:24,907 epoch 6 - iter 176/447 - loss 0.01821601 - time (sec): 11.69 - samples/sec: 3065.39 - lr: 0.000026 - momentum: 0.000000
2023-10-13 13:23:27,771 epoch 6 - iter 220/447 - loss 0.01881791 - time (sec): 14.55 - samples/sec: 2979.48 - lr: 0.000025 - momentum: 0.000000
2023-10-13 13:23:30,488 epoch 6 - iter 264/447 - loss 0.01860310 - time (sec): 17.27 - samples/sec: 2980.78 - lr: 0.000025 - momentum: 0.000000
2023-10-13 13:23:33,366 epoch 6 - iter 308/447 - loss 0.01815398 - time (sec): 20.15 - samples/sec: 2976.57 - lr: 0.000024 - momentum: 0.000000
2023-10-13 13:23:36,065 epoch 6 - iter 352/447 - loss 0.01915966 - time (sec): 22.85 - samples/sec: 2977.83 - lr: 0.000023 - momentum: 0.000000
2023-10-13 13:23:38,865 epoch 6 - iter 396/447 - loss 0.01991320 - time (sec): 25.65 - samples/sec: 2999.07 - lr: 0.000023 - momentum: 0.000000
2023-10-13 13:23:41,624 epoch 6 - iter 440/447 - loss 0.02043759 - time (sec): 28.41 - samples/sec: 3003.11 - lr: 0.000022 - momentum: 0.000000
2023-10-13 13:23:42,034 ----------------------------------------------------------------------------------------------------
2023-10-13 13:23:42,034 EPOCH 6 done: loss 0.0205 - lr: 0.000022
2023-10-13 13:23:50,494 DEV : loss 0.21282611787319183 - f1-score (micro avg) 0.7583
2023-10-13 13:23:50,525 ----------------------------------------------------------------------------------------------------
2023-10-13 13:23:53,838 epoch 7 - iter 44/447 - loss 0.02198471 - time (sec): 3.31 - samples/sec: 2998.52 - lr: 0.000022 - momentum: 0.000000
2023-10-13 13:23:56,555 epoch 7 - iter 88/447 - loss 0.01806647 - time (sec): 6.03 - samples/sec: 2959.07 - lr: 0.000021 - momentum: 0.000000
2023-10-13 13:23:59,442 epoch 7 - iter 132/447 - loss 0.01433410 - time (sec): 8.92 - samples/sec: 2982.30 - lr: 0.000021 - momentum: 0.000000
2023-10-13 13:24:02,316 epoch 7 - iter 176/447 - loss 0.01416791 - time (sec): 11.79 - samples/sec: 3005.39 - lr: 0.000020 - momentum: 0.000000
2023-10-13 13:24:05,100 epoch 7 - iter 220/447 - loss 0.01395297 - time (sec): 14.57 - samples/sec: 3015.13 - lr: 0.000020 - momentum: 0.000000
2023-10-13 13:24:07,721 epoch 7 - iter 264/447 - loss 0.01425224 - time (sec): 17.19 - samples/sec: 3003.40 - lr: 0.000019 - momentum: 0.000000
2023-10-13 13:24:10,471 epoch 7 - iter 308/447 - loss 0.01599349 - time (sec): 19.95 - samples/sec: 3015.57 - lr: 0.000018 - momentum: 0.000000
2023-10-13 13:24:13,263 epoch 7 - iter 352/447 - loss 0.01639863 - time (sec): 22.74 - samples/sec: 3007.11 - lr: 0.000018 - momentum: 0.000000
2023-10-13 13:24:15,883 epoch 7 - iter 396/447 - loss 0.01654187 - time (sec): 25.36 - samples/sec: 3012.06 - lr: 0.000017 - momentum: 0.000000
2023-10-13 13:24:18,717 epoch 7 - iter 440/447 - loss 0.01632206 - time (sec): 28.19 - samples/sec: 3031.51 - lr: 0.000017 - momentum: 0.000000
2023-10-13 13:24:19,099 ----------------------------------------------------------------------------------------------------
2023-10-13 13:24:19,099 EPOCH 7 done: loss 0.0163 - lr: 0.000017
2023-10-13 13:24:27,956 DEV : loss 0.21246877312660217 - f1-score (micro avg) 0.7732
2023-10-13 13:24:27,990 ----------------------------------------------------------------------------------------------------
2023-10-13 13:24:30,849 epoch 8 - iter 44/447 - loss 0.01236522 - time (sec): 2.86 - samples/sec: 3006.90 - lr: 0.000016 - momentum: 0.000000
2023-10-13 13:24:33,776 epoch 8 - iter 88/447 - loss 0.01153258 - time (sec): 5.78 - samples/sec: 2962.65 - lr: 0.000016 - momentum: 0.000000
2023-10-13 13:24:36,431 epoch 8 - iter 132/447 - loss 0.01262017 - time (sec): 8.44 - samples/sec: 3003.64 - lr: 0.000015 - momentum: 0.000000
2023-10-13 13:24:39,104 epoch 8 - iter 176/447 - loss 0.01112667 - time (sec): 11.11 - samples/sec: 3015.10 - lr: 0.000015 - momentum: 0.000000
2023-10-13 13:24:41,911 epoch 8 - iter 220/447 - loss 0.00946565 - time (sec): 13.92 - samples/sec: 3002.28 - lr: 0.000014 - momentum: 0.000000
2023-10-13 13:24:44,625 epoch 8 - iter 264/447 - loss 0.00948303 - time (sec): 16.63 - samples/sec: 3018.28 - lr: 0.000013 - momentum: 0.000000
2023-10-13 13:24:47,483 epoch 8 - iter 308/447 - loss 0.00851173 - time (sec): 19.49 - samples/sec: 2998.77 - lr: 0.000013 - momentum: 0.000000
2023-10-13 13:24:50,628 epoch 8 - iter 352/447 - loss 0.00968723 - time (sec): 22.64 - samples/sec: 2987.84 - lr: 0.000012 - momentum: 0.000000
2023-10-13 13:24:53,754 epoch 8 - iter 396/447 - loss 0.00996054 - time (sec): 25.76 - samples/sec: 2977.54 - lr: 0.000012 - momentum: 0.000000
2023-10-13 13:24:56,483 epoch 8 - iter 440/447 - loss 0.00980498 - time (sec): 28.49 - samples/sec: 2985.89 - lr: 0.000011 - momentum: 0.000000
2023-10-13 13:24:56,967 ----------------------------------------------------------------------------------------------------
2023-10-13 13:24:56,967 EPOCH 8 done: loss 0.0096 - lr: 0.000011
2023-10-13 13:25:05,316 DEV : loss 0.21407929062843323 - f1-score (micro avg) 0.7841
2023-10-13 13:25:05,349 saving best model
2023-10-13 13:25:06,114 ----------------------------------------------------------------------------------------------------
2023-10-13 13:25:08,889 epoch 9 - iter 44/447 - loss 0.00542228 - time (sec): 2.77 - samples/sec: 2990.28 - lr: 0.000011 - momentum: 0.000000
2023-10-13 13:25:11,925 epoch 9 - iter 88/447 - loss 0.00463397 - time (sec): 5.81 - samples/sec: 2922.50 - lr: 0.000010 - momentum: 0.000000
2023-10-13 13:25:14,532 epoch 9 - iter 132/447 - loss 0.00602455 - time (sec): 8.42 - samples/sec: 2961.12 - lr: 0.000010 - momentum: 0.000000
2023-10-13 13:25:17,407 epoch 9 - iter 176/447 - loss 0.00746601 - time (sec): 11.29 - samples/sec: 2945.80 - lr: 0.000009 - momentum: 0.000000
2023-10-13 13:25:20,398 epoch 9 - iter 220/447 - loss 0.00670432 - time (sec): 14.28 - samples/sec: 2945.21 - lr: 0.000008 - momentum: 0.000000
2023-10-13 13:25:23,382 epoch 9 - iter 264/447 - loss 0.00608308 - time (sec): 17.27 - samples/sec: 2921.85 - lr: 0.000008 - momentum: 0.000000
2023-10-13 13:25:26,159 epoch 9 - iter 308/447 - loss 0.00653327 - time (sec): 20.04 - samples/sec: 2942.50 - lr: 0.000007 - momentum: 0.000000
2023-10-13 13:25:29,535 epoch 9 - iter 352/447 - loss 0.00606227 - time (sec): 23.42 - samples/sec: 2950.37 - lr: 0.000007 - momentum: 0.000000
2023-10-13 13:25:32,391 epoch 9 - iter 396/447 - loss 0.00639259 - time (sec): 26.28 - samples/sec: 2943.53 - lr: 0.000006 - momentum: 0.000000
2023-10-13 13:25:35,184 epoch 9 - iter 440/447 - loss 0.00614557 - time (sec): 29.07 - samples/sec: 2940.07 - lr: 0.000006 - momentum: 0.000000
2023-10-13 13:25:35,554 ----------------------------------------------------------------------------------------------------
2023-10-13 13:25:35,555 EPOCH 9 done: loss 0.0066 - lr: 0.000006
2023-10-13 13:25:43,820 DEV : loss 0.23063816130161285 - f1-score (micro avg) 0.781
2023-10-13 13:25:43,854 ----------------------------------------------------------------------------------------------------
2023-10-13 13:25:47,476 epoch 10 - iter 44/447 - loss 0.00098734 - time (sec): 3.62 - samples/sec: 2730.27 - lr: 0.000005 - momentum: 0.000000
2023-10-13 13:25:50,549 epoch 10 - iter 88/447 - loss 0.00181722 - time (sec): 6.69 - samples/sec: 2770.81 - lr: 0.000005 - momentum: 0.000000
2023-10-13 13:25:53,340 epoch 10 - iter 132/447 - loss 0.00316493 - time (sec): 9.48 - samples/sec: 2826.86 - lr: 0.000004 - momentum: 0.000000
2023-10-13 13:25:56,007 epoch 10 - iter 176/447 - loss 0.00376538 - time (sec): 12.15 - samples/sec: 2864.66 - lr: 0.000003 - momentum: 0.000000
2023-10-13 13:25:58,830 epoch 10 - iter 220/447 - loss 0.00318066 - time (sec): 14.97 - samples/sec: 2891.81 - lr: 0.000003 - momentum: 0.000000
2023-10-13 13:26:01,539 epoch 10 - iter 264/447 - loss 0.00282318 - time (sec): 17.68 - samples/sec: 2901.48 - lr: 0.000002 - momentum: 0.000000
2023-10-13 13:26:04,410 epoch 10 - iter 308/447 - loss 0.00291001 - time (sec): 20.55 - samples/sec: 2894.73 - lr: 0.000002 - momentum: 0.000000
2023-10-13 13:26:07,417 epoch 10 - iter 352/447 - loss 0.00276025 - time (sec): 23.56 - samples/sec: 2890.66 - lr: 0.000001 - momentum: 0.000000
2023-10-13 13:26:10,104 epoch 10 - iter 396/447 - loss 0.00297373 - time (sec): 26.25 - samples/sec: 2912.16 - lr: 0.000001 - momentum: 0.000000
2023-10-13 13:26:12,949 epoch 10 - iter 440/447 - loss 0.00306137 - time (sec): 29.09 - samples/sec: 2939.98 - lr: 0.000000 - momentum: 0.000000
2023-10-13 13:26:13,356 ----------------------------------------------------------------------------------------------------
2023-10-13 13:26:13,357 EPOCH 10 done: loss 0.0030 - lr: 0.000000
2023-10-13 13:26:21,613 DEV : loss 0.2350376695394516 - f1-score (micro avg) 0.7849
2023-10-13 13:26:21,645 saving best model
2023-10-13 13:26:22,417 ----------------------------------------------------------------------------------------------------
2023-10-13 13:26:22,419 Loading model from best epoch ...
2023-10-13 13:26:23,884 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-13 13:26:28,948
Results:
- F-score (micro) 0.7551
- F-score (macro) 0.6861
- Accuracy 0.6253
By class:
precision recall f1-score support
loc 0.8361 0.8473 0.8417 596
pers 0.6658 0.7778 0.7175 333
org 0.5620 0.5152 0.5375 132
prod 0.6333 0.5758 0.6032 66
time 0.6909 0.7755 0.7308 49
micro avg 0.7388 0.7721 0.7551 1176
macro avg 0.6776 0.6983 0.6861 1176
weighted avg 0.7397 0.7721 0.7544 1176
2023-10-13 13:26:28,948 ----------------------------------------------------------------------------------------------------