stefan-it's picture
Upload folder using huggingface_hub
13aa49f
2023-10-17 16:56:44,723 ----------------------------------------------------------------------------------------------------
2023-10-17 16:56:44,725 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 16:56:44,725 ----------------------------------------------------------------------------------------------------
2023-10-17 16:56:44,725 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-17 16:56:44,725 ----------------------------------------------------------------------------------------------------
2023-10-17 16:56:44,725 Train: 3575 sentences
2023-10-17 16:56:44,725 (train_with_dev=False, train_with_test=False)
2023-10-17 16:56:44,726 ----------------------------------------------------------------------------------------------------
2023-10-17 16:56:44,726 Training Params:
2023-10-17 16:56:44,726 - learning_rate: "5e-05"
2023-10-17 16:56:44,726 - mini_batch_size: "8"
2023-10-17 16:56:44,726 - max_epochs: "10"
2023-10-17 16:56:44,726 - shuffle: "True"
2023-10-17 16:56:44,726 ----------------------------------------------------------------------------------------------------
2023-10-17 16:56:44,726 Plugins:
2023-10-17 16:56:44,726 - TensorboardLogger
2023-10-17 16:56:44,726 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 16:56:44,726 ----------------------------------------------------------------------------------------------------
2023-10-17 16:56:44,726 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 16:56:44,726 - metric: "('micro avg', 'f1-score')"
2023-10-17 16:56:44,726 ----------------------------------------------------------------------------------------------------
2023-10-17 16:56:44,726 Computation:
2023-10-17 16:56:44,727 - compute on device: cuda:0
2023-10-17 16:56:44,727 - embedding storage: none
2023-10-17 16:56:44,727 ----------------------------------------------------------------------------------------------------
2023-10-17 16:56:44,727 Model training base path: "hmbench-hipe2020/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-17 16:56:44,727 ----------------------------------------------------------------------------------------------------
2023-10-17 16:56:44,727 ----------------------------------------------------------------------------------------------------
2023-10-17 16:56:44,727 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 16:56:48,757 epoch 1 - iter 44/447 - loss 3.30809376 - time (sec): 4.03 - samples/sec: 1917.00 - lr: 0.000005 - momentum: 0.000000
2023-10-17 16:56:53,024 epoch 1 - iter 88/447 - loss 2.13499911 - time (sec): 8.30 - samples/sec: 2028.44 - lr: 0.000010 - momentum: 0.000000
2023-10-17 16:56:57,150 epoch 1 - iter 132/447 - loss 1.59167597 - time (sec): 12.42 - samples/sec: 2057.26 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:57:01,074 epoch 1 - iter 176/447 - loss 1.29369185 - time (sec): 16.35 - samples/sec: 2065.76 - lr: 0.000020 - momentum: 0.000000
2023-10-17 16:57:05,222 epoch 1 - iter 220/447 - loss 1.10953451 - time (sec): 20.49 - samples/sec: 2060.94 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:57:09,610 epoch 1 - iter 264/447 - loss 0.95129997 - time (sec): 24.88 - samples/sec: 2085.41 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:57:13,715 epoch 1 - iter 308/447 - loss 0.86138651 - time (sec): 28.99 - samples/sec: 2081.98 - lr: 0.000034 - momentum: 0.000000
2023-10-17 16:57:17,684 epoch 1 - iter 352/447 - loss 0.78885677 - time (sec): 32.96 - samples/sec: 2073.24 - lr: 0.000039 - momentum: 0.000000
2023-10-17 16:57:21,974 epoch 1 - iter 396/447 - loss 0.72376654 - time (sec): 37.25 - samples/sec: 2071.32 - lr: 0.000044 - momentum: 0.000000
2023-10-17 16:57:25,932 epoch 1 - iter 440/447 - loss 0.67845000 - time (sec): 41.20 - samples/sec: 2065.52 - lr: 0.000049 - momentum: 0.000000
2023-10-17 16:57:26,563 ----------------------------------------------------------------------------------------------------
2023-10-17 16:57:26,564 EPOCH 1 done: loss 0.6716 - lr: 0.000049
2023-10-17 16:57:33,391 DEV : loss 0.23900346457958221 - f1-score (micro avg) 0.6082
2023-10-17 16:57:33,446 saving best model
2023-10-17 16:57:33,993 ----------------------------------------------------------------------------------------------------
2023-10-17 16:57:38,068 epoch 2 - iter 44/447 - loss 0.16398010 - time (sec): 4.07 - samples/sec: 2093.32 - lr: 0.000049 - momentum: 0.000000
2023-10-17 16:57:42,038 epoch 2 - iter 88/447 - loss 0.15526367 - time (sec): 8.04 - samples/sec: 2099.43 - lr: 0.000049 - momentum: 0.000000
2023-10-17 16:57:46,047 epoch 2 - iter 132/447 - loss 0.14942996 - time (sec): 12.05 - samples/sec: 2041.00 - lr: 0.000048 - momentum: 0.000000
2023-10-17 16:57:49,992 epoch 2 - iter 176/447 - loss 0.15411946 - time (sec): 16.00 - samples/sec: 2007.81 - lr: 0.000048 - momentum: 0.000000
2023-10-17 16:57:54,245 epoch 2 - iter 220/447 - loss 0.15333183 - time (sec): 20.25 - samples/sec: 2048.44 - lr: 0.000047 - momentum: 0.000000
2023-10-17 16:57:58,661 epoch 2 - iter 264/447 - loss 0.15730108 - time (sec): 24.67 - samples/sec: 2054.37 - lr: 0.000047 - momentum: 0.000000
2023-10-17 16:58:02,647 epoch 2 - iter 308/447 - loss 0.15903483 - time (sec): 28.65 - samples/sec: 2059.87 - lr: 0.000046 - momentum: 0.000000
2023-10-17 16:58:06,891 epoch 2 - iter 352/447 - loss 0.15414908 - time (sec): 32.90 - samples/sec: 2065.39 - lr: 0.000046 - momentum: 0.000000
2023-10-17 16:58:11,277 epoch 2 - iter 396/447 - loss 0.14780656 - time (sec): 37.28 - samples/sec: 2074.68 - lr: 0.000045 - momentum: 0.000000
2023-10-17 16:58:15,283 epoch 2 - iter 440/447 - loss 0.14671942 - time (sec): 41.29 - samples/sec: 2065.99 - lr: 0.000045 - momentum: 0.000000
2023-10-17 16:58:15,903 ----------------------------------------------------------------------------------------------------
2023-10-17 16:58:15,903 EPOCH 2 done: loss 0.1463 - lr: 0.000045
2023-10-17 16:58:27,617 DEV : loss 0.13179324567317963 - f1-score (micro avg) 0.7139
2023-10-17 16:58:27,683 saving best model
2023-10-17 16:58:29,135 ----------------------------------------------------------------------------------------------------
2023-10-17 16:58:33,213 epoch 3 - iter 44/447 - loss 0.09154608 - time (sec): 4.07 - samples/sec: 2108.98 - lr: 0.000044 - momentum: 0.000000
2023-10-17 16:58:37,346 epoch 3 - iter 88/447 - loss 0.08126586 - time (sec): 8.21 - samples/sec: 2077.58 - lr: 0.000043 - momentum: 0.000000
2023-10-17 16:58:41,669 epoch 3 - iter 132/447 - loss 0.07757411 - time (sec): 12.53 - samples/sec: 2078.10 - lr: 0.000043 - momentum: 0.000000
2023-10-17 16:58:45,855 epoch 3 - iter 176/447 - loss 0.07876017 - time (sec): 16.72 - samples/sec: 2033.65 - lr: 0.000042 - momentum: 0.000000
2023-10-17 16:58:50,377 epoch 3 - iter 220/447 - loss 0.08044112 - time (sec): 21.24 - samples/sec: 2013.71 - lr: 0.000042 - momentum: 0.000000
2023-10-17 16:58:54,907 epoch 3 - iter 264/447 - loss 0.08197015 - time (sec): 25.77 - samples/sec: 2008.00 - lr: 0.000041 - momentum: 0.000000
2023-10-17 16:58:59,464 epoch 3 - iter 308/447 - loss 0.07876519 - time (sec): 30.33 - samples/sec: 1981.62 - lr: 0.000041 - momentum: 0.000000
2023-10-17 16:59:03,614 epoch 3 - iter 352/447 - loss 0.07855959 - time (sec): 34.47 - samples/sec: 1988.29 - lr: 0.000040 - momentum: 0.000000
2023-10-17 16:59:07,970 epoch 3 - iter 396/447 - loss 0.07891854 - time (sec): 38.83 - samples/sec: 1994.34 - lr: 0.000040 - momentum: 0.000000
2023-10-17 16:59:12,029 epoch 3 - iter 440/447 - loss 0.07888207 - time (sec): 42.89 - samples/sec: 1990.43 - lr: 0.000039 - momentum: 0.000000
2023-10-17 16:59:12,673 ----------------------------------------------------------------------------------------------------
2023-10-17 16:59:12,673 EPOCH 3 done: loss 0.0788 - lr: 0.000039
2023-10-17 16:59:24,476 DEV : loss 0.1531600058078766 - f1-score (micro avg) 0.7569
2023-10-17 16:59:24,536 saving best model
2023-10-17 16:59:25,929 ----------------------------------------------------------------------------------------------------
2023-10-17 16:59:30,444 epoch 4 - iter 44/447 - loss 0.05990645 - time (sec): 4.51 - samples/sec: 1989.46 - lr: 0.000038 - momentum: 0.000000
2023-10-17 16:59:34,512 epoch 4 - iter 88/447 - loss 0.04887083 - time (sec): 8.58 - samples/sec: 2018.74 - lr: 0.000038 - momentum: 0.000000
2023-10-17 16:59:38,671 epoch 4 - iter 132/447 - loss 0.04855191 - time (sec): 12.74 - samples/sec: 2033.66 - lr: 0.000037 - momentum: 0.000000
2023-10-17 16:59:42,795 epoch 4 - iter 176/447 - loss 0.05102939 - time (sec): 16.86 - samples/sec: 2019.73 - lr: 0.000037 - momentum: 0.000000
2023-10-17 16:59:46,890 epoch 4 - iter 220/447 - loss 0.05332854 - time (sec): 20.96 - samples/sec: 2033.26 - lr: 0.000036 - momentum: 0.000000
2023-10-17 16:59:51,075 epoch 4 - iter 264/447 - loss 0.05472661 - time (sec): 25.14 - samples/sec: 2039.55 - lr: 0.000036 - momentum: 0.000000
2023-10-17 16:59:55,185 epoch 4 - iter 308/447 - loss 0.05514047 - time (sec): 29.25 - samples/sec: 2032.86 - lr: 0.000035 - momentum: 0.000000
2023-10-17 16:59:59,158 epoch 4 - iter 352/447 - loss 0.05271247 - time (sec): 33.22 - samples/sec: 2030.20 - lr: 0.000035 - momentum: 0.000000
2023-10-17 17:00:03,723 epoch 4 - iter 396/447 - loss 0.05290971 - time (sec): 37.79 - samples/sec: 2033.66 - lr: 0.000034 - momentum: 0.000000
2023-10-17 17:00:07,795 epoch 4 - iter 440/447 - loss 0.05333484 - time (sec): 41.86 - samples/sec: 2033.09 - lr: 0.000033 - momentum: 0.000000
2023-10-17 17:00:08,446 ----------------------------------------------------------------------------------------------------
2023-10-17 17:00:08,446 EPOCH 4 done: loss 0.0527 - lr: 0.000033
2023-10-17 17:00:19,058 DEV : loss 0.1661703735589981 - f1-score (micro avg) 0.7689
2023-10-17 17:00:19,121 saving best model
2023-10-17 17:00:19,717 ----------------------------------------------------------------------------------------------------
2023-10-17 17:00:23,641 epoch 5 - iter 44/447 - loss 0.02390067 - time (sec): 3.92 - samples/sec: 2091.95 - lr: 0.000033 - momentum: 0.000000
2023-10-17 17:00:27,667 epoch 5 - iter 88/447 - loss 0.03146218 - time (sec): 7.95 - samples/sec: 2154.91 - lr: 0.000032 - momentum: 0.000000
2023-10-17 17:00:32,534 epoch 5 - iter 132/447 - loss 0.03162818 - time (sec): 12.81 - samples/sec: 2086.40 - lr: 0.000032 - momentum: 0.000000
2023-10-17 17:00:36,393 epoch 5 - iter 176/447 - loss 0.02971463 - time (sec): 16.67 - samples/sec: 2064.86 - lr: 0.000031 - momentum: 0.000000
2023-10-17 17:00:40,741 epoch 5 - iter 220/447 - loss 0.02941788 - time (sec): 21.02 - samples/sec: 2073.01 - lr: 0.000031 - momentum: 0.000000
2023-10-17 17:00:45,054 epoch 5 - iter 264/447 - loss 0.03197842 - time (sec): 25.34 - samples/sec: 2059.58 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:00:49,175 epoch 5 - iter 308/447 - loss 0.03204083 - time (sec): 29.46 - samples/sec: 2046.51 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:00:53,272 epoch 5 - iter 352/447 - loss 0.03206859 - time (sec): 33.55 - samples/sec: 2034.85 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:00:57,581 epoch 5 - iter 396/447 - loss 0.03511202 - time (sec): 37.86 - samples/sec: 2033.68 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:01:01,561 epoch 5 - iter 440/447 - loss 0.03540433 - time (sec): 41.84 - samples/sec: 2035.72 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:01:02,212 ----------------------------------------------------------------------------------------------------
2023-10-17 17:01:02,212 EPOCH 5 done: loss 0.0351 - lr: 0.000028
2023-10-17 17:01:12,880 DEV : loss 0.19611674547195435 - f1-score (micro avg) 0.7828
2023-10-17 17:01:12,937 saving best model
2023-10-17 17:01:14,312 ----------------------------------------------------------------------------------------------------
2023-10-17 17:01:18,523 epoch 6 - iter 44/447 - loss 0.01715331 - time (sec): 4.21 - samples/sec: 2090.26 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:01:23,111 epoch 6 - iter 88/447 - loss 0.01976064 - time (sec): 8.79 - samples/sec: 2081.58 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:01:27,390 epoch 6 - iter 132/447 - loss 0.02660722 - time (sec): 13.07 - samples/sec: 2041.65 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:01:31,491 epoch 6 - iter 176/447 - loss 0.02603617 - time (sec): 17.17 - samples/sec: 1997.37 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:01:35,489 epoch 6 - iter 220/447 - loss 0.02428162 - time (sec): 21.17 - samples/sec: 1953.34 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:01:39,549 epoch 6 - iter 264/447 - loss 0.02414744 - time (sec): 25.23 - samples/sec: 1973.76 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:01:43,891 epoch 6 - iter 308/447 - loss 0.02596565 - time (sec): 29.57 - samples/sec: 1999.62 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:01:48,147 epoch 6 - iter 352/447 - loss 0.02425663 - time (sec): 33.83 - samples/sec: 1995.53 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:01:52,679 epoch 6 - iter 396/447 - loss 0.02469203 - time (sec): 38.36 - samples/sec: 2010.34 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:01:56,788 epoch 6 - iter 440/447 - loss 0.02423347 - time (sec): 42.47 - samples/sec: 2013.25 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:01:57,424 ----------------------------------------------------------------------------------------------------
2023-10-17 17:01:57,424 EPOCH 6 done: loss 0.0248 - lr: 0.000022
2023-10-17 17:02:09,220 DEV : loss 0.23869967460632324 - f1-score (micro avg) 0.7879
2023-10-17 17:02:09,276 saving best model
2023-10-17 17:02:10,678 ----------------------------------------------------------------------------------------------------
2023-10-17 17:02:14,967 epoch 7 - iter 44/447 - loss 0.00914659 - time (sec): 4.28 - samples/sec: 2158.31 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:02:19,119 epoch 7 - iter 88/447 - loss 0.01155494 - time (sec): 8.44 - samples/sec: 2041.34 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:02:23,516 epoch 7 - iter 132/447 - loss 0.01063806 - time (sec): 12.83 - samples/sec: 2017.80 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:02:27,519 epoch 7 - iter 176/447 - loss 0.01089153 - time (sec): 16.84 - samples/sec: 2006.83 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:02:31,541 epoch 7 - iter 220/447 - loss 0.01611888 - time (sec): 20.86 - samples/sec: 2003.44 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:02:35,594 epoch 7 - iter 264/447 - loss 0.01739927 - time (sec): 24.91 - samples/sec: 2005.94 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:02:39,858 epoch 7 - iter 308/447 - loss 0.01620851 - time (sec): 29.18 - samples/sec: 2006.53 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:02:43,914 epoch 7 - iter 352/447 - loss 0.01519289 - time (sec): 33.23 - samples/sec: 2023.06 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:02:47,996 epoch 7 - iter 396/447 - loss 0.01501461 - time (sec): 37.31 - samples/sec: 2029.97 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:02:52,301 epoch 7 - iter 440/447 - loss 0.01504242 - time (sec): 41.62 - samples/sec: 2043.92 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:02:53,038 ----------------------------------------------------------------------------------------------------
2023-10-17 17:02:53,038 EPOCH 7 done: loss 0.0150 - lr: 0.000017
2023-10-17 17:03:04,279 DEV : loss 0.235974982380867 - f1-score (micro avg) 0.7969
2023-10-17 17:03:04,335 saving best model
2023-10-17 17:03:05,741 ----------------------------------------------------------------------------------------------------
2023-10-17 17:03:09,994 epoch 8 - iter 44/447 - loss 0.00261419 - time (sec): 4.25 - samples/sec: 2028.93 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:03:14,169 epoch 8 - iter 88/447 - loss 0.00424733 - time (sec): 8.42 - samples/sec: 2005.72 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:03:18,279 epoch 8 - iter 132/447 - loss 0.00574360 - time (sec): 12.53 - samples/sec: 2043.17 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:03:22,478 epoch 8 - iter 176/447 - loss 0.00668216 - time (sec): 16.73 - samples/sec: 2019.77 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:03:26,679 epoch 8 - iter 220/447 - loss 0.00675792 - time (sec): 20.93 - samples/sec: 2005.06 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:03:30,956 epoch 8 - iter 264/447 - loss 0.00747790 - time (sec): 25.21 - samples/sec: 2017.30 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:03:35,163 epoch 8 - iter 308/447 - loss 0.00697616 - time (sec): 29.42 - samples/sec: 2020.15 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:03:39,472 epoch 8 - iter 352/447 - loss 0.00788368 - time (sec): 33.73 - samples/sec: 2006.34 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:03:43,711 epoch 8 - iter 396/447 - loss 0.00799914 - time (sec): 37.97 - samples/sec: 2010.70 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:03:48,109 epoch 8 - iter 440/447 - loss 0.00821087 - time (sec): 42.36 - samples/sec: 2011.44 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:03:48,746 ----------------------------------------------------------------------------------------------------
2023-10-17 17:03:48,746 EPOCH 8 done: loss 0.0081 - lr: 0.000011
2023-10-17 17:04:00,423 DEV : loss 0.25937923789024353 - f1-score (micro avg) 0.7981
2023-10-17 17:04:00,486 saving best model
2023-10-17 17:04:01,884 ----------------------------------------------------------------------------------------------------
2023-10-17 17:04:06,305 epoch 9 - iter 44/447 - loss 0.00723446 - time (sec): 4.42 - samples/sec: 2026.19 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:04:10,864 epoch 9 - iter 88/447 - loss 0.00487931 - time (sec): 8.98 - samples/sec: 2114.36 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:04:14,962 epoch 9 - iter 132/447 - loss 0.00554836 - time (sec): 13.07 - samples/sec: 2079.81 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:04:19,149 epoch 9 - iter 176/447 - loss 0.00594698 - time (sec): 17.26 - samples/sec: 2043.96 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:04:23,564 epoch 9 - iter 220/447 - loss 0.00618049 - time (sec): 21.68 - samples/sec: 2039.55 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:04:27,826 epoch 9 - iter 264/447 - loss 0.00575274 - time (sec): 25.94 - samples/sec: 2044.86 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:04:31,919 epoch 9 - iter 308/447 - loss 0.00511891 - time (sec): 30.03 - samples/sec: 2041.76 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:04:35,941 epoch 9 - iter 352/447 - loss 0.00557735 - time (sec): 34.05 - samples/sec: 2033.36 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:04:39,858 epoch 9 - iter 396/447 - loss 0.00624259 - time (sec): 37.97 - samples/sec: 2034.83 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:04:43,841 epoch 9 - iter 440/447 - loss 0.00613617 - time (sec): 41.95 - samples/sec: 2035.35 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:04:44,441 ----------------------------------------------------------------------------------------------------
2023-10-17 17:04:44,441 EPOCH 9 done: loss 0.0061 - lr: 0.000006
2023-10-17 17:04:55,640 DEV : loss 0.25230181217193604 - f1-score (micro avg) 0.8013
2023-10-17 17:04:55,695 saving best model
2023-10-17 17:04:57,150 ----------------------------------------------------------------------------------------------------
2023-10-17 17:05:01,029 epoch 10 - iter 44/447 - loss 0.00272054 - time (sec): 3.88 - samples/sec: 2204.05 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:05:05,159 epoch 10 - iter 88/447 - loss 0.00268195 - time (sec): 8.01 - samples/sec: 2094.47 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:05:08,962 epoch 10 - iter 132/447 - loss 0.00252588 - time (sec): 11.81 - samples/sec: 2083.97 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:05:13,241 epoch 10 - iter 176/447 - loss 0.00294056 - time (sec): 16.09 - samples/sec: 2082.48 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:05:17,614 epoch 10 - iter 220/447 - loss 0.00255976 - time (sec): 20.46 - samples/sec: 2064.85 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:05:22,162 epoch 10 - iter 264/447 - loss 0.00364326 - time (sec): 25.01 - samples/sec: 2055.52 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:05:26,085 epoch 10 - iter 308/447 - loss 0.00413231 - time (sec): 28.93 - samples/sec: 2051.16 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:05:30,151 epoch 10 - iter 352/447 - loss 0.00402208 - time (sec): 33.00 - samples/sec: 2070.54 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:05:34,136 epoch 10 - iter 396/447 - loss 0.00401550 - time (sec): 36.98 - samples/sec: 2073.31 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:05:38,349 epoch 10 - iter 440/447 - loss 0.00394541 - time (sec): 41.20 - samples/sec: 2067.87 - lr: 0.000000 - momentum: 0.000000
2023-10-17 17:05:39,001 ----------------------------------------------------------------------------------------------------
2023-10-17 17:05:39,001 EPOCH 10 done: loss 0.0039 - lr: 0.000000
2023-10-17 17:05:50,065 DEV : loss 0.25091075897216797 - f1-score (micro avg) 0.8008
2023-10-17 17:05:50,781 ----------------------------------------------------------------------------------------------------
2023-10-17 17:05:50,783 Loading model from best epoch ...
2023-10-17 17:05:53,439 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-17 17:06:00,784
Results:
- F-score (micro) 0.769
- F-score (macro) 0.691
- Accuracy 0.645
By class:
precision recall f1-score support
loc 0.8699 0.8523 0.8610 596
pers 0.7037 0.7988 0.7482 333
org 0.5156 0.5000 0.5077 132
prod 0.5806 0.5455 0.5625 66
time 0.7755 0.7755 0.7755 49
micro avg 0.7610 0.7772 0.7690 1176
macro avg 0.6891 0.6944 0.6910 1176
weighted avg 0.7629 0.7772 0.7691 1176
2023-10-17 17:06:00,784 ----------------------------------------------------------------------------------------------------