stefan-it's picture
Upload folder using huggingface_hub
e1b6c7e
2023-10-17 16:47:12,055 ----------------------------------------------------------------------------------------------------
2023-10-17 16:47:12,057 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 16:47:12,057 ----------------------------------------------------------------------------------------------------
2023-10-17 16:47:12,057 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-17 16:47:12,057 ----------------------------------------------------------------------------------------------------
2023-10-17 16:47:12,057 Train: 3575 sentences
2023-10-17 16:47:12,057 (train_with_dev=False, train_with_test=False)
2023-10-17 16:47:12,057 ----------------------------------------------------------------------------------------------------
2023-10-17 16:47:12,058 Training Params:
2023-10-17 16:47:12,058 - learning_rate: "3e-05"
2023-10-17 16:47:12,058 - mini_batch_size: "8"
2023-10-17 16:47:12,058 - max_epochs: "10"
2023-10-17 16:47:12,058 - shuffle: "True"
2023-10-17 16:47:12,058 ----------------------------------------------------------------------------------------------------
2023-10-17 16:47:12,058 Plugins:
2023-10-17 16:47:12,058 - TensorboardLogger
2023-10-17 16:47:12,058 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 16:47:12,058 ----------------------------------------------------------------------------------------------------
2023-10-17 16:47:12,058 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 16:47:12,058 - metric: "('micro avg', 'f1-score')"
2023-10-17 16:47:12,058 ----------------------------------------------------------------------------------------------------
2023-10-17 16:47:12,059 Computation:
2023-10-17 16:47:12,059 - compute on device: cuda:0
2023-10-17 16:47:12,059 - embedding storage: none
2023-10-17 16:47:12,059 ----------------------------------------------------------------------------------------------------
2023-10-17 16:47:12,059 Model training base path: "hmbench-hipe2020/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-17 16:47:12,059 ----------------------------------------------------------------------------------------------------
2023-10-17 16:47:12,059 ----------------------------------------------------------------------------------------------------
2023-10-17 16:47:12,059 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 16:47:16,186 epoch 1 - iter 44/447 - loss 3.49661378 - time (sec): 4.13 - samples/sec: 1872.19 - lr: 0.000003 - momentum: 0.000000
2023-10-17 16:47:20,660 epoch 1 - iter 88/447 - loss 2.61631042 - time (sec): 8.60 - samples/sec: 1956.66 - lr: 0.000006 - momentum: 0.000000
2023-10-17 16:47:24,936 epoch 1 - iter 132/447 - loss 1.93352666 - time (sec): 12.88 - samples/sec: 1984.67 - lr: 0.000009 - momentum: 0.000000
2023-10-17 16:47:28,939 epoch 1 - iter 176/447 - loss 1.57530896 - time (sec): 16.88 - samples/sec: 2000.58 - lr: 0.000012 - momentum: 0.000000
2023-10-17 16:47:33,220 epoch 1 - iter 220/447 - loss 1.34725282 - time (sec): 21.16 - samples/sec: 1996.09 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:47:37,715 epoch 1 - iter 264/447 - loss 1.15393413 - time (sec): 25.65 - samples/sec: 2022.57 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:47:41,873 epoch 1 - iter 308/447 - loss 1.04226944 - time (sec): 29.81 - samples/sec: 2024.30 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:47:45,852 epoch 1 - iter 352/447 - loss 0.95292814 - time (sec): 33.79 - samples/sec: 2021.95 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:47:50,370 epoch 1 - iter 396/447 - loss 0.87278568 - time (sec): 38.31 - samples/sec: 2013.81 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:47:54,446 epoch 1 - iter 440/447 - loss 0.81583916 - time (sec): 42.39 - samples/sec: 2007.91 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:47:55,074 ----------------------------------------------------------------------------------------------------
2023-10-17 16:47:55,075 EPOCH 1 done: loss 0.8052 - lr: 0.000029
2023-10-17 16:48:01,442 DEV : loss 0.18609091639518738 - f1-score (micro avg) 0.62
2023-10-17 16:48:01,495 saving best model
2023-10-17 16:48:02,032 ----------------------------------------------------------------------------------------------------
2023-10-17 16:48:06,103 epoch 2 - iter 44/447 - loss 0.17646139 - time (sec): 4.07 - samples/sec: 2094.88 - lr: 0.000030 - momentum: 0.000000
2023-10-17 16:48:10,127 epoch 2 - iter 88/447 - loss 0.17558676 - time (sec): 8.09 - samples/sec: 2086.39 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:48:14,170 epoch 2 - iter 132/447 - loss 0.16683960 - time (sec): 12.14 - samples/sec: 2026.77 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:48:18,156 epoch 2 - iter 176/447 - loss 0.16952362 - time (sec): 16.12 - samples/sec: 1992.29 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:48:22,496 epoch 2 - iter 220/447 - loss 0.16761727 - time (sec): 20.46 - samples/sec: 2027.06 - lr: 0.000028 - momentum: 0.000000
2023-10-17 16:48:26,893 epoch 2 - iter 264/447 - loss 0.17109456 - time (sec): 24.86 - samples/sec: 2038.35 - lr: 0.000028 - momentum: 0.000000
2023-10-17 16:48:30,902 epoch 2 - iter 308/447 - loss 0.17214782 - time (sec): 28.87 - samples/sec: 2044.42 - lr: 0.000028 - momentum: 0.000000
2023-10-17 16:48:35,157 epoch 2 - iter 352/447 - loss 0.16542676 - time (sec): 33.12 - samples/sec: 2051.24 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:48:39,571 epoch 2 - iter 396/447 - loss 0.15856207 - time (sec): 37.54 - samples/sec: 2060.57 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:48:43,542 epoch 2 - iter 440/447 - loss 0.15666011 - time (sec): 41.51 - samples/sec: 2055.00 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:48:44,156 ----------------------------------------------------------------------------------------------------
2023-10-17 16:48:44,156 EPOCH 2 done: loss 0.1564 - lr: 0.000027
2023-10-17 16:48:55,101 DEV : loss 0.11850441992282867 - f1-score (micro avg) 0.7004
2023-10-17 16:48:55,153 saving best model
2023-10-17 16:48:56,532 ----------------------------------------------------------------------------------------------------
2023-10-17 16:49:00,603 epoch 3 - iter 44/447 - loss 0.08474878 - time (sec): 4.07 - samples/sec: 2113.03 - lr: 0.000026 - momentum: 0.000000
2023-10-17 16:49:04,693 epoch 3 - iter 88/447 - loss 0.08245773 - time (sec): 8.16 - samples/sec: 2090.39 - lr: 0.000026 - momentum: 0.000000
2023-10-17 16:49:09,017 epoch 3 - iter 132/447 - loss 0.08517300 - time (sec): 12.48 - samples/sec: 2086.23 - lr: 0.000026 - momentum: 0.000000
2023-10-17 16:49:13,016 epoch 3 - iter 176/447 - loss 0.08575584 - time (sec): 16.48 - samples/sec: 2062.81 - lr: 0.000025 - momentum: 0.000000
2023-10-17 16:49:17,520 epoch 3 - iter 220/447 - loss 0.08751252 - time (sec): 20.98 - samples/sec: 2038.05 - lr: 0.000025 - momentum: 0.000000
2023-10-17 16:49:22,077 epoch 3 - iter 264/447 - loss 0.08907774 - time (sec): 25.54 - samples/sec: 2025.88 - lr: 0.000025 - momentum: 0.000000
2023-10-17 16:49:26,176 epoch 3 - iter 308/447 - loss 0.08626226 - time (sec): 29.64 - samples/sec: 2027.47 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:49:30,200 epoch 3 - iter 352/447 - loss 0.08544823 - time (sec): 33.66 - samples/sec: 2036.17 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:49:34,477 epoch 3 - iter 396/447 - loss 0.08472916 - time (sec): 37.94 - samples/sec: 2041.16 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:49:38,389 epoch 3 - iter 440/447 - loss 0.08337869 - time (sec): 41.85 - samples/sec: 2039.72 - lr: 0.000023 - momentum: 0.000000
2023-10-17 16:49:39,011 ----------------------------------------------------------------------------------------------------
2023-10-17 16:49:39,011 EPOCH 3 done: loss 0.0831 - lr: 0.000023
2023-10-17 16:49:50,287 DEV : loss 0.16098077595233917 - f1-score (micro avg) 0.7368
2023-10-17 16:49:50,342 saving best model
2023-10-17 16:49:51,747 ----------------------------------------------------------------------------------------------------
2023-10-17 16:49:56,121 epoch 4 - iter 44/447 - loss 0.06080484 - time (sec): 4.37 - samples/sec: 2053.56 - lr: 0.000023 - momentum: 0.000000
2023-10-17 16:50:00,102 epoch 4 - iter 88/447 - loss 0.05047837 - time (sec): 8.35 - samples/sec: 2073.72 - lr: 0.000023 - momentum: 0.000000
2023-10-17 16:50:04,214 epoch 4 - iter 132/447 - loss 0.04822910 - time (sec): 12.46 - samples/sec: 2078.43 - lr: 0.000022 - momentum: 0.000000
2023-10-17 16:50:08,227 epoch 4 - iter 176/447 - loss 0.05146065 - time (sec): 16.48 - samples/sec: 2067.12 - lr: 0.000022 - momentum: 0.000000
2023-10-17 16:50:12,222 epoch 4 - iter 220/447 - loss 0.05457535 - time (sec): 20.47 - samples/sec: 2081.48 - lr: 0.000022 - momentum: 0.000000
2023-10-17 16:50:16,289 epoch 4 - iter 264/447 - loss 0.05684864 - time (sec): 24.54 - samples/sec: 2089.74 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:50:20,515 epoch 4 - iter 308/447 - loss 0.05629000 - time (sec): 28.76 - samples/sec: 2067.32 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:50:24,525 epoch 4 - iter 352/447 - loss 0.05515985 - time (sec): 32.77 - samples/sec: 2058.16 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:50:29,074 epoch 4 - iter 396/447 - loss 0.05406735 - time (sec): 37.32 - samples/sec: 2059.09 - lr: 0.000020 - momentum: 0.000000
2023-10-17 16:50:33,156 epoch 4 - iter 440/447 - loss 0.05404876 - time (sec): 41.40 - samples/sec: 2055.53 - lr: 0.000020 - momentum: 0.000000
2023-10-17 16:50:33,828 ----------------------------------------------------------------------------------------------------
2023-10-17 16:50:33,829 EPOCH 4 done: loss 0.0537 - lr: 0.000020
2023-10-17 16:50:44,774 DEV : loss 0.17118440568447113 - f1-score (micro avg) 0.7738
2023-10-17 16:50:44,823 saving best model
2023-10-17 16:50:46,179 ----------------------------------------------------------------------------------------------------
2023-10-17 16:50:50,043 epoch 5 - iter 44/447 - loss 0.02823819 - time (sec): 3.86 - samples/sec: 2125.42 - lr: 0.000020 - momentum: 0.000000
2023-10-17 16:50:54,046 epoch 5 - iter 88/447 - loss 0.02974272 - time (sec): 7.86 - samples/sec: 2178.10 - lr: 0.000019 - momentum: 0.000000
2023-10-17 16:50:58,481 epoch 5 - iter 132/447 - loss 0.03274366 - time (sec): 12.30 - samples/sec: 2173.97 - lr: 0.000019 - momentum: 0.000000
2023-10-17 16:51:02,421 epoch 5 - iter 176/447 - loss 0.03126456 - time (sec): 16.24 - samples/sec: 2120.29 - lr: 0.000019 - momentum: 0.000000
2023-10-17 16:51:06,701 epoch 5 - iter 220/447 - loss 0.02862125 - time (sec): 20.52 - samples/sec: 2123.83 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:51:10,991 epoch 5 - iter 264/447 - loss 0.02862398 - time (sec): 24.81 - samples/sec: 2103.31 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:51:15,112 epoch 5 - iter 308/447 - loss 0.02844421 - time (sec): 28.93 - samples/sec: 2083.75 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:51:19,138 epoch 5 - iter 352/447 - loss 0.02956228 - time (sec): 32.96 - samples/sec: 2071.76 - lr: 0.000017 - momentum: 0.000000
2023-10-17 16:51:23,488 epoch 5 - iter 396/447 - loss 0.03284724 - time (sec): 37.31 - samples/sec: 2064.00 - lr: 0.000017 - momentum: 0.000000
2023-10-17 16:51:27,485 epoch 5 - iter 440/447 - loss 0.03180610 - time (sec): 41.30 - samples/sec: 2062.28 - lr: 0.000017 - momentum: 0.000000
2023-10-17 16:51:28,153 ----------------------------------------------------------------------------------------------------
2023-10-17 16:51:28,154 EPOCH 5 done: loss 0.0315 - lr: 0.000017
2023-10-17 16:51:39,205 DEV : loss 0.17024928331375122 - f1-score (micro avg) 0.7777
2023-10-17 16:51:39,267 saving best model
2023-10-17 16:51:40,718 ----------------------------------------------------------------------------------------------------
2023-10-17 16:51:45,030 epoch 6 - iter 44/447 - loss 0.02203100 - time (sec): 4.31 - samples/sec: 2041.04 - lr: 0.000016 - momentum: 0.000000
2023-10-17 16:51:49,781 epoch 6 - iter 88/447 - loss 0.01920241 - time (sec): 9.06 - samples/sec: 2020.90 - lr: 0.000016 - momentum: 0.000000
2023-10-17 16:51:54,204 epoch 6 - iter 132/447 - loss 0.02226912 - time (sec): 13.48 - samples/sec: 1979.81 - lr: 0.000016 - momentum: 0.000000
2023-10-17 16:51:58,231 epoch 6 - iter 176/447 - loss 0.02378920 - time (sec): 17.51 - samples/sec: 1959.27 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:52:02,125 epoch 6 - iter 220/447 - loss 0.02333195 - time (sec): 21.40 - samples/sec: 1932.28 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:52:06,311 epoch 6 - iter 264/447 - loss 0.02310408 - time (sec): 25.59 - samples/sec: 1946.29 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:52:10,568 epoch 6 - iter 308/447 - loss 0.02234226 - time (sec): 29.85 - samples/sec: 1981.40 - lr: 0.000014 - momentum: 0.000000
2023-10-17 16:52:14,740 epoch 6 - iter 352/447 - loss 0.02289841 - time (sec): 34.02 - samples/sec: 1984.53 - lr: 0.000014 - momentum: 0.000000
2023-10-17 16:52:19,543 epoch 6 - iter 396/447 - loss 0.02189250 - time (sec): 38.82 - samples/sec: 1986.59 - lr: 0.000014 - momentum: 0.000000
2023-10-17 16:52:23,522 epoch 6 - iter 440/447 - loss 0.02187698 - time (sec): 42.80 - samples/sec: 1997.83 - lr: 0.000013 - momentum: 0.000000
2023-10-17 16:52:24,166 ----------------------------------------------------------------------------------------------------
2023-10-17 16:52:24,167 EPOCH 6 done: loss 0.0230 - lr: 0.000013
2023-10-17 16:52:34,982 DEV : loss 0.19983802735805511 - f1-score (micro avg) 0.803
2023-10-17 16:52:35,059 saving best model
2023-10-17 16:52:36,497 ----------------------------------------------------------------------------------------------------
2023-10-17 16:52:40,921 epoch 7 - iter 44/447 - loss 0.00850187 - time (sec): 4.42 - samples/sec: 2092.53 - lr: 0.000013 - momentum: 0.000000
2023-10-17 16:52:45,097 epoch 7 - iter 88/447 - loss 0.01184907 - time (sec): 8.60 - samples/sec: 2003.34 - lr: 0.000013 - momentum: 0.000000
2023-10-17 16:52:49,559 epoch 7 - iter 132/447 - loss 0.01008429 - time (sec): 13.06 - samples/sec: 1983.15 - lr: 0.000012 - momentum: 0.000000
2023-10-17 16:52:53,577 epoch 7 - iter 176/447 - loss 0.01116324 - time (sec): 17.08 - samples/sec: 1978.81 - lr: 0.000012 - momentum: 0.000000
2023-10-17 16:52:57,658 epoch 7 - iter 220/447 - loss 0.01311415 - time (sec): 21.16 - samples/sec: 1975.15 - lr: 0.000012 - momentum: 0.000000
2023-10-17 16:53:01,902 epoch 7 - iter 264/447 - loss 0.01472328 - time (sec): 25.40 - samples/sec: 1967.29 - lr: 0.000011 - momentum: 0.000000
2023-10-17 16:53:06,100 epoch 7 - iter 308/447 - loss 0.01385064 - time (sec): 29.60 - samples/sec: 1977.81 - lr: 0.000011 - momentum: 0.000000
2023-10-17 16:53:10,128 epoch 7 - iter 352/447 - loss 0.01331558 - time (sec): 33.63 - samples/sec: 1999.23 - lr: 0.000011 - momentum: 0.000000
2023-10-17 16:53:14,178 epoch 7 - iter 396/447 - loss 0.01397777 - time (sec): 37.68 - samples/sec: 2010.39 - lr: 0.000010 - momentum: 0.000000
2023-10-17 16:53:18,463 epoch 7 - iter 440/447 - loss 0.01368698 - time (sec): 41.96 - samples/sec: 2027.17 - lr: 0.000010 - momentum: 0.000000
2023-10-17 16:53:19,188 ----------------------------------------------------------------------------------------------------
2023-10-17 16:53:19,188 EPOCH 7 done: loss 0.0135 - lr: 0.000010
2023-10-17 16:53:30,279 DEV : loss 0.20469656586647034 - f1-score (micro avg) 0.7921
2023-10-17 16:53:30,335 ----------------------------------------------------------------------------------------------------
2023-10-17 16:53:34,480 epoch 8 - iter 44/447 - loss 0.00313666 - time (sec): 4.14 - samples/sec: 2080.39 - lr: 0.000010 - momentum: 0.000000
2023-10-17 16:53:38,581 epoch 8 - iter 88/447 - loss 0.00502648 - time (sec): 8.24 - samples/sec: 2049.56 - lr: 0.000009 - momentum: 0.000000
2023-10-17 16:53:42,753 epoch 8 - iter 132/447 - loss 0.00740875 - time (sec): 12.42 - samples/sec: 2062.49 - lr: 0.000009 - momentum: 0.000000
2023-10-17 16:53:47,062 epoch 8 - iter 176/447 - loss 0.00859745 - time (sec): 16.73 - samples/sec: 2020.65 - lr: 0.000009 - momentum: 0.000000
2023-10-17 16:53:51,256 epoch 8 - iter 220/447 - loss 0.00798549 - time (sec): 20.92 - samples/sec: 2006.40 - lr: 0.000008 - momentum: 0.000000
2023-10-17 16:53:55,572 epoch 8 - iter 264/447 - loss 0.00781959 - time (sec): 25.23 - samples/sec: 2015.40 - lr: 0.000008 - momentum: 0.000000
2023-10-17 16:53:59,928 epoch 8 - iter 308/447 - loss 0.00794785 - time (sec): 29.59 - samples/sec: 2008.36 - lr: 0.000008 - momentum: 0.000000
2023-10-17 16:54:04,317 epoch 8 - iter 352/447 - loss 0.00872736 - time (sec): 33.98 - samples/sec: 1991.41 - lr: 0.000007 - momentum: 0.000000
2023-10-17 16:54:08,671 epoch 8 - iter 396/447 - loss 0.00928899 - time (sec): 38.33 - samples/sec: 1991.38 - lr: 0.000007 - momentum: 0.000000
2023-10-17 16:54:13,050 epoch 8 - iter 440/447 - loss 0.00882367 - time (sec): 42.71 - samples/sec: 1995.01 - lr: 0.000007 - momentum: 0.000000
2023-10-17 16:54:13,689 ----------------------------------------------------------------------------------------------------
2023-10-17 16:54:13,689 EPOCH 8 done: loss 0.0088 - lr: 0.000007
2023-10-17 16:54:25,266 DEV : loss 0.21882730722427368 - f1-score (micro avg) 0.8018
2023-10-17 16:54:25,341 ----------------------------------------------------------------------------------------------------
2023-10-17 16:54:29,957 epoch 9 - iter 44/447 - loss 0.00911349 - time (sec): 4.61 - samples/sec: 1939.98 - lr: 0.000006 - momentum: 0.000000
2023-10-17 16:54:34,629 epoch 9 - iter 88/447 - loss 0.00622748 - time (sec): 9.28 - samples/sec: 2044.19 - lr: 0.000006 - momentum: 0.000000
2023-10-17 16:54:38,713 epoch 9 - iter 132/447 - loss 0.00477911 - time (sec): 13.37 - samples/sec: 2033.90 - lr: 0.000006 - momentum: 0.000000
2023-10-17 16:54:43,037 epoch 9 - iter 176/447 - loss 0.00545796 - time (sec): 17.69 - samples/sec: 1994.09 - lr: 0.000005 - momentum: 0.000000
2023-10-17 16:54:47,439 epoch 9 - iter 220/447 - loss 0.00514891 - time (sec): 22.09 - samples/sec: 2000.95 - lr: 0.000005 - momentum: 0.000000
2023-10-17 16:54:51,898 epoch 9 - iter 264/447 - loss 0.00530463 - time (sec): 26.55 - samples/sec: 1997.43 - lr: 0.000005 - momentum: 0.000000
2023-10-17 16:54:56,171 epoch 9 - iter 308/447 - loss 0.00519452 - time (sec): 30.83 - samples/sec: 1989.09 - lr: 0.000004 - momentum: 0.000000
2023-10-17 16:55:00,230 epoch 9 - iter 352/447 - loss 0.00567824 - time (sec): 34.89 - samples/sec: 1984.82 - lr: 0.000004 - momentum: 0.000000
2023-10-17 16:55:04,392 epoch 9 - iter 396/447 - loss 0.00625861 - time (sec): 39.05 - samples/sec: 1978.66 - lr: 0.000004 - momentum: 0.000000
2023-10-17 16:55:08,577 epoch 9 - iter 440/447 - loss 0.00656770 - time (sec): 43.23 - samples/sec: 1975.09 - lr: 0.000003 - momentum: 0.000000
2023-10-17 16:55:09,189 ----------------------------------------------------------------------------------------------------
2023-10-17 16:55:09,190 EPOCH 9 done: loss 0.0065 - lr: 0.000003
2023-10-17 16:55:20,752 DEV : loss 0.22874712944030762 - f1-score (micro avg) 0.8066
2023-10-17 16:55:20,817 saving best model
2023-10-17 16:55:22,230 ----------------------------------------------------------------------------------------------------
2023-10-17 16:55:26,626 epoch 10 - iter 44/447 - loss 0.00173260 - time (sec): 4.39 - samples/sec: 1946.13 - lr: 0.000003 - momentum: 0.000000
2023-10-17 16:55:30,957 epoch 10 - iter 88/447 - loss 0.00259927 - time (sec): 8.72 - samples/sec: 1922.78 - lr: 0.000003 - momentum: 0.000000
2023-10-17 16:55:34,844 epoch 10 - iter 132/447 - loss 0.00371022 - time (sec): 12.61 - samples/sec: 1951.89 - lr: 0.000002 - momentum: 0.000000
2023-10-17 16:55:39,216 epoch 10 - iter 176/447 - loss 0.00343244 - time (sec): 16.98 - samples/sec: 1973.14 - lr: 0.000002 - momentum: 0.000000
2023-10-17 16:55:43,550 epoch 10 - iter 220/447 - loss 0.00410654 - time (sec): 21.31 - samples/sec: 1982.22 - lr: 0.000002 - momentum: 0.000000
2023-10-17 16:55:48,037 epoch 10 - iter 264/447 - loss 0.00468622 - time (sec): 25.80 - samples/sec: 1992.41 - lr: 0.000001 - momentum: 0.000000
2023-10-17 16:55:51,992 epoch 10 - iter 308/447 - loss 0.00492962 - time (sec): 29.76 - samples/sec: 1994.33 - lr: 0.000001 - momentum: 0.000000
2023-10-17 16:55:56,019 epoch 10 - iter 352/447 - loss 0.00498152 - time (sec): 33.78 - samples/sec: 2022.40 - lr: 0.000001 - momentum: 0.000000
2023-10-17 16:55:59,943 epoch 10 - iter 396/447 - loss 0.00514324 - time (sec): 37.71 - samples/sec: 2033.47 - lr: 0.000000 - momentum: 0.000000
2023-10-17 16:56:04,083 epoch 10 - iter 440/447 - loss 0.00486802 - time (sec): 41.85 - samples/sec: 2035.70 - lr: 0.000000 - momentum: 0.000000
2023-10-17 16:56:04,730 ----------------------------------------------------------------------------------------------------
2023-10-17 16:56:04,731 EPOCH 10 done: loss 0.0048 - lr: 0.000000
2023-10-17 16:56:15,689 DEV : loss 0.23447194695472717 - f1-score (micro avg) 0.805
2023-10-17 16:56:16,295 ----------------------------------------------------------------------------------------------------
2023-10-17 16:56:16,298 Loading model from best epoch ...
2023-10-17 16:56:19,028 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-17 16:56:25,007
Results:
- F-score (micro) 0.7627
- F-score (macro) 0.6747
- Accuracy 0.6391
By class:
precision recall f1-score support
loc 0.8617 0.8574 0.8595 596
pers 0.7067 0.7958 0.7486 333
org 0.4667 0.5833 0.5185 132
prod 0.5965 0.5152 0.5528 66
time 0.6939 0.6939 0.6939 49
micro avg 0.7433 0.7832 0.7627 1176
macro avg 0.6651 0.6891 0.6747 1176
weighted avg 0.7516 0.7832 0.7657 1176
2023-10-17 16:56:25,007 ----------------------------------------------------------------------------------------------------