stefan-it's picture
Upload folder using huggingface_hub
0ad93eb
2023-10-13 13:26:55,352 ----------------------------------------------------------------------------------------------------
2023-10-13 13:26:55,353 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 13:26:55,353 ----------------------------------------------------------------------------------------------------
2023-10-13 13:26:55,353 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-13 13:26:55,353 ----------------------------------------------------------------------------------------------------
2023-10-13 13:26:55,353 Train: 3575 sentences
2023-10-13 13:26:55,353 (train_with_dev=False, train_with_test=False)
2023-10-13 13:26:55,353 ----------------------------------------------------------------------------------------------------
2023-10-13 13:26:55,353 Training Params:
2023-10-13 13:26:55,353 - learning_rate: "3e-05"
2023-10-13 13:26:55,353 - mini_batch_size: "4"
2023-10-13 13:26:55,353 - max_epochs: "10"
2023-10-13 13:26:55,353 - shuffle: "True"
2023-10-13 13:26:55,353 ----------------------------------------------------------------------------------------------------
2023-10-13 13:26:55,353 Plugins:
2023-10-13 13:26:55,353 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 13:26:55,353 ----------------------------------------------------------------------------------------------------
2023-10-13 13:26:55,353 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 13:26:55,354 - metric: "('micro avg', 'f1-score')"
2023-10-13 13:26:55,354 ----------------------------------------------------------------------------------------------------
2023-10-13 13:26:55,354 Computation:
2023-10-13 13:26:55,354 - compute on device: cuda:0
2023-10-13 13:26:55,354 - embedding storage: none
2023-10-13 13:26:55,354 ----------------------------------------------------------------------------------------------------
2023-10-13 13:26:55,354 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-13 13:26:55,354 ----------------------------------------------------------------------------------------------------
2023-10-13 13:26:55,354 ----------------------------------------------------------------------------------------------------
2023-10-13 13:26:59,604 epoch 1 - iter 89/894 - loss 2.95558386 - time (sec): 4.25 - samples/sec: 2080.59 - lr: 0.000003 - momentum: 0.000000
2023-10-13 13:27:03,798 epoch 1 - iter 178/894 - loss 1.86962681 - time (sec): 8.44 - samples/sec: 2140.01 - lr: 0.000006 - momentum: 0.000000
2023-10-13 13:27:07,907 epoch 1 - iter 267/894 - loss 1.43922230 - time (sec): 12.55 - samples/sec: 2070.79 - lr: 0.000009 - momentum: 0.000000
2023-10-13 13:27:12,189 epoch 1 - iter 356/894 - loss 1.16332138 - time (sec): 16.83 - samples/sec: 2074.45 - lr: 0.000012 - momentum: 0.000000
2023-10-13 13:27:16,305 epoch 1 - iter 445/894 - loss 0.99639681 - time (sec): 20.95 - samples/sec: 2063.30 - lr: 0.000015 - momentum: 0.000000
2023-10-13 13:27:20,409 epoch 1 - iter 534/894 - loss 0.88501649 - time (sec): 25.05 - samples/sec: 2062.28 - lr: 0.000018 - momentum: 0.000000
2023-10-13 13:27:24,535 epoch 1 - iter 623/894 - loss 0.80459753 - time (sec): 29.18 - samples/sec: 2051.80 - lr: 0.000021 - momentum: 0.000000
2023-10-13 13:27:28,784 epoch 1 - iter 712/894 - loss 0.73830399 - time (sec): 33.43 - samples/sec: 2048.10 - lr: 0.000024 - momentum: 0.000000
2023-10-13 13:27:32,761 epoch 1 - iter 801/894 - loss 0.68135095 - time (sec): 37.41 - samples/sec: 2044.69 - lr: 0.000027 - momentum: 0.000000
2023-10-13 13:27:37,145 epoch 1 - iter 890/894 - loss 0.63356747 - time (sec): 41.79 - samples/sec: 2062.28 - lr: 0.000030 - momentum: 0.000000
2023-10-13 13:27:37,323 ----------------------------------------------------------------------------------------------------
2023-10-13 13:27:37,324 EPOCH 1 done: loss 0.6313 - lr: 0.000030
2023-10-13 13:27:42,761 DEV : loss 0.1941666156053543 - f1-score (micro avg) 0.6106
2023-10-13 13:27:42,790 saving best model
2023-10-13 13:27:43,110 ----------------------------------------------------------------------------------------------------
2023-10-13 13:27:47,503 epoch 2 - iter 89/894 - loss 0.18794288 - time (sec): 4.39 - samples/sec: 2054.48 - lr: 0.000030 - momentum: 0.000000
2023-10-13 13:27:52,262 epoch 2 - iter 178/894 - loss 0.17923060 - time (sec): 9.15 - samples/sec: 2032.58 - lr: 0.000029 - momentum: 0.000000
2023-10-13 13:27:56,703 epoch 2 - iter 267/894 - loss 0.16858546 - time (sec): 13.59 - samples/sec: 1950.55 - lr: 0.000029 - momentum: 0.000000
2023-10-13 13:28:00,953 epoch 2 - iter 356/894 - loss 0.17427875 - time (sec): 17.84 - samples/sec: 1952.36 - lr: 0.000029 - momentum: 0.000000
2023-10-13 13:28:05,270 epoch 2 - iter 445/894 - loss 0.17144978 - time (sec): 22.16 - samples/sec: 1968.14 - lr: 0.000028 - momentum: 0.000000
2023-10-13 13:28:09,579 epoch 2 - iter 534/894 - loss 0.16359629 - time (sec): 26.47 - samples/sec: 1976.51 - lr: 0.000028 - momentum: 0.000000
2023-10-13 13:28:13,700 epoch 2 - iter 623/894 - loss 0.16276876 - time (sec): 30.59 - samples/sec: 1976.29 - lr: 0.000028 - momentum: 0.000000
2023-10-13 13:28:17,828 epoch 2 - iter 712/894 - loss 0.16121921 - time (sec): 34.72 - samples/sec: 1977.28 - lr: 0.000027 - momentum: 0.000000
2023-10-13 13:28:22,050 epoch 2 - iter 801/894 - loss 0.15844776 - time (sec): 38.94 - samples/sec: 1990.54 - lr: 0.000027 - momentum: 0.000000
2023-10-13 13:28:26,052 epoch 2 - iter 890/894 - loss 0.15768333 - time (sec): 42.94 - samples/sec: 2007.70 - lr: 0.000027 - momentum: 0.000000
2023-10-13 13:28:26,230 ----------------------------------------------------------------------------------------------------
2023-10-13 13:28:26,230 EPOCH 2 done: loss 0.1574 - lr: 0.000027
2023-10-13 13:28:35,343 DEV : loss 0.1417360156774521 - f1-score (micro avg) 0.6804
2023-10-13 13:28:35,382 saving best model
2023-10-13 13:28:35,903 ----------------------------------------------------------------------------------------------------
2023-10-13 13:28:40,362 epoch 3 - iter 89/894 - loss 0.10870491 - time (sec): 4.46 - samples/sec: 1945.07 - lr: 0.000026 - momentum: 0.000000
2023-10-13 13:28:44,659 epoch 3 - iter 178/894 - loss 0.09511837 - time (sec): 8.75 - samples/sec: 2060.06 - lr: 0.000026 - momentum: 0.000000
2023-10-13 13:28:48,835 epoch 3 - iter 267/894 - loss 0.08476480 - time (sec): 12.93 - samples/sec: 2045.85 - lr: 0.000026 - momentum: 0.000000
2023-10-13 13:28:52,984 epoch 3 - iter 356/894 - loss 0.08905987 - time (sec): 17.08 - samples/sec: 2036.20 - lr: 0.000025 - momentum: 0.000000
2023-10-13 13:28:57,234 epoch 3 - iter 445/894 - loss 0.08587345 - time (sec): 21.33 - samples/sec: 1995.76 - lr: 0.000025 - momentum: 0.000000
2023-10-13 13:29:01,699 epoch 3 - iter 534/894 - loss 0.08587494 - time (sec): 25.79 - samples/sec: 1983.40 - lr: 0.000025 - momentum: 0.000000
2023-10-13 13:29:06,110 epoch 3 - iter 623/894 - loss 0.08738808 - time (sec): 30.20 - samples/sec: 1965.13 - lr: 0.000024 - momentum: 0.000000
2023-10-13 13:29:10,497 epoch 3 - iter 712/894 - loss 0.08364026 - time (sec): 34.59 - samples/sec: 1980.18 - lr: 0.000024 - momentum: 0.000000
2023-10-13 13:29:14,555 epoch 3 - iter 801/894 - loss 0.08812239 - time (sec): 38.65 - samples/sec: 1985.64 - lr: 0.000024 - momentum: 0.000000
2023-10-13 13:29:18,956 epoch 3 - iter 890/894 - loss 0.08729426 - time (sec): 43.05 - samples/sec: 2001.73 - lr: 0.000023 - momentum: 0.000000
2023-10-13 13:29:19,149 ----------------------------------------------------------------------------------------------------
2023-10-13 13:29:19,149 EPOCH 3 done: loss 0.0872 - lr: 0.000023
2023-10-13 13:29:28,104 DEV : loss 0.1514357179403305 - f1-score (micro avg) 0.7256
2023-10-13 13:29:28,139 saving best model
2023-10-13 13:29:28,599 ----------------------------------------------------------------------------------------------------
2023-10-13 13:29:33,101 epoch 4 - iter 89/894 - loss 0.05364078 - time (sec): 4.50 - samples/sec: 1918.44 - lr: 0.000023 - momentum: 0.000000
2023-10-13 13:29:37,515 epoch 4 - iter 178/894 - loss 0.05117238 - time (sec): 8.91 - samples/sec: 1863.71 - lr: 0.000023 - momentum: 0.000000
2023-10-13 13:29:42,069 epoch 4 - iter 267/894 - loss 0.05442001 - time (sec): 13.46 - samples/sec: 1876.18 - lr: 0.000022 - momentum: 0.000000
2023-10-13 13:29:46,745 epoch 4 - iter 356/894 - loss 0.04904272 - time (sec): 18.14 - samples/sec: 1878.12 - lr: 0.000022 - momentum: 0.000000
2023-10-13 13:29:51,714 epoch 4 - iter 445/894 - loss 0.05288936 - time (sec): 23.11 - samples/sec: 1892.72 - lr: 0.000022 - momentum: 0.000000
2023-10-13 13:29:56,451 epoch 4 - iter 534/894 - loss 0.05337299 - time (sec): 27.84 - samples/sec: 1885.36 - lr: 0.000021 - momentum: 0.000000
2023-10-13 13:30:01,001 epoch 4 - iter 623/894 - loss 0.05444078 - time (sec): 32.40 - samples/sec: 1864.41 - lr: 0.000021 - momentum: 0.000000
2023-10-13 13:30:05,689 epoch 4 - iter 712/894 - loss 0.05523828 - time (sec): 37.08 - samples/sec: 1872.35 - lr: 0.000021 - momentum: 0.000000
2023-10-13 13:30:10,379 epoch 4 - iter 801/894 - loss 0.05634012 - time (sec): 41.77 - samples/sec: 1870.31 - lr: 0.000020 - momentum: 0.000000
2023-10-13 13:30:14,915 epoch 4 - iter 890/894 - loss 0.05893335 - time (sec): 46.31 - samples/sec: 1860.70 - lr: 0.000020 - momentum: 0.000000
2023-10-13 13:30:15,109 ----------------------------------------------------------------------------------------------------
2023-10-13 13:30:15,109 EPOCH 4 done: loss 0.0589 - lr: 0.000020
2023-10-13 13:30:23,755 DEV : loss 0.17774192988872528 - f1-score (micro avg) 0.7457
2023-10-13 13:30:23,789 saving best model
2023-10-13 13:30:24,260 ----------------------------------------------------------------------------------------------------
2023-10-13 13:30:28,888 epoch 5 - iter 89/894 - loss 0.04958656 - time (sec): 4.63 - samples/sec: 1958.63 - lr: 0.000020 - momentum: 0.000000
2023-10-13 13:30:32,984 epoch 5 - iter 178/894 - loss 0.04439640 - time (sec): 8.72 - samples/sec: 1995.20 - lr: 0.000019 - momentum: 0.000000
2023-10-13 13:30:37,156 epoch 5 - iter 267/894 - loss 0.04404992 - time (sec): 12.89 - samples/sec: 2038.74 - lr: 0.000019 - momentum: 0.000000
2023-10-13 13:30:41,256 epoch 5 - iter 356/894 - loss 0.04148328 - time (sec): 16.99 - samples/sec: 2062.03 - lr: 0.000019 - momentum: 0.000000
2023-10-13 13:30:45,339 epoch 5 - iter 445/894 - loss 0.04024971 - time (sec): 21.08 - samples/sec: 2053.98 - lr: 0.000018 - momentum: 0.000000
2023-10-13 13:30:49,695 epoch 5 - iter 534/894 - loss 0.04182158 - time (sec): 25.43 - samples/sec: 2035.14 - lr: 0.000018 - momentum: 0.000000
2023-10-13 13:30:54,179 epoch 5 - iter 623/894 - loss 0.03942623 - time (sec): 29.92 - samples/sec: 2047.61 - lr: 0.000018 - momentum: 0.000000
2023-10-13 13:30:58,471 epoch 5 - iter 712/894 - loss 0.04026267 - time (sec): 34.21 - samples/sec: 2025.62 - lr: 0.000017 - momentum: 0.000000
2023-10-13 13:31:02,647 epoch 5 - iter 801/894 - loss 0.03995825 - time (sec): 38.38 - samples/sec: 2024.13 - lr: 0.000017 - momentum: 0.000000
2023-10-13 13:31:06,816 epoch 5 - iter 890/894 - loss 0.03960817 - time (sec): 42.55 - samples/sec: 2026.89 - lr: 0.000017 - momentum: 0.000000
2023-10-13 13:31:07,001 ----------------------------------------------------------------------------------------------------
2023-10-13 13:31:07,002 EPOCH 5 done: loss 0.0397 - lr: 0.000017
2023-10-13 13:31:15,787 DEV : loss 0.2007289081811905 - f1-score (micro avg) 0.7645
2023-10-13 13:31:15,820 saving best model
2023-10-13 13:31:16,295 ----------------------------------------------------------------------------------------------------
2023-10-13 13:31:20,481 epoch 6 - iter 89/894 - loss 0.02040547 - time (sec): 4.18 - samples/sec: 2093.62 - lr: 0.000016 - momentum: 0.000000
2023-10-13 13:31:24,680 epoch 6 - iter 178/894 - loss 0.01964149 - time (sec): 8.38 - samples/sec: 2121.16 - lr: 0.000016 - momentum: 0.000000
2023-10-13 13:31:28,716 epoch 6 - iter 267/894 - loss 0.02088570 - time (sec): 12.42 - samples/sec: 2112.13 - lr: 0.000016 - momentum: 0.000000
2023-10-13 13:31:33,092 epoch 6 - iter 356/894 - loss 0.02324113 - time (sec): 16.79 - samples/sec: 2156.66 - lr: 0.000015 - momentum: 0.000000
2023-10-13 13:31:37,174 epoch 6 - iter 445/894 - loss 0.02270645 - time (sec): 20.88 - samples/sec: 2094.45 - lr: 0.000015 - momentum: 0.000000
2023-10-13 13:31:41,187 epoch 6 - iter 534/894 - loss 0.02337935 - time (sec): 24.89 - samples/sec: 2090.81 - lr: 0.000015 - momentum: 0.000000
2023-10-13 13:31:45,296 epoch 6 - iter 623/894 - loss 0.02406282 - time (sec): 29.00 - samples/sec: 2087.74 - lr: 0.000014 - momentum: 0.000000
2023-10-13 13:31:49,382 epoch 6 - iter 712/894 - loss 0.02361249 - time (sec): 33.08 - samples/sec: 2074.79 - lr: 0.000014 - momentum: 0.000000
2023-10-13 13:31:53,449 epoch 6 - iter 801/894 - loss 0.02432036 - time (sec): 37.15 - samples/sec: 2088.20 - lr: 0.000014 - momentum: 0.000000
2023-10-13 13:31:57,438 epoch 6 - iter 890/894 - loss 0.02630157 - time (sec): 41.14 - samples/sec: 2095.76 - lr: 0.000013 - momentum: 0.000000
2023-10-13 13:31:57,617 ----------------------------------------------------------------------------------------------------
2023-10-13 13:31:57,617 EPOCH 6 done: loss 0.0264 - lr: 0.000013
2023-10-13 13:32:06,141 DEV : loss 0.20405448973178864 - f1-score (micro avg) 0.7684
2023-10-13 13:32:06,171 saving best model
2023-10-13 13:32:06,604 ----------------------------------------------------------------------------------------------------
2023-10-13 13:32:11,208 epoch 7 - iter 89/894 - loss 0.01619462 - time (sec): 4.60 - samples/sec: 2178.62 - lr: 0.000013 - momentum: 0.000000
2023-10-13 13:32:15,438 epoch 7 - iter 178/894 - loss 0.01588258 - time (sec): 8.83 - samples/sec: 2050.83 - lr: 0.000013 - momentum: 0.000000
2023-10-13 13:32:19,895 epoch 7 - iter 267/894 - loss 0.01351689 - time (sec): 13.28 - samples/sec: 2029.35 - lr: 0.000012 - momentum: 0.000000
2023-10-13 13:32:24,203 epoch 7 - iter 356/894 - loss 0.01461563 - time (sec): 17.59 - samples/sec: 2039.65 - lr: 0.000012 - momentum: 0.000000
2023-10-13 13:32:28,462 epoch 7 - iter 445/894 - loss 0.01677621 - time (sec): 21.85 - samples/sec: 2033.09 - lr: 0.000012 - momentum: 0.000000
2023-10-13 13:32:32,582 epoch 7 - iter 534/894 - loss 0.01693157 - time (sec): 25.97 - samples/sec: 2006.28 - lr: 0.000011 - momentum: 0.000000
2023-10-13 13:32:36,783 epoch 7 - iter 623/894 - loss 0.01777874 - time (sec): 30.17 - samples/sec: 2017.12 - lr: 0.000011 - momentum: 0.000000
2023-10-13 13:32:40,885 epoch 7 - iter 712/894 - loss 0.01692073 - time (sec): 34.27 - samples/sec: 2018.10 - lr: 0.000011 - momentum: 0.000000
2023-10-13 13:32:44,949 epoch 7 - iter 801/894 - loss 0.01780651 - time (sec): 38.34 - samples/sec: 2021.61 - lr: 0.000010 - momentum: 0.000000
2023-10-13 13:32:48,987 epoch 7 - iter 890/894 - loss 0.01723398 - time (sec): 42.38 - samples/sec: 2034.07 - lr: 0.000010 - momentum: 0.000000
2023-10-13 13:32:49,163 ----------------------------------------------------------------------------------------------------
2023-10-13 13:32:49,163 EPOCH 7 done: loss 0.0174 - lr: 0.000010
2023-10-13 13:32:58,154 DEV : loss 0.22012537717819214 - f1-score (micro avg) 0.7795
2023-10-13 13:32:58,194 saving best model
2023-10-13 13:32:58,701 ----------------------------------------------------------------------------------------------------
2023-10-13 13:33:03,051 epoch 8 - iter 89/894 - loss 0.00832136 - time (sec): 4.34 - samples/sec: 2013.98 - lr: 0.000010 - momentum: 0.000000
2023-10-13 13:33:07,564 epoch 8 - iter 178/894 - loss 0.00808894 - time (sec): 8.86 - samples/sec: 1950.24 - lr: 0.000009 - momentum: 0.000000
2023-10-13 13:33:11,637 epoch 8 - iter 267/894 - loss 0.01090431 - time (sec): 12.93 - samples/sec: 1983.97 - lr: 0.000009 - momentum: 0.000000
2023-10-13 13:33:15,841 epoch 8 - iter 356/894 - loss 0.01087231 - time (sec): 17.13 - samples/sec: 1976.09 - lr: 0.000009 - momentum: 0.000000
2023-10-13 13:33:20,134 epoch 8 - iter 445/894 - loss 0.01052967 - time (sec): 21.43 - samples/sec: 1971.66 - lr: 0.000008 - momentum: 0.000000
2023-10-13 13:33:24,383 epoch 8 - iter 534/894 - loss 0.01078439 - time (sec): 25.68 - samples/sec: 1970.62 - lr: 0.000008 - momentum: 0.000000
2023-10-13 13:33:28,522 epoch 8 - iter 623/894 - loss 0.01109453 - time (sec): 29.82 - samples/sec: 1983.29 - lr: 0.000008 - momentum: 0.000000
2023-10-13 13:33:32,944 epoch 8 - iter 712/894 - loss 0.01240084 - time (sec): 34.24 - samples/sec: 1995.51 - lr: 0.000007 - momentum: 0.000000
2023-10-13 13:33:37,178 epoch 8 - iter 801/894 - loss 0.01237060 - time (sec): 38.47 - samples/sec: 2016.12 - lr: 0.000007 - momentum: 0.000000
2023-10-13 13:33:41,265 epoch 8 - iter 890/894 - loss 0.01219660 - time (sec): 42.56 - samples/sec: 2027.23 - lr: 0.000007 - momentum: 0.000000
2023-10-13 13:33:41,440 ----------------------------------------------------------------------------------------------------
2023-10-13 13:33:41,440 EPOCH 8 done: loss 0.0122 - lr: 0.000007
2023-10-13 13:33:50,357 DEV : loss 0.2315651774406433 - f1-score (micro avg) 0.7705
2023-10-13 13:33:50,387 ----------------------------------------------------------------------------------------------------
2023-10-13 13:33:54,612 epoch 9 - iter 89/894 - loss 0.00348902 - time (sec): 4.22 - samples/sec: 1985.17 - lr: 0.000006 - momentum: 0.000000
2023-10-13 13:33:58,941 epoch 9 - iter 178/894 - loss 0.00486255 - time (sec): 8.55 - samples/sec: 2008.41 - lr: 0.000006 - momentum: 0.000000
2023-10-13 13:34:03,159 epoch 9 - iter 267/894 - loss 0.00642317 - time (sec): 12.77 - samples/sec: 1974.93 - lr: 0.000006 - momentum: 0.000000
2023-10-13 13:34:07,390 epoch 9 - iter 356/894 - loss 0.01103944 - time (sec): 17.00 - samples/sec: 1971.80 - lr: 0.000005 - momentum: 0.000000
2023-10-13 13:34:11,668 epoch 9 - iter 445/894 - loss 0.00897085 - time (sec): 21.28 - samples/sec: 1999.42 - lr: 0.000005 - momentum: 0.000000
2023-10-13 13:34:15,923 epoch 9 - iter 534/894 - loss 0.00808237 - time (sec): 25.53 - samples/sec: 1998.01 - lr: 0.000005 - momentum: 0.000000
2023-10-13 13:34:20,374 epoch 9 - iter 623/894 - loss 0.00817370 - time (sec): 29.99 - samples/sec: 2006.05 - lr: 0.000004 - momentum: 0.000000
2023-10-13 13:34:24,751 epoch 9 - iter 712/894 - loss 0.00803605 - time (sec): 34.36 - samples/sec: 2035.31 - lr: 0.000004 - momentum: 0.000000
2023-10-13 13:34:28,751 epoch 9 - iter 801/894 - loss 0.00806848 - time (sec): 38.36 - samples/sec: 2035.57 - lr: 0.000004 - momentum: 0.000000
2023-10-13 13:34:32,862 epoch 9 - iter 890/894 - loss 0.00765097 - time (sec): 42.47 - samples/sec: 2032.01 - lr: 0.000003 - momentum: 0.000000
2023-10-13 13:34:33,037 ----------------------------------------------------------------------------------------------------
2023-10-13 13:34:33,038 EPOCH 9 done: loss 0.0076 - lr: 0.000003
2023-10-13 13:34:41,714 DEV : loss 0.23668904602527618 - f1-score (micro avg) 0.7859
2023-10-13 13:34:41,747 saving best model
2023-10-13 13:34:42,200 ----------------------------------------------------------------------------------------------------
2023-10-13 13:34:46,536 epoch 10 - iter 89/894 - loss 0.00697084 - time (sec): 4.33 - samples/sec: 2289.44 - lr: 0.000003 - momentum: 0.000000
2023-10-13 13:34:50,883 epoch 10 - iter 178/894 - loss 0.00707544 - time (sec): 8.68 - samples/sec: 2158.02 - lr: 0.000003 - momentum: 0.000000
2023-10-13 13:34:55,094 epoch 10 - iter 267/894 - loss 0.00741832 - time (sec): 12.89 - samples/sec: 2097.03 - lr: 0.000002 - momentum: 0.000000
2023-10-13 13:34:59,329 epoch 10 - iter 356/894 - loss 0.00805166 - time (sec): 17.13 - samples/sec: 2052.37 - lr: 0.000002 - momentum: 0.000000
2023-10-13 13:35:03,696 epoch 10 - iter 445/894 - loss 0.00661446 - time (sec): 21.49 - samples/sec: 2039.46 - lr: 0.000002 - momentum: 0.000000
2023-10-13 13:35:07,973 epoch 10 - iter 534/894 - loss 0.00680689 - time (sec): 25.77 - samples/sec: 2014.68 - lr: 0.000001 - momentum: 0.000000
2023-10-13 13:35:12,132 epoch 10 - iter 623/894 - loss 0.00641756 - time (sec): 29.93 - samples/sec: 2009.15 - lr: 0.000001 - momentum: 0.000000
2023-10-13 13:35:16,236 epoch 10 - iter 712/894 - loss 0.00594203 - time (sec): 34.03 - samples/sec: 2025.05 - lr: 0.000001 - momentum: 0.000000
2023-10-13 13:35:20,200 epoch 10 - iter 801/894 - loss 0.00534108 - time (sec): 38.00 - samples/sec: 2040.65 - lr: 0.000000 - momentum: 0.000000
2023-10-13 13:35:24,171 epoch 10 - iter 890/894 - loss 0.00532176 - time (sec): 41.97 - samples/sec: 2054.23 - lr: 0.000000 - momentum: 0.000000
2023-10-13 13:35:24,347 ----------------------------------------------------------------------------------------------------
2023-10-13 13:35:24,348 EPOCH 10 done: loss 0.0053 - lr: 0.000000
2023-10-13 13:35:33,250 DEV : loss 0.23821337521076202 - f1-score (micro avg) 0.7852
2023-10-13 13:35:33,648 ----------------------------------------------------------------------------------------------------
2023-10-13 13:35:33,649 Loading model from best epoch ...
2023-10-13 13:35:35,424 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-13 13:35:40,059
Results:
- F-score (micro) 0.7491
- F-score (macro) 0.6783
- Accuracy 0.6188
By class:
precision recall f1-score support
loc 0.8225 0.8473 0.8347 596
pers 0.6649 0.7447 0.7025 333
org 0.6018 0.5152 0.5551 132
prod 0.6346 0.5000 0.5593 66
time 0.7255 0.7551 0.7400 49
micro avg 0.7406 0.7577 0.7491 1176
macro avg 0.6898 0.6725 0.6783 1176
weighted avg 0.7385 0.7577 0.7465 1176
2023-10-13 13:35:40,059 ----------------------------------------------------------------------------------------------------