stefan-it's picture
Upload folder using huggingface_hub
0cf179e
2023-10-15 11:57:02,477 ----------------------------------------------------------------------------------------------------
2023-10-15 11:57:02,478 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-15 11:57:02,478 ----------------------------------------------------------------------------------------------------
2023-10-15 11:57:02,479 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-15 11:57:02,479 ----------------------------------------------------------------------------------------------------
2023-10-15 11:57:02,479 Train: 20847 sentences
2023-10-15 11:57:02,479 (train_with_dev=False, train_with_test=False)
2023-10-15 11:57:02,479 ----------------------------------------------------------------------------------------------------
2023-10-15 11:57:02,479 Training Params:
2023-10-15 11:57:02,479 - learning_rate: "5e-05"
2023-10-15 11:57:02,479 - mini_batch_size: "8"
2023-10-15 11:57:02,479 - max_epochs: "10"
2023-10-15 11:57:02,479 - shuffle: "True"
2023-10-15 11:57:02,479 ----------------------------------------------------------------------------------------------------
2023-10-15 11:57:02,479 Plugins:
2023-10-15 11:57:02,479 - LinearScheduler | warmup_fraction: '0.1'
2023-10-15 11:57:02,479 ----------------------------------------------------------------------------------------------------
2023-10-15 11:57:02,479 Final evaluation on model from best epoch (best-model.pt)
2023-10-15 11:57:02,479 - metric: "('micro avg', 'f1-score')"
2023-10-15 11:57:02,479 ----------------------------------------------------------------------------------------------------
2023-10-15 11:57:02,479 Computation:
2023-10-15 11:57:02,479 - compute on device: cuda:0
2023-10-15 11:57:02,479 - embedding storage: none
2023-10-15 11:57:02,479 ----------------------------------------------------------------------------------------------------
2023-10-15 11:57:02,479 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-15 11:57:02,479 ----------------------------------------------------------------------------------------------------
2023-10-15 11:57:02,479 ----------------------------------------------------------------------------------------------------
2023-10-15 11:57:21,436 epoch 1 - iter 260/2606 - loss 1.64915807 - time (sec): 18.96 - samples/sec: 1932.05 - lr: 0.000005 - momentum: 0.000000
2023-10-15 11:57:40,112 epoch 1 - iter 520/2606 - loss 1.03092842 - time (sec): 37.63 - samples/sec: 1945.85 - lr: 0.000010 - momentum: 0.000000
2023-10-15 11:57:59,795 epoch 1 - iter 780/2606 - loss 0.76385186 - time (sec): 57.31 - samples/sec: 1959.94 - lr: 0.000015 - momentum: 0.000000
2023-10-15 11:58:18,335 epoch 1 - iter 1040/2606 - loss 0.64168041 - time (sec): 75.85 - samples/sec: 1954.01 - lr: 0.000020 - momentum: 0.000000
2023-10-15 11:58:38,681 epoch 1 - iter 1300/2606 - loss 0.54874820 - time (sec): 96.20 - samples/sec: 1954.88 - lr: 0.000025 - momentum: 0.000000
2023-10-15 11:58:57,821 epoch 1 - iter 1560/2606 - loss 0.49391289 - time (sec): 115.34 - samples/sec: 1952.63 - lr: 0.000030 - momentum: 0.000000
2023-10-15 11:59:15,529 epoch 1 - iter 1820/2606 - loss 0.45687451 - time (sec): 133.05 - samples/sec: 1944.20 - lr: 0.000035 - momentum: 0.000000
2023-10-15 11:59:34,789 epoch 1 - iter 2080/2606 - loss 0.42534550 - time (sec): 152.31 - samples/sec: 1947.69 - lr: 0.000040 - momentum: 0.000000
2023-10-15 11:59:53,865 epoch 1 - iter 2340/2606 - loss 0.40230225 - time (sec): 171.39 - samples/sec: 1941.28 - lr: 0.000045 - momentum: 0.000000
2023-10-15 12:00:12,299 epoch 1 - iter 2600/2606 - loss 0.38436630 - time (sec): 189.82 - samples/sec: 1932.40 - lr: 0.000050 - momentum: 0.000000
2023-10-15 12:00:12,697 ----------------------------------------------------------------------------------------------------
2023-10-15 12:00:12,697 EPOCH 1 done: loss 0.3839 - lr: 0.000050
2023-10-15 12:00:19,023 DEV : loss 0.11487094312906265 - f1-score (micro avg) 0.3452
2023-10-15 12:00:19,047 saving best model
2023-10-15 12:00:19,408 ----------------------------------------------------------------------------------------------------
2023-10-15 12:00:38,662 epoch 2 - iter 260/2606 - loss 0.15682186 - time (sec): 19.25 - samples/sec: 1974.97 - lr: 0.000049 - momentum: 0.000000
2023-10-15 12:00:57,114 epoch 2 - iter 520/2606 - loss 0.16679577 - time (sec): 37.70 - samples/sec: 1893.71 - lr: 0.000049 - momentum: 0.000000
2023-10-15 12:01:16,199 epoch 2 - iter 780/2606 - loss 0.16159520 - time (sec): 56.79 - samples/sec: 1915.43 - lr: 0.000048 - momentum: 0.000000
2023-10-15 12:01:35,382 epoch 2 - iter 1040/2606 - loss 0.16406407 - time (sec): 75.97 - samples/sec: 1932.84 - lr: 0.000048 - momentum: 0.000000
2023-10-15 12:01:54,053 epoch 2 - iter 1300/2606 - loss 0.16300929 - time (sec): 94.64 - samples/sec: 1942.04 - lr: 0.000047 - momentum: 0.000000
2023-10-15 12:02:11,883 epoch 2 - iter 1560/2606 - loss 0.15872115 - time (sec): 112.47 - samples/sec: 1931.89 - lr: 0.000047 - momentum: 0.000000
2023-10-15 12:02:31,185 epoch 2 - iter 1820/2606 - loss 0.15896498 - time (sec): 131.78 - samples/sec: 1934.46 - lr: 0.000046 - momentum: 0.000000
2023-10-15 12:02:50,291 epoch 2 - iter 2080/2606 - loss 0.15531835 - time (sec): 150.88 - samples/sec: 1942.94 - lr: 0.000046 - momentum: 0.000000
2023-10-15 12:03:10,327 epoch 2 - iter 2340/2606 - loss 0.15458671 - time (sec): 170.92 - samples/sec: 1947.88 - lr: 0.000045 - momentum: 0.000000
2023-10-15 12:03:28,176 epoch 2 - iter 2600/2606 - loss 0.15487459 - time (sec): 188.77 - samples/sec: 1942.09 - lr: 0.000044 - momentum: 0.000000
2023-10-15 12:03:28,565 ----------------------------------------------------------------------------------------------------
2023-10-15 12:03:28,565 EPOCH 2 done: loss 0.1548 - lr: 0.000044
2023-10-15 12:03:36,764 DEV : loss 0.1457301676273346 - f1-score (micro avg) 0.3783
2023-10-15 12:03:36,788 saving best model
2023-10-15 12:03:38,006 ----------------------------------------------------------------------------------------------------
2023-10-15 12:03:57,780 epoch 3 - iter 260/2606 - loss 0.10903042 - time (sec): 19.77 - samples/sec: 1908.20 - lr: 0.000044 - momentum: 0.000000
2023-10-15 12:04:16,420 epoch 3 - iter 520/2606 - loss 0.11437700 - time (sec): 38.41 - samples/sec: 1904.63 - lr: 0.000043 - momentum: 0.000000
2023-10-15 12:04:35,156 epoch 3 - iter 780/2606 - loss 0.11275118 - time (sec): 57.15 - samples/sec: 1885.24 - lr: 0.000043 - momentum: 0.000000
2023-10-15 12:04:53,844 epoch 3 - iter 1040/2606 - loss 0.11675125 - time (sec): 75.84 - samples/sec: 1892.01 - lr: 0.000042 - momentum: 0.000000
2023-10-15 12:05:12,544 epoch 3 - iter 1300/2606 - loss 0.11512835 - time (sec): 94.54 - samples/sec: 1906.30 - lr: 0.000042 - momentum: 0.000000
2023-10-15 12:05:31,526 epoch 3 - iter 1560/2606 - loss 0.11310394 - time (sec): 113.52 - samples/sec: 1918.20 - lr: 0.000041 - momentum: 0.000000
2023-10-15 12:05:50,869 epoch 3 - iter 1820/2606 - loss 0.11220806 - time (sec): 132.86 - samples/sec: 1932.70 - lr: 0.000041 - momentum: 0.000000
2023-10-15 12:06:09,318 epoch 3 - iter 2080/2606 - loss 0.11087298 - time (sec): 151.31 - samples/sec: 1934.37 - lr: 0.000040 - momentum: 0.000000
2023-10-15 12:06:29,929 epoch 3 - iter 2340/2606 - loss 0.11008659 - time (sec): 171.92 - samples/sec: 1918.75 - lr: 0.000039 - momentum: 0.000000
2023-10-15 12:06:50,002 epoch 3 - iter 2600/2606 - loss 0.10813718 - time (sec): 191.99 - samples/sec: 1910.97 - lr: 0.000039 - momentum: 0.000000
2023-10-15 12:06:50,397 ----------------------------------------------------------------------------------------------------
2023-10-15 12:06:50,397 EPOCH 3 done: loss 0.1081 - lr: 0.000039
2023-10-15 12:06:58,735 DEV : loss 0.30479252338409424 - f1-score (micro avg) 0.3318
2023-10-15 12:06:58,777 ----------------------------------------------------------------------------------------------------
2023-10-15 12:07:18,000 epoch 4 - iter 260/2606 - loss 0.08238358 - time (sec): 19.22 - samples/sec: 1831.30 - lr: 0.000038 - momentum: 0.000000
2023-10-15 12:07:38,192 epoch 4 - iter 520/2606 - loss 0.08072289 - time (sec): 39.41 - samples/sec: 1790.47 - lr: 0.000038 - momentum: 0.000000
2023-10-15 12:07:59,891 epoch 4 - iter 780/2606 - loss 0.08208316 - time (sec): 61.11 - samples/sec: 1780.28 - lr: 0.000037 - momentum: 0.000000
2023-10-15 12:08:18,418 epoch 4 - iter 1040/2606 - loss 0.08409013 - time (sec): 79.64 - samples/sec: 1829.08 - lr: 0.000037 - momentum: 0.000000
2023-10-15 12:08:36,789 epoch 4 - iter 1300/2606 - loss 0.08241002 - time (sec): 98.01 - samples/sec: 1845.09 - lr: 0.000036 - momentum: 0.000000
2023-10-15 12:08:56,156 epoch 4 - iter 1560/2606 - loss 0.08187116 - time (sec): 117.38 - samples/sec: 1860.51 - lr: 0.000036 - momentum: 0.000000
2023-10-15 12:09:15,526 epoch 4 - iter 1820/2606 - loss 0.08319336 - time (sec): 136.75 - samples/sec: 1865.81 - lr: 0.000035 - momentum: 0.000000
2023-10-15 12:09:34,317 epoch 4 - iter 2080/2606 - loss 0.08131155 - time (sec): 155.54 - samples/sec: 1884.19 - lr: 0.000034 - momentum: 0.000000
2023-10-15 12:09:52,846 epoch 4 - iter 2340/2606 - loss 0.08077706 - time (sec): 174.07 - samples/sec: 1883.36 - lr: 0.000034 - momentum: 0.000000
2023-10-15 12:10:12,717 epoch 4 - iter 2600/2606 - loss 0.07876999 - time (sec): 193.94 - samples/sec: 1888.93 - lr: 0.000033 - momentum: 0.000000
2023-10-15 12:10:13,105 ----------------------------------------------------------------------------------------------------
2023-10-15 12:10:13,105 EPOCH 4 done: loss 0.0789 - lr: 0.000033
2023-10-15 12:10:21,438 DEV : loss 0.3154134452342987 - f1-score (micro avg) 0.3333
2023-10-15 12:10:21,475 ----------------------------------------------------------------------------------------------------
2023-10-15 12:10:40,522 epoch 5 - iter 260/2606 - loss 0.05258224 - time (sec): 19.04 - samples/sec: 1881.49 - lr: 0.000033 - momentum: 0.000000
2023-10-15 12:10:58,880 epoch 5 - iter 520/2606 - loss 0.04989730 - time (sec): 37.40 - samples/sec: 1885.34 - lr: 0.000032 - momentum: 0.000000
2023-10-15 12:11:18,620 epoch 5 - iter 780/2606 - loss 0.05552002 - time (sec): 57.14 - samples/sec: 1871.31 - lr: 0.000032 - momentum: 0.000000
2023-10-15 12:11:38,270 epoch 5 - iter 1040/2606 - loss 0.05584031 - time (sec): 76.79 - samples/sec: 1871.27 - lr: 0.000031 - momentum: 0.000000
2023-10-15 12:11:59,453 epoch 5 - iter 1300/2606 - loss 0.05720019 - time (sec): 97.98 - samples/sec: 1834.07 - lr: 0.000031 - momentum: 0.000000
2023-10-15 12:12:19,751 epoch 5 - iter 1560/2606 - loss 0.05743220 - time (sec): 118.27 - samples/sec: 1838.76 - lr: 0.000030 - momentum: 0.000000
2023-10-15 12:12:40,318 epoch 5 - iter 1820/2606 - loss 0.05650180 - time (sec): 138.84 - samples/sec: 1833.34 - lr: 0.000029 - momentum: 0.000000
2023-10-15 12:13:01,319 epoch 5 - iter 2080/2606 - loss 0.05647238 - time (sec): 159.84 - samples/sec: 1839.62 - lr: 0.000029 - momentum: 0.000000
2023-10-15 12:13:21,100 epoch 5 - iter 2340/2606 - loss 0.05690768 - time (sec): 179.62 - samples/sec: 1852.23 - lr: 0.000028 - momentum: 0.000000
2023-10-15 12:13:40,116 epoch 5 - iter 2600/2606 - loss 0.05691287 - time (sec): 198.64 - samples/sec: 1845.60 - lr: 0.000028 - momentum: 0.000000
2023-10-15 12:13:40,605 ----------------------------------------------------------------------------------------------------
2023-10-15 12:13:40,605 EPOCH 5 done: loss 0.0569 - lr: 0.000028
2023-10-15 12:13:48,784 DEV : loss 0.31486544013023376 - f1-score (micro avg) 0.3517
2023-10-15 12:13:48,815 ----------------------------------------------------------------------------------------------------
2023-10-15 12:14:08,648 epoch 6 - iter 260/2606 - loss 0.03313725 - time (sec): 19.83 - samples/sec: 1979.16 - lr: 0.000027 - momentum: 0.000000
2023-10-15 12:14:27,561 epoch 6 - iter 520/2606 - loss 0.03690228 - time (sec): 38.74 - samples/sec: 1969.87 - lr: 0.000027 - momentum: 0.000000
2023-10-15 12:14:47,369 epoch 6 - iter 780/2606 - loss 0.04112557 - time (sec): 58.55 - samples/sec: 1974.98 - lr: 0.000026 - momentum: 0.000000
2023-10-15 12:15:05,825 epoch 6 - iter 1040/2606 - loss 0.04204714 - time (sec): 77.01 - samples/sec: 1950.19 - lr: 0.000026 - momentum: 0.000000
2023-10-15 12:15:23,370 epoch 6 - iter 1300/2606 - loss 0.04230053 - time (sec): 94.55 - samples/sec: 1935.24 - lr: 0.000025 - momentum: 0.000000
2023-10-15 12:15:42,734 epoch 6 - iter 1560/2606 - loss 0.04315007 - time (sec): 113.92 - samples/sec: 1939.01 - lr: 0.000024 - momentum: 0.000000
2023-10-15 12:16:00,920 epoch 6 - iter 1820/2606 - loss 0.04303887 - time (sec): 132.10 - samples/sec: 1925.16 - lr: 0.000024 - momentum: 0.000000
2023-10-15 12:16:21,944 epoch 6 - iter 2080/2606 - loss 0.04257364 - time (sec): 153.13 - samples/sec: 1917.79 - lr: 0.000023 - momentum: 0.000000
2023-10-15 12:16:41,637 epoch 6 - iter 2340/2606 - loss 0.04413961 - time (sec): 172.82 - samples/sec: 1917.84 - lr: 0.000023 - momentum: 0.000000
2023-10-15 12:16:59,561 epoch 6 - iter 2600/2606 - loss 0.04348856 - time (sec): 190.74 - samples/sec: 1922.13 - lr: 0.000022 - momentum: 0.000000
2023-10-15 12:16:59,994 ----------------------------------------------------------------------------------------------------
2023-10-15 12:16:59,994 EPOCH 6 done: loss 0.0435 - lr: 0.000022
2023-10-15 12:17:08,237 DEV : loss 0.39616382122039795 - f1-score (micro avg) 0.3663
2023-10-15 12:17:08,263 ----------------------------------------------------------------------------------------------------
2023-10-15 12:17:26,344 epoch 7 - iter 260/2606 - loss 0.02748825 - time (sec): 18.08 - samples/sec: 1948.18 - lr: 0.000022 - momentum: 0.000000
2023-10-15 12:17:45,453 epoch 7 - iter 520/2606 - loss 0.02809898 - time (sec): 37.19 - samples/sec: 1935.59 - lr: 0.000021 - momentum: 0.000000
2023-10-15 12:18:04,516 epoch 7 - iter 780/2606 - loss 0.03121037 - time (sec): 56.25 - samples/sec: 1946.76 - lr: 0.000021 - momentum: 0.000000
2023-10-15 12:18:23,273 epoch 7 - iter 1040/2606 - loss 0.03049926 - time (sec): 75.01 - samples/sec: 1948.46 - lr: 0.000020 - momentum: 0.000000
2023-10-15 12:18:42,018 epoch 7 - iter 1300/2606 - loss 0.02965768 - time (sec): 93.75 - samples/sec: 1946.91 - lr: 0.000019 - momentum: 0.000000
2023-10-15 12:19:01,527 epoch 7 - iter 1560/2606 - loss 0.03081344 - time (sec): 113.26 - samples/sec: 1938.31 - lr: 0.000019 - momentum: 0.000000
2023-10-15 12:19:21,614 epoch 7 - iter 1820/2606 - loss 0.03116566 - time (sec): 133.35 - samples/sec: 1935.29 - lr: 0.000018 - momentum: 0.000000
2023-10-15 12:19:40,080 epoch 7 - iter 2080/2606 - loss 0.03093073 - time (sec): 151.82 - samples/sec: 1936.42 - lr: 0.000018 - momentum: 0.000000
2023-10-15 12:19:58,599 epoch 7 - iter 2340/2606 - loss 0.03093953 - time (sec): 170.33 - samples/sec: 1937.08 - lr: 0.000017 - momentum: 0.000000
2023-10-15 12:20:17,845 epoch 7 - iter 2600/2606 - loss 0.03105048 - time (sec): 189.58 - samples/sec: 1931.82 - lr: 0.000017 - momentum: 0.000000
2023-10-15 12:20:18,356 ----------------------------------------------------------------------------------------------------
2023-10-15 12:20:18,356 EPOCH 7 done: loss 0.0310 - lr: 0.000017
2023-10-15 12:20:26,693 DEV : loss 0.36471426486968994 - f1-score (micro avg) 0.3665
2023-10-15 12:20:26,727 ----------------------------------------------------------------------------------------------------
2023-10-15 12:20:46,325 epoch 8 - iter 260/2606 - loss 0.01927091 - time (sec): 19.60 - samples/sec: 1907.34 - lr: 0.000016 - momentum: 0.000000
2023-10-15 12:21:04,809 epoch 8 - iter 520/2606 - loss 0.02292594 - time (sec): 38.08 - samples/sec: 1901.26 - lr: 0.000016 - momentum: 0.000000
2023-10-15 12:21:23,743 epoch 8 - iter 780/2606 - loss 0.02009535 - time (sec): 57.01 - samples/sec: 1910.47 - lr: 0.000015 - momentum: 0.000000
2023-10-15 12:21:42,591 epoch 8 - iter 1040/2606 - loss 0.01979498 - time (sec): 75.86 - samples/sec: 1906.19 - lr: 0.000014 - momentum: 0.000000
2023-10-15 12:22:01,421 epoch 8 - iter 1300/2606 - loss 0.02040310 - time (sec): 94.69 - samples/sec: 1906.76 - lr: 0.000014 - momentum: 0.000000
2023-10-15 12:22:21,402 epoch 8 - iter 1560/2606 - loss 0.02110463 - time (sec): 114.67 - samples/sec: 1922.44 - lr: 0.000013 - momentum: 0.000000
2023-10-15 12:22:40,169 epoch 8 - iter 1820/2606 - loss 0.02252478 - time (sec): 133.44 - samples/sec: 1931.25 - lr: 0.000013 - momentum: 0.000000
2023-10-15 12:22:58,486 epoch 8 - iter 2080/2606 - loss 0.02230129 - time (sec): 151.76 - samples/sec: 1924.93 - lr: 0.000012 - momentum: 0.000000
2023-10-15 12:23:17,451 epoch 8 - iter 2340/2606 - loss 0.02176476 - time (sec): 170.72 - samples/sec: 1930.49 - lr: 0.000012 - momentum: 0.000000
2023-10-15 12:23:36,387 epoch 8 - iter 2600/2606 - loss 0.02155911 - time (sec): 189.66 - samples/sec: 1931.48 - lr: 0.000011 - momentum: 0.000000
2023-10-15 12:23:36,903 ----------------------------------------------------------------------------------------------------
2023-10-15 12:23:36,903 EPOCH 8 done: loss 0.0215 - lr: 0.000011
2023-10-15 12:23:45,945 DEV : loss 0.36754974722862244 - f1-score (micro avg) 0.3974
2023-10-15 12:23:45,976 saving best model
2023-10-15 12:23:46,501 ----------------------------------------------------------------------------------------------------
2023-10-15 12:24:07,094 epoch 9 - iter 260/2606 - loss 0.01881708 - time (sec): 20.59 - samples/sec: 1915.94 - lr: 0.000011 - momentum: 0.000000
2023-10-15 12:24:26,071 epoch 9 - iter 520/2606 - loss 0.01565953 - time (sec): 39.57 - samples/sec: 1945.70 - lr: 0.000010 - momentum: 0.000000
2023-10-15 12:24:43,744 epoch 9 - iter 780/2606 - loss 0.01545121 - time (sec): 57.24 - samples/sec: 1930.42 - lr: 0.000009 - momentum: 0.000000
2023-10-15 12:25:03,140 epoch 9 - iter 1040/2606 - loss 0.01662704 - time (sec): 76.64 - samples/sec: 1933.14 - lr: 0.000009 - momentum: 0.000000
2023-10-15 12:25:22,562 epoch 9 - iter 1300/2606 - loss 0.01629208 - time (sec): 96.06 - samples/sec: 1921.45 - lr: 0.000008 - momentum: 0.000000
2023-10-15 12:25:41,153 epoch 9 - iter 1560/2606 - loss 0.01635871 - time (sec): 114.65 - samples/sec: 1919.58 - lr: 0.000008 - momentum: 0.000000
2023-10-15 12:26:00,050 epoch 9 - iter 1820/2606 - loss 0.01600567 - time (sec): 133.55 - samples/sec: 1914.66 - lr: 0.000007 - momentum: 0.000000
2023-10-15 12:26:19,201 epoch 9 - iter 2080/2606 - loss 0.01600079 - time (sec): 152.70 - samples/sec: 1915.23 - lr: 0.000007 - momentum: 0.000000
2023-10-15 12:26:37,301 epoch 9 - iter 2340/2606 - loss 0.01585391 - time (sec): 170.80 - samples/sec: 1912.05 - lr: 0.000006 - momentum: 0.000000
2023-10-15 12:26:57,026 epoch 9 - iter 2600/2606 - loss 0.01575358 - time (sec): 190.52 - samples/sec: 1921.35 - lr: 0.000006 - momentum: 0.000000
2023-10-15 12:26:57,653 ----------------------------------------------------------------------------------------------------
2023-10-15 12:26:57,654 EPOCH 9 done: loss 0.0157 - lr: 0.000006
2023-10-15 12:27:06,597 DEV : loss 0.4267415404319763 - f1-score (micro avg) 0.3812
2023-10-15 12:27:06,622 ----------------------------------------------------------------------------------------------------
2023-10-15 12:27:26,051 epoch 10 - iter 260/2606 - loss 0.00922264 - time (sec): 19.43 - samples/sec: 1970.34 - lr: 0.000005 - momentum: 0.000000
2023-10-15 12:27:45,707 epoch 10 - iter 520/2606 - loss 0.01108063 - time (sec): 39.08 - samples/sec: 1937.27 - lr: 0.000004 - momentum: 0.000000
2023-10-15 12:28:05,674 epoch 10 - iter 780/2606 - loss 0.00999059 - time (sec): 59.05 - samples/sec: 1947.99 - lr: 0.000004 - momentum: 0.000000
2023-10-15 12:28:25,124 epoch 10 - iter 1040/2606 - loss 0.01000085 - time (sec): 78.50 - samples/sec: 1938.59 - lr: 0.000003 - momentum: 0.000000
2023-10-15 12:28:44,187 epoch 10 - iter 1300/2606 - loss 0.00987250 - time (sec): 97.56 - samples/sec: 1928.75 - lr: 0.000003 - momentum: 0.000000
2023-10-15 12:29:02,183 epoch 10 - iter 1560/2606 - loss 0.00985511 - time (sec): 115.56 - samples/sec: 1916.59 - lr: 0.000002 - momentum: 0.000000
2023-10-15 12:29:20,689 epoch 10 - iter 1820/2606 - loss 0.00987583 - time (sec): 134.07 - samples/sec: 1919.64 - lr: 0.000002 - momentum: 0.000000
2023-10-15 12:29:39,612 epoch 10 - iter 2080/2606 - loss 0.00980796 - time (sec): 152.99 - samples/sec: 1922.22 - lr: 0.000001 - momentum: 0.000000
2023-10-15 12:29:59,077 epoch 10 - iter 2340/2606 - loss 0.01015117 - time (sec): 172.45 - samples/sec: 1922.11 - lr: 0.000001 - momentum: 0.000000
2023-10-15 12:30:17,379 epoch 10 - iter 2600/2606 - loss 0.00999133 - time (sec): 190.76 - samples/sec: 1921.72 - lr: 0.000000 - momentum: 0.000000
2023-10-15 12:30:17,751 ----------------------------------------------------------------------------------------------------
2023-10-15 12:30:17,751 EPOCH 10 done: loss 0.0100 - lr: 0.000000
2023-10-15 12:30:26,946 DEV : loss 0.4642558991909027 - f1-score (micro avg) 0.3722
2023-10-15 12:30:27,450 ----------------------------------------------------------------------------------------------------
2023-10-15 12:30:27,451 Loading model from best epoch ...
2023-10-15 12:30:29,221 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-15 12:30:45,144
Results:
- F-score (micro) 0.4529
- F-score (macro) 0.3045
- Accuracy 0.2957
By class:
precision recall f1-score support
LOC 0.5466 0.5461 0.5464 1214
PER 0.3978 0.3948 0.3963 808
ORG 0.2791 0.2720 0.2755 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4549 0.4510 0.4529 2390
macro avg 0.3059 0.3032 0.3045 2390
weighted avg 0.4533 0.4510 0.4522 2390
2023-10-15 12:30:45,144 ----------------------------------------------------------------------------------------------------