stefan-it's picture
Upload folder using huggingface_hub
4dcdbb2
2023-10-13 13:36:06,213 ----------------------------------------------------------------------------------------------------
2023-10-13 13:36:06,214 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 13:36:06,214 ----------------------------------------------------------------------------------------------------
2023-10-13 13:36:06,214 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-13 13:36:06,214 ----------------------------------------------------------------------------------------------------
2023-10-13 13:36:06,214 Train: 3575 sentences
2023-10-13 13:36:06,214 (train_with_dev=False, train_with_test=False)
2023-10-13 13:36:06,214 ----------------------------------------------------------------------------------------------------
2023-10-13 13:36:06,214 Training Params:
2023-10-13 13:36:06,214 - learning_rate: "5e-05"
2023-10-13 13:36:06,214 - mini_batch_size: "4"
2023-10-13 13:36:06,215 - max_epochs: "10"
2023-10-13 13:36:06,215 - shuffle: "True"
2023-10-13 13:36:06,215 ----------------------------------------------------------------------------------------------------
2023-10-13 13:36:06,215 Plugins:
2023-10-13 13:36:06,215 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 13:36:06,215 ----------------------------------------------------------------------------------------------------
2023-10-13 13:36:06,215 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 13:36:06,215 - metric: "('micro avg', 'f1-score')"
2023-10-13 13:36:06,215 ----------------------------------------------------------------------------------------------------
2023-10-13 13:36:06,215 Computation:
2023-10-13 13:36:06,215 - compute on device: cuda:0
2023-10-13 13:36:06,215 - embedding storage: none
2023-10-13 13:36:06,215 ----------------------------------------------------------------------------------------------------
2023-10-13 13:36:06,215 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-13 13:36:06,215 ----------------------------------------------------------------------------------------------------
2023-10-13 13:36:06,215 ----------------------------------------------------------------------------------------------------
2023-10-13 13:36:10,938 epoch 1 - iter 89/894 - loss 2.68748942 - time (sec): 4.72 - samples/sec: 1872.42 - lr: 0.000005 - momentum: 0.000000
2023-10-13 13:36:15,495 epoch 1 - iter 178/894 - loss 1.59012845 - time (sec): 9.28 - samples/sec: 1947.09 - lr: 0.000010 - momentum: 0.000000
2023-10-13 13:36:19,942 epoch 1 - iter 267/894 - loss 1.22746441 - time (sec): 13.73 - samples/sec: 1893.71 - lr: 0.000015 - momentum: 0.000000
2023-10-13 13:36:24,240 epoch 1 - iter 356/894 - loss 0.99377093 - time (sec): 18.02 - samples/sec: 1937.51 - lr: 0.000020 - momentum: 0.000000
2023-10-13 13:36:28,368 epoch 1 - iter 445/894 - loss 0.85393894 - time (sec): 22.15 - samples/sec: 1951.39 - lr: 0.000025 - momentum: 0.000000
2023-10-13 13:36:32,638 epoch 1 - iter 534/894 - loss 0.76142709 - time (sec): 26.42 - samples/sec: 1955.45 - lr: 0.000030 - momentum: 0.000000
2023-10-13 13:36:36,984 epoch 1 - iter 623/894 - loss 0.69553409 - time (sec): 30.77 - samples/sec: 1945.95 - lr: 0.000035 - momentum: 0.000000
2023-10-13 13:36:41,113 epoch 1 - iter 712/894 - loss 0.64037032 - time (sec): 34.90 - samples/sec: 1961.99 - lr: 0.000040 - momentum: 0.000000
2023-10-13 13:36:45,296 epoch 1 - iter 801/894 - loss 0.59367766 - time (sec): 39.08 - samples/sec: 1957.13 - lr: 0.000045 - momentum: 0.000000
2023-10-13 13:36:49,755 epoch 1 - iter 890/894 - loss 0.55614722 - time (sec): 43.54 - samples/sec: 1979.46 - lr: 0.000050 - momentum: 0.000000
2023-10-13 13:36:49,955 ----------------------------------------------------------------------------------------------------
2023-10-13 13:36:49,955 EPOCH 1 done: loss 0.5541 - lr: 0.000050
2023-10-13 13:36:54,994 DEV : loss 0.25372514128685 - f1-score (micro avg) 0.5462
2023-10-13 13:36:55,024 saving best model
2023-10-13 13:36:55,386 ----------------------------------------------------------------------------------------------------
2023-10-13 13:36:59,708 epoch 2 - iter 89/894 - loss 0.17468129 - time (sec): 4.32 - samples/sec: 2088.65 - lr: 0.000049 - momentum: 0.000000
2023-10-13 13:37:04,099 epoch 2 - iter 178/894 - loss 0.17014545 - time (sec): 8.71 - samples/sec: 2135.36 - lr: 0.000049 - momentum: 0.000000
2023-10-13 13:37:08,087 epoch 2 - iter 267/894 - loss 0.16084483 - time (sec): 12.70 - samples/sec: 2087.62 - lr: 0.000048 - momentum: 0.000000
2023-10-13 13:37:12,211 epoch 2 - iter 356/894 - loss 0.16855910 - time (sec): 16.82 - samples/sec: 2070.57 - lr: 0.000048 - momentum: 0.000000
2023-10-13 13:37:16,385 epoch 2 - iter 445/894 - loss 0.16531549 - time (sec): 21.00 - samples/sec: 2077.00 - lr: 0.000047 - momentum: 0.000000
2023-10-13 13:37:20,503 epoch 2 - iter 534/894 - loss 0.15775841 - time (sec): 25.11 - samples/sec: 2082.98 - lr: 0.000047 - momentum: 0.000000
2023-10-13 13:37:24,683 epoch 2 - iter 623/894 - loss 0.15772402 - time (sec): 29.30 - samples/sec: 2063.56 - lr: 0.000046 - momentum: 0.000000
2023-10-13 13:37:28,927 epoch 2 - iter 712/894 - loss 0.15883528 - time (sec): 33.54 - samples/sec: 2046.72 - lr: 0.000046 - momentum: 0.000000
2023-10-13 13:37:33,333 epoch 2 - iter 801/894 - loss 0.15729564 - time (sec): 37.95 - samples/sec: 2042.63 - lr: 0.000045 - momentum: 0.000000
2023-10-13 13:37:37,353 epoch 2 - iter 890/894 - loss 0.15659491 - time (sec): 41.97 - samples/sec: 2054.37 - lr: 0.000044 - momentum: 0.000000
2023-10-13 13:37:37,536 ----------------------------------------------------------------------------------------------------
2023-10-13 13:37:37,536 EPOCH 2 done: loss 0.1565 - lr: 0.000044
2023-10-13 13:37:46,225 DEV : loss 0.146720752120018 - f1-score (micro avg) 0.6869
2023-10-13 13:37:46,253 saving best model
2023-10-13 13:37:46,641 ----------------------------------------------------------------------------------------------------
2023-10-13 13:37:50,828 epoch 3 - iter 89/894 - loss 0.11101622 - time (sec): 4.19 - samples/sec: 2071.00 - lr: 0.000044 - momentum: 0.000000
2023-10-13 13:37:55,412 epoch 3 - iter 178/894 - loss 0.10493274 - time (sec): 8.77 - samples/sec: 2056.15 - lr: 0.000043 - momentum: 0.000000
2023-10-13 13:37:59,662 epoch 3 - iter 267/894 - loss 0.10198121 - time (sec): 13.02 - samples/sec: 2031.68 - lr: 0.000043 - momentum: 0.000000
2023-10-13 13:38:03,763 epoch 3 - iter 356/894 - loss 0.10072889 - time (sec): 17.12 - samples/sec: 2031.21 - lr: 0.000042 - momentum: 0.000000
2023-10-13 13:38:07,891 epoch 3 - iter 445/894 - loss 0.10004708 - time (sec): 21.25 - samples/sec: 2003.26 - lr: 0.000042 - momentum: 0.000000
2023-10-13 13:38:11,984 epoch 3 - iter 534/894 - loss 0.09936209 - time (sec): 25.34 - samples/sec: 2018.71 - lr: 0.000041 - momentum: 0.000000
2023-10-13 13:38:16,023 epoch 3 - iter 623/894 - loss 0.09825369 - time (sec): 29.38 - samples/sec: 2020.23 - lr: 0.000041 - momentum: 0.000000
2023-10-13 13:38:20,434 epoch 3 - iter 712/894 - loss 0.09475486 - time (sec): 33.79 - samples/sec: 2027.08 - lr: 0.000040 - momentum: 0.000000
2023-10-13 13:38:24,656 epoch 3 - iter 801/894 - loss 0.09884765 - time (sec): 38.01 - samples/sec: 2018.81 - lr: 0.000039 - momentum: 0.000000
2023-10-13 13:38:28,994 epoch 3 - iter 890/894 - loss 0.09665535 - time (sec): 42.35 - samples/sec: 2034.77 - lr: 0.000039 - momentum: 0.000000
2023-10-13 13:38:29,183 ----------------------------------------------------------------------------------------------------
2023-10-13 13:38:29,183 EPOCH 3 done: loss 0.0966 - lr: 0.000039
2023-10-13 13:38:38,013 DEV : loss 0.1882152259349823 - f1-score (micro avg) 0.7134
2023-10-13 13:38:38,040 saving best model
2023-10-13 13:38:38,468 ----------------------------------------------------------------------------------------------------
2023-10-13 13:38:42,765 epoch 4 - iter 89/894 - loss 0.05521601 - time (sec): 4.29 - samples/sec: 2010.24 - lr: 0.000038 - momentum: 0.000000
2023-10-13 13:38:46,774 epoch 4 - iter 178/894 - loss 0.06025113 - time (sec): 8.30 - samples/sec: 2000.44 - lr: 0.000038 - momentum: 0.000000
2023-10-13 13:38:50,728 epoch 4 - iter 267/894 - loss 0.06246618 - time (sec): 12.25 - samples/sec: 2061.29 - lr: 0.000037 - momentum: 0.000000
2023-10-13 13:38:54,961 epoch 4 - iter 356/894 - loss 0.05713263 - time (sec): 16.49 - samples/sec: 2066.43 - lr: 0.000037 - momentum: 0.000000
2023-10-13 13:38:59,367 epoch 4 - iter 445/894 - loss 0.05854806 - time (sec): 20.89 - samples/sec: 2093.48 - lr: 0.000036 - momentum: 0.000000
2023-10-13 13:39:03,669 epoch 4 - iter 534/894 - loss 0.05727048 - time (sec): 25.19 - samples/sec: 2083.68 - lr: 0.000036 - momentum: 0.000000
2023-10-13 13:39:07,689 epoch 4 - iter 623/894 - loss 0.05943191 - time (sec): 29.21 - samples/sec: 2067.43 - lr: 0.000035 - momentum: 0.000000
2023-10-13 13:39:11,950 epoch 4 - iter 712/894 - loss 0.05928586 - time (sec): 33.48 - samples/sec: 2074.11 - lr: 0.000034 - momentum: 0.000000
2023-10-13 13:39:16,241 epoch 4 - iter 801/894 - loss 0.06054469 - time (sec): 37.77 - samples/sec: 2068.73 - lr: 0.000034 - momentum: 0.000000
2023-10-13 13:39:20,667 epoch 4 - iter 890/894 - loss 0.05991357 - time (sec): 42.19 - samples/sec: 2042.27 - lr: 0.000033 - momentum: 0.000000
2023-10-13 13:39:20,844 ----------------------------------------------------------------------------------------------------
2023-10-13 13:39:20,844 EPOCH 4 done: loss 0.0601 - lr: 0.000033
2023-10-13 13:39:29,758 DEV : loss 0.19933967292308807 - f1-score (micro avg) 0.7387
2023-10-13 13:39:29,792 saving best model
2023-10-13 13:39:30,297 ----------------------------------------------------------------------------------------------------
2023-10-13 13:39:34,666 epoch 5 - iter 89/894 - loss 0.05581229 - time (sec): 4.37 - samples/sec: 2074.31 - lr: 0.000033 - momentum: 0.000000
2023-10-13 13:39:39,035 epoch 5 - iter 178/894 - loss 0.04722653 - time (sec): 8.74 - samples/sec: 1991.98 - lr: 0.000032 - momentum: 0.000000
2023-10-13 13:39:43,612 epoch 5 - iter 267/894 - loss 0.04585486 - time (sec): 13.31 - samples/sec: 1974.54 - lr: 0.000032 - momentum: 0.000000
2023-10-13 13:39:47,897 epoch 5 - iter 356/894 - loss 0.04522439 - time (sec): 17.60 - samples/sec: 1991.17 - lr: 0.000031 - momentum: 0.000000
2023-10-13 13:39:52,087 epoch 5 - iter 445/894 - loss 0.04771289 - time (sec): 21.79 - samples/sec: 1986.95 - lr: 0.000031 - momentum: 0.000000
2023-10-13 13:39:56,379 epoch 5 - iter 534/894 - loss 0.04810893 - time (sec): 26.08 - samples/sec: 1984.66 - lr: 0.000030 - momentum: 0.000000
2023-10-13 13:40:00,872 epoch 5 - iter 623/894 - loss 0.04557853 - time (sec): 30.57 - samples/sec: 2003.60 - lr: 0.000029 - momentum: 0.000000
2023-10-13 13:40:05,100 epoch 5 - iter 712/894 - loss 0.04568690 - time (sec): 34.80 - samples/sec: 1991.16 - lr: 0.000029 - momentum: 0.000000
2023-10-13 13:40:09,267 epoch 5 - iter 801/894 - loss 0.04528489 - time (sec): 38.97 - samples/sec: 1993.82 - lr: 0.000028 - momentum: 0.000000
2023-10-13 13:40:13,313 epoch 5 - iter 890/894 - loss 0.04320919 - time (sec): 43.01 - samples/sec: 2005.20 - lr: 0.000028 - momentum: 0.000000
2023-10-13 13:40:13,482 ----------------------------------------------------------------------------------------------------
2023-10-13 13:40:13,482 EPOCH 5 done: loss 0.0432 - lr: 0.000028
2023-10-13 13:40:22,114 DEV : loss 0.22585448622703552 - f1-score (micro avg) 0.7412
2023-10-13 13:40:22,142 saving best model
2023-10-13 13:40:22,609 ----------------------------------------------------------------------------------------------------
2023-10-13 13:40:27,061 epoch 6 - iter 89/894 - loss 0.02085790 - time (sec): 4.45 - samples/sec: 1968.09 - lr: 0.000027 - momentum: 0.000000
2023-10-13 13:40:31,364 epoch 6 - iter 178/894 - loss 0.02817642 - time (sec): 8.75 - samples/sec: 2031.27 - lr: 0.000027 - momentum: 0.000000
2023-10-13 13:40:35,396 epoch 6 - iter 267/894 - loss 0.02565474 - time (sec): 12.79 - samples/sec: 2051.52 - lr: 0.000026 - momentum: 0.000000
2023-10-13 13:40:39,892 epoch 6 - iter 356/894 - loss 0.02217424 - time (sec): 17.28 - samples/sec: 2095.95 - lr: 0.000026 - momentum: 0.000000
2023-10-13 13:40:44,068 epoch 6 - iter 445/894 - loss 0.02178014 - time (sec): 21.46 - samples/sec: 2037.88 - lr: 0.000025 - momentum: 0.000000
2023-10-13 13:40:48,065 epoch 6 - iter 534/894 - loss 0.02116497 - time (sec): 25.45 - samples/sec: 2044.48 - lr: 0.000024 - momentum: 0.000000
2023-10-13 13:40:52,279 epoch 6 - iter 623/894 - loss 0.02285682 - time (sec): 29.67 - samples/sec: 2040.62 - lr: 0.000024 - momentum: 0.000000
2023-10-13 13:40:56,435 epoch 6 - iter 712/894 - loss 0.02393454 - time (sec): 33.82 - samples/sec: 2029.42 - lr: 0.000023 - momentum: 0.000000
2023-10-13 13:41:00,740 epoch 6 - iter 801/894 - loss 0.02490852 - time (sec): 38.13 - samples/sec: 2034.66 - lr: 0.000023 - momentum: 0.000000
2023-10-13 13:41:04,960 epoch 6 - iter 890/894 - loss 0.02709514 - time (sec): 42.35 - samples/sec: 2036.01 - lr: 0.000022 - momentum: 0.000000
2023-10-13 13:41:05,130 ----------------------------------------------------------------------------------------------------
2023-10-13 13:41:05,130 EPOCH 6 done: loss 0.0271 - lr: 0.000022
2023-10-13 13:41:13,883 DEV : loss 0.20660947263240814 - f1-score (micro avg) 0.7692
2023-10-13 13:41:13,912 saving best model
2023-10-13 13:41:14,371 ----------------------------------------------------------------------------------------------------
2023-10-13 13:41:18,988 epoch 7 - iter 89/894 - loss 0.01441146 - time (sec): 4.61 - samples/sec: 2170.14 - lr: 0.000022 - momentum: 0.000000
2023-10-13 13:41:23,142 epoch 7 - iter 178/894 - loss 0.01336769 - time (sec): 8.77 - samples/sec: 2064.61 - lr: 0.000021 - momentum: 0.000000
2023-10-13 13:41:27,267 epoch 7 - iter 267/894 - loss 0.01184471 - time (sec): 12.89 - samples/sec: 2090.67 - lr: 0.000021 - momentum: 0.000000
2023-10-13 13:41:31,390 epoch 7 - iter 356/894 - loss 0.01381643 - time (sec): 17.02 - samples/sec: 2108.61 - lr: 0.000020 - momentum: 0.000000
2023-10-13 13:41:35,581 epoch 7 - iter 445/894 - loss 0.01343301 - time (sec): 21.21 - samples/sec: 2094.77 - lr: 0.000019 - momentum: 0.000000
2023-10-13 13:41:39,673 epoch 7 - iter 534/894 - loss 0.01496204 - time (sec): 25.30 - samples/sec: 2059.50 - lr: 0.000019 - momentum: 0.000000
2023-10-13 13:41:43,879 epoch 7 - iter 623/894 - loss 0.01637548 - time (sec): 29.51 - samples/sec: 2062.67 - lr: 0.000018 - momentum: 0.000000
2023-10-13 13:41:48,037 epoch 7 - iter 712/894 - loss 0.01538496 - time (sec): 33.66 - samples/sec: 2054.63 - lr: 0.000018 - momentum: 0.000000
2023-10-13 13:41:52,153 epoch 7 - iter 801/894 - loss 0.01717510 - time (sec): 37.78 - samples/sec: 2051.43 - lr: 0.000017 - momentum: 0.000000
2023-10-13 13:41:56,385 epoch 7 - iter 890/894 - loss 0.01693132 - time (sec): 42.01 - samples/sec: 2051.71 - lr: 0.000017 - momentum: 0.000000
2023-10-13 13:41:56,565 ----------------------------------------------------------------------------------------------------
2023-10-13 13:41:56,565 EPOCH 7 done: loss 0.0169 - lr: 0.000017
2023-10-13 13:42:05,324 DEV : loss 0.2536468803882599 - f1-score (micro avg) 0.7769
2023-10-13 13:42:05,352 saving best model
2023-10-13 13:42:05,779 ----------------------------------------------------------------------------------------------------
2023-10-13 13:42:10,163 epoch 8 - iter 89/894 - loss 0.00786799 - time (sec): 4.38 - samples/sec: 1996.79 - lr: 0.000016 - momentum: 0.000000
2023-10-13 13:42:14,442 epoch 8 - iter 178/894 - loss 0.00866875 - time (sec): 8.66 - samples/sec: 1994.51 - lr: 0.000016 - momentum: 0.000000
2023-10-13 13:42:18,591 epoch 8 - iter 267/894 - loss 0.01092078 - time (sec): 12.81 - samples/sec: 2002.69 - lr: 0.000015 - momentum: 0.000000
2023-10-13 13:42:22,760 epoch 8 - iter 356/894 - loss 0.01198771 - time (sec): 16.98 - samples/sec: 1994.23 - lr: 0.000014 - momentum: 0.000000
2023-10-13 13:42:26,964 epoch 8 - iter 445/894 - loss 0.01138557 - time (sec): 21.18 - samples/sec: 1994.40 - lr: 0.000014 - momentum: 0.000000
2023-10-13 13:42:31,193 epoch 8 - iter 534/894 - loss 0.01099923 - time (sec): 25.41 - samples/sec: 1991.11 - lr: 0.000013 - momentum: 0.000000
2023-10-13 13:42:35,348 epoch 8 - iter 623/894 - loss 0.01067929 - time (sec): 29.57 - samples/sec: 1999.99 - lr: 0.000013 - momentum: 0.000000
2023-10-13 13:42:39,827 epoch 8 - iter 712/894 - loss 0.01143117 - time (sec): 34.05 - samples/sec: 2006.75 - lr: 0.000012 - momentum: 0.000000
2023-10-13 13:42:44,083 epoch 8 - iter 801/894 - loss 0.01122106 - time (sec): 38.30 - samples/sec: 2025.04 - lr: 0.000012 - momentum: 0.000000
2023-10-13 13:42:48,301 epoch 8 - iter 890/894 - loss 0.01097617 - time (sec): 42.52 - samples/sec: 2029.05 - lr: 0.000011 - momentum: 0.000000
2023-10-13 13:42:48,491 ----------------------------------------------------------------------------------------------------
2023-10-13 13:42:48,492 EPOCH 8 done: loss 0.0109 - lr: 0.000011
2023-10-13 13:42:57,409 DEV : loss 0.25275343656539917 - f1-score (micro avg) 0.7708
2023-10-13 13:42:57,440 ----------------------------------------------------------------------------------------------------
2023-10-13 13:43:02,157 epoch 9 - iter 89/894 - loss 0.00495732 - time (sec): 4.72 - samples/sec: 1777.88 - lr: 0.000011 - momentum: 0.000000
2023-10-13 13:43:06,477 epoch 9 - iter 178/894 - loss 0.00486139 - time (sec): 9.04 - samples/sec: 1901.16 - lr: 0.000010 - momentum: 0.000000
2023-10-13 13:43:10,501 epoch 9 - iter 267/894 - loss 0.00630451 - time (sec): 13.06 - samples/sec: 1931.23 - lr: 0.000009 - momentum: 0.000000
2023-10-13 13:43:14,571 epoch 9 - iter 356/894 - loss 0.00739423 - time (sec): 17.13 - samples/sec: 1957.10 - lr: 0.000009 - momentum: 0.000000
2023-10-13 13:43:18,711 epoch 9 - iter 445/894 - loss 0.00718639 - time (sec): 21.27 - samples/sec: 2000.40 - lr: 0.000008 - momentum: 0.000000
2023-10-13 13:43:22,820 epoch 9 - iter 534/894 - loss 0.00649298 - time (sec): 25.38 - samples/sec: 2010.27 - lr: 0.000008 - momentum: 0.000000
2023-10-13 13:43:26,935 epoch 9 - iter 623/894 - loss 0.00722465 - time (sec): 29.49 - samples/sec: 2039.50 - lr: 0.000007 - momentum: 0.000000
2023-10-13 13:43:31,387 epoch 9 - iter 712/894 - loss 0.00773812 - time (sec): 33.95 - samples/sec: 2060.28 - lr: 0.000007 - momentum: 0.000000
2023-10-13 13:43:35,413 epoch 9 - iter 801/894 - loss 0.00726079 - time (sec): 37.97 - samples/sec: 2056.51 - lr: 0.000006 - momentum: 0.000000
2023-10-13 13:43:39,402 epoch 9 - iter 890/894 - loss 0.00690011 - time (sec): 41.96 - samples/sec: 2056.85 - lr: 0.000006 - momentum: 0.000000
2023-10-13 13:43:39,576 ----------------------------------------------------------------------------------------------------
2023-10-13 13:43:39,576 EPOCH 9 done: loss 0.0069 - lr: 0.000006
2023-10-13 13:43:48,273 DEV : loss 0.2594471275806427 - f1-score (micro avg) 0.7778
2023-10-13 13:43:48,300 saving best model
2023-10-13 13:43:48,726 ----------------------------------------------------------------------------------------------------
2023-10-13 13:43:53,074 epoch 10 - iter 89/894 - loss 0.00215822 - time (sec): 4.35 - samples/sec: 2283.09 - lr: 0.000005 - momentum: 0.000000
2023-10-13 13:43:57,281 epoch 10 - iter 178/894 - loss 0.00389851 - time (sec): 8.55 - samples/sec: 2190.38 - lr: 0.000004 - momentum: 0.000000
2023-10-13 13:44:01,325 epoch 10 - iter 267/894 - loss 0.00350003 - time (sec): 12.60 - samples/sec: 2146.14 - lr: 0.000004 - momentum: 0.000000
2023-10-13 13:44:05,445 epoch 10 - iter 356/894 - loss 0.00399930 - time (sec): 16.72 - samples/sec: 2102.62 - lr: 0.000003 - momentum: 0.000000
2023-10-13 13:44:09,795 epoch 10 - iter 445/894 - loss 0.00340014 - time (sec): 21.07 - samples/sec: 2080.76 - lr: 0.000003 - momentum: 0.000000
2023-10-13 13:44:14,542 epoch 10 - iter 534/894 - loss 0.00358239 - time (sec): 25.82 - samples/sec: 2011.27 - lr: 0.000002 - momentum: 0.000000
2023-10-13 13:44:19,279 epoch 10 - iter 623/894 - loss 0.00371600 - time (sec): 30.55 - samples/sec: 1968.33 - lr: 0.000002 - momentum: 0.000000
2023-10-13 13:44:23,736 epoch 10 - iter 712/894 - loss 0.00366171 - time (sec): 35.01 - samples/sec: 1968.68 - lr: 0.000001 - momentum: 0.000000
2023-10-13 13:44:28,034 epoch 10 - iter 801/894 - loss 0.00355658 - time (sec): 39.31 - samples/sec: 1972.71 - lr: 0.000001 - momentum: 0.000000
2023-10-13 13:44:32,317 epoch 10 - iter 890/894 - loss 0.00415920 - time (sec): 43.59 - samples/sec: 1977.84 - lr: 0.000000 - momentum: 0.000000
2023-10-13 13:44:32,511 ----------------------------------------------------------------------------------------------------
2023-10-13 13:44:32,511 EPOCH 10 done: loss 0.0041 - lr: 0.000000
2023-10-13 13:44:41,095 DEV : loss 0.26341870427131653 - f1-score (micro avg) 0.7824
2023-10-13 13:44:41,123 saving best model
2023-10-13 13:44:41,945 ----------------------------------------------------------------------------------------------------
2023-10-13 13:44:41,946 Loading model from best epoch ...
2023-10-13 13:44:43,435 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-13 13:44:48,077
Results:
- F-score (micro) 0.7298
- F-score (macro) 0.658
- Accuracy 0.5973
By class:
precision recall f1-score support
loc 0.8023 0.8238 0.8129 596
pers 0.6329 0.7508 0.6868 333
org 0.5833 0.5303 0.5556 132
prod 0.6087 0.4242 0.5000 66
time 0.7347 0.7347 0.7347 49
micro avg 0.7160 0.7440 0.7298 1176
macro avg 0.6724 0.6528 0.6580 1176
weighted avg 0.7161 0.7440 0.7275 1176
2023-10-13 13:44:48,078 ----------------------------------------------------------------------------------------------------