stefan-it's picture
Upload folder using huggingface_hub
786a0c5
2023-10-13 12:55:14,652 ----------------------------------------------------------------------------------------------------
2023-10-13 12:55:14,653 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 12:55:14,653 ----------------------------------------------------------------------------------------------------
2023-10-13 12:55:14,653 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-13 12:55:14,653 ----------------------------------------------------------------------------------------------------
2023-10-13 12:55:14,653 Train: 3575 sentences
2023-10-13 12:55:14,653 (train_with_dev=False, train_with_test=False)
2023-10-13 12:55:14,653 ----------------------------------------------------------------------------------------------------
2023-10-13 12:55:14,653 Training Params:
2023-10-13 12:55:14,653 - learning_rate: "3e-05"
2023-10-13 12:55:14,653 - mini_batch_size: "4"
2023-10-13 12:55:14,653 - max_epochs: "10"
2023-10-13 12:55:14,653 - shuffle: "True"
2023-10-13 12:55:14,653 ----------------------------------------------------------------------------------------------------
2023-10-13 12:55:14,653 Plugins:
2023-10-13 12:55:14,654 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 12:55:14,654 ----------------------------------------------------------------------------------------------------
2023-10-13 12:55:14,654 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 12:55:14,654 - metric: "('micro avg', 'f1-score')"
2023-10-13 12:55:14,654 ----------------------------------------------------------------------------------------------------
2023-10-13 12:55:14,654 Computation:
2023-10-13 12:55:14,654 - compute on device: cuda:0
2023-10-13 12:55:14,654 - embedding storage: none
2023-10-13 12:55:14,654 ----------------------------------------------------------------------------------------------------
2023-10-13 12:55:14,654 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-13 12:55:14,654 ----------------------------------------------------------------------------------------------------
2023-10-13 12:55:14,654 ----------------------------------------------------------------------------------------------------
2023-10-13 12:55:18,848 epoch 1 - iter 89/894 - loss 2.69593370 - time (sec): 4.19 - samples/sec: 2000.73 - lr: 0.000003 - momentum: 0.000000
2023-10-13 12:55:23,146 epoch 1 - iter 178/894 - loss 1.71958865 - time (sec): 8.49 - samples/sec: 2028.76 - lr: 0.000006 - momentum: 0.000000
2023-10-13 12:55:27,200 epoch 1 - iter 267/894 - loss 1.32701386 - time (sec): 12.54 - samples/sec: 2010.55 - lr: 0.000009 - momentum: 0.000000
2023-10-13 12:55:31,451 epoch 1 - iter 356/894 - loss 1.05861260 - time (sec): 16.80 - samples/sec: 2078.11 - lr: 0.000012 - momentum: 0.000000
2023-10-13 12:55:35,742 epoch 1 - iter 445/894 - loss 0.90380884 - time (sec): 21.09 - samples/sec: 2070.33 - lr: 0.000015 - momentum: 0.000000
2023-10-13 12:55:39,944 epoch 1 - iter 534/894 - loss 0.79977481 - time (sec): 25.29 - samples/sec: 2079.85 - lr: 0.000018 - momentum: 0.000000
2023-10-13 12:55:44,011 epoch 1 - iter 623/894 - loss 0.73143982 - time (sec): 29.36 - samples/sec: 2066.38 - lr: 0.000021 - momentum: 0.000000
2023-10-13 12:55:48,084 epoch 1 - iter 712/894 - loss 0.68005300 - time (sec): 33.43 - samples/sec: 2056.86 - lr: 0.000024 - momentum: 0.000000
2023-10-13 12:55:52,305 epoch 1 - iter 801/894 - loss 0.63023691 - time (sec): 37.65 - samples/sec: 2070.97 - lr: 0.000027 - momentum: 0.000000
2023-10-13 12:55:56,579 epoch 1 - iter 890/894 - loss 0.59407354 - time (sec): 41.92 - samples/sec: 2057.15 - lr: 0.000030 - momentum: 0.000000
2023-10-13 12:55:56,770 ----------------------------------------------------------------------------------------------------
2023-10-13 12:55:56,770 EPOCH 1 done: loss 0.5935 - lr: 0.000030
2023-10-13 12:56:01,600 DEV : loss 0.19242174923419952 - f1-score (micro avg) 0.581
2023-10-13 12:56:01,629 saving best model
2023-10-13 12:56:01,951 ----------------------------------------------------------------------------------------------------
2023-10-13 12:56:05,928 epoch 2 - iter 89/894 - loss 0.20351765 - time (sec): 3.98 - samples/sec: 2070.12 - lr: 0.000030 - momentum: 0.000000
2023-10-13 12:56:09,978 epoch 2 - iter 178/894 - loss 0.19371181 - time (sec): 8.03 - samples/sec: 2100.43 - lr: 0.000029 - momentum: 0.000000
2023-10-13 12:56:14,065 epoch 2 - iter 267/894 - loss 0.17774358 - time (sec): 12.11 - samples/sec: 2072.98 - lr: 0.000029 - momentum: 0.000000
2023-10-13 12:56:18,280 epoch 2 - iter 356/894 - loss 0.16603566 - time (sec): 16.33 - samples/sec: 2063.76 - lr: 0.000029 - momentum: 0.000000
2023-10-13 12:56:22,329 epoch 2 - iter 445/894 - loss 0.17216562 - time (sec): 20.38 - samples/sec: 2057.61 - lr: 0.000028 - momentum: 0.000000
2023-10-13 12:56:26,707 epoch 2 - iter 534/894 - loss 0.16847025 - time (sec): 24.75 - samples/sec: 2027.62 - lr: 0.000028 - momentum: 0.000000
2023-10-13 12:56:31,007 epoch 2 - iter 623/894 - loss 0.16459087 - time (sec): 29.06 - samples/sec: 2036.86 - lr: 0.000028 - momentum: 0.000000
2023-10-13 12:56:35,095 epoch 2 - iter 712/894 - loss 0.16278123 - time (sec): 33.14 - samples/sec: 2049.87 - lr: 0.000027 - momentum: 0.000000
2023-10-13 12:56:39,067 epoch 2 - iter 801/894 - loss 0.16215338 - time (sec): 37.11 - samples/sec: 2088.95 - lr: 0.000027 - momentum: 0.000000
2023-10-13 12:56:43,379 epoch 2 - iter 890/894 - loss 0.16148233 - time (sec): 41.43 - samples/sec: 2081.25 - lr: 0.000027 - momentum: 0.000000
2023-10-13 12:56:43,566 ----------------------------------------------------------------------------------------------------
2023-10-13 12:56:43,566 EPOCH 2 done: loss 0.1618 - lr: 0.000027
2023-10-13 12:56:52,106 DEV : loss 0.14601190388202667 - f1-score (micro avg) 0.707
2023-10-13 12:56:52,135 saving best model
2023-10-13 12:56:52,581 ----------------------------------------------------------------------------------------------------
2023-10-13 12:56:56,686 epoch 3 - iter 89/894 - loss 0.09364191 - time (sec): 4.10 - samples/sec: 1998.31 - lr: 0.000026 - momentum: 0.000000
2023-10-13 12:57:00,659 epoch 3 - iter 178/894 - loss 0.08926448 - time (sec): 8.07 - samples/sec: 1991.65 - lr: 0.000026 - momentum: 0.000000
2023-10-13 12:57:04,820 epoch 3 - iter 267/894 - loss 0.08910757 - time (sec): 12.23 - samples/sec: 2046.42 - lr: 0.000026 - momentum: 0.000000
2023-10-13 12:57:08,871 epoch 3 - iter 356/894 - loss 0.09229143 - time (sec): 16.29 - samples/sec: 2054.75 - lr: 0.000025 - momentum: 0.000000
2023-10-13 12:57:13,084 epoch 3 - iter 445/894 - loss 0.09330711 - time (sec): 20.50 - samples/sec: 2026.30 - lr: 0.000025 - momentum: 0.000000
2023-10-13 12:57:17,109 epoch 3 - iter 534/894 - loss 0.08872648 - time (sec): 24.52 - samples/sec: 2052.27 - lr: 0.000025 - momentum: 0.000000
2023-10-13 12:57:21,286 epoch 3 - iter 623/894 - loss 0.08943539 - time (sec): 28.70 - samples/sec: 2046.90 - lr: 0.000024 - momentum: 0.000000
2023-10-13 12:57:25,411 epoch 3 - iter 712/894 - loss 0.09181165 - time (sec): 32.83 - samples/sec: 2051.35 - lr: 0.000024 - momentum: 0.000000
2023-10-13 12:57:29,642 epoch 3 - iter 801/894 - loss 0.09100852 - time (sec): 37.06 - samples/sec: 2060.24 - lr: 0.000024 - momentum: 0.000000
2023-10-13 12:57:34,154 epoch 3 - iter 890/894 - loss 0.09219008 - time (sec): 41.57 - samples/sec: 2072.83 - lr: 0.000023 - momentum: 0.000000
2023-10-13 12:57:34,340 ----------------------------------------------------------------------------------------------------
2023-10-13 12:57:34,340 EPOCH 3 done: loss 0.0919 - lr: 0.000023
2023-10-13 12:57:42,891 DEV : loss 0.169508159160614 - f1-score (micro avg) 0.7217
2023-10-13 12:57:42,921 saving best model
2023-10-13 12:57:43,338 ----------------------------------------------------------------------------------------------------
2023-10-13 12:57:47,481 epoch 4 - iter 89/894 - loss 0.07066538 - time (sec): 4.14 - samples/sec: 2006.71 - lr: 0.000023 - momentum: 0.000000
2023-10-13 12:57:51,485 epoch 4 - iter 178/894 - loss 0.06521536 - time (sec): 8.15 - samples/sec: 2053.11 - lr: 0.000023 - momentum: 0.000000
2023-10-13 12:57:55,505 epoch 4 - iter 267/894 - loss 0.06095332 - time (sec): 12.17 - samples/sec: 2061.80 - lr: 0.000022 - momentum: 0.000000
2023-10-13 12:57:59,865 epoch 4 - iter 356/894 - loss 0.05999315 - time (sec): 16.53 - samples/sec: 2133.54 - lr: 0.000022 - momentum: 0.000000
2023-10-13 12:58:03,915 epoch 4 - iter 445/894 - loss 0.05842531 - time (sec): 20.58 - samples/sec: 2114.71 - lr: 0.000022 - momentum: 0.000000
2023-10-13 12:58:08,237 epoch 4 - iter 534/894 - loss 0.05992046 - time (sec): 24.90 - samples/sec: 2095.16 - lr: 0.000021 - momentum: 0.000000
2023-10-13 12:58:12,418 epoch 4 - iter 623/894 - loss 0.05852169 - time (sec): 29.08 - samples/sec: 2093.88 - lr: 0.000021 - momentum: 0.000000
2023-10-13 12:58:16,496 epoch 4 - iter 712/894 - loss 0.05864176 - time (sec): 33.16 - samples/sec: 2086.38 - lr: 0.000021 - momentum: 0.000000
2023-10-13 12:58:20,933 epoch 4 - iter 801/894 - loss 0.05904040 - time (sec): 37.59 - samples/sec: 2083.56 - lr: 0.000020 - momentum: 0.000000
2023-10-13 12:58:24,881 epoch 4 - iter 890/894 - loss 0.05946598 - time (sec): 41.54 - samples/sec: 2076.25 - lr: 0.000020 - momentum: 0.000000
2023-10-13 12:58:25,052 ----------------------------------------------------------------------------------------------------
2023-10-13 12:58:25,052 EPOCH 4 done: loss 0.0594 - lr: 0.000020
2023-10-13 12:58:33,488 DEV : loss 0.20608599483966827 - f1-score (micro avg) 0.7631
2023-10-13 12:58:33,518 saving best model
2023-10-13 12:58:34,001 ----------------------------------------------------------------------------------------------------
2023-10-13 12:58:38,060 epoch 5 - iter 89/894 - loss 0.05023316 - time (sec): 4.06 - samples/sec: 2106.72 - lr: 0.000020 - momentum: 0.000000
2023-10-13 12:58:42,103 epoch 5 - iter 178/894 - loss 0.04072573 - time (sec): 8.10 - samples/sec: 2066.73 - lr: 0.000019 - momentum: 0.000000
2023-10-13 12:58:46,306 epoch 5 - iter 267/894 - loss 0.04046872 - time (sec): 12.30 - samples/sec: 2056.71 - lr: 0.000019 - momentum: 0.000000
2023-10-13 12:58:50,544 epoch 5 - iter 356/894 - loss 0.04285187 - time (sec): 16.54 - samples/sec: 2012.79 - lr: 0.000019 - momentum: 0.000000
2023-10-13 12:58:54,937 epoch 5 - iter 445/894 - loss 0.04118917 - time (sec): 20.93 - samples/sec: 2034.33 - lr: 0.000018 - momentum: 0.000000
2023-10-13 12:58:59,181 epoch 5 - iter 534/894 - loss 0.04312736 - time (sec): 25.18 - samples/sec: 2018.89 - lr: 0.000018 - momentum: 0.000000
2023-10-13 12:59:03,441 epoch 5 - iter 623/894 - loss 0.04081111 - time (sec): 29.44 - samples/sec: 1999.35 - lr: 0.000018 - momentum: 0.000000
2023-10-13 12:59:07,980 epoch 5 - iter 712/894 - loss 0.04391013 - time (sec): 33.98 - samples/sec: 2030.64 - lr: 0.000017 - momentum: 0.000000
2023-10-13 12:59:12,804 epoch 5 - iter 801/894 - loss 0.04351538 - time (sec): 38.80 - samples/sec: 2012.42 - lr: 0.000017 - momentum: 0.000000
2023-10-13 12:59:17,151 epoch 5 - iter 890/894 - loss 0.04355119 - time (sec): 43.15 - samples/sec: 1999.22 - lr: 0.000017 - momentum: 0.000000
2023-10-13 12:59:17,355 ----------------------------------------------------------------------------------------------------
2023-10-13 12:59:17,355 EPOCH 5 done: loss 0.0435 - lr: 0.000017
2023-10-13 12:59:26,229 DEV : loss 0.20556075870990753 - f1-score (micro avg) 0.7782
2023-10-13 12:59:26,260 saving best model
2023-10-13 12:59:26,730 ----------------------------------------------------------------------------------------------------
2023-10-13 12:59:31,174 epoch 6 - iter 89/894 - loss 0.02131180 - time (sec): 4.44 - samples/sec: 2185.91 - lr: 0.000016 - momentum: 0.000000
2023-10-13 12:59:35,181 epoch 6 - iter 178/894 - loss 0.01740524 - time (sec): 8.45 - samples/sec: 2134.49 - lr: 0.000016 - momentum: 0.000000
2023-10-13 12:59:39,711 epoch 6 - iter 267/894 - loss 0.01734140 - time (sec): 12.98 - samples/sec: 2072.83 - lr: 0.000016 - momentum: 0.000000
2023-10-13 12:59:44,017 epoch 6 - iter 356/894 - loss 0.02350570 - time (sec): 17.28 - samples/sec: 2074.33 - lr: 0.000015 - momentum: 0.000000
2023-10-13 12:59:48,252 epoch 6 - iter 445/894 - loss 0.02543387 - time (sec): 21.52 - samples/sec: 2080.84 - lr: 0.000015 - momentum: 0.000000
2023-10-13 12:59:52,367 epoch 6 - iter 534/894 - loss 0.02567710 - time (sec): 25.63 - samples/sec: 2060.05 - lr: 0.000015 - momentum: 0.000000
2023-10-13 12:59:56,761 epoch 6 - iter 623/894 - loss 0.02518657 - time (sec): 30.03 - samples/sec: 2023.69 - lr: 0.000014 - momentum: 0.000000
2023-10-13 13:00:00,899 epoch 6 - iter 712/894 - loss 0.02743067 - time (sec): 34.17 - samples/sec: 2015.73 - lr: 0.000014 - momentum: 0.000000
2023-10-13 13:00:05,151 epoch 6 - iter 801/894 - loss 0.02804290 - time (sec): 38.42 - samples/sec: 2022.12 - lr: 0.000014 - momentum: 0.000000
2023-10-13 13:00:09,498 epoch 6 - iter 890/894 - loss 0.02756516 - time (sec): 42.77 - samples/sec: 2017.73 - lr: 0.000013 - momentum: 0.000000
2023-10-13 13:00:09,683 ----------------------------------------------------------------------------------------------------
2023-10-13 13:00:09,684 EPOCH 6 done: loss 0.0277 - lr: 0.000013
2023-10-13 13:00:18,535 DEV : loss 0.20008665323257446 - f1-score (micro avg) 0.7608
2023-10-13 13:00:18,570 ----------------------------------------------------------------------------------------------------
2023-10-13 13:00:22,657 epoch 7 - iter 89/894 - loss 0.02473568 - time (sec): 4.09 - samples/sec: 2283.61 - lr: 0.000013 - momentum: 0.000000
2023-10-13 13:00:27,033 epoch 7 - iter 178/894 - loss 0.02214234 - time (sec): 8.46 - samples/sec: 2258.02 - lr: 0.000013 - momentum: 0.000000
2023-10-13 13:00:31,190 epoch 7 - iter 267/894 - loss 0.01933197 - time (sec): 12.62 - samples/sec: 2215.83 - lr: 0.000012 - momentum: 0.000000
2023-10-13 13:00:35,256 epoch 7 - iter 356/894 - loss 0.01840027 - time (sec): 16.68 - samples/sec: 2236.09 - lr: 0.000012 - momentum: 0.000000
2023-10-13 13:00:39,376 epoch 7 - iter 445/894 - loss 0.01736996 - time (sec): 20.81 - samples/sec: 2212.78 - lr: 0.000012 - momentum: 0.000000
2023-10-13 13:00:43,341 epoch 7 - iter 534/894 - loss 0.01710256 - time (sec): 24.77 - samples/sec: 2178.14 - lr: 0.000011 - momentum: 0.000000
2023-10-13 13:00:47,660 epoch 7 - iter 623/894 - loss 0.01847066 - time (sec): 29.09 - samples/sec: 2116.68 - lr: 0.000011 - momentum: 0.000000
2023-10-13 13:00:51,880 epoch 7 - iter 712/894 - loss 0.01768895 - time (sec): 33.31 - samples/sec: 2101.35 - lr: 0.000011 - momentum: 0.000000
2023-10-13 13:00:55,976 epoch 7 - iter 801/894 - loss 0.01830601 - time (sec): 37.40 - samples/sec: 2086.22 - lr: 0.000010 - momentum: 0.000000
2023-10-13 13:01:00,075 epoch 7 - iter 890/894 - loss 0.01838476 - time (sec): 41.50 - samples/sec: 2074.90 - lr: 0.000010 - momentum: 0.000000
2023-10-13 13:01:00,255 ----------------------------------------------------------------------------------------------------
2023-10-13 13:01:00,255 EPOCH 7 done: loss 0.0183 - lr: 0.000010
2023-10-13 13:01:09,025 DEV : loss 0.23889903724193573 - f1-score (micro avg) 0.778
2023-10-13 13:01:09,055 ----------------------------------------------------------------------------------------------------
2023-10-13 13:01:13,200 epoch 8 - iter 89/894 - loss 0.01264123 - time (sec): 4.14 - samples/sec: 2109.44 - lr: 0.000010 - momentum: 0.000000
2023-10-13 13:01:17,329 epoch 8 - iter 178/894 - loss 0.01441008 - time (sec): 8.27 - samples/sec: 2044.81 - lr: 0.000009 - momentum: 0.000000
2023-10-13 13:01:21,833 epoch 8 - iter 267/894 - loss 0.01359192 - time (sec): 12.78 - samples/sec: 2117.06 - lr: 0.000009 - momentum: 0.000000
2023-10-13 13:01:26,166 epoch 8 - iter 356/894 - loss 0.01220036 - time (sec): 17.11 - samples/sec: 2075.45 - lr: 0.000009 - momentum: 0.000000
2023-10-13 13:01:30,406 epoch 8 - iter 445/894 - loss 0.01217291 - time (sec): 21.35 - samples/sec: 2063.59 - lr: 0.000008 - momentum: 0.000000
2023-10-13 13:01:34,751 epoch 8 - iter 534/894 - loss 0.01395930 - time (sec): 25.69 - samples/sec: 2061.54 - lr: 0.000008 - momentum: 0.000000
2023-10-13 13:01:38,837 epoch 8 - iter 623/894 - loss 0.01363106 - time (sec): 29.78 - samples/sec: 2061.57 - lr: 0.000008 - momentum: 0.000000
2023-10-13 13:01:43,036 epoch 8 - iter 712/894 - loss 0.01331376 - time (sec): 33.98 - samples/sec: 2049.69 - lr: 0.000007 - momentum: 0.000000
2023-10-13 13:01:47,676 epoch 8 - iter 801/894 - loss 0.01313221 - time (sec): 38.62 - samples/sec: 2032.02 - lr: 0.000007 - momentum: 0.000000
2023-10-13 13:01:52,010 epoch 8 - iter 890/894 - loss 0.01300296 - time (sec): 42.95 - samples/sec: 2008.08 - lr: 0.000007 - momentum: 0.000000
2023-10-13 13:01:52,187 ----------------------------------------------------------------------------------------------------
2023-10-13 13:01:52,188 EPOCH 8 done: loss 0.0131 - lr: 0.000007
2023-10-13 13:02:01,000 DEV : loss 0.23165372014045715 - f1-score (micro avg) 0.7734
2023-10-13 13:02:01,032 ----------------------------------------------------------------------------------------------------
2023-10-13 13:02:05,174 epoch 9 - iter 89/894 - loss 0.00510865 - time (sec): 4.14 - samples/sec: 1989.00 - lr: 0.000006 - momentum: 0.000000
2023-10-13 13:02:09,509 epoch 9 - iter 178/894 - loss 0.00451444 - time (sec): 8.48 - samples/sec: 2090.14 - lr: 0.000006 - momentum: 0.000000
2023-10-13 13:02:13,747 epoch 9 - iter 267/894 - loss 0.00647433 - time (sec): 12.71 - samples/sec: 2067.70 - lr: 0.000006 - momentum: 0.000000
2023-10-13 13:02:18,039 epoch 9 - iter 356/894 - loss 0.00857438 - time (sec): 17.01 - samples/sec: 2043.58 - lr: 0.000005 - momentum: 0.000000
2023-10-13 13:02:22,150 epoch 9 - iter 445/894 - loss 0.01024290 - time (sec): 21.12 - samples/sec: 2049.28 - lr: 0.000005 - momentum: 0.000000
2023-10-13 13:02:26,250 epoch 9 - iter 534/894 - loss 0.00899714 - time (sec): 25.22 - samples/sec: 2043.77 - lr: 0.000005 - momentum: 0.000000
2023-10-13 13:02:30,329 epoch 9 - iter 623/894 - loss 0.00792510 - time (sec): 29.30 - samples/sec: 2035.48 - lr: 0.000004 - momentum: 0.000000
2023-10-13 13:02:34,640 epoch 9 - iter 712/894 - loss 0.00742665 - time (sec): 33.61 - samples/sec: 2032.31 - lr: 0.000004 - momentum: 0.000000
2023-10-13 13:02:38,893 epoch 9 - iter 801/894 - loss 0.00799065 - time (sec): 37.86 - samples/sec: 2058.77 - lr: 0.000004 - momentum: 0.000000
2023-10-13 13:02:42,941 epoch 9 - iter 890/894 - loss 0.00818853 - time (sec): 41.91 - samples/sec: 2056.11 - lr: 0.000003 - momentum: 0.000000
2023-10-13 13:02:43,123 ----------------------------------------------------------------------------------------------------
2023-10-13 13:02:43,123 EPOCH 9 done: loss 0.0084 - lr: 0.000003
2023-10-13 13:02:51,867 DEV : loss 0.2373068630695343 - f1-score (micro avg) 0.7803
2023-10-13 13:02:51,898 saving best model
2023-10-13 13:02:52,380 ----------------------------------------------------------------------------------------------------
2023-10-13 13:02:56,667 epoch 10 - iter 89/894 - loss 0.00299024 - time (sec): 4.28 - samples/sec: 2137.65 - lr: 0.000003 - momentum: 0.000000
2023-10-13 13:03:00,728 epoch 10 - iter 178/894 - loss 0.00773541 - time (sec): 8.34 - samples/sec: 2091.33 - lr: 0.000003 - momentum: 0.000000
2023-10-13 13:03:05,105 epoch 10 - iter 267/894 - loss 0.00678755 - time (sec): 12.72 - samples/sec: 2063.17 - lr: 0.000002 - momentum: 0.000000
2023-10-13 13:03:09,449 epoch 10 - iter 356/894 - loss 0.00753006 - time (sec): 17.06 - samples/sec: 2048.86 - lr: 0.000002 - momentum: 0.000000
2023-10-13 13:03:13,636 epoch 10 - iter 445/894 - loss 0.00720676 - time (sec): 21.25 - samples/sec: 2025.78 - lr: 0.000002 - momentum: 0.000000
2023-10-13 13:03:17,696 epoch 10 - iter 534/894 - loss 0.00687943 - time (sec): 25.31 - samples/sec: 2026.65 - lr: 0.000001 - momentum: 0.000000
2023-10-13 13:03:22,040 epoch 10 - iter 623/894 - loss 0.00586593 - time (sec): 29.65 - samples/sec: 2033.97 - lr: 0.000001 - momentum: 0.000000
2023-10-13 13:03:26,626 epoch 10 - iter 712/894 - loss 0.00544570 - time (sec): 34.24 - samples/sec: 2055.88 - lr: 0.000001 - momentum: 0.000000
2023-10-13 13:03:30,764 epoch 10 - iter 801/894 - loss 0.00536573 - time (sec): 38.38 - samples/sec: 2038.80 - lr: 0.000000 - momentum: 0.000000
2023-10-13 13:03:34,874 epoch 10 - iter 890/894 - loss 0.00540931 - time (sec): 42.49 - samples/sec: 2027.55 - lr: 0.000000 - momentum: 0.000000
2023-10-13 13:03:35,057 ----------------------------------------------------------------------------------------------------
2023-10-13 13:03:35,058 EPOCH 10 done: loss 0.0054 - lr: 0.000000
2023-10-13 13:03:43,593 DEV : loss 0.23833510279655457 - f1-score (micro avg) 0.7798
2023-10-13 13:03:43,975 ----------------------------------------------------------------------------------------------------
2023-10-13 13:03:43,976 Loading model from best epoch ...
2023-10-13 13:03:45,339 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-13 13:03:49,655
Results:
- F-score (micro) 0.7373
- F-score (macro) 0.662
- Accuracy 0.6036
By class:
precision recall f1-score support
loc 0.8468 0.8255 0.8360 596
pers 0.6525 0.7387 0.6930 333
org 0.4965 0.5303 0.5128 132
prod 0.6531 0.4848 0.5565 66
time 0.6727 0.7551 0.7115 49
micro avg 0.7290 0.7457 0.7373 1176
macro avg 0.6643 0.6669 0.6620 1176
weighted avg 0.7343 0.7457 0.7384 1176
2023-10-13 13:03:49,655 ----------------------------------------------------------------------------------------------------