stefan-it's picture
Upload folder using huggingface_hub
b357a9c
2023-10-13 13:04:16,455 ----------------------------------------------------------------------------------------------------
2023-10-13 13:04:16,455 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 13:04:16,456 ----------------------------------------------------------------------------------------------------
2023-10-13 13:04:16,456 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-13 13:04:16,456 ----------------------------------------------------------------------------------------------------
2023-10-13 13:04:16,456 Train: 3575 sentences
2023-10-13 13:04:16,456 (train_with_dev=False, train_with_test=False)
2023-10-13 13:04:16,456 ----------------------------------------------------------------------------------------------------
2023-10-13 13:04:16,456 Training Params:
2023-10-13 13:04:16,456 - learning_rate: "5e-05"
2023-10-13 13:04:16,456 - mini_batch_size: "4"
2023-10-13 13:04:16,456 - max_epochs: "10"
2023-10-13 13:04:16,456 - shuffle: "True"
2023-10-13 13:04:16,456 ----------------------------------------------------------------------------------------------------
2023-10-13 13:04:16,456 Plugins:
2023-10-13 13:04:16,456 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 13:04:16,456 ----------------------------------------------------------------------------------------------------
2023-10-13 13:04:16,456 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 13:04:16,456 - metric: "('micro avg', 'f1-score')"
2023-10-13 13:04:16,456 ----------------------------------------------------------------------------------------------------
2023-10-13 13:04:16,456 Computation:
2023-10-13 13:04:16,456 - compute on device: cuda:0
2023-10-13 13:04:16,456 - embedding storage: none
2023-10-13 13:04:16,456 ----------------------------------------------------------------------------------------------------
2023-10-13 13:04:16,456 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-13 13:04:16,456 ----------------------------------------------------------------------------------------------------
2023-10-13 13:04:16,456 ----------------------------------------------------------------------------------------------------
2023-10-13 13:04:20,590 epoch 1 - iter 89/894 - loss 2.36634984 - time (sec): 4.13 - samples/sec: 2030.01 - lr: 0.000005 - momentum: 0.000000
2023-10-13 13:04:24,739 epoch 1 - iter 178/894 - loss 1.43239919 - time (sec): 8.28 - samples/sec: 2080.21 - lr: 0.000010 - momentum: 0.000000
2023-10-13 13:04:28,865 epoch 1 - iter 267/894 - loss 1.11148343 - time (sec): 12.41 - samples/sec: 2032.80 - lr: 0.000015 - momentum: 0.000000
2023-10-13 13:04:33,154 epoch 1 - iter 356/894 - loss 0.89126147 - time (sec): 16.70 - samples/sec: 2090.54 - lr: 0.000020 - momentum: 0.000000
2023-10-13 13:04:37,385 epoch 1 - iter 445/894 - loss 0.76537662 - time (sec): 20.93 - samples/sec: 2086.15 - lr: 0.000025 - momentum: 0.000000
2023-10-13 13:04:41,659 epoch 1 - iter 534/894 - loss 0.68100211 - time (sec): 25.20 - samples/sec: 2087.08 - lr: 0.000030 - momentum: 0.000000
2023-10-13 13:04:45,922 epoch 1 - iter 623/894 - loss 0.62588240 - time (sec): 29.46 - samples/sec: 2058.75 - lr: 0.000035 - momentum: 0.000000
2023-10-13 13:04:50,658 epoch 1 - iter 712/894 - loss 0.58637517 - time (sec): 34.20 - samples/sec: 2010.46 - lr: 0.000040 - momentum: 0.000000
2023-10-13 13:04:55,024 epoch 1 - iter 801/894 - loss 0.54465662 - time (sec): 38.57 - samples/sec: 2021.76 - lr: 0.000045 - momentum: 0.000000
2023-10-13 13:04:59,165 epoch 1 - iter 890/894 - loss 0.51580101 - time (sec): 42.71 - samples/sec: 2019.42 - lr: 0.000050 - momentum: 0.000000
2023-10-13 13:04:59,339 ----------------------------------------------------------------------------------------------------
2023-10-13 13:04:59,339 EPOCH 1 done: loss 0.5152 - lr: 0.000050
2023-10-13 13:05:04,379 DEV : loss 0.1789691001176834 - f1-score (micro avg) 0.6043
2023-10-13 13:05:04,404 saving best model
2023-10-13 13:05:04,708 ----------------------------------------------------------------------------------------------------
2023-10-13 13:05:08,595 epoch 2 - iter 89/894 - loss 0.19123209 - time (sec): 3.89 - samples/sec: 2118.22 - lr: 0.000049 - momentum: 0.000000
2023-10-13 13:05:12,714 epoch 2 - iter 178/894 - loss 0.18653465 - time (sec): 8.00 - samples/sec: 2105.89 - lr: 0.000049 - momentum: 0.000000
2023-10-13 13:05:16,707 epoch 2 - iter 267/894 - loss 0.17320392 - time (sec): 12.00 - samples/sec: 2092.69 - lr: 0.000048 - momentum: 0.000000
2023-10-13 13:05:20,941 epoch 2 - iter 356/894 - loss 0.16508210 - time (sec): 16.23 - samples/sec: 2076.00 - lr: 0.000048 - momentum: 0.000000
2023-10-13 13:05:25,000 epoch 2 - iter 445/894 - loss 0.17103083 - time (sec): 20.29 - samples/sec: 2066.33 - lr: 0.000047 - momentum: 0.000000
2023-10-13 13:05:29,170 epoch 2 - iter 534/894 - loss 0.16503055 - time (sec): 24.46 - samples/sec: 2052.01 - lr: 0.000047 - momentum: 0.000000
2023-10-13 13:05:33,290 epoch 2 - iter 623/894 - loss 0.16223494 - time (sec): 28.58 - samples/sec: 2070.65 - lr: 0.000046 - momentum: 0.000000
2023-10-13 13:05:37,456 epoch 2 - iter 712/894 - loss 0.16356294 - time (sec): 32.75 - samples/sec: 2074.67 - lr: 0.000046 - momentum: 0.000000
2023-10-13 13:05:41,548 epoch 2 - iter 801/894 - loss 0.16281990 - time (sec): 36.84 - samples/sec: 2104.60 - lr: 0.000045 - momentum: 0.000000
2023-10-13 13:05:45,629 epoch 2 - iter 890/894 - loss 0.16303453 - time (sec): 40.92 - samples/sec: 2107.04 - lr: 0.000044 - momentum: 0.000000
2023-10-13 13:05:45,808 ----------------------------------------------------------------------------------------------------
2023-10-13 13:05:45,808 EPOCH 2 done: loss 0.1631 - lr: 0.000044
2023-10-13 13:05:54,534 DEV : loss 0.137100949883461 - f1-score (micro avg) 0.687
2023-10-13 13:05:54,562 saving best model
2023-10-13 13:05:55,032 ----------------------------------------------------------------------------------------------------
2023-10-13 13:05:59,180 epoch 3 - iter 89/894 - loss 0.09699773 - time (sec): 4.15 - samples/sec: 1976.35 - lr: 0.000044 - momentum: 0.000000
2023-10-13 13:06:03,141 epoch 3 - iter 178/894 - loss 0.09307940 - time (sec): 8.11 - samples/sec: 1983.41 - lr: 0.000043 - momentum: 0.000000
2023-10-13 13:06:07,398 epoch 3 - iter 267/894 - loss 0.09537427 - time (sec): 12.36 - samples/sec: 2024.87 - lr: 0.000043 - momentum: 0.000000
2023-10-13 13:06:11,519 epoch 3 - iter 356/894 - loss 0.10190669 - time (sec): 16.49 - samples/sec: 2029.87 - lr: 0.000042 - momentum: 0.000000
2023-10-13 13:06:15,669 epoch 3 - iter 445/894 - loss 0.10548613 - time (sec): 20.64 - samples/sec: 2012.81 - lr: 0.000042 - momentum: 0.000000
2023-10-13 13:06:19,846 epoch 3 - iter 534/894 - loss 0.09892079 - time (sec): 24.81 - samples/sec: 2028.38 - lr: 0.000041 - momentum: 0.000000
2023-10-13 13:06:24,075 epoch 3 - iter 623/894 - loss 0.09652969 - time (sec): 29.04 - samples/sec: 2022.93 - lr: 0.000041 - momentum: 0.000000
2023-10-13 13:06:28,473 epoch 3 - iter 712/894 - loss 0.09893057 - time (sec): 33.44 - samples/sec: 2013.66 - lr: 0.000040 - momentum: 0.000000
2023-10-13 13:06:32,779 epoch 3 - iter 801/894 - loss 0.09689469 - time (sec): 37.74 - samples/sec: 2022.65 - lr: 0.000039 - momentum: 0.000000
2023-10-13 13:06:37,176 epoch 3 - iter 890/894 - loss 0.09758921 - time (sec): 42.14 - samples/sec: 2044.61 - lr: 0.000039 - momentum: 0.000000
2023-10-13 13:06:37,358 ----------------------------------------------------------------------------------------------------
2023-10-13 13:06:37,359 EPOCH 3 done: loss 0.0973 - lr: 0.000039
2023-10-13 13:06:46,038 DEV : loss 0.1663622260093689 - f1-score (micro avg) 0.7477
2023-10-13 13:06:46,074 saving best model
2023-10-13 13:06:46,554 ----------------------------------------------------------------------------------------------------
2023-10-13 13:06:50,999 epoch 4 - iter 89/894 - loss 0.06861891 - time (sec): 4.44 - samples/sec: 1871.87 - lr: 0.000038 - momentum: 0.000000
2023-10-13 13:06:55,159 epoch 4 - iter 178/894 - loss 0.06191697 - time (sec): 8.60 - samples/sec: 1944.64 - lr: 0.000038 - momentum: 0.000000
2023-10-13 13:06:59,111 epoch 4 - iter 267/894 - loss 0.05888640 - time (sec): 12.55 - samples/sec: 1998.40 - lr: 0.000037 - momentum: 0.000000
2023-10-13 13:07:03,348 epoch 4 - iter 356/894 - loss 0.06116699 - time (sec): 16.79 - samples/sec: 2100.08 - lr: 0.000037 - momentum: 0.000000
2023-10-13 13:07:07,307 epoch 4 - iter 445/894 - loss 0.05715439 - time (sec): 20.75 - samples/sec: 2097.09 - lr: 0.000036 - momentum: 0.000000
2023-10-13 13:07:11,513 epoch 4 - iter 534/894 - loss 0.06420373 - time (sec): 24.95 - samples/sec: 2090.38 - lr: 0.000036 - momentum: 0.000000
2023-10-13 13:07:15,549 epoch 4 - iter 623/894 - loss 0.06272790 - time (sec): 28.99 - samples/sec: 2100.24 - lr: 0.000035 - momentum: 0.000000
2023-10-13 13:07:19,635 epoch 4 - iter 712/894 - loss 0.06336972 - time (sec): 33.08 - samples/sec: 2091.46 - lr: 0.000034 - momentum: 0.000000
2023-10-13 13:07:23,854 epoch 4 - iter 801/894 - loss 0.06486021 - time (sec): 37.30 - samples/sec: 2100.20 - lr: 0.000034 - momentum: 0.000000
2023-10-13 13:07:27,851 epoch 4 - iter 890/894 - loss 0.06614027 - time (sec): 41.29 - samples/sec: 2088.79 - lr: 0.000033 - momentum: 0.000000
2023-10-13 13:07:28,035 ----------------------------------------------------------------------------------------------------
2023-10-13 13:07:28,035 EPOCH 4 done: loss 0.0661 - lr: 0.000033
2023-10-13 13:07:36,744 DEV : loss 0.20190493762493134 - f1-score (micro avg) 0.7572
2023-10-13 13:07:36,774 saving best model
2023-10-13 13:07:37,188 ----------------------------------------------------------------------------------------------------
2023-10-13 13:07:41,561 epoch 5 - iter 89/894 - loss 0.04288380 - time (sec): 4.37 - samples/sec: 1956.39 - lr: 0.000033 - momentum: 0.000000
2023-10-13 13:07:45,694 epoch 5 - iter 178/894 - loss 0.03764053 - time (sec): 8.50 - samples/sec: 1968.92 - lr: 0.000032 - momentum: 0.000000
2023-10-13 13:07:49,851 epoch 5 - iter 267/894 - loss 0.03485234 - time (sec): 12.66 - samples/sec: 1998.91 - lr: 0.000032 - momentum: 0.000000
2023-10-13 13:07:53,898 epoch 5 - iter 356/894 - loss 0.03891244 - time (sec): 16.71 - samples/sec: 1992.95 - lr: 0.000031 - momentum: 0.000000
2023-10-13 13:07:58,344 epoch 5 - iter 445/894 - loss 0.04283138 - time (sec): 21.15 - samples/sec: 2013.45 - lr: 0.000031 - momentum: 0.000000
2023-10-13 13:08:02,706 epoch 5 - iter 534/894 - loss 0.04395209 - time (sec): 25.51 - samples/sec: 1992.27 - lr: 0.000030 - momentum: 0.000000
2023-10-13 13:08:07,000 epoch 5 - iter 623/894 - loss 0.04187519 - time (sec): 29.81 - samples/sec: 1974.53 - lr: 0.000029 - momentum: 0.000000
2023-10-13 13:08:11,451 epoch 5 - iter 712/894 - loss 0.04371575 - time (sec): 34.26 - samples/sec: 2013.92 - lr: 0.000029 - momentum: 0.000000
2023-10-13 13:08:15,665 epoch 5 - iter 801/894 - loss 0.04232226 - time (sec): 38.47 - samples/sec: 2029.58 - lr: 0.000028 - momentum: 0.000000
2023-10-13 13:08:19,803 epoch 5 - iter 890/894 - loss 0.04227247 - time (sec): 42.61 - samples/sec: 2024.39 - lr: 0.000028 - momentum: 0.000000
2023-10-13 13:08:19,982 ----------------------------------------------------------------------------------------------------
2023-10-13 13:08:19,982 EPOCH 5 done: loss 0.0421 - lr: 0.000028
2023-10-13 13:08:28,483 DEV : loss 0.2170535773038864 - f1-score (micro avg) 0.7413
2023-10-13 13:08:28,510 ----------------------------------------------------------------------------------------------------
2023-10-13 13:08:33,040 epoch 6 - iter 89/894 - loss 0.03105688 - time (sec): 4.53 - samples/sec: 2143.95 - lr: 0.000027 - momentum: 0.000000
2023-10-13 13:08:37,152 epoch 6 - iter 178/894 - loss 0.02598800 - time (sec): 8.64 - samples/sec: 2086.86 - lr: 0.000027 - momentum: 0.000000
2023-10-13 13:08:41,573 epoch 6 - iter 267/894 - loss 0.02511682 - time (sec): 13.06 - samples/sec: 2059.81 - lr: 0.000026 - momentum: 0.000000
2023-10-13 13:08:45,626 epoch 6 - iter 356/894 - loss 0.02507832 - time (sec): 17.11 - samples/sec: 2094.96 - lr: 0.000026 - momentum: 0.000000
2023-10-13 13:08:49,625 epoch 6 - iter 445/894 - loss 0.02412952 - time (sec): 21.11 - samples/sec: 2120.88 - lr: 0.000025 - momentum: 0.000000
2023-10-13 13:08:54,136 epoch 6 - iter 534/894 - loss 0.02552282 - time (sec): 25.63 - samples/sec: 2060.82 - lr: 0.000024 - momentum: 0.000000
2023-10-13 13:08:58,762 epoch 6 - iter 623/894 - loss 0.02567017 - time (sec): 30.25 - samples/sec: 2008.90 - lr: 0.000024 - momentum: 0.000000
2023-10-13 13:09:03,242 epoch 6 - iter 712/894 - loss 0.02907428 - time (sec): 34.73 - samples/sec: 1982.97 - lr: 0.000023 - momentum: 0.000000
2023-10-13 13:09:07,685 epoch 6 - iter 801/894 - loss 0.03070030 - time (sec): 39.17 - samples/sec: 1983.15 - lr: 0.000023 - momentum: 0.000000
2023-10-13 13:09:12,113 epoch 6 - iter 890/894 - loss 0.03031654 - time (sec): 43.60 - samples/sec: 1979.02 - lr: 0.000022 - momentum: 0.000000
2023-10-13 13:09:12,288 ----------------------------------------------------------------------------------------------------
2023-10-13 13:09:12,289 EPOCH 6 done: loss 0.0306 - lr: 0.000022
2023-10-13 13:09:20,941 DEV : loss 0.21267157793045044 - f1-score (micro avg) 0.7541
2023-10-13 13:09:20,968 ----------------------------------------------------------------------------------------------------
2023-10-13 13:09:25,137 epoch 7 - iter 89/894 - loss 0.02116415 - time (sec): 4.17 - samples/sec: 2238.75 - lr: 0.000022 - momentum: 0.000000
2023-10-13 13:09:29,616 epoch 7 - iter 178/894 - loss 0.02415449 - time (sec): 8.65 - samples/sec: 2209.79 - lr: 0.000021 - momentum: 0.000000
2023-10-13 13:09:34,148 epoch 7 - iter 267/894 - loss 0.02137786 - time (sec): 13.18 - samples/sec: 2121.64 - lr: 0.000021 - momentum: 0.000000
2023-10-13 13:09:38,226 epoch 7 - iter 356/894 - loss 0.02169246 - time (sec): 17.26 - samples/sec: 2162.02 - lr: 0.000020 - momentum: 0.000000
2023-10-13 13:09:42,374 epoch 7 - iter 445/894 - loss 0.01995436 - time (sec): 21.40 - samples/sec: 2150.81 - lr: 0.000019 - momentum: 0.000000
2023-10-13 13:09:46,490 epoch 7 - iter 534/894 - loss 0.01839450 - time (sec): 25.52 - samples/sec: 2114.02 - lr: 0.000019 - momentum: 0.000000
2023-10-13 13:09:50,578 epoch 7 - iter 623/894 - loss 0.01849719 - time (sec): 29.61 - samples/sec: 2079.51 - lr: 0.000018 - momentum: 0.000000
2023-10-13 13:09:54,790 epoch 7 - iter 712/894 - loss 0.01855825 - time (sec): 33.82 - samples/sec: 2069.57 - lr: 0.000018 - momentum: 0.000000
2023-10-13 13:09:58,936 epoch 7 - iter 801/894 - loss 0.01877532 - time (sec): 37.97 - samples/sec: 2055.33 - lr: 0.000017 - momentum: 0.000000
2023-10-13 13:10:02,992 epoch 7 - iter 890/894 - loss 0.02001437 - time (sec): 42.02 - samples/sec: 2049.29 - lr: 0.000017 - momentum: 0.000000
2023-10-13 13:10:03,172 ----------------------------------------------------------------------------------------------------
2023-10-13 13:10:03,172 EPOCH 7 done: loss 0.0199 - lr: 0.000017
2023-10-13 13:10:11,819 DEV : loss 0.24245555698871613 - f1-score (micro avg) 0.7744
2023-10-13 13:10:11,846 saving best model
2023-10-13 13:10:12,251 ----------------------------------------------------------------------------------------------------
2023-10-13 13:10:16,359 epoch 8 - iter 89/894 - loss 0.01603205 - time (sec): 4.10 - samples/sec: 2129.71 - lr: 0.000016 - momentum: 0.000000
2023-10-13 13:10:20,346 epoch 8 - iter 178/894 - loss 0.01420650 - time (sec): 8.09 - samples/sec: 2090.79 - lr: 0.000016 - momentum: 0.000000
2023-10-13 13:10:24,729 epoch 8 - iter 267/894 - loss 0.01506146 - time (sec): 12.47 - samples/sec: 2168.47 - lr: 0.000015 - momentum: 0.000000
2023-10-13 13:10:28,889 epoch 8 - iter 356/894 - loss 0.01227759 - time (sec): 16.63 - samples/sec: 2134.89 - lr: 0.000014 - momentum: 0.000000
2023-10-13 13:10:33,378 epoch 8 - iter 445/894 - loss 0.01221774 - time (sec): 21.12 - samples/sec: 2085.76 - lr: 0.000014 - momentum: 0.000000
2023-10-13 13:10:37,636 epoch 8 - iter 534/894 - loss 0.01151621 - time (sec): 25.38 - samples/sec: 2087.00 - lr: 0.000013 - momentum: 0.000000
2023-10-13 13:10:41,814 epoch 8 - iter 623/894 - loss 0.01137032 - time (sec): 29.56 - samples/sec: 2077.04 - lr: 0.000013 - momentum: 0.000000
2023-10-13 13:10:45,868 epoch 8 - iter 712/894 - loss 0.01073686 - time (sec): 33.61 - samples/sec: 2072.07 - lr: 0.000012 - momentum: 0.000000
2023-10-13 13:10:50,055 epoch 8 - iter 801/894 - loss 0.01085969 - time (sec): 37.80 - samples/sec: 2076.10 - lr: 0.000012 - momentum: 0.000000
2023-10-13 13:10:54,024 epoch 8 - iter 890/894 - loss 0.01108987 - time (sec): 41.77 - samples/sec: 2065.03 - lr: 0.000011 - momentum: 0.000000
2023-10-13 13:10:54,198 ----------------------------------------------------------------------------------------------------
2023-10-13 13:10:54,198 EPOCH 8 done: loss 0.0112 - lr: 0.000011
2023-10-13 13:11:02,846 DEV : loss 0.233389750123024 - f1-score (micro avg) 0.7832
2023-10-13 13:11:02,873 saving best model
2023-10-13 13:11:03,298 ----------------------------------------------------------------------------------------------------
2023-10-13 13:11:07,662 epoch 9 - iter 89/894 - loss 0.00502968 - time (sec): 4.36 - samples/sec: 1888.24 - lr: 0.000011 - momentum: 0.000000
2023-10-13 13:11:12,220 epoch 9 - iter 178/894 - loss 0.00345860 - time (sec): 8.92 - samples/sec: 1986.19 - lr: 0.000010 - momentum: 0.000000
2023-10-13 13:11:16,359 epoch 9 - iter 267/894 - loss 0.00469124 - time (sec): 13.06 - samples/sec: 2013.22 - lr: 0.000009 - momentum: 0.000000
2023-10-13 13:11:20,410 epoch 9 - iter 356/894 - loss 0.00482172 - time (sec): 17.11 - samples/sec: 2031.12 - lr: 0.000009 - momentum: 0.000000
2023-10-13 13:11:24,331 epoch 9 - iter 445/894 - loss 0.00561556 - time (sec): 21.03 - samples/sec: 2057.73 - lr: 0.000008 - momentum: 0.000000
2023-10-13 13:11:28,315 epoch 9 - iter 534/894 - loss 0.00515886 - time (sec): 25.01 - samples/sec: 2060.32 - lr: 0.000008 - momentum: 0.000000
2023-10-13 13:11:32,522 epoch 9 - iter 623/894 - loss 0.00534634 - time (sec): 29.22 - samples/sec: 2040.68 - lr: 0.000007 - momentum: 0.000000
2023-10-13 13:11:36,786 epoch 9 - iter 712/894 - loss 0.00559174 - time (sec): 33.49 - samples/sec: 2039.71 - lr: 0.000007 - momentum: 0.000000
2023-10-13 13:11:41,058 epoch 9 - iter 801/894 - loss 0.00552963 - time (sec): 37.76 - samples/sec: 2064.38 - lr: 0.000006 - momentum: 0.000000
2023-10-13 13:11:45,102 epoch 9 - iter 890/894 - loss 0.00582293 - time (sec): 41.80 - samples/sec: 2061.36 - lr: 0.000006 - momentum: 0.000000
2023-10-13 13:11:45,305 ----------------------------------------------------------------------------------------------------
2023-10-13 13:11:45,306 EPOCH 9 done: loss 0.0061 - lr: 0.000006
2023-10-13 13:11:54,109 DEV : loss 0.2255241870880127 - f1-score (micro avg) 0.779
2023-10-13 13:11:54,139 ----------------------------------------------------------------------------------------------------
2023-10-13 13:11:58,538 epoch 10 - iter 89/894 - loss 0.00368147 - time (sec): 4.40 - samples/sec: 2080.65 - lr: 0.000005 - momentum: 0.000000
2023-10-13 13:12:02,657 epoch 10 - iter 178/894 - loss 0.00436238 - time (sec): 8.52 - samples/sec: 2048.21 - lr: 0.000004 - momentum: 0.000000
2023-10-13 13:12:06,746 epoch 10 - iter 267/894 - loss 0.00379776 - time (sec): 12.61 - samples/sec: 2081.50 - lr: 0.000004 - momentum: 0.000000
2023-10-13 13:12:10,863 epoch 10 - iter 356/894 - loss 0.00400577 - time (sec): 16.72 - samples/sec: 2090.41 - lr: 0.000003 - momentum: 0.000000
2023-10-13 13:12:15,238 epoch 10 - iter 445/894 - loss 0.00371452 - time (sec): 21.10 - samples/sec: 2040.31 - lr: 0.000003 - momentum: 0.000000
2023-10-13 13:12:19,338 epoch 10 - iter 534/894 - loss 0.00453871 - time (sec): 25.20 - samples/sec: 2035.60 - lr: 0.000002 - momentum: 0.000000
2023-10-13 13:12:23,456 epoch 10 - iter 623/894 - loss 0.00446673 - time (sec): 29.32 - samples/sec: 2057.39 - lr: 0.000002 - momentum: 0.000000
2023-10-13 13:12:27,853 epoch 10 - iter 712/894 - loss 0.00466561 - time (sec): 33.71 - samples/sec: 2088.01 - lr: 0.000001 - momentum: 0.000000
2023-10-13 13:12:31,952 epoch 10 - iter 801/894 - loss 0.00442535 - time (sec): 37.81 - samples/sec: 2069.29 - lr: 0.000001 - momentum: 0.000000
2023-10-13 13:12:36,079 epoch 10 - iter 890/894 - loss 0.00430193 - time (sec): 41.94 - samples/sec: 2054.07 - lr: 0.000000 - momentum: 0.000000
2023-10-13 13:12:36,255 ----------------------------------------------------------------------------------------------------
2023-10-13 13:12:36,256 EPOCH 10 done: loss 0.0043 - lr: 0.000000
2023-10-13 13:12:44,962 DEV : loss 0.23823896050453186 - f1-score (micro avg) 0.7834
2023-10-13 13:12:44,989 saving best model
2023-10-13 13:12:45,734 ----------------------------------------------------------------------------------------------------
2023-10-13 13:12:45,736 Loading model from best epoch ...
2023-10-13 13:12:47,180 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-13 13:12:51,907
Results:
- F-score (micro) 0.7438
- F-score (macro) 0.6448
- Accuracy 0.6112
By class:
precision recall f1-score support
loc 0.8333 0.8473 0.8403 596
pers 0.6614 0.7568 0.7059 333
org 0.6118 0.3939 0.4793 132
prod 0.6222 0.4242 0.5045 66
time 0.6939 0.6939 0.6939 49
micro avg 0.7470 0.7406 0.7438 1176
macro avg 0.6845 0.6232 0.6448 1176
weighted avg 0.7421 0.7406 0.7367 1176
2023-10-13 13:12:51,908 ----------------------------------------------------------------------------------------------------