stefan-it's picture
Upload folder using huggingface_hub
7fed36d
2023-10-13 12:32:19,278 ----------------------------------------------------------------------------------------------------
2023-10-13 12:32:19,279 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 12:32:19,279 ----------------------------------------------------------------------------------------------------
2023-10-13 12:32:19,279 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-13 12:32:19,279 ----------------------------------------------------------------------------------------------------
2023-10-13 12:32:19,279 Train: 3575 sentences
2023-10-13 12:32:19,279 (train_with_dev=False, train_with_test=False)
2023-10-13 12:32:19,279 ----------------------------------------------------------------------------------------------------
2023-10-13 12:32:19,279 Training Params:
2023-10-13 12:32:19,279 - learning_rate: "5e-05"
2023-10-13 12:32:19,279 - mini_batch_size: "4"
2023-10-13 12:32:19,279 - max_epochs: "10"
2023-10-13 12:32:19,279 - shuffle: "True"
2023-10-13 12:32:19,279 ----------------------------------------------------------------------------------------------------
2023-10-13 12:32:19,279 Plugins:
2023-10-13 12:32:19,280 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 12:32:19,280 ----------------------------------------------------------------------------------------------------
2023-10-13 12:32:19,280 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 12:32:19,280 - metric: "('micro avg', 'f1-score')"
2023-10-13 12:32:19,280 ----------------------------------------------------------------------------------------------------
2023-10-13 12:32:19,280 Computation:
2023-10-13 12:32:19,280 - compute on device: cuda:0
2023-10-13 12:32:19,280 - embedding storage: none
2023-10-13 12:32:19,280 ----------------------------------------------------------------------------------------------------
2023-10-13 12:32:19,280 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-13 12:32:19,280 ----------------------------------------------------------------------------------------------------
2023-10-13 12:32:19,280 ----------------------------------------------------------------------------------------------------
2023-10-13 12:32:23,751 epoch 1 - iter 89/894 - loss 2.85983461 - time (sec): 4.47 - samples/sec: 1792.74 - lr: 0.000005 - momentum: 0.000000
2023-10-13 12:32:28,441 epoch 1 - iter 178/894 - loss 1.72346897 - time (sec): 9.16 - samples/sec: 1753.15 - lr: 0.000010 - momentum: 0.000000
2023-10-13 12:32:33,078 epoch 1 - iter 267/894 - loss 1.24732200 - time (sec): 13.80 - samples/sec: 1807.21 - lr: 0.000015 - momentum: 0.000000
2023-10-13 12:32:37,557 epoch 1 - iter 356/894 - loss 1.03004553 - time (sec): 18.28 - samples/sec: 1807.56 - lr: 0.000020 - momentum: 0.000000
2023-10-13 12:32:42,240 epoch 1 - iter 445/894 - loss 0.87776144 - time (sec): 22.96 - samples/sec: 1824.47 - lr: 0.000025 - momentum: 0.000000
2023-10-13 12:32:47,145 epoch 1 - iter 534/894 - loss 0.76230451 - time (sec): 27.86 - samples/sec: 1873.14 - lr: 0.000030 - momentum: 0.000000
2023-10-13 12:32:52,025 epoch 1 - iter 623/894 - loss 0.69436195 - time (sec): 32.74 - samples/sec: 1846.87 - lr: 0.000035 - momentum: 0.000000
2023-10-13 12:32:56,787 epoch 1 - iter 712/894 - loss 0.63775811 - time (sec): 37.51 - samples/sec: 1846.05 - lr: 0.000040 - momentum: 0.000000
2023-10-13 12:33:01,489 epoch 1 - iter 801/894 - loss 0.59768895 - time (sec): 42.21 - samples/sec: 1829.53 - lr: 0.000045 - momentum: 0.000000
2023-10-13 12:33:06,315 epoch 1 - iter 890/894 - loss 0.55580460 - time (sec): 47.03 - samples/sec: 1829.34 - lr: 0.000050 - momentum: 0.000000
2023-10-13 12:33:06,492 ----------------------------------------------------------------------------------------------------
2023-10-13 12:33:06,493 EPOCH 1 done: loss 0.5533 - lr: 0.000050
2023-10-13 12:33:11,352 DEV : loss 0.18206389248371124 - f1-score (micro avg) 0.6264
2023-10-13 12:33:11,379 saving best model
2023-10-13 12:33:11,748 ----------------------------------------------------------------------------------------------------
2023-10-13 12:33:15,811 epoch 2 - iter 89/894 - loss 0.20901231 - time (sec): 4.06 - samples/sec: 2117.99 - lr: 0.000049 - momentum: 0.000000
2023-10-13 12:33:19,898 epoch 2 - iter 178/894 - loss 0.19760510 - time (sec): 8.15 - samples/sec: 2090.27 - lr: 0.000049 - momentum: 0.000000
2023-10-13 12:33:23,982 epoch 2 - iter 267/894 - loss 0.19189895 - time (sec): 12.23 - samples/sec: 2062.14 - lr: 0.000048 - momentum: 0.000000
2023-10-13 12:33:28,119 epoch 2 - iter 356/894 - loss 0.18761752 - time (sec): 16.37 - samples/sec: 2080.74 - lr: 0.000048 - momentum: 0.000000
2023-10-13 12:33:32,271 epoch 2 - iter 445/894 - loss 0.18116435 - time (sec): 20.52 - samples/sec: 2055.03 - lr: 0.000047 - momentum: 0.000000
2023-10-13 12:33:36,461 epoch 2 - iter 534/894 - loss 0.17830011 - time (sec): 24.71 - samples/sec: 2083.01 - lr: 0.000047 - momentum: 0.000000
2023-10-13 12:33:40,787 epoch 2 - iter 623/894 - loss 0.17220591 - time (sec): 29.04 - samples/sec: 2059.70 - lr: 0.000046 - momentum: 0.000000
2023-10-13 12:33:45,142 epoch 2 - iter 712/894 - loss 0.16726249 - time (sec): 33.39 - samples/sec: 2070.36 - lr: 0.000046 - momentum: 0.000000
2023-10-13 12:33:49,272 epoch 2 - iter 801/894 - loss 0.16456063 - time (sec): 37.52 - samples/sec: 2075.68 - lr: 0.000045 - momentum: 0.000000
2023-10-13 12:33:53,170 epoch 2 - iter 890/894 - loss 0.16515628 - time (sec): 41.42 - samples/sec: 2079.86 - lr: 0.000044 - momentum: 0.000000
2023-10-13 12:33:53,343 ----------------------------------------------------------------------------------------------------
2023-10-13 12:33:53,343 EPOCH 2 done: loss 0.1651 - lr: 0.000044
2023-10-13 12:34:02,186 DEV : loss 0.14844626188278198 - f1-score (micro avg) 0.6903
2023-10-13 12:34:02,216 saving best model
2023-10-13 12:34:02,669 ----------------------------------------------------------------------------------------------------
2023-10-13 12:34:07,048 epoch 3 - iter 89/894 - loss 0.08584339 - time (sec): 4.38 - samples/sec: 1971.04 - lr: 0.000044 - momentum: 0.000000
2023-10-13 12:34:11,408 epoch 3 - iter 178/894 - loss 0.08458851 - time (sec): 8.74 - samples/sec: 2090.91 - lr: 0.000043 - momentum: 0.000000
2023-10-13 12:34:15,589 epoch 3 - iter 267/894 - loss 0.08982643 - time (sec): 12.92 - samples/sec: 2129.73 - lr: 0.000043 - momentum: 0.000000
2023-10-13 12:34:19,889 epoch 3 - iter 356/894 - loss 0.08300527 - time (sec): 17.22 - samples/sec: 2118.70 - lr: 0.000042 - momentum: 0.000000
2023-10-13 12:34:24,349 epoch 3 - iter 445/894 - loss 0.09461899 - time (sec): 21.68 - samples/sec: 2097.72 - lr: 0.000042 - momentum: 0.000000
2023-10-13 12:34:28,556 epoch 3 - iter 534/894 - loss 0.09779057 - time (sec): 25.89 - samples/sec: 2053.48 - lr: 0.000041 - momentum: 0.000000
2023-10-13 12:34:32,893 epoch 3 - iter 623/894 - loss 0.09502757 - time (sec): 30.22 - samples/sec: 2036.10 - lr: 0.000041 - momentum: 0.000000
2023-10-13 12:34:37,045 epoch 3 - iter 712/894 - loss 0.09421525 - time (sec): 34.37 - samples/sec: 2022.03 - lr: 0.000040 - momentum: 0.000000
2023-10-13 12:34:41,215 epoch 3 - iter 801/894 - loss 0.09728203 - time (sec): 38.54 - samples/sec: 2020.67 - lr: 0.000039 - momentum: 0.000000
2023-10-13 12:34:45,220 epoch 3 - iter 890/894 - loss 0.09737888 - time (sec): 42.55 - samples/sec: 2024.31 - lr: 0.000039 - momentum: 0.000000
2023-10-13 12:34:45,395 ----------------------------------------------------------------------------------------------------
2023-10-13 12:34:45,395 EPOCH 3 done: loss 0.0970 - lr: 0.000039
2023-10-13 12:34:54,092 DEV : loss 0.21671560406684875 - f1-score (micro avg) 0.7401
2023-10-13 12:34:54,121 saving best model
2023-10-13 12:34:54,564 ----------------------------------------------------------------------------------------------------
2023-10-13 12:34:58,507 epoch 4 - iter 89/894 - loss 0.06239486 - time (sec): 3.94 - samples/sec: 1934.14 - lr: 0.000038 - momentum: 0.000000
2023-10-13 12:35:02,740 epoch 4 - iter 178/894 - loss 0.05659001 - time (sec): 8.17 - samples/sec: 2076.86 - lr: 0.000038 - momentum: 0.000000
2023-10-13 12:35:06,896 epoch 4 - iter 267/894 - loss 0.07314064 - time (sec): 12.33 - samples/sec: 2052.34 - lr: 0.000037 - momentum: 0.000000
2023-10-13 12:35:11,114 epoch 4 - iter 356/894 - loss 0.06967169 - time (sec): 16.55 - samples/sec: 2051.60 - lr: 0.000037 - momentum: 0.000000
2023-10-13 12:35:15,115 epoch 4 - iter 445/894 - loss 0.06814429 - time (sec): 20.55 - samples/sec: 2023.19 - lr: 0.000036 - momentum: 0.000000
2023-10-13 12:35:19,600 epoch 4 - iter 534/894 - loss 0.06312653 - time (sec): 25.04 - samples/sec: 2074.53 - lr: 0.000036 - momentum: 0.000000
2023-10-13 12:35:23,863 epoch 4 - iter 623/894 - loss 0.06341301 - time (sec): 29.30 - samples/sec: 2064.67 - lr: 0.000035 - momentum: 0.000000
2023-10-13 12:35:27,776 epoch 4 - iter 712/894 - loss 0.06424281 - time (sec): 33.21 - samples/sec: 2065.66 - lr: 0.000034 - momentum: 0.000000
2023-10-13 12:35:31,743 epoch 4 - iter 801/894 - loss 0.06543736 - time (sec): 37.18 - samples/sec: 2092.22 - lr: 0.000034 - momentum: 0.000000
2023-10-13 12:35:35,761 epoch 4 - iter 890/894 - loss 0.06405722 - time (sec): 41.20 - samples/sec: 2093.77 - lr: 0.000033 - momentum: 0.000000
2023-10-13 12:35:35,945 ----------------------------------------------------------------------------------------------------
2023-10-13 12:35:35,945 EPOCH 4 done: loss 0.0639 - lr: 0.000033
2023-10-13 12:35:44,595 DEV : loss 0.1947321891784668 - f1-score (micro avg) 0.7379
2023-10-13 12:35:44,623 ----------------------------------------------------------------------------------------------------
2023-10-13 12:35:48,550 epoch 5 - iter 89/894 - loss 0.05096803 - time (sec): 3.93 - samples/sec: 2072.36 - lr: 0.000033 - momentum: 0.000000
2023-10-13 12:35:52,426 epoch 5 - iter 178/894 - loss 0.04532651 - time (sec): 7.80 - samples/sec: 2056.55 - lr: 0.000032 - momentum: 0.000000
2023-10-13 12:35:56,464 epoch 5 - iter 267/894 - loss 0.04436703 - time (sec): 11.84 - samples/sec: 2090.11 - lr: 0.000032 - momentum: 0.000000
2023-10-13 12:36:00,970 epoch 5 - iter 356/894 - loss 0.04213125 - time (sec): 16.35 - samples/sec: 2086.23 - lr: 0.000031 - momentum: 0.000000
2023-10-13 12:36:05,425 epoch 5 - iter 445/894 - loss 0.04195800 - time (sec): 20.80 - samples/sec: 2067.23 - lr: 0.000031 - momentum: 0.000000
2023-10-13 12:36:09,548 epoch 5 - iter 534/894 - loss 0.04254711 - time (sec): 24.92 - samples/sec: 2062.82 - lr: 0.000030 - momentum: 0.000000
2023-10-13 12:36:13,943 epoch 5 - iter 623/894 - loss 0.04283399 - time (sec): 29.32 - samples/sec: 2062.46 - lr: 0.000029 - momentum: 0.000000
2023-10-13 12:36:18,296 epoch 5 - iter 712/894 - loss 0.04335760 - time (sec): 33.67 - samples/sec: 2073.39 - lr: 0.000029 - momentum: 0.000000
2023-10-13 12:36:22,427 epoch 5 - iter 801/894 - loss 0.04107727 - time (sec): 37.80 - samples/sec: 2074.95 - lr: 0.000028 - momentum: 0.000000
2023-10-13 12:36:26,403 epoch 5 - iter 890/894 - loss 0.04070854 - time (sec): 41.78 - samples/sec: 2063.32 - lr: 0.000028 - momentum: 0.000000
2023-10-13 12:36:26,582 ----------------------------------------------------------------------------------------------------
2023-10-13 12:36:26,582 EPOCH 5 done: loss 0.0406 - lr: 0.000028
2023-10-13 12:36:35,756 DEV : loss 0.2318100780248642 - f1-score (micro avg) 0.7612
2023-10-13 12:36:35,787 saving best model
2023-10-13 12:36:36,173 ----------------------------------------------------------------------------------------------------
2023-10-13 12:36:40,528 epoch 6 - iter 89/894 - loss 0.02285241 - time (sec): 4.35 - samples/sec: 1993.67 - lr: 0.000027 - momentum: 0.000000
2023-10-13 12:36:44,740 epoch 6 - iter 178/894 - loss 0.02994531 - time (sec): 8.57 - samples/sec: 1977.19 - lr: 0.000027 - momentum: 0.000000
2023-10-13 12:36:48,963 epoch 6 - iter 267/894 - loss 0.03140204 - time (sec): 12.79 - samples/sec: 1961.50 - lr: 0.000026 - momentum: 0.000000
2023-10-13 12:36:53,037 epoch 6 - iter 356/894 - loss 0.03042671 - time (sec): 16.86 - samples/sec: 2000.69 - lr: 0.000026 - momentum: 0.000000
2023-10-13 12:36:57,485 epoch 6 - iter 445/894 - loss 0.03182242 - time (sec): 21.31 - samples/sec: 1970.17 - lr: 0.000025 - momentum: 0.000000
2023-10-13 12:37:01,561 epoch 6 - iter 534/894 - loss 0.03205269 - time (sec): 25.39 - samples/sec: 1986.79 - lr: 0.000024 - momentum: 0.000000
2023-10-13 12:37:05,724 epoch 6 - iter 623/894 - loss 0.03194324 - time (sec): 29.55 - samples/sec: 1982.14 - lr: 0.000024 - momentum: 0.000000
2023-10-13 12:37:10,061 epoch 6 - iter 712/894 - loss 0.03214632 - time (sec): 33.89 - samples/sec: 1976.66 - lr: 0.000023 - momentum: 0.000000
2023-10-13 12:37:14,206 epoch 6 - iter 801/894 - loss 0.03639544 - time (sec): 38.03 - samples/sec: 1997.06 - lr: 0.000023 - momentum: 0.000000
2023-10-13 12:37:18,693 epoch 6 - iter 890/894 - loss 0.03436798 - time (sec): 42.52 - samples/sec: 2022.52 - lr: 0.000022 - momentum: 0.000000
2023-10-13 12:37:18,891 ----------------------------------------------------------------------------------------------------
2023-10-13 12:37:18,891 EPOCH 6 done: loss 0.0343 - lr: 0.000022
2023-10-13 12:37:27,725 DEV : loss 0.2459443360567093 - f1-score (micro avg) 0.7664
2023-10-13 12:37:27,753 saving best model
2023-10-13 12:37:28,192 ----------------------------------------------------------------------------------------------------
2023-10-13 12:37:32,358 epoch 7 - iter 89/894 - loss 0.02901541 - time (sec): 4.16 - samples/sec: 2086.25 - lr: 0.000022 - momentum: 0.000000
2023-10-13 12:37:37,075 epoch 7 - iter 178/894 - loss 0.02057225 - time (sec): 8.88 - samples/sec: 1942.97 - lr: 0.000021 - momentum: 0.000000
2023-10-13 12:37:41,718 epoch 7 - iter 267/894 - loss 0.02355731 - time (sec): 13.52 - samples/sec: 1944.79 - lr: 0.000021 - momentum: 0.000000
2023-10-13 12:37:45,887 epoch 7 - iter 356/894 - loss 0.02226925 - time (sec): 17.69 - samples/sec: 1983.66 - lr: 0.000020 - momentum: 0.000000
2023-10-13 12:37:49,779 epoch 7 - iter 445/894 - loss 0.02176302 - time (sec): 21.58 - samples/sec: 2001.58 - lr: 0.000019 - momentum: 0.000000
2023-10-13 12:37:54,061 epoch 7 - iter 534/894 - loss 0.02088338 - time (sec): 25.87 - samples/sec: 2002.74 - lr: 0.000019 - momentum: 0.000000
2023-10-13 12:37:58,011 epoch 7 - iter 623/894 - loss 0.02136378 - time (sec): 29.82 - samples/sec: 2014.53 - lr: 0.000018 - momentum: 0.000000
2023-10-13 12:38:01,910 epoch 7 - iter 712/894 - loss 0.02093973 - time (sec): 33.72 - samples/sec: 2030.45 - lr: 0.000018 - momentum: 0.000000
2023-10-13 12:38:05,780 epoch 7 - iter 801/894 - loss 0.02131084 - time (sec): 37.59 - samples/sec: 2027.97 - lr: 0.000017 - momentum: 0.000000
2023-10-13 12:38:10,100 epoch 7 - iter 890/894 - loss 0.02072021 - time (sec): 41.91 - samples/sec: 2053.61 - lr: 0.000017 - momentum: 0.000000
2023-10-13 12:38:10,297 ----------------------------------------------------------------------------------------------------
2023-10-13 12:38:10,297 EPOCH 7 done: loss 0.0206 - lr: 0.000017
2023-10-13 12:38:19,433 DEV : loss 0.23593628406524658 - f1-score (micro avg) 0.7678
2023-10-13 12:38:19,472 saving best model
2023-10-13 12:38:20,009 ----------------------------------------------------------------------------------------------------
2023-10-13 12:38:24,729 epoch 8 - iter 89/894 - loss 0.01361398 - time (sec): 4.72 - samples/sec: 1779.60 - lr: 0.000016 - momentum: 0.000000
2023-10-13 12:38:29,248 epoch 8 - iter 178/894 - loss 0.01742511 - time (sec): 9.24 - samples/sec: 1957.94 - lr: 0.000016 - momentum: 0.000000
2023-10-13 12:38:33,322 epoch 8 - iter 267/894 - loss 0.01587200 - time (sec): 13.31 - samples/sec: 1983.09 - lr: 0.000015 - momentum: 0.000000
2023-10-13 12:38:37,457 epoch 8 - iter 356/894 - loss 0.01499465 - time (sec): 17.45 - samples/sec: 2007.73 - lr: 0.000014 - momentum: 0.000000
2023-10-13 12:38:41,570 epoch 8 - iter 445/894 - loss 0.01522297 - time (sec): 21.56 - samples/sec: 1985.39 - lr: 0.000014 - momentum: 0.000000
2023-10-13 12:38:45,637 epoch 8 - iter 534/894 - loss 0.01336557 - time (sec): 25.63 - samples/sec: 2019.99 - lr: 0.000013 - momentum: 0.000000
2023-10-13 12:38:49,596 epoch 8 - iter 623/894 - loss 0.01367776 - time (sec): 29.58 - samples/sec: 2048.84 - lr: 0.000013 - momentum: 0.000000
2023-10-13 12:38:53,562 epoch 8 - iter 712/894 - loss 0.01386286 - time (sec): 33.55 - samples/sec: 2053.20 - lr: 0.000012 - momentum: 0.000000
2023-10-13 12:38:57,493 epoch 8 - iter 801/894 - loss 0.01362775 - time (sec): 37.48 - samples/sec: 2067.50 - lr: 0.000012 - momentum: 0.000000
2023-10-13 12:39:01,513 epoch 8 - iter 890/894 - loss 0.01332056 - time (sec): 41.50 - samples/sec: 2077.00 - lr: 0.000011 - momentum: 0.000000
2023-10-13 12:39:01,685 ----------------------------------------------------------------------------------------------------
2023-10-13 12:39:01,685 EPOCH 8 done: loss 0.0133 - lr: 0.000011
2023-10-13 12:39:10,171 DEV : loss 0.2714207172393799 - f1-score (micro avg) 0.7702
2023-10-13 12:39:10,205 saving best model
2023-10-13 12:39:10,686 ----------------------------------------------------------------------------------------------------
2023-10-13 12:39:15,078 epoch 9 - iter 89/894 - loss 0.00547130 - time (sec): 4.39 - samples/sec: 1971.39 - lr: 0.000011 - momentum: 0.000000
2023-10-13 12:39:19,244 epoch 9 - iter 178/894 - loss 0.00963629 - time (sec): 8.56 - samples/sec: 1993.56 - lr: 0.000010 - momentum: 0.000000
2023-10-13 12:39:23,663 epoch 9 - iter 267/894 - loss 0.01066901 - time (sec): 12.97 - samples/sec: 1953.09 - lr: 0.000009 - momentum: 0.000000
2023-10-13 12:39:27,955 epoch 9 - iter 356/894 - loss 0.01008720 - time (sec): 17.27 - samples/sec: 1972.23 - lr: 0.000009 - momentum: 0.000000
2023-10-13 12:39:32,716 epoch 9 - iter 445/894 - loss 0.00899087 - time (sec): 22.03 - samples/sec: 1989.31 - lr: 0.000008 - momentum: 0.000000
2023-10-13 12:39:37,235 epoch 9 - iter 534/894 - loss 0.00832282 - time (sec): 26.55 - samples/sec: 1967.26 - lr: 0.000008 - momentum: 0.000000
2023-10-13 12:39:41,682 epoch 9 - iter 623/894 - loss 0.00840774 - time (sec): 30.99 - samples/sec: 1954.70 - lr: 0.000007 - momentum: 0.000000
2023-10-13 12:39:46,042 epoch 9 - iter 712/894 - loss 0.00907033 - time (sec): 35.35 - samples/sec: 1965.05 - lr: 0.000007 - momentum: 0.000000
2023-10-13 12:39:50,191 epoch 9 - iter 801/894 - loss 0.00943423 - time (sec): 39.50 - samples/sec: 1962.69 - lr: 0.000006 - momentum: 0.000000
2023-10-13 12:39:54,425 epoch 9 - iter 890/894 - loss 0.00956840 - time (sec): 43.74 - samples/sec: 1970.62 - lr: 0.000006 - momentum: 0.000000
2023-10-13 12:39:54,626 ----------------------------------------------------------------------------------------------------
2023-10-13 12:39:54,627 EPOCH 9 done: loss 0.0098 - lr: 0.000006
2023-10-13 12:40:03,411 DEV : loss 0.25177356600761414 - f1-score (micro avg) 0.7839
2023-10-13 12:40:03,439 saving best model
2023-10-13 12:40:03,879 ----------------------------------------------------------------------------------------------------
2023-10-13 12:40:08,201 epoch 10 - iter 89/894 - loss 0.00028415 - time (sec): 4.32 - samples/sec: 2139.25 - lr: 0.000005 - momentum: 0.000000
2023-10-13 12:40:12,369 epoch 10 - iter 178/894 - loss 0.00242940 - time (sec): 8.49 - samples/sec: 2042.29 - lr: 0.000004 - momentum: 0.000000
2023-10-13 12:40:16,627 epoch 10 - iter 267/894 - loss 0.00404990 - time (sec): 12.75 - samples/sec: 2020.69 - lr: 0.000004 - momentum: 0.000000
2023-10-13 12:40:20,968 epoch 10 - iter 356/894 - loss 0.00350026 - time (sec): 17.09 - samples/sec: 2079.62 - lr: 0.000003 - momentum: 0.000000
2023-10-13 12:40:25,142 epoch 10 - iter 445/894 - loss 0.00462522 - time (sec): 21.26 - samples/sec: 2064.72 - lr: 0.000003 - momentum: 0.000000
2023-10-13 12:40:29,316 epoch 10 - iter 534/894 - loss 0.00559551 - time (sec): 25.44 - samples/sec: 2060.31 - lr: 0.000002 - momentum: 0.000000
2023-10-13 12:40:33,584 epoch 10 - iter 623/894 - loss 0.00545572 - time (sec): 29.70 - samples/sec: 2024.32 - lr: 0.000002 - momentum: 0.000000
2023-10-13 12:40:38,084 epoch 10 - iter 712/894 - loss 0.00477820 - time (sec): 34.20 - samples/sec: 2010.58 - lr: 0.000001 - momentum: 0.000000
2023-10-13 12:40:42,361 epoch 10 - iter 801/894 - loss 0.00505387 - time (sec): 38.48 - samples/sec: 1999.45 - lr: 0.000001 - momentum: 0.000000
2023-10-13 12:40:46,742 epoch 10 - iter 890/894 - loss 0.00486550 - time (sec): 42.86 - samples/sec: 2012.22 - lr: 0.000000 - momentum: 0.000000
2023-10-13 12:40:46,927 ----------------------------------------------------------------------------------------------------
2023-10-13 12:40:46,927 EPOCH 10 done: loss 0.0048 - lr: 0.000000
2023-10-13 12:40:56,024 DEV : loss 0.2590446472167969 - f1-score (micro avg) 0.786
2023-10-13 12:40:56,051 saving best model
2023-10-13 12:40:56,820 ----------------------------------------------------------------------------------------------------
2023-10-13 12:40:56,822 Loading model from best epoch ...
2023-10-13 12:40:58,285 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-13 12:41:03,540
Results:
- F-score (micro) 0.7304
- F-score (macro) 0.6539
- Accuracy 0.5962
By class:
precision recall f1-score support
loc 0.8065 0.8322 0.8192 596
pers 0.6715 0.6937 0.6824 333
org 0.5800 0.4394 0.5000 132
prod 0.6735 0.5000 0.5739 66
time 0.6939 0.6939 0.6939 49
micro avg 0.7364 0.7245 0.7304 1176
macro avg 0.6851 0.6318 0.6539 1176
weighted avg 0.7307 0.7245 0.7256 1176
2023-10-13 12:41:03,540 ----------------------------------------------------------------------------------------------------