stefan-it's picture
Upload ./training.log with huggingface_hub
5924672
2023-10-25 21:07:49,395 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:49,396 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:49,396 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:49,396 Train: 1166 sentences
2023-10-25 21:07:49,396 (train_with_dev=False, train_with_test=False)
2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:49,396 Training Params:
2023-10-25 21:07:49,396 - learning_rate: "5e-05"
2023-10-25 21:07:49,396 - mini_batch_size: "8"
2023-10-25 21:07:49,396 - max_epochs: "10"
2023-10-25 21:07:49,396 - shuffle: "True"
2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:49,396 Plugins:
2023-10-25 21:07:49,396 - TensorboardLogger
2023-10-25 21:07:49,396 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:49,396 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 21:07:49,396 - metric: "('micro avg', 'f1-score')"
2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:49,396 Computation:
2023-10-25 21:07:49,396 - compute on device: cuda:0
2023-10-25 21:07:49,396 - embedding storage: none
2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:49,396 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:49,397 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 21:07:50,189 epoch 1 - iter 14/146 - loss 2.76093476 - time (sec): 0.79 - samples/sec: 4306.04 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:07:50,999 epoch 1 - iter 28/146 - loss 2.16927590 - time (sec): 1.60 - samples/sec: 4446.03 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:07:51,943 epoch 1 - iter 42/146 - loss 1.56576289 - time (sec): 2.55 - samples/sec: 4650.49 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:07:52,744 epoch 1 - iter 56/146 - loss 1.31295193 - time (sec): 3.35 - samples/sec: 4680.42 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:07:53,535 epoch 1 - iter 70/146 - loss 1.14715761 - time (sec): 4.14 - samples/sec: 4718.31 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:07:54,508 epoch 1 - iter 84/146 - loss 1.02143305 - time (sec): 5.11 - samples/sec: 4686.68 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:07:55,436 epoch 1 - iter 98/146 - loss 0.90968592 - time (sec): 6.04 - samples/sec: 4766.54 - lr: 0.000033 - momentum: 0.000000
2023-10-25 21:07:56,476 epoch 1 - iter 112/146 - loss 0.82600257 - time (sec): 7.08 - samples/sec: 4773.82 - lr: 0.000038 - momentum: 0.000000
2023-10-25 21:07:57,299 epoch 1 - iter 126/146 - loss 0.76327971 - time (sec): 7.90 - samples/sec: 4816.97 - lr: 0.000043 - momentum: 0.000000
2023-10-25 21:07:58,227 epoch 1 - iter 140/146 - loss 0.70269213 - time (sec): 8.83 - samples/sec: 4846.77 - lr: 0.000048 - momentum: 0.000000
2023-10-25 21:07:58,636 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:58,637 EPOCH 1 done: loss 0.6875 - lr: 0.000048
2023-10-25 21:07:59,149 DEV : loss 0.1478959619998932 - f1-score (micro avg) 0.6147
2023-10-25 21:07:59,153 saving best model
2023-10-25 21:07:59,662 ----------------------------------------------------------------------------------------------------
2023-10-25 21:08:00,616 epoch 2 - iter 14/146 - loss 0.16108065 - time (sec): 0.95 - samples/sec: 4811.08 - lr: 0.000050 - momentum: 0.000000
2023-10-25 21:08:01,592 epoch 2 - iter 28/146 - loss 0.15067828 - time (sec): 1.93 - samples/sec: 4899.11 - lr: 0.000049 - momentum: 0.000000
2023-10-25 21:08:02,467 epoch 2 - iter 42/146 - loss 0.15492269 - time (sec): 2.80 - samples/sec: 4896.36 - lr: 0.000048 - momentum: 0.000000
2023-10-25 21:08:03,367 epoch 2 - iter 56/146 - loss 0.16102186 - time (sec): 3.70 - samples/sec: 4832.43 - lr: 0.000048 - momentum: 0.000000
2023-10-25 21:08:04,144 epoch 2 - iter 70/146 - loss 0.15945960 - time (sec): 4.48 - samples/sec: 4838.41 - lr: 0.000047 - momentum: 0.000000
2023-10-25 21:08:04,909 epoch 2 - iter 84/146 - loss 0.16413615 - time (sec): 5.25 - samples/sec: 4824.82 - lr: 0.000047 - momentum: 0.000000
2023-10-25 21:08:05,687 epoch 2 - iter 98/146 - loss 0.15984296 - time (sec): 6.02 - samples/sec: 4843.16 - lr: 0.000046 - momentum: 0.000000
2023-10-25 21:08:06,672 epoch 2 - iter 112/146 - loss 0.15388793 - time (sec): 7.01 - samples/sec: 4844.91 - lr: 0.000046 - momentum: 0.000000
2023-10-25 21:08:07,514 epoch 2 - iter 126/146 - loss 0.14973376 - time (sec): 7.85 - samples/sec: 4895.41 - lr: 0.000045 - momentum: 0.000000
2023-10-25 21:08:08,387 epoch 2 - iter 140/146 - loss 0.14951333 - time (sec): 8.72 - samples/sec: 4914.47 - lr: 0.000045 - momentum: 0.000000
2023-10-25 21:08:08,740 ----------------------------------------------------------------------------------------------------
2023-10-25 21:08:08,740 EPOCH 2 done: loss 0.1491 - lr: 0.000045
2023-10-25 21:08:09,802 DEV : loss 0.10244478285312653 - f1-score (micro avg) 0.722
2023-10-25 21:08:09,806 saving best model
2023-10-25 21:08:10,488 ----------------------------------------------------------------------------------------------------
2023-10-25 21:08:11,393 epoch 3 - iter 14/146 - loss 0.09051348 - time (sec): 0.90 - samples/sec: 4622.43 - lr: 0.000044 - momentum: 0.000000
2023-10-25 21:08:12,169 epoch 3 - iter 28/146 - loss 0.08682021 - time (sec): 1.68 - samples/sec: 4486.31 - lr: 0.000043 - momentum: 0.000000
2023-10-25 21:08:13,122 epoch 3 - iter 42/146 - loss 0.08134884 - time (sec): 2.63 - samples/sec: 4551.81 - lr: 0.000043 - momentum: 0.000000
2023-10-25 21:08:13,976 epoch 3 - iter 56/146 - loss 0.07868931 - time (sec): 3.49 - samples/sec: 4454.44 - lr: 0.000042 - momentum: 0.000000
2023-10-25 21:08:15,095 epoch 3 - iter 70/146 - loss 0.08088185 - time (sec): 4.60 - samples/sec: 4610.30 - lr: 0.000042 - momentum: 0.000000
2023-10-25 21:08:15,955 epoch 3 - iter 84/146 - loss 0.08151152 - time (sec): 5.47 - samples/sec: 4731.13 - lr: 0.000041 - momentum: 0.000000
2023-10-25 21:08:16,823 epoch 3 - iter 98/146 - loss 0.08031537 - time (sec): 6.33 - samples/sec: 4791.34 - lr: 0.000041 - momentum: 0.000000
2023-10-25 21:08:17,535 epoch 3 - iter 112/146 - loss 0.08190989 - time (sec): 7.05 - samples/sec: 4836.59 - lr: 0.000040 - momentum: 0.000000
2023-10-25 21:08:18,326 epoch 3 - iter 126/146 - loss 0.08448286 - time (sec): 7.84 - samples/sec: 4838.21 - lr: 0.000040 - momentum: 0.000000
2023-10-25 21:08:19,203 epoch 3 - iter 140/146 - loss 0.08269211 - time (sec): 8.71 - samples/sec: 4854.04 - lr: 0.000039 - momentum: 0.000000
2023-10-25 21:08:19,638 ----------------------------------------------------------------------------------------------------
2023-10-25 21:08:19,639 EPOCH 3 done: loss 0.0836 - lr: 0.000039
2023-10-25 21:08:20,557 DEV : loss 0.10184833407402039 - f1-score (micro avg) 0.7212
2023-10-25 21:08:20,562 ----------------------------------------------------------------------------------------------------
2023-10-25 21:08:21,520 epoch 4 - iter 14/146 - loss 0.06152145 - time (sec): 0.96 - samples/sec: 5271.53 - lr: 0.000038 - momentum: 0.000000
2023-10-25 21:08:22,334 epoch 4 - iter 28/146 - loss 0.05606755 - time (sec): 1.77 - samples/sec: 4945.36 - lr: 0.000038 - momentum: 0.000000
2023-10-25 21:08:23,145 epoch 4 - iter 42/146 - loss 0.05920449 - time (sec): 2.58 - samples/sec: 4896.12 - lr: 0.000037 - momentum: 0.000000
2023-10-25 21:08:24,097 epoch 4 - iter 56/146 - loss 0.05389913 - time (sec): 3.53 - samples/sec: 4806.51 - lr: 0.000037 - momentum: 0.000000
2023-10-25 21:08:24,831 epoch 4 - iter 70/146 - loss 0.05447504 - time (sec): 4.27 - samples/sec: 4767.50 - lr: 0.000036 - momentum: 0.000000
2023-10-25 21:08:25,787 epoch 4 - iter 84/146 - loss 0.05784428 - time (sec): 5.22 - samples/sec: 4714.07 - lr: 0.000036 - momentum: 0.000000
2023-10-25 21:08:26,661 epoch 4 - iter 98/146 - loss 0.05700628 - time (sec): 6.10 - samples/sec: 4729.43 - lr: 0.000035 - momentum: 0.000000
2023-10-25 21:08:27,499 epoch 4 - iter 112/146 - loss 0.05350858 - time (sec): 6.94 - samples/sec: 4689.34 - lr: 0.000035 - momentum: 0.000000
2023-10-25 21:08:28,507 epoch 4 - iter 126/146 - loss 0.05232477 - time (sec): 7.94 - samples/sec: 4696.73 - lr: 0.000034 - momentum: 0.000000
2023-10-25 21:08:29,417 epoch 4 - iter 140/146 - loss 0.05225578 - time (sec): 8.85 - samples/sec: 4793.42 - lr: 0.000034 - momentum: 0.000000
2023-10-25 21:08:29,757 ----------------------------------------------------------------------------------------------------
2023-10-25 21:08:29,757 EPOCH 4 done: loss 0.0523 - lr: 0.000034
2023-10-25 21:08:30,673 DEV : loss 0.10563240945339203 - f1-score (micro avg) 0.7404
2023-10-25 21:08:30,678 saving best model
2023-10-25 21:08:31,335 ----------------------------------------------------------------------------------------------------
2023-10-25 21:08:32,246 epoch 5 - iter 14/146 - loss 0.02488614 - time (sec): 0.91 - samples/sec: 5101.79 - lr: 0.000033 - momentum: 0.000000
2023-10-25 21:08:33,018 epoch 5 - iter 28/146 - loss 0.02254102 - time (sec): 1.68 - samples/sec: 4976.30 - lr: 0.000032 - momentum: 0.000000
2023-10-25 21:08:33,914 epoch 5 - iter 42/146 - loss 0.02974599 - time (sec): 2.58 - samples/sec: 5077.10 - lr: 0.000032 - momentum: 0.000000
2023-10-25 21:08:34,838 epoch 5 - iter 56/146 - loss 0.02885809 - time (sec): 3.50 - samples/sec: 4900.07 - lr: 0.000031 - momentum: 0.000000
2023-10-25 21:08:35,787 epoch 5 - iter 70/146 - loss 0.02769486 - time (sec): 4.45 - samples/sec: 4744.69 - lr: 0.000031 - momentum: 0.000000
2023-10-25 21:08:36,619 epoch 5 - iter 84/146 - loss 0.02843650 - time (sec): 5.28 - samples/sec: 4729.31 - lr: 0.000030 - momentum: 0.000000
2023-10-25 21:08:37,565 epoch 5 - iter 98/146 - loss 0.03048721 - time (sec): 6.23 - samples/sec: 4678.06 - lr: 0.000030 - momentum: 0.000000
2023-10-25 21:08:38,453 epoch 5 - iter 112/146 - loss 0.03202213 - time (sec): 7.12 - samples/sec: 4695.52 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:08:39,677 epoch 5 - iter 126/146 - loss 0.03216288 - time (sec): 8.34 - samples/sec: 4614.52 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:08:40,447 epoch 5 - iter 140/146 - loss 0.03299174 - time (sec): 9.11 - samples/sec: 4678.91 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:08:40,842 ----------------------------------------------------------------------------------------------------
2023-10-25 21:08:40,842 EPOCH 5 done: loss 0.0327 - lr: 0.000028
2023-10-25 21:08:41,755 DEV : loss 0.10233564674854279 - f1-score (micro avg) 0.7706
2023-10-25 21:08:41,760 saving best model
2023-10-25 21:08:42,313 ----------------------------------------------------------------------------------------------------
2023-10-25 21:08:43,115 epoch 6 - iter 14/146 - loss 0.01610677 - time (sec): 0.80 - samples/sec: 5199.94 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:08:44,073 epoch 6 - iter 28/146 - loss 0.01771595 - time (sec): 1.76 - samples/sec: 4763.73 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:08:45,030 epoch 6 - iter 42/146 - loss 0.01749347 - time (sec): 2.72 - samples/sec: 4851.45 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:08:45,893 epoch 6 - iter 56/146 - loss 0.01637983 - time (sec): 3.58 - samples/sec: 4827.76 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:08:46,839 epoch 6 - iter 70/146 - loss 0.02013402 - time (sec): 4.52 - samples/sec: 4762.24 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:08:47,690 epoch 6 - iter 84/146 - loss 0.01826125 - time (sec): 5.38 - samples/sec: 4727.79 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:08:48,586 epoch 6 - iter 98/146 - loss 0.01955487 - time (sec): 6.27 - samples/sec: 4801.37 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:08:49,583 epoch 6 - iter 112/146 - loss 0.02207506 - time (sec): 7.27 - samples/sec: 4730.97 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:08:50,408 epoch 6 - iter 126/146 - loss 0.02262062 - time (sec): 8.09 - samples/sec: 4746.73 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:08:51,267 epoch 6 - iter 140/146 - loss 0.02172538 - time (sec): 8.95 - samples/sec: 4777.39 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:08:51,616 ----------------------------------------------------------------------------------------------------
2023-10-25 21:08:51,617 EPOCH 6 done: loss 0.0229 - lr: 0.000023
2023-10-25 21:08:52,528 DEV : loss 0.127528578042984 - f1-score (micro avg) 0.7511
2023-10-25 21:08:52,532 ----------------------------------------------------------------------------------------------------
2023-10-25 21:08:53,355 epoch 7 - iter 14/146 - loss 0.01150269 - time (sec): 0.82 - samples/sec: 5084.44 - lr: 0.000022 - momentum: 0.000000
2023-10-25 21:08:54,542 epoch 7 - iter 28/146 - loss 0.01576881 - time (sec): 2.01 - samples/sec: 5019.70 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:08:55,297 epoch 7 - iter 42/146 - loss 0.01762664 - time (sec): 2.76 - samples/sec: 4869.93 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:08:56,135 epoch 7 - iter 56/146 - loss 0.01668696 - time (sec): 3.60 - samples/sec: 4798.61 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:08:56,965 epoch 7 - iter 70/146 - loss 0.01515176 - time (sec): 4.43 - samples/sec: 4829.07 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:08:57,897 epoch 7 - iter 84/146 - loss 0.01337344 - time (sec): 5.36 - samples/sec: 4908.17 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:08:58,754 epoch 7 - iter 98/146 - loss 0.01419310 - time (sec): 6.22 - samples/sec: 4934.76 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:08:59,525 epoch 7 - iter 112/146 - loss 0.01622088 - time (sec): 6.99 - samples/sec: 4885.45 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:09:00,360 epoch 7 - iter 126/146 - loss 0.01549015 - time (sec): 7.83 - samples/sec: 4890.68 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:09:01,272 epoch 7 - iter 140/146 - loss 0.01516680 - time (sec): 8.74 - samples/sec: 4861.79 - lr: 0.000017 - momentum: 0.000000
2023-10-25 21:09:01,676 ----------------------------------------------------------------------------------------------------
2023-10-25 21:09:01,676 EPOCH 7 done: loss 0.0146 - lr: 0.000017
2023-10-25 21:09:02,587 DEV : loss 0.13177402317523956 - f1-score (micro avg) 0.7689
2023-10-25 21:09:02,591 ----------------------------------------------------------------------------------------------------
2023-10-25 21:09:03,473 epoch 8 - iter 14/146 - loss 0.02475525 - time (sec): 0.88 - samples/sec: 4376.44 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:09:04,487 epoch 8 - iter 28/146 - loss 0.01803622 - time (sec): 1.90 - samples/sec: 4481.02 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:09:05,574 epoch 8 - iter 42/146 - loss 0.01556120 - time (sec): 2.98 - samples/sec: 4529.62 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:09:06,474 epoch 8 - iter 56/146 - loss 0.01480972 - time (sec): 3.88 - samples/sec: 4494.81 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:09:07,396 epoch 8 - iter 70/146 - loss 0.01592848 - time (sec): 4.80 - samples/sec: 4523.32 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:09:08,211 epoch 8 - iter 84/146 - loss 0.01469362 - time (sec): 5.62 - samples/sec: 4618.26 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:09:08,991 epoch 8 - iter 98/146 - loss 0.01471966 - time (sec): 6.40 - samples/sec: 4605.33 - lr: 0.000013 - momentum: 0.000000
2023-10-25 21:09:09,977 epoch 8 - iter 112/146 - loss 0.01407256 - time (sec): 7.39 - samples/sec: 4671.32 - lr: 0.000013 - momentum: 0.000000
2023-10-25 21:09:10,789 epoch 8 - iter 126/146 - loss 0.01427331 - time (sec): 8.20 - samples/sec: 4669.48 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:09:11,640 epoch 8 - iter 140/146 - loss 0.01285895 - time (sec): 9.05 - samples/sec: 4747.94 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:09:11,945 ----------------------------------------------------------------------------------------------------
2023-10-25 21:09:11,946 EPOCH 8 done: loss 0.0125 - lr: 0.000012
2023-10-25 21:09:13,015 DEV : loss 0.14684733748435974 - f1-score (micro avg) 0.7458
2023-10-25 21:09:13,020 ----------------------------------------------------------------------------------------------------
2023-10-25 21:09:13,897 epoch 9 - iter 14/146 - loss 0.00374008 - time (sec): 0.88 - samples/sec: 5312.36 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:09:14,741 epoch 9 - iter 28/146 - loss 0.00602499 - time (sec): 1.72 - samples/sec: 5187.02 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:09:15,543 epoch 9 - iter 42/146 - loss 0.00469098 - time (sec): 2.52 - samples/sec: 4978.45 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:09:16,616 epoch 9 - iter 56/146 - loss 0.00530783 - time (sec): 3.59 - samples/sec: 4869.76 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:09:17,612 epoch 9 - iter 70/146 - loss 0.00888826 - time (sec): 4.59 - samples/sec: 4850.68 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:09:18,526 epoch 9 - iter 84/146 - loss 0.00861823 - time (sec): 5.50 - samples/sec: 4792.41 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:09:19,463 epoch 9 - iter 98/146 - loss 0.00801291 - time (sec): 6.44 - samples/sec: 4800.57 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:09:20,244 epoch 9 - iter 112/146 - loss 0.00827974 - time (sec): 7.22 - samples/sec: 4759.68 - lr: 0.000007 - momentum: 0.000000
2023-10-25 21:09:21,147 epoch 9 - iter 126/146 - loss 0.00760175 - time (sec): 8.13 - samples/sec: 4737.94 - lr: 0.000007 - momentum: 0.000000
2023-10-25 21:09:22,026 epoch 9 - iter 140/146 - loss 0.00722376 - time (sec): 9.00 - samples/sec: 4748.75 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:09:22,355 ----------------------------------------------------------------------------------------------------
2023-10-25 21:09:22,355 EPOCH 9 done: loss 0.0073 - lr: 0.000006
2023-10-25 21:09:23,267 DEV : loss 0.1549414098262787 - f1-score (micro avg) 0.7692
2023-10-25 21:09:23,271 ----------------------------------------------------------------------------------------------------
2023-10-25 21:09:24,097 epoch 10 - iter 14/146 - loss 0.00128690 - time (sec): 0.82 - samples/sec: 5057.66 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:09:24,940 epoch 10 - iter 28/146 - loss 0.00525616 - time (sec): 1.67 - samples/sec: 5092.22 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:09:25,838 epoch 10 - iter 42/146 - loss 0.00459227 - time (sec): 2.57 - samples/sec: 4870.74 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:09:26,823 epoch 10 - iter 56/146 - loss 0.00708331 - time (sec): 3.55 - samples/sec: 4768.34 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:09:27,709 epoch 10 - iter 70/146 - loss 0.00587957 - time (sec): 4.44 - samples/sec: 4762.88 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:09:28,497 epoch 10 - iter 84/146 - loss 0.00554849 - time (sec): 5.22 - samples/sec: 4713.50 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:09:29,533 epoch 10 - iter 98/146 - loss 0.00596721 - time (sec): 6.26 - samples/sec: 4666.95 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:09:30,455 epoch 10 - iter 112/146 - loss 0.00572547 - time (sec): 7.18 - samples/sec: 4724.35 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:09:31,589 epoch 10 - iter 126/146 - loss 0.00547473 - time (sec): 8.32 - samples/sec: 4612.75 - lr: 0.000001 - momentum: 0.000000
2023-10-25 21:09:32,468 epoch 10 - iter 140/146 - loss 0.00498630 - time (sec): 9.20 - samples/sec: 4651.83 - lr: 0.000000 - momentum: 0.000000
2023-10-25 21:09:32,839 ----------------------------------------------------------------------------------------------------
2023-10-25 21:09:32,839 EPOCH 10 done: loss 0.0049 - lr: 0.000000
2023-10-25 21:09:33,748 DEV : loss 0.15722544491291046 - f1-score (micro avg) 0.7706
2023-10-25 21:09:34,271 ----------------------------------------------------------------------------------------------------
2023-10-25 21:09:34,272 Loading model from best epoch ...
2023-10-25 21:09:35,987 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 21:09:37,504
Results:
- F-score (micro) 0.7628
- F-score (macro) 0.6702
- Accuracy 0.6352
By class:
precision recall f1-score support
PER 0.8319 0.8391 0.8355 348
LOC 0.6656 0.8314 0.7394 261
ORG 0.4468 0.4038 0.4242 52
HumanProd 0.6818 0.6818 0.6818 22
micro avg 0.7306 0.7980 0.7628 683
macro avg 0.6565 0.6890 0.6702 683
weighted avg 0.7342 0.7980 0.7625 683
2023-10-25 21:09:37,504 ----------------------------------------------------------------------------------------------------