stefan-it's picture
Upload folder using huggingface_hub
5f79942
2023-10-18 14:46:31,483 ----------------------------------------------------------------------------------------------------
2023-10-18 14:46:31,483 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 14:46:31,483 ----------------------------------------------------------------------------------------------------
2023-10-18 14:46:31,483 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-18 14:46:31,483 ----------------------------------------------------------------------------------------------------
2023-10-18 14:46:31,483 Train: 1100 sentences
2023-10-18 14:46:31,483 (train_with_dev=False, train_with_test=False)
2023-10-18 14:46:31,483 ----------------------------------------------------------------------------------------------------
2023-10-18 14:46:31,483 Training Params:
2023-10-18 14:46:31,483 - learning_rate: "5e-05"
2023-10-18 14:46:31,483 - mini_batch_size: "4"
2023-10-18 14:46:31,483 - max_epochs: "10"
2023-10-18 14:46:31,483 - shuffle: "True"
2023-10-18 14:46:31,483 ----------------------------------------------------------------------------------------------------
2023-10-18 14:46:31,484 Plugins:
2023-10-18 14:46:31,484 - TensorboardLogger
2023-10-18 14:46:31,484 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 14:46:31,484 ----------------------------------------------------------------------------------------------------
2023-10-18 14:46:31,484 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 14:46:31,484 - metric: "('micro avg', 'f1-score')"
2023-10-18 14:46:31,484 ----------------------------------------------------------------------------------------------------
2023-10-18 14:46:31,484 Computation:
2023-10-18 14:46:31,484 - compute on device: cuda:0
2023-10-18 14:46:31,484 - embedding storage: none
2023-10-18 14:46:31,484 ----------------------------------------------------------------------------------------------------
2023-10-18 14:46:31,484 Model training base path: "hmbench-ajmc/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-18 14:46:31,484 ----------------------------------------------------------------------------------------------------
2023-10-18 14:46:31,484 ----------------------------------------------------------------------------------------------------
2023-10-18 14:46:31,484 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 14:46:31,911 epoch 1 - iter 27/275 - loss 3.60427767 - time (sec): 0.43 - samples/sec: 5580.00 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:46:32,309 epoch 1 - iter 54/275 - loss 3.53847157 - time (sec): 0.82 - samples/sec: 5493.79 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:46:32,732 epoch 1 - iter 81/275 - loss 3.44585124 - time (sec): 1.25 - samples/sec: 5500.36 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:46:33,164 epoch 1 - iter 108/275 - loss 3.23727993 - time (sec): 1.68 - samples/sec: 5405.62 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:46:33,557 epoch 1 - iter 135/275 - loss 3.03084574 - time (sec): 2.07 - samples/sec: 5434.32 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:46:33,962 epoch 1 - iter 162/275 - loss 2.77873916 - time (sec): 2.48 - samples/sec: 5494.55 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:46:34,378 epoch 1 - iter 189/275 - loss 2.55578251 - time (sec): 2.89 - samples/sec: 5551.85 - lr: 0.000034 - momentum: 0.000000
2023-10-18 14:46:34,785 epoch 1 - iter 216/275 - loss 2.36375618 - time (sec): 3.30 - samples/sec: 5584.46 - lr: 0.000039 - momentum: 0.000000
2023-10-18 14:46:35,184 epoch 1 - iter 243/275 - loss 2.22483879 - time (sec): 3.70 - samples/sec: 5542.49 - lr: 0.000044 - momentum: 0.000000
2023-10-18 14:46:35,583 epoch 1 - iter 270/275 - loss 2.11483675 - time (sec): 4.10 - samples/sec: 5460.18 - lr: 0.000049 - momentum: 0.000000
2023-10-18 14:46:35,658 ----------------------------------------------------------------------------------------------------
2023-10-18 14:46:35,658 EPOCH 1 done: loss 2.0977 - lr: 0.000049
2023-10-18 14:46:35,907 DEV : loss 0.8249472379684448 - f1-score (micro avg) 0.0
2023-10-18 14:46:35,911 ----------------------------------------------------------------------------------------------------
2023-10-18 14:46:36,286 epoch 2 - iter 27/275 - loss 0.81902694 - time (sec): 0.38 - samples/sec: 6657.19 - lr: 0.000049 - momentum: 0.000000
2023-10-18 14:46:36,693 epoch 2 - iter 54/275 - loss 0.78271395 - time (sec): 0.78 - samples/sec: 5926.65 - lr: 0.000049 - momentum: 0.000000
2023-10-18 14:46:37,096 epoch 2 - iter 81/275 - loss 0.78122909 - time (sec): 1.18 - samples/sec: 5753.12 - lr: 0.000048 - momentum: 0.000000
2023-10-18 14:46:37,508 epoch 2 - iter 108/275 - loss 0.76956193 - time (sec): 1.60 - samples/sec: 5730.75 - lr: 0.000048 - momentum: 0.000000
2023-10-18 14:46:37,878 epoch 2 - iter 135/275 - loss 0.74916310 - time (sec): 1.97 - samples/sec: 5883.06 - lr: 0.000047 - momentum: 0.000000
2023-10-18 14:46:38,244 epoch 2 - iter 162/275 - loss 0.73473405 - time (sec): 2.33 - samples/sec: 5937.91 - lr: 0.000047 - momentum: 0.000000
2023-10-18 14:46:38,643 epoch 2 - iter 189/275 - loss 0.72819941 - time (sec): 2.73 - samples/sec: 5817.73 - lr: 0.000046 - momentum: 0.000000
2023-10-18 14:46:39,045 epoch 2 - iter 216/275 - loss 0.72751925 - time (sec): 3.13 - samples/sec: 5725.77 - lr: 0.000046 - momentum: 0.000000
2023-10-18 14:46:39,445 epoch 2 - iter 243/275 - loss 0.71409486 - time (sec): 3.53 - samples/sec: 5741.29 - lr: 0.000045 - momentum: 0.000000
2023-10-18 14:46:39,850 epoch 2 - iter 270/275 - loss 0.70555254 - time (sec): 3.94 - samples/sec: 5680.54 - lr: 0.000045 - momentum: 0.000000
2023-10-18 14:46:39,924 ----------------------------------------------------------------------------------------------------
2023-10-18 14:46:39,924 EPOCH 2 done: loss 0.7062 - lr: 0.000045
2023-10-18 14:46:40,290 DEV : loss 0.4668181538581848 - f1-score (micro avg) 0.3678
2023-10-18 14:46:40,295 saving best model
2023-10-18 14:46:40,329 ----------------------------------------------------------------------------------------------------
2023-10-18 14:46:40,694 epoch 3 - iter 27/275 - loss 0.57378026 - time (sec): 0.36 - samples/sec: 5822.53 - lr: 0.000044 - momentum: 0.000000
2023-10-18 14:46:41,070 epoch 3 - iter 54/275 - loss 0.54911028 - time (sec): 0.74 - samples/sec: 5725.31 - lr: 0.000043 - momentum: 0.000000
2023-10-18 14:46:41,441 epoch 3 - iter 81/275 - loss 0.56147019 - time (sec): 1.11 - samples/sec: 6078.86 - lr: 0.000043 - momentum: 0.000000
2023-10-18 14:46:41,828 epoch 3 - iter 108/275 - loss 0.53541681 - time (sec): 1.50 - samples/sec: 6101.91 - lr: 0.000042 - momentum: 0.000000
2023-10-18 14:46:42,198 epoch 3 - iter 135/275 - loss 0.53139631 - time (sec): 1.87 - samples/sec: 6124.60 - lr: 0.000042 - momentum: 0.000000
2023-10-18 14:46:42,567 epoch 3 - iter 162/275 - loss 0.52948251 - time (sec): 2.24 - samples/sec: 6047.78 - lr: 0.000041 - momentum: 0.000000
2023-10-18 14:46:42,956 epoch 3 - iter 189/275 - loss 0.53492904 - time (sec): 2.63 - samples/sec: 5897.15 - lr: 0.000041 - momentum: 0.000000
2023-10-18 14:46:43,356 epoch 3 - iter 216/275 - loss 0.53448777 - time (sec): 3.03 - samples/sec: 5908.36 - lr: 0.000040 - momentum: 0.000000
2023-10-18 14:46:43,765 epoch 3 - iter 243/275 - loss 0.53606007 - time (sec): 3.44 - samples/sec: 5812.14 - lr: 0.000040 - momentum: 0.000000
2023-10-18 14:46:44,316 epoch 3 - iter 270/275 - loss 0.52272556 - time (sec): 3.99 - samples/sec: 5628.54 - lr: 0.000039 - momentum: 0.000000
2023-10-18 14:46:44,388 ----------------------------------------------------------------------------------------------------
2023-10-18 14:46:44,388 EPOCH 3 done: loss 0.5248 - lr: 0.000039
2023-10-18 14:46:44,752 DEV : loss 0.3869856297969818 - f1-score (micro avg) 0.4924
2023-10-18 14:46:44,756 saving best model
2023-10-18 14:46:44,798 ----------------------------------------------------------------------------------------------------
2023-10-18 14:46:45,203 epoch 4 - iter 27/275 - loss 0.40150258 - time (sec): 0.40 - samples/sec: 5905.00 - lr: 0.000038 - momentum: 0.000000
2023-10-18 14:46:45,606 epoch 4 - iter 54/275 - loss 0.39228535 - time (sec): 0.81 - samples/sec: 5777.27 - lr: 0.000038 - momentum: 0.000000
2023-10-18 14:46:46,006 epoch 4 - iter 81/275 - loss 0.40830216 - time (sec): 1.21 - samples/sec: 5526.34 - lr: 0.000037 - momentum: 0.000000
2023-10-18 14:46:46,434 epoch 4 - iter 108/275 - loss 0.39555135 - time (sec): 1.64 - samples/sec: 5438.34 - lr: 0.000037 - momentum: 0.000000
2023-10-18 14:46:46,839 epoch 4 - iter 135/275 - loss 0.40624830 - time (sec): 2.04 - samples/sec: 5554.51 - lr: 0.000036 - momentum: 0.000000
2023-10-18 14:46:47,239 epoch 4 - iter 162/275 - loss 0.40411306 - time (sec): 2.44 - samples/sec: 5486.74 - lr: 0.000036 - momentum: 0.000000
2023-10-18 14:46:47,630 epoch 4 - iter 189/275 - loss 0.41093818 - time (sec): 2.83 - samples/sec: 5458.66 - lr: 0.000035 - momentum: 0.000000
2023-10-18 14:46:48,032 epoch 4 - iter 216/275 - loss 0.41566907 - time (sec): 3.23 - samples/sec: 5435.98 - lr: 0.000035 - momentum: 0.000000
2023-10-18 14:46:48,447 epoch 4 - iter 243/275 - loss 0.42718447 - time (sec): 3.65 - samples/sec: 5484.15 - lr: 0.000034 - momentum: 0.000000
2023-10-18 14:46:48,865 epoch 4 - iter 270/275 - loss 0.42318262 - time (sec): 4.07 - samples/sec: 5497.34 - lr: 0.000034 - momentum: 0.000000
2023-10-18 14:46:48,939 ----------------------------------------------------------------------------------------------------
2023-10-18 14:46:48,939 EPOCH 4 done: loss 0.4242 - lr: 0.000034
2023-10-18 14:46:49,308 DEV : loss 0.33181697130203247 - f1-score (micro avg) 0.5721
2023-10-18 14:46:49,312 saving best model
2023-10-18 14:46:49,347 ----------------------------------------------------------------------------------------------------
2023-10-18 14:46:49,746 epoch 5 - iter 27/275 - loss 0.32643325 - time (sec): 0.40 - samples/sec: 5434.98 - lr: 0.000033 - momentum: 0.000000
2023-10-18 14:46:50,140 epoch 5 - iter 54/275 - loss 0.38503466 - time (sec): 0.79 - samples/sec: 5467.38 - lr: 0.000032 - momentum: 0.000000
2023-10-18 14:46:50,547 epoch 5 - iter 81/275 - loss 0.37432988 - time (sec): 1.20 - samples/sec: 5553.12 - lr: 0.000032 - momentum: 0.000000
2023-10-18 14:46:50,955 epoch 5 - iter 108/275 - loss 0.36444609 - time (sec): 1.61 - samples/sec: 5532.46 - lr: 0.000031 - momentum: 0.000000
2023-10-18 14:46:51,361 epoch 5 - iter 135/275 - loss 0.38091347 - time (sec): 2.01 - samples/sec: 5413.71 - lr: 0.000031 - momentum: 0.000000
2023-10-18 14:46:51,783 epoch 5 - iter 162/275 - loss 0.37677816 - time (sec): 2.44 - samples/sec: 5357.69 - lr: 0.000030 - momentum: 0.000000
2023-10-18 14:46:52,203 epoch 5 - iter 189/275 - loss 0.37777729 - time (sec): 2.86 - samples/sec: 5457.15 - lr: 0.000030 - momentum: 0.000000
2023-10-18 14:46:52,610 epoch 5 - iter 216/275 - loss 0.36413619 - time (sec): 3.26 - samples/sec: 5376.81 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:46:53,018 epoch 5 - iter 243/275 - loss 0.37879068 - time (sec): 3.67 - samples/sec: 5440.56 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:46:53,426 epoch 5 - iter 270/275 - loss 0.37792066 - time (sec): 4.08 - samples/sec: 5458.85 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:46:53,508 ----------------------------------------------------------------------------------------------------
2023-10-18 14:46:53,508 EPOCH 5 done: loss 0.3780 - lr: 0.000028
2023-10-18 14:46:53,876 DEV : loss 0.29136621952056885 - f1-score (micro avg) 0.5917
2023-10-18 14:46:53,880 saving best model
2023-10-18 14:46:53,914 ----------------------------------------------------------------------------------------------------
2023-10-18 14:46:54,314 epoch 6 - iter 27/275 - loss 0.33775861 - time (sec): 0.40 - samples/sec: 5751.04 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:46:54,702 epoch 6 - iter 54/275 - loss 0.32505727 - time (sec): 0.79 - samples/sec: 5467.74 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:46:55,088 epoch 6 - iter 81/275 - loss 0.34399252 - time (sec): 1.17 - samples/sec: 5495.05 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:46:55,493 epoch 6 - iter 108/275 - loss 0.35821638 - time (sec): 1.58 - samples/sec: 5535.75 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:46:55,896 epoch 6 - iter 135/275 - loss 0.35511108 - time (sec): 1.98 - samples/sec: 5555.34 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:46:56,295 epoch 6 - iter 162/275 - loss 0.35328830 - time (sec): 2.38 - samples/sec: 5593.88 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:46:56,706 epoch 6 - iter 189/275 - loss 0.35286958 - time (sec): 2.79 - samples/sec: 5609.22 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:46:57,118 epoch 6 - iter 216/275 - loss 0.34996273 - time (sec): 3.20 - samples/sec: 5575.49 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:46:57,530 epoch 6 - iter 243/275 - loss 0.35211615 - time (sec): 3.62 - samples/sec: 5561.76 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:46:57,944 epoch 6 - iter 270/275 - loss 0.35733180 - time (sec): 4.03 - samples/sec: 5563.43 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:46:58,016 ----------------------------------------------------------------------------------------------------
2023-10-18 14:46:58,016 EPOCH 6 done: loss 0.3549 - lr: 0.000022
2023-10-18 14:46:58,391 DEV : loss 0.2738904058933258 - f1-score (micro avg) 0.6089
2023-10-18 14:46:58,395 saving best model
2023-10-18 14:46:58,429 ----------------------------------------------------------------------------------------------------
2023-10-18 14:46:58,823 epoch 7 - iter 27/275 - loss 0.34506173 - time (sec): 0.39 - samples/sec: 5877.75 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:46:59,231 epoch 7 - iter 54/275 - loss 0.34712269 - time (sec): 0.80 - samples/sec: 5611.24 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:46:59,629 epoch 7 - iter 81/275 - loss 0.34413844 - time (sec): 1.20 - samples/sec: 5623.76 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:47:00,028 epoch 7 - iter 108/275 - loss 0.34104194 - time (sec): 1.60 - samples/sec: 5723.25 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:47:00,428 epoch 7 - iter 135/275 - loss 0.33170648 - time (sec): 2.00 - samples/sec: 5671.22 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:47:00,824 epoch 7 - iter 162/275 - loss 0.31619704 - time (sec): 2.39 - samples/sec: 5684.06 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:47:01,218 epoch 7 - iter 189/275 - loss 0.31263825 - time (sec): 2.79 - samples/sec: 5618.38 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:47:01,614 epoch 7 - iter 216/275 - loss 0.31927826 - time (sec): 3.18 - samples/sec: 5646.02 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:47:02,033 epoch 7 - iter 243/275 - loss 0.31997382 - time (sec): 3.60 - samples/sec: 5622.52 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:47:02,437 epoch 7 - iter 270/275 - loss 0.32151010 - time (sec): 4.01 - samples/sec: 5584.15 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:47:02,508 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:02,508 EPOCH 7 done: loss 0.3197 - lr: 0.000017
2023-10-18 14:47:02,876 DEV : loss 0.269247442483902 - f1-score (micro avg) 0.6094
2023-10-18 14:47:02,880 saving best model
2023-10-18 14:47:02,915 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:03,316 epoch 8 - iter 27/275 - loss 0.31986097 - time (sec): 0.40 - samples/sec: 5020.13 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:47:03,715 epoch 8 - iter 54/275 - loss 0.30295688 - time (sec): 0.80 - samples/sec: 5218.82 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:47:04,123 epoch 8 - iter 81/275 - loss 0.31479421 - time (sec): 1.21 - samples/sec: 5255.28 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:47:04,545 epoch 8 - iter 108/275 - loss 0.32365688 - time (sec): 1.63 - samples/sec: 5335.63 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:47:04,960 epoch 8 - iter 135/275 - loss 0.31327927 - time (sec): 2.04 - samples/sec: 5496.21 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:47:05,366 epoch 8 - iter 162/275 - loss 0.31111245 - time (sec): 2.45 - samples/sec: 5566.97 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:47:05,761 epoch 8 - iter 189/275 - loss 0.31008732 - time (sec): 2.85 - samples/sec: 5582.38 - lr: 0.000013 - momentum: 0.000000
2023-10-18 14:47:06,167 epoch 8 - iter 216/275 - loss 0.30335225 - time (sec): 3.25 - samples/sec: 5573.64 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:47:06,586 epoch 8 - iter 243/275 - loss 0.30252895 - time (sec): 3.67 - samples/sec: 5517.90 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:47:07,009 epoch 8 - iter 270/275 - loss 0.30460537 - time (sec): 4.09 - samples/sec: 5473.76 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:47:07,083 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:07,083 EPOCH 8 done: loss 0.3041 - lr: 0.000011
2023-10-18 14:47:07,467 DEV : loss 0.2648944556713104 - f1-score (micro avg) 0.6014
2023-10-18 14:47:07,471 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:07,900 epoch 9 - iter 27/275 - loss 0.37548290 - time (sec): 0.43 - samples/sec: 5106.56 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:47:08,314 epoch 9 - iter 54/275 - loss 0.33433064 - time (sec): 0.84 - samples/sec: 5333.58 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:47:08,741 epoch 9 - iter 81/275 - loss 0.34245202 - time (sec): 1.27 - samples/sec: 5348.00 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:47:09,149 epoch 9 - iter 108/275 - loss 0.32758265 - time (sec): 1.68 - samples/sec: 5337.14 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:47:09,557 epoch 9 - iter 135/275 - loss 0.32232435 - time (sec): 2.09 - samples/sec: 5363.30 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:47:09,960 epoch 9 - iter 162/275 - loss 0.31360330 - time (sec): 2.49 - samples/sec: 5330.50 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:47:10,379 epoch 9 - iter 189/275 - loss 0.29656919 - time (sec): 2.91 - samples/sec: 5346.10 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:47:10,804 epoch 9 - iter 216/275 - loss 0.29675440 - time (sec): 3.33 - samples/sec: 5324.77 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:47:11,211 epoch 9 - iter 243/275 - loss 0.29630007 - time (sec): 3.74 - samples/sec: 5377.20 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:47:11,618 epoch 9 - iter 270/275 - loss 0.29837352 - time (sec): 4.15 - samples/sec: 5397.92 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:47:11,693 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:11,693 EPOCH 9 done: loss 0.3001 - lr: 0.000006
2023-10-18 14:47:12,066 DEV : loss 0.2552837133407593 - f1-score (micro avg) 0.6238
2023-10-18 14:47:12,070 saving best model
2023-10-18 14:47:12,103 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:12,503 epoch 10 - iter 27/275 - loss 0.30442183 - time (sec): 0.40 - samples/sec: 5927.04 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:47:12,909 epoch 10 - iter 54/275 - loss 0.27744080 - time (sec): 0.80 - samples/sec: 5580.39 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:47:13,309 epoch 10 - iter 81/275 - loss 0.28826987 - time (sec): 1.20 - samples/sec: 5563.13 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:47:13,723 epoch 10 - iter 108/275 - loss 0.29295146 - time (sec): 1.62 - samples/sec: 5685.92 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:47:14,128 epoch 10 - iter 135/275 - loss 0.28756797 - time (sec): 2.02 - samples/sec: 5664.55 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:47:14,553 epoch 10 - iter 162/275 - loss 0.29582673 - time (sec): 2.45 - samples/sec: 5629.73 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:47:14,968 epoch 10 - iter 189/275 - loss 0.29264789 - time (sec): 2.86 - samples/sec: 5583.33 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:47:15,378 epoch 10 - iter 216/275 - loss 0.29844534 - time (sec): 3.27 - samples/sec: 5546.32 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:47:15,780 epoch 10 - iter 243/275 - loss 0.29580685 - time (sec): 3.68 - samples/sec: 5493.57 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:47:16,183 epoch 10 - iter 270/275 - loss 0.29296097 - time (sec): 4.08 - samples/sec: 5479.00 - lr: 0.000000 - momentum: 0.000000
2023-10-18 14:47:16,254 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:16,254 EPOCH 10 done: loss 0.2940 - lr: 0.000000
2023-10-18 14:47:16,625 DEV : loss 0.25367701053619385 - f1-score (micro avg) 0.6286
2023-10-18 14:47:16,629 saving best model
2023-10-18 14:47:16,692 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:16,692 Loading model from best epoch ...
2023-10-18 14:47:16,777 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-18 14:47:17,065
Results:
- F-score (micro) 0.6515
- F-score (macro) 0.3885
- Accuracy 0.4952
By class:
precision recall f1-score support
scope 0.5859 0.6591 0.6203 176
pers 0.8291 0.7578 0.7918 128
work 0.4783 0.5946 0.5301 74
object 0.0000 0.0000 0.0000 2
loc 0.0000 0.0000 0.0000 2
micro avg 0.6314 0.6728 0.6515 382
macro avg 0.3786 0.4023 0.3885 382
weighted avg 0.6404 0.6728 0.6538 382
2023-10-18 14:47:17,065 ----------------------------------------------------------------------------------------------------