stefan-it's picture
Upload folder using huggingface_hub
fd38baa
2023-10-16 20:11:42,385 ----------------------------------------------------------------------------------------------------
2023-10-16 20:11:42,386 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-16 20:11:42,387 ----------------------------------------------------------------------------------------------------
2023-10-16 20:11:42,387 MultiCorpus: 1085 train + 148 dev + 364 test sentences
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator
2023-10-16 20:11:42,387 ----------------------------------------------------------------------------------------------------
2023-10-16 20:11:42,387 Train: 1085 sentences
2023-10-16 20:11:42,387 (train_with_dev=False, train_with_test=False)
2023-10-16 20:11:42,387 ----------------------------------------------------------------------------------------------------
2023-10-16 20:11:42,387 Training Params:
2023-10-16 20:11:42,387 - learning_rate: "5e-05"
2023-10-16 20:11:42,387 - mini_batch_size: "4"
2023-10-16 20:11:42,387 - max_epochs: "10"
2023-10-16 20:11:42,387 - shuffle: "True"
2023-10-16 20:11:42,387 ----------------------------------------------------------------------------------------------------
2023-10-16 20:11:42,387 Plugins:
2023-10-16 20:11:42,387 - LinearScheduler | warmup_fraction: '0.1'
2023-10-16 20:11:42,387 ----------------------------------------------------------------------------------------------------
2023-10-16 20:11:42,387 Final evaluation on model from best epoch (best-model.pt)
2023-10-16 20:11:42,387 - metric: "('micro avg', 'f1-score')"
2023-10-16 20:11:42,387 ----------------------------------------------------------------------------------------------------
2023-10-16 20:11:42,387 Computation:
2023-10-16 20:11:42,387 - compute on device: cuda:0
2023-10-16 20:11:42,387 - embedding storage: none
2023-10-16 20:11:42,387 ----------------------------------------------------------------------------------------------------
2023-10-16 20:11:42,387 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-16 20:11:42,388 ----------------------------------------------------------------------------------------------------
2023-10-16 20:11:42,388 ----------------------------------------------------------------------------------------------------
2023-10-16 20:11:43,924 epoch 1 - iter 27/272 - loss 2.95922567 - time (sec): 1.54 - samples/sec: 3298.78 - lr: 0.000005 - momentum: 0.000000
2023-10-16 20:11:45,701 epoch 1 - iter 54/272 - loss 2.32051200 - time (sec): 3.31 - samples/sec: 3299.49 - lr: 0.000010 - momentum: 0.000000
2023-10-16 20:11:47,348 epoch 1 - iter 81/272 - loss 1.73864625 - time (sec): 4.96 - samples/sec: 3285.05 - lr: 0.000015 - momentum: 0.000000
2023-10-16 20:11:48,971 epoch 1 - iter 108/272 - loss 1.38585266 - time (sec): 6.58 - samples/sec: 3350.45 - lr: 0.000020 - momentum: 0.000000
2023-10-16 20:11:50,588 epoch 1 - iter 135/272 - loss 1.17450241 - time (sec): 8.20 - samples/sec: 3370.04 - lr: 0.000025 - momentum: 0.000000
2023-10-16 20:11:52,042 epoch 1 - iter 162/272 - loss 1.06209535 - time (sec): 9.65 - samples/sec: 3333.26 - lr: 0.000030 - momentum: 0.000000
2023-10-16 20:11:53,643 epoch 1 - iter 189/272 - loss 0.95412205 - time (sec): 11.25 - samples/sec: 3293.65 - lr: 0.000035 - momentum: 0.000000
2023-10-16 20:11:55,144 epoch 1 - iter 216/272 - loss 0.87249746 - time (sec): 12.75 - samples/sec: 3282.64 - lr: 0.000040 - momentum: 0.000000
2023-10-16 20:11:56,784 epoch 1 - iter 243/272 - loss 0.79404191 - time (sec): 14.39 - samples/sec: 3282.89 - lr: 0.000044 - momentum: 0.000000
2023-10-16 20:11:58,211 epoch 1 - iter 270/272 - loss 0.74711671 - time (sec): 15.82 - samples/sec: 3271.21 - lr: 0.000049 - momentum: 0.000000
2023-10-16 20:11:58,297 ----------------------------------------------------------------------------------------------------
2023-10-16 20:11:58,297 EPOCH 1 done: loss 0.7444 - lr: 0.000049
2023-10-16 20:11:59,192 DEV : loss 0.13936997950077057 - f1-score (micro avg) 0.7209
2023-10-16 20:11:59,200 saving best model
2023-10-16 20:11:59,572 ----------------------------------------------------------------------------------------------------
2023-10-16 20:12:01,066 epoch 2 - iter 27/272 - loss 0.16359337 - time (sec): 1.49 - samples/sec: 3175.52 - lr: 0.000049 - momentum: 0.000000
2023-10-16 20:12:02,642 epoch 2 - iter 54/272 - loss 0.13838335 - time (sec): 3.07 - samples/sec: 3239.45 - lr: 0.000049 - momentum: 0.000000
2023-10-16 20:12:04,264 epoch 2 - iter 81/272 - loss 0.12859783 - time (sec): 4.69 - samples/sec: 3280.03 - lr: 0.000048 - momentum: 0.000000
2023-10-16 20:12:05,734 epoch 2 - iter 108/272 - loss 0.13307366 - time (sec): 6.16 - samples/sec: 3296.41 - lr: 0.000048 - momentum: 0.000000
2023-10-16 20:12:07,240 epoch 2 - iter 135/272 - loss 0.14563773 - time (sec): 7.67 - samples/sec: 3232.49 - lr: 0.000047 - momentum: 0.000000
2023-10-16 20:12:08,846 epoch 2 - iter 162/272 - loss 0.14873637 - time (sec): 9.27 - samples/sec: 3204.70 - lr: 0.000047 - momentum: 0.000000
2023-10-16 20:12:10,597 epoch 2 - iter 189/272 - loss 0.14467753 - time (sec): 11.02 - samples/sec: 3196.66 - lr: 0.000046 - momentum: 0.000000
2023-10-16 20:12:12,254 epoch 2 - iter 216/272 - loss 0.14245295 - time (sec): 12.68 - samples/sec: 3238.99 - lr: 0.000046 - momentum: 0.000000
2023-10-16 20:12:13,895 epoch 2 - iter 243/272 - loss 0.13866363 - time (sec): 14.32 - samples/sec: 3251.61 - lr: 0.000045 - momentum: 0.000000
2023-10-16 20:12:15,469 epoch 2 - iter 270/272 - loss 0.13692941 - time (sec): 15.90 - samples/sec: 3264.82 - lr: 0.000045 - momentum: 0.000000
2023-10-16 20:12:15,550 ----------------------------------------------------------------------------------------------------
2023-10-16 20:12:15,550 EPOCH 2 done: loss 0.1367 - lr: 0.000045
2023-10-16 20:12:17,247 DEV : loss 0.1134832352399826 - f1-score (micro avg) 0.7569
2023-10-16 20:12:17,252 saving best model
2023-10-16 20:12:17,749 ----------------------------------------------------------------------------------------------------
2023-10-16 20:12:19,424 epoch 3 - iter 27/272 - loss 0.10085692 - time (sec): 1.67 - samples/sec: 3479.84 - lr: 0.000044 - momentum: 0.000000
2023-10-16 20:12:21,055 epoch 3 - iter 54/272 - loss 0.09648160 - time (sec): 3.30 - samples/sec: 3521.68 - lr: 0.000043 - momentum: 0.000000
2023-10-16 20:12:22,603 epoch 3 - iter 81/272 - loss 0.08501000 - time (sec): 4.85 - samples/sec: 3374.61 - lr: 0.000043 - momentum: 0.000000
2023-10-16 20:12:24,277 epoch 3 - iter 108/272 - loss 0.09393010 - time (sec): 6.53 - samples/sec: 3289.17 - lr: 0.000042 - momentum: 0.000000
2023-10-16 20:12:25,730 epoch 3 - iter 135/272 - loss 0.09122040 - time (sec): 7.98 - samples/sec: 3250.99 - lr: 0.000042 - momentum: 0.000000
2023-10-16 20:12:27,230 epoch 3 - iter 162/272 - loss 0.08865571 - time (sec): 9.48 - samples/sec: 3247.59 - lr: 0.000041 - momentum: 0.000000
2023-10-16 20:12:28,936 epoch 3 - iter 189/272 - loss 0.08680863 - time (sec): 11.19 - samples/sec: 3292.09 - lr: 0.000041 - momentum: 0.000000
2023-10-16 20:12:30,418 epoch 3 - iter 216/272 - loss 0.08315365 - time (sec): 12.67 - samples/sec: 3273.29 - lr: 0.000040 - momentum: 0.000000
2023-10-16 20:12:31,934 epoch 3 - iter 243/272 - loss 0.08167352 - time (sec): 14.18 - samples/sec: 3281.90 - lr: 0.000040 - momentum: 0.000000
2023-10-16 20:12:33,576 epoch 3 - iter 270/272 - loss 0.08177957 - time (sec): 15.83 - samples/sec: 3270.36 - lr: 0.000039 - momentum: 0.000000
2023-10-16 20:12:33,663 ----------------------------------------------------------------------------------------------------
2023-10-16 20:12:33,663 EPOCH 3 done: loss 0.0815 - lr: 0.000039
2023-10-16 20:12:35,145 DEV : loss 0.12934227287769318 - f1-score (micro avg) 0.7786
2023-10-16 20:12:35,150 saving best model
2023-10-16 20:12:35,614 ----------------------------------------------------------------------------------------------------
2023-10-16 20:12:37,307 epoch 4 - iter 27/272 - loss 0.06994923 - time (sec): 1.69 - samples/sec: 3074.91 - lr: 0.000038 - momentum: 0.000000
2023-10-16 20:12:38,883 epoch 4 - iter 54/272 - loss 0.05718223 - time (sec): 3.27 - samples/sec: 3239.55 - lr: 0.000038 - momentum: 0.000000
2023-10-16 20:12:40,442 epoch 4 - iter 81/272 - loss 0.05004048 - time (sec): 4.83 - samples/sec: 3312.38 - lr: 0.000037 - momentum: 0.000000
2023-10-16 20:12:42,095 epoch 4 - iter 108/272 - loss 0.05133187 - time (sec): 6.48 - samples/sec: 3323.67 - lr: 0.000037 - momentum: 0.000000
2023-10-16 20:12:43,593 epoch 4 - iter 135/272 - loss 0.05180209 - time (sec): 7.98 - samples/sec: 3261.06 - lr: 0.000036 - momentum: 0.000000
2023-10-16 20:12:45,363 epoch 4 - iter 162/272 - loss 0.05254018 - time (sec): 9.75 - samples/sec: 3283.13 - lr: 0.000036 - momentum: 0.000000
2023-10-16 20:12:46,964 epoch 4 - iter 189/272 - loss 0.05384827 - time (sec): 11.35 - samples/sec: 3256.76 - lr: 0.000035 - momentum: 0.000000
2023-10-16 20:12:48,626 epoch 4 - iter 216/272 - loss 0.05136057 - time (sec): 13.01 - samples/sec: 3235.03 - lr: 0.000034 - momentum: 0.000000
2023-10-16 20:12:50,220 epoch 4 - iter 243/272 - loss 0.04939585 - time (sec): 14.60 - samples/sec: 3250.81 - lr: 0.000034 - momentum: 0.000000
2023-10-16 20:12:51,632 epoch 4 - iter 270/272 - loss 0.04895622 - time (sec): 16.02 - samples/sec: 3237.43 - lr: 0.000033 - momentum: 0.000000
2023-10-16 20:12:51,724 ----------------------------------------------------------------------------------------------------
2023-10-16 20:12:51,724 EPOCH 4 done: loss 0.0490 - lr: 0.000033
2023-10-16 20:12:53,185 DEV : loss 0.13081075251102448 - f1-score (micro avg) 0.8286
2023-10-16 20:12:53,192 saving best model
2023-10-16 20:12:53,667 ----------------------------------------------------------------------------------------------------
2023-10-16 20:12:55,298 epoch 5 - iter 27/272 - loss 0.05425176 - time (sec): 1.63 - samples/sec: 3037.57 - lr: 0.000033 - momentum: 0.000000
2023-10-16 20:12:56,770 epoch 5 - iter 54/272 - loss 0.05287029 - time (sec): 3.10 - samples/sec: 3045.97 - lr: 0.000032 - momentum: 0.000000
2023-10-16 20:12:58,299 epoch 5 - iter 81/272 - loss 0.04416411 - time (sec): 4.63 - samples/sec: 3241.53 - lr: 0.000032 - momentum: 0.000000
2023-10-16 20:13:00,025 epoch 5 - iter 108/272 - loss 0.04256412 - time (sec): 6.35 - samples/sec: 3175.71 - lr: 0.000031 - momentum: 0.000000
2023-10-16 20:13:01,566 epoch 5 - iter 135/272 - loss 0.03979846 - time (sec): 7.89 - samples/sec: 3164.58 - lr: 0.000031 - momentum: 0.000000
2023-10-16 20:13:03,123 epoch 5 - iter 162/272 - loss 0.03933371 - time (sec): 9.45 - samples/sec: 3186.42 - lr: 0.000030 - momentum: 0.000000
2023-10-16 20:13:04,803 epoch 5 - iter 189/272 - loss 0.03588228 - time (sec): 11.13 - samples/sec: 3207.14 - lr: 0.000029 - momentum: 0.000000
2023-10-16 20:13:06,412 epoch 5 - iter 216/272 - loss 0.03528880 - time (sec): 12.74 - samples/sec: 3224.43 - lr: 0.000029 - momentum: 0.000000
2023-10-16 20:13:07,991 epoch 5 - iter 243/272 - loss 0.03455066 - time (sec): 14.32 - samples/sec: 3225.97 - lr: 0.000028 - momentum: 0.000000
2023-10-16 20:13:09,530 epoch 5 - iter 270/272 - loss 0.03343776 - time (sec): 15.86 - samples/sec: 3240.11 - lr: 0.000028 - momentum: 0.000000
2023-10-16 20:13:09,727 ----------------------------------------------------------------------------------------------------
2023-10-16 20:13:09,727 EPOCH 5 done: loss 0.0331 - lr: 0.000028
2023-10-16 20:13:11,202 DEV : loss 0.1527273952960968 - f1-score (micro avg) 0.7935
2023-10-16 20:13:11,207 ----------------------------------------------------------------------------------------------------
2023-10-16 20:13:12,713 epoch 6 - iter 27/272 - loss 0.01883225 - time (sec): 1.51 - samples/sec: 3001.91 - lr: 0.000027 - momentum: 0.000000
2023-10-16 20:13:14,269 epoch 6 - iter 54/272 - loss 0.02182728 - time (sec): 3.06 - samples/sec: 3091.68 - lr: 0.000027 - momentum: 0.000000
2023-10-16 20:13:15,971 epoch 6 - iter 81/272 - loss 0.02015877 - time (sec): 4.76 - samples/sec: 3205.41 - lr: 0.000026 - momentum: 0.000000
2023-10-16 20:13:17,821 epoch 6 - iter 108/272 - loss 0.02272207 - time (sec): 6.61 - samples/sec: 3260.86 - lr: 0.000026 - momentum: 0.000000
2023-10-16 20:13:19,409 epoch 6 - iter 135/272 - loss 0.02407593 - time (sec): 8.20 - samples/sec: 3271.97 - lr: 0.000025 - momentum: 0.000000
2023-10-16 20:13:21,087 epoch 6 - iter 162/272 - loss 0.02539970 - time (sec): 9.88 - samples/sec: 3142.44 - lr: 0.000024 - momentum: 0.000000
2023-10-16 20:13:22,652 epoch 6 - iter 189/272 - loss 0.02622946 - time (sec): 11.44 - samples/sec: 3134.59 - lr: 0.000024 - momentum: 0.000000
2023-10-16 20:13:24,262 epoch 6 - iter 216/272 - loss 0.02509676 - time (sec): 13.05 - samples/sec: 3177.71 - lr: 0.000023 - momentum: 0.000000
2023-10-16 20:13:25,864 epoch 6 - iter 243/272 - loss 0.02277234 - time (sec): 14.66 - samples/sec: 3224.19 - lr: 0.000023 - momentum: 0.000000
2023-10-16 20:13:27,277 epoch 6 - iter 270/272 - loss 0.02396847 - time (sec): 16.07 - samples/sec: 3224.61 - lr: 0.000022 - momentum: 0.000000
2023-10-16 20:13:27,369 ----------------------------------------------------------------------------------------------------
2023-10-16 20:13:27,369 EPOCH 6 done: loss 0.0242 - lr: 0.000022
2023-10-16 20:13:28,839 DEV : loss 0.15852421522140503 - f1-score (micro avg) 0.8133
2023-10-16 20:13:28,844 ----------------------------------------------------------------------------------------------------
2023-10-16 20:13:30,407 epoch 7 - iter 27/272 - loss 0.01518661 - time (sec): 1.56 - samples/sec: 3243.62 - lr: 0.000022 - momentum: 0.000000
2023-10-16 20:13:32,194 epoch 7 - iter 54/272 - loss 0.01064029 - time (sec): 3.35 - samples/sec: 3364.96 - lr: 0.000021 - momentum: 0.000000
2023-10-16 20:13:33,880 epoch 7 - iter 81/272 - loss 0.01487741 - time (sec): 5.04 - samples/sec: 3289.63 - lr: 0.000021 - momentum: 0.000000
2023-10-16 20:13:35,587 epoch 7 - iter 108/272 - loss 0.02170189 - time (sec): 6.74 - samples/sec: 3303.03 - lr: 0.000020 - momentum: 0.000000
2023-10-16 20:13:37,127 epoch 7 - iter 135/272 - loss 0.02038448 - time (sec): 8.28 - samples/sec: 3261.03 - lr: 0.000019 - momentum: 0.000000
2023-10-16 20:13:38,794 epoch 7 - iter 162/272 - loss 0.01901781 - time (sec): 9.95 - samples/sec: 3299.77 - lr: 0.000019 - momentum: 0.000000
2023-10-16 20:13:40,322 epoch 7 - iter 189/272 - loss 0.01954576 - time (sec): 11.48 - samples/sec: 3287.24 - lr: 0.000018 - momentum: 0.000000
2023-10-16 20:13:41,770 epoch 7 - iter 216/272 - loss 0.01865636 - time (sec): 12.93 - samples/sec: 3273.03 - lr: 0.000018 - momentum: 0.000000
2023-10-16 20:13:43,348 epoch 7 - iter 243/272 - loss 0.01877717 - time (sec): 14.50 - samples/sec: 3253.76 - lr: 0.000017 - momentum: 0.000000
2023-10-16 20:13:44,842 epoch 7 - iter 270/272 - loss 0.01826347 - time (sec): 16.00 - samples/sec: 3234.50 - lr: 0.000017 - momentum: 0.000000
2023-10-16 20:13:44,935 ----------------------------------------------------------------------------------------------------
2023-10-16 20:13:44,936 EPOCH 7 done: loss 0.0182 - lr: 0.000017
2023-10-16 20:13:46,403 DEV : loss 0.1606188863515854 - f1-score (micro avg) 0.8022
2023-10-16 20:13:46,409 ----------------------------------------------------------------------------------------------------
2023-10-16 20:13:47,916 epoch 8 - iter 27/272 - loss 0.01279749 - time (sec): 1.51 - samples/sec: 2910.79 - lr: 0.000016 - momentum: 0.000000
2023-10-16 20:13:49,411 epoch 8 - iter 54/272 - loss 0.01405124 - time (sec): 3.00 - samples/sec: 3058.02 - lr: 0.000016 - momentum: 0.000000
2023-10-16 20:13:50,897 epoch 8 - iter 81/272 - loss 0.01180266 - time (sec): 4.49 - samples/sec: 3078.08 - lr: 0.000015 - momentum: 0.000000
2023-10-16 20:13:52,510 epoch 8 - iter 108/272 - loss 0.01124551 - time (sec): 6.10 - samples/sec: 3145.57 - lr: 0.000014 - momentum: 0.000000
2023-10-16 20:13:54,170 epoch 8 - iter 135/272 - loss 0.01015956 - time (sec): 7.76 - samples/sec: 3180.26 - lr: 0.000014 - momentum: 0.000000
2023-10-16 20:13:55,748 epoch 8 - iter 162/272 - loss 0.00918604 - time (sec): 9.34 - samples/sec: 3204.25 - lr: 0.000013 - momentum: 0.000000
2023-10-16 20:13:57,409 epoch 8 - iter 189/272 - loss 0.00923296 - time (sec): 11.00 - samples/sec: 3250.85 - lr: 0.000013 - momentum: 0.000000
2023-10-16 20:13:59,023 epoch 8 - iter 216/272 - loss 0.01026326 - time (sec): 12.61 - samples/sec: 3222.45 - lr: 0.000012 - momentum: 0.000000
2023-10-16 20:14:00,747 epoch 8 - iter 243/272 - loss 0.01130313 - time (sec): 14.34 - samples/sec: 3254.11 - lr: 0.000012 - momentum: 0.000000
2023-10-16 20:14:02,335 epoch 8 - iter 270/272 - loss 0.01201146 - time (sec): 15.92 - samples/sec: 3246.25 - lr: 0.000011 - momentum: 0.000000
2023-10-16 20:14:02,434 ----------------------------------------------------------------------------------------------------
2023-10-16 20:14:02,434 EPOCH 8 done: loss 0.0119 - lr: 0.000011
2023-10-16 20:14:03,922 DEV : loss 0.1735740751028061 - f1-score (micro avg) 0.8111
2023-10-16 20:14:03,928 ----------------------------------------------------------------------------------------------------
2023-10-16 20:14:05,664 epoch 9 - iter 27/272 - loss 0.01240591 - time (sec): 1.73 - samples/sec: 3381.09 - lr: 0.000011 - momentum: 0.000000
2023-10-16 20:14:07,209 epoch 9 - iter 54/272 - loss 0.00910613 - time (sec): 3.28 - samples/sec: 3363.52 - lr: 0.000010 - momentum: 0.000000
2023-10-16 20:14:08,673 epoch 9 - iter 81/272 - loss 0.01017944 - time (sec): 4.74 - samples/sec: 3238.20 - lr: 0.000009 - momentum: 0.000000
2023-10-16 20:14:10,290 epoch 9 - iter 108/272 - loss 0.00815891 - time (sec): 6.36 - samples/sec: 3269.78 - lr: 0.000009 - momentum: 0.000000
2023-10-16 20:14:11,895 epoch 9 - iter 135/272 - loss 0.00791519 - time (sec): 7.97 - samples/sec: 3293.98 - lr: 0.000008 - momentum: 0.000000
2023-10-16 20:14:13,471 epoch 9 - iter 162/272 - loss 0.00875581 - time (sec): 9.54 - samples/sec: 3304.22 - lr: 0.000008 - momentum: 0.000000
2023-10-16 20:14:15,146 epoch 9 - iter 189/272 - loss 0.00801884 - time (sec): 11.22 - samples/sec: 3260.87 - lr: 0.000007 - momentum: 0.000000
2023-10-16 20:14:16,833 epoch 9 - iter 216/272 - loss 0.00860070 - time (sec): 12.90 - samples/sec: 3289.81 - lr: 0.000007 - momentum: 0.000000
2023-10-16 20:14:18,379 epoch 9 - iter 243/272 - loss 0.00817350 - time (sec): 14.45 - samples/sec: 3288.83 - lr: 0.000006 - momentum: 0.000000
2023-10-16 20:14:19,871 epoch 9 - iter 270/272 - loss 0.00901566 - time (sec): 15.94 - samples/sec: 3254.91 - lr: 0.000006 - momentum: 0.000000
2023-10-16 20:14:19,958 ----------------------------------------------------------------------------------------------------
2023-10-16 20:14:19,958 EPOCH 9 done: loss 0.0091 - lr: 0.000006
2023-10-16 20:14:21,469 DEV : loss 0.16326886415481567 - f1-score (micro avg) 0.8095
2023-10-16 20:14:21,475 ----------------------------------------------------------------------------------------------------
2023-10-16 20:14:23,346 epoch 10 - iter 27/272 - loss 0.00105081 - time (sec): 1.87 - samples/sec: 2931.62 - lr: 0.000005 - momentum: 0.000000
2023-10-16 20:14:24,727 epoch 10 - iter 54/272 - loss 0.00079658 - time (sec): 3.25 - samples/sec: 2951.64 - lr: 0.000004 - momentum: 0.000000
2023-10-16 20:14:26,210 epoch 10 - iter 81/272 - loss 0.00416132 - time (sec): 4.73 - samples/sec: 2993.52 - lr: 0.000004 - momentum: 0.000000
2023-10-16 20:14:27,787 epoch 10 - iter 108/272 - loss 0.00376567 - time (sec): 6.31 - samples/sec: 3094.50 - lr: 0.000003 - momentum: 0.000000
2023-10-16 20:14:29,309 epoch 10 - iter 135/272 - loss 0.00421615 - time (sec): 7.83 - samples/sec: 3150.25 - lr: 0.000003 - momentum: 0.000000
2023-10-16 20:14:30,754 epoch 10 - iter 162/272 - loss 0.00373493 - time (sec): 9.28 - samples/sec: 3136.20 - lr: 0.000002 - momentum: 0.000000
2023-10-16 20:14:32,440 epoch 10 - iter 189/272 - loss 0.00621518 - time (sec): 10.96 - samples/sec: 3185.75 - lr: 0.000002 - momentum: 0.000000
2023-10-16 20:14:34,013 epoch 10 - iter 216/272 - loss 0.00638498 - time (sec): 12.54 - samples/sec: 3209.92 - lr: 0.000001 - momentum: 0.000000
2023-10-16 20:14:35,653 epoch 10 - iter 243/272 - loss 0.00582988 - time (sec): 14.18 - samples/sec: 3226.26 - lr: 0.000001 - momentum: 0.000000
2023-10-16 20:14:37,311 epoch 10 - iter 270/272 - loss 0.00598226 - time (sec): 15.84 - samples/sec: 3263.04 - lr: 0.000000 - momentum: 0.000000
2023-10-16 20:14:37,397 ----------------------------------------------------------------------------------------------------
2023-10-16 20:14:37,397 EPOCH 10 done: loss 0.0059 - lr: 0.000000
2023-10-16 20:14:38,897 DEV : loss 0.17024299502372742 - f1-score (micro avg) 0.792
2023-10-16 20:14:39,273 ----------------------------------------------------------------------------------------------------
2023-10-16 20:14:39,274 Loading model from best epoch ...
2023-10-16 20:14:40,853 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-16 20:14:42,964
Results:
- F-score (micro) 0.7734
- F-score (macro) 0.7221
- Accuracy 0.647
By class:
precision recall f1-score support
LOC 0.8306 0.8173 0.8239 312
PER 0.7296 0.8173 0.7710 208
ORG 0.4815 0.4727 0.4771 55
HumanProd 0.7407 0.9091 0.8163 22
micro avg 0.7585 0.7889 0.7734 597
macro avg 0.6956 0.7541 0.7221 597
weighted avg 0.7600 0.7889 0.7732 597
2023-10-16 20:14:42,964 ----------------------------------------------------------------------------------------------------