stefan-it's picture
Upload ./training.log with huggingface_hub
92e74ca
2023-10-25 21:02:46,321 ----------------------------------------------------------------------------------------------------
2023-10-25 21:02:46,322 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 21:02:46,322 ----------------------------------------------------------------------------------------------------
2023-10-25 21:02:46,322 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-25 21:02:46,322 ----------------------------------------------------------------------------------------------------
2023-10-25 21:02:46,322 Train: 1166 sentences
2023-10-25 21:02:46,322 (train_with_dev=False, train_with_test=False)
2023-10-25 21:02:46,322 ----------------------------------------------------------------------------------------------------
2023-10-25 21:02:46,322 Training Params:
2023-10-25 21:02:46,322 - learning_rate: "5e-05"
2023-10-25 21:02:46,322 - mini_batch_size: "4"
2023-10-25 21:02:46,322 - max_epochs: "10"
2023-10-25 21:02:46,322 - shuffle: "True"
2023-10-25 21:02:46,322 ----------------------------------------------------------------------------------------------------
2023-10-25 21:02:46,323 Plugins:
2023-10-25 21:02:46,323 - TensorboardLogger
2023-10-25 21:02:46,323 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 21:02:46,323 ----------------------------------------------------------------------------------------------------
2023-10-25 21:02:46,323 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 21:02:46,323 - metric: "('micro avg', 'f1-score')"
2023-10-25 21:02:46,323 ----------------------------------------------------------------------------------------------------
2023-10-25 21:02:46,323 Computation:
2023-10-25 21:02:46,323 - compute on device: cuda:0
2023-10-25 21:02:46,323 - embedding storage: none
2023-10-25 21:02:46,323 ----------------------------------------------------------------------------------------------------
2023-10-25 21:02:46,323 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-25 21:02:46,323 ----------------------------------------------------------------------------------------------------
2023-10-25 21:02:46,323 ----------------------------------------------------------------------------------------------------
2023-10-25 21:02:46,323 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 21:02:47,568 epoch 1 - iter 29/292 - loss 2.70493946 - time (sec): 1.24 - samples/sec: 3001.90 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:02:48,960 epoch 1 - iter 58/292 - loss 1.64294689 - time (sec): 2.64 - samples/sec: 3433.28 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:02:50,359 epoch 1 - iter 87/292 - loss 1.24359019 - time (sec): 4.03 - samples/sec: 3589.43 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:02:51,628 epoch 1 - iter 116/292 - loss 1.05004462 - time (sec): 5.30 - samples/sec: 3552.13 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:02:52,874 epoch 1 - iter 145/292 - loss 0.91786982 - time (sec): 6.55 - samples/sec: 3520.91 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:02:54,229 epoch 1 - iter 174/292 - loss 0.80347598 - time (sec): 7.91 - samples/sec: 3546.58 - lr: 0.000030 - momentum: 0.000000
2023-10-25 21:02:55,489 epoch 1 - iter 203/292 - loss 0.74188701 - time (sec): 9.17 - samples/sec: 3431.97 - lr: 0.000035 - momentum: 0.000000
2023-10-25 21:02:56,762 epoch 1 - iter 232/292 - loss 0.68879598 - time (sec): 10.44 - samples/sec: 3393.36 - lr: 0.000040 - momentum: 0.000000
2023-10-25 21:02:57,981 epoch 1 - iter 261/292 - loss 0.65138888 - time (sec): 11.66 - samples/sec: 3344.55 - lr: 0.000045 - momentum: 0.000000
2023-10-25 21:02:59,311 epoch 1 - iter 290/292 - loss 0.60276932 - time (sec): 12.99 - samples/sec: 3401.64 - lr: 0.000049 - momentum: 0.000000
2023-10-25 21:02:59,388 ----------------------------------------------------------------------------------------------------
2023-10-25 21:02:59,388 EPOCH 1 done: loss 0.5995 - lr: 0.000049
2023-10-25 21:03:00,042 DEV : loss 0.15543945133686066 - f1-score (micro avg) 0.5759
2023-10-25 21:03:00,046 saving best model
2023-10-25 21:03:00,575 ----------------------------------------------------------------------------------------------------
2023-10-25 21:03:01,799 epoch 2 - iter 29/292 - loss 0.16952144 - time (sec): 1.22 - samples/sec: 3371.46 - lr: 0.000049 - momentum: 0.000000
2023-10-25 21:03:03,040 epoch 2 - iter 58/292 - loss 0.18657242 - time (sec): 2.46 - samples/sec: 3294.57 - lr: 0.000049 - momentum: 0.000000
2023-10-25 21:03:04,352 epoch 2 - iter 87/292 - loss 0.16106239 - time (sec): 3.78 - samples/sec: 3300.13 - lr: 0.000048 - momentum: 0.000000
2023-10-25 21:03:05,608 epoch 2 - iter 116/292 - loss 0.15644988 - time (sec): 5.03 - samples/sec: 3352.82 - lr: 0.000048 - momentum: 0.000000
2023-10-25 21:03:06,836 epoch 2 - iter 145/292 - loss 0.15373069 - time (sec): 6.26 - samples/sec: 3411.63 - lr: 0.000047 - momentum: 0.000000
2023-10-25 21:03:08,014 epoch 2 - iter 174/292 - loss 0.16590765 - time (sec): 7.44 - samples/sec: 3373.45 - lr: 0.000047 - momentum: 0.000000
2023-10-25 21:03:09,249 epoch 2 - iter 203/292 - loss 0.17045348 - time (sec): 8.67 - samples/sec: 3377.62 - lr: 0.000046 - momentum: 0.000000
2023-10-25 21:03:10,551 epoch 2 - iter 232/292 - loss 0.17007656 - time (sec): 9.97 - samples/sec: 3343.68 - lr: 0.000046 - momentum: 0.000000
2023-10-25 21:03:11,963 epoch 2 - iter 261/292 - loss 0.16415110 - time (sec): 11.39 - samples/sec: 3441.55 - lr: 0.000045 - momentum: 0.000000
2023-10-25 21:03:13,297 epoch 2 - iter 290/292 - loss 0.15537644 - time (sec): 12.72 - samples/sec: 3482.88 - lr: 0.000045 - momentum: 0.000000
2023-10-25 21:03:13,376 ----------------------------------------------------------------------------------------------------
2023-10-25 21:03:13,376 EPOCH 2 done: loss 0.1553 - lr: 0.000045
2023-10-25 21:03:14,282 DEV : loss 0.14280402660369873 - f1-score (micro avg) 0.7113
2023-10-25 21:03:14,287 saving best model
2023-10-25 21:03:14,953 ----------------------------------------------------------------------------------------------------
2023-10-25 21:03:16,338 epoch 3 - iter 29/292 - loss 0.12799538 - time (sec): 1.38 - samples/sec: 3483.45 - lr: 0.000044 - momentum: 0.000000
2023-10-25 21:03:17,670 epoch 3 - iter 58/292 - loss 0.09862348 - time (sec): 2.71 - samples/sec: 3517.39 - lr: 0.000043 - momentum: 0.000000
2023-10-25 21:03:18,972 epoch 3 - iter 87/292 - loss 0.08769634 - time (sec): 4.02 - samples/sec: 3425.70 - lr: 0.000043 - momentum: 0.000000
2023-10-25 21:03:20,258 epoch 3 - iter 116/292 - loss 0.08875247 - time (sec): 5.30 - samples/sec: 3388.13 - lr: 0.000042 - momentum: 0.000000
2023-10-25 21:03:21,512 epoch 3 - iter 145/292 - loss 0.08886591 - time (sec): 6.56 - samples/sec: 3325.24 - lr: 0.000042 - momentum: 0.000000
2023-10-25 21:03:22,754 epoch 3 - iter 174/292 - loss 0.09139678 - time (sec): 7.80 - samples/sec: 3283.17 - lr: 0.000041 - momentum: 0.000000
2023-10-25 21:03:24,065 epoch 3 - iter 203/292 - loss 0.08933904 - time (sec): 9.11 - samples/sec: 3332.68 - lr: 0.000041 - momentum: 0.000000
2023-10-25 21:03:25,360 epoch 3 - iter 232/292 - loss 0.08802096 - time (sec): 10.40 - samples/sec: 3352.74 - lr: 0.000040 - momentum: 0.000000
2023-10-25 21:03:26,654 epoch 3 - iter 261/292 - loss 0.09013801 - time (sec): 11.70 - samples/sec: 3350.70 - lr: 0.000040 - momentum: 0.000000
2023-10-25 21:03:28,018 epoch 3 - iter 290/292 - loss 0.08726593 - time (sec): 13.06 - samples/sec: 3372.98 - lr: 0.000039 - momentum: 0.000000
2023-10-25 21:03:28,107 ----------------------------------------------------------------------------------------------------
2023-10-25 21:03:28,108 EPOCH 3 done: loss 0.0879 - lr: 0.000039
2023-10-25 21:03:29,012 DEV : loss 0.16181543469429016 - f1-score (micro avg) 0.7412
2023-10-25 21:03:29,017 saving best model
2023-10-25 21:03:29,682 ----------------------------------------------------------------------------------------------------
2023-10-25 21:03:31,011 epoch 4 - iter 29/292 - loss 0.08251774 - time (sec): 1.33 - samples/sec: 3144.69 - lr: 0.000038 - momentum: 0.000000
2023-10-25 21:03:32,296 epoch 4 - iter 58/292 - loss 0.06039563 - time (sec): 2.61 - samples/sec: 3480.66 - lr: 0.000038 - momentum: 0.000000
2023-10-25 21:03:33,577 epoch 4 - iter 87/292 - loss 0.05907690 - time (sec): 3.89 - samples/sec: 3336.97 - lr: 0.000037 - momentum: 0.000000
2023-10-25 21:03:34,854 epoch 4 - iter 116/292 - loss 0.05470702 - time (sec): 5.17 - samples/sec: 3343.45 - lr: 0.000037 - momentum: 0.000000
2023-10-25 21:03:36,253 epoch 4 - iter 145/292 - loss 0.05698655 - time (sec): 6.57 - samples/sec: 3300.17 - lr: 0.000036 - momentum: 0.000000
2023-10-25 21:03:37,613 epoch 4 - iter 174/292 - loss 0.05507272 - time (sec): 7.93 - samples/sec: 3315.00 - lr: 0.000036 - momentum: 0.000000
2023-10-25 21:03:38,875 epoch 4 - iter 203/292 - loss 0.06136269 - time (sec): 9.19 - samples/sec: 3312.60 - lr: 0.000035 - momentum: 0.000000
2023-10-25 21:03:40,241 epoch 4 - iter 232/292 - loss 0.06014642 - time (sec): 10.56 - samples/sec: 3353.87 - lr: 0.000035 - momentum: 0.000000
2023-10-25 21:03:41,520 epoch 4 - iter 261/292 - loss 0.05735605 - time (sec): 11.84 - samples/sec: 3372.49 - lr: 0.000034 - momentum: 0.000000
2023-10-25 21:03:42,826 epoch 4 - iter 290/292 - loss 0.05824407 - time (sec): 13.14 - samples/sec: 3365.23 - lr: 0.000033 - momentum: 0.000000
2023-10-25 21:03:42,909 ----------------------------------------------------------------------------------------------------
2023-10-25 21:03:42,909 EPOCH 4 done: loss 0.0580 - lr: 0.000033
2023-10-25 21:03:43,968 DEV : loss 0.1569732278585434 - f1-score (micro avg) 0.7606
2023-10-25 21:03:43,972 saving best model
2023-10-25 21:03:44,658 ----------------------------------------------------------------------------------------------------
2023-10-25 21:03:45,907 epoch 5 - iter 29/292 - loss 0.02448988 - time (sec): 1.25 - samples/sec: 2965.12 - lr: 0.000033 - momentum: 0.000000
2023-10-25 21:03:47,242 epoch 5 - iter 58/292 - loss 0.03323303 - time (sec): 2.58 - samples/sec: 3230.68 - lr: 0.000032 - momentum: 0.000000
2023-10-25 21:03:48,588 epoch 5 - iter 87/292 - loss 0.03405391 - time (sec): 3.93 - samples/sec: 3407.07 - lr: 0.000032 - momentum: 0.000000
2023-10-25 21:03:49,819 epoch 5 - iter 116/292 - loss 0.03187030 - time (sec): 5.16 - samples/sec: 3408.51 - lr: 0.000031 - momentum: 0.000000
2023-10-25 21:03:51,096 epoch 5 - iter 145/292 - loss 0.03696710 - time (sec): 6.43 - samples/sec: 3345.01 - lr: 0.000031 - momentum: 0.000000
2023-10-25 21:03:52,458 epoch 5 - iter 174/292 - loss 0.03833785 - time (sec): 7.80 - samples/sec: 3310.13 - lr: 0.000030 - momentum: 0.000000
2023-10-25 21:03:53,820 epoch 5 - iter 203/292 - loss 0.04040647 - time (sec): 9.16 - samples/sec: 3370.90 - lr: 0.000030 - momentum: 0.000000
2023-10-25 21:03:55,148 epoch 5 - iter 232/292 - loss 0.03785862 - time (sec): 10.49 - samples/sec: 3380.50 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:03:56,505 epoch 5 - iter 261/292 - loss 0.03665954 - time (sec): 11.84 - samples/sec: 3356.28 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:03:57,830 epoch 5 - iter 290/292 - loss 0.03585809 - time (sec): 13.17 - samples/sec: 3363.05 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:03:57,905 ----------------------------------------------------------------------------------------------------
2023-10-25 21:03:57,905 EPOCH 5 done: loss 0.0357 - lr: 0.000028
2023-10-25 21:03:58,815 DEV : loss 0.14395155012607574 - f1-score (micro avg) 0.7265
2023-10-25 21:03:58,819 ----------------------------------------------------------------------------------------------------
2023-10-25 21:04:00,208 epoch 6 - iter 29/292 - loss 0.02165138 - time (sec): 1.39 - samples/sec: 3839.74 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:04:01,519 epoch 6 - iter 58/292 - loss 0.01941242 - time (sec): 2.70 - samples/sec: 3485.44 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:04:02,781 epoch 6 - iter 87/292 - loss 0.02557862 - time (sec): 3.96 - samples/sec: 3472.87 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:04:04,147 epoch 6 - iter 116/292 - loss 0.02482614 - time (sec): 5.33 - samples/sec: 3550.68 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:04:05,427 epoch 6 - iter 145/292 - loss 0.02631017 - time (sec): 6.61 - samples/sec: 3457.32 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:04:06,786 epoch 6 - iter 174/292 - loss 0.02314945 - time (sec): 7.97 - samples/sec: 3450.43 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:04:08,068 epoch 6 - iter 203/292 - loss 0.02313133 - time (sec): 9.25 - samples/sec: 3440.85 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:04:09,331 epoch 6 - iter 232/292 - loss 0.02352516 - time (sec): 10.51 - samples/sec: 3447.74 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:04:10,604 epoch 6 - iter 261/292 - loss 0.02453468 - time (sec): 11.78 - samples/sec: 3461.04 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:04:11,847 epoch 6 - iter 290/292 - loss 0.02499737 - time (sec): 13.03 - samples/sec: 3387.70 - lr: 0.000022 - momentum: 0.000000
2023-10-25 21:04:11,936 ----------------------------------------------------------------------------------------------------
2023-10-25 21:04:11,936 EPOCH 6 done: loss 0.0251 - lr: 0.000022
2023-10-25 21:04:12,842 DEV : loss 0.17060092091560364 - f1-score (micro avg) 0.7447
2023-10-25 21:04:12,846 ----------------------------------------------------------------------------------------------------
2023-10-25 21:04:14,184 epoch 7 - iter 29/292 - loss 0.03038679 - time (sec): 1.34 - samples/sec: 3200.51 - lr: 0.000022 - momentum: 0.000000
2023-10-25 21:04:15,504 epoch 7 - iter 58/292 - loss 0.02199783 - time (sec): 2.66 - samples/sec: 3269.59 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:04:16,943 epoch 7 - iter 87/292 - loss 0.02107799 - time (sec): 4.10 - samples/sec: 3377.74 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:04:18,274 epoch 7 - iter 116/292 - loss 0.01988430 - time (sec): 5.43 - samples/sec: 3358.94 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:04:19,515 epoch 7 - iter 145/292 - loss 0.02054682 - time (sec): 6.67 - samples/sec: 3387.20 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:04:20,827 epoch 7 - iter 174/292 - loss 0.01902887 - time (sec): 7.98 - samples/sec: 3402.18 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:04:22,089 epoch 7 - iter 203/292 - loss 0.01967890 - time (sec): 9.24 - samples/sec: 3381.86 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:04:23,370 epoch 7 - iter 232/292 - loss 0.02047310 - time (sec): 10.52 - samples/sec: 3353.31 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:04:24,632 epoch 7 - iter 261/292 - loss 0.02022304 - time (sec): 11.78 - samples/sec: 3367.80 - lr: 0.000017 - momentum: 0.000000
2023-10-25 21:04:25,851 epoch 7 - iter 290/292 - loss 0.01979598 - time (sec): 13.00 - samples/sec: 3390.67 - lr: 0.000017 - momentum: 0.000000
2023-10-25 21:04:25,946 ----------------------------------------------------------------------------------------------------
2023-10-25 21:04:25,946 EPOCH 7 done: loss 0.0196 - lr: 0.000017
2023-10-25 21:04:26,861 DEV : loss 0.1871466189622879 - f1-score (micro avg) 0.7439
2023-10-25 21:04:26,866 ----------------------------------------------------------------------------------------------------
2023-10-25 21:04:28,143 epoch 8 - iter 29/292 - loss 0.00718740 - time (sec): 1.28 - samples/sec: 3597.72 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:04:29,467 epoch 8 - iter 58/292 - loss 0.01503456 - time (sec): 2.60 - samples/sec: 3365.84 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:04:30,790 epoch 8 - iter 87/292 - loss 0.01557495 - time (sec): 3.92 - samples/sec: 3190.64 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:04:32,093 epoch 8 - iter 116/292 - loss 0.01500266 - time (sec): 5.23 - samples/sec: 3234.34 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:04:33,336 epoch 8 - iter 145/292 - loss 0.01724417 - time (sec): 6.47 - samples/sec: 3304.72 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:04:34,647 epoch 8 - iter 174/292 - loss 0.01760521 - time (sec): 7.78 - samples/sec: 3357.34 - lr: 0.000013 - momentum: 0.000000
2023-10-25 21:04:35,925 epoch 8 - iter 203/292 - loss 0.01743967 - time (sec): 9.06 - samples/sec: 3381.28 - lr: 0.000013 - momentum: 0.000000
2023-10-25 21:04:37,315 epoch 8 - iter 232/292 - loss 0.01549544 - time (sec): 10.45 - samples/sec: 3403.73 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:04:38,612 epoch 8 - iter 261/292 - loss 0.01484755 - time (sec): 11.74 - samples/sec: 3405.07 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:04:39,938 epoch 8 - iter 290/292 - loss 0.01565987 - time (sec): 13.07 - samples/sec: 3394.16 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:04:40,013 ----------------------------------------------------------------------------------------------------
2023-10-25 21:04:40,013 EPOCH 8 done: loss 0.0159 - lr: 0.000011
2023-10-25 21:04:41,089 DEV : loss 0.2016552984714508 - f1-score (micro avg) 0.7642
2023-10-25 21:04:41,094 saving best model
2023-10-25 21:04:41,658 ----------------------------------------------------------------------------------------------------
2023-10-25 21:04:42,938 epoch 9 - iter 29/292 - loss 0.00883147 - time (sec): 1.28 - samples/sec: 3170.38 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:04:44,271 epoch 9 - iter 58/292 - loss 0.01566657 - time (sec): 2.61 - samples/sec: 3276.96 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:04:45,628 epoch 9 - iter 87/292 - loss 0.01114646 - time (sec): 3.97 - samples/sec: 3248.19 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:04:46,926 epoch 9 - iter 116/292 - loss 0.00886227 - time (sec): 5.27 - samples/sec: 3159.36 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:04:48,286 epoch 9 - iter 145/292 - loss 0.00720736 - time (sec): 6.63 - samples/sec: 3205.00 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:04:49,636 epoch 9 - iter 174/292 - loss 0.00633132 - time (sec): 7.98 - samples/sec: 3302.03 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:04:50,954 epoch 9 - iter 203/292 - loss 0.00655342 - time (sec): 9.29 - samples/sec: 3292.40 - lr: 0.000007 - momentum: 0.000000
2023-10-25 21:04:52,300 epoch 9 - iter 232/292 - loss 0.00853524 - time (sec): 10.64 - samples/sec: 3304.99 - lr: 0.000007 - momentum: 0.000000
2023-10-25 21:04:53,593 epoch 9 - iter 261/292 - loss 0.00803946 - time (sec): 11.93 - samples/sec: 3297.35 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:04:54,911 epoch 9 - iter 290/292 - loss 0.00829946 - time (sec): 13.25 - samples/sec: 3327.92 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:04:54,992 ----------------------------------------------------------------------------------------------------
2023-10-25 21:04:54,992 EPOCH 9 done: loss 0.0082 - lr: 0.000006
2023-10-25 21:04:55,903 DEV : loss 0.2061367630958557 - f1-score (micro avg) 0.7543
2023-10-25 21:04:55,908 ----------------------------------------------------------------------------------------------------
2023-10-25 21:04:57,187 epoch 10 - iter 29/292 - loss 0.00318513 - time (sec): 1.28 - samples/sec: 3035.16 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:04:58,468 epoch 10 - iter 58/292 - loss 0.00296072 - time (sec): 2.56 - samples/sec: 3051.77 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:04:59,815 epoch 10 - iter 87/292 - loss 0.00649624 - time (sec): 3.91 - samples/sec: 3244.54 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:05:01,138 epoch 10 - iter 116/292 - loss 0.00570956 - time (sec): 5.23 - samples/sec: 3328.03 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:05:02,456 epoch 10 - iter 145/292 - loss 0.00505520 - time (sec): 6.55 - samples/sec: 3425.62 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:05:03,752 epoch 10 - iter 174/292 - loss 0.00507932 - time (sec): 7.84 - samples/sec: 3488.66 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:05:05,062 epoch 10 - iter 203/292 - loss 0.00532013 - time (sec): 9.15 - samples/sec: 3509.91 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:05:06,295 epoch 10 - iter 232/292 - loss 0.00556513 - time (sec): 10.39 - samples/sec: 3461.67 - lr: 0.000001 - momentum: 0.000000
2023-10-25 21:05:07,561 epoch 10 - iter 261/292 - loss 0.00596959 - time (sec): 11.65 - samples/sec: 3425.39 - lr: 0.000001 - momentum: 0.000000
2023-10-25 21:05:08,803 epoch 10 - iter 290/292 - loss 0.00553911 - time (sec): 12.89 - samples/sec: 3424.82 - lr: 0.000000 - momentum: 0.000000
2023-10-25 21:05:08,887 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:08,887 EPOCH 10 done: loss 0.0058 - lr: 0.000000
2023-10-25 21:05:09,796 DEV : loss 0.20963872969150543 - f1-score (micro avg) 0.7366
2023-10-25 21:05:10,330 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:10,332 Loading model from best epoch ...
2023-10-25 21:05:12,043 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 21:05:13,585
Results:
- F-score (micro) 0.7502
- F-score (macro) 0.6592
- Accuracy 0.6229
By class:
precision recall f1-score support
PER 0.7754 0.8333 0.8033 348
LOC 0.6889 0.8314 0.7535 261
ORG 0.3962 0.4038 0.4000 52
HumanProd 0.6071 0.7727 0.6800 22
micro avg 0.7078 0.7980 0.7502 683
macro avg 0.6169 0.7103 0.6592 683
weighted avg 0.7081 0.7980 0.7496 683
2023-10-25 21:05:13,585 ----------------------------------------------------------------------------------------------------