stefan-it's picture
Upload ./training.log with huggingface_hub
c6942d3
2023-10-25 21:30:08,203 ----------------------------------------------------------------------------------------------------
2023-10-25 21:30:08,204 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 21:30:08,204 ----------------------------------------------------------------------------------------------------
2023-10-25 21:30:08,204 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-25 21:30:08,204 ----------------------------------------------------------------------------------------------------
2023-10-25 21:30:08,204 Train: 1166 sentences
2023-10-25 21:30:08,204 (train_with_dev=False, train_with_test=False)
2023-10-25 21:30:08,204 ----------------------------------------------------------------------------------------------------
2023-10-25 21:30:08,204 Training Params:
2023-10-25 21:30:08,204 - learning_rate: "3e-05"
2023-10-25 21:30:08,204 - mini_batch_size: "4"
2023-10-25 21:30:08,204 - max_epochs: "10"
2023-10-25 21:30:08,204 - shuffle: "True"
2023-10-25 21:30:08,204 ----------------------------------------------------------------------------------------------------
2023-10-25 21:30:08,204 Plugins:
2023-10-25 21:30:08,205 - TensorboardLogger
2023-10-25 21:30:08,205 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 21:30:08,205 ----------------------------------------------------------------------------------------------------
2023-10-25 21:30:08,205 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 21:30:08,205 - metric: "('micro avg', 'f1-score')"
2023-10-25 21:30:08,205 ----------------------------------------------------------------------------------------------------
2023-10-25 21:30:08,205 Computation:
2023-10-25 21:30:08,205 - compute on device: cuda:0
2023-10-25 21:30:08,205 - embedding storage: none
2023-10-25 21:30:08,205 ----------------------------------------------------------------------------------------------------
2023-10-25 21:30:08,205 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-25 21:30:08,205 ----------------------------------------------------------------------------------------------------
2023-10-25 21:30:08,205 ----------------------------------------------------------------------------------------------------
2023-10-25 21:30:08,205 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 21:30:09,606 epoch 1 - iter 29/292 - loss 2.59371458 - time (sec): 1.40 - samples/sec: 2896.33 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:30:10,850 epoch 1 - iter 58/292 - loss 1.95964027 - time (sec): 2.64 - samples/sec: 2959.98 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:30:12,193 epoch 1 - iter 87/292 - loss 1.55409749 - time (sec): 3.99 - samples/sec: 3138.14 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:30:13,533 epoch 1 - iter 116/292 - loss 1.29448959 - time (sec): 5.33 - samples/sec: 3211.67 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:30:14,805 epoch 1 - iter 145/292 - loss 1.10125147 - time (sec): 6.60 - samples/sec: 3291.64 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:30:16,127 epoch 1 - iter 174/292 - loss 0.98629607 - time (sec): 7.92 - samples/sec: 3235.22 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:30:17,493 epoch 1 - iter 203/292 - loss 0.86851694 - time (sec): 9.29 - samples/sec: 3310.48 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:30:18,859 epoch 1 - iter 232/292 - loss 0.77881535 - time (sec): 10.65 - samples/sec: 3357.49 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:30:20,169 epoch 1 - iter 261/292 - loss 0.72190147 - time (sec): 11.96 - samples/sec: 3368.38 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:30:21,437 epoch 1 - iter 290/292 - loss 0.68367522 - time (sec): 13.23 - samples/sec: 3341.39 - lr: 0.000030 - momentum: 0.000000
2023-10-25 21:30:21,516 ----------------------------------------------------------------------------------------------------
2023-10-25 21:30:21,516 EPOCH 1 done: loss 0.6826 - lr: 0.000030
2023-10-25 21:30:22,184 DEV : loss 0.1497330218553543 - f1-score (micro avg) 0.5972
2023-10-25 21:30:22,188 saving best model
2023-10-25 21:30:22,530 ----------------------------------------------------------------------------------------------------
2023-10-25 21:30:23,819 epoch 2 - iter 29/292 - loss 0.17252226 - time (sec): 1.29 - samples/sec: 3461.31 - lr: 0.000030 - momentum: 0.000000
2023-10-25 21:30:25,115 epoch 2 - iter 58/292 - loss 0.19611204 - time (sec): 2.58 - samples/sec: 3403.61 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:30:26,423 epoch 2 - iter 87/292 - loss 0.18751975 - time (sec): 3.89 - samples/sec: 3340.71 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:30:27,769 epoch 2 - iter 116/292 - loss 0.17497620 - time (sec): 5.24 - samples/sec: 3374.72 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:30:29,058 epoch 2 - iter 145/292 - loss 0.16837296 - time (sec): 6.53 - samples/sec: 3335.07 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:30:30,367 epoch 2 - iter 174/292 - loss 0.16678211 - time (sec): 7.84 - samples/sec: 3347.37 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:30:31,620 epoch 2 - iter 203/292 - loss 0.16469283 - time (sec): 9.09 - samples/sec: 3342.18 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:30:32,877 epoch 2 - iter 232/292 - loss 0.16617154 - time (sec): 10.35 - samples/sec: 3360.39 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:30:34,211 epoch 2 - iter 261/292 - loss 0.16625252 - time (sec): 11.68 - samples/sec: 3367.96 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:30:35,519 epoch 2 - iter 290/292 - loss 0.16032533 - time (sec): 12.99 - samples/sec: 3391.30 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:30:35,604 ----------------------------------------------------------------------------------------------------
2023-10-25 21:30:35,604 EPOCH 2 done: loss 0.1595 - lr: 0.000027
2023-10-25 21:30:36,510 DEV : loss 0.12749069929122925 - f1-score (micro avg) 0.6216
2023-10-25 21:30:36,514 saving best model
2023-10-25 21:30:37,133 ----------------------------------------------------------------------------------------------------
2023-10-25 21:30:38,576 epoch 3 - iter 29/292 - loss 0.09022188 - time (sec): 1.44 - samples/sec: 4060.31 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:30:39,893 epoch 3 - iter 58/292 - loss 0.10087929 - time (sec): 2.76 - samples/sec: 3781.78 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:30:41,160 epoch 3 - iter 87/292 - loss 0.10208490 - time (sec): 4.03 - samples/sec: 3635.44 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:30:42,449 epoch 3 - iter 116/292 - loss 0.10105352 - time (sec): 5.31 - samples/sec: 3517.62 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:30:43,751 epoch 3 - iter 145/292 - loss 0.10094429 - time (sec): 6.62 - samples/sec: 3495.48 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:30:45,016 epoch 3 - iter 174/292 - loss 0.09645934 - time (sec): 7.88 - samples/sec: 3441.59 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:30:46,488 epoch 3 - iter 203/292 - loss 0.09398874 - time (sec): 9.35 - samples/sec: 3382.73 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:30:47,737 epoch 3 - iter 232/292 - loss 0.09314091 - time (sec): 10.60 - samples/sec: 3308.62 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:30:49,070 epoch 3 - iter 261/292 - loss 0.09176423 - time (sec): 11.93 - samples/sec: 3358.99 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:30:50,363 epoch 3 - iter 290/292 - loss 0.09054539 - time (sec): 13.23 - samples/sec: 3342.94 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:30:50,450 ----------------------------------------------------------------------------------------------------
2023-10-25 21:30:50,450 EPOCH 3 done: loss 0.0913 - lr: 0.000023
2023-10-25 21:30:51,361 DEV : loss 0.11841346323490143 - f1-score (micro avg) 0.7118
2023-10-25 21:30:51,365 saving best model
2023-10-25 21:30:52,005 ----------------------------------------------------------------------------------------------------
2023-10-25 21:30:53,266 epoch 4 - iter 29/292 - loss 0.06571735 - time (sec): 1.26 - samples/sec: 3370.24 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:30:54,598 epoch 4 - iter 58/292 - loss 0.05780459 - time (sec): 2.59 - samples/sec: 3250.20 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:30:55,965 epoch 4 - iter 87/292 - loss 0.05216866 - time (sec): 3.96 - samples/sec: 3297.12 - lr: 0.000022 - momentum: 0.000000
2023-10-25 21:30:57,257 epoch 4 - iter 116/292 - loss 0.04928450 - time (sec): 5.25 - samples/sec: 3238.11 - lr: 0.000022 - momentum: 0.000000
2023-10-25 21:30:58,611 epoch 4 - iter 145/292 - loss 0.05174887 - time (sec): 6.60 - samples/sec: 3421.02 - lr: 0.000022 - momentum: 0.000000
2023-10-25 21:30:59,914 epoch 4 - iter 174/292 - loss 0.05349774 - time (sec): 7.91 - samples/sec: 3465.27 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:31:01,321 epoch 4 - iter 203/292 - loss 0.05412245 - time (sec): 9.31 - samples/sec: 3411.38 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:31:02,580 epoch 4 - iter 232/292 - loss 0.05878785 - time (sec): 10.57 - samples/sec: 3383.60 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:31:03,967 epoch 4 - iter 261/292 - loss 0.06009869 - time (sec): 11.96 - samples/sec: 3381.49 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:31:05,232 epoch 4 - iter 290/292 - loss 0.05847457 - time (sec): 13.22 - samples/sec: 3342.38 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:31:05,312 ----------------------------------------------------------------------------------------------------
2023-10-25 21:31:05,312 EPOCH 4 done: loss 0.0583 - lr: 0.000020
2023-10-25 21:31:06,226 DEV : loss 0.13698235154151917 - f1-score (micro avg) 0.7329
2023-10-25 21:31:06,231 saving best model
2023-10-25 21:31:06,849 ----------------------------------------------------------------------------------------------------
2023-10-25 21:31:08,165 epoch 5 - iter 29/292 - loss 0.03972724 - time (sec): 1.31 - samples/sec: 3464.93 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:31:09,448 epoch 5 - iter 58/292 - loss 0.03798626 - time (sec): 2.60 - samples/sec: 3365.39 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:31:10,783 epoch 5 - iter 87/292 - loss 0.03555398 - time (sec): 3.93 - samples/sec: 3314.51 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:31:12,032 epoch 5 - iter 116/292 - loss 0.03214301 - time (sec): 5.18 - samples/sec: 3360.42 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:31:13,311 epoch 5 - iter 145/292 - loss 0.03292483 - time (sec): 6.46 - samples/sec: 3385.15 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:31:14,600 epoch 5 - iter 174/292 - loss 0.03681121 - time (sec): 7.75 - samples/sec: 3329.04 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:31:15,983 epoch 5 - iter 203/292 - loss 0.03818340 - time (sec): 9.13 - samples/sec: 3327.67 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:31:17,283 epoch 5 - iter 232/292 - loss 0.04048615 - time (sec): 10.43 - samples/sec: 3412.34 - lr: 0.000017 - momentum: 0.000000
2023-10-25 21:31:18,581 epoch 5 - iter 261/292 - loss 0.04033068 - time (sec): 11.73 - samples/sec: 3414.90 - lr: 0.000017 - momentum: 0.000000
2023-10-25 21:31:19,822 epoch 5 - iter 290/292 - loss 0.03959189 - time (sec): 12.97 - samples/sec: 3415.07 - lr: 0.000017 - momentum: 0.000000
2023-10-25 21:31:19,897 ----------------------------------------------------------------------------------------------------
2023-10-25 21:31:19,897 EPOCH 5 done: loss 0.0395 - lr: 0.000017
2023-10-25 21:31:20,805 DEV : loss 0.14024661481380463 - f1-score (micro avg) 0.7364
2023-10-25 21:31:20,809 saving best model
2023-10-25 21:31:21,435 ----------------------------------------------------------------------------------------------------
2023-10-25 21:31:22,752 epoch 6 - iter 29/292 - loss 0.02545293 - time (sec): 1.31 - samples/sec: 3771.70 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:31:24,050 epoch 6 - iter 58/292 - loss 0.03242435 - time (sec): 2.61 - samples/sec: 3433.27 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:31:25,321 epoch 6 - iter 87/292 - loss 0.02375022 - time (sec): 3.88 - samples/sec: 3488.35 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:31:26,643 epoch 6 - iter 116/292 - loss 0.03110057 - time (sec): 5.20 - samples/sec: 3502.60 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:31:27,915 epoch 6 - iter 145/292 - loss 0.03115742 - time (sec): 6.48 - samples/sec: 3503.77 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:31:29,157 epoch 6 - iter 174/292 - loss 0.03002770 - time (sec): 7.72 - samples/sec: 3510.24 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:31:30,359 epoch 6 - iter 203/292 - loss 0.02977686 - time (sec): 8.92 - samples/sec: 3488.79 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:31:31,596 epoch 6 - iter 232/292 - loss 0.02910083 - time (sec): 10.16 - samples/sec: 3471.29 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:31:32,922 epoch 6 - iter 261/292 - loss 0.02814018 - time (sec): 11.48 - samples/sec: 3472.71 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:31:34,133 epoch 6 - iter 290/292 - loss 0.02750245 - time (sec): 12.69 - samples/sec: 3462.82 - lr: 0.000013 - momentum: 0.000000
2023-10-25 21:31:34,216 ----------------------------------------------------------------------------------------------------
2023-10-25 21:31:34,216 EPOCH 6 done: loss 0.0275 - lr: 0.000013
2023-10-25 21:31:35,136 DEV : loss 0.1627625823020935 - f1-score (micro avg) 0.7527
2023-10-25 21:31:35,140 saving best model
2023-10-25 21:31:35,751 ----------------------------------------------------------------------------------------------------
2023-10-25 21:31:36,998 epoch 7 - iter 29/292 - loss 0.02550245 - time (sec): 1.24 - samples/sec: 3912.23 - lr: 0.000013 - momentum: 0.000000
2023-10-25 21:31:38,272 epoch 7 - iter 58/292 - loss 0.03222503 - time (sec): 2.52 - samples/sec: 3887.60 - lr: 0.000013 - momentum: 0.000000
2023-10-25 21:31:39,460 epoch 7 - iter 87/292 - loss 0.03168046 - time (sec): 3.70 - samples/sec: 3733.03 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:31:40,677 epoch 7 - iter 116/292 - loss 0.02908034 - time (sec): 4.92 - samples/sec: 3612.53 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:31:41,881 epoch 7 - iter 145/292 - loss 0.02514753 - time (sec): 6.13 - samples/sec: 3542.89 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:31:43,238 epoch 7 - iter 174/292 - loss 0.02468695 - time (sec): 7.48 - samples/sec: 3553.75 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:31:44,537 epoch 7 - iter 203/292 - loss 0.02340193 - time (sec): 8.78 - samples/sec: 3551.40 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:31:45,809 epoch 7 - iter 232/292 - loss 0.02228458 - time (sec): 10.05 - samples/sec: 3510.36 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:31:47,167 epoch 7 - iter 261/292 - loss 0.02091578 - time (sec): 11.41 - samples/sec: 3473.34 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:31:48,521 epoch 7 - iter 290/292 - loss 0.02054549 - time (sec): 12.77 - samples/sec: 3470.06 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:31:48,612 ----------------------------------------------------------------------------------------------------
2023-10-25 21:31:48,612 EPOCH 7 done: loss 0.0205 - lr: 0.000010
2023-10-25 21:31:49,536 DEV : loss 0.18406300246715546 - f1-score (micro avg) 0.7638
2023-10-25 21:31:49,540 saving best model
2023-10-25 21:31:50,152 ----------------------------------------------------------------------------------------------------
2023-10-25 21:31:51,534 epoch 8 - iter 29/292 - loss 0.01278755 - time (sec): 1.38 - samples/sec: 3174.63 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:31:52,890 epoch 8 - iter 58/292 - loss 0.01502218 - time (sec): 2.73 - samples/sec: 3230.30 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:31:54,205 epoch 8 - iter 87/292 - loss 0.01162352 - time (sec): 4.05 - samples/sec: 3323.36 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:31:55,489 epoch 8 - iter 116/292 - loss 0.01556603 - time (sec): 5.33 - samples/sec: 3302.90 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:31:56,784 epoch 8 - iter 145/292 - loss 0.01579903 - time (sec): 6.63 - samples/sec: 3285.94 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:31:58,212 epoch 8 - iter 174/292 - loss 0.01669043 - time (sec): 8.06 - samples/sec: 3226.30 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:31:59,482 epoch 8 - iter 203/292 - loss 0.01521658 - time (sec): 9.33 - samples/sec: 3184.00 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:32:00,800 epoch 8 - iter 232/292 - loss 0.01518710 - time (sec): 10.64 - samples/sec: 3210.18 - lr: 0.000007 - momentum: 0.000000
2023-10-25 21:32:02,092 epoch 8 - iter 261/292 - loss 0.01451772 - time (sec): 11.94 - samples/sec: 3264.25 - lr: 0.000007 - momentum: 0.000000
2023-10-25 21:32:03,464 epoch 8 - iter 290/292 - loss 0.01550935 - time (sec): 13.31 - samples/sec: 3323.97 - lr: 0.000007 - momentum: 0.000000
2023-10-25 21:32:03,551 ----------------------------------------------------------------------------------------------------
2023-10-25 21:32:03,552 EPOCH 8 done: loss 0.0156 - lr: 0.000007
2023-10-25 21:32:04,457 DEV : loss 0.18133316934108734 - f1-score (micro avg) 0.7489
2023-10-25 21:32:04,462 ----------------------------------------------------------------------------------------------------
2023-10-25 21:32:05,828 epoch 9 - iter 29/292 - loss 0.00469478 - time (sec): 1.36 - samples/sec: 3658.24 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:32:07,078 epoch 9 - iter 58/292 - loss 0.00939010 - time (sec): 2.61 - samples/sec: 3544.28 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:32:08,353 epoch 9 - iter 87/292 - loss 0.00915593 - time (sec): 3.89 - samples/sec: 3555.78 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:32:09,712 epoch 9 - iter 116/292 - loss 0.01207961 - time (sec): 5.25 - samples/sec: 3529.11 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:32:11,030 epoch 9 - iter 145/292 - loss 0.01222517 - time (sec): 6.57 - samples/sec: 3493.57 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:32:12,307 epoch 9 - iter 174/292 - loss 0.01195187 - time (sec): 7.84 - samples/sec: 3478.22 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:32:13,584 epoch 9 - iter 203/292 - loss 0.01115930 - time (sec): 9.12 - samples/sec: 3479.08 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:32:14,839 epoch 9 - iter 232/292 - loss 0.01057009 - time (sec): 10.38 - samples/sec: 3434.42 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:32:16,163 epoch 9 - iter 261/292 - loss 0.01094290 - time (sec): 11.70 - samples/sec: 3384.80 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:32:17,540 epoch 9 - iter 290/292 - loss 0.01034591 - time (sec): 13.08 - samples/sec: 3376.60 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:32:17,632 ----------------------------------------------------------------------------------------------------
2023-10-25 21:32:17,632 EPOCH 9 done: loss 0.0103 - lr: 0.000003
2023-10-25 21:32:18,549 DEV : loss 0.1857309192419052 - f1-score (micro avg) 0.7461
2023-10-25 21:32:18,553 ----------------------------------------------------------------------------------------------------
2023-10-25 21:32:19,857 epoch 10 - iter 29/292 - loss 0.00833025 - time (sec): 1.30 - samples/sec: 3353.71 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:32:21,159 epoch 10 - iter 58/292 - loss 0.00620856 - time (sec): 2.60 - samples/sec: 3160.18 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:32:22,448 epoch 10 - iter 87/292 - loss 0.01250990 - time (sec): 3.89 - samples/sec: 3177.17 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:32:23,642 epoch 10 - iter 116/292 - loss 0.01078091 - time (sec): 5.09 - samples/sec: 3278.69 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:32:24,940 epoch 10 - iter 145/292 - loss 0.00880841 - time (sec): 6.39 - samples/sec: 3371.71 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:32:26,129 epoch 10 - iter 174/292 - loss 0.00884157 - time (sec): 7.57 - samples/sec: 3418.33 - lr: 0.000001 - momentum: 0.000000
2023-10-25 21:32:27,383 epoch 10 - iter 203/292 - loss 0.00866918 - time (sec): 8.83 - samples/sec: 3498.88 - lr: 0.000001 - momentum: 0.000000
2023-10-25 21:32:28,613 epoch 10 - iter 232/292 - loss 0.00812226 - time (sec): 10.06 - samples/sec: 3493.63 - lr: 0.000001 - momentum: 0.000000
2023-10-25 21:32:29,859 epoch 10 - iter 261/292 - loss 0.00777060 - time (sec): 11.30 - samples/sec: 3510.22 - lr: 0.000000 - momentum: 0.000000
2023-10-25 21:32:31,235 epoch 10 - iter 290/292 - loss 0.00767455 - time (sec): 12.68 - samples/sec: 3490.79 - lr: 0.000000 - momentum: 0.000000
2023-10-25 21:32:31,323 ----------------------------------------------------------------------------------------------------
2023-10-25 21:32:31,323 EPOCH 10 done: loss 0.0084 - lr: 0.000000
2023-10-25 21:32:32,251 DEV : loss 0.1949864774942398 - f1-score (micro avg) 0.7571
2023-10-25 21:32:32,724 ----------------------------------------------------------------------------------------------------
2023-10-25 21:32:32,725 Loading model from best epoch ...
2023-10-25 21:32:34,339 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 21:32:35,859
Results:
- F-score (micro) 0.7648
- F-score (macro) 0.6948
- Accuracy 0.6424
By class:
precision recall f1-score support
PER 0.8187 0.8305 0.8245 348
LOC 0.6759 0.8391 0.7487 261
ORG 0.4583 0.4231 0.4400 52
HumanProd 0.7200 0.8182 0.7660 22
micro avg 0.7307 0.8023 0.7648 683
macro avg 0.6682 0.7277 0.6948 683
weighted avg 0.7335 0.8023 0.7644 683
2023-10-25 21:32:35,859 ----------------------------------------------------------------------------------------------------