stefan-it's picture
Upload ./training.log with huggingface_hub
5235bd4
2023-10-25 20:52:43,281 ----------------------------------------------------------------------------------------------------
2023-10-25 20:52:43,282 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 20:52:43,282 ----------------------------------------------------------------------------------------------------
2023-10-25 20:52:43,282 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-25 20:52:43,282 ----------------------------------------------------------------------------------------------------
2023-10-25 20:52:43,282 Train: 1166 sentences
2023-10-25 20:52:43,282 (train_with_dev=False, train_with_test=False)
2023-10-25 20:52:43,282 ----------------------------------------------------------------------------------------------------
2023-10-25 20:52:43,282 Training Params:
2023-10-25 20:52:43,282 - learning_rate: "5e-05"
2023-10-25 20:52:43,282 - mini_batch_size: "4"
2023-10-25 20:52:43,282 - max_epochs: "10"
2023-10-25 20:52:43,282 - shuffle: "True"
2023-10-25 20:52:43,282 ----------------------------------------------------------------------------------------------------
2023-10-25 20:52:43,282 Plugins:
2023-10-25 20:52:43,282 - TensorboardLogger
2023-10-25 20:52:43,282 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 20:52:43,282 ----------------------------------------------------------------------------------------------------
2023-10-25 20:52:43,282 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 20:52:43,282 - metric: "('micro avg', 'f1-score')"
2023-10-25 20:52:43,282 ----------------------------------------------------------------------------------------------------
2023-10-25 20:52:43,282 Computation:
2023-10-25 20:52:43,282 - compute on device: cuda:0
2023-10-25 20:52:43,282 - embedding storage: none
2023-10-25 20:52:43,282 ----------------------------------------------------------------------------------------------------
2023-10-25 20:52:43,282 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-25 20:52:43,282 ----------------------------------------------------------------------------------------------------
2023-10-25 20:52:43,282 ----------------------------------------------------------------------------------------------------
2023-10-25 20:52:43,283 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 20:52:44,591 epoch 1 - iter 29/292 - loss 3.24438831 - time (sec): 1.31 - samples/sec: 3643.15 - lr: 0.000005 - momentum: 0.000000
2023-10-25 20:52:45,844 epoch 1 - iter 58/292 - loss 2.35737757 - time (sec): 2.56 - samples/sec: 3396.11 - lr: 0.000010 - momentum: 0.000000
2023-10-25 20:52:47,145 epoch 1 - iter 87/292 - loss 1.70602648 - time (sec): 3.86 - samples/sec: 3418.58 - lr: 0.000015 - momentum: 0.000000
2023-10-25 20:52:48,484 epoch 1 - iter 116/292 - loss 1.39064461 - time (sec): 5.20 - samples/sec: 3415.94 - lr: 0.000020 - momentum: 0.000000
2023-10-25 20:52:49,917 epoch 1 - iter 145/292 - loss 1.17555138 - time (sec): 6.63 - samples/sec: 3402.10 - lr: 0.000025 - momentum: 0.000000
2023-10-25 20:52:51,237 epoch 1 - iter 174/292 - loss 1.01899326 - time (sec): 7.95 - samples/sec: 3443.73 - lr: 0.000030 - momentum: 0.000000
2023-10-25 20:52:52,538 epoch 1 - iter 203/292 - loss 0.91213345 - time (sec): 9.25 - samples/sec: 3426.03 - lr: 0.000035 - momentum: 0.000000
2023-10-25 20:52:53,826 epoch 1 - iter 232/292 - loss 0.83161711 - time (sec): 10.54 - samples/sec: 3403.05 - lr: 0.000040 - momentum: 0.000000
2023-10-25 20:52:55,130 epoch 1 - iter 261/292 - loss 0.76267135 - time (sec): 11.85 - samples/sec: 3400.51 - lr: 0.000045 - momentum: 0.000000
2023-10-25 20:52:56,405 epoch 1 - iter 290/292 - loss 0.71294027 - time (sec): 13.12 - samples/sec: 3371.82 - lr: 0.000049 - momentum: 0.000000
2023-10-25 20:52:56,483 ----------------------------------------------------------------------------------------------------
2023-10-25 20:52:56,483 EPOCH 1 done: loss 0.7122 - lr: 0.000049
2023-10-25 20:52:56,997 DEV : loss 0.14957387745380402 - f1-score (micro avg) 0.567
2023-10-25 20:52:57,002 saving best model
2023-10-25 20:52:57,519 ----------------------------------------------------------------------------------------------------
2023-10-25 20:52:58,790 epoch 2 - iter 29/292 - loss 0.17704339 - time (sec): 1.27 - samples/sec: 3344.25 - lr: 0.000049 - momentum: 0.000000
2023-10-25 20:53:00,044 epoch 2 - iter 58/292 - loss 0.17561646 - time (sec): 2.52 - samples/sec: 3391.82 - lr: 0.000049 - momentum: 0.000000
2023-10-25 20:53:01,370 epoch 2 - iter 87/292 - loss 0.17159875 - time (sec): 3.85 - samples/sec: 3503.44 - lr: 0.000048 - momentum: 0.000000
2023-10-25 20:53:02,630 epoch 2 - iter 116/292 - loss 0.16582732 - time (sec): 5.11 - samples/sec: 3467.99 - lr: 0.000048 - momentum: 0.000000
2023-10-25 20:53:03,858 epoch 2 - iter 145/292 - loss 0.16514765 - time (sec): 6.34 - samples/sec: 3353.34 - lr: 0.000047 - momentum: 0.000000
2023-10-25 20:53:05,182 epoch 2 - iter 174/292 - loss 0.15569460 - time (sec): 7.66 - samples/sec: 3412.54 - lr: 0.000047 - momentum: 0.000000
2023-10-25 20:53:06,547 epoch 2 - iter 203/292 - loss 0.14640467 - time (sec): 9.03 - samples/sec: 3502.60 - lr: 0.000046 - momentum: 0.000000
2023-10-25 20:53:07,883 epoch 2 - iter 232/292 - loss 0.14581795 - time (sec): 10.36 - samples/sec: 3440.88 - lr: 0.000046 - momentum: 0.000000
2023-10-25 20:53:09,217 epoch 2 - iter 261/292 - loss 0.14877150 - time (sec): 11.70 - samples/sec: 3357.27 - lr: 0.000045 - momentum: 0.000000
2023-10-25 20:53:10,602 epoch 2 - iter 290/292 - loss 0.15023947 - time (sec): 13.08 - samples/sec: 3379.62 - lr: 0.000045 - momentum: 0.000000
2023-10-25 20:53:10,691 ----------------------------------------------------------------------------------------------------
2023-10-25 20:53:10,691 EPOCH 2 done: loss 0.1508 - lr: 0.000045
2023-10-25 20:53:11,772 DEV : loss 0.13169609010219574 - f1-score (micro avg) 0.6407
2023-10-25 20:53:11,777 saving best model
2023-10-25 20:53:12,452 ----------------------------------------------------------------------------------------------------
2023-10-25 20:53:13,788 epoch 3 - iter 29/292 - loss 0.08817301 - time (sec): 1.33 - samples/sec: 3800.10 - lr: 0.000044 - momentum: 0.000000
2023-10-25 20:53:15,148 epoch 3 - iter 58/292 - loss 0.07997124 - time (sec): 2.69 - samples/sec: 3792.63 - lr: 0.000043 - momentum: 0.000000
2023-10-25 20:53:16,442 epoch 3 - iter 87/292 - loss 0.08332501 - time (sec): 3.99 - samples/sec: 3484.50 - lr: 0.000043 - momentum: 0.000000
2023-10-25 20:53:17,735 epoch 3 - iter 116/292 - loss 0.08493381 - time (sec): 5.28 - samples/sec: 3430.17 - lr: 0.000042 - momentum: 0.000000
2023-10-25 20:53:19,013 epoch 3 - iter 145/292 - loss 0.08790732 - time (sec): 6.56 - samples/sec: 3456.09 - lr: 0.000042 - momentum: 0.000000
2023-10-25 20:53:20,255 epoch 3 - iter 174/292 - loss 0.08164116 - time (sec): 7.80 - samples/sec: 3330.52 - lr: 0.000041 - momentum: 0.000000
2023-10-25 20:53:21,582 epoch 3 - iter 203/292 - loss 0.08657696 - time (sec): 9.13 - samples/sec: 3404.26 - lr: 0.000041 - momentum: 0.000000
2023-10-25 20:53:22,907 epoch 3 - iter 232/292 - loss 0.08717220 - time (sec): 10.45 - samples/sec: 3388.03 - lr: 0.000040 - momentum: 0.000000
2023-10-25 20:53:24,256 epoch 3 - iter 261/292 - loss 0.08829596 - time (sec): 11.80 - samples/sec: 3397.56 - lr: 0.000040 - momentum: 0.000000
2023-10-25 20:53:25,532 epoch 3 - iter 290/292 - loss 0.08553678 - time (sec): 13.08 - samples/sec: 3390.93 - lr: 0.000039 - momentum: 0.000000
2023-10-25 20:53:25,619 ----------------------------------------------------------------------------------------------------
2023-10-25 20:53:25,619 EPOCH 3 done: loss 0.0855 - lr: 0.000039
2023-10-25 20:53:26,532 DEV : loss 0.1209733635187149 - f1-score (micro avg) 0.7101
2023-10-25 20:53:26,536 saving best model
2023-10-25 20:53:27,211 ----------------------------------------------------------------------------------------------------
2023-10-25 20:53:28,480 epoch 4 - iter 29/292 - loss 0.04193234 - time (sec): 1.27 - samples/sec: 2908.67 - lr: 0.000038 - momentum: 0.000000
2023-10-25 20:53:29,729 epoch 4 - iter 58/292 - loss 0.05216175 - time (sec): 2.52 - samples/sec: 3085.61 - lr: 0.000038 - momentum: 0.000000
2023-10-25 20:53:31,136 epoch 4 - iter 87/292 - loss 0.04828892 - time (sec): 3.92 - samples/sec: 3187.99 - lr: 0.000037 - momentum: 0.000000
2023-10-25 20:53:32,401 epoch 4 - iter 116/292 - loss 0.05429306 - time (sec): 5.19 - samples/sec: 3318.14 - lr: 0.000037 - momentum: 0.000000
2023-10-25 20:53:33,740 epoch 4 - iter 145/292 - loss 0.04835703 - time (sec): 6.53 - samples/sec: 3520.02 - lr: 0.000036 - momentum: 0.000000
2023-10-25 20:53:34,991 epoch 4 - iter 174/292 - loss 0.05061998 - time (sec): 7.78 - samples/sec: 3422.69 - lr: 0.000036 - momentum: 0.000000
2023-10-25 20:53:36,354 epoch 4 - iter 203/292 - loss 0.05117584 - time (sec): 9.14 - samples/sec: 3450.24 - lr: 0.000035 - momentum: 0.000000
2023-10-25 20:53:37,606 epoch 4 - iter 232/292 - loss 0.05313204 - time (sec): 10.39 - samples/sec: 3423.20 - lr: 0.000035 - momentum: 0.000000
2023-10-25 20:53:38,852 epoch 4 - iter 261/292 - loss 0.05248535 - time (sec): 11.64 - samples/sec: 3398.13 - lr: 0.000034 - momentum: 0.000000
2023-10-25 20:53:40,096 epoch 4 - iter 290/292 - loss 0.05345773 - time (sec): 12.88 - samples/sec: 3434.57 - lr: 0.000033 - momentum: 0.000000
2023-10-25 20:53:40,181 ----------------------------------------------------------------------------------------------------
2023-10-25 20:53:40,182 EPOCH 4 done: loss 0.0537 - lr: 0.000033
2023-10-25 20:53:41,097 DEV : loss 0.11280453205108643 - f1-score (micro avg) 0.7522
2023-10-25 20:53:41,102 saving best model
2023-10-25 20:53:41,756 ----------------------------------------------------------------------------------------------------
2023-10-25 20:53:43,027 epoch 5 - iter 29/292 - loss 0.03067125 - time (sec): 1.27 - samples/sec: 3116.28 - lr: 0.000033 - momentum: 0.000000
2023-10-25 20:53:44,281 epoch 5 - iter 58/292 - loss 0.03432643 - time (sec): 2.52 - samples/sec: 3410.73 - lr: 0.000032 - momentum: 0.000000
2023-10-25 20:53:45,604 epoch 5 - iter 87/292 - loss 0.04094254 - time (sec): 3.85 - samples/sec: 3442.26 - lr: 0.000032 - momentum: 0.000000
2023-10-25 20:53:46,927 epoch 5 - iter 116/292 - loss 0.03856020 - time (sec): 5.17 - samples/sec: 3419.71 - lr: 0.000031 - momentum: 0.000000
2023-10-25 20:53:48,182 epoch 5 - iter 145/292 - loss 0.03449220 - time (sec): 6.42 - samples/sec: 3411.26 - lr: 0.000031 - momentum: 0.000000
2023-10-25 20:53:49,492 epoch 5 - iter 174/292 - loss 0.03424483 - time (sec): 7.73 - samples/sec: 3425.22 - lr: 0.000030 - momentum: 0.000000
2023-10-25 20:53:50,794 epoch 5 - iter 203/292 - loss 0.03475940 - time (sec): 9.04 - samples/sec: 3410.84 - lr: 0.000030 - momentum: 0.000000
2023-10-25 20:53:52,091 epoch 5 - iter 232/292 - loss 0.03428080 - time (sec): 10.33 - samples/sec: 3432.78 - lr: 0.000029 - momentum: 0.000000
2023-10-25 20:53:53,399 epoch 5 - iter 261/292 - loss 0.03246941 - time (sec): 11.64 - samples/sec: 3388.06 - lr: 0.000028 - momentum: 0.000000
2023-10-25 20:53:54,768 epoch 5 - iter 290/292 - loss 0.03384146 - time (sec): 13.01 - samples/sec: 3402.63 - lr: 0.000028 - momentum: 0.000000
2023-10-25 20:53:54,843 ----------------------------------------------------------------------------------------------------
2023-10-25 20:53:54,843 EPOCH 5 done: loss 0.0341 - lr: 0.000028
2023-10-25 20:53:55,776 DEV : loss 0.14307908713817596 - f1-score (micro avg) 0.7343
2023-10-25 20:53:55,781 ----------------------------------------------------------------------------------------------------
2023-10-25 20:53:57,014 epoch 6 - iter 29/292 - loss 0.02283523 - time (sec): 1.23 - samples/sec: 3403.33 - lr: 0.000027 - momentum: 0.000000
2023-10-25 20:53:58,274 epoch 6 - iter 58/292 - loss 0.02378104 - time (sec): 2.49 - samples/sec: 3439.00 - lr: 0.000027 - momentum: 0.000000
2023-10-25 20:53:59,705 epoch 6 - iter 87/292 - loss 0.02477152 - time (sec): 3.92 - samples/sec: 3555.45 - lr: 0.000026 - momentum: 0.000000
2023-10-25 20:54:01,001 epoch 6 - iter 116/292 - loss 0.02923244 - time (sec): 5.22 - samples/sec: 3558.00 - lr: 0.000026 - momentum: 0.000000
2023-10-25 20:54:02,338 epoch 6 - iter 145/292 - loss 0.02764328 - time (sec): 6.56 - samples/sec: 3408.14 - lr: 0.000025 - momentum: 0.000000
2023-10-25 20:54:03,788 epoch 6 - iter 174/292 - loss 0.02758029 - time (sec): 8.01 - samples/sec: 3357.42 - lr: 0.000025 - momentum: 0.000000
2023-10-25 20:54:05,097 epoch 6 - iter 203/292 - loss 0.02712844 - time (sec): 9.32 - samples/sec: 3364.89 - lr: 0.000024 - momentum: 0.000000
2023-10-25 20:54:06,398 epoch 6 - iter 232/292 - loss 0.02556567 - time (sec): 10.62 - samples/sec: 3381.89 - lr: 0.000023 - momentum: 0.000000
2023-10-25 20:54:07,644 epoch 6 - iter 261/292 - loss 0.02621766 - time (sec): 11.86 - samples/sec: 3356.01 - lr: 0.000023 - momentum: 0.000000
2023-10-25 20:54:08,936 epoch 6 - iter 290/292 - loss 0.02847300 - time (sec): 13.15 - samples/sec: 3366.52 - lr: 0.000022 - momentum: 0.000000
2023-10-25 20:54:09,016 ----------------------------------------------------------------------------------------------------
2023-10-25 20:54:09,016 EPOCH 6 done: loss 0.0284 - lr: 0.000022
2023-10-25 20:54:10,097 DEV : loss 0.16628390550613403 - f1-score (micro avg) 0.7522
2023-10-25 20:54:10,102 saving best model
2023-10-25 20:54:10,660 ----------------------------------------------------------------------------------------------------
2023-10-25 20:54:11,922 epoch 7 - iter 29/292 - loss 0.01313140 - time (sec): 1.26 - samples/sec: 3320.18 - lr: 0.000022 - momentum: 0.000000
2023-10-25 20:54:13,181 epoch 7 - iter 58/292 - loss 0.01062088 - time (sec): 2.52 - samples/sec: 3319.42 - lr: 0.000021 - momentum: 0.000000
2023-10-25 20:54:14,445 epoch 7 - iter 87/292 - loss 0.01259981 - time (sec): 3.78 - samples/sec: 3345.63 - lr: 0.000021 - momentum: 0.000000
2023-10-25 20:54:15,657 epoch 7 - iter 116/292 - loss 0.01549817 - time (sec): 5.00 - samples/sec: 3351.89 - lr: 0.000020 - momentum: 0.000000
2023-10-25 20:54:17,051 epoch 7 - iter 145/292 - loss 0.01629611 - time (sec): 6.39 - samples/sec: 3443.66 - lr: 0.000020 - momentum: 0.000000
2023-10-25 20:54:18,411 epoch 7 - iter 174/292 - loss 0.01683554 - time (sec): 7.75 - samples/sec: 3456.79 - lr: 0.000019 - momentum: 0.000000
2023-10-25 20:54:19,716 epoch 7 - iter 203/292 - loss 0.02085836 - time (sec): 9.06 - samples/sec: 3467.56 - lr: 0.000018 - momentum: 0.000000
2023-10-25 20:54:20,977 epoch 7 - iter 232/292 - loss 0.02012985 - time (sec): 10.32 - samples/sec: 3450.94 - lr: 0.000018 - momentum: 0.000000
2023-10-25 20:54:22,272 epoch 7 - iter 261/292 - loss 0.01963116 - time (sec): 11.61 - samples/sec: 3441.95 - lr: 0.000017 - momentum: 0.000000
2023-10-25 20:54:23,519 epoch 7 - iter 290/292 - loss 0.01887461 - time (sec): 12.86 - samples/sec: 3434.82 - lr: 0.000017 - momentum: 0.000000
2023-10-25 20:54:23,603 ----------------------------------------------------------------------------------------------------
2023-10-25 20:54:23,603 EPOCH 7 done: loss 0.0188 - lr: 0.000017
2023-10-25 20:54:24,526 DEV : loss 0.18740706145763397 - f1-score (micro avg) 0.7209
2023-10-25 20:54:24,531 ----------------------------------------------------------------------------------------------------
2023-10-25 20:54:25,835 epoch 8 - iter 29/292 - loss 0.01005555 - time (sec): 1.30 - samples/sec: 3400.17 - lr: 0.000016 - momentum: 0.000000
2023-10-25 20:54:27,199 epoch 8 - iter 58/292 - loss 0.01177608 - time (sec): 2.67 - samples/sec: 3735.36 - lr: 0.000016 - momentum: 0.000000
2023-10-25 20:54:28,454 epoch 8 - iter 87/292 - loss 0.01023423 - time (sec): 3.92 - samples/sec: 3525.50 - lr: 0.000015 - momentum: 0.000000
2023-10-25 20:54:29,725 epoch 8 - iter 116/292 - loss 0.01018425 - time (sec): 5.19 - samples/sec: 3503.67 - lr: 0.000015 - momentum: 0.000000
2023-10-25 20:54:30,974 epoch 8 - iter 145/292 - loss 0.00987982 - time (sec): 6.44 - samples/sec: 3494.30 - lr: 0.000014 - momentum: 0.000000
2023-10-25 20:54:32,348 epoch 8 - iter 174/292 - loss 0.01053102 - time (sec): 7.82 - samples/sec: 3468.90 - lr: 0.000013 - momentum: 0.000000
2023-10-25 20:54:33,697 epoch 8 - iter 203/292 - loss 0.01060943 - time (sec): 9.17 - samples/sec: 3480.83 - lr: 0.000013 - momentum: 0.000000
2023-10-25 20:54:35,002 epoch 8 - iter 232/292 - loss 0.01150807 - time (sec): 10.47 - samples/sec: 3462.88 - lr: 0.000012 - momentum: 0.000000
2023-10-25 20:54:36,267 epoch 8 - iter 261/292 - loss 0.01076409 - time (sec): 11.74 - samples/sec: 3411.72 - lr: 0.000012 - momentum: 0.000000
2023-10-25 20:54:37,558 epoch 8 - iter 290/292 - loss 0.01018919 - time (sec): 13.03 - samples/sec: 3390.12 - lr: 0.000011 - momentum: 0.000000
2023-10-25 20:54:37,644 ----------------------------------------------------------------------------------------------------
2023-10-25 20:54:37,644 EPOCH 8 done: loss 0.0101 - lr: 0.000011
2023-10-25 20:54:38,559 DEV : loss 0.19930434226989746 - f1-score (micro avg) 0.7309
2023-10-25 20:54:38,563 ----------------------------------------------------------------------------------------------------
2023-10-25 20:54:39,907 epoch 9 - iter 29/292 - loss 0.01344550 - time (sec): 1.34 - samples/sec: 3432.35 - lr: 0.000011 - momentum: 0.000000
2023-10-25 20:54:41,169 epoch 9 - iter 58/292 - loss 0.01055378 - time (sec): 2.60 - samples/sec: 3307.88 - lr: 0.000010 - momentum: 0.000000
2023-10-25 20:54:42,418 epoch 9 - iter 87/292 - loss 0.00815295 - time (sec): 3.85 - samples/sec: 3152.34 - lr: 0.000010 - momentum: 0.000000
2023-10-25 20:54:43,710 epoch 9 - iter 116/292 - loss 0.00642132 - time (sec): 5.15 - samples/sec: 3294.32 - lr: 0.000009 - momentum: 0.000000
2023-10-25 20:54:44,977 epoch 9 - iter 145/292 - loss 0.00692122 - time (sec): 6.41 - samples/sec: 3257.05 - lr: 0.000008 - momentum: 0.000000
2023-10-25 20:54:46,255 epoch 9 - iter 174/292 - loss 0.00736428 - time (sec): 7.69 - samples/sec: 3255.49 - lr: 0.000008 - momentum: 0.000000
2023-10-25 20:54:47,546 epoch 9 - iter 203/292 - loss 0.00688575 - time (sec): 8.98 - samples/sec: 3341.57 - lr: 0.000007 - momentum: 0.000000
2023-10-25 20:54:48,746 epoch 9 - iter 232/292 - loss 0.00605517 - time (sec): 10.18 - samples/sec: 3389.43 - lr: 0.000007 - momentum: 0.000000
2023-10-25 20:54:50,084 epoch 9 - iter 261/292 - loss 0.00604344 - time (sec): 11.52 - samples/sec: 3467.54 - lr: 0.000006 - momentum: 0.000000
2023-10-25 20:54:51,342 epoch 9 - iter 290/292 - loss 0.00647095 - time (sec): 12.78 - samples/sec: 3460.46 - lr: 0.000006 - momentum: 0.000000
2023-10-25 20:54:51,421 ----------------------------------------------------------------------------------------------------
2023-10-25 20:54:51,421 EPOCH 9 done: loss 0.0064 - lr: 0.000006
2023-10-25 20:54:52,337 DEV : loss 0.20804047584533691 - f1-score (micro avg) 0.735
2023-10-25 20:54:52,342 ----------------------------------------------------------------------------------------------------
2023-10-25 20:54:53,531 epoch 10 - iter 29/292 - loss 0.00167655 - time (sec): 1.19 - samples/sec: 3673.47 - lr: 0.000005 - momentum: 0.000000
2023-10-25 20:54:54,800 epoch 10 - iter 58/292 - loss 0.00843115 - time (sec): 2.46 - samples/sec: 3456.32 - lr: 0.000005 - momentum: 0.000000
2023-10-25 20:54:56,074 epoch 10 - iter 87/292 - loss 0.00620319 - time (sec): 3.73 - samples/sec: 3501.50 - lr: 0.000004 - momentum: 0.000000
2023-10-25 20:54:57,422 epoch 10 - iter 116/292 - loss 0.00516689 - time (sec): 5.08 - samples/sec: 3410.62 - lr: 0.000003 - momentum: 0.000000
2023-10-25 20:54:58,751 epoch 10 - iter 145/292 - loss 0.00519706 - time (sec): 6.41 - samples/sec: 3393.23 - lr: 0.000003 - momentum: 0.000000
2023-10-25 20:55:00,068 epoch 10 - iter 174/292 - loss 0.00530414 - time (sec): 7.73 - samples/sec: 3331.36 - lr: 0.000002 - momentum: 0.000000
2023-10-25 20:55:01,375 epoch 10 - iter 203/292 - loss 0.00462386 - time (sec): 9.03 - samples/sec: 3418.00 - lr: 0.000002 - momentum: 0.000000
2023-10-25 20:55:02,678 epoch 10 - iter 232/292 - loss 0.00511570 - time (sec): 10.34 - samples/sec: 3447.71 - lr: 0.000001 - momentum: 0.000000
2023-10-25 20:55:03,957 epoch 10 - iter 261/292 - loss 0.00465885 - time (sec): 11.61 - samples/sec: 3411.83 - lr: 0.000001 - momentum: 0.000000
2023-10-25 20:55:05,243 epoch 10 - iter 290/292 - loss 0.00423569 - time (sec): 12.90 - samples/sec: 3419.83 - lr: 0.000000 - momentum: 0.000000
2023-10-25 20:55:05,325 ----------------------------------------------------------------------------------------------------
2023-10-25 20:55:05,325 EPOCH 10 done: loss 0.0042 - lr: 0.000000
2023-10-25 20:55:06,243 DEV : loss 0.21092882752418518 - f1-score (micro avg) 0.7292
2023-10-25 20:55:06,767 ----------------------------------------------------------------------------------------------------
2023-10-25 20:55:06,768 Loading model from best epoch ...
2023-10-25 20:55:08,658 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 20:55:10,238
Results:
- F-score (micro) 0.7654
- F-score (macro) 0.6924
- Accuracy 0.6478
By class:
precision recall f1-score support
PER 0.7832 0.8305 0.8061 348
LOC 0.6959 0.8506 0.7655 261
ORG 0.5278 0.3654 0.4318 52
HumanProd 0.7200 0.8182 0.7660 22
micro avg 0.7316 0.8023 0.7654 683
macro avg 0.6817 0.7162 0.6924 683
weighted avg 0.7284 0.8023 0.7608 683
2023-10-25 20:55:10,238 ----------------------------------------------------------------------------------------------------