2023-10-25 20:52:43,281 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:52:43,282 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 20:52:43,282 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:52:43,282 MultiCorpus: 1166 train + 165 dev + 415 test sentences - NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator 2023-10-25 20:52:43,282 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:52:43,282 Train: 1166 sentences 2023-10-25 20:52:43,282 (train_with_dev=False, train_with_test=False) 2023-10-25 20:52:43,282 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:52:43,282 Training Params: 2023-10-25 20:52:43,282 - learning_rate: "5e-05" 2023-10-25 20:52:43,282 - mini_batch_size: "4" 2023-10-25 20:52:43,282 - max_epochs: "10" 2023-10-25 20:52:43,282 - shuffle: "True" 2023-10-25 20:52:43,282 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:52:43,282 Plugins: 2023-10-25 20:52:43,282 - TensorboardLogger 2023-10-25 20:52:43,282 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 20:52:43,282 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:52:43,282 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 20:52:43,282 - metric: "('micro avg', 'f1-score')" 2023-10-25 20:52:43,282 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:52:43,282 Computation: 2023-10-25 20:52:43,282 - compute on device: cuda:0 2023-10-25 20:52:43,282 - embedding storage: none 2023-10-25 20:52:43,282 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:52:43,282 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-25 20:52:43,282 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:52:43,282 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:52:43,283 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 20:52:44,591 epoch 1 - iter 29/292 - loss 3.24438831 - time (sec): 1.31 - samples/sec: 3643.15 - lr: 0.000005 - momentum: 0.000000 2023-10-25 20:52:45,844 epoch 1 - iter 58/292 - loss 2.35737757 - time (sec): 2.56 - samples/sec: 3396.11 - lr: 0.000010 - momentum: 0.000000 2023-10-25 20:52:47,145 epoch 1 - iter 87/292 - loss 1.70602648 - time (sec): 3.86 - samples/sec: 3418.58 - lr: 0.000015 - momentum: 0.000000 2023-10-25 20:52:48,484 epoch 1 - iter 116/292 - loss 1.39064461 - time (sec): 5.20 - samples/sec: 3415.94 - lr: 0.000020 - momentum: 0.000000 2023-10-25 20:52:49,917 epoch 1 - iter 145/292 - loss 1.17555138 - time (sec): 6.63 - samples/sec: 3402.10 - lr: 0.000025 - momentum: 0.000000 2023-10-25 20:52:51,237 epoch 1 - iter 174/292 - loss 1.01899326 - time (sec): 7.95 - samples/sec: 3443.73 - lr: 0.000030 - momentum: 0.000000 2023-10-25 20:52:52,538 epoch 1 - iter 203/292 - loss 0.91213345 - time (sec): 9.25 - samples/sec: 3426.03 - lr: 0.000035 - momentum: 0.000000 2023-10-25 20:52:53,826 epoch 1 - iter 232/292 - loss 0.83161711 - time (sec): 10.54 - samples/sec: 3403.05 - lr: 0.000040 - momentum: 0.000000 2023-10-25 20:52:55,130 epoch 1 - iter 261/292 - loss 0.76267135 - time (sec): 11.85 - samples/sec: 3400.51 - lr: 0.000045 - momentum: 0.000000 2023-10-25 20:52:56,405 epoch 1 - iter 290/292 - loss 0.71294027 - time (sec): 13.12 - samples/sec: 3371.82 - lr: 0.000049 - momentum: 0.000000 2023-10-25 20:52:56,483 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:52:56,483 EPOCH 1 done: loss 0.7122 - lr: 0.000049 2023-10-25 20:52:56,997 DEV : loss 0.14957387745380402 - f1-score (micro avg) 0.567 2023-10-25 20:52:57,002 saving best model 2023-10-25 20:52:57,519 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:52:58,790 epoch 2 - iter 29/292 - loss 0.17704339 - time (sec): 1.27 - samples/sec: 3344.25 - lr: 0.000049 - momentum: 0.000000 2023-10-25 20:53:00,044 epoch 2 - iter 58/292 - loss 0.17561646 - time (sec): 2.52 - samples/sec: 3391.82 - lr: 0.000049 - momentum: 0.000000 2023-10-25 20:53:01,370 epoch 2 - iter 87/292 - loss 0.17159875 - time (sec): 3.85 - samples/sec: 3503.44 - lr: 0.000048 - momentum: 0.000000 2023-10-25 20:53:02,630 epoch 2 - iter 116/292 - loss 0.16582732 - time (sec): 5.11 - samples/sec: 3467.99 - lr: 0.000048 - momentum: 0.000000 2023-10-25 20:53:03,858 epoch 2 - iter 145/292 - loss 0.16514765 - time (sec): 6.34 - samples/sec: 3353.34 - lr: 0.000047 - momentum: 0.000000 2023-10-25 20:53:05,182 epoch 2 - iter 174/292 - loss 0.15569460 - time (sec): 7.66 - samples/sec: 3412.54 - lr: 0.000047 - momentum: 0.000000 2023-10-25 20:53:06,547 epoch 2 - iter 203/292 - loss 0.14640467 - time (sec): 9.03 - samples/sec: 3502.60 - lr: 0.000046 - momentum: 0.000000 2023-10-25 20:53:07,883 epoch 2 - iter 232/292 - loss 0.14581795 - time (sec): 10.36 - samples/sec: 3440.88 - lr: 0.000046 - momentum: 0.000000 2023-10-25 20:53:09,217 epoch 2 - iter 261/292 - loss 0.14877150 - time (sec): 11.70 - samples/sec: 3357.27 - lr: 0.000045 - momentum: 0.000000 2023-10-25 20:53:10,602 epoch 2 - iter 290/292 - loss 0.15023947 - time (sec): 13.08 - samples/sec: 3379.62 - lr: 0.000045 - momentum: 0.000000 2023-10-25 20:53:10,691 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:53:10,691 EPOCH 2 done: loss 0.1508 - lr: 0.000045 2023-10-25 20:53:11,772 DEV : loss 0.13169609010219574 - f1-score (micro avg) 0.6407 2023-10-25 20:53:11,777 saving best model 2023-10-25 20:53:12,452 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:53:13,788 epoch 3 - iter 29/292 - loss 0.08817301 - time (sec): 1.33 - samples/sec: 3800.10 - lr: 0.000044 - momentum: 0.000000 2023-10-25 20:53:15,148 epoch 3 - iter 58/292 - loss 0.07997124 - time (sec): 2.69 - samples/sec: 3792.63 - lr: 0.000043 - momentum: 0.000000 2023-10-25 20:53:16,442 epoch 3 - iter 87/292 - loss 0.08332501 - time (sec): 3.99 - samples/sec: 3484.50 - lr: 0.000043 - momentum: 0.000000 2023-10-25 20:53:17,735 epoch 3 - iter 116/292 - loss 0.08493381 - time (sec): 5.28 - samples/sec: 3430.17 - lr: 0.000042 - momentum: 0.000000 2023-10-25 20:53:19,013 epoch 3 - iter 145/292 - loss 0.08790732 - time (sec): 6.56 - samples/sec: 3456.09 - lr: 0.000042 - momentum: 0.000000 2023-10-25 20:53:20,255 epoch 3 - iter 174/292 - loss 0.08164116 - time (sec): 7.80 - samples/sec: 3330.52 - lr: 0.000041 - momentum: 0.000000 2023-10-25 20:53:21,582 epoch 3 - iter 203/292 - loss 0.08657696 - time (sec): 9.13 - samples/sec: 3404.26 - lr: 0.000041 - momentum: 0.000000 2023-10-25 20:53:22,907 epoch 3 - iter 232/292 - loss 0.08717220 - time (sec): 10.45 - samples/sec: 3388.03 - lr: 0.000040 - momentum: 0.000000 2023-10-25 20:53:24,256 epoch 3 - iter 261/292 - loss 0.08829596 - time (sec): 11.80 - samples/sec: 3397.56 - lr: 0.000040 - momentum: 0.000000 2023-10-25 20:53:25,532 epoch 3 - iter 290/292 - loss 0.08553678 - time (sec): 13.08 - samples/sec: 3390.93 - lr: 0.000039 - momentum: 0.000000 2023-10-25 20:53:25,619 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:53:25,619 EPOCH 3 done: loss 0.0855 - lr: 0.000039 2023-10-25 20:53:26,532 DEV : loss 0.1209733635187149 - f1-score (micro avg) 0.7101 2023-10-25 20:53:26,536 saving best model 2023-10-25 20:53:27,211 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:53:28,480 epoch 4 - iter 29/292 - loss 0.04193234 - time (sec): 1.27 - samples/sec: 2908.67 - lr: 0.000038 - momentum: 0.000000 2023-10-25 20:53:29,729 epoch 4 - iter 58/292 - loss 0.05216175 - time (sec): 2.52 - samples/sec: 3085.61 - lr: 0.000038 - momentum: 0.000000 2023-10-25 20:53:31,136 epoch 4 - iter 87/292 - loss 0.04828892 - time (sec): 3.92 - samples/sec: 3187.99 - lr: 0.000037 - momentum: 0.000000 2023-10-25 20:53:32,401 epoch 4 - iter 116/292 - loss 0.05429306 - time (sec): 5.19 - samples/sec: 3318.14 - lr: 0.000037 - momentum: 0.000000 2023-10-25 20:53:33,740 epoch 4 - iter 145/292 - loss 0.04835703 - time (sec): 6.53 - samples/sec: 3520.02 - lr: 0.000036 - momentum: 0.000000 2023-10-25 20:53:34,991 epoch 4 - iter 174/292 - loss 0.05061998 - time (sec): 7.78 - samples/sec: 3422.69 - lr: 0.000036 - momentum: 0.000000 2023-10-25 20:53:36,354 epoch 4 - iter 203/292 - loss 0.05117584 - time (sec): 9.14 - samples/sec: 3450.24 - lr: 0.000035 - momentum: 0.000000 2023-10-25 20:53:37,606 epoch 4 - iter 232/292 - loss 0.05313204 - time (sec): 10.39 - samples/sec: 3423.20 - lr: 0.000035 - momentum: 0.000000 2023-10-25 20:53:38,852 epoch 4 - iter 261/292 - loss 0.05248535 - time (sec): 11.64 - samples/sec: 3398.13 - lr: 0.000034 - momentum: 0.000000 2023-10-25 20:53:40,096 epoch 4 - iter 290/292 - loss 0.05345773 - time (sec): 12.88 - samples/sec: 3434.57 - lr: 0.000033 - momentum: 0.000000 2023-10-25 20:53:40,181 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:53:40,182 EPOCH 4 done: loss 0.0537 - lr: 0.000033 2023-10-25 20:53:41,097 DEV : loss 0.11280453205108643 - f1-score (micro avg) 0.7522 2023-10-25 20:53:41,102 saving best model 2023-10-25 20:53:41,756 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:53:43,027 epoch 5 - iter 29/292 - loss 0.03067125 - time (sec): 1.27 - samples/sec: 3116.28 - lr: 0.000033 - momentum: 0.000000 2023-10-25 20:53:44,281 epoch 5 - iter 58/292 - loss 0.03432643 - time (sec): 2.52 - samples/sec: 3410.73 - lr: 0.000032 - momentum: 0.000000 2023-10-25 20:53:45,604 epoch 5 - iter 87/292 - loss 0.04094254 - time (sec): 3.85 - samples/sec: 3442.26 - lr: 0.000032 - momentum: 0.000000 2023-10-25 20:53:46,927 epoch 5 - iter 116/292 - loss 0.03856020 - time (sec): 5.17 - samples/sec: 3419.71 - lr: 0.000031 - momentum: 0.000000 2023-10-25 20:53:48,182 epoch 5 - iter 145/292 - loss 0.03449220 - time (sec): 6.42 - samples/sec: 3411.26 - lr: 0.000031 - momentum: 0.000000 2023-10-25 20:53:49,492 epoch 5 - iter 174/292 - loss 0.03424483 - time (sec): 7.73 - samples/sec: 3425.22 - lr: 0.000030 - momentum: 0.000000 2023-10-25 20:53:50,794 epoch 5 - iter 203/292 - loss 0.03475940 - time (sec): 9.04 - samples/sec: 3410.84 - lr: 0.000030 - momentum: 0.000000 2023-10-25 20:53:52,091 epoch 5 - iter 232/292 - loss 0.03428080 - time (sec): 10.33 - samples/sec: 3432.78 - lr: 0.000029 - momentum: 0.000000 2023-10-25 20:53:53,399 epoch 5 - iter 261/292 - loss 0.03246941 - time (sec): 11.64 - samples/sec: 3388.06 - lr: 0.000028 - momentum: 0.000000 2023-10-25 20:53:54,768 epoch 5 - iter 290/292 - loss 0.03384146 - time (sec): 13.01 - samples/sec: 3402.63 - lr: 0.000028 - momentum: 0.000000 2023-10-25 20:53:54,843 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:53:54,843 EPOCH 5 done: loss 0.0341 - lr: 0.000028 2023-10-25 20:53:55,776 DEV : loss 0.14307908713817596 - f1-score (micro avg) 0.7343 2023-10-25 20:53:55,781 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:53:57,014 epoch 6 - iter 29/292 - loss 0.02283523 - time (sec): 1.23 - samples/sec: 3403.33 - lr: 0.000027 - momentum: 0.000000 2023-10-25 20:53:58,274 epoch 6 - iter 58/292 - loss 0.02378104 - time (sec): 2.49 - samples/sec: 3439.00 - lr: 0.000027 - momentum: 0.000000 2023-10-25 20:53:59,705 epoch 6 - iter 87/292 - loss 0.02477152 - time (sec): 3.92 - samples/sec: 3555.45 - lr: 0.000026 - momentum: 0.000000 2023-10-25 20:54:01,001 epoch 6 - iter 116/292 - loss 0.02923244 - time (sec): 5.22 - samples/sec: 3558.00 - lr: 0.000026 - momentum: 0.000000 2023-10-25 20:54:02,338 epoch 6 - iter 145/292 - loss 0.02764328 - time (sec): 6.56 - samples/sec: 3408.14 - lr: 0.000025 - momentum: 0.000000 2023-10-25 20:54:03,788 epoch 6 - iter 174/292 - loss 0.02758029 - time (sec): 8.01 - samples/sec: 3357.42 - lr: 0.000025 - momentum: 0.000000 2023-10-25 20:54:05,097 epoch 6 - iter 203/292 - loss 0.02712844 - time (sec): 9.32 - samples/sec: 3364.89 - lr: 0.000024 - momentum: 0.000000 2023-10-25 20:54:06,398 epoch 6 - iter 232/292 - loss 0.02556567 - time (sec): 10.62 - samples/sec: 3381.89 - lr: 0.000023 - momentum: 0.000000 2023-10-25 20:54:07,644 epoch 6 - iter 261/292 - loss 0.02621766 - time (sec): 11.86 - samples/sec: 3356.01 - lr: 0.000023 - momentum: 0.000000 2023-10-25 20:54:08,936 epoch 6 - iter 290/292 - loss 0.02847300 - time (sec): 13.15 - samples/sec: 3366.52 - lr: 0.000022 - momentum: 0.000000 2023-10-25 20:54:09,016 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:54:09,016 EPOCH 6 done: loss 0.0284 - lr: 0.000022 2023-10-25 20:54:10,097 DEV : loss 0.16628390550613403 - f1-score (micro avg) 0.7522 2023-10-25 20:54:10,102 saving best model 2023-10-25 20:54:10,660 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:54:11,922 epoch 7 - iter 29/292 - loss 0.01313140 - time (sec): 1.26 - samples/sec: 3320.18 - lr: 0.000022 - momentum: 0.000000 2023-10-25 20:54:13,181 epoch 7 - iter 58/292 - loss 0.01062088 - time (sec): 2.52 - samples/sec: 3319.42 - lr: 0.000021 - momentum: 0.000000 2023-10-25 20:54:14,445 epoch 7 - iter 87/292 - loss 0.01259981 - time (sec): 3.78 - samples/sec: 3345.63 - lr: 0.000021 - momentum: 0.000000 2023-10-25 20:54:15,657 epoch 7 - iter 116/292 - loss 0.01549817 - time (sec): 5.00 - samples/sec: 3351.89 - lr: 0.000020 - momentum: 0.000000 2023-10-25 20:54:17,051 epoch 7 - iter 145/292 - loss 0.01629611 - time (sec): 6.39 - samples/sec: 3443.66 - lr: 0.000020 - momentum: 0.000000 2023-10-25 20:54:18,411 epoch 7 - iter 174/292 - loss 0.01683554 - time (sec): 7.75 - samples/sec: 3456.79 - lr: 0.000019 - momentum: 0.000000 2023-10-25 20:54:19,716 epoch 7 - iter 203/292 - loss 0.02085836 - time (sec): 9.06 - samples/sec: 3467.56 - lr: 0.000018 - momentum: 0.000000 2023-10-25 20:54:20,977 epoch 7 - iter 232/292 - loss 0.02012985 - time (sec): 10.32 - samples/sec: 3450.94 - lr: 0.000018 - momentum: 0.000000 2023-10-25 20:54:22,272 epoch 7 - iter 261/292 - loss 0.01963116 - time (sec): 11.61 - samples/sec: 3441.95 - lr: 0.000017 - momentum: 0.000000 2023-10-25 20:54:23,519 epoch 7 - iter 290/292 - loss 0.01887461 - time (sec): 12.86 - samples/sec: 3434.82 - lr: 0.000017 - momentum: 0.000000 2023-10-25 20:54:23,603 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:54:23,603 EPOCH 7 done: loss 0.0188 - lr: 0.000017 2023-10-25 20:54:24,526 DEV : loss 0.18740706145763397 - f1-score (micro avg) 0.7209 2023-10-25 20:54:24,531 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:54:25,835 epoch 8 - iter 29/292 - loss 0.01005555 - time (sec): 1.30 - samples/sec: 3400.17 - lr: 0.000016 - momentum: 0.000000 2023-10-25 20:54:27,199 epoch 8 - iter 58/292 - loss 0.01177608 - time (sec): 2.67 - samples/sec: 3735.36 - lr: 0.000016 - momentum: 0.000000 2023-10-25 20:54:28,454 epoch 8 - iter 87/292 - loss 0.01023423 - time (sec): 3.92 - samples/sec: 3525.50 - lr: 0.000015 - momentum: 0.000000 2023-10-25 20:54:29,725 epoch 8 - iter 116/292 - loss 0.01018425 - time (sec): 5.19 - samples/sec: 3503.67 - lr: 0.000015 - momentum: 0.000000 2023-10-25 20:54:30,974 epoch 8 - iter 145/292 - loss 0.00987982 - time (sec): 6.44 - samples/sec: 3494.30 - lr: 0.000014 - momentum: 0.000000 2023-10-25 20:54:32,348 epoch 8 - iter 174/292 - loss 0.01053102 - time (sec): 7.82 - samples/sec: 3468.90 - lr: 0.000013 - momentum: 0.000000 2023-10-25 20:54:33,697 epoch 8 - iter 203/292 - loss 0.01060943 - time (sec): 9.17 - samples/sec: 3480.83 - lr: 0.000013 - momentum: 0.000000 2023-10-25 20:54:35,002 epoch 8 - iter 232/292 - loss 0.01150807 - time (sec): 10.47 - samples/sec: 3462.88 - lr: 0.000012 - momentum: 0.000000 2023-10-25 20:54:36,267 epoch 8 - iter 261/292 - loss 0.01076409 - time (sec): 11.74 - samples/sec: 3411.72 - lr: 0.000012 - momentum: 0.000000 2023-10-25 20:54:37,558 epoch 8 - iter 290/292 - loss 0.01018919 - time (sec): 13.03 - samples/sec: 3390.12 - lr: 0.000011 - momentum: 0.000000 2023-10-25 20:54:37,644 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:54:37,644 EPOCH 8 done: loss 0.0101 - lr: 0.000011 2023-10-25 20:54:38,559 DEV : loss 0.19930434226989746 - f1-score (micro avg) 0.7309 2023-10-25 20:54:38,563 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:54:39,907 epoch 9 - iter 29/292 - loss 0.01344550 - time (sec): 1.34 - samples/sec: 3432.35 - lr: 0.000011 - momentum: 0.000000 2023-10-25 20:54:41,169 epoch 9 - iter 58/292 - loss 0.01055378 - time (sec): 2.60 - samples/sec: 3307.88 - lr: 0.000010 - momentum: 0.000000 2023-10-25 20:54:42,418 epoch 9 - iter 87/292 - loss 0.00815295 - time (sec): 3.85 - samples/sec: 3152.34 - lr: 0.000010 - momentum: 0.000000 2023-10-25 20:54:43,710 epoch 9 - iter 116/292 - loss 0.00642132 - time (sec): 5.15 - samples/sec: 3294.32 - lr: 0.000009 - momentum: 0.000000 2023-10-25 20:54:44,977 epoch 9 - iter 145/292 - loss 0.00692122 - time (sec): 6.41 - samples/sec: 3257.05 - lr: 0.000008 - momentum: 0.000000 2023-10-25 20:54:46,255 epoch 9 - iter 174/292 - loss 0.00736428 - time (sec): 7.69 - samples/sec: 3255.49 - lr: 0.000008 - momentum: 0.000000 2023-10-25 20:54:47,546 epoch 9 - iter 203/292 - loss 0.00688575 - time (sec): 8.98 - samples/sec: 3341.57 - lr: 0.000007 - momentum: 0.000000 2023-10-25 20:54:48,746 epoch 9 - iter 232/292 - loss 0.00605517 - time (sec): 10.18 - samples/sec: 3389.43 - lr: 0.000007 - momentum: 0.000000 2023-10-25 20:54:50,084 epoch 9 - iter 261/292 - loss 0.00604344 - time (sec): 11.52 - samples/sec: 3467.54 - lr: 0.000006 - momentum: 0.000000 2023-10-25 20:54:51,342 epoch 9 - iter 290/292 - loss 0.00647095 - time (sec): 12.78 - samples/sec: 3460.46 - lr: 0.000006 - momentum: 0.000000 2023-10-25 20:54:51,421 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:54:51,421 EPOCH 9 done: loss 0.0064 - lr: 0.000006 2023-10-25 20:54:52,337 DEV : loss 0.20804047584533691 - f1-score (micro avg) 0.735 2023-10-25 20:54:52,342 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:54:53,531 epoch 10 - iter 29/292 - loss 0.00167655 - time (sec): 1.19 - samples/sec: 3673.47 - lr: 0.000005 - momentum: 0.000000 2023-10-25 20:54:54,800 epoch 10 - iter 58/292 - loss 0.00843115 - time (sec): 2.46 - samples/sec: 3456.32 - lr: 0.000005 - momentum: 0.000000 2023-10-25 20:54:56,074 epoch 10 - iter 87/292 - loss 0.00620319 - time (sec): 3.73 - samples/sec: 3501.50 - lr: 0.000004 - momentum: 0.000000 2023-10-25 20:54:57,422 epoch 10 - iter 116/292 - loss 0.00516689 - time (sec): 5.08 - samples/sec: 3410.62 - lr: 0.000003 - momentum: 0.000000 2023-10-25 20:54:58,751 epoch 10 - iter 145/292 - loss 0.00519706 - time (sec): 6.41 - samples/sec: 3393.23 - lr: 0.000003 - momentum: 0.000000 2023-10-25 20:55:00,068 epoch 10 - iter 174/292 - loss 0.00530414 - time (sec): 7.73 - samples/sec: 3331.36 - lr: 0.000002 - momentum: 0.000000 2023-10-25 20:55:01,375 epoch 10 - iter 203/292 - loss 0.00462386 - time (sec): 9.03 - samples/sec: 3418.00 - lr: 0.000002 - momentum: 0.000000 2023-10-25 20:55:02,678 epoch 10 - iter 232/292 - loss 0.00511570 - time (sec): 10.34 - samples/sec: 3447.71 - lr: 0.000001 - momentum: 0.000000 2023-10-25 20:55:03,957 epoch 10 - iter 261/292 - loss 0.00465885 - time (sec): 11.61 - samples/sec: 3411.83 - lr: 0.000001 - momentum: 0.000000 2023-10-25 20:55:05,243 epoch 10 - iter 290/292 - loss 0.00423569 - time (sec): 12.90 - samples/sec: 3419.83 - lr: 0.000000 - momentum: 0.000000 2023-10-25 20:55:05,325 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:55:05,325 EPOCH 10 done: loss 0.0042 - lr: 0.000000 2023-10-25 20:55:06,243 DEV : loss 0.21092882752418518 - f1-score (micro avg) 0.7292 2023-10-25 20:55:06,767 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:55:06,768 Loading model from best epoch ... 2023-10-25 20:55:08,658 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-25 20:55:10,238 Results: - F-score (micro) 0.7654 - F-score (macro) 0.6924 - Accuracy 0.6478 By class: precision recall f1-score support PER 0.7832 0.8305 0.8061 348 LOC 0.6959 0.8506 0.7655 261 ORG 0.5278 0.3654 0.4318 52 HumanProd 0.7200 0.8182 0.7660 22 micro avg 0.7316 0.8023 0.7654 683 macro avg 0.6817 0.7162 0.6924 683 weighted avg 0.7284 0.8023 0.7608 683 2023-10-25 20:55:10,238 ----------------------------------------------------------------------------------------------------