2023-10-25 20:45:27,300 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:45:27,301 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 20:45:27,301 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:45:27,301 MultiCorpus: 1166 train + 165 dev + 415 test sentences - NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator 2023-10-25 20:45:27,302 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:45:27,302 Train: 1166 sentences 2023-10-25 20:45:27,302 (train_with_dev=False, train_with_test=False) 2023-10-25 20:45:27,302 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:45:27,302 Training Params: 2023-10-25 20:45:27,302 - learning_rate: "3e-05" 2023-10-25 20:45:27,302 - mini_batch_size: "8" 2023-10-25 20:45:27,302 - max_epochs: "10" 2023-10-25 20:45:27,302 - shuffle: "True" 2023-10-25 20:45:27,302 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:45:27,302 Plugins: 2023-10-25 20:45:27,302 - TensorboardLogger 2023-10-25 20:45:27,302 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 20:45:27,302 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:45:27,302 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 20:45:27,302 - metric: "('micro avg', 'f1-score')" 2023-10-25 20:45:27,302 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:45:27,302 Computation: 2023-10-25 20:45:27,302 - compute on device: cuda:0 2023-10-25 20:45:27,302 - embedding storage: none 2023-10-25 20:45:27,302 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:45:27,302 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-25 20:45:27,302 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:45:27,302 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:45:27,303 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 20:45:28,780 epoch 1 - iter 14/146 - loss 3.40387893 - time (sec): 1.48 - samples/sec: 3168.95 - lr: 0.000003 - momentum: 0.000000 2023-10-25 20:45:29,613 epoch 1 - iter 28/146 - loss 3.06671750 - time (sec): 2.31 - samples/sec: 3664.65 - lr: 0.000006 - momentum: 0.000000 2023-10-25 20:45:30,529 epoch 1 - iter 42/146 - loss 2.42857814 - time (sec): 3.23 - samples/sec: 3960.64 - lr: 0.000008 - momentum: 0.000000 2023-10-25 20:45:31,464 epoch 1 - iter 56/146 - loss 1.94932103 - time (sec): 4.16 - samples/sec: 4165.72 - lr: 0.000011 - momentum: 0.000000 2023-10-25 20:45:32,404 epoch 1 - iter 70/146 - loss 1.65766878 - time (sec): 5.10 - samples/sec: 4292.17 - lr: 0.000014 - momentum: 0.000000 2023-10-25 20:45:33,352 epoch 1 - iter 84/146 - loss 1.44606435 - time (sec): 6.05 - samples/sec: 4392.13 - lr: 0.000017 - momentum: 0.000000 2023-10-25 20:45:34,263 epoch 1 - iter 98/146 - loss 1.29854117 - time (sec): 6.96 - samples/sec: 4422.42 - lr: 0.000020 - momentum: 0.000000 2023-10-25 20:45:35,000 epoch 1 - iter 112/146 - loss 1.19665486 - time (sec): 7.70 - samples/sec: 4452.40 - lr: 0.000023 - momentum: 0.000000 2023-10-25 20:45:35,992 epoch 1 - iter 126/146 - loss 1.08010980 - time (sec): 8.69 - samples/sec: 4492.40 - lr: 0.000026 - momentum: 0.000000 2023-10-25 20:45:36,820 epoch 1 - iter 140/146 - loss 1.00953467 - time (sec): 9.52 - samples/sec: 4495.08 - lr: 0.000029 - momentum: 0.000000 2023-10-25 20:45:37,181 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:45:37,181 EPOCH 1 done: loss 0.9803 - lr: 0.000029 2023-10-25 20:45:37,681 DEV : loss 0.18878793716430664 - f1-score (micro avg) 0.4692 2023-10-25 20:45:37,685 saving best model 2023-10-25 20:45:38,227 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:45:39,018 epoch 2 - iter 14/146 - loss 0.22373383 - time (sec): 0.79 - samples/sec: 5025.93 - lr: 0.000030 - momentum: 0.000000 2023-10-25 20:45:39,871 epoch 2 - iter 28/146 - loss 0.21574470 - time (sec): 1.64 - samples/sec: 5031.21 - lr: 0.000029 - momentum: 0.000000 2023-10-25 20:45:40,819 epoch 2 - iter 42/146 - loss 0.22004443 - time (sec): 2.59 - samples/sec: 5013.80 - lr: 0.000029 - momentum: 0.000000 2023-10-25 20:45:41,640 epoch 2 - iter 56/146 - loss 0.20860038 - time (sec): 3.41 - samples/sec: 5027.01 - lr: 0.000029 - momentum: 0.000000 2023-10-25 20:45:42,429 epoch 2 - iter 70/146 - loss 0.20976419 - time (sec): 4.20 - samples/sec: 4957.47 - lr: 0.000028 - momentum: 0.000000 2023-10-25 20:45:43,421 epoch 2 - iter 84/146 - loss 0.20054745 - time (sec): 5.19 - samples/sec: 4852.10 - lr: 0.000028 - momentum: 0.000000 2023-10-25 20:45:44,456 epoch 2 - iter 98/146 - loss 0.19171488 - time (sec): 6.23 - samples/sec: 4836.36 - lr: 0.000028 - momentum: 0.000000 2023-10-25 20:45:45,320 epoch 2 - iter 112/146 - loss 0.18425139 - time (sec): 7.09 - samples/sec: 4856.43 - lr: 0.000027 - momentum: 0.000000 2023-10-25 20:45:46,213 epoch 2 - iter 126/146 - loss 0.18453654 - time (sec): 7.98 - samples/sec: 4787.91 - lr: 0.000027 - momentum: 0.000000 2023-10-25 20:45:46,992 epoch 2 - iter 140/146 - loss 0.18299430 - time (sec): 8.76 - samples/sec: 4783.07 - lr: 0.000027 - momentum: 0.000000 2023-10-25 20:45:47,552 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:45:47,553 EPOCH 2 done: loss 0.1829 - lr: 0.000027 2023-10-25 20:45:48,459 DEV : loss 0.10772357136011124 - f1-score (micro avg) 0.6757 2023-10-25 20:45:48,464 saving best model 2023-10-25 20:45:49,128 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:45:50,253 epoch 3 - iter 14/146 - loss 0.09397448 - time (sec): 1.12 - samples/sec: 4453.73 - lr: 0.000026 - momentum: 0.000000 2023-10-25 20:45:51,244 epoch 3 - iter 28/146 - loss 0.08863819 - time (sec): 2.11 - samples/sec: 4734.46 - lr: 0.000026 - momentum: 0.000000 2023-10-25 20:45:52,029 epoch 3 - iter 42/146 - loss 0.09172307 - time (sec): 2.90 - samples/sec: 4638.72 - lr: 0.000026 - momentum: 0.000000 2023-10-25 20:45:52,887 epoch 3 - iter 56/146 - loss 0.09536448 - time (sec): 3.76 - samples/sec: 4682.14 - lr: 0.000025 - momentum: 0.000000 2023-10-25 20:45:53,773 epoch 3 - iter 70/146 - loss 0.09920949 - time (sec): 4.64 - samples/sec: 4721.91 - lr: 0.000025 - momentum: 0.000000 2023-10-25 20:45:54,543 epoch 3 - iter 84/146 - loss 0.09883122 - time (sec): 5.41 - samples/sec: 4656.43 - lr: 0.000025 - momentum: 0.000000 2023-10-25 20:45:55,495 epoch 3 - iter 98/146 - loss 0.09485869 - time (sec): 6.37 - samples/sec: 4721.80 - lr: 0.000024 - momentum: 0.000000 2023-10-25 20:45:56,456 epoch 3 - iter 112/146 - loss 0.09585353 - time (sec): 7.33 - samples/sec: 4703.34 - lr: 0.000024 - momentum: 0.000000 2023-10-25 20:45:57,400 epoch 3 - iter 126/146 - loss 0.10213869 - time (sec): 8.27 - samples/sec: 4707.42 - lr: 0.000024 - momentum: 0.000000 2023-10-25 20:45:58,266 epoch 3 - iter 140/146 - loss 0.09946411 - time (sec): 9.14 - samples/sec: 4687.32 - lr: 0.000024 - momentum: 0.000000 2023-10-25 20:45:58,615 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:45:58,615 EPOCH 3 done: loss 0.0993 - lr: 0.000024 2023-10-25 20:45:59,521 DEV : loss 0.09623494744300842 - f1-score (micro avg) 0.7237 2023-10-25 20:45:59,525 saving best model 2023-10-25 20:46:00,194 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:46:00,959 epoch 4 - iter 14/146 - loss 0.04022952 - time (sec): 0.76 - samples/sec: 4527.81 - lr: 0.000023 - momentum: 0.000000 2023-10-25 20:46:01,817 epoch 4 - iter 28/146 - loss 0.05326105 - time (sec): 1.62 - samples/sec: 4707.44 - lr: 0.000023 - momentum: 0.000000 2023-10-25 20:46:02,851 epoch 4 - iter 42/146 - loss 0.05115026 - time (sec): 2.65 - samples/sec: 4516.22 - lr: 0.000022 - momentum: 0.000000 2023-10-25 20:46:03,802 epoch 4 - iter 56/146 - loss 0.05355703 - time (sec): 3.61 - samples/sec: 4625.29 - lr: 0.000022 - momentum: 0.000000 2023-10-25 20:46:04,793 epoch 4 - iter 70/146 - loss 0.04916831 - time (sec): 4.60 - samples/sec: 4842.45 - lr: 0.000022 - momentum: 0.000000 2023-10-25 20:46:05,616 epoch 4 - iter 84/146 - loss 0.05190130 - time (sec): 5.42 - samples/sec: 4783.85 - lr: 0.000021 - momentum: 0.000000 2023-10-25 20:46:06,680 epoch 4 - iter 98/146 - loss 0.05403563 - time (sec): 6.48 - samples/sec: 4719.13 - lr: 0.000021 - momentum: 0.000000 2023-10-25 20:46:07,485 epoch 4 - iter 112/146 - loss 0.05760992 - time (sec): 7.29 - samples/sec: 4729.44 - lr: 0.000021 - momentum: 0.000000 2023-10-25 20:46:08,366 epoch 4 - iter 126/146 - loss 0.05858482 - time (sec): 8.17 - samples/sec: 4699.12 - lr: 0.000021 - momentum: 0.000000 2023-10-25 20:46:09,193 epoch 4 - iter 140/146 - loss 0.05860587 - time (sec): 9.00 - samples/sec: 4749.76 - lr: 0.000020 - momentum: 0.000000 2023-10-25 20:46:09,538 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:46:09,539 EPOCH 4 done: loss 0.0602 - lr: 0.000020 2023-10-25 20:46:10,450 DEV : loss 0.10259346663951874 - f1-score (micro avg) 0.7352 2023-10-25 20:46:10,454 saving best model 2023-10-25 20:46:11,125 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:46:11,959 epoch 5 - iter 14/146 - loss 0.04835059 - time (sec): 0.83 - samples/sec: 4538.83 - lr: 0.000020 - momentum: 0.000000 2023-10-25 20:46:12,819 epoch 5 - iter 28/146 - loss 0.04414478 - time (sec): 1.69 - samples/sec: 4744.01 - lr: 0.000019 - momentum: 0.000000 2023-10-25 20:46:13,786 epoch 5 - iter 42/146 - loss 0.04694686 - time (sec): 2.66 - samples/sec: 4818.30 - lr: 0.000019 - momentum: 0.000000 2023-10-25 20:46:14,616 epoch 5 - iter 56/146 - loss 0.04615584 - time (sec): 3.49 - samples/sec: 4796.28 - lr: 0.000019 - momentum: 0.000000 2023-10-25 20:46:15,550 epoch 5 - iter 70/146 - loss 0.04037843 - time (sec): 4.42 - samples/sec: 4778.41 - lr: 0.000018 - momentum: 0.000000 2023-10-25 20:46:16,368 epoch 5 - iter 84/146 - loss 0.04012595 - time (sec): 5.24 - samples/sec: 4751.28 - lr: 0.000018 - momentum: 0.000000 2023-10-25 20:46:17,376 epoch 5 - iter 98/146 - loss 0.04025863 - time (sec): 6.25 - samples/sec: 4720.73 - lr: 0.000018 - momentum: 0.000000 2023-10-25 20:46:18,308 epoch 5 - iter 112/146 - loss 0.03949493 - time (sec): 7.18 - samples/sec: 4719.44 - lr: 0.000018 - momentum: 0.000000 2023-10-25 20:46:19,261 epoch 5 - iter 126/146 - loss 0.03932549 - time (sec): 8.13 - samples/sec: 4681.60 - lr: 0.000017 - momentum: 0.000000 2023-10-25 20:46:20,151 epoch 5 - iter 140/146 - loss 0.03746108 - time (sec): 9.02 - samples/sec: 4702.72 - lr: 0.000017 - momentum: 0.000000 2023-10-25 20:46:20,577 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:46:20,577 EPOCH 5 done: loss 0.0404 - lr: 0.000017 2023-10-25 20:46:21,635 DEV : loss 0.10988734662532806 - f1-score (micro avg) 0.7435 2023-10-25 20:46:21,639 saving best model 2023-10-25 20:46:22,313 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:46:23,226 epoch 6 - iter 14/146 - loss 0.03584193 - time (sec): 0.91 - samples/sec: 4497.18 - lr: 0.000016 - momentum: 0.000000 2023-10-25 20:46:24,123 epoch 6 - iter 28/146 - loss 0.02695874 - time (sec): 1.81 - samples/sec: 4616.48 - lr: 0.000016 - momentum: 0.000000 2023-10-25 20:46:25,174 epoch 6 - iter 42/146 - loss 0.02419280 - time (sec): 2.86 - samples/sec: 4545.19 - lr: 0.000016 - momentum: 0.000000 2023-10-25 20:46:26,171 epoch 6 - iter 56/146 - loss 0.02537702 - time (sec): 3.86 - samples/sec: 4669.12 - lr: 0.000015 - momentum: 0.000000 2023-10-25 20:46:27,060 epoch 6 - iter 70/146 - loss 0.02481698 - time (sec): 4.75 - samples/sec: 4611.66 - lr: 0.000015 - momentum: 0.000000 2023-10-25 20:46:27,932 epoch 6 - iter 84/146 - loss 0.02388273 - time (sec): 5.62 - samples/sec: 4591.94 - lr: 0.000015 - momentum: 0.000000 2023-10-25 20:46:28,827 epoch 6 - iter 98/146 - loss 0.02589398 - time (sec): 6.51 - samples/sec: 4653.01 - lr: 0.000015 - momentum: 0.000000 2023-10-25 20:46:29,715 epoch 6 - iter 112/146 - loss 0.02424820 - time (sec): 7.40 - samples/sec: 4697.26 - lr: 0.000014 - momentum: 0.000000 2023-10-25 20:46:30,507 epoch 6 - iter 126/146 - loss 0.02469800 - time (sec): 8.19 - samples/sec: 4696.81 - lr: 0.000014 - momentum: 0.000000 2023-10-25 20:46:31,313 epoch 6 - iter 140/146 - loss 0.02532898 - time (sec): 9.00 - samples/sec: 4768.13 - lr: 0.000014 - momentum: 0.000000 2023-10-25 20:46:31,721 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:46:31,721 EPOCH 6 done: loss 0.0264 - lr: 0.000014 2023-10-25 20:46:32,629 DEV : loss 0.1386864334344864 - f1-score (micro avg) 0.7348 2023-10-25 20:46:32,634 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:46:33,506 epoch 7 - iter 14/146 - loss 0.02298317 - time (sec): 0.87 - samples/sec: 4624.97 - lr: 0.000013 - momentum: 0.000000 2023-10-25 20:46:34,390 epoch 7 - iter 28/146 - loss 0.01697781 - time (sec): 1.75 - samples/sec: 4561.88 - lr: 0.000013 - momentum: 0.000000 2023-10-25 20:46:35,300 epoch 7 - iter 42/146 - loss 0.01735609 - time (sec): 2.67 - samples/sec: 4561.13 - lr: 0.000012 - momentum: 0.000000 2023-10-25 20:46:36,143 epoch 7 - iter 56/146 - loss 0.01627697 - time (sec): 3.51 - samples/sec: 4555.86 - lr: 0.000012 - momentum: 0.000000 2023-10-25 20:46:37,224 epoch 7 - iter 70/146 - loss 0.01980291 - time (sec): 4.59 - samples/sec: 4619.86 - lr: 0.000012 - momentum: 0.000000 2023-10-25 20:46:38,274 epoch 7 - iter 84/146 - loss 0.01830047 - time (sec): 5.64 - samples/sec: 4600.61 - lr: 0.000012 - momentum: 0.000000 2023-10-25 20:46:39,173 epoch 7 - iter 98/146 - loss 0.01978492 - time (sec): 6.54 - samples/sec: 4609.19 - lr: 0.000011 - momentum: 0.000000 2023-10-25 20:46:40,095 epoch 7 - iter 112/146 - loss 0.01965229 - time (sec): 7.46 - samples/sec: 4637.54 - lr: 0.000011 - momentum: 0.000000 2023-10-25 20:46:40,973 epoch 7 - iter 126/146 - loss 0.01942350 - time (sec): 8.34 - samples/sec: 4592.38 - lr: 0.000011 - momentum: 0.000000 2023-10-25 20:46:41,772 epoch 7 - iter 140/146 - loss 0.01866854 - time (sec): 9.14 - samples/sec: 4683.18 - lr: 0.000010 - momentum: 0.000000 2023-10-25 20:46:42,111 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:46:42,111 EPOCH 7 done: loss 0.0185 - lr: 0.000010 2023-10-25 20:46:43,024 DEV : loss 0.12989920377731323 - f1-score (micro avg) 0.7407 2023-10-25 20:46:43,028 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:46:43,931 epoch 8 - iter 14/146 - loss 0.00678178 - time (sec): 0.90 - samples/sec: 4885.73 - lr: 0.000010 - momentum: 0.000000 2023-10-25 20:46:44,781 epoch 8 - iter 28/146 - loss 0.01030138 - time (sec): 1.75 - samples/sec: 5354.80 - lr: 0.000009 - momentum: 0.000000 2023-10-25 20:46:45,610 epoch 8 - iter 42/146 - loss 0.01041059 - time (sec): 2.58 - samples/sec: 5152.88 - lr: 0.000009 - momentum: 0.000000 2023-10-25 20:46:46,515 epoch 8 - iter 56/146 - loss 0.01122395 - time (sec): 3.49 - samples/sec: 5052.68 - lr: 0.000009 - momentum: 0.000000 2023-10-25 20:46:47,386 epoch 8 - iter 70/146 - loss 0.01098060 - time (sec): 4.36 - samples/sec: 4939.20 - lr: 0.000009 - momentum: 0.000000 2023-10-25 20:46:48,315 epoch 8 - iter 84/146 - loss 0.01162853 - time (sec): 5.29 - samples/sec: 4912.02 - lr: 0.000008 - momentum: 0.000000 2023-10-25 20:46:49,378 epoch 8 - iter 98/146 - loss 0.01200666 - time (sec): 6.35 - samples/sec: 4823.94 - lr: 0.000008 - momentum: 0.000000 2023-10-25 20:46:50,289 epoch 8 - iter 112/146 - loss 0.01164341 - time (sec): 7.26 - samples/sec: 4811.61 - lr: 0.000008 - momentum: 0.000000 2023-10-25 20:46:51,179 epoch 8 - iter 126/146 - loss 0.01388481 - time (sec): 8.15 - samples/sec: 4728.40 - lr: 0.000007 - momentum: 0.000000 2023-10-25 20:46:52,051 epoch 8 - iter 140/146 - loss 0.01385814 - time (sec): 9.02 - samples/sec: 4750.18 - lr: 0.000007 - momentum: 0.000000 2023-10-25 20:46:52,378 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:46:52,378 EPOCH 8 done: loss 0.0141 - lr: 0.000007 2023-10-25 20:46:53,292 DEV : loss 0.1553896963596344 - f1-score (micro avg) 0.7527 2023-10-25 20:46:53,296 saving best model 2023-10-25 20:46:53,826 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:46:54,838 epoch 9 - iter 14/146 - loss 0.01917424 - time (sec): 1.01 - samples/sec: 4363.15 - lr: 0.000006 - momentum: 0.000000 2023-10-25 20:46:55,724 epoch 9 - iter 28/146 - loss 0.01227936 - time (sec): 1.90 - samples/sec: 4423.12 - lr: 0.000006 - momentum: 0.000000 2023-10-25 20:46:56,495 epoch 9 - iter 42/146 - loss 0.01288509 - time (sec): 2.67 - samples/sec: 4458.51 - lr: 0.000006 - momentum: 0.000000 2023-10-25 20:46:57,559 epoch 9 - iter 56/146 - loss 0.01164916 - time (sec): 3.73 - samples/sec: 4356.87 - lr: 0.000006 - momentum: 0.000000 2023-10-25 20:46:58,372 epoch 9 - iter 70/146 - loss 0.01131783 - time (sec): 4.54 - samples/sec: 4431.60 - lr: 0.000005 - momentum: 0.000000 2023-10-25 20:46:59,292 epoch 9 - iter 84/146 - loss 0.01232463 - time (sec): 5.46 - samples/sec: 4406.38 - lr: 0.000005 - momentum: 0.000000 2023-10-25 20:47:00,189 epoch 9 - iter 98/146 - loss 0.01239644 - time (sec): 6.36 - samples/sec: 4546.61 - lr: 0.000005 - momentum: 0.000000 2023-10-25 20:47:01,088 epoch 9 - iter 112/146 - loss 0.01154778 - time (sec): 7.26 - samples/sec: 4592.00 - lr: 0.000004 - momentum: 0.000000 2023-10-25 20:47:02,053 epoch 9 - iter 126/146 - loss 0.01104399 - time (sec): 8.22 - samples/sec: 4653.77 - lr: 0.000004 - momentum: 0.000000 2023-10-25 20:47:03,090 epoch 9 - iter 140/146 - loss 0.01095344 - time (sec): 9.26 - samples/sec: 4640.05 - lr: 0.000004 - momentum: 0.000000 2023-10-25 20:47:03,435 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:47:03,435 EPOCH 9 done: loss 0.0110 - lr: 0.000004 2023-10-25 20:47:04,346 DEV : loss 0.16054613888263702 - f1-score (micro avg) 0.7446 2023-10-25 20:47:04,350 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:47:05,164 epoch 10 - iter 14/146 - loss 0.00674318 - time (sec): 0.81 - samples/sec: 5173.53 - lr: 0.000003 - momentum: 0.000000 2023-10-25 20:47:06,134 epoch 10 - iter 28/146 - loss 0.01110079 - time (sec): 1.78 - samples/sec: 4683.12 - lr: 0.000003 - momentum: 0.000000 2023-10-25 20:47:07,014 epoch 10 - iter 42/146 - loss 0.01282647 - time (sec): 2.66 - samples/sec: 4633.70 - lr: 0.000003 - momentum: 0.000000 2023-10-25 20:47:07,950 epoch 10 - iter 56/146 - loss 0.01170567 - time (sec): 3.60 - samples/sec: 4666.99 - lr: 0.000002 - momentum: 0.000000 2023-10-25 20:47:08,947 epoch 10 - iter 70/146 - loss 0.01044850 - time (sec): 4.60 - samples/sec: 4570.03 - lr: 0.000002 - momentum: 0.000000 2023-10-25 20:47:09,806 epoch 10 - iter 84/146 - loss 0.00899261 - time (sec): 5.46 - samples/sec: 4505.29 - lr: 0.000002 - momentum: 0.000000 2023-10-25 20:47:10,843 epoch 10 - iter 98/146 - loss 0.00848585 - time (sec): 6.49 - samples/sec: 4560.37 - lr: 0.000001 - momentum: 0.000000 2023-10-25 20:47:11,802 epoch 10 - iter 112/146 - loss 0.00890020 - time (sec): 7.45 - samples/sec: 4616.28 - lr: 0.000001 - momentum: 0.000000 2023-10-25 20:47:12,761 epoch 10 - iter 126/146 - loss 0.00868290 - time (sec): 8.41 - samples/sec: 4583.94 - lr: 0.000001 - momentum: 0.000000 2023-10-25 20:47:13,584 epoch 10 - iter 140/146 - loss 0.00822908 - time (sec): 9.23 - samples/sec: 4587.58 - lr: 0.000000 - momentum: 0.000000 2023-10-25 20:47:13,961 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:47:13,961 EPOCH 10 done: loss 0.0079 - lr: 0.000000 2023-10-25 20:47:14,874 DEV : loss 0.16437558829784393 - f1-score (micro avg) 0.7414 2023-10-25 20:47:15,407 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:47:15,408 Loading model from best epoch ... 2023-10-25 20:47:17,312 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-25 20:47:18,849 Results: - F-score (micro) 0.7801 - F-score (macro) 0.7068 - Accuracy 0.6671 By class: precision recall f1-score support PER 0.7984 0.8649 0.8303 348 LOC 0.7230 0.8199 0.7684 261 ORG 0.5319 0.4808 0.5051 52 HumanProd 0.6800 0.7727 0.7234 22 micro avg 0.7477 0.8155 0.7801 683 macro avg 0.6833 0.7346 0.7068 683 weighted avg 0.7455 0.8155 0.7785 683 2023-10-25 20:47:18,849 ----------------------------------------------------------------------------------------------------