2023-10-18 17:43:04,080 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:43:04,080 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 17:43:04,080 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:43:04,080 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences - NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator 2023-10-18 17:43:04,080 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:43:04,080 Train: 3575 sentences 2023-10-18 17:43:04,080 (train_with_dev=False, train_with_test=False) 2023-10-18 17:43:04,080 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:43:04,080 Training Params: 2023-10-18 17:43:04,080 - learning_rate: "3e-05" 2023-10-18 17:43:04,080 - mini_batch_size: "8" 2023-10-18 17:43:04,080 - max_epochs: "10" 2023-10-18 17:43:04,080 - shuffle: "True" 2023-10-18 17:43:04,080 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:43:04,080 Plugins: 2023-10-18 17:43:04,080 - TensorboardLogger 2023-10-18 17:43:04,080 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 17:43:04,081 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:43:04,081 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 17:43:04,081 - metric: "('micro avg', 'f1-score')" 2023-10-18 17:43:04,081 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:43:04,081 Computation: 2023-10-18 17:43:04,081 - compute on device: cuda:0 2023-10-18 17:43:04,081 - embedding storage: none 2023-10-18 17:43:04,081 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:43:04,081 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-18 17:43:04,081 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:43:04,081 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:43:04,081 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 17:43:05,144 epoch 1 - iter 44/447 - loss 3.65841700 - time (sec): 1.06 - samples/sec: 7713.80 - lr: 0.000003 - momentum: 0.000000 2023-10-18 17:43:06,199 epoch 1 - iter 88/447 - loss 3.57458239 - time (sec): 2.12 - samples/sec: 7828.32 - lr: 0.000006 - momentum: 0.000000 2023-10-18 17:43:07,234 epoch 1 - iter 132/447 - loss 3.39036802 - time (sec): 3.15 - samples/sec: 7956.93 - lr: 0.000009 - momentum: 0.000000 2023-10-18 17:43:08,235 epoch 1 - iter 176/447 - loss 3.17329702 - time (sec): 4.15 - samples/sec: 8053.02 - lr: 0.000012 - momentum: 0.000000 2023-10-18 17:43:09,270 epoch 1 - iter 220/447 - loss 2.85941374 - time (sec): 5.19 - samples/sec: 8162.15 - lr: 0.000015 - momentum: 0.000000 2023-10-18 17:43:10,278 epoch 1 - iter 264/447 - loss 2.55730823 - time (sec): 6.20 - samples/sec: 8256.04 - lr: 0.000018 - momentum: 0.000000 2023-10-18 17:43:11,302 epoch 1 - iter 308/447 - loss 2.29715011 - time (sec): 7.22 - samples/sec: 8268.17 - lr: 0.000021 - momentum: 0.000000 2023-10-18 17:43:12,297 epoch 1 - iter 352/447 - loss 2.08270657 - time (sec): 8.22 - samples/sec: 8344.94 - lr: 0.000024 - momentum: 0.000000 2023-10-18 17:43:13,298 epoch 1 - iter 396/447 - loss 1.92246771 - time (sec): 9.22 - samples/sec: 8373.56 - lr: 0.000027 - momentum: 0.000000 2023-10-18 17:43:14,289 epoch 1 - iter 440/447 - loss 1.79687681 - time (sec): 10.21 - samples/sec: 8344.92 - lr: 0.000029 - momentum: 0.000000 2023-10-18 17:43:14,437 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:43:14,437 EPOCH 1 done: loss 1.7781 - lr: 0.000029 2023-10-18 17:43:16,615 DEV : loss 0.4775766432285309 - f1-score (micro avg) 0.0 2023-10-18 17:43:16,640 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:43:17,685 epoch 2 - iter 44/447 - loss 0.59589924 - time (sec): 1.04 - samples/sec: 9017.00 - lr: 0.000030 - momentum: 0.000000 2023-10-18 17:43:18,715 epoch 2 - iter 88/447 - loss 0.56401892 - time (sec): 2.07 - samples/sec: 8811.19 - lr: 0.000029 - momentum: 0.000000 2023-10-18 17:43:19,789 epoch 2 - iter 132/447 - loss 0.56470757 - time (sec): 3.15 - samples/sec: 8633.50 - lr: 0.000029 - momentum: 0.000000 2023-10-18 17:43:20,799 epoch 2 - iter 176/447 - loss 0.56520829 - time (sec): 4.16 - samples/sec: 8345.06 - lr: 0.000029 - momentum: 0.000000 2023-10-18 17:43:21,845 epoch 2 - iter 220/447 - loss 0.55623726 - time (sec): 5.20 - samples/sec: 8399.40 - lr: 0.000028 - momentum: 0.000000 2023-10-18 17:43:22,847 epoch 2 - iter 264/447 - loss 0.54920417 - time (sec): 6.21 - samples/sec: 8342.39 - lr: 0.000028 - momentum: 0.000000 2023-10-18 17:43:23,868 epoch 2 - iter 308/447 - loss 0.55616306 - time (sec): 7.23 - samples/sec: 8312.48 - lr: 0.000028 - momentum: 0.000000 2023-10-18 17:43:24,933 epoch 2 - iter 352/447 - loss 0.55121105 - time (sec): 8.29 - samples/sec: 8275.02 - lr: 0.000027 - momentum: 0.000000 2023-10-18 17:43:25,962 epoch 2 - iter 396/447 - loss 0.55138490 - time (sec): 9.32 - samples/sec: 8261.89 - lr: 0.000027 - momentum: 0.000000 2023-10-18 17:43:26,977 epoch 2 - iter 440/447 - loss 0.54548733 - time (sec): 10.34 - samples/sec: 8232.48 - lr: 0.000027 - momentum: 0.000000 2023-10-18 17:43:27,142 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:43:27,142 EPOCH 2 done: loss 0.5453 - lr: 0.000027 2023-10-18 17:43:32,386 DEV : loss 0.3954547643661499 - f1-score (micro avg) 0.0047 2023-10-18 17:43:32,412 saving best model 2023-10-18 17:43:32,446 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:43:33,438 epoch 3 - iter 44/447 - loss 0.46568998 - time (sec): 0.99 - samples/sec: 7873.33 - lr: 0.000026 - momentum: 0.000000 2023-10-18 17:43:34,448 epoch 3 - iter 88/447 - loss 0.48584638 - time (sec): 2.00 - samples/sec: 8243.38 - lr: 0.000026 - momentum: 0.000000 2023-10-18 17:43:35,426 epoch 3 - iter 132/447 - loss 0.48381208 - time (sec): 2.98 - samples/sec: 8146.00 - lr: 0.000026 - momentum: 0.000000 2023-10-18 17:43:36,433 epoch 3 - iter 176/447 - loss 0.46960679 - time (sec): 3.99 - samples/sec: 8330.61 - lr: 0.000025 - momentum: 0.000000 2023-10-18 17:43:37,453 epoch 3 - iter 220/447 - loss 0.46623827 - time (sec): 5.01 - samples/sec: 8371.67 - lr: 0.000025 - momentum: 0.000000 2023-10-18 17:43:38,479 epoch 3 - iter 264/447 - loss 0.46159907 - time (sec): 6.03 - samples/sec: 8435.20 - lr: 0.000025 - momentum: 0.000000 2023-10-18 17:43:39,455 epoch 3 - iter 308/447 - loss 0.45514585 - time (sec): 7.01 - samples/sec: 8383.41 - lr: 0.000024 - momentum: 0.000000 2023-10-18 17:43:40,425 epoch 3 - iter 352/447 - loss 0.45263927 - time (sec): 7.98 - samples/sec: 8415.36 - lr: 0.000024 - momentum: 0.000000 2023-10-18 17:43:41,417 epoch 3 - iter 396/447 - loss 0.45536448 - time (sec): 8.97 - samples/sec: 8428.60 - lr: 0.000024 - momentum: 0.000000 2023-10-18 17:43:42,459 epoch 3 - iter 440/447 - loss 0.45158447 - time (sec): 10.01 - samples/sec: 8515.27 - lr: 0.000023 - momentum: 0.000000 2023-10-18 17:43:42,605 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:43:42,606 EPOCH 3 done: loss 0.4501 - lr: 0.000023 2023-10-18 17:43:47,818 DEV : loss 0.34006714820861816 - f1-score (micro avg) 0.1762 2023-10-18 17:43:47,844 saving best model 2023-10-18 17:43:47,878 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:43:48,930 epoch 4 - iter 44/447 - loss 0.40096435 - time (sec): 1.05 - samples/sec: 8641.77 - lr: 0.000023 - momentum: 0.000000 2023-10-18 17:43:49,946 epoch 4 - iter 88/447 - loss 0.42313975 - time (sec): 2.07 - samples/sec: 8518.95 - lr: 0.000023 - momentum: 0.000000 2023-10-18 17:43:50,941 epoch 4 - iter 132/447 - loss 0.43615643 - time (sec): 3.06 - samples/sec: 8605.19 - lr: 0.000022 - momentum: 0.000000 2023-10-18 17:43:51,951 epoch 4 - iter 176/447 - loss 0.43038311 - time (sec): 4.07 - samples/sec: 8692.14 - lr: 0.000022 - momentum: 0.000000 2023-10-18 17:43:52,941 epoch 4 - iter 220/447 - loss 0.42491578 - time (sec): 5.06 - samples/sec: 8589.74 - lr: 0.000022 - momentum: 0.000000 2023-10-18 17:43:53,937 epoch 4 - iter 264/447 - loss 0.41934533 - time (sec): 6.06 - samples/sec: 8550.42 - lr: 0.000021 - momentum: 0.000000 2023-10-18 17:43:54,941 epoch 4 - iter 308/447 - loss 0.41228277 - time (sec): 7.06 - samples/sec: 8549.29 - lr: 0.000021 - momentum: 0.000000 2023-10-18 17:43:56,014 epoch 4 - iter 352/447 - loss 0.40713092 - time (sec): 8.14 - samples/sec: 8457.46 - lr: 0.000021 - momentum: 0.000000 2023-10-18 17:43:57,053 epoch 4 - iter 396/447 - loss 0.41039188 - time (sec): 9.17 - samples/sec: 8419.34 - lr: 0.000020 - momentum: 0.000000 2023-10-18 17:43:58,086 epoch 4 - iter 440/447 - loss 0.40812616 - time (sec): 10.21 - samples/sec: 8356.67 - lr: 0.000020 - momentum: 0.000000 2023-10-18 17:43:58,234 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:43:58,234 EPOCH 4 done: loss 0.4086 - lr: 0.000020 2023-10-18 17:44:03,498 DEV : loss 0.33860334753990173 - f1-score (micro avg) 0.2534 2023-10-18 17:44:03,524 saving best model 2023-10-18 17:44:03,556 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:44:04,590 epoch 5 - iter 44/447 - loss 0.36098289 - time (sec): 1.03 - samples/sec: 7878.70 - lr: 0.000020 - momentum: 0.000000 2023-10-18 17:44:05,601 epoch 5 - iter 88/447 - loss 0.39788915 - time (sec): 2.04 - samples/sec: 7724.01 - lr: 0.000019 - momentum: 0.000000 2023-10-18 17:44:06,617 epoch 5 - iter 132/447 - loss 0.37581248 - time (sec): 3.06 - samples/sec: 7780.33 - lr: 0.000019 - momentum: 0.000000 2023-10-18 17:44:07,695 epoch 5 - iter 176/447 - loss 0.37494525 - time (sec): 4.14 - samples/sec: 8048.39 - lr: 0.000019 - momentum: 0.000000 2023-10-18 17:44:08,706 epoch 5 - iter 220/447 - loss 0.37534730 - time (sec): 5.15 - samples/sec: 8161.94 - lr: 0.000018 - momentum: 0.000000 2023-10-18 17:44:09,732 epoch 5 - iter 264/447 - loss 0.37572997 - time (sec): 6.18 - samples/sec: 8254.05 - lr: 0.000018 - momentum: 0.000000 2023-10-18 17:44:10,768 epoch 5 - iter 308/447 - loss 0.37803576 - time (sec): 7.21 - samples/sec: 8247.46 - lr: 0.000018 - momentum: 0.000000 2023-10-18 17:44:11,797 epoch 5 - iter 352/447 - loss 0.38296535 - time (sec): 8.24 - samples/sec: 8275.23 - lr: 0.000017 - momentum: 0.000000 2023-10-18 17:44:12,634 epoch 5 - iter 396/447 - loss 0.38360699 - time (sec): 9.08 - samples/sec: 8430.98 - lr: 0.000017 - momentum: 0.000000 2023-10-18 17:44:13,491 epoch 5 - iter 440/447 - loss 0.38337258 - time (sec): 9.93 - samples/sec: 8583.20 - lr: 0.000017 - momentum: 0.000000 2023-10-18 17:44:13,633 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:44:13,633 EPOCH 5 done: loss 0.3872 - lr: 0.000017 2023-10-18 17:44:18,644 DEV : loss 0.3213781416416168 - f1-score (micro avg) 0.2904 2023-10-18 17:44:18,669 saving best model 2023-10-18 17:44:18,713 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:44:19,667 epoch 6 - iter 44/447 - loss 0.39724277 - time (sec): 0.95 - samples/sec: 8724.03 - lr: 0.000016 - momentum: 0.000000 2023-10-18 17:44:20,673 epoch 6 - iter 88/447 - loss 0.35175566 - time (sec): 1.96 - samples/sec: 8833.06 - lr: 0.000016 - momentum: 0.000000 2023-10-18 17:44:21,765 epoch 6 - iter 132/447 - loss 0.33678911 - time (sec): 3.05 - samples/sec: 8716.95 - lr: 0.000016 - momentum: 0.000000 2023-10-18 17:44:22,726 epoch 6 - iter 176/447 - loss 0.35169461 - time (sec): 4.01 - samples/sec: 8618.93 - lr: 0.000015 - momentum: 0.000000 2023-10-18 17:44:24,015 epoch 6 - iter 220/447 - loss 0.36233877 - time (sec): 5.30 - samples/sec: 8117.05 - lr: 0.000015 - momentum: 0.000000 2023-10-18 17:44:24,995 epoch 6 - iter 264/447 - loss 0.36252351 - time (sec): 6.28 - samples/sec: 8149.49 - lr: 0.000015 - momentum: 0.000000 2023-10-18 17:44:25,991 epoch 6 - iter 308/447 - loss 0.36318080 - time (sec): 7.28 - samples/sec: 8180.81 - lr: 0.000014 - momentum: 0.000000 2023-10-18 17:44:27,018 epoch 6 - iter 352/447 - loss 0.36149952 - time (sec): 8.31 - samples/sec: 8230.02 - lr: 0.000014 - momentum: 0.000000 2023-10-18 17:44:28,069 epoch 6 - iter 396/447 - loss 0.36522263 - time (sec): 9.36 - samples/sec: 8223.75 - lr: 0.000014 - momentum: 0.000000 2023-10-18 17:44:29,035 epoch 6 - iter 440/447 - loss 0.36696941 - time (sec): 10.32 - samples/sec: 8242.81 - lr: 0.000013 - momentum: 0.000000 2023-10-18 17:44:29,194 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:44:29,194 EPOCH 6 done: loss 0.3667 - lr: 0.000013 2023-10-18 17:44:34,156 DEV : loss 0.3145124316215515 - f1-score (micro avg) 0.3153 2023-10-18 17:44:34,182 saving best model 2023-10-18 17:44:34,215 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:44:35,237 epoch 7 - iter 44/447 - loss 0.33254949 - time (sec): 1.02 - samples/sec: 8026.91 - lr: 0.000013 - momentum: 0.000000 2023-10-18 17:44:36,289 epoch 7 - iter 88/447 - loss 0.33867551 - time (sec): 2.07 - samples/sec: 8083.21 - lr: 0.000013 - momentum: 0.000000 2023-10-18 17:44:37,293 epoch 7 - iter 132/447 - loss 0.34664698 - time (sec): 3.08 - samples/sec: 7937.45 - lr: 0.000012 - momentum: 0.000000 2023-10-18 17:44:38,336 epoch 7 - iter 176/447 - loss 0.35008385 - time (sec): 4.12 - samples/sec: 8059.32 - lr: 0.000012 - momentum: 0.000000 2023-10-18 17:44:39,361 epoch 7 - iter 220/447 - loss 0.34738790 - time (sec): 5.15 - samples/sec: 8109.70 - lr: 0.000012 - momentum: 0.000000 2023-10-18 17:44:40,353 epoch 7 - iter 264/447 - loss 0.34675104 - time (sec): 6.14 - samples/sec: 8102.12 - lr: 0.000011 - momentum: 0.000000 2023-10-18 17:44:41,348 epoch 7 - iter 308/447 - loss 0.35204448 - time (sec): 7.13 - samples/sec: 8185.69 - lr: 0.000011 - momentum: 0.000000 2023-10-18 17:44:42,408 epoch 7 - iter 352/447 - loss 0.35087997 - time (sec): 8.19 - samples/sec: 8314.56 - lr: 0.000011 - momentum: 0.000000 2023-10-18 17:44:43,385 epoch 7 - iter 396/447 - loss 0.34971556 - time (sec): 9.17 - samples/sec: 8296.34 - lr: 0.000010 - momentum: 0.000000 2023-10-18 17:44:44,485 epoch 7 - iter 440/447 - loss 0.35367941 - time (sec): 10.27 - samples/sec: 8320.95 - lr: 0.000010 - momentum: 0.000000 2023-10-18 17:44:44,643 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:44:44,644 EPOCH 7 done: loss 0.3538 - lr: 0.000010 2023-10-18 17:44:49,903 DEV : loss 0.30769509077072144 - f1-score (micro avg) 0.3193 2023-10-18 17:44:49,929 saving best model 2023-10-18 17:44:49,962 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:44:50,958 epoch 8 - iter 44/447 - loss 0.33422276 - time (sec): 0.99 - samples/sec: 8165.06 - lr: 0.000010 - momentum: 0.000000 2023-10-18 17:44:51,968 epoch 8 - iter 88/447 - loss 0.33262216 - time (sec): 2.01 - samples/sec: 8318.97 - lr: 0.000009 - momentum: 0.000000 2023-10-18 17:44:52,970 epoch 8 - iter 132/447 - loss 0.33401019 - time (sec): 3.01 - samples/sec: 8188.09 - lr: 0.000009 - momentum: 0.000000 2023-10-18 17:44:54,003 epoch 8 - iter 176/447 - loss 0.34047320 - time (sec): 4.04 - samples/sec: 8112.27 - lr: 0.000009 - momentum: 0.000000 2023-10-18 17:44:55,056 epoch 8 - iter 220/447 - loss 0.34431160 - time (sec): 5.09 - samples/sec: 8109.95 - lr: 0.000008 - momentum: 0.000000 2023-10-18 17:44:56,114 epoch 8 - iter 264/447 - loss 0.35113437 - time (sec): 6.15 - samples/sec: 8116.36 - lr: 0.000008 - momentum: 0.000000 2023-10-18 17:44:57,217 epoch 8 - iter 308/447 - loss 0.34790518 - time (sec): 7.25 - samples/sec: 8267.13 - lr: 0.000008 - momentum: 0.000000 2023-10-18 17:44:58,227 epoch 8 - iter 352/447 - loss 0.35030628 - time (sec): 8.26 - samples/sec: 8323.70 - lr: 0.000007 - momentum: 0.000000 2023-10-18 17:44:59,235 epoch 8 - iter 396/447 - loss 0.35029828 - time (sec): 9.27 - samples/sec: 8291.91 - lr: 0.000007 - momentum: 0.000000 2023-10-18 17:45:00,271 epoch 8 - iter 440/447 - loss 0.34699832 - time (sec): 10.31 - samples/sec: 8284.79 - lr: 0.000007 - momentum: 0.000000 2023-10-18 17:45:00,431 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:45:00,431 EPOCH 8 done: loss 0.3456 - lr: 0.000007 2023-10-18 17:45:05,789 DEV : loss 0.3056319057941437 - f1-score (micro avg) 0.3282 2023-10-18 17:45:05,815 saving best model 2023-10-18 17:45:05,850 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:45:06,908 epoch 9 - iter 44/447 - loss 0.33033877 - time (sec): 1.06 - samples/sec: 7764.12 - lr: 0.000006 - momentum: 0.000000 2023-10-18 17:45:07,972 epoch 9 - iter 88/447 - loss 0.32438109 - time (sec): 2.12 - samples/sec: 7960.53 - lr: 0.000006 - momentum: 0.000000 2023-10-18 17:45:08,980 epoch 9 - iter 132/447 - loss 0.33253674 - time (sec): 3.13 - samples/sec: 8029.34 - lr: 0.000006 - momentum: 0.000000 2023-10-18 17:45:10,020 epoch 9 - iter 176/447 - loss 0.32688422 - time (sec): 4.17 - samples/sec: 8005.44 - lr: 0.000005 - momentum: 0.000000 2023-10-18 17:45:11,047 epoch 9 - iter 220/447 - loss 0.32819605 - time (sec): 5.20 - samples/sec: 7999.77 - lr: 0.000005 - momentum: 0.000000 2023-10-18 17:45:12,072 epoch 9 - iter 264/447 - loss 0.32313458 - time (sec): 6.22 - samples/sec: 8130.90 - lr: 0.000005 - momentum: 0.000000 2023-10-18 17:45:13,119 epoch 9 - iter 308/447 - loss 0.32996163 - time (sec): 7.27 - samples/sec: 8219.22 - lr: 0.000004 - momentum: 0.000000 2023-10-18 17:45:14,091 epoch 9 - iter 352/447 - loss 0.33131371 - time (sec): 8.24 - samples/sec: 8213.97 - lr: 0.000004 - momentum: 0.000000 2023-10-18 17:45:15,118 epoch 9 - iter 396/447 - loss 0.33364538 - time (sec): 9.27 - samples/sec: 8198.10 - lr: 0.000004 - momentum: 0.000000 2023-10-18 17:45:16,158 epoch 9 - iter 440/447 - loss 0.33168082 - time (sec): 10.31 - samples/sec: 8290.47 - lr: 0.000003 - momentum: 0.000000 2023-10-18 17:45:16,310 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:45:16,310 EPOCH 9 done: loss 0.3324 - lr: 0.000003 2023-10-18 17:45:21,581 DEV : loss 0.30642732977867126 - f1-score (micro avg) 0.3224 2023-10-18 17:45:21,606 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:45:22,469 epoch 10 - iter 44/447 - loss 0.31571016 - time (sec): 0.86 - samples/sec: 9189.95 - lr: 0.000003 - momentum: 0.000000 2023-10-18 17:45:23,521 epoch 10 - iter 88/447 - loss 0.30383316 - time (sec): 1.91 - samples/sec: 8967.01 - lr: 0.000003 - momentum: 0.000000 2023-10-18 17:45:24,548 epoch 10 - iter 132/447 - loss 0.31717265 - time (sec): 2.94 - samples/sec: 8708.30 - lr: 0.000002 - momentum: 0.000000 2023-10-18 17:45:25,537 epoch 10 - iter 176/447 - loss 0.32673786 - time (sec): 3.93 - samples/sec: 8612.34 - lr: 0.000002 - momentum: 0.000000 2023-10-18 17:45:26,610 epoch 10 - iter 220/447 - loss 0.33065968 - time (sec): 5.00 - samples/sec: 8635.05 - lr: 0.000002 - momentum: 0.000000 2023-10-18 17:45:27,600 epoch 10 - iter 264/447 - loss 0.32901583 - time (sec): 5.99 - samples/sec: 8607.52 - lr: 0.000001 - momentum: 0.000000 2023-10-18 17:45:28,597 epoch 10 - iter 308/447 - loss 0.33533615 - time (sec): 6.99 - samples/sec: 8580.40 - lr: 0.000001 - momentum: 0.000000 2023-10-18 17:45:29,593 epoch 10 - iter 352/447 - loss 0.33627064 - time (sec): 7.99 - samples/sec: 8559.46 - lr: 0.000001 - momentum: 0.000000 2023-10-18 17:45:30,644 epoch 10 - iter 396/447 - loss 0.33572652 - time (sec): 9.04 - samples/sec: 8512.58 - lr: 0.000000 - momentum: 0.000000 2023-10-18 17:45:31,694 epoch 10 - iter 440/447 - loss 0.33227974 - time (sec): 10.09 - samples/sec: 8452.37 - lr: 0.000000 - momentum: 0.000000 2023-10-18 17:45:31,857 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:45:31,857 EPOCH 10 done: loss 0.3314 - lr: 0.000000 2023-10-18 17:45:36,818 DEV : loss 0.3060940206050873 - f1-score (micro avg) 0.3249 2023-10-18 17:45:36,875 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:45:36,875 Loading model from best epoch ... 2023-10-18 17:45:36,950 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time 2023-10-18 17:45:39,234 Results: - F-score (micro) 0.3356 - F-score (macro) 0.1275 - Accuracy 0.2112 By class: precision recall f1-score support loc 0.4857 0.5117 0.4984 596 pers 0.1595 0.1231 0.1390 333 org 0.0000 0.0000 0.0000 132 prod 0.0000 0.0000 0.0000 66 time 0.0000 0.0000 0.0000 49 micro avg 0.3905 0.2942 0.3356 1176 macro avg 0.1290 0.1270 0.1275 1176 weighted avg 0.2913 0.2942 0.2919 1176 2023-10-18 17:45:39,234 ----------------------------------------------------------------------------------------------------