2023-10-18 18:21:43,173 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:21:43,173 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 18:21:43,173 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:21:43,173 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences - NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator 2023-10-18 18:21:43,173 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:21:43,173 Train: 3575 sentences 2023-10-18 18:21:43,173 (train_with_dev=False, train_with_test=False) 2023-10-18 18:21:43,173 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:21:43,173 Training Params: 2023-10-18 18:21:43,173 - learning_rate: "5e-05" 2023-10-18 18:21:43,173 - mini_batch_size: "8" 2023-10-18 18:21:43,173 - max_epochs: "10" 2023-10-18 18:21:43,173 - shuffle: "True" 2023-10-18 18:21:43,173 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:21:43,173 Plugins: 2023-10-18 18:21:43,173 - TensorboardLogger 2023-10-18 18:21:43,173 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 18:21:43,174 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:21:43,174 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 18:21:43,174 - metric: "('micro avg', 'f1-score')" 2023-10-18 18:21:43,174 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:21:43,174 Computation: 2023-10-18 18:21:43,174 - compute on device: cuda:0 2023-10-18 18:21:43,174 - embedding storage: none 2023-10-18 18:21:43,174 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:21:43,174 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-18 18:21:43,174 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:21:43,174 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:21:43,174 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 18:21:44,063 epoch 1 - iter 44/447 - loss 4.23649821 - time (sec): 0.89 - samples/sec: 9205.03 - lr: 0.000005 - momentum: 0.000000 2023-10-18 18:21:45,001 epoch 1 - iter 88/447 - loss 4.09913748 - time (sec): 1.83 - samples/sec: 9161.65 - lr: 0.000010 - momentum: 0.000000 2023-10-18 18:21:46,122 epoch 1 - iter 132/447 - loss 3.70752803 - time (sec): 2.95 - samples/sec: 8885.41 - lr: 0.000015 - momentum: 0.000000 2023-10-18 18:21:47,135 epoch 1 - iter 176/447 - loss 3.33851307 - time (sec): 3.96 - samples/sec: 8897.63 - lr: 0.000020 - momentum: 0.000000 2023-10-18 18:21:48,153 epoch 1 - iter 220/447 - loss 2.91921810 - time (sec): 4.98 - samples/sec: 8929.24 - lr: 0.000024 - momentum: 0.000000 2023-10-18 18:21:49,138 epoch 1 - iter 264/447 - loss 2.57172108 - time (sec): 5.96 - samples/sec: 8897.72 - lr: 0.000029 - momentum: 0.000000 2023-10-18 18:21:50,118 epoch 1 - iter 308/447 - loss 2.32138092 - time (sec): 6.94 - samples/sec: 8782.67 - lr: 0.000034 - momentum: 0.000000 2023-10-18 18:21:51,110 epoch 1 - iter 352/447 - loss 2.12444880 - time (sec): 7.94 - samples/sec: 8684.37 - lr: 0.000039 - momentum: 0.000000 2023-10-18 18:21:52,115 epoch 1 - iter 396/447 - loss 1.96997781 - time (sec): 8.94 - samples/sec: 8592.80 - lr: 0.000044 - momentum: 0.000000 2023-10-18 18:21:53,135 epoch 1 - iter 440/447 - loss 1.83051644 - time (sec): 9.96 - samples/sec: 8560.53 - lr: 0.000049 - momentum: 0.000000 2023-10-18 18:21:53,279 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:21:53,280 EPOCH 1 done: loss 1.8109 - lr: 0.000049 2023-10-18 18:21:55,566 DEV : loss 0.44877171516418457 - f1-score (micro avg) 0.0 2023-10-18 18:21:55,589 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:21:56,612 epoch 2 - iter 44/447 - loss 0.60155493 - time (sec): 1.02 - samples/sec: 8574.38 - lr: 0.000049 - momentum: 0.000000 2023-10-18 18:21:57,597 epoch 2 - iter 88/447 - loss 0.56142538 - time (sec): 2.01 - samples/sec: 8504.11 - lr: 0.000049 - momentum: 0.000000 2023-10-18 18:21:58,565 epoch 2 - iter 132/447 - loss 0.55726995 - time (sec): 2.98 - samples/sec: 8469.91 - lr: 0.000048 - momentum: 0.000000 2023-10-18 18:21:59,547 epoch 2 - iter 176/447 - loss 0.54406916 - time (sec): 3.96 - samples/sec: 8495.52 - lr: 0.000048 - momentum: 0.000000 2023-10-18 18:22:00,547 epoch 2 - iter 220/447 - loss 0.53559205 - time (sec): 4.96 - samples/sec: 8474.13 - lr: 0.000047 - momentum: 0.000000 2023-10-18 18:22:01,542 epoch 2 - iter 264/447 - loss 0.52587561 - time (sec): 5.95 - samples/sec: 8323.02 - lr: 0.000047 - momentum: 0.000000 2023-10-18 18:22:02,554 epoch 2 - iter 308/447 - loss 0.51234790 - time (sec): 6.96 - samples/sec: 8349.78 - lr: 0.000046 - momentum: 0.000000 2023-10-18 18:22:03,602 epoch 2 - iter 352/447 - loss 0.49579329 - time (sec): 8.01 - samples/sec: 8484.55 - lr: 0.000046 - momentum: 0.000000 2023-10-18 18:22:04,613 epoch 2 - iter 396/447 - loss 0.49021091 - time (sec): 9.02 - samples/sec: 8540.08 - lr: 0.000045 - momentum: 0.000000 2023-10-18 18:22:05,637 epoch 2 - iter 440/447 - loss 0.48330812 - time (sec): 10.05 - samples/sec: 8495.76 - lr: 0.000045 - momentum: 0.000000 2023-10-18 18:22:05,784 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:22:05,784 EPOCH 2 done: loss 0.4835 - lr: 0.000045 2023-10-18 18:22:11,082 DEV : loss 0.32866111397743225 - f1-score (micro avg) 0.2007 2023-10-18 18:22:11,105 saving best model 2023-10-18 18:22:11,142 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:22:12,115 epoch 3 - iter 44/447 - loss 0.40849086 - time (sec): 0.97 - samples/sec: 9093.83 - lr: 0.000044 - momentum: 0.000000 2023-10-18 18:22:13,082 epoch 3 - iter 88/447 - loss 0.39700437 - time (sec): 1.94 - samples/sec: 8792.81 - lr: 0.000043 - momentum: 0.000000 2023-10-18 18:22:14,073 epoch 3 - iter 132/447 - loss 0.41180397 - time (sec): 2.93 - samples/sec: 8767.84 - lr: 0.000043 - momentum: 0.000000 2023-10-18 18:22:15,057 epoch 3 - iter 176/447 - loss 0.41517633 - time (sec): 3.91 - samples/sec: 8603.19 - lr: 0.000042 - momentum: 0.000000 2023-10-18 18:22:16,077 epoch 3 - iter 220/447 - loss 0.40697778 - time (sec): 4.93 - samples/sec: 8582.36 - lr: 0.000042 - momentum: 0.000000 2023-10-18 18:22:17,133 epoch 3 - iter 264/447 - loss 0.40336023 - time (sec): 5.99 - samples/sec: 8712.21 - lr: 0.000041 - momentum: 0.000000 2023-10-18 18:22:18,123 epoch 3 - iter 308/447 - loss 0.40473996 - time (sec): 6.98 - samples/sec: 8727.05 - lr: 0.000041 - momentum: 0.000000 2023-10-18 18:22:19,105 epoch 3 - iter 352/447 - loss 0.40234530 - time (sec): 7.96 - samples/sec: 8674.70 - lr: 0.000040 - momentum: 0.000000 2023-10-18 18:22:20,067 epoch 3 - iter 396/447 - loss 0.40175952 - time (sec): 8.92 - samples/sec: 8624.41 - lr: 0.000040 - momentum: 0.000000 2023-10-18 18:22:21,055 epoch 3 - iter 440/447 - loss 0.40155202 - time (sec): 9.91 - samples/sec: 8597.01 - lr: 0.000039 - momentum: 0.000000 2023-10-18 18:22:21,214 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:22:21,214 EPOCH 3 done: loss 0.3996 - lr: 0.000039 2023-10-18 18:22:26,494 DEV : loss 0.31150904297828674 - f1-score (micro avg) 0.2895 2023-10-18 18:22:26,518 saving best model 2023-10-18 18:22:26,553 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:22:27,547 epoch 4 - iter 44/447 - loss 0.31938505 - time (sec): 0.99 - samples/sec: 7768.01 - lr: 0.000038 - momentum: 0.000000 2023-10-18 18:22:28,522 epoch 4 - iter 88/447 - loss 0.35629034 - time (sec): 1.97 - samples/sec: 8020.00 - lr: 0.000038 - momentum: 0.000000 2023-10-18 18:22:29,360 epoch 4 - iter 132/447 - loss 0.37841602 - time (sec): 2.81 - samples/sec: 8550.59 - lr: 0.000037 - momentum: 0.000000 2023-10-18 18:22:30,205 epoch 4 - iter 176/447 - loss 0.37884124 - time (sec): 3.65 - samples/sec: 8816.64 - lr: 0.000037 - momentum: 0.000000 2023-10-18 18:22:31,154 epoch 4 - iter 220/447 - loss 0.36800228 - time (sec): 4.60 - samples/sec: 9070.98 - lr: 0.000036 - momentum: 0.000000 2023-10-18 18:22:32,140 epoch 4 - iter 264/447 - loss 0.35905570 - time (sec): 5.59 - samples/sec: 9038.17 - lr: 0.000036 - momentum: 0.000000 2023-10-18 18:22:33,228 epoch 4 - iter 308/447 - loss 0.35834364 - time (sec): 6.68 - samples/sec: 8997.15 - lr: 0.000035 - momentum: 0.000000 2023-10-18 18:22:34,223 epoch 4 - iter 352/447 - loss 0.35753489 - time (sec): 7.67 - samples/sec: 8989.30 - lr: 0.000035 - momentum: 0.000000 2023-10-18 18:22:35,209 epoch 4 - iter 396/447 - loss 0.35399316 - time (sec): 8.66 - samples/sec: 8900.47 - lr: 0.000034 - momentum: 0.000000 2023-10-18 18:22:36,209 epoch 4 - iter 440/447 - loss 0.35960244 - time (sec): 9.66 - samples/sec: 8828.74 - lr: 0.000033 - momentum: 0.000000 2023-10-18 18:22:36,366 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:22:36,366 EPOCH 4 done: loss 0.3601 - lr: 0.000033 2023-10-18 18:22:41,343 DEV : loss 0.3030697703361511 - f1-score (micro avg) 0.3261 2023-10-18 18:22:41,367 saving best model 2023-10-18 18:22:41,411 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:22:42,431 epoch 5 - iter 44/447 - loss 0.32935993 - time (sec): 1.02 - samples/sec: 8650.32 - lr: 0.000033 - momentum: 0.000000 2023-10-18 18:22:43,415 epoch 5 - iter 88/447 - loss 0.31978926 - time (sec): 2.00 - samples/sec: 8326.37 - lr: 0.000032 - momentum: 0.000000 2023-10-18 18:22:44,768 epoch 5 - iter 132/447 - loss 0.32434382 - time (sec): 3.36 - samples/sec: 7570.99 - lr: 0.000032 - momentum: 0.000000 2023-10-18 18:22:45,779 epoch 5 - iter 176/447 - loss 0.32626966 - time (sec): 4.37 - samples/sec: 7906.10 - lr: 0.000031 - momentum: 0.000000 2023-10-18 18:22:46,731 epoch 5 - iter 220/447 - loss 0.32079104 - time (sec): 5.32 - samples/sec: 8117.80 - lr: 0.000031 - momentum: 0.000000 2023-10-18 18:22:47,707 epoch 5 - iter 264/447 - loss 0.32647442 - time (sec): 6.30 - samples/sec: 8199.87 - lr: 0.000030 - momentum: 0.000000 2023-10-18 18:22:48,633 epoch 5 - iter 308/447 - loss 0.33184709 - time (sec): 7.22 - samples/sec: 8204.55 - lr: 0.000030 - momentum: 0.000000 2023-10-18 18:22:49,467 epoch 5 - iter 352/447 - loss 0.33135495 - time (sec): 8.06 - samples/sec: 8350.69 - lr: 0.000029 - momentum: 0.000000 2023-10-18 18:22:50,382 epoch 5 - iter 396/447 - loss 0.32921133 - time (sec): 8.97 - samples/sec: 8431.60 - lr: 0.000028 - momentum: 0.000000 2023-10-18 18:22:51,434 epoch 5 - iter 440/447 - loss 0.33252622 - time (sec): 10.02 - samples/sec: 8520.37 - lr: 0.000028 - momentum: 0.000000 2023-10-18 18:22:51,599 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:22:51,599 EPOCH 5 done: loss 0.3314 - lr: 0.000028 2023-10-18 18:22:56,580 DEV : loss 0.2966790497303009 - f1-score (micro avg) 0.3432 2023-10-18 18:22:56,605 saving best model 2023-10-18 18:22:56,646 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:22:57,639 epoch 6 - iter 44/447 - loss 0.32205009 - time (sec): 0.99 - samples/sec: 7882.22 - lr: 0.000027 - momentum: 0.000000 2023-10-18 18:22:58,675 epoch 6 - iter 88/447 - loss 0.28851192 - time (sec): 2.03 - samples/sec: 8518.03 - lr: 0.000027 - momentum: 0.000000 2023-10-18 18:22:59,657 epoch 6 - iter 132/447 - loss 0.27935474 - time (sec): 3.01 - samples/sec: 8308.57 - lr: 0.000026 - momentum: 0.000000 2023-10-18 18:23:00,674 epoch 6 - iter 176/447 - loss 0.29305316 - time (sec): 4.03 - samples/sec: 8517.53 - lr: 0.000026 - momentum: 0.000000 2023-10-18 18:23:01,689 epoch 6 - iter 220/447 - loss 0.29344782 - time (sec): 5.04 - samples/sec: 8600.65 - lr: 0.000025 - momentum: 0.000000 2023-10-18 18:23:02,693 epoch 6 - iter 264/447 - loss 0.29557160 - time (sec): 6.05 - samples/sec: 8512.17 - lr: 0.000025 - momentum: 0.000000 2023-10-18 18:23:03,653 epoch 6 - iter 308/447 - loss 0.29555013 - time (sec): 7.01 - samples/sec: 8482.76 - lr: 0.000024 - momentum: 0.000000 2023-10-18 18:23:04,667 epoch 6 - iter 352/447 - loss 0.29649384 - time (sec): 8.02 - samples/sec: 8570.31 - lr: 0.000023 - momentum: 0.000000 2023-10-18 18:23:05,648 epoch 6 - iter 396/447 - loss 0.29505917 - time (sec): 9.00 - samples/sec: 8570.39 - lr: 0.000023 - momentum: 0.000000 2023-10-18 18:23:06,640 epoch 6 - iter 440/447 - loss 0.30654783 - time (sec): 9.99 - samples/sec: 8556.98 - lr: 0.000022 - momentum: 0.000000 2023-10-18 18:23:06,794 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:23:06,794 EPOCH 6 done: loss 0.3076 - lr: 0.000022 2023-10-18 18:23:12,104 DEV : loss 0.2881721258163452 - f1-score (micro avg) 0.3428 2023-10-18 18:23:12,130 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:23:13,173 epoch 7 - iter 44/447 - loss 0.24532264 - time (sec): 1.04 - samples/sec: 8882.55 - lr: 0.000022 - momentum: 0.000000 2023-10-18 18:23:14,184 epoch 7 - iter 88/447 - loss 0.27568476 - time (sec): 2.05 - samples/sec: 8486.80 - lr: 0.000021 - momentum: 0.000000 2023-10-18 18:23:15,283 epoch 7 - iter 132/447 - loss 0.29794749 - time (sec): 3.15 - samples/sec: 8599.07 - lr: 0.000021 - momentum: 0.000000 2023-10-18 18:23:16,312 epoch 7 - iter 176/447 - loss 0.29956954 - time (sec): 4.18 - samples/sec: 8565.70 - lr: 0.000020 - momentum: 0.000000 2023-10-18 18:23:17,376 epoch 7 - iter 220/447 - loss 0.30213892 - time (sec): 5.25 - samples/sec: 8384.24 - lr: 0.000020 - momentum: 0.000000 2023-10-18 18:23:18,349 epoch 7 - iter 264/447 - loss 0.30357337 - time (sec): 6.22 - samples/sec: 8377.41 - lr: 0.000019 - momentum: 0.000000 2023-10-18 18:23:19,377 epoch 7 - iter 308/447 - loss 0.29758405 - time (sec): 7.25 - samples/sec: 8318.63 - lr: 0.000018 - momentum: 0.000000 2023-10-18 18:23:20,358 epoch 7 - iter 352/447 - loss 0.29898433 - time (sec): 8.23 - samples/sec: 8353.32 - lr: 0.000018 - momentum: 0.000000 2023-10-18 18:23:21,384 epoch 7 - iter 396/447 - loss 0.29612803 - time (sec): 9.25 - samples/sec: 8325.21 - lr: 0.000017 - momentum: 0.000000 2023-10-18 18:23:22,424 epoch 7 - iter 440/447 - loss 0.29754448 - time (sec): 10.29 - samples/sec: 8276.01 - lr: 0.000017 - momentum: 0.000000 2023-10-18 18:23:22,589 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:23:22,589 EPOCH 7 done: loss 0.2978 - lr: 0.000017 2023-10-18 18:23:27,857 DEV : loss 0.2950444519519806 - f1-score (micro avg) 0.3478 2023-10-18 18:23:27,881 saving best model 2023-10-18 18:23:27,915 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:23:28,944 epoch 8 - iter 44/447 - loss 0.29022344 - time (sec): 1.03 - samples/sec: 9194.27 - lr: 0.000016 - momentum: 0.000000 2023-10-18 18:23:29,939 epoch 8 - iter 88/447 - loss 0.27818831 - time (sec): 2.02 - samples/sec: 8694.64 - lr: 0.000016 - momentum: 0.000000 2023-10-18 18:23:30,925 epoch 8 - iter 132/447 - loss 0.29183782 - time (sec): 3.01 - samples/sec: 8683.21 - lr: 0.000015 - momentum: 0.000000 2023-10-18 18:23:31,934 epoch 8 - iter 176/447 - loss 0.29768575 - time (sec): 4.02 - samples/sec: 8511.47 - lr: 0.000015 - momentum: 0.000000 2023-10-18 18:23:32,961 epoch 8 - iter 220/447 - loss 0.30091732 - time (sec): 5.05 - samples/sec: 8435.25 - lr: 0.000014 - momentum: 0.000000 2023-10-18 18:23:33,943 epoch 8 - iter 264/447 - loss 0.29597312 - time (sec): 6.03 - samples/sec: 8442.64 - lr: 0.000013 - momentum: 0.000000 2023-10-18 18:23:34,941 epoch 8 - iter 308/447 - loss 0.29271218 - time (sec): 7.03 - samples/sec: 8347.80 - lr: 0.000013 - momentum: 0.000000 2023-10-18 18:23:35,941 epoch 8 - iter 352/447 - loss 0.29100478 - time (sec): 8.03 - samples/sec: 8386.27 - lr: 0.000012 - momentum: 0.000000 2023-10-18 18:23:36,957 epoch 8 - iter 396/447 - loss 0.28742095 - time (sec): 9.04 - samples/sec: 8390.58 - lr: 0.000012 - momentum: 0.000000 2023-10-18 18:23:38,050 epoch 8 - iter 440/447 - loss 0.28930722 - time (sec): 10.13 - samples/sec: 8390.30 - lr: 0.000011 - momentum: 0.000000 2023-10-18 18:23:38,225 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:23:38,225 EPOCH 8 done: loss 0.2878 - lr: 0.000011 2023-10-18 18:23:43,523 DEV : loss 0.29474112391471863 - f1-score (micro avg) 0.3473 2023-10-18 18:23:43,547 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:23:44,548 epoch 9 - iter 44/447 - loss 0.25740795 - time (sec): 1.00 - samples/sec: 8141.93 - lr: 0.000011 - momentum: 0.000000 2023-10-18 18:23:45,525 epoch 9 - iter 88/447 - loss 0.27278712 - time (sec): 1.98 - samples/sec: 7858.76 - lr: 0.000010 - momentum: 0.000000 2023-10-18 18:23:46,560 epoch 9 - iter 132/447 - loss 0.26541675 - time (sec): 3.01 - samples/sec: 8036.26 - lr: 0.000010 - momentum: 0.000000 2023-10-18 18:23:47,629 epoch 9 - iter 176/447 - loss 0.27555671 - time (sec): 4.08 - samples/sec: 7930.21 - lr: 0.000009 - momentum: 0.000000 2023-10-18 18:23:48,717 epoch 9 - iter 220/447 - loss 0.27547894 - time (sec): 5.17 - samples/sec: 8030.67 - lr: 0.000008 - momentum: 0.000000 2023-10-18 18:23:49,774 epoch 9 - iter 264/447 - loss 0.27439270 - time (sec): 6.23 - samples/sec: 8141.15 - lr: 0.000008 - momentum: 0.000000 2023-10-18 18:23:50,822 epoch 9 - iter 308/447 - loss 0.27018164 - time (sec): 7.27 - samples/sec: 8134.64 - lr: 0.000007 - momentum: 0.000000 2023-10-18 18:23:51,870 epoch 9 - iter 352/447 - loss 0.26807567 - time (sec): 8.32 - samples/sec: 8095.54 - lr: 0.000007 - momentum: 0.000000 2023-10-18 18:23:52,962 epoch 9 - iter 396/447 - loss 0.27305200 - time (sec): 9.41 - samples/sec: 8151.44 - lr: 0.000006 - momentum: 0.000000 2023-10-18 18:23:53,986 epoch 9 - iter 440/447 - loss 0.27534298 - time (sec): 10.44 - samples/sec: 8199.93 - lr: 0.000006 - momentum: 0.000000 2023-10-18 18:23:54,132 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:23:54,133 EPOCH 9 done: loss 0.2753 - lr: 0.000006 2023-10-18 18:23:59,472 DEV : loss 0.2967955470085144 - f1-score (micro avg) 0.3451 2023-10-18 18:23:59,496 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:24:00,570 epoch 10 - iter 44/447 - loss 0.31147788 - time (sec): 1.07 - samples/sec: 7567.81 - lr: 0.000005 - momentum: 0.000000 2023-10-18 18:24:01,581 epoch 10 - iter 88/447 - loss 0.29350164 - time (sec): 2.08 - samples/sec: 7864.46 - lr: 0.000005 - momentum: 0.000000 2023-10-18 18:24:02,585 epoch 10 - iter 132/447 - loss 0.27414972 - time (sec): 3.09 - samples/sec: 7968.43 - lr: 0.000004 - momentum: 0.000000 2023-10-18 18:24:03,541 epoch 10 - iter 176/447 - loss 0.27134812 - time (sec): 4.05 - samples/sec: 8107.61 - lr: 0.000003 - momentum: 0.000000 2023-10-18 18:24:04,542 epoch 10 - iter 220/447 - loss 0.27511632 - time (sec): 5.05 - samples/sec: 8012.55 - lr: 0.000003 - momentum: 0.000000 2023-10-18 18:24:05,574 epoch 10 - iter 264/447 - loss 0.26854414 - time (sec): 6.08 - samples/sec: 8070.76 - lr: 0.000002 - momentum: 0.000000 2023-10-18 18:24:06,537 epoch 10 - iter 308/447 - loss 0.27296262 - time (sec): 7.04 - samples/sec: 8110.53 - lr: 0.000002 - momentum: 0.000000 2023-10-18 18:24:07,641 epoch 10 - iter 352/447 - loss 0.27183740 - time (sec): 8.14 - samples/sec: 8302.87 - lr: 0.000001 - momentum: 0.000000 2023-10-18 18:24:08,693 epoch 10 - iter 396/447 - loss 0.27614075 - time (sec): 9.20 - samples/sec: 8270.70 - lr: 0.000001 - momentum: 0.000000 2023-10-18 18:24:09,741 epoch 10 - iter 440/447 - loss 0.27640189 - time (sec): 10.24 - samples/sec: 8271.79 - lr: 0.000000 - momentum: 0.000000 2023-10-18 18:24:09,932 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:24:09,932 EPOCH 10 done: loss 0.2761 - lr: 0.000000 2023-10-18 18:24:14,905 DEV : loss 0.2940215766429901 - f1-score (micro avg) 0.3506 2023-10-18 18:24:14,929 saving best model 2023-10-18 18:24:14,992 ---------------------------------------------------------------------------------------------------- 2023-10-18 18:24:14,992 Loading model from best epoch ... 2023-10-18 18:24:15,067 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time 2023-10-18 18:24:17,305 Results: - F-score (micro) 0.3601 - F-score (macro) 0.1639 - Accuracy 0.2303 By class: precision recall f1-score support loc 0.5196 0.5554 0.5369 596 pers 0.1712 0.2282 0.1956 333 org 0.0000 0.0000 0.0000 132 time 0.1500 0.0612 0.0870 49 prod 0.0000 0.0000 0.0000 66 micro avg 0.3724 0.3486 0.3601 1176 macro avg 0.1682 0.1690 0.1639 1176 weighted avg 0.3181 0.3486 0.3311 1176 2023-10-18 18:24:17,305 ----------------------------------------------------------------------------------------------------