2023-10-18 17:48:28,457 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:48:28,458 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 17:48:28,458 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:48:28,458 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences - NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator 2023-10-18 17:48:28,458 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:48:28,458 Train: 3575 sentences 2023-10-18 17:48:28,458 (train_with_dev=False, train_with_test=False) 2023-10-18 17:48:28,458 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:48:28,458 Training Params: 2023-10-18 17:48:28,458 - learning_rate: "3e-05" 2023-10-18 17:48:28,458 - mini_batch_size: "4" 2023-10-18 17:48:28,458 - max_epochs: "10" 2023-10-18 17:48:28,458 - shuffle: "True" 2023-10-18 17:48:28,458 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:48:28,458 Plugins: 2023-10-18 17:48:28,458 - TensorboardLogger 2023-10-18 17:48:28,458 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 17:48:28,458 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:48:28,458 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 17:48:28,458 - metric: "('micro avg', 'f1-score')" 2023-10-18 17:48:28,458 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:48:28,458 Computation: 2023-10-18 17:48:28,458 - compute on device: cuda:0 2023-10-18 17:48:28,458 - embedding storage: none 2023-10-18 17:48:28,458 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:48:28,459 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-18 17:48:28,459 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:48:28,459 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:48:28,459 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 17:48:29,748 epoch 1 - iter 89/894 - loss 3.19630071 - time (sec): 1.29 - samples/sec: 7045.16 - lr: 0.000003 - momentum: 0.000000 2023-10-18 17:48:31,082 epoch 1 - iter 178/894 - loss 2.96353199 - time (sec): 2.62 - samples/sec: 7187.38 - lr: 0.000006 - momentum: 0.000000 2023-10-18 17:48:32,421 epoch 1 - iter 267/894 - loss 2.74540200 - time (sec): 3.96 - samples/sec: 6702.85 - lr: 0.000009 - momentum: 0.000000 2023-10-18 17:48:33,790 epoch 1 - iter 356/894 - loss 2.43894253 - time (sec): 5.33 - samples/sec: 6411.16 - lr: 0.000012 - momentum: 0.000000 2023-10-18 17:48:35,196 epoch 1 - iter 445/894 - loss 2.12976788 - time (sec): 6.74 - samples/sec: 6277.64 - lr: 0.000015 - momentum: 0.000000 2023-10-18 17:48:36,585 epoch 1 - iter 534/894 - loss 1.88648563 - time (sec): 8.13 - samples/sec: 6263.25 - lr: 0.000018 - momentum: 0.000000 2023-10-18 17:48:37,976 epoch 1 - iter 623/894 - loss 1.67326583 - time (sec): 9.52 - samples/sec: 6416.47 - lr: 0.000021 - momentum: 0.000000 2023-10-18 17:48:39,326 epoch 1 - iter 712/894 - loss 1.53353448 - time (sec): 10.87 - samples/sec: 6403.24 - lr: 0.000024 - momentum: 0.000000 2023-10-18 17:48:40,702 epoch 1 - iter 801/894 - loss 1.42922638 - time (sec): 12.24 - samples/sec: 6366.34 - lr: 0.000027 - momentum: 0.000000 2023-10-18 17:48:42,044 epoch 1 - iter 890/894 - loss 1.35456652 - time (sec): 13.58 - samples/sec: 6351.81 - lr: 0.000030 - momentum: 0.000000 2023-10-18 17:48:42,100 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:48:42,100 EPOCH 1 done: loss 1.3532 - lr: 0.000030 2023-10-18 17:48:44,322 DEV : loss 0.45246344804763794 - f1-score (micro avg) 0.0 2023-10-18 17:48:44,344 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:48:45,711 epoch 2 - iter 89/894 - loss 0.52574422 - time (sec): 1.37 - samples/sec: 6183.97 - lr: 0.000030 - momentum: 0.000000 2023-10-18 17:48:47,080 epoch 2 - iter 178/894 - loss 0.55545770 - time (sec): 2.74 - samples/sec: 6374.43 - lr: 0.000029 - momentum: 0.000000 2023-10-18 17:48:48,520 epoch 2 - iter 267/894 - loss 0.54150980 - time (sec): 4.18 - samples/sec: 6041.74 - lr: 0.000029 - momentum: 0.000000 2023-10-18 17:48:49,990 epoch 2 - iter 356/894 - loss 0.53379165 - time (sec): 5.64 - samples/sec: 5904.15 - lr: 0.000029 - momentum: 0.000000 2023-10-18 17:48:51,410 epoch 2 - iter 445/894 - loss 0.53062271 - time (sec): 7.07 - samples/sec: 6094.43 - lr: 0.000028 - momentum: 0.000000 2023-10-18 17:48:52,758 epoch 2 - iter 534/894 - loss 0.52081983 - time (sec): 8.41 - samples/sec: 6121.72 - lr: 0.000028 - momentum: 0.000000 2023-10-18 17:48:54,219 epoch 2 - iter 623/894 - loss 0.51963233 - time (sec): 9.87 - samples/sec: 6246.67 - lr: 0.000028 - momentum: 0.000000 2023-10-18 17:48:55,574 epoch 2 - iter 712/894 - loss 0.51540611 - time (sec): 11.23 - samples/sec: 6162.82 - lr: 0.000027 - momentum: 0.000000 2023-10-18 17:48:56,953 epoch 2 - iter 801/894 - loss 0.51041159 - time (sec): 12.61 - samples/sec: 6165.15 - lr: 0.000027 - momentum: 0.000000 2023-10-18 17:48:58,341 epoch 2 - iter 890/894 - loss 0.50535478 - time (sec): 14.00 - samples/sec: 6164.74 - lr: 0.000027 - momentum: 0.000000 2023-10-18 17:48:58,402 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:48:58,403 EPOCH 2 done: loss 0.5064 - lr: 0.000027 2023-10-18 17:49:03,571 DEV : loss 0.3578655421733856 - f1-score (micro avg) 0.0592 2023-10-18 17:49:03,594 saving best model 2023-10-18 17:49:03,627 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:49:05,037 epoch 3 - iter 89/894 - loss 0.48336938 - time (sec): 1.41 - samples/sec: 6484.71 - lr: 0.000026 - momentum: 0.000000 2023-10-18 17:49:06,438 epoch 3 - iter 178/894 - loss 0.47658882 - time (sec): 2.81 - samples/sec: 6324.71 - lr: 0.000026 - momentum: 0.000000 2023-10-18 17:49:07,814 epoch 3 - iter 267/894 - loss 0.45712463 - time (sec): 4.19 - samples/sec: 6340.69 - lr: 0.000026 - momentum: 0.000000 2023-10-18 17:49:09,179 epoch 3 - iter 356/894 - loss 0.46393711 - time (sec): 5.55 - samples/sec: 6172.52 - lr: 0.000025 - momentum: 0.000000 2023-10-18 17:49:10,558 epoch 3 - iter 445/894 - loss 0.44830358 - time (sec): 6.93 - samples/sec: 6114.08 - lr: 0.000025 - momentum: 0.000000 2023-10-18 17:49:11,950 epoch 3 - iter 534/894 - loss 0.44669476 - time (sec): 8.32 - samples/sec: 6129.35 - lr: 0.000025 - momentum: 0.000000 2023-10-18 17:49:13,338 epoch 3 - iter 623/894 - loss 0.43924135 - time (sec): 9.71 - samples/sec: 6133.73 - lr: 0.000024 - momentum: 0.000000 2023-10-18 17:49:14,741 epoch 3 - iter 712/894 - loss 0.43899574 - time (sec): 11.11 - samples/sec: 6177.45 - lr: 0.000024 - momentum: 0.000000 2023-10-18 17:49:16,164 epoch 3 - iter 801/894 - loss 0.43256443 - time (sec): 12.54 - samples/sec: 6201.00 - lr: 0.000024 - momentum: 0.000000 2023-10-18 17:49:17,548 epoch 3 - iter 890/894 - loss 0.42949594 - time (sec): 13.92 - samples/sec: 6194.26 - lr: 0.000023 - momentum: 0.000000 2023-10-18 17:49:17,606 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:49:17,606 EPOCH 3 done: loss 0.4296 - lr: 0.000023 2023-10-18 17:49:22,809 DEV : loss 0.34198394417762756 - f1-score (micro avg) 0.2455 2023-10-18 17:49:22,832 saving best model 2023-10-18 17:49:22,865 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:49:24,232 epoch 4 - iter 89/894 - loss 0.41318889 - time (sec): 1.37 - samples/sec: 5961.29 - lr: 0.000023 - momentum: 0.000000 2023-10-18 17:49:25,688 epoch 4 - iter 178/894 - loss 0.38872500 - time (sec): 2.82 - samples/sec: 6480.28 - lr: 0.000023 - momentum: 0.000000 2023-10-18 17:49:27,081 epoch 4 - iter 267/894 - loss 0.38659751 - time (sec): 4.22 - samples/sec: 6377.23 - lr: 0.000022 - momentum: 0.000000 2023-10-18 17:49:28,484 epoch 4 - iter 356/894 - loss 0.39484288 - time (sec): 5.62 - samples/sec: 6340.75 - lr: 0.000022 - momentum: 0.000000 2023-10-18 17:49:29,929 epoch 4 - iter 445/894 - loss 0.38596894 - time (sec): 7.06 - samples/sec: 6296.23 - lr: 0.000022 - momentum: 0.000000 2023-10-18 17:49:31,291 epoch 4 - iter 534/894 - loss 0.38762013 - time (sec): 8.43 - samples/sec: 6266.21 - lr: 0.000021 - momentum: 0.000000 2023-10-18 17:49:32,694 epoch 4 - iter 623/894 - loss 0.38363860 - time (sec): 9.83 - samples/sec: 6235.32 - lr: 0.000021 - momentum: 0.000000 2023-10-18 17:49:34,090 epoch 4 - iter 712/894 - loss 0.38723100 - time (sec): 11.22 - samples/sec: 6214.74 - lr: 0.000021 - momentum: 0.000000 2023-10-18 17:49:35,490 epoch 4 - iter 801/894 - loss 0.38665888 - time (sec): 12.62 - samples/sec: 6162.80 - lr: 0.000020 - momentum: 0.000000 2023-10-18 17:49:36,891 epoch 4 - iter 890/894 - loss 0.38610173 - time (sec): 14.03 - samples/sec: 6143.24 - lr: 0.000020 - momentum: 0.000000 2023-10-18 17:49:36,957 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:49:36,957 EPOCH 4 done: loss 0.3846 - lr: 0.000020 2023-10-18 17:49:41,899 DEV : loss 0.3150075674057007 - f1-score (micro avg) 0.3247 2023-10-18 17:49:41,923 saving best model 2023-10-18 17:49:41,956 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:49:43,421 epoch 5 - iter 89/894 - loss 0.38144140 - time (sec): 1.46 - samples/sec: 5968.14 - lr: 0.000020 - momentum: 0.000000 2023-10-18 17:49:44,878 epoch 5 - iter 178/894 - loss 0.35024717 - time (sec): 2.92 - samples/sec: 6219.91 - lr: 0.000019 - momentum: 0.000000 2023-10-18 17:49:46,330 epoch 5 - iter 267/894 - loss 0.35588428 - time (sec): 4.37 - samples/sec: 5912.08 - lr: 0.000019 - momentum: 0.000000 2023-10-18 17:49:47,698 epoch 5 - iter 356/894 - loss 0.35467769 - time (sec): 5.74 - samples/sec: 6017.05 - lr: 0.000019 - momentum: 0.000000 2023-10-18 17:49:49,081 epoch 5 - iter 445/894 - loss 0.36077938 - time (sec): 7.12 - samples/sec: 5971.89 - lr: 0.000018 - momentum: 0.000000 2023-10-18 17:49:50,469 epoch 5 - iter 534/894 - loss 0.36734085 - time (sec): 8.51 - samples/sec: 5956.77 - lr: 0.000018 - momentum: 0.000000 2023-10-18 17:49:51,914 epoch 5 - iter 623/894 - loss 0.36702674 - time (sec): 9.96 - samples/sec: 6018.41 - lr: 0.000018 - momentum: 0.000000 2023-10-18 17:49:53,392 epoch 5 - iter 712/894 - loss 0.36851786 - time (sec): 11.44 - samples/sec: 6059.43 - lr: 0.000017 - momentum: 0.000000 2023-10-18 17:49:54,785 epoch 5 - iter 801/894 - loss 0.36449745 - time (sec): 12.83 - samples/sec: 6069.19 - lr: 0.000017 - momentum: 0.000000 2023-10-18 17:49:56,193 epoch 5 - iter 890/894 - loss 0.35757282 - time (sec): 14.24 - samples/sec: 6057.28 - lr: 0.000017 - momentum: 0.000000 2023-10-18 17:49:56,259 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:49:56,259 EPOCH 5 done: loss 0.3570 - lr: 0.000017 2023-10-18 17:50:01,547 DEV : loss 0.3157925307750702 - f1-score (micro avg) 0.3235 2023-10-18 17:50:01,571 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:50:03,005 epoch 6 - iter 89/894 - loss 0.34112212 - time (sec): 1.43 - samples/sec: 5988.42 - lr: 0.000016 - momentum: 0.000000 2023-10-18 17:50:04,422 epoch 6 - iter 178/894 - loss 0.33324202 - time (sec): 2.85 - samples/sec: 5812.42 - lr: 0.000016 - momentum: 0.000000 2023-10-18 17:50:05,839 epoch 6 - iter 267/894 - loss 0.32141621 - time (sec): 4.27 - samples/sec: 5680.00 - lr: 0.000016 - momentum: 0.000000 2023-10-18 17:50:07,210 epoch 6 - iter 356/894 - loss 0.34308998 - time (sec): 5.64 - samples/sec: 5766.94 - lr: 0.000015 - momentum: 0.000000 2023-10-18 17:50:08,571 epoch 6 - iter 445/894 - loss 0.34151889 - time (sec): 7.00 - samples/sec: 5841.83 - lr: 0.000015 - momentum: 0.000000 2023-10-18 17:50:10,009 epoch 6 - iter 534/894 - loss 0.35052796 - time (sec): 8.44 - samples/sec: 6032.34 - lr: 0.000015 - momentum: 0.000000 2023-10-18 17:50:11,412 epoch 6 - iter 623/894 - loss 0.34315959 - time (sec): 9.84 - samples/sec: 6061.10 - lr: 0.000014 - momentum: 0.000000 2023-10-18 17:50:12,775 epoch 6 - iter 712/894 - loss 0.33510239 - time (sec): 11.20 - samples/sec: 6112.00 - lr: 0.000014 - momentum: 0.000000 2023-10-18 17:50:14,192 epoch 6 - iter 801/894 - loss 0.33637264 - time (sec): 12.62 - samples/sec: 6162.15 - lr: 0.000014 - momentum: 0.000000 2023-10-18 17:50:15,601 epoch 6 - iter 890/894 - loss 0.33235115 - time (sec): 14.03 - samples/sec: 6145.37 - lr: 0.000013 - momentum: 0.000000 2023-10-18 17:50:15,658 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:50:15,658 EPOCH 6 done: loss 0.3326 - lr: 0.000013 2023-10-18 17:50:20,999 DEV : loss 0.30548417568206787 - f1-score (micro avg) 0.3435 2023-10-18 17:50:21,022 saving best model 2023-10-18 17:50:21,056 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:50:22,623 epoch 7 - iter 89/894 - loss 0.31222141 - time (sec): 1.57 - samples/sec: 5714.42 - lr: 0.000013 - momentum: 0.000000 2023-10-18 17:50:24,005 epoch 7 - iter 178/894 - loss 0.32705010 - time (sec): 2.95 - samples/sec: 5872.12 - lr: 0.000013 - momentum: 0.000000 2023-10-18 17:50:25,399 epoch 7 - iter 267/894 - loss 0.32263937 - time (sec): 4.34 - samples/sec: 5931.39 - lr: 0.000012 - momentum: 0.000000 2023-10-18 17:50:26,782 epoch 7 - iter 356/894 - loss 0.32544371 - time (sec): 5.73 - samples/sec: 5904.42 - lr: 0.000012 - momentum: 0.000000 2023-10-18 17:50:28,201 epoch 7 - iter 445/894 - loss 0.31601726 - time (sec): 7.14 - samples/sec: 5919.08 - lr: 0.000012 - momentum: 0.000000 2023-10-18 17:50:29,634 epoch 7 - iter 534/894 - loss 0.31615422 - time (sec): 8.58 - samples/sec: 6012.46 - lr: 0.000011 - momentum: 0.000000 2023-10-18 17:50:31,015 epoch 7 - iter 623/894 - loss 0.31916724 - time (sec): 9.96 - samples/sec: 6066.94 - lr: 0.000011 - momentum: 0.000000 2023-10-18 17:50:32,413 epoch 7 - iter 712/894 - loss 0.32144582 - time (sec): 11.36 - samples/sec: 6137.21 - lr: 0.000011 - momentum: 0.000000 2023-10-18 17:50:33,805 epoch 7 - iter 801/894 - loss 0.32125099 - time (sec): 12.75 - samples/sec: 6150.79 - lr: 0.000010 - momentum: 0.000000 2023-10-18 17:50:35,151 epoch 7 - iter 890/894 - loss 0.31897858 - time (sec): 14.09 - samples/sec: 6117.17 - lr: 0.000010 - momentum: 0.000000 2023-10-18 17:50:35,210 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:50:35,210 EPOCH 7 done: loss 0.3195 - lr: 0.000010 2023-10-18 17:50:40,535 DEV : loss 0.3022187650203705 - f1-score (micro avg) 0.3497 2023-10-18 17:50:40,559 saving best model 2023-10-18 17:50:40,599 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:50:42,000 epoch 8 - iter 89/894 - loss 0.34555510 - time (sec): 1.40 - samples/sec: 6326.33 - lr: 0.000010 - momentum: 0.000000 2023-10-18 17:50:43,433 epoch 8 - iter 178/894 - loss 0.32944189 - time (sec): 2.83 - samples/sec: 6467.94 - lr: 0.000009 - momentum: 0.000000 2023-10-18 17:50:44,836 epoch 8 - iter 267/894 - loss 0.32918232 - time (sec): 4.24 - samples/sec: 6225.29 - lr: 0.000009 - momentum: 0.000000 2023-10-18 17:50:46,332 epoch 8 - iter 356/894 - loss 0.31980933 - time (sec): 5.73 - samples/sec: 6146.00 - lr: 0.000009 - momentum: 0.000000 2023-10-18 17:50:47,718 epoch 8 - iter 445/894 - loss 0.31177364 - time (sec): 7.12 - samples/sec: 6270.04 - lr: 0.000008 - momentum: 0.000000 2023-10-18 17:50:49,196 epoch 8 - iter 534/894 - loss 0.30996565 - time (sec): 8.60 - samples/sec: 6170.04 - lr: 0.000008 - momentum: 0.000000 2023-10-18 17:50:50,598 epoch 8 - iter 623/894 - loss 0.31266497 - time (sec): 10.00 - samples/sec: 6092.52 - lr: 0.000008 - momentum: 0.000000 2023-10-18 17:50:52,119 epoch 8 - iter 712/894 - loss 0.30729815 - time (sec): 11.52 - samples/sec: 6096.38 - lr: 0.000007 - momentum: 0.000000 2023-10-18 17:50:53,590 epoch 8 - iter 801/894 - loss 0.31538057 - time (sec): 12.99 - samples/sec: 6039.31 - lr: 0.000007 - momentum: 0.000000 2023-10-18 17:50:54,948 epoch 8 - iter 890/894 - loss 0.31034285 - time (sec): 14.35 - samples/sec: 6002.89 - lr: 0.000007 - momentum: 0.000000 2023-10-18 17:50:55,010 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:50:55,010 EPOCH 8 done: loss 0.3097 - lr: 0.000007 2023-10-18 17:50:59,964 DEV : loss 0.30434364080429077 - f1-score (micro avg) 0.3522 2023-10-18 17:50:59,988 saving best model 2023-10-18 17:51:00,025 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:51:01,426 epoch 9 - iter 89/894 - loss 0.23959273 - time (sec): 1.40 - samples/sec: 6024.01 - lr: 0.000006 - momentum: 0.000000 2023-10-18 17:51:02,796 epoch 9 - iter 178/894 - loss 0.27105112 - time (sec): 2.77 - samples/sec: 6003.56 - lr: 0.000006 - momentum: 0.000000 2023-10-18 17:51:04,184 epoch 9 - iter 267/894 - loss 0.28143971 - time (sec): 4.16 - samples/sec: 6298.31 - lr: 0.000006 - momentum: 0.000000 2023-10-18 17:51:05,606 epoch 9 - iter 356/894 - loss 0.29591837 - time (sec): 5.58 - samples/sec: 6341.19 - lr: 0.000005 - momentum: 0.000000 2023-10-18 17:51:07,028 epoch 9 - iter 445/894 - loss 0.29997546 - time (sec): 7.00 - samples/sec: 6173.84 - lr: 0.000005 - momentum: 0.000000 2023-10-18 17:51:08,706 epoch 9 - iter 534/894 - loss 0.30091527 - time (sec): 8.68 - samples/sec: 5943.69 - lr: 0.000005 - momentum: 0.000000 2023-10-18 17:51:10,103 epoch 9 - iter 623/894 - loss 0.30243132 - time (sec): 10.08 - samples/sec: 5982.17 - lr: 0.000004 - momentum: 0.000000 2023-10-18 17:51:11,554 epoch 9 - iter 712/894 - loss 0.29460013 - time (sec): 11.53 - samples/sec: 6064.68 - lr: 0.000004 - momentum: 0.000000 2023-10-18 17:51:12,917 epoch 9 - iter 801/894 - loss 0.30229177 - time (sec): 12.89 - samples/sec: 6047.04 - lr: 0.000004 - momentum: 0.000000 2023-10-18 17:51:14,303 epoch 9 - iter 890/894 - loss 0.30375418 - time (sec): 14.28 - samples/sec: 6036.73 - lr: 0.000003 - momentum: 0.000000 2023-10-18 17:51:14,365 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:51:14,365 EPOCH 9 done: loss 0.3029 - lr: 0.000003 2023-10-18 17:51:19,332 DEV : loss 0.30048123002052307 - f1-score (micro avg) 0.3541 2023-10-18 17:51:19,357 saving best model 2023-10-18 17:51:19,393 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:51:20,890 epoch 10 - iter 89/894 - loss 0.29449068 - time (sec): 1.50 - samples/sec: 6558.64 - lr: 0.000003 - momentum: 0.000000 2023-10-18 17:51:22,283 epoch 10 - iter 178/894 - loss 0.30220126 - time (sec): 2.89 - samples/sec: 6265.58 - lr: 0.000003 - momentum: 0.000000 2023-10-18 17:51:23,707 epoch 10 - iter 267/894 - loss 0.28820853 - time (sec): 4.31 - samples/sec: 6138.84 - lr: 0.000002 - momentum: 0.000000 2023-10-18 17:51:25,097 epoch 10 - iter 356/894 - loss 0.29937325 - time (sec): 5.70 - samples/sec: 6225.42 - lr: 0.000002 - momentum: 0.000000 2023-10-18 17:51:26,464 epoch 10 - iter 445/894 - loss 0.29761073 - time (sec): 7.07 - samples/sec: 6096.08 - lr: 0.000002 - momentum: 0.000000 2023-10-18 17:51:27,899 epoch 10 - iter 534/894 - loss 0.29810315 - time (sec): 8.51 - samples/sec: 6153.61 - lr: 0.000001 - momentum: 0.000000 2023-10-18 17:51:29,274 epoch 10 - iter 623/894 - loss 0.30838358 - time (sec): 9.88 - samples/sec: 6263.76 - lr: 0.000001 - momentum: 0.000000 2023-10-18 17:51:30,631 epoch 10 - iter 712/894 - loss 0.31079931 - time (sec): 11.24 - samples/sec: 6198.45 - lr: 0.000001 - momentum: 0.000000 2023-10-18 17:51:31,990 epoch 10 - iter 801/894 - loss 0.30309182 - time (sec): 12.60 - samples/sec: 6192.14 - lr: 0.000000 - momentum: 0.000000 2023-10-18 17:51:33,352 epoch 10 - iter 890/894 - loss 0.30297146 - time (sec): 13.96 - samples/sec: 6148.71 - lr: 0.000000 - momentum: 0.000000 2023-10-18 17:51:33,434 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:51:33,434 EPOCH 10 done: loss 0.3025 - lr: 0.000000 2023-10-18 17:51:38,772 DEV : loss 0.300153523683548 - f1-score (micro avg) 0.3579 2023-10-18 17:51:38,797 saving best model 2023-10-18 17:51:38,868 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:51:38,868 Loading model from best epoch ... 2023-10-18 17:51:38,945 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time 2023-10-18 17:51:41,291 Results: - F-score (micro) 0.3715 - F-score (macro) 0.1528 - Accuracy 0.2374 By class: precision recall f1-score support loc 0.4882 0.5923 0.5353 596 pers 0.1927 0.2072 0.1997 333 org 0.0000 0.0000 0.0000 132 time 0.0500 0.0204 0.0290 49 prod 0.0000 0.0000 0.0000 66 micro avg 0.3842 0.3597 0.3715 1176 macro avg 0.1462 0.1640 0.1528 1176 weighted avg 0.3041 0.3597 0.3290 1176 2023-10-18 17:51:41,291 ----------------------------------------------------------------------------------------------------