2023-10-25 21:07:49,395 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:07:49,396 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 21:07:49,396 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:07:49,396 MultiCorpus: 1166 train + 165 dev + 415 test sentences - NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator 2023-10-25 21:07:49,396 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:07:49,396 Train: 1166 sentences 2023-10-25 21:07:49,396 (train_with_dev=False, train_with_test=False) 2023-10-25 21:07:49,396 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:07:49,396 Training Params: 2023-10-25 21:07:49,396 - learning_rate: "5e-05" 2023-10-25 21:07:49,396 - mini_batch_size: "8" 2023-10-25 21:07:49,396 - max_epochs: "10" 2023-10-25 21:07:49,396 - shuffle: "True" 2023-10-25 21:07:49,396 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:07:49,396 Plugins: 2023-10-25 21:07:49,396 - TensorboardLogger 2023-10-25 21:07:49,396 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 21:07:49,396 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:07:49,396 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 21:07:49,396 - metric: "('micro avg', 'f1-score')" 2023-10-25 21:07:49,396 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:07:49,396 Computation: 2023-10-25 21:07:49,396 - compute on device: cuda:0 2023-10-25 21:07:49,396 - embedding storage: none 2023-10-25 21:07:49,396 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:07:49,396 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-25 21:07:49,396 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:07:49,396 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:07:49,397 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 21:07:50,189 epoch 1 - iter 14/146 - loss 2.76093476 - time (sec): 0.79 - samples/sec: 4306.04 - lr: 0.000004 - momentum: 0.000000 2023-10-25 21:07:50,999 epoch 1 - iter 28/146 - loss 2.16927590 - time (sec): 1.60 - samples/sec: 4446.03 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:07:51,943 epoch 1 - iter 42/146 - loss 1.56576289 - time (sec): 2.55 - samples/sec: 4650.49 - lr: 0.000014 - momentum: 0.000000 2023-10-25 21:07:52,744 epoch 1 - iter 56/146 - loss 1.31295193 - time (sec): 3.35 - samples/sec: 4680.42 - lr: 0.000019 - momentum: 0.000000 2023-10-25 21:07:53,535 epoch 1 - iter 70/146 - loss 1.14715761 - time (sec): 4.14 - samples/sec: 4718.31 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:07:54,508 epoch 1 - iter 84/146 - loss 1.02143305 - time (sec): 5.11 - samples/sec: 4686.68 - lr: 0.000028 - momentum: 0.000000 2023-10-25 21:07:55,436 epoch 1 - iter 98/146 - loss 0.90968592 - time (sec): 6.04 - samples/sec: 4766.54 - lr: 0.000033 - momentum: 0.000000 2023-10-25 21:07:56,476 epoch 1 - iter 112/146 - loss 0.82600257 - time (sec): 7.08 - samples/sec: 4773.82 - lr: 0.000038 - momentum: 0.000000 2023-10-25 21:07:57,299 epoch 1 - iter 126/146 - loss 0.76327971 - time (sec): 7.90 - samples/sec: 4816.97 - lr: 0.000043 - momentum: 0.000000 2023-10-25 21:07:58,227 epoch 1 - iter 140/146 - loss 0.70269213 - time (sec): 8.83 - samples/sec: 4846.77 - lr: 0.000048 - momentum: 0.000000 2023-10-25 21:07:58,636 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:07:58,637 EPOCH 1 done: loss 0.6875 - lr: 0.000048 2023-10-25 21:07:59,149 DEV : loss 0.1478959619998932 - f1-score (micro avg) 0.6147 2023-10-25 21:07:59,153 saving best model 2023-10-25 21:07:59,662 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:08:00,616 epoch 2 - iter 14/146 - loss 0.16108065 - time (sec): 0.95 - samples/sec: 4811.08 - lr: 0.000050 - momentum: 0.000000 2023-10-25 21:08:01,592 epoch 2 - iter 28/146 - loss 0.15067828 - time (sec): 1.93 - samples/sec: 4899.11 - lr: 0.000049 - momentum: 0.000000 2023-10-25 21:08:02,467 epoch 2 - iter 42/146 - loss 0.15492269 - time (sec): 2.80 - samples/sec: 4896.36 - lr: 0.000048 - momentum: 0.000000 2023-10-25 21:08:03,367 epoch 2 - iter 56/146 - loss 0.16102186 - time (sec): 3.70 - samples/sec: 4832.43 - lr: 0.000048 - momentum: 0.000000 2023-10-25 21:08:04,144 epoch 2 - iter 70/146 - loss 0.15945960 - time (sec): 4.48 - samples/sec: 4838.41 - lr: 0.000047 - momentum: 0.000000 2023-10-25 21:08:04,909 epoch 2 - iter 84/146 - loss 0.16413615 - time (sec): 5.25 - samples/sec: 4824.82 - lr: 0.000047 - momentum: 0.000000 2023-10-25 21:08:05,687 epoch 2 - iter 98/146 - loss 0.15984296 - time (sec): 6.02 - samples/sec: 4843.16 - lr: 0.000046 - momentum: 0.000000 2023-10-25 21:08:06,672 epoch 2 - iter 112/146 - loss 0.15388793 - time (sec): 7.01 - samples/sec: 4844.91 - lr: 0.000046 - momentum: 0.000000 2023-10-25 21:08:07,514 epoch 2 - iter 126/146 - loss 0.14973376 - time (sec): 7.85 - samples/sec: 4895.41 - lr: 0.000045 - momentum: 0.000000 2023-10-25 21:08:08,387 epoch 2 - iter 140/146 - loss 0.14951333 - time (sec): 8.72 - samples/sec: 4914.47 - lr: 0.000045 - momentum: 0.000000 2023-10-25 21:08:08,740 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:08:08,740 EPOCH 2 done: loss 0.1491 - lr: 0.000045 2023-10-25 21:08:09,802 DEV : loss 0.10244478285312653 - f1-score (micro avg) 0.722 2023-10-25 21:08:09,806 saving best model 2023-10-25 21:08:10,488 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:08:11,393 epoch 3 - iter 14/146 - loss 0.09051348 - time (sec): 0.90 - samples/sec: 4622.43 - lr: 0.000044 - momentum: 0.000000 2023-10-25 21:08:12,169 epoch 3 - iter 28/146 - loss 0.08682021 - time (sec): 1.68 - samples/sec: 4486.31 - lr: 0.000043 - momentum: 0.000000 2023-10-25 21:08:13,122 epoch 3 - iter 42/146 - loss 0.08134884 - time (sec): 2.63 - samples/sec: 4551.81 - lr: 0.000043 - momentum: 0.000000 2023-10-25 21:08:13,976 epoch 3 - iter 56/146 - loss 0.07868931 - time (sec): 3.49 - samples/sec: 4454.44 - lr: 0.000042 - momentum: 0.000000 2023-10-25 21:08:15,095 epoch 3 - iter 70/146 - loss 0.08088185 - time (sec): 4.60 - samples/sec: 4610.30 - lr: 0.000042 - momentum: 0.000000 2023-10-25 21:08:15,955 epoch 3 - iter 84/146 - loss 0.08151152 - time (sec): 5.47 - samples/sec: 4731.13 - lr: 0.000041 - momentum: 0.000000 2023-10-25 21:08:16,823 epoch 3 - iter 98/146 - loss 0.08031537 - time (sec): 6.33 - samples/sec: 4791.34 - lr: 0.000041 - momentum: 0.000000 2023-10-25 21:08:17,535 epoch 3 - iter 112/146 - loss 0.08190989 - time (sec): 7.05 - samples/sec: 4836.59 - lr: 0.000040 - momentum: 0.000000 2023-10-25 21:08:18,326 epoch 3 - iter 126/146 - loss 0.08448286 - time (sec): 7.84 - samples/sec: 4838.21 - lr: 0.000040 - momentum: 0.000000 2023-10-25 21:08:19,203 epoch 3 - iter 140/146 - loss 0.08269211 - time (sec): 8.71 - samples/sec: 4854.04 - lr: 0.000039 - momentum: 0.000000 2023-10-25 21:08:19,638 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:08:19,639 EPOCH 3 done: loss 0.0836 - lr: 0.000039 2023-10-25 21:08:20,557 DEV : loss 0.10184833407402039 - f1-score (micro avg) 0.7212 2023-10-25 21:08:20,562 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:08:21,520 epoch 4 - iter 14/146 - loss 0.06152145 - time (sec): 0.96 - samples/sec: 5271.53 - lr: 0.000038 - momentum: 0.000000 2023-10-25 21:08:22,334 epoch 4 - iter 28/146 - loss 0.05606755 - time (sec): 1.77 - samples/sec: 4945.36 - lr: 0.000038 - momentum: 0.000000 2023-10-25 21:08:23,145 epoch 4 - iter 42/146 - loss 0.05920449 - time (sec): 2.58 - samples/sec: 4896.12 - lr: 0.000037 - momentum: 0.000000 2023-10-25 21:08:24,097 epoch 4 - iter 56/146 - loss 0.05389913 - time (sec): 3.53 - samples/sec: 4806.51 - lr: 0.000037 - momentum: 0.000000 2023-10-25 21:08:24,831 epoch 4 - iter 70/146 - loss 0.05447504 - time (sec): 4.27 - samples/sec: 4767.50 - lr: 0.000036 - momentum: 0.000000 2023-10-25 21:08:25,787 epoch 4 - iter 84/146 - loss 0.05784428 - time (sec): 5.22 - samples/sec: 4714.07 - lr: 0.000036 - momentum: 0.000000 2023-10-25 21:08:26,661 epoch 4 - iter 98/146 - loss 0.05700628 - time (sec): 6.10 - samples/sec: 4729.43 - lr: 0.000035 - momentum: 0.000000 2023-10-25 21:08:27,499 epoch 4 - iter 112/146 - loss 0.05350858 - time (sec): 6.94 - samples/sec: 4689.34 - lr: 0.000035 - momentum: 0.000000 2023-10-25 21:08:28,507 epoch 4 - iter 126/146 - loss 0.05232477 - time (sec): 7.94 - samples/sec: 4696.73 - lr: 0.000034 - momentum: 0.000000 2023-10-25 21:08:29,417 epoch 4 - iter 140/146 - loss 0.05225578 - time (sec): 8.85 - samples/sec: 4793.42 - lr: 0.000034 - momentum: 0.000000 2023-10-25 21:08:29,757 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:08:29,757 EPOCH 4 done: loss 0.0523 - lr: 0.000034 2023-10-25 21:08:30,673 DEV : loss 0.10563240945339203 - f1-score (micro avg) 0.7404 2023-10-25 21:08:30,678 saving best model 2023-10-25 21:08:31,335 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:08:32,246 epoch 5 - iter 14/146 - loss 0.02488614 - time (sec): 0.91 - samples/sec: 5101.79 - lr: 0.000033 - momentum: 0.000000 2023-10-25 21:08:33,018 epoch 5 - iter 28/146 - loss 0.02254102 - time (sec): 1.68 - samples/sec: 4976.30 - lr: 0.000032 - momentum: 0.000000 2023-10-25 21:08:33,914 epoch 5 - iter 42/146 - loss 0.02974599 - time (sec): 2.58 - samples/sec: 5077.10 - lr: 0.000032 - momentum: 0.000000 2023-10-25 21:08:34,838 epoch 5 - iter 56/146 - loss 0.02885809 - time (sec): 3.50 - samples/sec: 4900.07 - lr: 0.000031 - momentum: 0.000000 2023-10-25 21:08:35,787 epoch 5 - iter 70/146 - loss 0.02769486 - time (sec): 4.45 - samples/sec: 4744.69 - lr: 0.000031 - momentum: 0.000000 2023-10-25 21:08:36,619 epoch 5 - iter 84/146 - loss 0.02843650 - time (sec): 5.28 - samples/sec: 4729.31 - lr: 0.000030 - momentum: 0.000000 2023-10-25 21:08:37,565 epoch 5 - iter 98/146 - loss 0.03048721 - time (sec): 6.23 - samples/sec: 4678.06 - lr: 0.000030 - momentum: 0.000000 2023-10-25 21:08:38,453 epoch 5 - iter 112/146 - loss 0.03202213 - time (sec): 7.12 - samples/sec: 4695.52 - lr: 0.000029 - momentum: 0.000000 2023-10-25 21:08:39,677 epoch 5 - iter 126/146 - loss 0.03216288 - time (sec): 8.34 - samples/sec: 4614.52 - lr: 0.000029 - momentum: 0.000000 2023-10-25 21:08:40,447 epoch 5 - iter 140/146 - loss 0.03299174 - time (sec): 9.11 - samples/sec: 4678.91 - lr: 0.000028 - momentum: 0.000000 2023-10-25 21:08:40,842 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:08:40,842 EPOCH 5 done: loss 0.0327 - lr: 0.000028 2023-10-25 21:08:41,755 DEV : loss 0.10233564674854279 - f1-score (micro avg) 0.7706 2023-10-25 21:08:41,760 saving best model 2023-10-25 21:08:42,313 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:08:43,115 epoch 6 - iter 14/146 - loss 0.01610677 - time (sec): 0.80 - samples/sec: 5199.94 - lr: 0.000027 - momentum: 0.000000 2023-10-25 21:08:44,073 epoch 6 - iter 28/146 - loss 0.01771595 - time (sec): 1.76 - samples/sec: 4763.73 - lr: 0.000027 - momentum: 0.000000 2023-10-25 21:08:45,030 epoch 6 - iter 42/146 - loss 0.01749347 - time (sec): 2.72 - samples/sec: 4851.45 - lr: 0.000026 - momentum: 0.000000 2023-10-25 21:08:45,893 epoch 6 - iter 56/146 - loss 0.01637983 - time (sec): 3.58 - samples/sec: 4827.76 - lr: 0.000026 - momentum: 0.000000 2023-10-25 21:08:46,839 epoch 6 - iter 70/146 - loss 0.02013402 - time (sec): 4.52 - samples/sec: 4762.24 - lr: 0.000025 - momentum: 0.000000 2023-10-25 21:08:47,690 epoch 6 - iter 84/146 - loss 0.01826125 - time (sec): 5.38 - samples/sec: 4727.79 - lr: 0.000025 - momentum: 0.000000 2023-10-25 21:08:48,586 epoch 6 - iter 98/146 - loss 0.01955487 - time (sec): 6.27 - samples/sec: 4801.37 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:08:49,583 epoch 6 - iter 112/146 - loss 0.02207506 - time (sec): 7.27 - samples/sec: 4730.97 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:08:50,408 epoch 6 - iter 126/146 - loss 0.02262062 - time (sec): 8.09 - samples/sec: 4746.73 - lr: 0.000023 - momentum: 0.000000 2023-10-25 21:08:51,267 epoch 6 - iter 140/146 - loss 0.02172538 - time (sec): 8.95 - samples/sec: 4777.39 - lr: 0.000023 - momentum: 0.000000 2023-10-25 21:08:51,616 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:08:51,617 EPOCH 6 done: loss 0.0229 - lr: 0.000023 2023-10-25 21:08:52,528 DEV : loss 0.127528578042984 - f1-score (micro avg) 0.7511 2023-10-25 21:08:52,532 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:08:53,355 epoch 7 - iter 14/146 - loss 0.01150269 - time (sec): 0.82 - samples/sec: 5084.44 - lr: 0.000022 - momentum: 0.000000 2023-10-25 21:08:54,542 epoch 7 - iter 28/146 - loss 0.01576881 - time (sec): 2.01 - samples/sec: 5019.70 - lr: 0.000021 - momentum: 0.000000 2023-10-25 21:08:55,297 epoch 7 - iter 42/146 - loss 0.01762664 - time (sec): 2.76 - samples/sec: 4869.93 - lr: 0.000021 - momentum: 0.000000 2023-10-25 21:08:56,135 epoch 7 - iter 56/146 - loss 0.01668696 - time (sec): 3.60 - samples/sec: 4798.61 - lr: 0.000020 - momentum: 0.000000 2023-10-25 21:08:56,965 epoch 7 - iter 70/146 - loss 0.01515176 - time (sec): 4.43 - samples/sec: 4829.07 - lr: 0.000020 - momentum: 0.000000 2023-10-25 21:08:57,897 epoch 7 - iter 84/146 - loss 0.01337344 - time (sec): 5.36 - samples/sec: 4908.17 - lr: 0.000019 - momentum: 0.000000 2023-10-25 21:08:58,754 epoch 7 - iter 98/146 - loss 0.01419310 - time (sec): 6.22 - samples/sec: 4934.76 - lr: 0.000019 - momentum: 0.000000 2023-10-25 21:08:59,525 epoch 7 - iter 112/146 - loss 0.01622088 - time (sec): 6.99 - samples/sec: 4885.45 - lr: 0.000018 - momentum: 0.000000 2023-10-25 21:09:00,360 epoch 7 - iter 126/146 - loss 0.01549015 - time (sec): 7.83 - samples/sec: 4890.68 - lr: 0.000018 - momentum: 0.000000 2023-10-25 21:09:01,272 epoch 7 - iter 140/146 - loss 0.01516680 - time (sec): 8.74 - samples/sec: 4861.79 - lr: 0.000017 - momentum: 0.000000 2023-10-25 21:09:01,676 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:09:01,676 EPOCH 7 done: loss 0.0146 - lr: 0.000017 2023-10-25 21:09:02,587 DEV : loss 0.13177402317523956 - f1-score (micro avg) 0.7689 2023-10-25 21:09:02,591 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:09:03,473 epoch 8 - iter 14/146 - loss 0.02475525 - time (sec): 0.88 - samples/sec: 4376.44 - lr: 0.000016 - momentum: 0.000000 2023-10-25 21:09:04,487 epoch 8 - iter 28/146 - loss 0.01803622 - time (sec): 1.90 - samples/sec: 4481.02 - lr: 0.000016 - momentum: 0.000000 2023-10-25 21:09:05,574 epoch 8 - iter 42/146 - loss 0.01556120 - time (sec): 2.98 - samples/sec: 4529.62 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:09:06,474 epoch 8 - iter 56/146 - loss 0.01480972 - time (sec): 3.88 - samples/sec: 4494.81 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:09:07,396 epoch 8 - iter 70/146 - loss 0.01592848 - time (sec): 4.80 - samples/sec: 4523.32 - lr: 0.000014 - momentum: 0.000000 2023-10-25 21:09:08,211 epoch 8 - iter 84/146 - loss 0.01469362 - time (sec): 5.62 - samples/sec: 4618.26 - lr: 0.000014 - momentum: 0.000000 2023-10-25 21:09:08,991 epoch 8 - iter 98/146 - loss 0.01471966 - time (sec): 6.40 - samples/sec: 4605.33 - lr: 0.000013 - momentum: 0.000000 2023-10-25 21:09:09,977 epoch 8 - iter 112/146 - loss 0.01407256 - time (sec): 7.39 - samples/sec: 4671.32 - lr: 0.000013 - momentum: 0.000000 2023-10-25 21:09:10,789 epoch 8 - iter 126/146 - loss 0.01427331 - time (sec): 8.20 - samples/sec: 4669.48 - lr: 0.000012 - momentum: 0.000000 2023-10-25 21:09:11,640 epoch 8 - iter 140/146 - loss 0.01285895 - time (sec): 9.05 - samples/sec: 4747.94 - lr: 0.000012 - momentum: 0.000000 2023-10-25 21:09:11,945 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:09:11,946 EPOCH 8 done: loss 0.0125 - lr: 0.000012 2023-10-25 21:09:13,015 DEV : loss 0.14684733748435974 - f1-score (micro avg) 0.7458 2023-10-25 21:09:13,020 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:09:13,897 epoch 9 - iter 14/146 - loss 0.00374008 - time (sec): 0.88 - samples/sec: 5312.36 - lr: 0.000011 - momentum: 0.000000 2023-10-25 21:09:14,741 epoch 9 - iter 28/146 - loss 0.00602499 - time (sec): 1.72 - samples/sec: 5187.02 - lr: 0.000010 - momentum: 0.000000 2023-10-25 21:09:15,543 epoch 9 - iter 42/146 - loss 0.00469098 - time (sec): 2.52 - samples/sec: 4978.45 - lr: 0.000010 - momentum: 0.000000 2023-10-25 21:09:16,616 epoch 9 - iter 56/146 - loss 0.00530783 - time (sec): 3.59 - samples/sec: 4869.76 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:09:17,612 epoch 9 - iter 70/146 - loss 0.00888826 - time (sec): 4.59 - samples/sec: 4850.68 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:09:18,526 epoch 9 - iter 84/146 - loss 0.00861823 - time (sec): 5.50 - samples/sec: 4792.41 - lr: 0.000008 - momentum: 0.000000 2023-10-25 21:09:19,463 epoch 9 - iter 98/146 - loss 0.00801291 - time (sec): 6.44 - samples/sec: 4800.57 - lr: 0.000008 - momentum: 0.000000 2023-10-25 21:09:20,244 epoch 9 - iter 112/146 - loss 0.00827974 - time (sec): 7.22 - samples/sec: 4759.68 - lr: 0.000007 - momentum: 0.000000 2023-10-25 21:09:21,147 epoch 9 - iter 126/146 - loss 0.00760175 - time (sec): 8.13 - samples/sec: 4737.94 - lr: 0.000007 - momentum: 0.000000 2023-10-25 21:09:22,026 epoch 9 - iter 140/146 - loss 0.00722376 - time (sec): 9.00 - samples/sec: 4748.75 - lr: 0.000006 - momentum: 0.000000 2023-10-25 21:09:22,355 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:09:22,355 EPOCH 9 done: loss 0.0073 - lr: 0.000006 2023-10-25 21:09:23,267 DEV : loss 0.1549414098262787 - f1-score (micro avg) 0.7692 2023-10-25 21:09:23,271 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:09:24,097 epoch 10 - iter 14/146 - loss 0.00128690 - time (sec): 0.82 - samples/sec: 5057.66 - lr: 0.000005 - momentum: 0.000000 2023-10-25 21:09:24,940 epoch 10 - iter 28/146 - loss 0.00525616 - time (sec): 1.67 - samples/sec: 5092.22 - lr: 0.000005 - momentum: 0.000000 2023-10-25 21:09:25,838 epoch 10 - iter 42/146 - loss 0.00459227 - time (sec): 2.57 - samples/sec: 4870.74 - lr: 0.000004 - momentum: 0.000000 2023-10-25 21:09:26,823 epoch 10 - iter 56/146 - loss 0.00708331 - time (sec): 3.55 - samples/sec: 4768.34 - lr: 0.000004 - momentum: 0.000000 2023-10-25 21:09:27,709 epoch 10 - iter 70/146 - loss 0.00587957 - time (sec): 4.44 - samples/sec: 4762.88 - lr: 0.000003 - momentum: 0.000000 2023-10-25 21:09:28,497 epoch 10 - iter 84/146 - loss 0.00554849 - time (sec): 5.22 - samples/sec: 4713.50 - lr: 0.000003 - momentum: 0.000000 2023-10-25 21:09:29,533 epoch 10 - iter 98/146 - loss 0.00596721 - time (sec): 6.26 - samples/sec: 4666.95 - lr: 0.000002 - momentum: 0.000000 2023-10-25 21:09:30,455 epoch 10 - iter 112/146 - loss 0.00572547 - time (sec): 7.18 - samples/sec: 4724.35 - lr: 0.000002 - momentum: 0.000000 2023-10-25 21:09:31,589 epoch 10 - iter 126/146 - loss 0.00547473 - time (sec): 8.32 - samples/sec: 4612.75 - lr: 0.000001 - momentum: 0.000000 2023-10-25 21:09:32,468 epoch 10 - iter 140/146 - loss 0.00498630 - time (sec): 9.20 - samples/sec: 4651.83 - lr: 0.000000 - momentum: 0.000000 2023-10-25 21:09:32,839 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:09:32,839 EPOCH 10 done: loss 0.0049 - lr: 0.000000 2023-10-25 21:09:33,748 DEV : loss 0.15722544491291046 - f1-score (micro avg) 0.7706 2023-10-25 21:09:34,271 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:09:34,272 Loading model from best epoch ... 2023-10-25 21:09:35,987 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-25 21:09:37,504 Results: - F-score (micro) 0.7628 - F-score (macro) 0.6702 - Accuracy 0.6352 By class: precision recall f1-score support PER 0.8319 0.8391 0.8355 348 LOC 0.6656 0.8314 0.7394 261 ORG 0.4468 0.4038 0.4242 52 HumanProd 0.6818 0.6818 0.6818 22 micro avg 0.7306 0.7980 0.7628 683 macro avg 0.6565 0.6890 0.6702 683 weighted avg 0.7342 0.7980 0.7625 683 2023-10-25 21:09:37,504 ----------------------------------------------------------------------------------------------------