2023-10-25 20:47:46,272 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:47:46,273 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 20:47:46,273 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:47:46,273 MultiCorpus: 1166 train + 165 dev + 415 test sentences - NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator 2023-10-25 20:47:46,273 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:47:46,273 Train: 1166 sentences 2023-10-25 20:47:46,273 (train_with_dev=False, train_with_test=False) 2023-10-25 20:47:46,273 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:47:46,273 Training Params: 2023-10-25 20:47:46,273 - learning_rate: "5e-05" 2023-10-25 20:47:46,273 - mini_batch_size: "8" 2023-10-25 20:47:46,273 - max_epochs: "10" 2023-10-25 20:47:46,273 - shuffle: "True" 2023-10-25 20:47:46,273 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:47:46,273 Plugins: 2023-10-25 20:47:46,273 - TensorboardLogger 2023-10-25 20:47:46,273 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 20:47:46,274 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:47:46,274 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 20:47:46,274 - metric: "('micro avg', 'f1-score')" 2023-10-25 20:47:46,274 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:47:46,274 Computation: 2023-10-25 20:47:46,274 - compute on device: cuda:0 2023-10-25 20:47:46,274 - embedding storage: none 2023-10-25 20:47:46,274 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:47:46,274 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-25 20:47:46,274 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:47:46,274 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:47:46,274 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 20:47:47,228 epoch 1 - iter 14/146 - loss 3.32619155 - time (sec): 0.95 - samples/sec: 4905.02 - lr: 0.000004 - momentum: 0.000000 2023-10-25 20:47:48,090 epoch 1 - iter 28/146 - loss 2.77463951 - time (sec): 1.81 - samples/sec: 4662.84 - lr: 0.000009 - momentum: 0.000000 2023-10-25 20:47:49,025 epoch 1 - iter 42/146 - loss 2.06209058 - time (sec): 2.75 - samples/sec: 4645.68 - lr: 0.000014 - momentum: 0.000000 2023-10-25 20:47:49,928 epoch 1 - iter 56/146 - loss 1.66590766 - time (sec): 3.65 - samples/sec: 4744.10 - lr: 0.000019 - momentum: 0.000000 2023-10-25 20:47:50,870 epoch 1 - iter 70/146 - loss 1.41744737 - time (sec): 4.60 - samples/sec: 4763.62 - lr: 0.000024 - momentum: 0.000000 2023-10-25 20:47:51,793 epoch 1 - iter 84/146 - loss 1.23294191 - time (sec): 5.52 - samples/sec: 4814.17 - lr: 0.000028 - momentum: 0.000000 2023-10-25 20:47:52,708 epoch 1 - iter 98/146 - loss 1.10453887 - time (sec): 6.43 - samples/sec: 4784.49 - lr: 0.000033 - momentum: 0.000000 2023-10-25 20:47:53,444 epoch 1 - iter 112/146 - loss 1.01746390 - time (sec): 7.17 - samples/sec: 4779.66 - lr: 0.000038 - momentum: 0.000000 2023-10-25 20:47:54,435 epoch 1 - iter 126/146 - loss 0.91742517 - time (sec): 8.16 - samples/sec: 4783.24 - lr: 0.000043 - momentum: 0.000000 2023-10-25 20:47:55,261 epoch 1 - iter 140/146 - loss 0.85738814 - time (sec): 8.99 - samples/sec: 4759.96 - lr: 0.000048 - momentum: 0.000000 2023-10-25 20:47:55,630 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:47:55,630 EPOCH 1 done: loss 0.8327 - lr: 0.000048 2023-10-25 20:47:56,280 DEV : loss 0.17366763949394226 - f1-score (micro avg) 0.4951 2023-10-25 20:47:56,285 saving best model 2023-10-25 20:47:56,703 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:47:57,513 epoch 2 - iter 14/146 - loss 0.18263706 - time (sec): 0.81 - samples/sec: 4908.96 - lr: 0.000050 - momentum: 0.000000 2023-10-25 20:47:58,350 epoch 2 - iter 28/146 - loss 0.18295105 - time (sec): 1.65 - samples/sec: 5022.09 - lr: 0.000049 - momentum: 0.000000 2023-10-25 20:47:59,282 epoch 2 - iter 42/146 - loss 0.19154440 - time (sec): 2.58 - samples/sec: 5039.20 - lr: 0.000048 - momentum: 0.000000 2023-10-25 20:48:00,078 epoch 2 - iter 56/146 - loss 0.18243784 - time (sec): 3.37 - samples/sec: 5084.29 - lr: 0.000048 - momentum: 0.000000 2023-10-25 20:48:00,853 epoch 2 - iter 70/146 - loss 0.18313209 - time (sec): 4.15 - samples/sec: 5020.62 - lr: 0.000047 - momentum: 0.000000 2023-10-25 20:48:01,828 epoch 2 - iter 84/146 - loss 0.17440435 - time (sec): 5.12 - samples/sec: 4918.17 - lr: 0.000047 - momentum: 0.000000 2023-10-25 20:48:02,875 epoch 2 - iter 98/146 - loss 0.16903145 - time (sec): 6.17 - samples/sec: 4880.88 - lr: 0.000046 - momentum: 0.000000 2023-10-25 20:48:03,747 epoch 2 - iter 112/146 - loss 0.16353225 - time (sec): 7.04 - samples/sec: 4891.08 - lr: 0.000046 - momentum: 0.000000 2023-10-25 20:48:04,623 epoch 2 - iter 126/146 - loss 0.16247055 - time (sec): 7.92 - samples/sec: 4827.66 - lr: 0.000045 - momentum: 0.000000 2023-10-25 20:48:05,426 epoch 2 - iter 140/146 - loss 0.16260366 - time (sec): 8.72 - samples/sec: 4806.35 - lr: 0.000045 - momentum: 0.000000 2023-10-25 20:48:06,000 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:48:06,000 EPOCH 2 done: loss 0.1626 - lr: 0.000045 2023-10-25 20:48:06,908 DEV : loss 0.10925206542015076 - f1-score (micro avg) 0.683 2023-10-25 20:48:06,913 saving best model 2023-10-25 20:48:07,451 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:48:08,430 epoch 3 - iter 14/146 - loss 0.08329505 - time (sec): 0.98 - samples/sec: 5121.07 - lr: 0.000044 - momentum: 0.000000 2023-10-25 20:48:09,436 epoch 3 - iter 28/146 - loss 0.07646522 - time (sec): 1.98 - samples/sec: 5046.33 - lr: 0.000043 - momentum: 0.000000 2023-10-25 20:48:10,199 epoch 3 - iter 42/146 - loss 0.07857219 - time (sec): 2.75 - samples/sec: 4897.73 - lr: 0.000043 - momentum: 0.000000 2023-10-25 20:48:11,036 epoch 3 - iter 56/146 - loss 0.08146631 - time (sec): 3.58 - samples/sec: 4909.24 - lr: 0.000042 - momentum: 0.000000 2023-10-25 20:48:11,890 epoch 3 - iter 70/146 - loss 0.08586188 - time (sec): 4.44 - samples/sec: 4941.40 - lr: 0.000042 - momentum: 0.000000 2023-10-25 20:48:12,633 epoch 3 - iter 84/146 - loss 0.08659125 - time (sec): 5.18 - samples/sec: 4866.47 - lr: 0.000041 - momentum: 0.000000 2023-10-25 20:48:13,560 epoch 3 - iter 98/146 - loss 0.08365767 - time (sec): 6.11 - samples/sec: 4921.59 - lr: 0.000041 - momentum: 0.000000 2023-10-25 20:48:14,480 epoch 3 - iter 112/146 - loss 0.08564022 - time (sec): 7.03 - samples/sec: 4903.22 - lr: 0.000040 - momentum: 0.000000 2023-10-25 20:48:15,417 epoch 3 - iter 126/146 - loss 0.08934483 - time (sec): 7.96 - samples/sec: 4888.54 - lr: 0.000040 - momentum: 0.000000 2023-10-25 20:48:16,286 epoch 3 - iter 140/146 - loss 0.08642318 - time (sec): 8.83 - samples/sec: 4847.70 - lr: 0.000039 - momentum: 0.000000 2023-10-25 20:48:16,655 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:48:16,655 EPOCH 3 done: loss 0.0862 - lr: 0.000039 2023-10-25 20:48:17,571 DEV : loss 0.09392837435007095 - f1-score (micro avg) 0.7484 2023-10-25 20:48:17,576 saving best model 2023-10-25 20:48:18,398 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:48:19,170 epoch 4 - iter 14/146 - loss 0.03589195 - time (sec): 0.77 - samples/sec: 4505.55 - lr: 0.000038 - momentum: 0.000000 2023-10-25 20:48:20,000 epoch 4 - iter 28/146 - loss 0.04570279 - time (sec): 1.60 - samples/sec: 4779.15 - lr: 0.000038 - momentum: 0.000000 2023-10-25 20:48:21,050 epoch 4 - iter 42/146 - loss 0.04554306 - time (sec): 2.65 - samples/sec: 4530.48 - lr: 0.000037 - momentum: 0.000000 2023-10-25 20:48:22,028 epoch 4 - iter 56/146 - loss 0.04561147 - time (sec): 3.62 - samples/sec: 4602.70 - lr: 0.000037 - momentum: 0.000000 2023-10-25 20:48:23,042 epoch 4 - iter 70/146 - loss 0.04012849 - time (sec): 4.64 - samples/sec: 4799.02 - lr: 0.000036 - momentum: 0.000000 2023-10-25 20:48:23,878 epoch 4 - iter 84/146 - loss 0.04312134 - time (sec): 5.47 - samples/sec: 4736.42 - lr: 0.000036 - momentum: 0.000000 2023-10-25 20:48:24,937 epoch 4 - iter 98/146 - loss 0.04547308 - time (sec): 6.53 - samples/sec: 4683.90 - lr: 0.000035 - momentum: 0.000000 2023-10-25 20:48:25,734 epoch 4 - iter 112/146 - loss 0.04785940 - time (sec): 7.33 - samples/sec: 4702.37 - lr: 0.000035 - momentum: 0.000000 2023-10-25 20:48:26,619 epoch 4 - iter 126/146 - loss 0.04800961 - time (sec): 8.22 - samples/sec: 4673.38 - lr: 0.000034 - momentum: 0.000000 2023-10-25 20:48:27,443 epoch 4 - iter 140/146 - loss 0.04853540 - time (sec): 9.04 - samples/sec: 4727.50 - lr: 0.000034 - momentum: 0.000000 2023-10-25 20:48:27,795 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:48:27,795 EPOCH 4 done: loss 0.0492 - lr: 0.000034 2023-10-25 20:48:28,710 DEV : loss 0.14798803627490997 - f1-score (micro avg) 0.7111 2023-10-25 20:48:28,715 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:48:29,538 epoch 5 - iter 14/146 - loss 0.02945627 - time (sec): 0.82 - samples/sec: 4595.35 - lr: 0.000033 - momentum: 0.000000 2023-10-25 20:48:30,394 epoch 5 - iter 28/146 - loss 0.03208317 - time (sec): 1.68 - samples/sec: 4783.75 - lr: 0.000032 - momentum: 0.000000 2023-10-25 20:48:31,343 epoch 5 - iter 42/146 - loss 0.03529632 - time (sec): 2.63 - samples/sec: 4876.90 - lr: 0.000032 - momentum: 0.000000 2023-10-25 20:48:32,177 epoch 5 - iter 56/146 - loss 0.03385465 - time (sec): 3.46 - samples/sec: 4834.24 - lr: 0.000031 - momentum: 0.000000 2023-10-25 20:48:33,089 epoch 5 - iter 70/146 - loss 0.03310305 - time (sec): 4.37 - samples/sec: 4832.85 - lr: 0.000031 - momentum: 0.000000 2023-10-25 20:48:33,897 epoch 5 - iter 84/146 - loss 0.03279721 - time (sec): 5.18 - samples/sec: 4806.60 - lr: 0.000030 - momentum: 0.000000 2023-10-25 20:48:34,879 epoch 5 - iter 98/146 - loss 0.03501715 - time (sec): 6.16 - samples/sec: 4786.64 - lr: 0.000030 - momentum: 0.000000 2023-10-25 20:48:35,798 epoch 5 - iter 112/146 - loss 0.03321133 - time (sec): 7.08 - samples/sec: 4785.39 - lr: 0.000029 - momentum: 0.000000 2023-10-25 20:48:36,745 epoch 5 - iter 126/146 - loss 0.03182421 - time (sec): 8.03 - samples/sec: 4743.23 - lr: 0.000029 - momentum: 0.000000 2023-10-25 20:48:37,639 epoch 5 - iter 140/146 - loss 0.02979870 - time (sec): 8.92 - samples/sec: 4755.87 - lr: 0.000028 - momentum: 0.000000 2023-10-25 20:48:38,075 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:48:38,076 EPOCH 5 done: loss 0.0316 - lr: 0.000028 2023-10-25 20:48:38,990 DEV : loss 0.14484462141990662 - f1-score (micro avg) 0.7101 2023-10-25 20:48:38,995 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:48:39,886 epoch 6 - iter 14/146 - loss 0.03930537 - time (sec): 0.89 - samples/sec: 4602.61 - lr: 0.000027 - momentum: 0.000000 2023-10-25 20:48:40,777 epoch 6 - iter 28/146 - loss 0.02555020 - time (sec): 1.78 - samples/sec: 4689.85 - lr: 0.000027 - momentum: 0.000000 2023-10-25 20:48:41,846 epoch 6 - iter 42/146 - loss 0.02053346 - time (sec): 2.85 - samples/sec: 4560.50 - lr: 0.000026 - momentum: 0.000000 2023-10-25 20:48:42,830 epoch 6 - iter 56/146 - loss 0.02171275 - time (sec): 3.83 - samples/sec: 4695.94 - lr: 0.000026 - momentum: 0.000000 2023-10-25 20:48:43,736 epoch 6 - iter 70/146 - loss 0.02182934 - time (sec): 4.74 - samples/sec: 4617.08 - lr: 0.000025 - momentum: 0.000000 2023-10-25 20:48:44,587 epoch 6 - iter 84/146 - loss 0.02098662 - time (sec): 5.59 - samples/sec: 4613.56 - lr: 0.000025 - momentum: 0.000000 2023-10-25 20:48:45,499 epoch 6 - iter 98/146 - loss 0.02256168 - time (sec): 6.50 - samples/sec: 4659.28 - lr: 0.000024 - momentum: 0.000000 2023-10-25 20:48:46,401 epoch 6 - iter 112/146 - loss 0.02105759 - time (sec): 7.41 - samples/sec: 4694.00 - lr: 0.000024 - momentum: 0.000000 2023-10-25 20:48:47,177 epoch 6 - iter 126/146 - loss 0.02169497 - time (sec): 8.18 - samples/sec: 4703.57 - lr: 0.000023 - momentum: 0.000000 2023-10-25 20:48:47,967 epoch 6 - iter 140/146 - loss 0.02216839 - time (sec): 8.97 - samples/sec: 4782.72 - lr: 0.000023 - momentum: 0.000000 2023-10-25 20:48:48,373 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:48:48,373 EPOCH 6 done: loss 0.0229 - lr: 0.000023 2023-10-25 20:48:49,285 DEV : loss 0.1573459953069687 - f1-score (micro avg) 0.7339 2023-10-25 20:48:49,290 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:48:50,319 epoch 7 - iter 14/146 - loss 0.02023463 - time (sec): 1.03 - samples/sec: 3920.05 - lr: 0.000022 - momentum: 0.000000 2023-10-25 20:48:51,207 epoch 7 - iter 28/146 - loss 0.01645952 - time (sec): 1.92 - samples/sec: 4177.15 - lr: 0.000021 - momentum: 0.000000 2023-10-25 20:48:52,071 epoch 7 - iter 42/146 - loss 0.01621473 - time (sec): 2.78 - samples/sec: 4372.48 - lr: 0.000021 - momentum: 0.000000 2023-10-25 20:48:52,893 epoch 7 - iter 56/146 - loss 0.01498802 - time (sec): 3.60 - samples/sec: 4437.54 - lr: 0.000020 - momentum: 0.000000 2023-10-25 20:48:54,004 epoch 7 - iter 70/146 - loss 0.01718484 - time (sec): 4.71 - samples/sec: 4498.40 - lr: 0.000020 - momentum: 0.000000 2023-10-25 20:48:55,044 epoch 7 - iter 84/146 - loss 0.01586359 - time (sec): 5.75 - samples/sec: 4509.60 - lr: 0.000019 - momentum: 0.000000 2023-10-25 20:48:55,972 epoch 7 - iter 98/146 - loss 0.01684853 - time (sec): 6.68 - samples/sec: 4510.54 - lr: 0.000019 - momentum: 0.000000 2023-10-25 20:48:56,896 epoch 7 - iter 112/146 - loss 0.01618138 - time (sec): 7.61 - samples/sec: 4548.68 - lr: 0.000018 - momentum: 0.000000 2023-10-25 20:48:57,791 epoch 7 - iter 126/146 - loss 0.01618698 - time (sec): 8.50 - samples/sec: 4505.04 - lr: 0.000018 - momentum: 0.000000 2023-10-25 20:48:58,621 epoch 7 - iter 140/146 - loss 0.01703696 - time (sec): 9.33 - samples/sec: 4586.23 - lr: 0.000017 - momentum: 0.000000 2023-10-25 20:48:58,968 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:48:58,968 EPOCH 7 done: loss 0.0171 - lr: 0.000017 2023-10-25 20:48:59,889 DEV : loss 0.16221855580806732 - f1-score (micro avg) 0.7226 2023-10-25 20:48:59,893 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:49:00,821 epoch 8 - iter 14/146 - loss 0.00624075 - time (sec): 0.93 - samples/sec: 4750.92 - lr: 0.000016 - momentum: 0.000000 2023-10-25 20:49:01,689 epoch 8 - iter 28/146 - loss 0.00947817 - time (sec): 1.79 - samples/sec: 5225.63 - lr: 0.000016 - momentum: 0.000000 2023-10-25 20:49:02,544 epoch 8 - iter 42/146 - loss 0.00932840 - time (sec): 2.65 - samples/sec: 5019.16 - lr: 0.000015 - momentum: 0.000000 2023-10-25 20:49:03,436 epoch 8 - iter 56/146 - loss 0.00899173 - time (sec): 3.54 - samples/sec: 4972.49 - lr: 0.000015 - momentum: 0.000000 2023-10-25 20:49:04,307 epoch 8 - iter 70/146 - loss 0.00929944 - time (sec): 4.41 - samples/sec: 4876.07 - lr: 0.000014 - momentum: 0.000000 2023-10-25 20:49:05,239 epoch 8 - iter 84/146 - loss 0.01037046 - time (sec): 5.34 - samples/sec: 4857.64 - lr: 0.000014 - momentum: 0.000000 2023-10-25 20:49:06,385 epoch 8 - iter 98/146 - loss 0.01144120 - time (sec): 6.49 - samples/sec: 4718.27 - lr: 0.000013 - momentum: 0.000000 2023-10-25 20:49:07,293 epoch 8 - iter 112/146 - loss 0.01108653 - time (sec): 7.40 - samples/sec: 4721.18 - lr: 0.000013 - momentum: 0.000000 2023-10-25 20:49:08,200 epoch 8 - iter 126/146 - loss 0.01315505 - time (sec): 8.31 - samples/sec: 4639.70 - lr: 0.000012 - momentum: 0.000000 2023-10-25 20:49:09,092 epoch 8 - iter 140/146 - loss 0.01296013 - time (sec): 9.20 - samples/sec: 4659.37 - lr: 0.000012 - momentum: 0.000000 2023-10-25 20:49:09,419 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:49:09,419 EPOCH 8 done: loss 0.0131 - lr: 0.000012 2023-10-25 20:49:10,337 DEV : loss 0.17299844324588776 - f1-score (micro avg) 0.7257 2023-10-25 20:49:10,342 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:49:11,325 epoch 9 - iter 14/146 - loss 0.01830680 - time (sec): 0.98 - samples/sec: 4488.91 - lr: 0.000011 - momentum: 0.000000 2023-10-25 20:49:12,187 epoch 9 - iter 28/146 - loss 0.01109420 - time (sec): 1.84 - samples/sec: 4547.76 - lr: 0.000010 - momentum: 0.000000 2023-10-25 20:49:12,956 epoch 9 - iter 42/146 - loss 0.01001611 - time (sec): 2.61 - samples/sec: 4550.41 - lr: 0.000010 - momentum: 0.000000 2023-10-25 20:49:13,857 epoch 9 - iter 56/146 - loss 0.00902443 - time (sec): 3.51 - samples/sec: 4624.04 - lr: 0.000009 - momentum: 0.000000 2023-10-25 20:49:14,691 epoch 9 - iter 70/146 - loss 0.00932559 - time (sec): 4.35 - samples/sec: 4630.51 - lr: 0.000009 - momentum: 0.000000 2023-10-25 20:49:15,596 epoch 9 - iter 84/146 - loss 0.00988126 - time (sec): 5.25 - samples/sec: 4582.74 - lr: 0.000008 - momentum: 0.000000 2023-10-25 20:49:16,492 epoch 9 - iter 98/146 - loss 0.00934970 - time (sec): 6.15 - samples/sec: 4703.10 - lr: 0.000008 - momentum: 0.000000 2023-10-25 20:49:17,401 epoch 9 - iter 112/146 - loss 0.00834335 - time (sec): 7.06 - samples/sec: 4723.11 - lr: 0.000007 - momentum: 0.000000 2023-10-25 20:49:18,388 epoch 9 - iter 126/146 - loss 0.00786094 - time (sec): 8.04 - samples/sec: 4757.90 - lr: 0.000007 - momentum: 0.000000 2023-10-25 20:49:19,410 epoch 9 - iter 140/146 - loss 0.00807518 - time (sec): 9.07 - samples/sec: 4739.93 - lr: 0.000006 - momentum: 0.000000 2023-10-25 20:49:19,762 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:49:19,762 EPOCH 9 done: loss 0.0082 - lr: 0.000006 2023-10-25 20:49:20,679 DEV : loss 0.19021421670913696 - f1-score (micro avg) 0.7032 2023-10-25 20:49:20,684 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:49:21,526 epoch 10 - iter 14/146 - loss 0.00264529 - time (sec): 0.84 - samples/sec: 5002.29 - lr: 0.000005 - momentum: 0.000000 2023-10-25 20:49:22,516 epoch 10 - iter 28/146 - loss 0.00715687 - time (sec): 1.83 - samples/sec: 4562.72 - lr: 0.000005 - momentum: 0.000000 2023-10-25 20:49:23,371 epoch 10 - iter 42/146 - loss 0.00756824 - time (sec): 2.69 - samples/sec: 4594.76 - lr: 0.000004 - momentum: 0.000000 2023-10-25 20:49:24,269 epoch 10 - iter 56/146 - loss 0.00627773 - time (sec): 3.58 - samples/sec: 4685.83 - lr: 0.000004 - momentum: 0.000000 2023-10-25 20:49:25,223 epoch 10 - iter 70/146 - loss 0.00727587 - time (sec): 4.54 - samples/sec: 4629.20 - lr: 0.000003 - momentum: 0.000000 2023-10-25 20:49:26,052 epoch 10 - iter 84/146 - loss 0.00623229 - time (sec): 5.37 - samples/sec: 4579.56 - lr: 0.000003 - momentum: 0.000000 2023-10-25 20:49:27,230 epoch 10 - iter 98/146 - loss 0.00645332 - time (sec): 6.54 - samples/sec: 4523.53 - lr: 0.000002 - momentum: 0.000000 2023-10-25 20:49:28,185 epoch 10 - iter 112/146 - loss 0.00637249 - time (sec): 7.50 - samples/sec: 4586.34 - lr: 0.000002 - momentum: 0.000000 2023-10-25 20:49:29,154 epoch 10 - iter 126/146 - loss 0.00621133 - time (sec): 8.47 - samples/sec: 4552.42 - lr: 0.000001 - momentum: 0.000000 2023-10-25 20:49:29,992 epoch 10 - iter 140/146 - loss 0.00573965 - time (sec): 9.31 - samples/sec: 4551.44 - lr: 0.000000 - momentum: 0.000000 2023-10-25 20:49:30,393 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:49:30,393 EPOCH 10 done: loss 0.0055 - lr: 0.000000 2023-10-25 20:49:31,304 DEV : loss 0.19148127734661102 - f1-score (micro avg) 0.6975 2023-10-25 20:49:31,833 ---------------------------------------------------------------------------------------------------- 2023-10-25 20:49:31,834 Loading model from best epoch ... 2023-10-25 20:49:33,617 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-25 20:49:35,165 Results: - F-score (micro) 0.754 - F-score (macro) 0.67 - Accuracy 0.6246 By class: precision recall f1-score support PER 0.7983 0.8305 0.8141 348 LOC 0.6616 0.8391 0.7399 261 ORG 0.4565 0.4038 0.4286 52 HumanProd 0.7143 0.6818 0.6977 22 micro avg 0.7158 0.7965 0.7540 683 macro avg 0.6577 0.6888 0.6700 683 weighted avg 0.7174 0.7965 0.7526 683 2023-10-25 20:49:35,166 ----------------------------------------------------------------------------------------------------