2023-10-25 21:09:57,475 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:09:57,476 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 21:09:57,476 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:09:57,476 MultiCorpus: 1166 train + 165 dev + 415 test sentences - NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator 2023-10-25 21:09:57,477 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:09:57,477 Train: 1166 sentences 2023-10-25 21:09:57,477 (train_with_dev=False, train_with_test=False) 2023-10-25 21:09:57,477 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:09:57,477 Training Params: 2023-10-25 21:09:57,477 - learning_rate: "3e-05" 2023-10-25 21:09:57,477 - mini_batch_size: "4" 2023-10-25 21:09:57,477 - max_epochs: "10" 2023-10-25 21:09:57,477 - shuffle: "True" 2023-10-25 21:09:57,477 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:09:57,477 Plugins: 2023-10-25 21:09:57,477 - TensorboardLogger 2023-10-25 21:09:57,477 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 21:09:57,477 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:09:57,477 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 21:09:57,477 - metric: "('micro avg', 'f1-score')" 2023-10-25 21:09:57,477 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:09:57,477 Computation: 2023-10-25 21:09:57,477 - compute on device: cuda:0 2023-10-25 21:09:57,477 - embedding storage: none 2023-10-25 21:09:57,477 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:09:57,477 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-25 21:09:57,477 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:09:57,477 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:09:57,477 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 21:09:58,743 epoch 1 - iter 29/292 - loss 2.73833700 - time (sec): 1.26 - samples/sec: 2733.50 - lr: 0.000003 - momentum: 0.000000 2023-10-25 21:10:00,016 epoch 1 - iter 58/292 - loss 2.06223157 - time (sec): 2.54 - samples/sec: 2941.83 - lr: 0.000006 - momentum: 0.000000 2023-10-25 21:10:01,360 epoch 1 - iter 87/292 - loss 1.51849599 - time (sec): 3.88 - samples/sec: 3112.72 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:10:02,634 epoch 1 - iter 116/292 - loss 1.26889759 - time (sec): 5.16 - samples/sec: 3123.06 - lr: 0.000012 - momentum: 0.000000 2023-10-25 21:10:03,888 epoch 1 - iter 145/292 - loss 1.09220417 - time (sec): 6.41 - samples/sec: 3199.90 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:10:05,205 epoch 1 - iter 174/292 - loss 0.98328127 - time (sec): 7.73 - samples/sec: 3233.86 - lr: 0.000018 - momentum: 0.000000 2023-10-25 21:10:06,492 epoch 1 - iter 203/292 - loss 0.88448692 - time (sec): 9.01 - samples/sec: 3301.34 - lr: 0.000021 - momentum: 0.000000 2023-10-25 21:10:07,957 epoch 1 - iter 232/292 - loss 0.81219091 - time (sec): 10.48 - samples/sec: 3328.88 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:10:09,285 epoch 1 - iter 261/292 - loss 0.73736454 - time (sec): 11.81 - samples/sec: 3386.30 - lr: 0.000027 - momentum: 0.000000 2023-10-25 21:10:10,589 epoch 1 - iter 290/292 - loss 0.68903885 - time (sec): 13.11 - samples/sec: 3376.55 - lr: 0.000030 - momentum: 0.000000 2023-10-25 21:10:10,669 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:10:10,669 EPOCH 1 done: loss 0.6887 - lr: 0.000030 2023-10-25 21:10:11,337 DEV : loss 0.14741386473178864 - f1-score (micro avg) 0.5684 2023-10-25 21:10:11,341 saving best model 2023-10-25 21:10:11,804 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:10:13,116 epoch 2 - iter 29/292 - loss 0.19802987 - time (sec): 1.31 - samples/sec: 3532.37 - lr: 0.000030 - momentum: 0.000000 2023-10-25 21:10:14,467 epoch 2 - iter 58/292 - loss 0.16963628 - time (sec): 2.66 - samples/sec: 3655.77 - lr: 0.000029 - momentum: 0.000000 2023-10-25 21:10:15,747 epoch 2 - iter 87/292 - loss 0.17025510 - time (sec): 3.94 - samples/sec: 3550.44 - lr: 0.000029 - momentum: 0.000000 2023-10-25 21:10:17,076 epoch 2 - iter 116/292 - loss 0.16977186 - time (sec): 5.27 - samples/sec: 3463.69 - lr: 0.000029 - momentum: 0.000000 2023-10-25 21:10:18,323 epoch 2 - iter 145/292 - loss 0.16943346 - time (sec): 6.52 - samples/sec: 3405.57 - lr: 0.000028 - momentum: 0.000000 2023-10-25 21:10:19,568 epoch 2 - iter 174/292 - loss 0.17536090 - time (sec): 7.76 - samples/sec: 3354.69 - lr: 0.000028 - momentum: 0.000000 2023-10-25 21:10:20,803 epoch 2 - iter 203/292 - loss 0.17323557 - time (sec): 9.00 - samples/sec: 3361.80 - lr: 0.000028 - momentum: 0.000000 2023-10-25 21:10:22,166 epoch 2 - iter 232/292 - loss 0.16456716 - time (sec): 10.36 - samples/sec: 3373.62 - lr: 0.000027 - momentum: 0.000000 2023-10-25 21:10:23,436 epoch 2 - iter 261/292 - loss 0.16135994 - time (sec): 11.63 - samples/sec: 3407.22 - lr: 0.000027 - momentum: 0.000000 2023-10-25 21:10:24,699 epoch 2 - iter 290/292 - loss 0.16019950 - time (sec): 12.89 - samples/sec: 3421.71 - lr: 0.000027 - momentum: 0.000000 2023-10-25 21:10:24,782 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:10:24,783 EPOCH 2 done: loss 0.1601 - lr: 0.000027 2023-10-25 21:10:25,689 DEV : loss 0.1006804034113884 - f1-score (micro avg) 0.7293 2023-10-25 21:10:25,694 saving best model 2023-10-25 21:10:26,488 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:10:27,806 epoch 3 - iter 29/292 - loss 0.07594065 - time (sec): 1.31 - samples/sec: 3394.59 - lr: 0.000026 - momentum: 0.000000 2023-10-25 21:10:29,004 epoch 3 - iter 58/292 - loss 0.07959637 - time (sec): 2.51 - samples/sec: 3064.08 - lr: 0.000026 - momentum: 0.000000 2023-10-25 21:10:30,354 epoch 3 - iter 87/292 - loss 0.08406265 - time (sec): 3.86 - samples/sec: 3187.84 - lr: 0.000026 - momentum: 0.000000 2023-10-25 21:10:31,639 epoch 3 - iter 116/292 - loss 0.08390541 - time (sec): 5.15 - samples/sec: 3101.09 - lr: 0.000025 - momentum: 0.000000 2023-10-25 21:10:33,068 epoch 3 - iter 145/292 - loss 0.08728900 - time (sec): 6.58 - samples/sec: 3334.11 - lr: 0.000025 - momentum: 0.000000 2023-10-25 21:10:34,367 epoch 3 - iter 174/292 - loss 0.08901608 - time (sec): 7.88 - samples/sec: 3386.20 - lr: 0.000025 - momentum: 0.000000 2023-10-25 21:10:35,652 epoch 3 - iter 203/292 - loss 0.09014633 - time (sec): 9.16 - samples/sec: 3410.32 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:10:36,942 epoch 3 - iter 232/292 - loss 0.08866790 - time (sec): 10.45 - samples/sec: 3369.56 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:10:38,213 epoch 3 - iter 261/292 - loss 0.08888259 - time (sec): 11.72 - samples/sec: 3339.15 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:10:39,580 epoch 3 - iter 290/292 - loss 0.08998420 - time (sec): 13.09 - samples/sec: 3344.03 - lr: 0.000023 - momentum: 0.000000 2023-10-25 21:10:39,684 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:10:39,685 EPOCH 3 done: loss 0.0907 - lr: 0.000023 2023-10-25 21:10:40,596 DEV : loss 0.10108631104230881 - f1-score (micro avg) 0.7149 2023-10-25 21:10:40,601 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:10:41,978 epoch 4 - iter 29/292 - loss 0.06981579 - time (sec): 1.38 - samples/sec: 3683.46 - lr: 0.000023 - momentum: 0.000000 2023-10-25 21:10:43,335 epoch 4 - iter 58/292 - loss 0.06503789 - time (sec): 2.73 - samples/sec: 3312.15 - lr: 0.000023 - momentum: 0.000000 2023-10-25 21:10:44,674 epoch 4 - iter 87/292 - loss 0.06470976 - time (sec): 4.07 - samples/sec: 3261.96 - lr: 0.000022 - momentum: 0.000000 2023-10-25 21:10:45,984 epoch 4 - iter 116/292 - loss 0.06055619 - time (sec): 5.38 - samples/sec: 3217.83 - lr: 0.000022 - momentum: 0.000000 2023-10-25 21:10:47,230 epoch 4 - iter 145/292 - loss 0.05764284 - time (sec): 6.63 - samples/sec: 3167.14 - lr: 0.000022 - momentum: 0.000000 2023-10-25 21:10:48,543 epoch 4 - iter 174/292 - loss 0.06190221 - time (sec): 7.94 - samples/sec: 3228.80 - lr: 0.000021 - momentum: 0.000000 2023-10-25 21:10:49,817 epoch 4 - iter 203/292 - loss 0.06220813 - time (sec): 9.22 - samples/sec: 3221.19 - lr: 0.000021 - momentum: 0.000000 2023-10-25 21:10:51,171 epoch 4 - iter 232/292 - loss 0.06042297 - time (sec): 10.57 - samples/sec: 3177.20 - lr: 0.000021 - momentum: 0.000000 2023-10-25 21:10:52,526 epoch 4 - iter 261/292 - loss 0.06188990 - time (sec): 11.92 - samples/sec: 3277.30 - lr: 0.000020 - momentum: 0.000000 2023-10-25 21:10:53,812 epoch 4 - iter 290/292 - loss 0.06067152 - time (sec): 13.21 - samples/sec: 3353.33 - lr: 0.000020 - momentum: 0.000000 2023-10-25 21:10:53,891 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:10:53,892 EPOCH 4 done: loss 0.0605 - lr: 0.000020 2023-10-25 21:10:54,800 DEV : loss 0.12220078706741333 - f1-score (micro avg) 0.7566 2023-10-25 21:10:54,805 saving best model 2023-10-25 21:10:55,307 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:10:56,618 epoch 5 - iter 29/292 - loss 0.07226843 - time (sec): 1.31 - samples/sec: 3651.83 - lr: 0.000020 - momentum: 0.000000 2023-10-25 21:10:57,855 epoch 5 - iter 58/292 - loss 0.05576230 - time (sec): 2.55 - samples/sec: 3360.82 - lr: 0.000019 - momentum: 0.000000 2023-10-25 21:10:59,178 epoch 5 - iter 87/292 - loss 0.05222793 - time (sec): 3.87 - samples/sec: 3532.22 - lr: 0.000019 - momentum: 0.000000 2023-10-25 21:11:00,461 epoch 5 - iter 116/292 - loss 0.04893205 - time (sec): 5.15 - samples/sec: 3442.37 - lr: 0.000019 - momentum: 0.000000 2023-10-25 21:11:01,744 epoch 5 - iter 145/292 - loss 0.04364075 - time (sec): 6.44 - samples/sec: 3410.85 - lr: 0.000018 - momentum: 0.000000 2023-10-25 21:11:03,003 epoch 5 - iter 174/292 - loss 0.04155609 - time (sec): 7.70 - samples/sec: 3347.83 - lr: 0.000018 - momentum: 0.000000 2023-10-25 21:11:04,289 epoch 5 - iter 203/292 - loss 0.04349718 - time (sec): 8.98 - samples/sec: 3332.55 - lr: 0.000018 - momentum: 0.000000 2023-10-25 21:11:05,576 epoch 5 - iter 232/292 - loss 0.04323634 - time (sec): 10.27 - samples/sec: 3396.22 - lr: 0.000017 - momentum: 0.000000 2023-10-25 21:11:06,959 epoch 5 - iter 261/292 - loss 0.04192164 - time (sec): 11.65 - samples/sec: 3438.20 - lr: 0.000017 - momentum: 0.000000 2023-10-25 21:11:08,205 epoch 5 - iter 290/292 - loss 0.04216845 - time (sec): 12.90 - samples/sec: 3436.94 - lr: 0.000017 - momentum: 0.000000 2023-10-25 21:11:08,280 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:11:08,280 EPOCH 5 done: loss 0.0421 - lr: 0.000017 2023-10-25 21:11:09,192 DEV : loss 0.14462324976921082 - f1-score (micro avg) 0.7615 2023-10-25 21:11:09,197 saving best model 2023-10-25 21:11:09,811 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:11:11,084 epoch 6 - iter 29/292 - loss 0.03071517 - time (sec): 1.27 - samples/sec: 3367.38 - lr: 0.000016 - momentum: 0.000000 2023-10-25 21:11:12,392 epoch 6 - iter 58/292 - loss 0.03420363 - time (sec): 2.58 - samples/sec: 3399.63 - lr: 0.000016 - momentum: 0.000000 2023-10-25 21:11:13,735 epoch 6 - iter 87/292 - loss 0.02882773 - time (sec): 3.92 - samples/sec: 3494.61 - lr: 0.000016 - momentum: 0.000000 2023-10-25 21:11:15,013 epoch 6 - iter 116/292 - loss 0.02863387 - time (sec): 5.20 - samples/sec: 3502.54 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:11:16,288 epoch 6 - iter 145/292 - loss 0.02715294 - time (sec): 6.47 - samples/sec: 3430.36 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:11:17,594 epoch 6 - iter 174/292 - loss 0.02561636 - time (sec): 7.78 - samples/sec: 3368.67 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:11:18,908 epoch 6 - iter 203/292 - loss 0.02415975 - time (sec): 9.09 - samples/sec: 3379.04 - lr: 0.000014 - momentum: 0.000000 2023-10-25 21:11:20,254 epoch 6 - iter 232/292 - loss 0.02654283 - time (sec): 10.44 - samples/sec: 3388.04 - lr: 0.000014 - momentum: 0.000000 2023-10-25 21:11:21,502 epoch 6 - iter 261/292 - loss 0.02923450 - time (sec): 11.69 - samples/sec: 3425.09 - lr: 0.000014 - momentum: 0.000000 2023-10-25 21:11:22,697 epoch 6 - iter 290/292 - loss 0.02964102 - time (sec): 12.88 - samples/sec: 3434.90 - lr: 0.000013 - momentum: 0.000000 2023-10-25 21:11:22,773 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:11:22,774 EPOCH 6 done: loss 0.0298 - lr: 0.000013 2023-10-25 21:11:23,685 DEV : loss 0.14538371562957764 - f1-score (micro avg) 0.7451 2023-10-25 21:11:23,689 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:11:24,920 epoch 7 - iter 29/292 - loss 0.01672833 - time (sec): 1.23 - samples/sec: 3498.10 - lr: 0.000013 - momentum: 0.000000 2023-10-25 21:11:26,397 epoch 7 - iter 58/292 - loss 0.02256440 - time (sec): 2.71 - samples/sec: 3800.73 - lr: 0.000013 - momentum: 0.000000 2023-10-25 21:11:27,816 epoch 7 - iter 87/292 - loss 0.02204005 - time (sec): 4.13 - samples/sec: 3378.64 - lr: 0.000012 - momentum: 0.000000 2023-10-25 21:11:29,078 epoch 7 - iter 116/292 - loss 0.02236449 - time (sec): 5.39 - samples/sec: 3287.29 - lr: 0.000012 - momentum: 0.000000 2023-10-25 21:11:30,408 epoch 7 - iter 145/292 - loss 0.02209624 - time (sec): 6.72 - samples/sec: 3341.87 - lr: 0.000012 - momentum: 0.000000 2023-10-25 21:11:31,689 epoch 7 - iter 174/292 - loss 0.02162009 - time (sec): 8.00 - samples/sec: 3385.55 - lr: 0.000011 - momentum: 0.000000 2023-10-25 21:11:32,976 epoch 7 - iter 203/292 - loss 0.02205763 - time (sec): 9.29 - samples/sec: 3396.11 - lr: 0.000011 - momentum: 0.000000 2023-10-25 21:11:34,194 epoch 7 - iter 232/292 - loss 0.02172002 - time (sec): 10.50 - samples/sec: 3344.86 - lr: 0.000011 - momentum: 0.000000 2023-10-25 21:11:35,497 epoch 7 - iter 261/292 - loss 0.02105659 - time (sec): 11.81 - samples/sec: 3360.74 - lr: 0.000010 - momentum: 0.000000 2023-10-25 21:11:36,791 epoch 7 - iter 290/292 - loss 0.01955965 - time (sec): 13.10 - samples/sec: 3379.17 - lr: 0.000010 - momentum: 0.000000 2023-10-25 21:11:36,874 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:11:36,874 EPOCH 7 done: loss 0.0195 - lr: 0.000010 2023-10-25 21:11:37,790 DEV : loss 0.13841482996940613 - f1-score (micro avg) 0.7592 2023-10-25 21:11:37,794 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:11:39,074 epoch 8 - iter 29/292 - loss 0.02671843 - time (sec): 1.28 - samples/sec: 3108.12 - lr: 0.000010 - momentum: 0.000000 2023-10-25 21:11:40,476 epoch 8 - iter 58/292 - loss 0.02178925 - time (sec): 2.68 - samples/sec: 3397.93 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:11:41,813 epoch 8 - iter 87/292 - loss 0.01899837 - time (sec): 4.02 - samples/sec: 3467.31 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:11:43,060 epoch 8 - iter 116/292 - loss 0.02281207 - time (sec): 5.27 - samples/sec: 3471.94 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:11:44,356 epoch 8 - iter 145/292 - loss 0.02038146 - time (sec): 6.56 - samples/sec: 3427.58 - lr: 0.000008 - momentum: 0.000000 2023-10-25 21:11:45,606 epoch 8 - iter 174/292 - loss 0.01938016 - time (sec): 7.81 - samples/sec: 3442.67 - lr: 0.000008 - momentum: 0.000000 2023-10-25 21:11:46,852 epoch 8 - iter 203/292 - loss 0.01829205 - time (sec): 9.06 - samples/sec: 3394.12 - lr: 0.000008 - momentum: 0.000000 2023-10-25 21:11:48,119 epoch 8 - iter 232/292 - loss 0.01759759 - time (sec): 10.32 - samples/sec: 3435.64 - lr: 0.000007 - momentum: 0.000000 2023-10-25 21:11:49,392 epoch 8 - iter 261/292 - loss 0.01611240 - time (sec): 11.60 - samples/sec: 3441.24 - lr: 0.000007 - momentum: 0.000000 2023-10-25 21:11:50,615 epoch 8 - iter 290/292 - loss 0.01512526 - time (sec): 12.82 - samples/sec: 3457.28 - lr: 0.000007 - momentum: 0.000000 2023-10-25 21:11:50,691 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:11:50,691 EPOCH 8 done: loss 0.0151 - lr: 0.000007 2023-10-25 21:11:51,601 DEV : loss 0.158660426735878 - f1-score (micro avg) 0.7716 2023-10-25 21:11:51,605 saving best model 2023-10-25 21:11:52,218 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:11:53,533 epoch 9 - iter 29/292 - loss 0.00434558 - time (sec): 1.31 - samples/sec: 3611.31 - lr: 0.000006 - momentum: 0.000000 2023-10-25 21:11:54,797 epoch 9 - iter 58/292 - loss 0.00498322 - time (sec): 2.58 - samples/sec: 3555.16 - lr: 0.000006 - momentum: 0.000000 2023-10-25 21:11:56,049 epoch 9 - iter 87/292 - loss 0.00460247 - time (sec): 3.83 - samples/sec: 3395.57 - lr: 0.000006 - momentum: 0.000000 2023-10-25 21:11:57,416 epoch 9 - iter 116/292 - loss 0.00516195 - time (sec): 5.20 - samples/sec: 3445.47 - lr: 0.000005 - momentum: 0.000000 2023-10-25 21:11:58,788 epoch 9 - iter 145/292 - loss 0.00698851 - time (sec): 6.57 - samples/sec: 3472.08 - lr: 0.000005 - momentum: 0.000000 2023-10-25 21:12:00,089 epoch 9 - iter 174/292 - loss 0.00881053 - time (sec): 7.87 - samples/sec: 3480.23 - lr: 0.000005 - momentum: 0.000000 2023-10-25 21:12:01,400 epoch 9 - iter 203/292 - loss 0.00796033 - time (sec): 9.18 - samples/sec: 3444.05 - lr: 0.000004 - momentum: 0.000000 2023-10-25 21:12:02,643 epoch 9 - iter 232/292 - loss 0.00873789 - time (sec): 10.42 - samples/sec: 3416.19 - lr: 0.000004 - momentum: 0.000000 2023-10-25 21:12:03,950 epoch 9 - iter 261/292 - loss 0.00842511 - time (sec): 11.73 - samples/sec: 3381.93 - lr: 0.000004 - momentum: 0.000000 2023-10-25 21:12:05,264 epoch 9 - iter 290/292 - loss 0.00863425 - time (sec): 13.04 - samples/sec: 3387.58 - lr: 0.000003 - momentum: 0.000000 2023-10-25 21:12:05,342 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:12:05,342 EPOCH 9 done: loss 0.0086 - lr: 0.000003 2023-10-25 21:12:06,262 DEV : loss 0.1723966747522354 - f1-score (micro avg) 0.7479 2023-10-25 21:12:06,266 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:12:07,519 epoch 10 - iter 29/292 - loss 0.00123487 - time (sec): 1.25 - samples/sec: 3447.36 - lr: 0.000003 - momentum: 0.000000 2023-10-25 21:12:08,777 epoch 10 - iter 58/292 - loss 0.00792751 - time (sec): 2.51 - samples/sec: 3468.05 - lr: 0.000003 - momentum: 0.000000 2023-10-25 21:12:10,077 epoch 10 - iter 87/292 - loss 0.01303484 - time (sec): 3.81 - samples/sec: 3461.35 - lr: 0.000002 - momentum: 0.000000 2023-10-25 21:12:11,290 epoch 10 - iter 116/292 - loss 0.01034843 - time (sec): 5.02 - samples/sec: 3469.73 - lr: 0.000002 - momentum: 0.000000 2023-10-25 21:12:12,506 epoch 10 - iter 145/292 - loss 0.00886202 - time (sec): 6.24 - samples/sec: 3469.77 - lr: 0.000002 - momentum: 0.000000 2023-10-25 21:12:13,730 epoch 10 - iter 174/292 - loss 0.00848014 - time (sec): 7.46 - samples/sec: 3435.23 - lr: 0.000001 - momentum: 0.000000 2023-10-25 21:12:15,122 epoch 10 - iter 203/292 - loss 0.00876510 - time (sec): 8.86 - samples/sec: 3504.95 - lr: 0.000001 - momentum: 0.000000 2023-10-25 21:12:16,413 epoch 10 - iter 232/292 - loss 0.00952739 - time (sec): 10.15 - samples/sec: 3475.76 - lr: 0.000001 - momentum: 0.000000 2023-10-25 21:12:17,803 epoch 10 - iter 261/292 - loss 0.00885530 - time (sec): 11.54 - samples/sec: 3462.61 - lr: 0.000000 - momentum: 0.000000 2023-10-25 21:12:19,084 epoch 10 - iter 290/292 - loss 0.00831916 - time (sec): 12.82 - samples/sec: 3445.59 - lr: 0.000000 - momentum: 0.000000 2023-10-25 21:12:19,171 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:12:19,171 EPOCH 10 done: loss 0.0083 - lr: 0.000000 2023-10-25 21:12:20,093 DEV : loss 0.17783689498901367 - f1-score (micro avg) 0.7511 2023-10-25 21:12:20,562 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:12:20,563 Loading model from best epoch ... 2023-10-25 21:12:22,175 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-25 21:12:23,911 Results: - F-score (micro) 0.7554 - F-score (macro) 0.6669 - Accuracy 0.6324 By class: precision recall f1-score support PER 0.7733 0.8333 0.8022 348 LOC 0.7063 0.8199 0.7589 261 ORG 0.4583 0.4231 0.4400 52 HumanProd 0.6154 0.7273 0.6667 22 micro avg 0.7207 0.7936 0.7554 683 macro avg 0.6383 0.7009 0.6669 683 weighted avg 0.7186 0.7936 0.7537 683 2023-10-25 21:12:23,911 ----------------------------------------------------------------------------------------------------