2023-10-13 12:16:20,073 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:16:20,074 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 12:16:20,074 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:16:20,074 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences - NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator 2023-10-13 12:16:20,074 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:16:20,074 Train: 3575 sentences 2023-10-13 12:16:20,074 (train_with_dev=False, train_with_test=False) 2023-10-13 12:16:20,074 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:16:20,074 Training Params: 2023-10-13 12:16:20,074 - learning_rate: "5e-05" 2023-10-13 12:16:20,074 - mini_batch_size: "8" 2023-10-13 12:16:20,074 - max_epochs: "10" 2023-10-13 12:16:20,074 - shuffle: "True" 2023-10-13 12:16:20,074 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:16:20,074 Plugins: 2023-10-13 12:16:20,074 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 12:16:20,074 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:16:20,074 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 12:16:20,074 - metric: "('micro avg', 'f1-score')" 2023-10-13 12:16:20,074 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:16:20,074 Computation: 2023-10-13 12:16:20,075 - compute on device: cuda:0 2023-10-13 12:16:20,075 - embedding storage: none 2023-10-13 12:16:20,075 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:16:20,075 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-13 12:16:20,075 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:16:20,075 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:16:22,769 epoch 1 - iter 44/447 - loss 3.14261033 - time (sec): 2.69 - samples/sec: 2960.33 - lr: 0.000005 - momentum: 0.000000 2023-10-13 12:16:25,692 epoch 1 - iter 88/447 - loss 2.14441076 - time (sec): 5.62 - samples/sec: 2837.51 - lr: 0.000010 - momentum: 0.000000 2023-10-13 12:16:28,648 epoch 1 - iter 132/447 - loss 1.53821589 - time (sec): 8.57 - samples/sec: 2881.13 - lr: 0.000015 - momentum: 0.000000 2023-10-13 12:16:31,310 epoch 1 - iter 176/447 - loss 1.26325007 - time (sec): 11.23 - samples/sec: 2904.95 - lr: 0.000020 - momentum: 0.000000 2023-10-13 12:16:34,078 epoch 1 - iter 220/447 - loss 1.06860500 - time (sec): 14.00 - samples/sec: 2946.64 - lr: 0.000024 - momentum: 0.000000 2023-10-13 12:16:37,553 epoch 1 - iter 264/447 - loss 0.91852698 - time (sec): 17.48 - samples/sec: 2953.39 - lr: 0.000029 - momentum: 0.000000 2023-10-13 12:16:40,411 epoch 1 - iter 308/447 - loss 0.83070003 - time (sec): 20.34 - samples/sec: 2931.86 - lr: 0.000034 - momentum: 0.000000 2023-10-13 12:16:43,058 epoch 1 - iter 352/447 - loss 0.75729611 - time (sec): 22.98 - samples/sec: 2960.94 - lr: 0.000039 - momentum: 0.000000 2023-10-13 12:16:46,142 epoch 1 - iter 396/447 - loss 0.70030509 - time (sec): 26.07 - samples/sec: 2932.61 - lr: 0.000044 - momentum: 0.000000 2023-10-13 12:16:49,210 epoch 1 - iter 440/447 - loss 0.65259845 - time (sec): 29.13 - samples/sec: 2905.36 - lr: 0.000049 - momentum: 0.000000 2023-10-13 12:16:49,782 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:16:49,783 EPOCH 1 done: loss 0.6424 - lr: 0.000049 2023-10-13 12:16:54,809 DEV : loss 0.17999280989170074 - f1-score (micro avg) 0.6119 2023-10-13 12:16:54,839 saving best model 2023-10-13 12:16:55,180 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:16:58,029 epoch 2 - iter 44/447 - loss 0.21885310 - time (sec): 2.85 - samples/sec: 2992.93 - lr: 0.000049 - momentum: 0.000000 2023-10-13 12:17:00,968 epoch 2 - iter 88/447 - loss 0.20348487 - time (sec): 5.79 - samples/sec: 2915.48 - lr: 0.000049 - momentum: 0.000000 2023-10-13 12:17:03,645 epoch 2 - iter 132/447 - loss 0.18779304 - time (sec): 8.46 - samples/sec: 2955.05 - lr: 0.000048 - momentum: 0.000000 2023-10-13 12:17:06,522 epoch 2 - iter 176/447 - loss 0.18362775 - time (sec): 11.34 - samples/sec: 2978.59 - lr: 0.000048 - momentum: 0.000000 2023-10-13 12:17:09,210 epoch 2 - iter 220/447 - loss 0.17259586 - time (sec): 14.03 - samples/sec: 2963.51 - lr: 0.000047 - momentum: 0.000000 2023-10-13 12:17:12,381 epoch 2 - iter 264/447 - loss 0.16827387 - time (sec): 17.20 - samples/sec: 2960.19 - lr: 0.000047 - momentum: 0.000000 2023-10-13 12:17:15,141 epoch 2 - iter 308/447 - loss 0.16382626 - time (sec): 19.96 - samples/sec: 2969.07 - lr: 0.000046 - momentum: 0.000000 2023-10-13 12:17:18,142 epoch 2 - iter 352/447 - loss 0.15749203 - time (sec): 22.96 - samples/sec: 2987.22 - lr: 0.000046 - momentum: 0.000000 2023-10-13 12:17:21,109 epoch 2 - iter 396/447 - loss 0.15568860 - time (sec): 25.93 - samples/sec: 2968.50 - lr: 0.000045 - momentum: 0.000000 2023-10-13 12:17:23,854 epoch 2 - iter 440/447 - loss 0.15491984 - time (sec): 28.67 - samples/sec: 2972.67 - lr: 0.000045 - momentum: 0.000000 2023-10-13 12:17:24,241 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:17:24,241 EPOCH 2 done: loss 0.1546 - lr: 0.000045 2023-10-13 12:17:33,041 DEV : loss 0.11931649595499039 - f1-score (micro avg) 0.7133 2023-10-13 12:17:33,072 saving best model 2023-10-13 12:17:33,543 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:17:36,515 epoch 3 - iter 44/447 - loss 0.09401755 - time (sec): 2.97 - samples/sec: 2867.56 - lr: 0.000044 - momentum: 0.000000 2023-10-13 12:17:39,572 epoch 3 - iter 88/447 - loss 0.08053615 - time (sec): 6.03 - samples/sec: 3004.31 - lr: 0.000043 - momentum: 0.000000 2023-10-13 12:17:42,521 epoch 3 - iter 132/447 - loss 0.08329820 - time (sec): 8.98 - samples/sec: 3034.73 - lr: 0.000043 - momentum: 0.000000 2023-10-13 12:17:45,376 epoch 3 - iter 176/447 - loss 0.07875894 - time (sec): 11.83 - samples/sec: 3054.35 - lr: 0.000042 - momentum: 0.000000 2023-10-13 12:17:48,289 epoch 3 - iter 220/447 - loss 0.08444205 - time (sec): 14.74 - samples/sec: 3052.92 - lr: 0.000042 - momentum: 0.000000 2023-10-13 12:17:50,914 epoch 3 - iter 264/447 - loss 0.08644774 - time (sec): 17.37 - samples/sec: 3028.81 - lr: 0.000041 - momentum: 0.000000 2023-10-13 12:17:53,528 epoch 3 - iter 308/447 - loss 0.08382388 - time (sec): 19.98 - samples/sec: 3043.28 - lr: 0.000041 - momentum: 0.000000 2023-10-13 12:17:56,182 epoch 3 - iter 352/447 - loss 0.08593653 - time (sec): 22.64 - samples/sec: 3040.32 - lr: 0.000040 - momentum: 0.000000 2023-10-13 12:17:59,167 epoch 3 - iter 396/447 - loss 0.08618714 - time (sec): 25.62 - samples/sec: 3009.13 - lr: 0.000040 - momentum: 0.000000 2023-10-13 12:18:02,080 epoch 3 - iter 440/447 - loss 0.08664836 - time (sec): 28.54 - samples/sec: 2988.13 - lr: 0.000039 - momentum: 0.000000 2023-10-13 12:18:02,548 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:18:02,548 EPOCH 3 done: loss 0.0863 - lr: 0.000039 2023-10-13 12:18:11,118 DEV : loss 0.1396493762731552 - f1-score (micro avg) 0.7502 2023-10-13 12:18:11,149 saving best model 2023-10-13 12:18:11,612 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:18:14,188 epoch 4 - iter 44/447 - loss 0.05525879 - time (sec): 2.57 - samples/sec: 2939.76 - lr: 0.000038 - momentum: 0.000000 2023-10-13 12:18:17,182 epoch 4 - iter 88/447 - loss 0.04633208 - time (sec): 5.56 - samples/sec: 3023.46 - lr: 0.000038 - momentum: 0.000000 2023-10-13 12:18:19,914 epoch 4 - iter 132/447 - loss 0.05499492 - time (sec): 8.30 - samples/sec: 3023.56 - lr: 0.000037 - momentum: 0.000000 2023-10-13 12:18:22,639 epoch 4 - iter 176/447 - loss 0.05387788 - time (sec): 11.02 - samples/sec: 3051.89 - lr: 0.000037 - momentum: 0.000000 2023-10-13 12:18:25,154 epoch 4 - iter 220/447 - loss 0.05237504 - time (sec): 13.53 - samples/sec: 3033.31 - lr: 0.000036 - momentum: 0.000000 2023-10-13 12:18:28,279 epoch 4 - iter 264/447 - loss 0.04945493 - time (sec): 16.66 - samples/sec: 3062.16 - lr: 0.000036 - momentum: 0.000000 2023-10-13 12:18:31,380 epoch 4 - iter 308/447 - loss 0.04857546 - time (sec): 19.76 - samples/sec: 3025.12 - lr: 0.000035 - momentum: 0.000000 2023-10-13 12:18:34,060 epoch 4 - iter 352/447 - loss 0.04897486 - time (sec): 22.44 - samples/sec: 3022.95 - lr: 0.000035 - momentum: 0.000000 2023-10-13 12:18:37,004 epoch 4 - iter 396/447 - loss 0.04934248 - time (sec): 25.39 - samples/sec: 3037.22 - lr: 0.000034 - momentum: 0.000000 2023-10-13 12:18:39,766 epoch 4 - iter 440/447 - loss 0.04973534 - time (sec): 28.15 - samples/sec: 3032.90 - lr: 0.000033 - momentum: 0.000000 2023-10-13 12:18:40,175 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:18:40,175 EPOCH 4 done: loss 0.0495 - lr: 0.000033 2023-10-13 12:18:48,839 DEV : loss 0.1706864982843399 - f1-score (micro avg) 0.7558 2023-10-13 12:18:48,869 saving best model 2023-10-13 12:18:49,312 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:18:52,083 epoch 5 - iter 44/447 - loss 0.05200091 - time (sec): 2.76 - samples/sec: 2928.37 - lr: 0.000033 - momentum: 0.000000 2023-10-13 12:18:54,786 epoch 5 - iter 88/447 - loss 0.03619275 - time (sec): 5.46 - samples/sec: 2908.21 - lr: 0.000032 - momentum: 0.000000 2023-10-13 12:18:57,731 epoch 5 - iter 132/447 - loss 0.03375555 - time (sec): 8.41 - samples/sec: 2915.48 - lr: 0.000032 - momentum: 0.000000 2023-10-13 12:19:00,469 epoch 5 - iter 176/447 - loss 0.03452040 - time (sec): 11.15 - samples/sec: 2929.20 - lr: 0.000031 - momentum: 0.000000 2023-10-13 12:19:03,583 epoch 5 - iter 220/447 - loss 0.03448902 - time (sec): 14.26 - samples/sec: 2980.50 - lr: 0.000031 - momentum: 0.000000 2023-10-13 12:19:06,193 epoch 5 - iter 264/447 - loss 0.03387829 - time (sec): 16.87 - samples/sec: 3013.19 - lr: 0.000030 - momentum: 0.000000 2023-10-13 12:19:09,170 epoch 5 - iter 308/447 - loss 0.03356111 - time (sec): 19.85 - samples/sec: 3011.24 - lr: 0.000030 - momentum: 0.000000 2023-10-13 12:19:12,309 epoch 5 - iter 352/447 - loss 0.03271895 - time (sec): 22.99 - samples/sec: 3003.67 - lr: 0.000029 - momentum: 0.000000 2023-10-13 12:19:15,047 epoch 5 - iter 396/447 - loss 0.03239190 - time (sec): 25.72 - samples/sec: 3021.95 - lr: 0.000028 - momentum: 0.000000 2023-10-13 12:19:17,519 epoch 5 - iter 440/447 - loss 0.03136978 - time (sec): 28.20 - samples/sec: 3022.80 - lr: 0.000028 - momentum: 0.000000 2023-10-13 12:19:17,931 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:19:17,931 EPOCH 5 done: loss 0.0310 - lr: 0.000028 2023-10-13 12:19:26,603 DEV : loss 0.20082274079322815 - f1-score (micro avg) 0.7696 2023-10-13 12:19:26,634 saving best model 2023-10-13 12:19:27,018 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:19:29,851 epoch 6 - iter 44/447 - loss 0.01622643 - time (sec): 2.83 - samples/sec: 3032.18 - lr: 0.000027 - momentum: 0.000000 2023-10-13 12:19:32,640 epoch 6 - iter 88/447 - loss 0.01695102 - time (sec): 5.62 - samples/sec: 2985.73 - lr: 0.000027 - momentum: 0.000000 2023-10-13 12:19:35,449 epoch 6 - iter 132/447 - loss 0.01667450 - time (sec): 8.43 - samples/sec: 2944.44 - lr: 0.000026 - momentum: 0.000000 2023-10-13 12:19:38,260 epoch 6 - iter 176/447 - loss 0.01827741 - time (sec): 11.24 - samples/sec: 2951.34 - lr: 0.000026 - momentum: 0.000000 2023-10-13 12:19:41,153 epoch 6 - iter 220/447 - loss 0.01959407 - time (sec): 14.13 - samples/sec: 2944.73 - lr: 0.000025 - momentum: 0.000000 2023-10-13 12:19:43,808 epoch 6 - iter 264/447 - loss 0.02009856 - time (sec): 16.79 - samples/sec: 2975.89 - lr: 0.000025 - momentum: 0.000000 2023-10-13 12:19:46,363 epoch 6 - iter 308/447 - loss 0.02044350 - time (sec): 19.34 - samples/sec: 2989.34 - lr: 0.000024 - momentum: 0.000000 2023-10-13 12:19:49,232 epoch 6 - iter 352/447 - loss 0.02069867 - time (sec): 22.21 - samples/sec: 2991.02 - lr: 0.000023 - momentum: 0.000000 2023-10-13 12:19:52,193 epoch 6 - iter 396/447 - loss 0.02098711 - time (sec): 25.17 - samples/sec: 2980.62 - lr: 0.000023 - momentum: 0.000000 2023-10-13 12:19:55,409 epoch 6 - iter 440/447 - loss 0.02136923 - time (sec): 28.39 - samples/sec: 2994.56 - lr: 0.000022 - momentum: 0.000000 2023-10-13 12:19:55,894 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:19:55,894 EPOCH 6 done: loss 0.0211 - lr: 0.000022 2023-10-13 12:20:04,495 DEV : loss 0.23715780675411224 - f1-score (micro avg) 0.7733 2023-10-13 12:20:04,525 saving best model 2023-10-13 12:20:04,985 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:20:07,786 epoch 7 - iter 44/447 - loss 0.01199565 - time (sec): 2.79 - samples/sec: 3082.52 - lr: 0.000022 - momentum: 0.000000 2023-10-13 12:20:10,593 epoch 7 - iter 88/447 - loss 0.01126610 - time (sec): 5.60 - samples/sec: 3054.32 - lr: 0.000021 - momentum: 0.000000 2023-10-13 12:20:13,242 epoch 7 - iter 132/447 - loss 0.01064061 - time (sec): 8.25 - samples/sec: 3167.04 - lr: 0.000021 - momentum: 0.000000 2023-10-13 12:20:16,107 epoch 7 - iter 176/447 - loss 0.01633639 - time (sec): 11.11 - samples/sec: 3129.97 - lr: 0.000020 - momentum: 0.000000 2023-10-13 12:20:18,845 epoch 7 - iter 220/447 - loss 0.01466051 - time (sec): 13.85 - samples/sec: 3084.13 - lr: 0.000020 - momentum: 0.000000 2023-10-13 12:20:21,664 epoch 7 - iter 264/447 - loss 0.01635494 - time (sec): 16.67 - samples/sec: 3083.72 - lr: 0.000019 - momentum: 0.000000 2023-10-13 12:20:24,361 epoch 7 - iter 308/447 - loss 0.01815937 - time (sec): 19.37 - samples/sec: 3069.17 - lr: 0.000018 - momentum: 0.000000 2023-10-13 12:20:27,142 epoch 7 - iter 352/447 - loss 0.01751451 - time (sec): 22.15 - samples/sec: 3064.68 - lr: 0.000018 - momentum: 0.000000 2023-10-13 12:20:29,760 epoch 7 - iter 396/447 - loss 0.01791652 - time (sec): 24.77 - samples/sec: 3047.44 - lr: 0.000017 - momentum: 0.000000 2023-10-13 12:20:32,796 epoch 7 - iter 440/447 - loss 0.01696968 - time (sec): 27.80 - samples/sec: 3043.76 - lr: 0.000017 - momentum: 0.000000 2023-10-13 12:20:33,478 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:20:33,479 EPOCH 7 done: loss 0.0169 - lr: 0.000017 2023-10-13 12:20:42,328 DEV : loss 0.24267171323299408 - f1-score (micro avg) 0.7719 2023-10-13 12:20:42,360 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:20:45,065 epoch 8 - iter 44/447 - loss 0.00574822 - time (sec): 2.70 - samples/sec: 3087.09 - lr: 0.000016 - momentum: 0.000000 2023-10-13 12:20:48,466 epoch 8 - iter 88/447 - loss 0.00743845 - time (sec): 6.11 - samples/sec: 2924.33 - lr: 0.000016 - momentum: 0.000000 2023-10-13 12:20:51,212 epoch 8 - iter 132/447 - loss 0.00792900 - time (sec): 8.85 - samples/sec: 2948.74 - lr: 0.000015 - momentum: 0.000000 2023-10-13 12:20:53,937 epoch 8 - iter 176/447 - loss 0.00785956 - time (sec): 11.58 - samples/sec: 2982.77 - lr: 0.000015 - momentum: 0.000000 2023-10-13 12:20:56,601 epoch 8 - iter 220/447 - loss 0.00788583 - time (sec): 14.24 - samples/sec: 2978.18 - lr: 0.000014 - momentum: 0.000000 2023-10-13 12:20:59,624 epoch 8 - iter 264/447 - loss 0.00853818 - time (sec): 17.26 - samples/sec: 2966.63 - lr: 0.000013 - momentum: 0.000000 2023-10-13 12:21:02,453 epoch 8 - iter 308/447 - loss 0.00916299 - time (sec): 20.09 - samples/sec: 2990.31 - lr: 0.000013 - momentum: 0.000000 2023-10-13 12:21:05,235 epoch 8 - iter 352/447 - loss 0.00932899 - time (sec): 22.87 - samples/sec: 2981.93 - lr: 0.000012 - momentum: 0.000000 2023-10-13 12:21:08,099 epoch 8 - iter 396/447 - loss 0.01034094 - time (sec): 25.74 - samples/sec: 2982.67 - lr: 0.000012 - momentum: 0.000000 2023-10-13 12:21:11,010 epoch 8 - iter 440/447 - loss 0.01011961 - time (sec): 28.65 - samples/sec: 2975.75 - lr: 0.000011 - momentum: 0.000000 2023-10-13 12:21:11,441 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:21:11,441 EPOCH 8 done: loss 0.0101 - lr: 0.000011 2023-10-13 12:21:19,717 DEV : loss 0.2543591260910034 - f1-score (micro avg) 0.7859 2023-10-13 12:21:19,749 saving best model 2023-10-13 12:21:20,261 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:21:23,549 epoch 9 - iter 44/447 - loss 0.00528114 - time (sec): 3.28 - samples/sec: 2607.80 - lr: 0.000011 - momentum: 0.000000 2023-10-13 12:21:26,230 epoch 9 - iter 88/447 - loss 0.00993798 - time (sec): 5.96 - samples/sec: 2822.40 - lr: 0.000010 - momentum: 0.000000 2023-10-13 12:21:29,082 epoch 9 - iter 132/447 - loss 0.00867724 - time (sec): 8.82 - samples/sec: 2840.89 - lr: 0.000010 - momentum: 0.000000 2023-10-13 12:21:31,841 epoch 9 - iter 176/447 - loss 0.00730150 - time (sec): 11.57 - samples/sec: 2909.04 - lr: 0.000009 - momentum: 0.000000 2023-10-13 12:21:35,160 epoch 9 - iter 220/447 - loss 0.00633256 - time (sec): 14.89 - samples/sec: 2909.00 - lr: 0.000008 - momentum: 0.000000 2023-10-13 12:21:37,857 epoch 9 - iter 264/447 - loss 0.00584102 - time (sec): 17.59 - samples/sec: 2941.43 - lr: 0.000008 - momentum: 0.000000 2023-10-13 12:21:40,619 epoch 9 - iter 308/447 - loss 0.00698111 - time (sec): 20.35 - samples/sec: 2937.67 - lr: 0.000007 - momentum: 0.000000 2023-10-13 12:21:43,603 epoch 9 - iter 352/447 - loss 0.00775451 - time (sec): 23.34 - samples/sec: 2946.59 - lr: 0.000007 - momentum: 0.000000 2023-10-13 12:21:46,303 epoch 9 - iter 396/447 - loss 0.00750628 - time (sec): 26.04 - samples/sec: 2943.23 - lr: 0.000006 - momentum: 0.000000 2023-10-13 12:21:49,116 epoch 9 - iter 440/447 - loss 0.00746570 - time (sec): 28.85 - samples/sec: 2948.98 - lr: 0.000006 - momentum: 0.000000 2023-10-13 12:21:49,770 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:21:49,771 EPOCH 9 done: loss 0.0075 - lr: 0.000006 2023-10-13 12:21:58,239 DEV : loss 0.2509065568447113 - f1-score (micro avg) 0.7812 2023-10-13 12:21:58,271 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:22:01,228 epoch 10 - iter 44/447 - loss 0.00265283 - time (sec): 2.96 - samples/sec: 3095.76 - lr: 0.000005 - momentum: 0.000000 2023-10-13 12:22:03,969 epoch 10 - iter 88/447 - loss 0.00408723 - time (sec): 5.70 - samples/sec: 3012.81 - lr: 0.000005 - momentum: 0.000000 2023-10-13 12:22:06,819 epoch 10 - iter 132/447 - loss 0.00409162 - time (sec): 8.55 - samples/sec: 2966.94 - lr: 0.000004 - momentum: 0.000000 2023-10-13 12:22:10,474 epoch 10 - iter 176/447 - loss 0.00341264 - time (sec): 12.20 - samples/sec: 2890.46 - lr: 0.000003 - momentum: 0.000000 2023-10-13 12:22:13,149 epoch 10 - iter 220/447 - loss 0.00461848 - time (sec): 14.88 - samples/sec: 2925.23 - lr: 0.000003 - momentum: 0.000000 2023-10-13 12:22:15,741 epoch 10 - iter 264/447 - loss 0.00418989 - time (sec): 17.47 - samples/sec: 2963.91 - lr: 0.000002 - momentum: 0.000000 2023-10-13 12:22:18,378 epoch 10 - iter 308/447 - loss 0.00461877 - time (sec): 20.11 - samples/sec: 2961.29 - lr: 0.000002 - momentum: 0.000000 2023-10-13 12:22:21,374 epoch 10 - iter 352/447 - loss 0.00511943 - time (sec): 23.10 - samples/sec: 2952.59 - lr: 0.000001 - momentum: 0.000000 2023-10-13 12:22:24,099 epoch 10 - iter 396/447 - loss 0.00522285 - time (sec): 25.83 - samples/sec: 2955.50 - lr: 0.000001 - momentum: 0.000000 2023-10-13 12:22:27,111 epoch 10 - iter 440/447 - loss 0.00506098 - time (sec): 28.84 - samples/sec: 2961.27 - lr: 0.000000 - momentum: 0.000000 2023-10-13 12:22:27,543 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:22:27,543 EPOCH 10 done: loss 0.0050 - lr: 0.000000 2023-10-13 12:22:35,641 DEV : loss 0.25304141640663147 - f1-score (micro avg) 0.7829 2023-10-13 12:22:36,008 ---------------------------------------------------------------------------------------------------- 2023-10-13 12:22:36,010 Loading model from best epoch ... 2023-10-13 12:22:37,653 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time 2023-10-13 12:22:42,814 Results: - F-score (micro) 0.7517 - F-score (macro) 0.6707 - Accuracy 0.6231 By class: precision recall f1-score support loc 0.8654 0.8305 0.8476 596 pers 0.6525 0.7838 0.7121 333 org 0.5686 0.4394 0.4957 132 prod 0.6667 0.4848 0.5614 66 time 0.7609 0.7143 0.7368 49 micro avg 0.7543 0.7491 0.7517 1176 macro avg 0.7028 0.6506 0.6707 1176 weighted avg 0.7563 0.7491 0.7491 1176 2023-10-13 12:22:42,814 ----------------------------------------------------------------------------------------------------