stefan-it's picture
Upload folder using huggingface_hub
c1fe613
2023-10-13 12:16:20,073 ----------------------------------------------------------------------------------------------------
2023-10-13 12:16:20,074 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 12:16:20,074 ----------------------------------------------------------------------------------------------------
2023-10-13 12:16:20,074 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-13 12:16:20,074 ----------------------------------------------------------------------------------------------------
2023-10-13 12:16:20,074 Train: 3575 sentences
2023-10-13 12:16:20,074 (train_with_dev=False, train_with_test=False)
2023-10-13 12:16:20,074 ----------------------------------------------------------------------------------------------------
2023-10-13 12:16:20,074 Training Params:
2023-10-13 12:16:20,074 - learning_rate: "5e-05"
2023-10-13 12:16:20,074 - mini_batch_size: "8"
2023-10-13 12:16:20,074 - max_epochs: "10"
2023-10-13 12:16:20,074 - shuffle: "True"
2023-10-13 12:16:20,074 ----------------------------------------------------------------------------------------------------
2023-10-13 12:16:20,074 Plugins:
2023-10-13 12:16:20,074 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 12:16:20,074 ----------------------------------------------------------------------------------------------------
2023-10-13 12:16:20,074 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 12:16:20,074 - metric: "('micro avg', 'f1-score')"
2023-10-13 12:16:20,074 ----------------------------------------------------------------------------------------------------
2023-10-13 12:16:20,074 Computation:
2023-10-13 12:16:20,075 - compute on device: cuda:0
2023-10-13 12:16:20,075 - embedding storage: none
2023-10-13 12:16:20,075 ----------------------------------------------------------------------------------------------------
2023-10-13 12:16:20,075 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-13 12:16:20,075 ----------------------------------------------------------------------------------------------------
2023-10-13 12:16:20,075 ----------------------------------------------------------------------------------------------------
2023-10-13 12:16:22,769 epoch 1 - iter 44/447 - loss 3.14261033 - time (sec): 2.69 - samples/sec: 2960.33 - lr: 0.000005 - momentum: 0.000000
2023-10-13 12:16:25,692 epoch 1 - iter 88/447 - loss 2.14441076 - time (sec): 5.62 - samples/sec: 2837.51 - lr: 0.000010 - momentum: 0.000000
2023-10-13 12:16:28,648 epoch 1 - iter 132/447 - loss 1.53821589 - time (sec): 8.57 - samples/sec: 2881.13 - lr: 0.000015 - momentum: 0.000000
2023-10-13 12:16:31,310 epoch 1 - iter 176/447 - loss 1.26325007 - time (sec): 11.23 - samples/sec: 2904.95 - lr: 0.000020 - momentum: 0.000000
2023-10-13 12:16:34,078 epoch 1 - iter 220/447 - loss 1.06860500 - time (sec): 14.00 - samples/sec: 2946.64 - lr: 0.000024 - momentum: 0.000000
2023-10-13 12:16:37,553 epoch 1 - iter 264/447 - loss 0.91852698 - time (sec): 17.48 - samples/sec: 2953.39 - lr: 0.000029 - momentum: 0.000000
2023-10-13 12:16:40,411 epoch 1 - iter 308/447 - loss 0.83070003 - time (sec): 20.34 - samples/sec: 2931.86 - lr: 0.000034 - momentum: 0.000000
2023-10-13 12:16:43,058 epoch 1 - iter 352/447 - loss 0.75729611 - time (sec): 22.98 - samples/sec: 2960.94 - lr: 0.000039 - momentum: 0.000000
2023-10-13 12:16:46,142 epoch 1 - iter 396/447 - loss 0.70030509 - time (sec): 26.07 - samples/sec: 2932.61 - lr: 0.000044 - momentum: 0.000000
2023-10-13 12:16:49,210 epoch 1 - iter 440/447 - loss 0.65259845 - time (sec): 29.13 - samples/sec: 2905.36 - lr: 0.000049 - momentum: 0.000000
2023-10-13 12:16:49,782 ----------------------------------------------------------------------------------------------------
2023-10-13 12:16:49,783 EPOCH 1 done: loss 0.6424 - lr: 0.000049
2023-10-13 12:16:54,809 DEV : loss 0.17999280989170074 - f1-score (micro avg) 0.6119
2023-10-13 12:16:54,839 saving best model
2023-10-13 12:16:55,180 ----------------------------------------------------------------------------------------------------
2023-10-13 12:16:58,029 epoch 2 - iter 44/447 - loss 0.21885310 - time (sec): 2.85 - samples/sec: 2992.93 - lr: 0.000049 - momentum: 0.000000
2023-10-13 12:17:00,968 epoch 2 - iter 88/447 - loss 0.20348487 - time (sec): 5.79 - samples/sec: 2915.48 - lr: 0.000049 - momentum: 0.000000
2023-10-13 12:17:03,645 epoch 2 - iter 132/447 - loss 0.18779304 - time (sec): 8.46 - samples/sec: 2955.05 - lr: 0.000048 - momentum: 0.000000
2023-10-13 12:17:06,522 epoch 2 - iter 176/447 - loss 0.18362775 - time (sec): 11.34 - samples/sec: 2978.59 - lr: 0.000048 - momentum: 0.000000
2023-10-13 12:17:09,210 epoch 2 - iter 220/447 - loss 0.17259586 - time (sec): 14.03 - samples/sec: 2963.51 - lr: 0.000047 - momentum: 0.000000
2023-10-13 12:17:12,381 epoch 2 - iter 264/447 - loss 0.16827387 - time (sec): 17.20 - samples/sec: 2960.19 - lr: 0.000047 - momentum: 0.000000
2023-10-13 12:17:15,141 epoch 2 - iter 308/447 - loss 0.16382626 - time (sec): 19.96 - samples/sec: 2969.07 - lr: 0.000046 - momentum: 0.000000
2023-10-13 12:17:18,142 epoch 2 - iter 352/447 - loss 0.15749203 - time (sec): 22.96 - samples/sec: 2987.22 - lr: 0.000046 - momentum: 0.000000
2023-10-13 12:17:21,109 epoch 2 - iter 396/447 - loss 0.15568860 - time (sec): 25.93 - samples/sec: 2968.50 - lr: 0.000045 - momentum: 0.000000
2023-10-13 12:17:23,854 epoch 2 - iter 440/447 - loss 0.15491984 - time (sec): 28.67 - samples/sec: 2972.67 - lr: 0.000045 - momentum: 0.000000
2023-10-13 12:17:24,241 ----------------------------------------------------------------------------------------------------
2023-10-13 12:17:24,241 EPOCH 2 done: loss 0.1546 - lr: 0.000045
2023-10-13 12:17:33,041 DEV : loss 0.11931649595499039 - f1-score (micro avg) 0.7133
2023-10-13 12:17:33,072 saving best model
2023-10-13 12:17:33,543 ----------------------------------------------------------------------------------------------------
2023-10-13 12:17:36,515 epoch 3 - iter 44/447 - loss 0.09401755 - time (sec): 2.97 - samples/sec: 2867.56 - lr: 0.000044 - momentum: 0.000000
2023-10-13 12:17:39,572 epoch 3 - iter 88/447 - loss 0.08053615 - time (sec): 6.03 - samples/sec: 3004.31 - lr: 0.000043 - momentum: 0.000000
2023-10-13 12:17:42,521 epoch 3 - iter 132/447 - loss 0.08329820 - time (sec): 8.98 - samples/sec: 3034.73 - lr: 0.000043 - momentum: 0.000000
2023-10-13 12:17:45,376 epoch 3 - iter 176/447 - loss 0.07875894 - time (sec): 11.83 - samples/sec: 3054.35 - lr: 0.000042 - momentum: 0.000000
2023-10-13 12:17:48,289 epoch 3 - iter 220/447 - loss 0.08444205 - time (sec): 14.74 - samples/sec: 3052.92 - lr: 0.000042 - momentum: 0.000000
2023-10-13 12:17:50,914 epoch 3 - iter 264/447 - loss 0.08644774 - time (sec): 17.37 - samples/sec: 3028.81 - lr: 0.000041 - momentum: 0.000000
2023-10-13 12:17:53,528 epoch 3 - iter 308/447 - loss 0.08382388 - time (sec): 19.98 - samples/sec: 3043.28 - lr: 0.000041 - momentum: 0.000000
2023-10-13 12:17:56,182 epoch 3 - iter 352/447 - loss 0.08593653 - time (sec): 22.64 - samples/sec: 3040.32 - lr: 0.000040 - momentum: 0.000000
2023-10-13 12:17:59,167 epoch 3 - iter 396/447 - loss 0.08618714 - time (sec): 25.62 - samples/sec: 3009.13 - lr: 0.000040 - momentum: 0.000000
2023-10-13 12:18:02,080 epoch 3 - iter 440/447 - loss 0.08664836 - time (sec): 28.54 - samples/sec: 2988.13 - lr: 0.000039 - momentum: 0.000000
2023-10-13 12:18:02,548 ----------------------------------------------------------------------------------------------------
2023-10-13 12:18:02,548 EPOCH 3 done: loss 0.0863 - lr: 0.000039
2023-10-13 12:18:11,118 DEV : loss 0.1396493762731552 - f1-score (micro avg) 0.7502
2023-10-13 12:18:11,149 saving best model
2023-10-13 12:18:11,612 ----------------------------------------------------------------------------------------------------
2023-10-13 12:18:14,188 epoch 4 - iter 44/447 - loss 0.05525879 - time (sec): 2.57 - samples/sec: 2939.76 - lr: 0.000038 - momentum: 0.000000
2023-10-13 12:18:17,182 epoch 4 - iter 88/447 - loss 0.04633208 - time (sec): 5.56 - samples/sec: 3023.46 - lr: 0.000038 - momentum: 0.000000
2023-10-13 12:18:19,914 epoch 4 - iter 132/447 - loss 0.05499492 - time (sec): 8.30 - samples/sec: 3023.56 - lr: 0.000037 - momentum: 0.000000
2023-10-13 12:18:22,639 epoch 4 - iter 176/447 - loss 0.05387788 - time (sec): 11.02 - samples/sec: 3051.89 - lr: 0.000037 - momentum: 0.000000
2023-10-13 12:18:25,154 epoch 4 - iter 220/447 - loss 0.05237504 - time (sec): 13.53 - samples/sec: 3033.31 - lr: 0.000036 - momentum: 0.000000
2023-10-13 12:18:28,279 epoch 4 - iter 264/447 - loss 0.04945493 - time (sec): 16.66 - samples/sec: 3062.16 - lr: 0.000036 - momentum: 0.000000
2023-10-13 12:18:31,380 epoch 4 - iter 308/447 - loss 0.04857546 - time (sec): 19.76 - samples/sec: 3025.12 - lr: 0.000035 - momentum: 0.000000
2023-10-13 12:18:34,060 epoch 4 - iter 352/447 - loss 0.04897486 - time (sec): 22.44 - samples/sec: 3022.95 - lr: 0.000035 - momentum: 0.000000
2023-10-13 12:18:37,004 epoch 4 - iter 396/447 - loss 0.04934248 - time (sec): 25.39 - samples/sec: 3037.22 - lr: 0.000034 - momentum: 0.000000
2023-10-13 12:18:39,766 epoch 4 - iter 440/447 - loss 0.04973534 - time (sec): 28.15 - samples/sec: 3032.90 - lr: 0.000033 - momentum: 0.000000
2023-10-13 12:18:40,175 ----------------------------------------------------------------------------------------------------
2023-10-13 12:18:40,175 EPOCH 4 done: loss 0.0495 - lr: 0.000033
2023-10-13 12:18:48,839 DEV : loss 0.1706864982843399 - f1-score (micro avg) 0.7558
2023-10-13 12:18:48,869 saving best model
2023-10-13 12:18:49,312 ----------------------------------------------------------------------------------------------------
2023-10-13 12:18:52,083 epoch 5 - iter 44/447 - loss 0.05200091 - time (sec): 2.76 - samples/sec: 2928.37 - lr: 0.000033 - momentum: 0.000000
2023-10-13 12:18:54,786 epoch 5 - iter 88/447 - loss 0.03619275 - time (sec): 5.46 - samples/sec: 2908.21 - lr: 0.000032 - momentum: 0.000000
2023-10-13 12:18:57,731 epoch 5 - iter 132/447 - loss 0.03375555 - time (sec): 8.41 - samples/sec: 2915.48 - lr: 0.000032 - momentum: 0.000000
2023-10-13 12:19:00,469 epoch 5 - iter 176/447 - loss 0.03452040 - time (sec): 11.15 - samples/sec: 2929.20 - lr: 0.000031 - momentum: 0.000000
2023-10-13 12:19:03,583 epoch 5 - iter 220/447 - loss 0.03448902 - time (sec): 14.26 - samples/sec: 2980.50 - lr: 0.000031 - momentum: 0.000000
2023-10-13 12:19:06,193 epoch 5 - iter 264/447 - loss 0.03387829 - time (sec): 16.87 - samples/sec: 3013.19 - lr: 0.000030 - momentum: 0.000000
2023-10-13 12:19:09,170 epoch 5 - iter 308/447 - loss 0.03356111 - time (sec): 19.85 - samples/sec: 3011.24 - lr: 0.000030 - momentum: 0.000000
2023-10-13 12:19:12,309 epoch 5 - iter 352/447 - loss 0.03271895 - time (sec): 22.99 - samples/sec: 3003.67 - lr: 0.000029 - momentum: 0.000000
2023-10-13 12:19:15,047 epoch 5 - iter 396/447 - loss 0.03239190 - time (sec): 25.72 - samples/sec: 3021.95 - lr: 0.000028 - momentum: 0.000000
2023-10-13 12:19:17,519 epoch 5 - iter 440/447 - loss 0.03136978 - time (sec): 28.20 - samples/sec: 3022.80 - lr: 0.000028 - momentum: 0.000000
2023-10-13 12:19:17,931 ----------------------------------------------------------------------------------------------------
2023-10-13 12:19:17,931 EPOCH 5 done: loss 0.0310 - lr: 0.000028
2023-10-13 12:19:26,603 DEV : loss 0.20082274079322815 - f1-score (micro avg) 0.7696
2023-10-13 12:19:26,634 saving best model
2023-10-13 12:19:27,018 ----------------------------------------------------------------------------------------------------
2023-10-13 12:19:29,851 epoch 6 - iter 44/447 - loss 0.01622643 - time (sec): 2.83 - samples/sec: 3032.18 - lr: 0.000027 - momentum: 0.000000
2023-10-13 12:19:32,640 epoch 6 - iter 88/447 - loss 0.01695102 - time (sec): 5.62 - samples/sec: 2985.73 - lr: 0.000027 - momentum: 0.000000
2023-10-13 12:19:35,449 epoch 6 - iter 132/447 - loss 0.01667450 - time (sec): 8.43 - samples/sec: 2944.44 - lr: 0.000026 - momentum: 0.000000
2023-10-13 12:19:38,260 epoch 6 - iter 176/447 - loss 0.01827741 - time (sec): 11.24 - samples/sec: 2951.34 - lr: 0.000026 - momentum: 0.000000
2023-10-13 12:19:41,153 epoch 6 - iter 220/447 - loss 0.01959407 - time (sec): 14.13 - samples/sec: 2944.73 - lr: 0.000025 - momentum: 0.000000
2023-10-13 12:19:43,808 epoch 6 - iter 264/447 - loss 0.02009856 - time (sec): 16.79 - samples/sec: 2975.89 - lr: 0.000025 - momentum: 0.000000
2023-10-13 12:19:46,363 epoch 6 - iter 308/447 - loss 0.02044350 - time (sec): 19.34 - samples/sec: 2989.34 - lr: 0.000024 - momentum: 0.000000
2023-10-13 12:19:49,232 epoch 6 - iter 352/447 - loss 0.02069867 - time (sec): 22.21 - samples/sec: 2991.02 - lr: 0.000023 - momentum: 0.000000
2023-10-13 12:19:52,193 epoch 6 - iter 396/447 - loss 0.02098711 - time (sec): 25.17 - samples/sec: 2980.62 - lr: 0.000023 - momentum: 0.000000
2023-10-13 12:19:55,409 epoch 6 - iter 440/447 - loss 0.02136923 - time (sec): 28.39 - samples/sec: 2994.56 - lr: 0.000022 - momentum: 0.000000
2023-10-13 12:19:55,894 ----------------------------------------------------------------------------------------------------
2023-10-13 12:19:55,894 EPOCH 6 done: loss 0.0211 - lr: 0.000022
2023-10-13 12:20:04,495 DEV : loss 0.23715780675411224 - f1-score (micro avg) 0.7733
2023-10-13 12:20:04,525 saving best model
2023-10-13 12:20:04,985 ----------------------------------------------------------------------------------------------------
2023-10-13 12:20:07,786 epoch 7 - iter 44/447 - loss 0.01199565 - time (sec): 2.79 - samples/sec: 3082.52 - lr: 0.000022 - momentum: 0.000000
2023-10-13 12:20:10,593 epoch 7 - iter 88/447 - loss 0.01126610 - time (sec): 5.60 - samples/sec: 3054.32 - lr: 0.000021 - momentum: 0.000000
2023-10-13 12:20:13,242 epoch 7 - iter 132/447 - loss 0.01064061 - time (sec): 8.25 - samples/sec: 3167.04 - lr: 0.000021 - momentum: 0.000000
2023-10-13 12:20:16,107 epoch 7 - iter 176/447 - loss 0.01633639 - time (sec): 11.11 - samples/sec: 3129.97 - lr: 0.000020 - momentum: 0.000000
2023-10-13 12:20:18,845 epoch 7 - iter 220/447 - loss 0.01466051 - time (sec): 13.85 - samples/sec: 3084.13 - lr: 0.000020 - momentum: 0.000000
2023-10-13 12:20:21,664 epoch 7 - iter 264/447 - loss 0.01635494 - time (sec): 16.67 - samples/sec: 3083.72 - lr: 0.000019 - momentum: 0.000000
2023-10-13 12:20:24,361 epoch 7 - iter 308/447 - loss 0.01815937 - time (sec): 19.37 - samples/sec: 3069.17 - lr: 0.000018 - momentum: 0.000000
2023-10-13 12:20:27,142 epoch 7 - iter 352/447 - loss 0.01751451 - time (sec): 22.15 - samples/sec: 3064.68 - lr: 0.000018 - momentum: 0.000000
2023-10-13 12:20:29,760 epoch 7 - iter 396/447 - loss 0.01791652 - time (sec): 24.77 - samples/sec: 3047.44 - lr: 0.000017 - momentum: 0.000000
2023-10-13 12:20:32,796 epoch 7 - iter 440/447 - loss 0.01696968 - time (sec): 27.80 - samples/sec: 3043.76 - lr: 0.000017 - momentum: 0.000000
2023-10-13 12:20:33,478 ----------------------------------------------------------------------------------------------------
2023-10-13 12:20:33,479 EPOCH 7 done: loss 0.0169 - lr: 0.000017
2023-10-13 12:20:42,328 DEV : loss 0.24267171323299408 - f1-score (micro avg) 0.7719
2023-10-13 12:20:42,360 ----------------------------------------------------------------------------------------------------
2023-10-13 12:20:45,065 epoch 8 - iter 44/447 - loss 0.00574822 - time (sec): 2.70 - samples/sec: 3087.09 - lr: 0.000016 - momentum: 0.000000
2023-10-13 12:20:48,466 epoch 8 - iter 88/447 - loss 0.00743845 - time (sec): 6.11 - samples/sec: 2924.33 - lr: 0.000016 - momentum: 0.000000
2023-10-13 12:20:51,212 epoch 8 - iter 132/447 - loss 0.00792900 - time (sec): 8.85 - samples/sec: 2948.74 - lr: 0.000015 - momentum: 0.000000
2023-10-13 12:20:53,937 epoch 8 - iter 176/447 - loss 0.00785956 - time (sec): 11.58 - samples/sec: 2982.77 - lr: 0.000015 - momentum: 0.000000
2023-10-13 12:20:56,601 epoch 8 - iter 220/447 - loss 0.00788583 - time (sec): 14.24 - samples/sec: 2978.18 - lr: 0.000014 - momentum: 0.000000
2023-10-13 12:20:59,624 epoch 8 - iter 264/447 - loss 0.00853818 - time (sec): 17.26 - samples/sec: 2966.63 - lr: 0.000013 - momentum: 0.000000
2023-10-13 12:21:02,453 epoch 8 - iter 308/447 - loss 0.00916299 - time (sec): 20.09 - samples/sec: 2990.31 - lr: 0.000013 - momentum: 0.000000
2023-10-13 12:21:05,235 epoch 8 - iter 352/447 - loss 0.00932899 - time (sec): 22.87 - samples/sec: 2981.93 - lr: 0.000012 - momentum: 0.000000
2023-10-13 12:21:08,099 epoch 8 - iter 396/447 - loss 0.01034094 - time (sec): 25.74 - samples/sec: 2982.67 - lr: 0.000012 - momentum: 0.000000
2023-10-13 12:21:11,010 epoch 8 - iter 440/447 - loss 0.01011961 - time (sec): 28.65 - samples/sec: 2975.75 - lr: 0.000011 - momentum: 0.000000
2023-10-13 12:21:11,441 ----------------------------------------------------------------------------------------------------
2023-10-13 12:21:11,441 EPOCH 8 done: loss 0.0101 - lr: 0.000011
2023-10-13 12:21:19,717 DEV : loss 0.2543591260910034 - f1-score (micro avg) 0.7859
2023-10-13 12:21:19,749 saving best model
2023-10-13 12:21:20,261 ----------------------------------------------------------------------------------------------------
2023-10-13 12:21:23,549 epoch 9 - iter 44/447 - loss 0.00528114 - time (sec): 3.28 - samples/sec: 2607.80 - lr: 0.000011 - momentum: 0.000000
2023-10-13 12:21:26,230 epoch 9 - iter 88/447 - loss 0.00993798 - time (sec): 5.96 - samples/sec: 2822.40 - lr: 0.000010 - momentum: 0.000000
2023-10-13 12:21:29,082 epoch 9 - iter 132/447 - loss 0.00867724 - time (sec): 8.82 - samples/sec: 2840.89 - lr: 0.000010 - momentum: 0.000000
2023-10-13 12:21:31,841 epoch 9 - iter 176/447 - loss 0.00730150 - time (sec): 11.57 - samples/sec: 2909.04 - lr: 0.000009 - momentum: 0.000000
2023-10-13 12:21:35,160 epoch 9 - iter 220/447 - loss 0.00633256 - time (sec): 14.89 - samples/sec: 2909.00 - lr: 0.000008 - momentum: 0.000000
2023-10-13 12:21:37,857 epoch 9 - iter 264/447 - loss 0.00584102 - time (sec): 17.59 - samples/sec: 2941.43 - lr: 0.000008 - momentum: 0.000000
2023-10-13 12:21:40,619 epoch 9 - iter 308/447 - loss 0.00698111 - time (sec): 20.35 - samples/sec: 2937.67 - lr: 0.000007 - momentum: 0.000000
2023-10-13 12:21:43,603 epoch 9 - iter 352/447 - loss 0.00775451 - time (sec): 23.34 - samples/sec: 2946.59 - lr: 0.000007 - momentum: 0.000000
2023-10-13 12:21:46,303 epoch 9 - iter 396/447 - loss 0.00750628 - time (sec): 26.04 - samples/sec: 2943.23 - lr: 0.000006 - momentum: 0.000000
2023-10-13 12:21:49,116 epoch 9 - iter 440/447 - loss 0.00746570 - time (sec): 28.85 - samples/sec: 2948.98 - lr: 0.000006 - momentum: 0.000000
2023-10-13 12:21:49,770 ----------------------------------------------------------------------------------------------------
2023-10-13 12:21:49,771 EPOCH 9 done: loss 0.0075 - lr: 0.000006
2023-10-13 12:21:58,239 DEV : loss 0.2509065568447113 - f1-score (micro avg) 0.7812
2023-10-13 12:21:58,271 ----------------------------------------------------------------------------------------------------
2023-10-13 12:22:01,228 epoch 10 - iter 44/447 - loss 0.00265283 - time (sec): 2.96 - samples/sec: 3095.76 - lr: 0.000005 - momentum: 0.000000
2023-10-13 12:22:03,969 epoch 10 - iter 88/447 - loss 0.00408723 - time (sec): 5.70 - samples/sec: 3012.81 - lr: 0.000005 - momentum: 0.000000
2023-10-13 12:22:06,819 epoch 10 - iter 132/447 - loss 0.00409162 - time (sec): 8.55 - samples/sec: 2966.94 - lr: 0.000004 - momentum: 0.000000
2023-10-13 12:22:10,474 epoch 10 - iter 176/447 - loss 0.00341264 - time (sec): 12.20 - samples/sec: 2890.46 - lr: 0.000003 - momentum: 0.000000
2023-10-13 12:22:13,149 epoch 10 - iter 220/447 - loss 0.00461848 - time (sec): 14.88 - samples/sec: 2925.23 - lr: 0.000003 - momentum: 0.000000
2023-10-13 12:22:15,741 epoch 10 - iter 264/447 - loss 0.00418989 - time (sec): 17.47 - samples/sec: 2963.91 - lr: 0.000002 - momentum: 0.000000
2023-10-13 12:22:18,378 epoch 10 - iter 308/447 - loss 0.00461877 - time (sec): 20.11 - samples/sec: 2961.29 - lr: 0.000002 - momentum: 0.000000
2023-10-13 12:22:21,374 epoch 10 - iter 352/447 - loss 0.00511943 - time (sec): 23.10 - samples/sec: 2952.59 - lr: 0.000001 - momentum: 0.000000
2023-10-13 12:22:24,099 epoch 10 - iter 396/447 - loss 0.00522285 - time (sec): 25.83 - samples/sec: 2955.50 - lr: 0.000001 - momentum: 0.000000
2023-10-13 12:22:27,111 epoch 10 - iter 440/447 - loss 0.00506098 - time (sec): 28.84 - samples/sec: 2961.27 - lr: 0.000000 - momentum: 0.000000
2023-10-13 12:22:27,543 ----------------------------------------------------------------------------------------------------
2023-10-13 12:22:27,543 EPOCH 10 done: loss 0.0050 - lr: 0.000000
2023-10-13 12:22:35,641 DEV : loss 0.25304141640663147 - f1-score (micro avg) 0.7829
2023-10-13 12:22:36,008 ----------------------------------------------------------------------------------------------------
2023-10-13 12:22:36,010 Loading model from best epoch ...
2023-10-13 12:22:37,653 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-13 12:22:42,814
Results:
- F-score (micro) 0.7517
- F-score (macro) 0.6707
- Accuracy 0.6231
By class:
precision recall f1-score support
loc 0.8654 0.8305 0.8476 596
pers 0.6525 0.7838 0.7121 333
org 0.5686 0.4394 0.4957 132
prod 0.6667 0.4848 0.5614 66
time 0.7609 0.7143 0.7368 49
micro avg 0.7543 0.7491 0.7517 1176
macro avg 0.7028 0.6506 0.6707 1176
weighted avg 0.7563 0.7491 0.7491 1176
2023-10-13 12:22:42,814 ----------------------------------------------------------------------------------------------------