stefan-it's picture
Upload folder using huggingface_hub
8712e0f
2023-10-13 12:09:35,270 ----------------------------------------------------------------------------------------------------
2023-10-13 12:09:35,271 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 12:09:35,271 ----------------------------------------------------------------------------------------------------
2023-10-13 12:09:35,271 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-13 12:09:35,271 ----------------------------------------------------------------------------------------------------
2023-10-13 12:09:35,271 Train: 3575 sentences
2023-10-13 12:09:35,271 (train_with_dev=False, train_with_test=False)
2023-10-13 12:09:35,271 ----------------------------------------------------------------------------------------------------
2023-10-13 12:09:35,271 Training Params:
2023-10-13 12:09:35,271 - learning_rate: "3e-05"
2023-10-13 12:09:35,271 - mini_batch_size: "8"
2023-10-13 12:09:35,271 - max_epochs: "10"
2023-10-13 12:09:35,271 - shuffle: "True"
2023-10-13 12:09:35,271 ----------------------------------------------------------------------------------------------------
2023-10-13 12:09:35,271 Plugins:
2023-10-13 12:09:35,271 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 12:09:35,271 ----------------------------------------------------------------------------------------------------
2023-10-13 12:09:35,271 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 12:09:35,271 - metric: "('micro avg', 'f1-score')"
2023-10-13 12:09:35,271 ----------------------------------------------------------------------------------------------------
2023-10-13 12:09:35,271 Computation:
2023-10-13 12:09:35,271 - compute on device: cuda:0
2023-10-13 12:09:35,271 - embedding storage: none
2023-10-13 12:09:35,271 ----------------------------------------------------------------------------------------------------
2023-10-13 12:09:35,272 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-13 12:09:35,272 ----------------------------------------------------------------------------------------------------
2023-10-13 12:09:35,272 ----------------------------------------------------------------------------------------------------
2023-10-13 12:09:37,960 epoch 1 - iter 44/447 - loss 3.26423371 - time (sec): 2.69 - samples/sec: 2966.03 - lr: 0.000003 - momentum: 0.000000
2023-10-13 12:09:40,629 epoch 1 - iter 88/447 - loss 2.55251654 - time (sec): 5.36 - samples/sec: 2975.06 - lr: 0.000006 - momentum: 0.000000
2023-10-13 12:09:43,351 epoch 1 - iter 132/447 - loss 1.81465913 - time (sec): 8.08 - samples/sec: 3057.25 - lr: 0.000009 - momentum: 0.000000
2023-10-13 12:09:45,966 epoch 1 - iter 176/447 - loss 1.49020630 - time (sec): 10.69 - samples/sec: 3052.05 - lr: 0.000012 - momentum: 0.000000
2023-10-13 12:09:48,727 epoch 1 - iter 220/447 - loss 1.25720936 - time (sec): 13.45 - samples/sec: 3066.78 - lr: 0.000015 - momentum: 0.000000
2023-10-13 12:09:52,053 epoch 1 - iter 264/447 - loss 1.07641978 - time (sec): 16.78 - samples/sec: 3076.12 - lr: 0.000018 - momentum: 0.000000
2023-10-13 12:09:54,913 epoch 1 - iter 308/447 - loss 0.97218598 - time (sec): 19.64 - samples/sec: 3035.64 - lr: 0.000021 - momentum: 0.000000
2023-10-13 12:09:57,534 epoch 1 - iter 352/447 - loss 0.88562839 - time (sec): 22.26 - samples/sec: 3056.85 - lr: 0.000024 - momentum: 0.000000
2023-10-13 12:10:00,460 epoch 1 - iter 396/447 - loss 0.81842546 - time (sec): 25.19 - samples/sec: 3034.98 - lr: 0.000027 - momentum: 0.000000
2023-10-13 12:10:03,271 epoch 1 - iter 440/447 - loss 0.76093176 - time (sec): 28.00 - samples/sec: 3023.23 - lr: 0.000029 - momentum: 0.000000
2023-10-13 12:10:03,802 ----------------------------------------------------------------------------------------------------
2023-10-13 12:10:03,803 EPOCH 1 done: loss 0.7490 - lr: 0.000029
2023-10-13 12:10:08,594 DEV : loss 0.1892923265695572 - f1-score (micro avg) 0.6033
2023-10-13 12:10:08,622 saving best model
2023-10-13 12:10:08,990 ----------------------------------------------------------------------------------------------------
2023-10-13 12:10:11,751 epoch 2 - iter 44/447 - loss 0.22503075 - time (sec): 2.76 - samples/sec: 3089.16 - lr: 0.000030 - momentum: 0.000000
2023-10-13 12:10:14,608 epoch 2 - iter 88/447 - loss 0.21611089 - time (sec): 5.62 - samples/sec: 3004.42 - lr: 0.000029 - momentum: 0.000000
2023-10-13 12:10:17,239 epoch 2 - iter 132/447 - loss 0.19999503 - time (sec): 8.25 - samples/sec: 3032.70 - lr: 0.000029 - momentum: 0.000000
2023-10-13 12:10:20,013 epoch 2 - iter 176/447 - loss 0.19471361 - time (sec): 11.02 - samples/sec: 3064.84 - lr: 0.000029 - momentum: 0.000000
2023-10-13 12:10:22,647 epoch 2 - iter 220/447 - loss 0.18611851 - time (sec): 13.66 - samples/sec: 3044.52 - lr: 0.000028 - momentum: 0.000000
2023-10-13 12:10:25,663 epoch 2 - iter 264/447 - loss 0.18149685 - time (sec): 16.67 - samples/sec: 3054.10 - lr: 0.000028 - momentum: 0.000000
2023-10-13 12:10:28,344 epoch 2 - iter 308/447 - loss 0.17422604 - time (sec): 19.35 - samples/sec: 3062.40 - lr: 0.000028 - momentum: 0.000000
2023-10-13 12:10:31,324 epoch 2 - iter 352/447 - loss 0.16962325 - time (sec): 22.33 - samples/sec: 3071.16 - lr: 0.000027 - momentum: 0.000000
2023-10-13 12:10:34,299 epoch 2 - iter 396/447 - loss 0.16829992 - time (sec): 25.31 - samples/sec: 3041.33 - lr: 0.000027 - momentum: 0.000000
2023-10-13 12:10:37,052 epoch 2 - iter 440/447 - loss 0.16687265 - time (sec): 28.06 - samples/sec: 3037.56 - lr: 0.000027 - momentum: 0.000000
2023-10-13 12:10:37,451 ----------------------------------------------------------------------------------------------------
2023-10-13 12:10:37,451 EPOCH 2 done: loss 0.1664 - lr: 0.000027
2023-10-13 12:10:45,769 DEV : loss 0.12273670732975006 - f1-score (micro avg) 0.6851
2023-10-13 12:10:45,796 saving best model
2023-10-13 12:10:46,248 ----------------------------------------------------------------------------------------------------
2023-10-13 12:10:49,187 epoch 3 - iter 44/447 - loss 0.09991546 - time (sec): 2.94 - samples/sec: 2899.24 - lr: 0.000026 - momentum: 0.000000
2023-10-13 12:10:52,223 epoch 3 - iter 88/447 - loss 0.09710562 - time (sec): 5.97 - samples/sec: 3031.43 - lr: 0.000026 - momentum: 0.000000
2023-10-13 12:10:55,209 epoch 3 - iter 132/447 - loss 0.09199418 - time (sec): 8.96 - samples/sec: 3040.31 - lr: 0.000026 - momentum: 0.000000
2023-10-13 12:10:58,100 epoch 3 - iter 176/447 - loss 0.08528336 - time (sec): 11.85 - samples/sec: 3049.39 - lr: 0.000025 - momentum: 0.000000
2023-10-13 12:11:00,998 epoch 3 - iter 220/447 - loss 0.09281222 - time (sec): 14.75 - samples/sec: 3051.97 - lr: 0.000025 - momentum: 0.000000
2023-10-13 12:11:03,636 epoch 3 - iter 264/447 - loss 0.09405935 - time (sec): 17.39 - samples/sec: 3025.84 - lr: 0.000025 - momentum: 0.000000
2023-10-13 12:11:06,248 epoch 3 - iter 308/447 - loss 0.09018315 - time (sec): 20.00 - samples/sec: 3041.00 - lr: 0.000024 - momentum: 0.000000
2023-10-13 12:11:08,845 epoch 3 - iter 352/447 - loss 0.08995194 - time (sec): 22.60 - samples/sec: 3045.93 - lr: 0.000024 - momentum: 0.000000
2023-10-13 12:11:11,702 epoch 3 - iter 396/447 - loss 0.08979467 - time (sec): 25.45 - samples/sec: 3029.20 - lr: 0.000024 - momentum: 0.000000
2023-10-13 12:11:14,330 epoch 3 - iter 440/447 - loss 0.09064393 - time (sec): 28.08 - samples/sec: 3036.51 - lr: 0.000023 - momentum: 0.000000
2023-10-13 12:11:14,735 ----------------------------------------------------------------------------------------------------
2023-10-13 12:11:14,735 EPOCH 3 done: loss 0.0906 - lr: 0.000023
2023-10-13 12:11:23,095 DEV : loss 0.1380746215581894 - f1-score (micro avg) 0.7368
2023-10-13 12:11:23,122 saving best model
2023-10-13 12:11:23,561 ----------------------------------------------------------------------------------------------------
2023-10-13 12:11:26,132 epoch 4 - iter 44/447 - loss 0.05167342 - time (sec): 2.57 - samples/sec: 2943.67 - lr: 0.000023 - momentum: 0.000000
2023-10-13 12:11:29,209 epoch 4 - iter 88/447 - loss 0.04774725 - time (sec): 5.64 - samples/sec: 2980.68 - lr: 0.000023 - momentum: 0.000000
2023-10-13 12:11:31,981 epoch 4 - iter 132/447 - loss 0.05250541 - time (sec): 8.41 - samples/sec: 2980.78 - lr: 0.000022 - momentum: 0.000000
2023-10-13 12:11:34,711 epoch 4 - iter 176/447 - loss 0.05249414 - time (sec): 11.14 - samples/sec: 3017.69 - lr: 0.000022 - momentum: 0.000000
2023-10-13 12:11:37,381 epoch 4 - iter 220/447 - loss 0.05143646 - time (sec): 13.81 - samples/sec: 2971.83 - lr: 0.000022 - momentum: 0.000000
2023-10-13 12:11:40,611 epoch 4 - iter 264/447 - loss 0.04851668 - time (sec): 17.04 - samples/sec: 2992.98 - lr: 0.000021 - momentum: 0.000000
2023-10-13 12:11:43,801 epoch 4 - iter 308/447 - loss 0.04919683 - time (sec): 20.24 - samples/sec: 2954.25 - lr: 0.000021 - momentum: 0.000000
2023-10-13 12:11:46,537 epoch 4 - iter 352/447 - loss 0.04937125 - time (sec): 22.97 - samples/sec: 2953.29 - lr: 0.000021 - momentum: 0.000000
2023-10-13 12:11:49,442 epoch 4 - iter 396/447 - loss 0.04948050 - time (sec): 25.88 - samples/sec: 2979.68 - lr: 0.000020 - momentum: 0.000000
2023-10-13 12:11:52,225 epoch 4 - iter 440/447 - loss 0.04960176 - time (sec): 28.66 - samples/sec: 2978.70 - lr: 0.000020 - momentum: 0.000000
2023-10-13 12:11:52,628 ----------------------------------------------------------------------------------------------------
2023-10-13 12:11:52,629 EPOCH 4 done: loss 0.0492 - lr: 0.000020
2023-10-13 12:12:01,176 DEV : loss 0.1485665887594223 - f1-score (micro avg) 0.7566
2023-10-13 12:12:01,203 saving best model
2023-10-13 12:12:01,668 ----------------------------------------------------------------------------------------------------
2023-10-13 12:12:04,553 epoch 5 - iter 44/447 - loss 0.06274556 - time (sec): 2.88 - samples/sec: 2805.26 - lr: 0.000020 - momentum: 0.000000
2023-10-13 12:12:07,298 epoch 5 - iter 88/447 - loss 0.04536218 - time (sec): 5.63 - samples/sec: 2823.72 - lr: 0.000019 - momentum: 0.000000
2023-10-13 12:12:10,229 epoch 5 - iter 132/447 - loss 0.03980649 - time (sec): 8.56 - samples/sec: 2864.57 - lr: 0.000019 - momentum: 0.000000
2023-10-13 12:12:12,968 epoch 5 - iter 176/447 - loss 0.03919475 - time (sec): 11.30 - samples/sec: 2890.30 - lr: 0.000019 - momentum: 0.000000
2023-10-13 12:12:16,055 epoch 5 - iter 220/447 - loss 0.03737208 - time (sec): 14.38 - samples/sec: 2955.03 - lr: 0.000018 - momentum: 0.000000
2023-10-13 12:12:18,688 epoch 5 - iter 264/447 - loss 0.03726708 - time (sec): 17.02 - samples/sec: 2987.24 - lr: 0.000018 - momentum: 0.000000
2023-10-13 12:12:21,705 epoch 5 - iter 308/447 - loss 0.03688963 - time (sec): 20.03 - samples/sec: 2983.27 - lr: 0.000018 - momentum: 0.000000
2023-10-13 12:12:24,853 epoch 5 - iter 352/447 - loss 0.03722901 - time (sec): 23.18 - samples/sec: 2978.37 - lr: 0.000017 - momentum: 0.000000
2023-10-13 12:12:27,583 epoch 5 - iter 396/447 - loss 0.03667790 - time (sec): 25.91 - samples/sec: 3000.11 - lr: 0.000017 - momentum: 0.000000
2023-10-13 12:12:30,102 epoch 5 - iter 440/447 - loss 0.03624333 - time (sec): 28.43 - samples/sec: 2997.89 - lr: 0.000017 - momentum: 0.000000
2023-10-13 12:12:30,518 ----------------------------------------------------------------------------------------------------
2023-10-13 12:12:30,518 EPOCH 5 done: loss 0.0360 - lr: 0.000017
2023-10-13 12:12:39,052 DEV : loss 0.16211558878421783 - f1-score (micro avg) 0.7683
2023-10-13 12:12:39,080 saving best model
2023-10-13 12:12:39,538 ----------------------------------------------------------------------------------------------------
2023-10-13 12:12:42,323 epoch 6 - iter 44/447 - loss 0.01665545 - time (sec): 2.78 - samples/sec: 3085.13 - lr: 0.000016 - momentum: 0.000000
2023-10-13 12:12:45,047 epoch 6 - iter 88/447 - loss 0.01899288 - time (sec): 5.51 - samples/sec: 3046.61 - lr: 0.000016 - momentum: 0.000000
2023-10-13 12:12:47,709 epoch 6 - iter 132/447 - loss 0.02204439 - time (sec): 8.17 - samples/sec: 3038.25 - lr: 0.000016 - momentum: 0.000000
2023-10-13 12:12:50,485 epoch 6 - iter 176/447 - loss 0.02141173 - time (sec): 10.94 - samples/sec: 3031.12 - lr: 0.000015 - momentum: 0.000000
2023-10-13 12:12:53,395 epoch 6 - iter 220/447 - loss 0.02138525 - time (sec): 13.86 - samples/sec: 3003.83 - lr: 0.000015 - momentum: 0.000000
2023-10-13 12:12:56,134 epoch 6 - iter 264/447 - loss 0.02204675 - time (sec): 16.59 - samples/sec: 3010.82 - lr: 0.000015 - momentum: 0.000000
2023-10-13 12:12:58,732 epoch 6 - iter 308/447 - loss 0.02286212 - time (sec): 19.19 - samples/sec: 3012.99 - lr: 0.000014 - momentum: 0.000000
2023-10-13 12:13:01,618 epoch 6 - iter 352/447 - loss 0.02277241 - time (sec): 22.08 - samples/sec: 3009.11 - lr: 0.000014 - momentum: 0.000000
2023-10-13 12:13:04,692 epoch 6 - iter 396/447 - loss 0.02385308 - time (sec): 25.15 - samples/sec: 2983.10 - lr: 0.000014 - momentum: 0.000000
2023-10-13 12:13:07,948 epoch 6 - iter 440/447 - loss 0.02363909 - time (sec): 28.41 - samples/sec: 2992.62 - lr: 0.000013 - momentum: 0.000000
2023-10-13 12:13:08,441 ----------------------------------------------------------------------------------------------------
2023-10-13 12:13:08,441 EPOCH 6 done: loss 0.0233 - lr: 0.000013
2023-10-13 12:13:16,934 DEV : loss 0.17185726761817932 - f1-score (micro avg) 0.7742
2023-10-13 12:13:16,961 saving best model
2023-10-13 12:13:17,402 ----------------------------------------------------------------------------------------------------
2023-10-13 12:13:20,166 epoch 7 - iter 44/447 - loss 0.01916041 - time (sec): 2.76 - samples/sec: 3114.27 - lr: 0.000013 - momentum: 0.000000
2023-10-13 12:13:23,001 epoch 7 - iter 88/447 - loss 0.01510835 - time (sec): 5.60 - samples/sec: 3054.36 - lr: 0.000013 - momentum: 0.000000
2023-10-13 12:13:25,697 epoch 7 - iter 132/447 - loss 0.01730488 - time (sec): 8.29 - samples/sec: 3149.16 - lr: 0.000012 - momentum: 0.000000
2023-10-13 12:13:28,564 epoch 7 - iter 176/447 - loss 0.02055155 - time (sec): 11.16 - samples/sec: 3116.43 - lr: 0.000012 - momentum: 0.000000
2023-10-13 12:13:31,329 epoch 7 - iter 220/447 - loss 0.01884195 - time (sec): 13.93 - samples/sec: 3067.37 - lr: 0.000012 - momentum: 0.000000
2023-10-13 12:13:34,117 epoch 7 - iter 264/447 - loss 0.01845163 - time (sec): 16.71 - samples/sec: 3075.63 - lr: 0.000011 - momentum: 0.000000
2023-10-13 12:13:36,839 epoch 7 - iter 308/447 - loss 0.01868504 - time (sec): 19.44 - samples/sec: 3058.19 - lr: 0.000011 - momentum: 0.000000
2023-10-13 12:13:39,690 epoch 7 - iter 352/447 - loss 0.01820740 - time (sec): 22.29 - samples/sec: 3045.37 - lr: 0.000011 - momentum: 0.000000
2023-10-13 12:13:42,381 epoch 7 - iter 396/447 - loss 0.01837560 - time (sec): 24.98 - samples/sec: 3021.47 - lr: 0.000010 - momentum: 0.000000
2023-10-13 12:13:45,484 epoch 7 - iter 440/447 - loss 0.01746424 - time (sec): 28.08 - samples/sec: 3013.49 - lr: 0.000010 - momentum: 0.000000
2023-10-13 12:13:46,176 ----------------------------------------------------------------------------------------------------
2023-10-13 12:13:46,176 EPOCH 7 done: loss 0.0178 - lr: 0.000010
2023-10-13 12:13:54,157 DEV : loss 0.1952558010816574 - f1-score (micro avg) 0.7884
2023-10-13 12:13:54,183 saving best model
2023-10-13 12:13:54,603 ----------------------------------------------------------------------------------------------------
2023-10-13 12:13:57,192 epoch 8 - iter 44/447 - loss 0.00952279 - time (sec): 2.59 - samples/sec: 3226.01 - lr: 0.000010 - momentum: 0.000000
2023-10-13 12:14:00,951 epoch 8 - iter 88/447 - loss 0.01075405 - time (sec): 6.35 - samples/sec: 2813.51 - lr: 0.000009 - momentum: 0.000000
2023-10-13 12:14:03,638 epoch 8 - iter 132/447 - loss 0.01163951 - time (sec): 9.03 - samples/sec: 2889.35 - lr: 0.000009 - momentum: 0.000000
2023-10-13 12:14:06,321 epoch 8 - iter 176/447 - loss 0.01011208 - time (sec): 11.72 - samples/sec: 2947.12 - lr: 0.000009 - momentum: 0.000000
2023-10-13 12:14:08,976 epoch 8 - iter 220/447 - loss 0.01040926 - time (sec): 14.37 - samples/sec: 2951.05 - lr: 0.000008 - momentum: 0.000000
2023-10-13 12:14:11,964 epoch 8 - iter 264/447 - loss 0.01021770 - time (sec): 17.36 - samples/sec: 2950.17 - lr: 0.000008 - momentum: 0.000000
2023-10-13 12:14:14,719 epoch 8 - iter 308/447 - loss 0.01160813 - time (sec): 20.11 - samples/sec: 2987.03 - lr: 0.000008 - momentum: 0.000000
2023-10-13 12:14:17,416 epoch 8 - iter 352/447 - loss 0.01157027 - time (sec): 22.81 - samples/sec: 2990.17 - lr: 0.000007 - momentum: 0.000000
2023-10-13 12:14:20,158 epoch 8 - iter 396/447 - loss 0.01170964 - time (sec): 25.55 - samples/sec: 3004.19 - lr: 0.000007 - momentum: 0.000000
2023-10-13 12:14:22,957 epoch 8 - iter 440/447 - loss 0.01121158 - time (sec): 28.35 - samples/sec: 3006.90 - lr: 0.000007 - momentum: 0.000000
2023-10-13 12:14:23,380 ----------------------------------------------------------------------------------------------------
2023-10-13 12:14:23,380 EPOCH 8 done: loss 0.0112 - lr: 0.000007
2023-10-13 12:14:31,497 DEV : loss 0.2167775183916092 - f1-score (micro avg) 0.7789
2023-10-13 12:14:31,525 ----------------------------------------------------------------------------------------------------
2023-10-13 12:14:34,360 epoch 9 - iter 44/447 - loss 0.00561482 - time (sec): 2.83 - samples/sec: 3020.24 - lr: 0.000006 - momentum: 0.000000
2023-10-13 12:14:37,025 epoch 9 - iter 88/447 - loss 0.00613922 - time (sec): 5.50 - samples/sec: 3060.61 - lr: 0.000006 - momentum: 0.000000
2023-10-13 12:14:39,869 epoch 9 - iter 132/447 - loss 0.00641510 - time (sec): 8.34 - samples/sec: 3001.76 - lr: 0.000006 - momentum: 0.000000
2023-10-13 12:14:42,645 epoch 9 - iter 176/447 - loss 0.00730428 - time (sec): 11.12 - samples/sec: 3027.86 - lr: 0.000005 - momentum: 0.000000
2023-10-13 12:14:46,449 epoch 9 - iter 220/447 - loss 0.00735542 - time (sec): 14.92 - samples/sec: 2903.03 - lr: 0.000005 - momentum: 0.000000
2023-10-13 12:14:49,357 epoch 9 - iter 264/447 - loss 0.00755574 - time (sec): 17.83 - samples/sec: 2901.71 - lr: 0.000005 - momentum: 0.000000
2023-10-13 12:14:52,247 epoch 9 - iter 308/447 - loss 0.00820703 - time (sec): 20.72 - samples/sec: 2885.30 - lr: 0.000004 - momentum: 0.000000
2023-10-13 12:14:55,197 epoch 9 - iter 352/447 - loss 0.00811542 - time (sec): 23.67 - samples/sec: 2904.86 - lr: 0.000004 - momentum: 0.000000
2023-10-13 12:14:57,809 epoch 9 - iter 396/447 - loss 0.00836396 - time (sec): 26.28 - samples/sec: 2915.48 - lr: 0.000004 - momentum: 0.000000
2023-10-13 12:15:00,699 epoch 9 - iter 440/447 - loss 0.00820656 - time (sec): 29.17 - samples/sec: 2916.14 - lr: 0.000003 - momentum: 0.000000
2023-10-13 12:15:01,337 ----------------------------------------------------------------------------------------------------
2023-10-13 12:15:01,337 EPOCH 9 done: loss 0.0082 - lr: 0.000003
2023-10-13 12:15:09,668 DEV : loss 0.21906858682632446 - f1-score (micro avg) 0.7831
2023-10-13 12:15:09,697 ----------------------------------------------------------------------------------------------------
2023-10-13 12:15:12,660 epoch 10 - iter 44/447 - loss 0.00224005 - time (sec): 2.96 - samples/sec: 3090.02 - lr: 0.000003 - momentum: 0.000000
2023-10-13 12:15:15,340 epoch 10 - iter 88/447 - loss 0.00566886 - time (sec): 5.64 - samples/sec: 3042.50 - lr: 0.000003 - momentum: 0.000000
2023-10-13 12:15:18,179 epoch 10 - iter 132/447 - loss 0.00777346 - time (sec): 8.48 - samples/sec: 2990.28 - lr: 0.000002 - momentum: 0.000000
2023-10-13 12:15:21,408 epoch 10 - iter 176/447 - loss 0.00673985 - time (sec): 11.71 - samples/sec: 3011.92 - lr: 0.000002 - momentum: 0.000000
2023-10-13 12:15:24,050 epoch 10 - iter 220/447 - loss 0.00725669 - time (sec): 14.35 - samples/sec: 3032.21 - lr: 0.000002 - momentum: 0.000000
2023-10-13 12:15:26,607 epoch 10 - iter 264/447 - loss 0.00740123 - time (sec): 16.91 - samples/sec: 3062.02 - lr: 0.000001 - momentum: 0.000000
2023-10-13 12:15:29,245 epoch 10 - iter 308/447 - loss 0.00855628 - time (sec): 19.55 - samples/sec: 3045.94 - lr: 0.000001 - momentum: 0.000000
2023-10-13 12:15:32,354 epoch 10 - iter 352/447 - loss 0.00792544 - time (sec): 22.66 - samples/sec: 3010.68 - lr: 0.000001 - momentum: 0.000000
2023-10-13 12:15:35,115 epoch 10 - iter 396/447 - loss 0.00837304 - time (sec): 25.42 - samples/sec: 3003.23 - lr: 0.000000 - momentum: 0.000000
2023-10-13 12:15:38,576 epoch 10 - iter 440/447 - loss 0.00803818 - time (sec): 28.88 - samples/sec: 2957.32 - lr: 0.000000 - momentum: 0.000000
2023-10-13 12:15:39,010 ----------------------------------------------------------------------------------------------------
2023-10-13 12:15:39,010 EPOCH 10 done: loss 0.0079 - lr: 0.000000
2023-10-13 12:15:47,210 DEV : loss 0.21315275132656097 - f1-score (micro avg) 0.7884
2023-10-13 12:15:47,592 ----------------------------------------------------------------------------------------------------
2023-10-13 12:15:47,594 Loading model from best epoch ...
2023-10-13 12:15:49,064 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-13 12:15:54,036
Results:
- F-score (micro) 0.7463
- F-score (macro) 0.6505
- Accuracy 0.6143
By class:
precision recall f1-score support
loc 0.8507 0.8607 0.8557 596
pers 0.6477 0.7508 0.6954 333
org 0.5182 0.4318 0.4711 132
prod 0.7045 0.4697 0.5636 66
time 0.6600 0.6735 0.6667 49
micro avg 0.7410 0.7517 0.7463 1176
macro avg 0.6762 0.6373 0.6505 1176
weighted avg 0.7398 0.7517 0.7429 1176
2023-10-13 12:15:54,036 ----------------------------------------------------------------------------------------------------