stefan-it's picture
Upload folder using huggingface_hub
f104962
2023-10-13 13:13:20,057 ----------------------------------------------------------------------------------------------------
2023-10-13 13:13:20,058 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 13:13:20,058 ----------------------------------------------------------------------------------------------------
2023-10-13 13:13:20,058 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-13 13:13:20,058 ----------------------------------------------------------------------------------------------------
2023-10-13 13:13:20,058 Train: 3575 sentences
2023-10-13 13:13:20,058 (train_with_dev=False, train_with_test=False)
2023-10-13 13:13:20,058 ----------------------------------------------------------------------------------------------------
2023-10-13 13:13:20,058 Training Params:
2023-10-13 13:13:20,058 - learning_rate: "3e-05"
2023-10-13 13:13:20,058 - mini_batch_size: "8"
2023-10-13 13:13:20,058 - max_epochs: "10"
2023-10-13 13:13:20,058 - shuffle: "True"
2023-10-13 13:13:20,058 ----------------------------------------------------------------------------------------------------
2023-10-13 13:13:20,058 Plugins:
2023-10-13 13:13:20,058 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 13:13:20,058 ----------------------------------------------------------------------------------------------------
2023-10-13 13:13:20,058 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 13:13:20,058 - metric: "('micro avg', 'f1-score')"
2023-10-13 13:13:20,058 ----------------------------------------------------------------------------------------------------
2023-10-13 13:13:20,058 Computation:
2023-10-13 13:13:20,058 - compute on device: cuda:0
2023-10-13 13:13:20,058 - embedding storage: none
2023-10-13 13:13:20,058 ----------------------------------------------------------------------------------------------------
2023-10-13 13:13:20,058 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-13 13:13:20,058 ----------------------------------------------------------------------------------------------------
2023-10-13 13:13:20,058 ----------------------------------------------------------------------------------------------------
2023-10-13 13:13:22,914 epoch 1 - iter 44/447 - loss 3.11777082 - time (sec): 2.85 - samples/sec: 3068.12 - lr: 0.000003 - momentum: 0.000000
2023-10-13 13:13:25,858 epoch 1 - iter 88/447 - loss 2.36193790 - time (sec): 5.80 - samples/sec: 3064.20 - lr: 0.000006 - momentum: 0.000000
2023-10-13 13:13:28,576 epoch 1 - iter 132/447 - loss 1.79092667 - time (sec): 8.52 - samples/sec: 3020.50 - lr: 0.000009 - momentum: 0.000000
2023-10-13 13:13:31,655 epoch 1 - iter 176/447 - loss 1.44848537 - time (sec): 11.60 - samples/sec: 2976.51 - lr: 0.000012 - momentum: 0.000000
2023-10-13 13:13:34,436 epoch 1 - iter 220/447 - loss 1.23228332 - time (sec): 14.38 - samples/sec: 2970.99 - lr: 0.000015 - momentum: 0.000000
2023-10-13 13:13:37,240 epoch 1 - iter 264/447 - loss 1.08886007 - time (sec): 17.18 - samples/sec: 2964.39 - lr: 0.000018 - momentum: 0.000000
2023-10-13 13:13:39,997 epoch 1 - iter 308/447 - loss 0.98213248 - time (sec): 19.94 - samples/sec: 2970.69 - lr: 0.000021 - momentum: 0.000000
2023-10-13 13:13:42,747 epoch 1 - iter 352/447 - loss 0.89561881 - time (sec): 22.69 - samples/sec: 2980.97 - lr: 0.000024 - momentum: 0.000000
2023-10-13 13:13:45,432 epoch 1 - iter 396/447 - loss 0.82242688 - time (sec): 25.37 - samples/sec: 2981.41 - lr: 0.000027 - momentum: 0.000000
2023-10-13 13:13:48,627 epoch 1 - iter 440/447 - loss 0.75951686 - time (sec): 28.57 - samples/sec: 2983.72 - lr: 0.000029 - momentum: 0.000000
2023-10-13 13:13:49,039 ----------------------------------------------------------------------------------------------------
2023-10-13 13:13:49,040 EPOCH 1 done: loss 0.7518 - lr: 0.000029
2023-10-13 13:13:53,953 DEV : loss 0.18360073864459991 - f1-score (micro avg) 0.6411
2023-10-13 13:13:53,978 saving best model
2023-10-13 13:13:54,320 ----------------------------------------------------------------------------------------------------
2023-10-13 13:13:57,283 epoch 2 - iter 44/447 - loss 0.21325634 - time (sec): 2.96 - samples/sec: 3023.79 - lr: 0.000030 - momentum: 0.000000
2023-10-13 13:14:00,394 epoch 2 - iter 88/447 - loss 0.19763757 - time (sec): 6.07 - samples/sec: 3047.06 - lr: 0.000029 - momentum: 0.000000
2023-10-13 13:14:02,962 epoch 2 - iter 132/447 - loss 0.18755004 - time (sec): 8.64 - samples/sec: 3031.14 - lr: 0.000029 - momentum: 0.000000
2023-10-13 13:14:05,602 epoch 2 - iter 176/447 - loss 0.19041313 - time (sec): 11.28 - samples/sec: 3055.96 - lr: 0.000029 - momentum: 0.000000
2023-10-13 13:14:08,514 epoch 2 - iter 220/447 - loss 0.18445879 - time (sec): 14.19 - samples/sec: 3035.46 - lr: 0.000028 - momentum: 0.000000
2023-10-13 13:14:11,208 epoch 2 - iter 264/447 - loss 0.17645831 - time (sec): 16.89 - samples/sec: 3063.78 - lr: 0.000028 - momentum: 0.000000
2023-10-13 13:14:13,825 epoch 2 - iter 308/447 - loss 0.17200425 - time (sec): 19.50 - samples/sec: 3061.76 - lr: 0.000028 - momentum: 0.000000
2023-10-13 13:14:16,387 epoch 2 - iter 352/447 - loss 0.16971408 - time (sec): 22.07 - samples/sec: 3070.68 - lr: 0.000027 - momentum: 0.000000
2023-10-13 13:14:19,560 epoch 2 - iter 396/447 - loss 0.16473002 - time (sec): 25.24 - samples/sec: 3044.12 - lr: 0.000027 - momentum: 0.000000
2023-10-13 13:14:22,291 epoch 2 - iter 440/447 - loss 0.16259189 - time (sec): 27.97 - samples/sec: 3047.79 - lr: 0.000027 - momentum: 0.000000
2023-10-13 13:14:22,795 ----------------------------------------------------------------------------------------------------
2023-10-13 13:14:22,796 EPOCH 2 done: loss 0.1618 - lr: 0.000027
2023-10-13 13:14:31,384 DEV : loss 0.1280011683702469 - f1-score (micro avg) 0.6793
2023-10-13 13:14:31,411 saving best model
2023-10-13 13:14:31,868 ----------------------------------------------------------------------------------------------------
2023-10-13 13:14:34,764 epoch 3 - iter 44/447 - loss 0.10970391 - time (sec): 2.89 - samples/sec: 2957.74 - lr: 0.000026 - momentum: 0.000000
2023-10-13 13:14:38,002 epoch 3 - iter 88/447 - loss 0.10095129 - time (sec): 6.13 - samples/sec: 2911.29 - lr: 0.000026 - momentum: 0.000000
2023-10-13 13:14:40,901 epoch 3 - iter 132/447 - loss 0.09159053 - time (sec): 9.03 - samples/sec: 2902.88 - lr: 0.000026 - momentum: 0.000000
2023-10-13 13:14:43,698 epoch 3 - iter 176/447 - loss 0.09005402 - time (sec): 11.83 - samples/sec: 2913.87 - lr: 0.000025 - momentum: 0.000000
2023-10-13 13:14:46,381 epoch 3 - iter 220/447 - loss 0.08834012 - time (sec): 14.51 - samples/sec: 2898.73 - lr: 0.000025 - momentum: 0.000000
2023-10-13 13:14:49,199 epoch 3 - iter 264/447 - loss 0.08867495 - time (sec): 17.33 - samples/sec: 2917.60 - lr: 0.000025 - momentum: 0.000000
2023-10-13 13:14:51,891 epoch 3 - iter 308/447 - loss 0.08828525 - time (sec): 20.02 - samples/sec: 2931.90 - lr: 0.000024 - momentum: 0.000000
2023-10-13 13:14:54,732 epoch 3 - iter 352/447 - loss 0.08402043 - time (sec): 22.86 - samples/sec: 2953.93 - lr: 0.000024 - momentum: 0.000000
2023-10-13 13:14:57,347 epoch 3 - iter 396/447 - loss 0.08793302 - time (sec): 25.48 - samples/sec: 2976.67 - lr: 0.000024 - momentum: 0.000000
2023-10-13 13:15:00,431 epoch 3 - iter 440/447 - loss 0.08822390 - time (sec): 28.56 - samples/sec: 2987.02 - lr: 0.000023 - momentum: 0.000000
2023-10-13 13:15:00,838 ----------------------------------------------------------------------------------------------------
2023-10-13 13:15:00,839 EPOCH 3 done: loss 0.0882 - lr: 0.000023
2023-10-13 13:15:09,537 DEV : loss 0.1200539767742157 - f1-score (micro avg) 0.7483
2023-10-13 13:15:09,567 saving best model
2023-10-13 13:15:09,992 ----------------------------------------------------------------------------------------------------
2023-10-13 13:15:12,750 epoch 4 - iter 44/447 - loss 0.04601992 - time (sec): 2.76 - samples/sec: 3061.98 - lr: 0.000023 - momentum: 0.000000
2023-10-13 13:15:15,345 epoch 4 - iter 88/447 - loss 0.05132868 - time (sec): 5.35 - samples/sec: 3066.86 - lr: 0.000023 - momentum: 0.000000
2023-10-13 13:15:18,079 epoch 4 - iter 132/447 - loss 0.04850121 - time (sec): 8.09 - samples/sec: 3088.42 - lr: 0.000022 - momentum: 0.000000
2023-10-13 13:15:20,701 epoch 4 - iter 176/447 - loss 0.04502986 - time (sec): 10.71 - samples/sec: 3112.42 - lr: 0.000022 - momentum: 0.000000
2023-10-13 13:15:24,117 epoch 4 - iter 220/447 - loss 0.04746775 - time (sec): 14.12 - samples/sec: 3070.89 - lr: 0.000022 - momentum: 0.000000
2023-10-13 13:15:26,903 epoch 4 - iter 264/447 - loss 0.04791562 - time (sec): 16.91 - samples/sec: 3071.67 - lr: 0.000021 - momentum: 0.000000
2023-10-13 13:15:29,531 epoch 4 - iter 308/447 - loss 0.04886274 - time (sec): 19.54 - samples/sec: 3060.31 - lr: 0.000021 - momentum: 0.000000
2023-10-13 13:15:32,213 epoch 4 - iter 352/447 - loss 0.04909487 - time (sec): 22.22 - samples/sec: 3052.34 - lr: 0.000021 - momentum: 0.000000
2023-10-13 13:15:35,496 epoch 4 - iter 396/447 - loss 0.04995671 - time (sec): 25.50 - samples/sec: 3031.17 - lr: 0.000020 - momentum: 0.000000
2023-10-13 13:15:38,187 epoch 4 - iter 440/447 - loss 0.05051059 - time (sec): 28.19 - samples/sec: 3023.06 - lr: 0.000020 - momentum: 0.000000
2023-10-13 13:15:38,616 ----------------------------------------------------------------------------------------------------
2023-10-13 13:15:38,617 EPOCH 4 done: loss 0.0506 - lr: 0.000020
2023-10-13 13:15:47,029 DEV : loss 0.14254607260227203 - f1-score (micro avg) 0.7467
2023-10-13 13:15:47,056 ----------------------------------------------------------------------------------------------------
2023-10-13 13:15:49,999 epoch 5 - iter 44/447 - loss 0.03966301 - time (sec): 2.94 - samples/sec: 3049.04 - lr: 0.000020 - momentum: 0.000000
2023-10-13 13:15:52,832 epoch 5 - iter 88/447 - loss 0.03669848 - time (sec): 5.78 - samples/sec: 2990.27 - lr: 0.000019 - momentum: 0.000000
2023-10-13 13:15:55,761 epoch 5 - iter 132/447 - loss 0.03533218 - time (sec): 8.70 - samples/sec: 3000.27 - lr: 0.000019 - momentum: 0.000000
2023-10-13 13:15:58,591 epoch 5 - iter 176/447 - loss 0.03429375 - time (sec): 11.53 - samples/sec: 3006.74 - lr: 0.000019 - momentum: 0.000000
2023-10-13 13:16:01,229 epoch 5 - iter 220/447 - loss 0.03499214 - time (sec): 14.17 - samples/sec: 3009.08 - lr: 0.000018 - momentum: 0.000000
2023-10-13 13:16:04,099 epoch 5 - iter 264/447 - loss 0.03588282 - time (sec): 17.04 - samples/sec: 3007.92 - lr: 0.000018 - momentum: 0.000000
2023-10-13 13:16:07,274 epoch 5 - iter 308/447 - loss 0.03556606 - time (sec): 20.22 - samples/sec: 2996.33 - lr: 0.000018 - momentum: 0.000000
2023-10-13 13:16:09,873 epoch 5 - iter 352/447 - loss 0.03896903 - time (sec): 22.82 - samples/sec: 3006.18 - lr: 0.000017 - momentum: 0.000000
2023-10-13 13:16:12,685 epoch 5 - iter 396/447 - loss 0.03718731 - time (sec): 25.63 - samples/sec: 2995.43 - lr: 0.000017 - momentum: 0.000000
2023-10-13 13:16:15,538 epoch 5 - iter 440/447 - loss 0.03556750 - time (sec): 28.48 - samples/sec: 2997.63 - lr: 0.000017 - momentum: 0.000000
2023-10-13 13:16:15,921 ----------------------------------------------------------------------------------------------------
2023-10-13 13:16:15,922 EPOCH 5 done: loss 0.0352 - lr: 0.000017
2023-10-13 13:16:24,461 DEV : loss 0.16648352146148682 - f1-score (micro avg) 0.7495
2023-10-13 13:16:24,487 saving best model
2023-10-13 13:16:24,910 ----------------------------------------------------------------------------------------------------
2023-10-13 13:16:27,776 epoch 6 - iter 44/447 - loss 0.01297918 - time (sec): 2.86 - samples/sec: 3007.92 - lr: 0.000016 - momentum: 0.000000
2023-10-13 13:16:30,760 epoch 6 - iter 88/447 - loss 0.01521816 - time (sec): 5.85 - samples/sec: 3013.80 - lr: 0.000016 - momentum: 0.000000
2023-10-13 13:16:33,450 epoch 6 - iter 132/447 - loss 0.01783763 - time (sec): 8.53 - samples/sec: 3042.85 - lr: 0.000016 - momentum: 0.000000
2023-10-13 13:16:36,642 epoch 6 - iter 176/447 - loss 0.01724718 - time (sec): 11.73 - samples/sec: 3055.21 - lr: 0.000015 - momentum: 0.000000
2023-10-13 13:16:39,440 epoch 6 - iter 220/447 - loss 0.01881627 - time (sec): 14.53 - samples/sec: 2985.20 - lr: 0.000015 - momentum: 0.000000
2023-10-13 13:16:42,158 epoch 6 - iter 264/447 - loss 0.01817880 - time (sec): 17.24 - samples/sec: 2985.53 - lr: 0.000015 - momentum: 0.000000
2023-10-13 13:16:45,022 epoch 6 - iter 308/447 - loss 0.01947892 - time (sec): 20.11 - samples/sec: 2982.50 - lr: 0.000014 - momentum: 0.000000
2023-10-13 13:16:47,824 epoch 6 - iter 352/447 - loss 0.02012745 - time (sec): 22.91 - samples/sec: 2969.62 - lr: 0.000014 - momentum: 0.000000
2023-10-13 13:16:50,622 epoch 6 - iter 396/447 - loss 0.02016141 - time (sec): 25.71 - samples/sec: 2992.05 - lr: 0.000014 - momentum: 0.000000
2023-10-13 13:16:53,327 epoch 6 - iter 440/447 - loss 0.02126196 - time (sec): 28.41 - samples/sec: 3002.50 - lr: 0.000013 - momentum: 0.000000
2023-10-13 13:16:53,729 ----------------------------------------------------------------------------------------------------
2023-10-13 13:16:53,730 EPOCH 6 done: loss 0.0212 - lr: 0.000013
2023-10-13 13:17:02,414 DEV : loss 0.173013374209404 - f1-score (micro avg) 0.7741
2023-10-13 13:17:02,440 saving best model
2023-10-13 13:17:02,868 ----------------------------------------------------------------------------------------------------
2023-10-13 13:17:06,304 epoch 7 - iter 44/447 - loss 0.02481076 - time (sec): 3.43 - samples/sec: 2891.86 - lr: 0.000013 - momentum: 0.000000
2023-10-13 13:17:09,074 epoch 7 - iter 88/447 - loss 0.01608324 - time (sec): 6.20 - samples/sec: 2875.39 - lr: 0.000013 - momentum: 0.000000
2023-10-13 13:17:12,040 epoch 7 - iter 132/447 - loss 0.01410154 - time (sec): 9.17 - samples/sec: 2899.58 - lr: 0.000012 - momentum: 0.000000
2023-10-13 13:17:14,929 epoch 7 - iter 176/447 - loss 0.01489213 - time (sec): 12.06 - samples/sec: 2938.16 - lr: 0.000012 - momentum: 0.000000
2023-10-13 13:17:17,754 epoch 7 - iter 220/447 - loss 0.01604270 - time (sec): 14.88 - samples/sec: 2952.10 - lr: 0.000012 - momentum: 0.000000
2023-10-13 13:17:20,446 epoch 7 - iter 264/447 - loss 0.01618986 - time (sec): 17.58 - samples/sec: 2938.06 - lr: 0.000011 - momentum: 0.000000
2023-10-13 13:17:23,179 epoch 7 - iter 308/447 - loss 0.01644546 - time (sec): 20.31 - samples/sec: 2961.42 - lr: 0.000011 - momentum: 0.000000
2023-10-13 13:17:25,924 epoch 7 - iter 352/447 - loss 0.01537412 - time (sec): 23.05 - samples/sec: 2965.75 - lr: 0.000011 - momentum: 0.000000
2023-10-13 13:17:28,546 epoch 7 - iter 396/447 - loss 0.01579262 - time (sec): 25.68 - samples/sec: 2974.59 - lr: 0.000010 - momentum: 0.000000
2023-10-13 13:17:31,395 epoch 7 - iter 440/447 - loss 0.01544474 - time (sec): 28.53 - samples/sec: 2995.90 - lr: 0.000010 - momentum: 0.000000
2023-10-13 13:17:31,781 ----------------------------------------------------------------------------------------------------
2023-10-13 13:17:31,782 EPOCH 7 done: loss 0.0155 - lr: 0.000010
2023-10-13 13:17:39,921 DEV : loss 0.1985795646905899 - f1-score (micro avg) 0.783
2023-10-13 13:17:39,950 saving best model
2023-10-13 13:17:40,398 ----------------------------------------------------------------------------------------------------
2023-10-13 13:17:43,232 epoch 8 - iter 44/447 - loss 0.00957408 - time (sec): 2.83 - samples/sec: 3033.73 - lr: 0.000010 - momentum: 0.000000
2023-10-13 13:17:46,091 epoch 8 - iter 88/447 - loss 0.00922003 - time (sec): 5.69 - samples/sec: 3010.78 - lr: 0.000009 - momentum: 0.000000
2023-10-13 13:17:48,808 epoch 8 - iter 132/447 - loss 0.01038901 - time (sec): 8.41 - samples/sec: 3015.07 - lr: 0.000009 - momentum: 0.000000
2023-10-13 13:17:51,553 epoch 8 - iter 176/447 - loss 0.01041656 - time (sec): 11.15 - samples/sec: 3004.15 - lr: 0.000009 - momentum: 0.000000
2023-10-13 13:17:54,368 epoch 8 - iter 220/447 - loss 0.00978835 - time (sec): 13.97 - samples/sec: 2991.67 - lr: 0.000008 - momentum: 0.000000
2023-10-13 13:17:57,374 epoch 8 - iter 264/447 - loss 0.00983571 - time (sec): 16.97 - samples/sec: 2957.65 - lr: 0.000008 - momentum: 0.000000
2023-10-13 13:18:00,184 epoch 8 - iter 308/447 - loss 0.00935734 - time (sec): 19.78 - samples/sec: 2954.35 - lr: 0.000008 - momentum: 0.000000
2023-10-13 13:18:03,374 epoch 8 - iter 352/447 - loss 0.01020653 - time (sec): 22.97 - samples/sec: 2943.90 - lr: 0.000007 - momentum: 0.000000
2023-10-13 13:18:06,439 epoch 8 - iter 396/447 - loss 0.01083975 - time (sec): 26.04 - samples/sec: 2945.85 - lr: 0.000007 - momentum: 0.000000
2023-10-13 13:18:09,151 epoch 8 - iter 440/447 - loss 0.01083258 - time (sec): 28.75 - samples/sec: 2958.89 - lr: 0.000007 - momentum: 0.000000
2023-10-13 13:18:09,642 ----------------------------------------------------------------------------------------------------
2023-10-13 13:18:09,643 EPOCH 8 done: loss 0.0108 - lr: 0.000007
2023-10-13 13:18:17,739 DEV : loss 0.2084827721118927 - f1-score (micro avg) 0.7768
2023-10-13 13:18:17,767 ----------------------------------------------------------------------------------------------------
2023-10-13 13:18:20,478 epoch 9 - iter 44/447 - loss 0.00612637 - time (sec): 2.71 - samples/sec: 3061.27 - lr: 0.000006 - momentum: 0.000000
2023-10-13 13:18:23,449 epoch 9 - iter 88/447 - loss 0.00545493 - time (sec): 5.68 - samples/sec: 2988.83 - lr: 0.000006 - momentum: 0.000000
2023-10-13 13:18:26,013 epoch 9 - iter 132/447 - loss 0.00810978 - time (sec): 8.24 - samples/sec: 3022.80 - lr: 0.000006 - momentum: 0.000000
2023-10-13 13:18:28,792 epoch 9 - iter 176/447 - loss 0.00943692 - time (sec): 11.02 - samples/sec: 3017.39 - lr: 0.000005 - momentum: 0.000000
2023-10-13 13:18:31,737 epoch 9 - iter 220/447 - loss 0.00827507 - time (sec): 13.97 - samples/sec: 3011.35 - lr: 0.000005 - momentum: 0.000000
2023-10-13 13:18:34,594 epoch 9 - iter 264/447 - loss 0.00745688 - time (sec): 16.83 - samples/sec: 2998.27 - lr: 0.000005 - momentum: 0.000000
2023-10-13 13:18:37,258 epoch 9 - iter 308/447 - loss 0.00768552 - time (sec): 19.49 - samples/sec: 3026.07 - lr: 0.000004 - momentum: 0.000000
2023-10-13 13:18:41,084 epoch 9 - iter 352/447 - loss 0.00760993 - time (sec): 23.32 - samples/sec: 2963.42 - lr: 0.000004 - momentum: 0.000000
2023-10-13 13:18:43,947 epoch 9 - iter 396/447 - loss 0.00728736 - time (sec): 26.18 - samples/sec: 2954.39 - lr: 0.000004 - momentum: 0.000000
2023-10-13 13:18:46,795 epoch 9 - iter 440/447 - loss 0.00694997 - time (sec): 29.03 - samples/sec: 2944.30 - lr: 0.000003 - momentum: 0.000000
2023-10-13 13:18:47,199 ----------------------------------------------------------------------------------------------------
2023-10-13 13:18:47,199 EPOCH 9 done: loss 0.0070 - lr: 0.000003
2023-10-13 13:18:55,583 DEV : loss 0.2126864343881607 - f1-score (micro avg) 0.7776
2023-10-13 13:18:55,611 ----------------------------------------------------------------------------------------------------
2023-10-13 13:18:58,894 epoch 10 - iter 44/447 - loss 0.00635944 - time (sec): 3.28 - samples/sec: 3012.37 - lr: 0.000003 - momentum: 0.000000
2023-10-13 13:19:02,008 epoch 10 - iter 88/447 - loss 0.00527678 - time (sec): 6.40 - samples/sec: 2899.98 - lr: 0.000003 - momentum: 0.000000
2023-10-13 13:19:04,861 epoch 10 - iter 132/447 - loss 0.00649806 - time (sec): 9.25 - samples/sec: 2899.02 - lr: 0.000002 - momentum: 0.000000
2023-10-13 13:19:07,541 epoch 10 - iter 176/447 - loss 0.00621582 - time (sec): 11.93 - samples/sec: 2918.26 - lr: 0.000002 - momentum: 0.000000
2023-10-13 13:19:10,370 epoch 10 - iter 220/447 - loss 0.00571437 - time (sec): 14.76 - samples/sec: 2934.41 - lr: 0.000002 - momentum: 0.000000
2023-10-13 13:19:13,065 epoch 10 - iter 264/447 - loss 0.00591055 - time (sec): 17.45 - samples/sec: 2939.79 - lr: 0.000001 - momentum: 0.000000
2023-10-13 13:19:15,840 epoch 10 - iter 308/447 - loss 0.00566945 - time (sec): 20.23 - samples/sec: 2941.50 - lr: 0.000001 - momentum: 0.000000
2023-10-13 13:19:18,771 epoch 10 - iter 352/447 - loss 0.00536572 - time (sec): 23.16 - samples/sec: 2940.96 - lr: 0.000001 - momentum: 0.000000
2023-10-13 13:19:21,445 epoch 10 - iter 396/447 - loss 0.00529827 - time (sec): 25.83 - samples/sec: 2959.09 - lr: 0.000000 - momentum: 0.000000
2023-10-13 13:19:24,347 epoch 10 - iter 440/447 - loss 0.00512907 - time (sec): 28.73 - samples/sec: 2976.79 - lr: 0.000000 - momentum: 0.000000
2023-10-13 13:19:24,753 ----------------------------------------------------------------------------------------------------
2023-10-13 13:19:24,753 EPOCH 10 done: loss 0.0051 - lr: 0.000000
2023-10-13 13:19:33,426 DEV : loss 0.2154415100812912 - f1-score (micro avg) 0.7754
2023-10-13 13:19:33,796 ----------------------------------------------------------------------------------------------------
2023-10-13 13:19:33,798 Loading model from best epoch ...
2023-10-13 13:19:35,452 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-13 13:19:40,515
Results:
- F-score (micro) 0.7437
- F-score (macro) 0.6536
- Accuracy 0.6094
By class:
precision recall f1-score support
loc 0.8596 0.8322 0.8457 596
pers 0.6605 0.7538 0.7041 333
org 0.5310 0.4545 0.4898 132
prod 0.5957 0.4242 0.4956 66
time 0.7115 0.7551 0.7327 49
micro avg 0.7459 0.7415 0.7437 1176
macro avg 0.6717 0.6440 0.6536 1176
weighted avg 0.7454 0.7415 0.7413 1176
2023-10-13 13:19:40,515 ----------------------------------------------------------------------------------------------------