stefan-it's picture
Upload folder using huggingface_hub
7e6f9c1
2023-10-13 08:35:49,871 ----------------------------------------------------------------------------------------------------
2023-10-13 08:35:49,872 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 08:35:49,872 ----------------------------------------------------------------------------------------------------
2023-10-13 08:35:49,872 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-13 08:35:49,872 ----------------------------------------------------------------------------------------------------
2023-10-13 08:35:49,872 Train: 1100 sentences
2023-10-13 08:35:49,872 (train_with_dev=False, train_with_test=False)
2023-10-13 08:35:49,872 ----------------------------------------------------------------------------------------------------
2023-10-13 08:35:49,872 Training Params:
2023-10-13 08:35:49,872 - learning_rate: "3e-05"
2023-10-13 08:35:49,872 - mini_batch_size: "8"
2023-10-13 08:35:49,872 - max_epochs: "10"
2023-10-13 08:35:49,872 - shuffle: "True"
2023-10-13 08:35:49,872 ----------------------------------------------------------------------------------------------------
2023-10-13 08:35:49,872 Plugins:
2023-10-13 08:35:49,872 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 08:35:49,872 ----------------------------------------------------------------------------------------------------
2023-10-13 08:35:49,872 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 08:35:49,873 - metric: "('micro avg', 'f1-score')"
2023-10-13 08:35:49,873 ----------------------------------------------------------------------------------------------------
2023-10-13 08:35:49,873 Computation:
2023-10-13 08:35:49,873 - compute on device: cuda:0
2023-10-13 08:35:49,873 - embedding storage: none
2023-10-13 08:35:49,873 ----------------------------------------------------------------------------------------------------
2023-10-13 08:35:49,873 Model training base path: "hmbench-ajmc/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-13 08:35:49,873 ----------------------------------------------------------------------------------------------------
2023-10-13 08:35:49,873 ----------------------------------------------------------------------------------------------------
2023-10-13 08:35:50,616 epoch 1 - iter 13/138 - loss 3.15683476 - time (sec): 0.74 - samples/sec: 3189.10 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:35:51,303 epoch 1 - iter 26/138 - loss 2.97590677 - time (sec): 1.42 - samples/sec: 3138.04 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:35:52,024 epoch 1 - iter 39/138 - loss 2.59052689 - time (sec): 2.14 - samples/sec: 3037.49 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:35:52,769 epoch 1 - iter 52/138 - loss 2.12928205 - time (sec): 2.89 - samples/sec: 3024.07 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:35:53,479 epoch 1 - iter 65/138 - loss 1.89733493 - time (sec): 3.60 - samples/sec: 3011.76 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:35:54,187 epoch 1 - iter 78/138 - loss 1.74800111 - time (sec): 4.31 - samples/sec: 2956.04 - lr: 0.000017 - momentum: 0.000000
2023-10-13 08:35:54,980 epoch 1 - iter 91/138 - loss 1.58409426 - time (sec): 5.10 - samples/sec: 2973.59 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:35:55,691 epoch 1 - iter 104/138 - loss 1.44967089 - time (sec): 5.81 - samples/sec: 2955.57 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:35:56,399 epoch 1 - iter 117/138 - loss 1.34104244 - time (sec): 6.52 - samples/sec: 2968.77 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:35:57,123 epoch 1 - iter 130/138 - loss 1.24619147 - time (sec): 7.24 - samples/sec: 2981.49 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:35:57,538 ----------------------------------------------------------------------------------------------------
2023-10-13 08:35:57,539 EPOCH 1 done: loss 1.2061 - lr: 0.000028
2023-10-13 08:35:58,275 DEV : loss 0.2992438077926636 - f1-score (micro avg) 0.6502
2023-10-13 08:35:58,280 saving best model
2023-10-13 08:35:58,652 ----------------------------------------------------------------------------------------------------
2023-10-13 08:35:59,357 epoch 2 - iter 13/138 - loss 0.25983347 - time (sec): 0.70 - samples/sec: 2769.65 - lr: 0.000030 - momentum: 0.000000
2023-10-13 08:36:00,083 epoch 2 - iter 26/138 - loss 0.30280514 - time (sec): 1.43 - samples/sec: 2974.06 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:36:00,856 epoch 2 - iter 39/138 - loss 0.28969457 - time (sec): 2.20 - samples/sec: 2978.71 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:36:01,603 epoch 2 - iter 52/138 - loss 0.27768767 - time (sec): 2.95 - samples/sec: 3006.01 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:36:02,420 epoch 2 - iter 65/138 - loss 0.26921623 - time (sec): 3.77 - samples/sec: 2997.75 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:36:03,123 epoch 2 - iter 78/138 - loss 0.25522822 - time (sec): 4.47 - samples/sec: 2970.26 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:36:03,848 epoch 2 - iter 91/138 - loss 0.24593549 - time (sec): 5.19 - samples/sec: 2965.62 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:36:04,589 epoch 2 - iter 104/138 - loss 0.23909381 - time (sec): 5.94 - samples/sec: 2965.65 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:36:05,356 epoch 2 - iter 117/138 - loss 0.23411641 - time (sec): 6.70 - samples/sec: 2952.40 - lr: 0.000027 - momentum: 0.000000
2023-10-13 08:36:06,063 epoch 2 - iter 130/138 - loss 0.22805297 - time (sec): 7.41 - samples/sec: 2932.26 - lr: 0.000027 - momentum: 0.000000
2023-10-13 08:36:06,517 ----------------------------------------------------------------------------------------------------
2023-10-13 08:36:06,517 EPOCH 2 done: loss 0.2288 - lr: 0.000027
2023-10-13 08:36:07,227 DEV : loss 0.14183788001537323 - f1-score (micro avg) 0.7919
2023-10-13 08:36:07,233 saving best model
2023-10-13 08:36:07,718 ----------------------------------------------------------------------------------------------------
2023-10-13 08:36:08,438 epoch 3 - iter 13/138 - loss 0.10195422 - time (sec): 0.71 - samples/sec: 2982.10 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:36:09,168 epoch 3 - iter 26/138 - loss 0.13022547 - time (sec): 1.45 - samples/sec: 3036.65 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:36:09,910 epoch 3 - iter 39/138 - loss 0.13367901 - time (sec): 2.19 - samples/sec: 3072.58 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:36:10,607 epoch 3 - iter 52/138 - loss 0.12155712 - time (sec): 2.88 - samples/sec: 3047.37 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:36:11,368 epoch 3 - iter 65/138 - loss 0.12300005 - time (sec): 3.65 - samples/sec: 3049.34 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:36:12,064 epoch 3 - iter 78/138 - loss 0.12568278 - time (sec): 4.34 - samples/sec: 3020.54 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:36:12,783 epoch 3 - iter 91/138 - loss 0.12291916 - time (sec): 5.06 - samples/sec: 2999.08 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:36:13,546 epoch 3 - iter 104/138 - loss 0.11825695 - time (sec): 5.82 - samples/sec: 2982.45 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:36:14,266 epoch 3 - iter 117/138 - loss 0.12228047 - time (sec): 6.54 - samples/sec: 2992.85 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:36:14,985 epoch 3 - iter 130/138 - loss 0.11723604 - time (sec): 7.26 - samples/sec: 2962.60 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:36:15,439 ----------------------------------------------------------------------------------------------------
2023-10-13 08:36:15,439 EPOCH 3 done: loss 0.1163 - lr: 0.000024
2023-10-13 08:36:16,078 DEV : loss 0.12795314192771912 - f1-score (micro avg) 0.8404
2023-10-13 08:36:16,084 saving best model
2023-10-13 08:36:16,549 ----------------------------------------------------------------------------------------------------
2023-10-13 08:36:17,287 epoch 4 - iter 13/138 - loss 0.06034711 - time (sec): 0.73 - samples/sec: 2955.79 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:36:18,017 epoch 4 - iter 26/138 - loss 0.07457368 - time (sec): 1.46 - samples/sec: 2940.29 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:36:18,759 epoch 4 - iter 39/138 - loss 0.06266461 - time (sec): 2.20 - samples/sec: 2990.35 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:36:19,467 epoch 4 - iter 52/138 - loss 0.07363202 - time (sec): 2.91 - samples/sec: 2966.39 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:36:20,204 epoch 4 - iter 65/138 - loss 0.07604624 - time (sec): 3.65 - samples/sec: 2937.83 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:36:20,923 epoch 4 - iter 78/138 - loss 0.07566573 - time (sec): 4.37 - samples/sec: 2951.17 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:36:21,659 epoch 4 - iter 91/138 - loss 0.07965420 - time (sec): 5.10 - samples/sec: 2941.06 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:36:22,414 epoch 4 - iter 104/138 - loss 0.07795115 - time (sec): 5.86 - samples/sec: 2934.96 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:36:23,100 epoch 4 - iter 117/138 - loss 0.07576140 - time (sec): 6.54 - samples/sec: 2936.89 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:36:23,831 epoch 4 - iter 130/138 - loss 0.07503969 - time (sec): 7.27 - samples/sec: 2935.23 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:36:24,324 ----------------------------------------------------------------------------------------------------
2023-10-13 08:36:24,325 EPOCH 4 done: loss 0.0728 - lr: 0.000020
2023-10-13 08:36:24,968 DEV : loss 0.1402752846479416 - f1-score (micro avg) 0.8524
2023-10-13 08:36:24,973 saving best model
2023-10-13 08:36:25,415 ----------------------------------------------------------------------------------------------------
2023-10-13 08:36:26,228 epoch 5 - iter 13/138 - loss 0.06027946 - time (sec): 0.81 - samples/sec: 2945.94 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:36:26,946 epoch 5 - iter 26/138 - loss 0.05871368 - time (sec): 1.52 - samples/sec: 2987.70 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:36:27,658 epoch 5 - iter 39/138 - loss 0.06897645 - time (sec): 2.24 - samples/sec: 2972.34 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:36:28,440 epoch 5 - iter 52/138 - loss 0.06086560 - time (sec): 3.02 - samples/sec: 2913.98 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:36:29,169 epoch 5 - iter 65/138 - loss 0.05969031 - time (sec): 3.75 - samples/sec: 2914.98 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:36:29,855 epoch 5 - iter 78/138 - loss 0.05410138 - time (sec): 4.43 - samples/sec: 2886.97 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:36:30,639 epoch 5 - iter 91/138 - loss 0.05821061 - time (sec): 5.22 - samples/sec: 2883.88 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:36:31,389 epoch 5 - iter 104/138 - loss 0.06126496 - time (sec): 5.97 - samples/sec: 2878.10 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:36:32,157 epoch 5 - iter 117/138 - loss 0.05957542 - time (sec): 6.74 - samples/sec: 2882.63 - lr: 0.000017 - momentum: 0.000000
2023-10-13 08:36:32,876 epoch 5 - iter 130/138 - loss 0.05797760 - time (sec): 7.45 - samples/sec: 2897.41 - lr: 0.000017 - momentum: 0.000000
2023-10-13 08:36:33,308 ----------------------------------------------------------------------------------------------------
2023-10-13 08:36:33,309 EPOCH 5 done: loss 0.0593 - lr: 0.000017
2023-10-13 08:36:33,959 DEV : loss 0.14202465116977692 - f1-score (micro avg) 0.844
2023-10-13 08:36:33,964 ----------------------------------------------------------------------------------------------------
2023-10-13 08:36:34,759 epoch 6 - iter 13/138 - loss 0.03744477 - time (sec): 0.79 - samples/sec: 2790.32 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:36:35,493 epoch 6 - iter 26/138 - loss 0.04100794 - time (sec): 1.53 - samples/sec: 2990.48 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:36:36,176 epoch 6 - iter 39/138 - loss 0.04319891 - time (sec): 2.21 - samples/sec: 2934.17 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:36:36,888 epoch 6 - iter 52/138 - loss 0.04264249 - time (sec): 2.92 - samples/sec: 2930.78 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:36:37,604 epoch 6 - iter 65/138 - loss 0.03835619 - time (sec): 3.64 - samples/sec: 2923.30 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:36:38,322 epoch 6 - iter 78/138 - loss 0.04033351 - time (sec): 4.36 - samples/sec: 2934.77 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:36:39,029 epoch 6 - iter 91/138 - loss 0.03574248 - time (sec): 5.06 - samples/sec: 2944.83 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:36:39,747 epoch 6 - iter 104/138 - loss 0.03858039 - time (sec): 5.78 - samples/sec: 2942.32 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:36:40,504 epoch 6 - iter 117/138 - loss 0.04052553 - time (sec): 6.54 - samples/sec: 2965.32 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:36:41,237 epoch 6 - iter 130/138 - loss 0.04293068 - time (sec): 7.27 - samples/sec: 2967.77 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:36:41,688 ----------------------------------------------------------------------------------------------------
2023-10-13 08:36:41,689 EPOCH 6 done: loss 0.0429 - lr: 0.000014
2023-10-13 08:36:42,396 DEV : loss 0.1494191288948059 - f1-score (micro avg) 0.8473
2023-10-13 08:36:42,402 ----------------------------------------------------------------------------------------------------
2023-10-13 08:36:43,147 epoch 7 - iter 13/138 - loss 0.05958199 - time (sec): 0.74 - samples/sec: 2975.32 - lr: 0.000013 - momentum: 0.000000
2023-10-13 08:36:43,846 epoch 7 - iter 26/138 - loss 0.04101013 - time (sec): 1.44 - samples/sec: 2865.68 - lr: 0.000013 - momentum: 0.000000
2023-10-13 08:36:44,573 epoch 7 - iter 39/138 - loss 0.04307772 - time (sec): 2.17 - samples/sec: 2898.63 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:36:45,302 epoch 7 - iter 52/138 - loss 0.03531247 - time (sec): 2.90 - samples/sec: 2959.93 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:36:46,049 epoch 7 - iter 65/138 - loss 0.04364804 - time (sec): 3.65 - samples/sec: 2950.20 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:36:46,845 epoch 7 - iter 78/138 - loss 0.03840951 - time (sec): 4.44 - samples/sec: 2925.67 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:36:47,597 epoch 7 - iter 91/138 - loss 0.03719766 - time (sec): 5.19 - samples/sec: 2922.63 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:36:48,381 epoch 7 - iter 104/138 - loss 0.03701463 - time (sec): 5.98 - samples/sec: 2879.81 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:36:49,112 epoch 7 - iter 117/138 - loss 0.03419231 - time (sec): 6.71 - samples/sec: 2887.75 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:36:49,811 epoch 7 - iter 130/138 - loss 0.03397131 - time (sec): 7.41 - samples/sec: 2910.57 - lr: 0.000010 - momentum: 0.000000
2023-10-13 08:36:50,236 ----------------------------------------------------------------------------------------------------
2023-10-13 08:36:50,237 EPOCH 7 done: loss 0.0353 - lr: 0.000010
2023-10-13 08:36:50,972 DEV : loss 0.15942086279392242 - f1-score (micro avg) 0.8605
2023-10-13 08:36:50,978 saving best model
2023-10-13 08:36:51,624 ----------------------------------------------------------------------------------------------------
2023-10-13 08:36:52,339 epoch 8 - iter 13/138 - loss 0.02168777 - time (sec): 0.71 - samples/sec: 3139.01 - lr: 0.000010 - momentum: 0.000000
2023-10-13 08:36:53,113 epoch 8 - iter 26/138 - loss 0.02661765 - time (sec): 1.49 - samples/sec: 2892.93 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:36:53,827 epoch 8 - iter 39/138 - loss 0.02310758 - time (sec): 2.20 - samples/sec: 2899.57 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:36:54,557 epoch 8 - iter 52/138 - loss 0.02772704 - time (sec): 2.93 - samples/sec: 2967.36 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:36:55,356 epoch 8 - iter 65/138 - loss 0.03020615 - time (sec): 3.73 - samples/sec: 2935.59 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:36:56,086 epoch 8 - iter 78/138 - loss 0.02991360 - time (sec): 4.46 - samples/sec: 2930.56 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:36:56,812 epoch 8 - iter 91/138 - loss 0.03217017 - time (sec): 5.19 - samples/sec: 2953.87 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:36:57,488 epoch 8 - iter 104/138 - loss 0.02921793 - time (sec): 5.86 - samples/sec: 2906.58 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:36:58,245 epoch 8 - iter 117/138 - loss 0.02851128 - time (sec): 6.62 - samples/sec: 2886.50 - lr: 0.000007 - momentum: 0.000000
2023-10-13 08:36:58,985 epoch 8 - iter 130/138 - loss 0.02828101 - time (sec): 7.36 - samples/sec: 2922.33 - lr: 0.000007 - momentum: 0.000000
2023-10-13 08:36:59,405 ----------------------------------------------------------------------------------------------------
2023-10-13 08:36:59,406 EPOCH 8 done: loss 0.0297 - lr: 0.000007
2023-10-13 08:37:00,121 DEV : loss 0.16467083990573883 - f1-score (micro avg) 0.8701
2023-10-13 08:37:00,127 saving best model
2023-10-13 08:37:00,649 ----------------------------------------------------------------------------------------------------
2023-10-13 08:37:01,403 epoch 9 - iter 13/138 - loss 0.03326079 - time (sec): 0.75 - samples/sec: 2906.43 - lr: 0.000006 - momentum: 0.000000
2023-10-13 08:37:02,188 epoch 9 - iter 26/138 - loss 0.01810971 - time (sec): 1.54 - samples/sec: 2752.67 - lr: 0.000006 - momentum: 0.000000
2023-10-13 08:37:02,882 epoch 9 - iter 39/138 - loss 0.02285897 - time (sec): 2.23 - samples/sec: 2823.58 - lr: 0.000006 - momentum: 0.000000
2023-10-13 08:37:03,592 epoch 9 - iter 52/138 - loss 0.02092099 - time (sec): 2.94 - samples/sec: 2771.75 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:37:04,367 epoch 9 - iter 65/138 - loss 0.02750583 - time (sec): 3.72 - samples/sec: 2823.57 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:37:05,088 epoch 9 - iter 78/138 - loss 0.02769534 - time (sec): 4.44 - samples/sec: 2939.10 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:37:05,823 epoch 9 - iter 91/138 - loss 0.02598387 - time (sec): 5.17 - samples/sec: 2936.80 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:37:06,535 epoch 9 - iter 104/138 - loss 0.02725468 - time (sec): 5.88 - samples/sec: 2960.13 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:37:07,268 epoch 9 - iter 117/138 - loss 0.02514963 - time (sec): 6.62 - samples/sec: 2939.20 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:37:07,996 epoch 9 - iter 130/138 - loss 0.02539927 - time (sec): 7.35 - samples/sec: 2933.72 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:37:08,449 ----------------------------------------------------------------------------------------------------
2023-10-13 08:37:08,449 EPOCH 9 done: loss 0.0270 - lr: 0.000004
2023-10-13 08:37:09,124 DEV : loss 0.16532635688781738 - f1-score (micro avg) 0.8722
2023-10-13 08:37:09,130 saving best model
2023-10-13 08:37:09,633 ----------------------------------------------------------------------------------------------------
2023-10-13 08:37:10,349 epoch 10 - iter 13/138 - loss 0.04651164 - time (sec): 0.71 - samples/sec: 3011.35 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:37:11,101 epoch 10 - iter 26/138 - loss 0.05835617 - time (sec): 1.47 - samples/sec: 3038.51 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:37:11,871 epoch 10 - iter 39/138 - loss 0.03954145 - time (sec): 2.24 - samples/sec: 2999.87 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:37:12,565 epoch 10 - iter 52/138 - loss 0.03736192 - time (sec): 2.93 - samples/sec: 2985.06 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:37:13,259 epoch 10 - iter 65/138 - loss 0.03082446 - time (sec): 3.62 - samples/sec: 2991.62 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:37:14,001 epoch 10 - iter 78/138 - loss 0.03061056 - time (sec): 4.37 - samples/sec: 3023.61 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:37:14,703 epoch 10 - iter 91/138 - loss 0.02816398 - time (sec): 5.07 - samples/sec: 3029.94 - lr: 0.000001 - momentum: 0.000000
2023-10-13 08:37:15,417 epoch 10 - iter 104/138 - loss 0.02637905 - time (sec): 5.78 - samples/sec: 3000.59 - lr: 0.000001 - momentum: 0.000000
2023-10-13 08:37:16,166 epoch 10 - iter 117/138 - loss 0.02611645 - time (sec): 6.53 - samples/sec: 2974.20 - lr: 0.000001 - momentum: 0.000000
2023-10-13 08:37:16,868 epoch 10 - iter 130/138 - loss 0.02470807 - time (sec): 7.23 - samples/sec: 2956.73 - lr: 0.000000 - momentum: 0.000000
2023-10-13 08:37:17,323 ----------------------------------------------------------------------------------------------------
2023-10-13 08:37:17,324 EPOCH 10 done: loss 0.0236 - lr: 0.000000
2023-10-13 08:37:18,006 DEV : loss 0.16804882884025574 - f1-score (micro avg) 0.8656
2023-10-13 08:37:18,460 ----------------------------------------------------------------------------------------------------
2023-10-13 08:37:18,461 Loading model from best epoch ...
2023-10-13 08:37:20,141 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-13 08:37:21,099
Results:
- F-score (micro) 0.915
- F-score (macro) 0.7102
- Accuracy 0.8516
By class:
precision recall f1-score support
scope 0.8883 0.9034 0.8958 176
pers 0.9612 0.9688 0.9650 128
work 0.9028 0.8784 0.8904 74
loc 0.6667 1.0000 0.8000 2
object 0.0000 0.0000 0.0000 2
micro avg 0.9138 0.9162 0.9150 382
macro avg 0.6838 0.7501 0.7102 382
weighted avg 0.9097 0.9162 0.9127 382
2023-10-13 08:37:21,099 ----------------------------------------------------------------------------------------------------