stefan-it's picture
Upload folder using huggingface_hub
ab8a9a2
2023-10-13 08:47:56,478 ----------------------------------------------------------------------------------------------------
2023-10-13 08:47:56,479 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 08:47:56,479 ----------------------------------------------------------------------------------------------------
2023-10-13 08:47:56,479 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-13 08:47:56,479 ----------------------------------------------------------------------------------------------------
2023-10-13 08:47:56,479 Train: 1100 sentences
2023-10-13 08:47:56,479 (train_with_dev=False, train_with_test=False)
2023-10-13 08:47:56,479 ----------------------------------------------------------------------------------------------------
2023-10-13 08:47:56,479 Training Params:
2023-10-13 08:47:56,479 - learning_rate: "5e-05"
2023-10-13 08:47:56,479 - mini_batch_size: "8"
2023-10-13 08:47:56,479 - max_epochs: "10"
2023-10-13 08:47:56,479 - shuffle: "True"
2023-10-13 08:47:56,479 ----------------------------------------------------------------------------------------------------
2023-10-13 08:47:56,479 Plugins:
2023-10-13 08:47:56,480 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 08:47:56,480 ----------------------------------------------------------------------------------------------------
2023-10-13 08:47:56,480 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 08:47:56,480 - metric: "('micro avg', 'f1-score')"
2023-10-13 08:47:56,480 ----------------------------------------------------------------------------------------------------
2023-10-13 08:47:56,480 Computation:
2023-10-13 08:47:56,480 - compute on device: cuda:0
2023-10-13 08:47:56,480 - embedding storage: none
2023-10-13 08:47:56,480 ----------------------------------------------------------------------------------------------------
2023-10-13 08:47:56,480 Model training base path: "hmbench-ajmc/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-13 08:47:56,480 ----------------------------------------------------------------------------------------------------
2023-10-13 08:47:56,480 ----------------------------------------------------------------------------------------------------
2023-10-13 08:47:57,256 epoch 1 - iter 13/138 - loss 3.39217261 - time (sec): 0.77 - samples/sec: 2618.08 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:47:58,122 epoch 1 - iter 26/138 - loss 3.10450393 - time (sec): 1.64 - samples/sec: 2603.76 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:47:58,992 epoch 1 - iter 39/138 - loss 2.57254833 - time (sec): 2.51 - samples/sec: 2542.76 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:47:59,853 epoch 1 - iter 52/138 - loss 2.14277098 - time (sec): 3.37 - samples/sec: 2516.48 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:48:00,700 epoch 1 - iter 65/138 - loss 1.85893243 - time (sec): 4.22 - samples/sec: 2531.37 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:48:01,518 epoch 1 - iter 78/138 - loss 1.66007819 - time (sec): 5.04 - samples/sec: 2536.02 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:48:02,320 epoch 1 - iter 91/138 - loss 1.48357053 - time (sec): 5.84 - samples/sec: 2593.85 - lr: 0.000033 - momentum: 0.000000
2023-10-13 08:48:03,043 epoch 1 - iter 104/138 - loss 1.34118862 - time (sec): 6.56 - samples/sec: 2646.10 - lr: 0.000037 - momentum: 0.000000
2023-10-13 08:48:03,789 epoch 1 - iter 117/138 - loss 1.23303479 - time (sec): 7.31 - samples/sec: 2657.54 - lr: 0.000042 - momentum: 0.000000
2023-10-13 08:48:04,559 epoch 1 - iter 130/138 - loss 1.13842031 - time (sec): 8.08 - samples/sec: 2668.92 - lr: 0.000047 - momentum: 0.000000
2023-10-13 08:48:05,049 ----------------------------------------------------------------------------------------------------
2023-10-13 08:48:05,049 EPOCH 1 done: loss 1.0922 - lr: 0.000047
2023-10-13 08:48:05,630 DEV : loss 0.25364017486572266 - f1-score (micro avg) 0.6181
2023-10-13 08:48:05,634 saving best model
2023-10-13 08:48:05,975 ----------------------------------------------------------------------------------------------------
2023-10-13 08:48:06,706 epoch 2 - iter 13/138 - loss 0.27716583 - time (sec): 0.73 - samples/sec: 2731.77 - lr: 0.000050 - momentum: 0.000000
2023-10-13 08:48:07,482 epoch 2 - iter 26/138 - loss 0.23002082 - time (sec): 1.51 - samples/sec: 2790.35 - lr: 0.000049 - momentum: 0.000000
2023-10-13 08:48:08,271 epoch 2 - iter 39/138 - loss 0.23319228 - time (sec): 2.29 - samples/sec: 2741.18 - lr: 0.000048 - momentum: 0.000000
2023-10-13 08:48:09,049 epoch 2 - iter 52/138 - loss 0.21708136 - time (sec): 3.07 - samples/sec: 2767.78 - lr: 0.000048 - momentum: 0.000000
2023-10-13 08:48:09,820 epoch 2 - iter 65/138 - loss 0.20135419 - time (sec): 3.84 - samples/sec: 2783.69 - lr: 0.000047 - momentum: 0.000000
2023-10-13 08:48:10,603 epoch 2 - iter 78/138 - loss 0.19981267 - time (sec): 4.63 - samples/sec: 2802.71 - lr: 0.000047 - momentum: 0.000000
2023-10-13 08:48:11,406 epoch 2 - iter 91/138 - loss 0.19271180 - time (sec): 5.43 - samples/sec: 2787.22 - lr: 0.000046 - momentum: 0.000000
2023-10-13 08:48:12,204 epoch 2 - iter 104/138 - loss 0.18664404 - time (sec): 6.23 - samples/sec: 2790.69 - lr: 0.000046 - momentum: 0.000000
2023-10-13 08:48:12,961 epoch 2 - iter 117/138 - loss 0.18264422 - time (sec): 6.98 - samples/sec: 2798.68 - lr: 0.000045 - momentum: 0.000000
2023-10-13 08:48:13,722 epoch 2 - iter 130/138 - loss 0.18092299 - time (sec): 7.75 - samples/sec: 2783.27 - lr: 0.000045 - momentum: 0.000000
2023-10-13 08:48:14,208 ----------------------------------------------------------------------------------------------------
2023-10-13 08:48:14,209 EPOCH 2 done: loss 0.1818 - lr: 0.000045
2023-10-13 08:48:14,918 DEV : loss 0.1298540234565735 - f1-score (micro avg) 0.7991
2023-10-13 08:48:14,923 saving best model
2023-10-13 08:48:15,820 ----------------------------------------------------------------------------------------------------
2023-10-13 08:48:16,503 epoch 3 - iter 13/138 - loss 0.08939003 - time (sec): 0.68 - samples/sec: 3245.29 - lr: 0.000044 - momentum: 0.000000
2023-10-13 08:48:17,331 epoch 3 - iter 26/138 - loss 0.09765393 - time (sec): 1.51 - samples/sec: 2897.19 - lr: 0.000043 - momentum: 0.000000
2023-10-13 08:48:18,091 epoch 3 - iter 39/138 - loss 0.11474174 - time (sec): 2.27 - samples/sec: 2836.80 - lr: 0.000043 - momentum: 0.000000
2023-10-13 08:48:18,906 epoch 3 - iter 52/138 - loss 0.11031411 - time (sec): 3.08 - samples/sec: 2788.14 - lr: 0.000042 - momentum: 0.000000
2023-10-13 08:48:19,680 epoch 3 - iter 65/138 - loss 0.11495807 - time (sec): 3.86 - samples/sec: 2812.14 - lr: 0.000042 - momentum: 0.000000
2023-10-13 08:48:20,417 epoch 3 - iter 78/138 - loss 0.11184132 - time (sec): 4.59 - samples/sec: 2825.09 - lr: 0.000041 - momentum: 0.000000
2023-10-13 08:48:21,238 epoch 3 - iter 91/138 - loss 0.10692215 - time (sec): 5.41 - samples/sec: 2826.92 - lr: 0.000041 - momentum: 0.000000
2023-10-13 08:48:21,977 epoch 3 - iter 104/138 - loss 0.10349019 - time (sec): 6.15 - samples/sec: 2845.23 - lr: 0.000040 - momentum: 0.000000
2023-10-13 08:48:22,776 epoch 3 - iter 117/138 - loss 0.09974043 - time (sec): 6.95 - samples/sec: 2818.45 - lr: 0.000040 - momentum: 0.000000
2023-10-13 08:48:23,575 epoch 3 - iter 130/138 - loss 0.09742995 - time (sec): 7.75 - samples/sec: 2811.59 - lr: 0.000039 - momentum: 0.000000
2023-10-13 08:48:24,013 ----------------------------------------------------------------------------------------------------
2023-10-13 08:48:24,013 EPOCH 3 done: loss 0.1002 - lr: 0.000039
2023-10-13 08:48:24,749 DEV : loss 0.13911183178424835 - f1-score (micro avg) 0.8496
2023-10-13 08:48:24,753 saving best model
2023-10-13 08:48:25,296 ----------------------------------------------------------------------------------------------------
2023-10-13 08:48:26,100 epoch 4 - iter 13/138 - loss 0.05017842 - time (sec): 0.80 - samples/sec: 2674.87 - lr: 0.000038 - momentum: 0.000000
2023-10-13 08:48:26,857 epoch 4 - iter 26/138 - loss 0.05457347 - time (sec): 1.55 - samples/sec: 2670.81 - lr: 0.000038 - momentum: 0.000000
2023-10-13 08:48:27,616 epoch 4 - iter 39/138 - loss 0.06804756 - time (sec): 2.31 - samples/sec: 2680.11 - lr: 0.000037 - momentum: 0.000000
2023-10-13 08:48:28,427 epoch 4 - iter 52/138 - loss 0.07615447 - time (sec): 3.12 - samples/sec: 2611.28 - lr: 0.000037 - momentum: 0.000000
2023-10-13 08:48:29,219 epoch 4 - iter 65/138 - loss 0.07356145 - time (sec): 3.91 - samples/sec: 2600.45 - lr: 0.000036 - momentum: 0.000000
2023-10-13 08:48:30,051 epoch 4 - iter 78/138 - loss 0.06935192 - time (sec): 4.75 - samples/sec: 2637.15 - lr: 0.000036 - momentum: 0.000000
2023-10-13 08:48:30,814 epoch 4 - iter 91/138 - loss 0.06636175 - time (sec): 5.51 - samples/sec: 2603.79 - lr: 0.000035 - momentum: 0.000000
2023-10-13 08:48:31,564 epoch 4 - iter 104/138 - loss 0.06705173 - time (sec): 6.26 - samples/sec: 2639.39 - lr: 0.000035 - momentum: 0.000000
2023-10-13 08:48:32,386 epoch 4 - iter 117/138 - loss 0.07192855 - time (sec): 7.08 - samples/sec: 2668.06 - lr: 0.000034 - momentum: 0.000000
2023-10-13 08:48:33,254 epoch 4 - iter 130/138 - loss 0.07207122 - time (sec): 7.95 - samples/sec: 2699.87 - lr: 0.000034 - momentum: 0.000000
2023-10-13 08:48:33,726 ----------------------------------------------------------------------------------------------------
2023-10-13 08:48:33,727 EPOCH 4 done: loss 0.0708 - lr: 0.000034
2023-10-13 08:48:34,439 DEV : loss 0.14075107872486115 - f1-score (micro avg) 0.8548
2023-10-13 08:48:34,444 saving best model
2023-10-13 08:48:34,921 ----------------------------------------------------------------------------------------------------
2023-10-13 08:48:35,717 epoch 5 - iter 13/138 - loss 0.04871266 - time (sec): 0.79 - samples/sec: 2673.64 - lr: 0.000033 - momentum: 0.000000
2023-10-13 08:48:36,534 epoch 5 - iter 26/138 - loss 0.04761880 - time (sec): 1.61 - samples/sec: 2746.95 - lr: 0.000032 - momentum: 0.000000
2023-10-13 08:48:37,301 epoch 5 - iter 39/138 - loss 0.04327600 - time (sec): 2.38 - samples/sec: 2753.35 - lr: 0.000032 - momentum: 0.000000
2023-10-13 08:48:38,095 epoch 5 - iter 52/138 - loss 0.05450586 - time (sec): 3.17 - samples/sec: 2777.82 - lr: 0.000031 - momentum: 0.000000
2023-10-13 08:48:38,922 epoch 5 - iter 65/138 - loss 0.05206128 - time (sec): 4.00 - samples/sec: 2752.88 - lr: 0.000031 - momentum: 0.000000
2023-10-13 08:48:39,689 epoch 5 - iter 78/138 - loss 0.05203615 - time (sec): 4.77 - samples/sec: 2762.62 - lr: 0.000030 - momentum: 0.000000
2023-10-13 08:48:40,449 epoch 5 - iter 91/138 - loss 0.05144350 - time (sec): 5.53 - samples/sec: 2696.45 - lr: 0.000030 - momentum: 0.000000
2023-10-13 08:48:41,237 epoch 5 - iter 104/138 - loss 0.05701987 - time (sec): 6.31 - samples/sec: 2707.58 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:48:42,022 epoch 5 - iter 117/138 - loss 0.05817524 - time (sec): 7.10 - samples/sec: 2707.16 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:48:42,813 epoch 5 - iter 130/138 - loss 0.05551517 - time (sec): 7.89 - samples/sec: 2709.14 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:48:43,311 ----------------------------------------------------------------------------------------------------
2023-10-13 08:48:43,311 EPOCH 5 done: loss 0.0556 - lr: 0.000028
2023-10-13 08:48:44,060 DEV : loss 0.13628125190734863 - f1-score (micro avg) 0.8578
2023-10-13 08:48:44,064 saving best model
2023-10-13 08:48:44,538 ----------------------------------------------------------------------------------------------------
2023-10-13 08:48:45,301 epoch 6 - iter 13/138 - loss 0.00432082 - time (sec): 0.76 - samples/sec: 2737.39 - lr: 0.000027 - momentum: 0.000000
2023-10-13 08:48:46,117 epoch 6 - iter 26/138 - loss 0.02729222 - time (sec): 1.58 - samples/sec: 2793.19 - lr: 0.000027 - momentum: 0.000000
2023-10-13 08:48:46,876 epoch 6 - iter 39/138 - loss 0.04952061 - time (sec): 2.34 - samples/sec: 2856.51 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:48:47,608 epoch 6 - iter 52/138 - loss 0.04392451 - time (sec): 3.07 - samples/sec: 2794.95 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:48:48,423 epoch 6 - iter 65/138 - loss 0.04557450 - time (sec): 3.88 - samples/sec: 2771.20 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:48:49,210 epoch 6 - iter 78/138 - loss 0.03929253 - time (sec): 4.67 - samples/sec: 2772.47 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:48:50,036 epoch 6 - iter 91/138 - loss 0.03886788 - time (sec): 5.50 - samples/sec: 2769.13 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:48:50,766 epoch 6 - iter 104/138 - loss 0.03997157 - time (sec): 6.23 - samples/sec: 2768.67 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:48:51,570 epoch 6 - iter 117/138 - loss 0.04194946 - time (sec): 7.03 - samples/sec: 2787.98 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:48:52,351 epoch 6 - iter 130/138 - loss 0.04087328 - time (sec): 7.81 - samples/sec: 2781.66 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:48:52,811 ----------------------------------------------------------------------------------------------------
2023-10-13 08:48:52,811 EPOCH 6 done: loss 0.0397 - lr: 0.000023
2023-10-13 08:48:53,561 DEV : loss 0.16073298454284668 - f1-score (micro avg) 0.8507
2023-10-13 08:48:53,568 ----------------------------------------------------------------------------------------------------
2023-10-13 08:48:54,377 epoch 7 - iter 13/138 - loss 0.03098511 - time (sec): 0.81 - samples/sec: 2267.87 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:48:55,152 epoch 7 - iter 26/138 - loss 0.03759647 - time (sec): 1.58 - samples/sec: 2478.75 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:48:55,891 epoch 7 - iter 39/138 - loss 0.04766066 - time (sec): 2.32 - samples/sec: 2694.42 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:48:56,645 epoch 7 - iter 52/138 - loss 0.03967438 - time (sec): 3.08 - samples/sec: 2763.62 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:48:57,417 epoch 7 - iter 65/138 - loss 0.03864303 - time (sec): 3.85 - samples/sec: 2773.97 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:48:58,107 epoch 7 - iter 78/138 - loss 0.03911134 - time (sec): 4.54 - samples/sec: 2819.36 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:48:58,847 epoch 7 - iter 91/138 - loss 0.03671785 - time (sec): 5.28 - samples/sec: 2855.22 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:48:59,671 epoch 7 - iter 104/138 - loss 0.03518715 - time (sec): 6.10 - samples/sec: 2856.17 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:49:00,524 epoch 7 - iter 117/138 - loss 0.03295486 - time (sec): 6.95 - samples/sec: 2808.26 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:49:01,333 epoch 7 - iter 130/138 - loss 0.03142400 - time (sec): 7.76 - samples/sec: 2772.16 - lr: 0.000017 - momentum: 0.000000
2023-10-13 08:49:01,824 ----------------------------------------------------------------------------------------------------
2023-10-13 08:49:01,824 EPOCH 7 done: loss 0.0320 - lr: 0.000017
2023-10-13 08:49:02,571 DEV : loss 0.1471194475889206 - f1-score (micro avg) 0.8886
2023-10-13 08:49:02,576 saving best model
2023-10-13 08:49:03,034 ----------------------------------------------------------------------------------------------------
2023-10-13 08:49:03,846 epoch 8 - iter 13/138 - loss 0.01968322 - time (sec): 0.81 - samples/sec: 2511.90 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:49:04,571 epoch 8 - iter 26/138 - loss 0.02518349 - time (sec): 1.54 - samples/sec: 2664.52 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:49:05,369 epoch 8 - iter 39/138 - loss 0.01915542 - time (sec): 2.33 - samples/sec: 2622.33 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:49:06,177 epoch 8 - iter 52/138 - loss 0.02553957 - time (sec): 3.14 - samples/sec: 2686.75 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:49:06,932 epoch 8 - iter 65/138 - loss 0.02721683 - time (sec): 3.90 - samples/sec: 2690.85 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:49:07,697 epoch 8 - iter 78/138 - loss 0.02746147 - time (sec): 4.66 - samples/sec: 2688.01 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:49:08,449 epoch 8 - iter 91/138 - loss 0.02580390 - time (sec): 5.41 - samples/sec: 2727.11 - lr: 0.000013 - momentum: 0.000000
2023-10-13 08:49:09,258 epoch 8 - iter 104/138 - loss 0.02565199 - time (sec): 6.22 - samples/sec: 2745.29 - lr: 0.000013 - momentum: 0.000000
2023-10-13 08:49:10,054 epoch 8 - iter 117/138 - loss 0.02483025 - time (sec): 7.02 - samples/sec: 2750.63 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:49:10,830 epoch 8 - iter 130/138 - loss 0.02276873 - time (sec): 7.79 - samples/sec: 2749.37 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:49:11,328 ----------------------------------------------------------------------------------------------------
2023-10-13 08:49:11,328 EPOCH 8 done: loss 0.0247 - lr: 0.000012
2023-10-13 08:49:12,080 DEV : loss 0.1488998383283615 - f1-score (micro avg) 0.882
2023-10-13 08:49:12,087 ----------------------------------------------------------------------------------------------------
2023-10-13 08:49:12,856 epoch 9 - iter 13/138 - loss 0.00085242 - time (sec): 0.77 - samples/sec: 2531.84 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:49:13,592 epoch 9 - iter 26/138 - loss 0.03493382 - time (sec): 1.50 - samples/sec: 2636.59 - lr: 0.000010 - momentum: 0.000000
2023-10-13 08:49:14,377 epoch 9 - iter 39/138 - loss 0.02699129 - time (sec): 2.29 - samples/sec: 2688.01 - lr: 0.000010 - momentum: 0.000000
2023-10-13 08:49:15,192 epoch 9 - iter 52/138 - loss 0.02545629 - time (sec): 3.10 - samples/sec: 2723.42 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:49:15,942 epoch 9 - iter 65/138 - loss 0.02521135 - time (sec): 3.85 - samples/sec: 2766.88 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:49:16,729 epoch 9 - iter 78/138 - loss 0.02199381 - time (sec): 4.64 - samples/sec: 2774.82 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:49:17,496 epoch 9 - iter 91/138 - loss 0.01960602 - time (sec): 5.41 - samples/sec: 2744.53 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:49:18,298 epoch 9 - iter 104/138 - loss 0.01938647 - time (sec): 6.21 - samples/sec: 2761.89 - lr: 0.000007 - momentum: 0.000000
2023-10-13 08:49:19,096 epoch 9 - iter 117/138 - loss 0.01886298 - time (sec): 7.01 - samples/sec: 2773.61 - lr: 0.000007 - momentum: 0.000000
2023-10-13 08:49:19,890 epoch 9 - iter 130/138 - loss 0.01961368 - time (sec): 7.80 - samples/sec: 2767.52 - lr: 0.000006 - momentum: 0.000000
2023-10-13 08:49:20,382 ----------------------------------------------------------------------------------------------------
2023-10-13 08:49:20,382 EPOCH 9 done: loss 0.0200 - lr: 0.000006
2023-10-13 08:49:21,129 DEV : loss 0.1539003998041153 - f1-score (micro avg) 0.8934
2023-10-13 08:49:21,135 saving best model
2023-10-13 08:49:21,706 ----------------------------------------------------------------------------------------------------
2023-10-13 08:49:22,463 epoch 10 - iter 13/138 - loss 0.00219720 - time (sec): 0.75 - samples/sec: 2614.76 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:49:23,216 epoch 10 - iter 26/138 - loss 0.00675932 - time (sec): 1.51 - samples/sec: 2794.34 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:49:23,943 epoch 10 - iter 39/138 - loss 0.00619257 - time (sec): 2.23 - samples/sec: 2808.21 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:49:24,678 epoch 10 - iter 52/138 - loss 0.00854341 - time (sec): 2.97 - samples/sec: 2831.43 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:49:25,437 epoch 10 - iter 65/138 - loss 0.01380154 - time (sec): 3.73 - samples/sec: 2850.88 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:49:26,177 epoch 10 - iter 78/138 - loss 0.01603853 - time (sec): 4.47 - samples/sec: 2881.16 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:49:26,917 epoch 10 - iter 91/138 - loss 0.01467614 - time (sec): 5.21 - samples/sec: 2892.33 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:49:27,662 epoch 10 - iter 104/138 - loss 0.01796054 - time (sec): 5.95 - samples/sec: 2886.00 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:49:28,346 epoch 10 - iter 117/138 - loss 0.01712340 - time (sec): 6.64 - samples/sec: 2906.53 - lr: 0.000001 - momentum: 0.000000
2023-10-13 08:49:29,119 epoch 10 - iter 130/138 - loss 0.01583261 - time (sec): 7.41 - samples/sec: 2905.74 - lr: 0.000000 - momentum: 0.000000
2023-10-13 08:49:29,558 ----------------------------------------------------------------------------------------------------
2023-10-13 08:49:29,558 EPOCH 10 done: loss 0.0168 - lr: 0.000000
2023-10-13 08:49:30,282 DEV : loss 0.14939561486244202 - f1-score (micro avg) 0.8838
2023-10-13 08:49:30,710 ----------------------------------------------------------------------------------------------------
2023-10-13 08:49:30,711 Loading model from best epoch ...
2023-10-13 08:49:32,427 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-13 08:49:33,327
Results:
- F-score (micro) 0.9074
- F-score (macro) 0.7442
- Accuracy 0.8447
By class:
precision recall f1-score support
scope 0.8895 0.9148 0.9020 176
pers 0.9677 0.9375 0.9524 128
work 0.8553 0.8784 0.8667 74
object 0.5000 0.5000 0.5000 2
loc 0.5000 0.5000 0.5000 2
micro avg 0.9039 0.9110 0.9074 382
macro avg 0.7425 0.7461 0.7442 382
weighted avg 0.9050 0.9110 0.9078 382
2023-10-13 08:49:33,327 ----------------------------------------------------------------------------------------------------