|
2023-10-25 11:32:17,897 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:32:17,898 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 11:32:17,898 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:32:17,899 MultiCorpus: 6183 train + 680 dev + 2113 test sentences |
|
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator |
|
2023-10-25 11:32:17,899 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:32:17,899 Train: 6183 sentences |
|
2023-10-25 11:32:17,899 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 11:32:17,899 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:32:17,899 Training Params: |
|
2023-10-25 11:32:17,899 - learning_rate: "5e-05" |
|
2023-10-25 11:32:17,899 - mini_batch_size: "8" |
|
2023-10-25 11:32:17,899 - max_epochs: "10" |
|
2023-10-25 11:32:17,899 - shuffle: "True" |
|
2023-10-25 11:32:17,899 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:32:17,899 Plugins: |
|
2023-10-25 11:32:17,899 - TensorboardLogger |
|
2023-10-25 11:32:17,899 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 11:32:17,899 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:32:17,899 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 11:32:17,899 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 11:32:17,899 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:32:17,899 Computation: |
|
2023-10-25 11:32:17,899 - compute on device: cuda:0 |
|
2023-10-25 11:32:17,899 - embedding storage: none |
|
2023-10-25 11:32:17,899 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:32:17,899 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-25 11:32:17,899 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:32:17,899 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:32:17,900 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 11:32:22,467 epoch 1 - iter 77/773 - loss 1.55928995 - time (sec): 4.57 - samples/sec: 3117.25 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 11:32:26,899 epoch 1 - iter 154/773 - loss 0.94444348 - time (sec): 9.00 - samples/sec: 2895.74 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 11:32:31,544 epoch 1 - iter 231/773 - loss 0.68472071 - time (sec): 13.64 - samples/sec: 2835.90 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 11:32:36,101 epoch 1 - iter 308/773 - loss 0.54838258 - time (sec): 18.20 - samples/sec: 2792.56 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 11:32:40,729 epoch 1 - iter 385/773 - loss 0.46334242 - time (sec): 22.83 - samples/sec: 2770.44 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 11:32:45,076 epoch 1 - iter 462/773 - loss 0.41308200 - time (sec): 27.18 - samples/sec: 2740.33 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 11:32:49,556 epoch 1 - iter 539/773 - loss 0.37076947 - time (sec): 31.66 - samples/sec: 2735.35 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 11:32:53,999 epoch 1 - iter 616/773 - loss 0.33516167 - time (sec): 36.10 - samples/sec: 2746.05 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 11:32:58,447 epoch 1 - iter 693/773 - loss 0.30900288 - time (sec): 40.55 - samples/sec: 2748.87 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 11:33:03,589 epoch 1 - iter 770/773 - loss 0.28804151 - time (sec): 45.69 - samples/sec: 2708.60 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-25 11:33:03,765 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:33:03,766 EPOCH 1 done: loss 0.2873 - lr: 0.000050 |
|
2023-10-25 11:33:06,404 DEV : loss 0.06803968548774719 - f1-score (micro avg) 0.7209 |
|
2023-10-25 11:33:06,427 saving best model |
|
2023-10-25 11:33:06,908 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:33:11,327 epoch 2 - iter 77/773 - loss 0.07623849 - time (sec): 4.42 - samples/sec: 2667.47 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 11:33:15,659 epoch 2 - iter 154/773 - loss 0.07951236 - time (sec): 8.75 - samples/sec: 2814.00 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 11:33:19,994 epoch 2 - iter 231/773 - loss 0.08070316 - time (sec): 13.08 - samples/sec: 2905.85 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 11:33:24,314 epoch 2 - iter 308/773 - loss 0.07668770 - time (sec): 17.40 - samples/sec: 2938.42 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 11:33:28,562 epoch 2 - iter 385/773 - loss 0.07601583 - time (sec): 21.65 - samples/sec: 2931.54 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 11:33:32,947 epoch 2 - iter 462/773 - loss 0.07553467 - time (sec): 26.04 - samples/sec: 2877.34 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 11:33:37,394 epoch 2 - iter 539/773 - loss 0.07726320 - time (sec): 30.48 - samples/sec: 2852.33 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 11:33:41,685 epoch 2 - iter 616/773 - loss 0.07687578 - time (sec): 34.77 - samples/sec: 2862.30 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 11:33:45,897 epoch 2 - iter 693/773 - loss 0.07690697 - time (sec): 38.99 - samples/sec: 2853.35 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 11:33:50,273 epoch 2 - iter 770/773 - loss 0.07652112 - time (sec): 43.36 - samples/sec: 2853.67 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 11:33:50,458 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:33:50,458 EPOCH 2 done: loss 0.0764 - lr: 0.000044 |
|
2023-10-25 11:33:54,305 DEV : loss 0.05765606090426445 - f1-score (micro avg) 0.6577 |
|
2023-10-25 11:33:54,328 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:33:58,799 epoch 3 - iter 77/773 - loss 0.04624094 - time (sec): 4.47 - samples/sec: 2773.78 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 11:34:03,308 epoch 3 - iter 154/773 - loss 0.04639348 - time (sec): 8.98 - samples/sec: 2763.86 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 11:34:07,837 epoch 3 - iter 231/773 - loss 0.04638161 - time (sec): 13.51 - samples/sec: 2848.46 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 11:34:12,399 epoch 3 - iter 308/773 - loss 0.04881426 - time (sec): 18.07 - samples/sec: 2807.68 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 11:34:17,014 epoch 3 - iter 385/773 - loss 0.05206470 - time (sec): 22.68 - samples/sec: 2748.41 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 11:34:21,488 epoch 3 - iter 462/773 - loss 0.05275003 - time (sec): 27.16 - samples/sec: 2704.18 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 11:34:25,906 epoch 3 - iter 539/773 - loss 0.05152024 - time (sec): 31.58 - samples/sec: 2736.60 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 11:34:30,464 epoch 3 - iter 616/773 - loss 0.05087172 - time (sec): 36.13 - samples/sec: 2732.63 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 11:34:35,078 epoch 3 - iter 693/773 - loss 0.05319417 - time (sec): 40.75 - samples/sec: 2731.67 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-25 11:34:39,720 epoch 3 - iter 770/773 - loss 0.05805865 - time (sec): 45.39 - samples/sec: 2731.10 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-25 11:34:39,889 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:34:39,889 EPOCH 3 done: loss 0.0581 - lr: 0.000039 |
|
2023-10-25 11:34:42,593 DEV : loss 0.09656655043363571 - f1-score (micro avg) 0.7287 |
|
2023-10-25 11:34:42,615 saving best model |
|
2023-10-25 11:34:43,315 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:34:47,956 epoch 4 - iter 77/773 - loss 0.04468020 - time (sec): 4.64 - samples/sec: 2770.59 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 11:34:52,542 epoch 4 - iter 154/773 - loss 0.05357280 - time (sec): 9.22 - samples/sec: 2790.55 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 11:34:56,989 epoch 4 - iter 231/773 - loss 0.05948461 - time (sec): 13.67 - samples/sec: 2812.21 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 11:35:01,205 epoch 4 - iter 308/773 - loss 0.06530804 - time (sec): 17.89 - samples/sec: 2838.76 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 11:35:05,460 epoch 4 - iter 385/773 - loss 0.06132855 - time (sec): 22.14 - samples/sec: 2860.87 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 11:35:09,680 epoch 4 - iter 462/773 - loss 0.05751702 - time (sec): 26.36 - samples/sec: 2876.95 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 11:35:13,858 epoch 4 - iter 539/773 - loss 0.06019540 - time (sec): 30.54 - samples/sec: 2865.69 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 11:35:17,970 epoch 4 - iter 616/773 - loss 0.06214758 - time (sec): 34.65 - samples/sec: 2860.96 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 11:35:22,136 epoch 4 - iter 693/773 - loss 0.06374761 - time (sec): 38.82 - samples/sec: 2850.03 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 11:35:26,302 epoch 4 - iter 770/773 - loss 0.06105828 - time (sec): 42.98 - samples/sec: 2878.38 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 11:35:26,475 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:35:26,476 EPOCH 4 done: loss 0.0609 - lr: 0.000033 |
|
2023-10-25 11:35:29,209 DEV : loss 0.10266965627670288 - f1-score (micro avg) 0.7418 |
|
2023-10-25 11:35:29,226 saving best model |
|
2023-10-25 11:35:29,827 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:35:34,167 epoch 5 - iter 77/773 - loss 0.07354317 - time (sec): 4.34 - samples/sec: 2703.66 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 11:35:38,533 epoch 5 - iter 154/773 - loss 0.05656659 - time (sec): 8.70 - samples/sec: 2754.18 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 11:35:43,064 epoch 5 - iter 231/773 - loss 0.06643580 - time (sec): 13.24 - samples/sec: 2772.49 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 11:35:47,551 epoch 5 - iter 308/773 - loss 0.06782891 - time (sec): 17.72 - samples/sec: 2771.55 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 11:35:52,054 epoch 5 - iter 385/773 - loss 0.06184014 - time (sec): 22.23 - samples/sec: 2727.18 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 11:35:56,527 epoch 5 - iter 462/773 - loss 0.05866160 - time (sec): 26.70 - samples/sec: 2749.72 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 11:36:00,938 epoch 5 - iter 539/773 - loss 0.06250747 - time (sec): 31.11 - samples/sec: 2755.48 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 11:36:05,448 epoch 5 - iter 616/773 - loss 0.06026997 - time (sec): 35.62 - samples/sec: 2756.11 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 11:36:09,984 epoch 5 - iter 693/773 - loss 0.05814121 - time (sec): 40.16 - samples/sec: 2749.97 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 11:36:14,741 epoch 5 - iter 770/773 - loss 0.05494553 - time (sec): 44.91 - samples/sec: 2759.99 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 11:36:14,907 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:36:14,907 EPOCH 5 done: loss 0.0549 - lr: 0.000028 |
|
2023-10-25 11:36:17,604 DEV : loss 0.1113104522228241 - f1-score (micro avg) 0.7619 |
|
2023-10-25 11:36:17,624 saving best model |
|
2023-10-25 11:36:18,324 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:36:22,987 epoch 6 - iter 77/773 - loss 0.04865046 - time (sec): 4.66 - samples/sec: 2616.38 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 11:36:27,557 epoch 6 - iter 154/773 - loss 0.03475110 - time (sec): 9.23 - samples/sec: 2676.91 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 11:36:32,097 epoch 6 - iter 231/773 - loss 0.02956246 - time (sec): 13.77 - samples/sec: 2710.98 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 11:36:36,609 epoch 6 - iter 308/773 - loss 0.03026117 - time (sec): 18.28 - samples/sec: 2695.78 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 11:36:41,206 epoch 6 - iter 385/773 - loss 0.03316285 - time (sec): 22.88 - samples/sec: 2765.88 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 11:36:45,639 epoch 6 - iter 462/773 - loss 0.03323264 - time (sec): 27.31 - samples/sec: 2774.22 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 11:36:50,098 epoch 6 - iter 539/773 - loss 0.03456321 - time (sec): 31.77 - samples/sec: 2750.32 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 11:36:54,524 epoch 6 - iter 616/773 - loss 0.03464651 - time (sec): 36.20 - samples/sec: 2751.08 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 11:36:58,914 epoch 6 - iter 693/773 - loss 0.03425099 - time (sec): 40.59 - samples/sec: 2748.59 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 11:37:03,254 epoch 6 - iter 770/773 - loss 0.03297911 - time (sec): 44.93 - samples/sec: 2756.29 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 11:37:03,415 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:37:03,416 EPOCH 6 done: loss 0.0329 - lr: 0.000022 |
|
2023-10-25 11:37:06,006 DEV : loss 0.09328915923833847 - f1-score (micro avg) 0.7741 |
|
2023-10-25 11:37:06,024 saving best model |
|
2023-10-25 11:37:06,667 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:37:11,201 epoch 7 - iter 77/773 - loss 0.02826965 - time (sec): 4.53 - samples/sec: 2689.74 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 11:37:16,301 epoch 7 - iter 154/773 - loss 0.02916549 - time (sec): 9.63 - samples/sec: 2579.06 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 11:37:20,597 epoch 7 - iter 231/773 - loss 0.02694630 - time (sec): 13.93 - samples/sec: 2688.77 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 11:37:25,060 epoch 7 - iter 308/773 - loss 0.02382137 - time (sec): 18.39 - samples/sec: 2700.08 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 11:37:29,623 epoch 7 - iter 385/773 - loss 0.02400061 - time (sec): 22.95 - samples/sec: 2674.46 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 11:37:34,323 epoch 7 - iter 462/773 - loss 0.02420007 - time (sec): 27.65 - samples/sec: 2692.42 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 11:37:39,239 epoch 7 - iter 539/773 - loss 0.02350486 - time (sec): 32.57 - samples/sec: 2653.80 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 11:37:44,444 epoch 7 - iter 616/773 - loss 0.02263543 - time (sec): 37.77 - samples/sec: 2600.25 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 11:37:48,997 epoch 7 - iter 693/773 - loss 0.02248745 - time (sec): 42.33 - samples/sec: 2624.42 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 11:37:53,293 epoch 7 - iter 770/773 - loss 0.02276205 - time (sec): 46.62 - samples/sec: 2656.63 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 11:37:53,458 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:37:53,458 EPOCH 7 done: loss 0.0227 - lr: 0.000017 |
|
2023-10-25 11:37:56,498 DEV : loss 0.09706912189722061 - f1-score (micro avg) 0.786 |
|
2023-10-25 11:37:56,520 saving best model |
|
2023-10-25 11:37:57,194 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:38:01,852 epoch 8 - iter 77/773 - loss 0.00963689 - time (sec): 4.65 - samples/sec: 2569.03 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 11:38:06,649 epoch 8 - iter 154/773 - loss 0.01048478 - time (sec): 9.45 - samples/sec: 2646.74 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 11:38:11,176 epoch 8 - iter 231/773 - loss 0.01023261 - time (sec): 13.98 - samples/sec: 2696.04 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 11:38:15,527 epoch 8 - iter 308/773 - loss 0.01078646 - time (sec): 18.33 - samples/sec: 2741.57 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 11:38:19,868 epoch 8 - iter 385/773 - loss 0.01293255 - time (sec): 22.67 - samples/sec: 2762.41 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 11:38:24,519 epoch 8 - iter 462/773 - loss 0.01414220 - time (sec): 27.32 - samples/sec: 2713.20 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 11:38:29,076 epoch 8 - iter 539/773 - loss 0.01443726 - time (sec): 31.88 - samples/sec: 2720.03 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 11:38:33,590 epoch 8 - iter 616/773 - loss 0.01493887 - time (sec): 36.39 - samples/sec: 2719.16 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 11:38:38,269 epoch 8 - iter 693/773 - loss 0.01475114 - time (sec): 41.07 - samples/sec: 2706.28 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 11:38:42,783 epoch 8 - iter 770/773 - loss 0.01430847 - time (sec): 45.58 - samples/sec: 2712.30 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 11:38:42,957 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:38:42,958 EPOCH 8 done: loss 0.0143 - lr: 0.000011 |
|
2023-10-25 11:38:45,598 DEV : loss 0.10801413655281067 - f1-score (micro avg) 0.7515 |
|
2023-10-25 11:38:45,615 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:38:50,161 epoch 9 - iter 77/773 - loss 0.00466196 - time (sec): 4.54 - samples/sec: 2698.86 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 11:38:54,650 epoch 9 - iter 154/773 - loss 0.00740543 - time (sec): 9.03 - samples/sec: 2702.12 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 11:38:59,263 epoch 9 - iter 231/773 - loss 0.00858285 - time (sec): 13.65 - samples/sec: 2678.11 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 11:39:03,513 epoch 9 - iter 308/773 - loss 0.00944576 - time (sec): 17.90 - samples/sec: 2731.89 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 11:39:08,048 epoch 9 - iter 385/773 - loss 0.00881598 - time (sec): 22.43 - samples/sec: 2778.28 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 11:39:12,207 epoch 9 - iter 462/773 - loss 0.00893348 - time (sec): 26.59 - samples/sec: 2804.33 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 11:39:16,419 epoch 9 - iter 539/773 - loss 0.00944343 - time (sec): 30.80 - samples/sec: 2818.54 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 11:39:20,930 epoch 9 - iter 616/773 - loss 0.00960352 - time (sec): 35.31 - samples/sec: 2824.99 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 11:39:25,174 epoch 9 - iter 693/773 - loss 0.00940583 - time (sec): 39.56 - samples/sec: 2841.88 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 11:39:29,510 epoch 9 - iter 770/773 - loss 0.00955294 - time (sec): 43.89 - samples/sec: 2821.72 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 11:39:29,686 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:39:29,687 EPOCH 9 done: loss 0.0095 - lr: 0.000006 |
|
2023-10-25 11:39:32,704 DEV : loss 0.11337698251008987 - f1-score (micro avg) 0.7683 |
|
2023-10-25 11:39:32,724 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:39:37,694 epoch 10 - iter 77/773 - loss 0.00238185 - time (sec): 4.97 - samples/sec: 2756.02 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 11:39:42,114 epoch 10 - iter 154/773 - loss 0.00380767 - time (sec): 9.39 - samples/sec: 2716.28 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 11:39:46,400 epoch 10 - iter 231/773 - loss 0.00374973 - time (sec): 13.68 - samples/sec: 2800.44 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 11:39:50,769 epoch 10 - iter 308/773 - loss 0.00342233 - time (sec): 18.04 - samples/sec: 2807.72 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 11:39:54,989 epoch 10 - iter 385/773 - loss 0.00446767 - time (sec): 22.26 - samples/sec: 2856.40 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 11:39:59,798 epoch 10 - iter 462/773 - loss 0.00465055 - time (sec): 27.07 - samples/sec: 2805.78 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 11:40:03,966 epoch 10 - iter 539/773 - loss 0.00454917 - time (sec): 31.24 - samples/sec: 2810.46 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 11:40:08,229 epoch 10 - iter 616/773 - loss 0.00492510 - time (sec): 35.50 - samples/sec: 2808.44 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 11:40:12,467 epoch 10 - iter 693/773 - loss 0.00477115 - time (sec): 39.74 - samples/sec: 2816.91 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 11:40:16,634 epoch 10 - iter 770/773 - loss 0.00517975 - time (sec): 43.91 - samples/sec: 2821.68 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 11:40:16,791 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:40:16,791 EPOCH 10 done: loss 0.0052 - lr: 0.000000 |
|
2023-10-25 11:40:20,311 DEV : loss 0.12045777589082718 - f1-score (micro avg) 0.7648 |
|
2023-10-25 11:40:20,827 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 11:40:20,828 Loading model from best epoch ... |
|
2023-10-25 11:40:22,671 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET |
|
2023-10-25 11:40:33,409 |
|
Results: |
|
- F-score (micro) 0.7908 |
|
- F-score (macro) 0.6692 |
|
- Accuracy 0.6765 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8141 0.8562 0.8346 946 |
|
BUILDING 0.6667 0.5622 0.6100 185 |
|
STREET 0.6170 0.5179 0.5631 56 |
|
|
|
micro avg 0.7871 0.7944 0.7908 1187 |
|
macro avg 0.6993 0.6454 0.6692 1187 |
|
weighted avg 0.7818 0.7944 0.7868 1187 |
|
|
|
2023-10-25 11:40:33,410 ---------------------------------------------------------------------------------------------------- |
|
|