|
2023-10-13 13:45:29,155 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:45:29,155 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-13 13:45:29,156 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:45:29,156 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences |
|
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator |
|
2023-10-13 13:45:29,156 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:45:29,156 Train: 3575 sentences |
|
2023-10-13 13:45:29,156 (train_with_dev=False, train_with_test=False) |
|
2023-10-13 13:45:29,156 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:45:29,156 Training Params: |
|
2023-10-13 13:45:29,156 - learning_rate: "3e-05" |
|
2023-10-13 13:45:29,156 - mini_batch_size: "8" |
|
2023-10-13 13:45:29,156 - max_epochs: "10" |
|
2023-10-13 13:45:29,156 - shuffle: "True" |
|
2023-10-13 13:45:29,156 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:45:29,156 Plugins: |
|
2023-10-13 13:45:29,156 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-13 13:45:29,156 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:45:29,156 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-13 13:45:29,156 - metric: "('micro avg', 'f1-score')" |
|
2023-10-13 13:45:29,156 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:45:29,156 Computation: |
|
2023-10-13 13:45:29,156 - compute on device: cuda:0 |
|
2023-10-13 13:45:29,156 - embedding storage: none |
|
2023-10-13 13:45:29,156 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:45:29,156 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-13 13:45:29,156 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:45:29,156 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:45:32,664 epoch 1 - iter 44/447 - loss 2.93081395 - time (sec): 3.51 - samples/sec: 2794.79 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 13:45:35,519 epoch 1 - iter 88/447 - loss 2.21221631 - time (sec): 6.36 - samples/sec: 2952.52 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 13:45:38,330 epoch 1 - iter 132/447 - loss 1.70150545 - time (sec): 9.17 - samples/sec: 2955.93 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 13:45:41,390 epoch 1 - iter 176/447 - loss 1.40314630 - time (sec): 12.23 - samples/sec: 2923.74 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 13:45:44,060 epoch 1 - iter 220/447 - loss 1.19922858 - time (sec): 14.90 - samples/sec: 2973.78 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 13:45:46,870 epoch 1 - iter 264/447 - loss 1.05760240 - time (sec): 17.71 - samples/sec: 2976.98 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 13:45:49,718 epoch 1 - iter 308/447 - loss 0.95386364 - time (sec): 20.56 - samples/sec: 2968.22 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 13:45:52,428 epoch 1 - iter 352/447 - loss 0.87272127 - time (sec): 23.27 - samples/sec: 2979.06 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 13:45:55,040 epoch 1 - iter 396/447 - loss 0.81230539 - time (sec): 25.88 - samples/sec: 2980.96 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 13:45:57,785 epoch 1 - iter 440/447 - loss 0.75773565 - time (sec): 28.63 - samples/sec: 2986.90 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 13:45:58,180 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:45:58,181 EPOCH 1 done: loss 0.7513 - lr: 0.000029 |
|
2023-10-13 13:46:03,831 DEV : loss 0.19987231492996216 - f1-score (micro avg) 0.5867 |
|
2023-10-13 13:46:03,858 saving best model |
|
2023-10-13 13:46:04,222 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:46:07,216 epoch 2 - iter 44/447 - loss 0.22838690 - time (sec): 2.99 - samples/sec: 3057.96 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 13:46:09,923 epoch 2 - iter 88/447 - loss 0.21485306 - time (sec): 5.70 - samples/sec: 3078.06 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 13:46:12,699 epoch 2 - iter 132/447 - loss 0.19643660 - time (sec): 8.48 - samples/sec: 3033.96 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 13:46:15,236 epoch 2 - iter 176/447 - loss 0.19333987 - time (sec): 11.01 - samples/sec: 3017.71 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 13:46:18,157 epoch 2 - iter 220/447 - loss 0.18690042 - time (sec): 13.93 - samples/sec: 2971.59 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 13:46:21,020 epoch 2 - iter 264/447 - loss 0.18521400 - time (sec): 16.80 - samples/sec: 2981.18 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 13:46:23,773 epoch 2 - iter 308/447 - loss 0.18183687 - time (sec): 19.55 - samples/sec: 3008.95 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 13:46:26,581 epoch 2 - iter 352/447 - loss 0.17669129 - time (sec): 22.36 - samples/sec: 3011.66 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 13:46:29,613 epoch 2 - iter 396/447 - loss 0.17173028 - time (sec): 25.39 - samples/sec: 3008.50 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 13:46:32,544 epoch 2 - iter 440/447 - loss 0.16743004 - time (sec): 28.32 - samples/sec: 3001.78 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 13:46:33,009 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:46:33,009 EPOCH 2 done: loss 0.1658 - lr: 0.000027 |
|
2023-10-13 13:46:41,867 DEV : loss 0.13758938014507294 - f1-score (micro avg) 0.6982 |
|
2023-10-13 13:46:41,893 saving best model |
|
2023-10-13 13:46:42,353 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:46:45,070 epoch 3 - iter 44/447 - loss 0.09046529 - time (sec): 2.71 - samples/sec: 3343.86 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 13:46:47,780 epoch 3 - iter 88/447 - loss 0.08742309 - time (sec): 5.42 - samples/sec: 3275.41 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 13:46:50,919 epoch 3 - iter 132/447 - loss 0.09040866 - time (sec): 8.56 - samples/sec: 3157.90 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 13:46:53,787 epoch 3 - iter 176/447 - loss 0.09359270 - time (sec): 11.43 - samples/sec: 3147.53 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 13:46:56,667 epoch 3 - iter 220/447 - loss 0.09236157 - time (sec): 14.31 - samples/sec: 3069.82 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 13:46:59,353 epoch 3 - iter 264/447 - loss 0.09220849 - time (sec): 17.00 - samples/sec: 3079.50 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 13:47:02,086 epoch 3 - iter 308/447 - loss 0.09035981 - time (sec): 19.73 - samples/sec: 3045.30 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 13:47:04,963 epoch 3 - iter 352/447 - loss 0.08957548 - time (sec): 22.61 - samples/sec: 3022.73 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 13:47:07,667 epoch 3 - iter 396/447 - loss 0.08950727 - time (sec): 25.31 - samples/sec: 3047.51 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 13:47:10,328 epoch 3 - iter 440/447 - loss 0.08942085 - time (sec): 27.97 - samples/sec: 3047.77 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 13:47:10,806 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:47:10,806 EPOCH 3 done: loss 0.0894 - lr: 0.000023 |
|
2023-10-13 13:47:19,441 DEV : loss 0.13164587318897247 - f1-score (micro avg) 0.7433 |
|
2023-10-13 13:47:19,469 saving best model |
|
2023-10-13 13:47:19,885 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:47:22,845 epoch 4 - iter 44/447 - loss 0.05596531 - time (sec): 2.96 - samples/sec: 3100.82 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 13:47:25,456 epoch 4 - iter 88/447 - loss 0.05366549 - time (sec): 5.57 - samples/sec: 3109.90 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 13:47:28,711 epoch 4 - iter 132/447 - loss 0.05216138 - time (sec): 8.82 - samples/sec: 3101.36 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 13:47:31,433 epoch 4 - iter 176/447 - loss 0.05205191 - time (sec): 11.54 - samples/sec: 3060.90 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 13:47:34,196 epoch 4 - iter 220/447 - loss 0.05390396 - time (sec): 14.31 - samples/sec: 3051.40 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 13:47:37,277 epoch 4 - iter 264/447 - loss 0.05351340 - time (sec): 17.39 - samples/sec: 3050.35 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 13:47:39,967 epoch 4 - iter 308/447 - loss 0.05467544 - time (sec): 20.08 - samples/sec: 3047.85 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 13:47:42,750 epoch 4 - iter 352/447 - loss 0.05475463 - time (sec): 22.86 - samples/sec: 3022.27 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 13:47:45,448 epoch 4 - iter 396/447 - loss 0.05579707 - time (sec): 25.56 - samples/sec: 3012.72 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 13:47:48,323 epoch 4 - iter 440/447 - loss 0.05437413 - time (sec): 28.44 - samples/sec: 2999.50 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 13:47:48,774 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:47:48,774 EPOCH 4 done: loss 0.0544 - lr: 0.000020 |
|
2023-10-13 13:47:57,440 DEV : loss 0.1452960968017578 - f1-score (micro avg) 0.7585 |
|
2023-10-13 13:47:57,475 saving best model |
|
2023-10-13 13:47:57,967 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:48:00,973 epoch 5 - iter 44/447 - loss 0.03517186 - time (sec): 3.00 - samples/sec: 2869.42 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 13:48:03,740 epoch 5 - iter 88/447 - loss 0.03773064 - time (sec): 5.77 - samples/sec: 2919.25 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 13:48:06,570 epoch 5 - iter 132/447 - loss 0.03356338 - time (sec): 8.60 - samples/sec: 2890.74 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 13:48:09,544 epoch 5 - iter 176/447 - loss 0.03500474 - time (sec): 11.57 - samples/sec: 2969.91 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 13:48:12,444 epoch 5 - iter 220/447 - loss 0.03438478 - time (sec): 14.48 - samples/sec: 2998.12 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 13:48:15,455 epoch 5 - iter 264/447 - loss 0.03589972 - time (sec): 17.49 - samples/sec: 2992.34 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 13:48:18,072 epoch 5 - iter 308/447 - loss 0.03587490 - time (sec): 20.10 - samples/sec: 3006.27 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 13:48:20,894 epoch 5 - iter 352/447 - loss 0.03540735 - time (sec): 22.92 - samples/sec: 3014.00 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 13:48:23,570 epoch 5 - iter 396/447 - loss 0.03499206 - time (sec): 25.60 - samples/sec: 3008.37 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 13:48:26,331 epoch 5 - iter 440/447 - loss 0.03401512 - time (sec): 28.36 - samples/sec: 3004.98 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 13:48:26,751 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:48:26,751 EPOCH 5 done: loss 0.0337 - lr: 0.000017 |
|
2023-10-13 13:48:35,451 DEV : loss 0.1682368963956833 - f1-score (micro avg) 0.7564 |
|
2023-10-13 13:48:35,478 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:48:38,149 epoch 6 - iter 44/447 - loss 0.02451337 - time (sec): 2.67 - samples/sec: 3095.65 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 13:48:41,006 epoch 6 - iter 88/447 - loss 0.02301274 - time (sec): 5.53 - samples/sec: 3017.27 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 13:48:44,494 epoch 6 - iter 132/447 - loss 0.01930332 - time (sec): 9.02 - samples/sec: 2980.08 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 13:48:47,421 epoch 6 - iter 176/447 - loss 0.01984188 - time (sec): 11.94 - samples/sec: 2973.45 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 13:48:50,346 epoch 6 - iter 220/447 - loss 0.01856509 - time (sec): 14.87 - samples/sec: 3011.30 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 13:48:53,243 epoch 6 - iter 264/447 - loss 0.01980529 - time (sec): 17.76 - samples/sec: 2984.20 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 13:48:55,997 epoch 6 - iter 308/447 - loss 0.02127022 - time (sec): 20.52 - samples/sec: 2969.36 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 13:48:58,601 epoch 6 - iter 352/447 - loss 0.02325254 - time (sec): 23.12 - samples/sec: 2992.88 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 13:49:01,377 epoch 6 - iter 396/447 - loss 0.02395462 - time (sec): 25.90 - samples/sec: 2998.56 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 13:49:03,917 epoch 6 - iter 440/447 - loss 0.02470748 - time (sec): 28.44 - samples/sec: 2995.43 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 13:49:04,342 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:49:04,342 EPOCH 6 done: loss 0.0244 - lr: 0.000013 |
|
2023-10-13 13:49:12,554 DEV : loss 0.18090558052062988 - f1-score (micro avg) 0.7619 |
|
2023-10-13 13:49:12,582 saving best model |
|
2023-10-13 13:49:13,002 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:49:15,547 epoch 7 - iter 44/447 - loss 0.01799003 - time (sec): 2.54 - samples/sec: 3025.09 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 13:49:18,238 epoch 7 - iter 88/447 - loss 0.01481873 - time (sec): 5.23 - samples/sec: 2955.08 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 13:49:21,888 epoch 7 - iter 132/447 - loss 0.01452826 - time (sec): 8.88 - samples/sec: 2818.32 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 13:49:24,671 epoch 7 - iter 176/447 - loss 0.01521788 - time (sec): 11.67 - samples/sec: 2888.97 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 13:49:27,614 epoch 7 - iter 220/447 - loss 0.01545290 - time (sec): 14.61 - samples/sec: 2908.87 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 13:49:30,558 epoch 7 - iter 264/447 - loss 0.01487671 - time (sec): 17.55 - samples/sec: 2892.61 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 13:49:33,270 epoch 7 - iter 308/447 - loss 0.01548150 - time (sec): 20.27 - samples/sec: 2914.17 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 13:49:36,465 epoch 7 - iter 352/447 - loss 0.01551647 - time (sec): 23.46 - samples/sec: 2901.51 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 13:49:39,150 epoch 7 - iter 396/447 - loss 0.01499973 - time (sec): 26.15 - samples/sec: 2927.73 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 13:49:41,933 epoch 7 - iter 440/447 - loss 0.01549702 - time (sec): 28.93 - samples/sec: 2944.57 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 13:49:42,378 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:49:42,378 EPOCH 7 done: loss 0.0156 - lr: 0.000010 |
|
2023-10-13 13:49:50,741 DEV : loss 0.2054622322320938 - f1-score (micro avg) 0.7834 |
|
2023-10-13 13:49:50,771 saving best model |
|
2023-10-13 13:49:51,167 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:49:53,918 epoch 8 - iter 44/447 - loss 0.02317903 - time (sec): 2.75 - samples/sec: 3031.64 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 13:49:56,645 epoch 8 - iter 88/447 - loss 0.01597198 - time (sec): 5.48 - samples/sec: 3026.22 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 13:49:59,631 epoch 8 - iter 132/447 - loss 0.01364282 - time (sec): 8.46 - samples/sec: 3052.97 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 13:50:03,019 epoch 8 - iter 176/447 - loss 0.01296909 - time (sec): 11.85 - samples/sec: 2963.07 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 13:50:05,926 epoch 8 - iter 220/447 - loss 0.01238780 - time (sec): 14.76 - samples/sec: 2958.16 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 13:50:08,399 epoch 8 - iter 264/447 - loss 0.01314139 - time (sec): 17.23 - samples/sec: 3003.92 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 13:50:11,150 epoch 8 - iter 308/447 - loss 0.01219720 - time (sec): 19.98 - samples/sec: 3006.95 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 13:50:13,836 epoch 8 - iter 352/447 - loss 0.01233694 - time (sec): 22.67 - samples/sec: 3019.24 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 13:50:16,541 epoch 8 - iter 396/447 - loss 0.01193033 - time (sec): 25.37 - samples/sec: 3024.64 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 13:50:19,511 epoch 8 - iter 440/447 - loss 0.01165387 - time (sec): 28.34 - samples/sec: 3012.13 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 13:50:19,893 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:50:19,893 EPOCH 8 done: loss 0.0115 - lr: 0.000007 |
|
2023-10-13 13:50:28,185 DEV : loss 0.2043541818857193 - f1-score (micro avg) 0.7832 |
|
2023-10-13 13:50:28,214 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:50:31,021 epoch 9 - iter 44/447 - loss 0.00821035 - time (sec): 2.81 - samples/sec: 2915.87 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 13:50:34,171 epoch 9 - iter 88/447 - loss 0.00704373 - time (sec): 5.96 - samples/sec: 2937.96 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 13:50:37,236 epoch 9 - iter 132/447 - loss 0.00630963 - time (sec): 9.02 - samples/sec: 2947.79 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 13:50:40,151 epoch 9 - iter 176/447 - loss 0.00673535 - time (sec): 11.94 - samples/sec: 2936.36 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 13:50:43,007 epoch 9 - iter 220/447 - loss 0.00678672 - time (sec): 14.79 - samples/sec: 2919.96 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 13:50:45,713 epoch 9 - iter 264/447 - loss 0.00638557 - time (sec): 17.50 - samples/sec: 2939.90 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 13:50:48,377 epoch 9 - iter 308/447 - loss 0.00699517 - time (sec): 20.16 - samples/sec: 2973.46 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 13:50:51,086 epoch 9 - iter 352/447 - loss 0.00728302 - time (sec): 22.87 - samples/sec: 2999.27 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 13:50:53,984 epoch 9 - iter 396/447 - loss 0.00737181 - time (sec): 25.77 - samples/sec: 2987.05 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 13:50:57,195 epoch 9 - iter 440/447 - loss 0.00763102 - time (sec): 28.98 - samples/sec: 2943.56 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 13:50:57,624 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:50:57,624 EPOCH 9 done: loss 0.0075 - lr: 0.000003 |
|
2023-10-13 13:51:05,821 DEV : loss 0.20693199336528778 - f1-score (micro avg) 0.7894 |
|
2023-10-13 13:51:05,850 saving best model |
|
2023-10-13 13:51:06,241 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:51:09,038 epoch 10 - iter 44/447 - loss 0.00549616 - time (sec): 2.79 - samples/sec: 3056.58 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 13:51:11,718 epoch 10 - iter 88/447 - loss 0.00580800 - time (sec): 5.47 - samples/sec: 2968.75 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 13:51:14,523 epoch 10 - iter 132/447 - loss 0.00521287 - time (sec): 8.28 - samples/sec: 2999.29 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 13:51:17,082 epoch 10 - iter 176/447 - loss 0.00551659 - time (sec): 10.84 - samples/sec: 3018.95 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 13:51:20,311 epoch 10 - iter 220/447 - loss 0.00577789 - time (sec): 14.07 - samples/sec: 3014.52 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 13:51:23,519 epoch 10 - iter 264/447 - loss 0.00524429 - time (sec): 17.27 - samples/sec: 2988.18 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 13:51:26,181 epoch 10 - iter 308/447 - loss 0.00473462 - time (sec): 19.93 - samples/sec: 3002.31 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 13:51:28,755 epoch 10 - iter 352/447 - loss 0.00431197 - time (sec): 22.51 - samples/sec: 2999.77 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 13:51:31,655 epoch 10 - iter 396/447 - loss 0.00461146 - time (sec): 25.41 - samples/sec: 3029.63 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 13:51:34,370 epoch 10 - iter 440/447 - loss 0.00490048 - time (sec): 28.12 - samples/sec: 3034.79 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 13:51:34,805 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:51:34,805 EPOCH 10 done: loss 0.0048 - lr: 0.000000 |
|
2023-10-13 13:51:43,485 DEV : loss 0.21205270290374756 - f1-score (micro avg) 0.784 |
|
2023-10-13 13:51:43,833 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:51:43,834 Loading model from best epoch ... |
|
2023-10-13 13:51:45,288 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time |
|
2023-10-13 13:51:50,821 |
|
Results: |
|
- F-score (micro) 0.7487 |
|
- F-score (macro) 0.667 |
|
- Accuracy 0.6162 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.8279 0.8557 0.8416 596 |
|
pers 0.6684 0.7508 0.7072 333 |
|
org 0.5575 0.4773 0.5143 132 |
|
prod 0.6400 0.4848 0.5517 66 |
|
time 0.7059 0.7347 0.7200 49 |
|
|
|
micro avg 0.7400 0.7577 0.7487 1176 |
|
macro avg 0.6800 0.6607 0.6670 1176 |
|
weighted avg 0.7368 0.7577 0.7455 1176 |
|
|
|
2023-10-13 13:51:50,821 ---------------------------------------------------------------------------------------------------- |
|
|