|
2023-10-18 17:48:28,457 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:48:28,458 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 17:48:28,458 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:48:28,458 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences |
|
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator |
|
2023-10-18 17:48:28,458 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:48:28,458 Train: 3575 sentences |
|
2023-10-18 17:48:28,458 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 17:48:28,458 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:48:28,458 Training Params: |
|
2023-10-18 17:48:28,458 - learning_rate: "3e-05" |
|
2023-10-18 17:48:28,458 - mini_batch_size: "4" |
|
2023-10-18 17:48:28,458 - max_epochs: "10" |
|
2023-10-18 17:48:28,458 - shuffle: "True" |
|
2023-10-18 17:48:28,458 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:48:28,458 Plugins: |
|
2023-10-18 17:48:28,458 - TensorboardLogger |
|
2023-10-18 17:48:28,458 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 17:48:28,458 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:48:28,458 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 17:48:28,458 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 17:48:28,458 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:48:28,458 Computation: |
|
2023-10-18 17:48:28,458 - compute on device: cuda:0 |
|
2023-10-18 17:48:28,458 - embedding storage: none |
|
2023-10-18 17:48:28,458 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:48:28,459 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-18 17:48:28,459 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:48:28,459 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:48:28,459 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 17:48:29,748 epoch 1 - iter 89/894 - loss 3.19630071 - time (sec): 1.29 - samples/sec: 7045.16 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 17:48:31,082 epoch 1 - iter 178/894 - loss 2.96353199 - time (sec): 2.62 - samples/sec: 7187.38 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 17:48:32,421 epoch 1 - iter 267/894 - loss 2.74540200 - time (sec): 3.96 - samples/sec: 6702.85 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 17:48:33,790 epoch 1 - iter 356/894 - loss 2.43894253 - time (sec): 5.33 - samples/sec: 6411.16 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 17:48:35,196 epoch 1 - iter 445/894 - loss 2.12976788 - time (sec): 6.74 - samples/sec: 6277.64 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 17:48:36,585 epoch 1 - iter 534/894 - loss 1.88648563 - time (sec): 8.13 - samples/sec: 6263.25 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 17:48:37,976 epoch 1 - iter 623/894 - loss 1.67326583 - time (sec): 9.52 - samples/sec: 6416.47 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 17:48:39,326 epoch 1 - iter 712/894 - loss 1.53353448 - time (sec): 10.87 - samples/sec: 6403.24 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 17:48:40,702 epoch 1 - iter 801/894 - loss 1.42922638 - time (sec): 12.24 - samples/sec: 6366.34 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 17:48:42,044 epoch 1 - iter 890/894 - loss 1.35456652 - time (sec): 13.58 - samples/sec: 6351.81 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 17:48:42,100 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:48:42,100 EPOCH 1 done: loss 1.3532 - lr: 0.000030 |
|
2023-10-18 17:48:44,322 DEV : loss 0.45246344804763794 - f1-score (micro avg) 0.0 |
|
2023-10-18 17:48:44,344 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:48:45,711 epoch 2 - iter 89/894 - loss 0.52574422 - time (sec): 1.37 - samples/sec: 6183.97 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 17:48:47,080 epoch 2 - iter 178/894 - loss 0.55545770 - time (sec): 2.74 - samples/sec: 6374.43 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 17:48:48,520 epoch 2 - iter 267/894 - loss 0.54150980 - time (sec): 4.18 - samples/sec: 6041.74 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 17:48:49,990 epoch 2 - iter 356/894 - loss 0.53379165 - time (sec): 5.64 - samples/sec: 5904.15 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 17:48:51,410 epoch 2 - iter 445/894 - loss 0.53062271 - time (sec): 7.07 - samples/sec: 6094.43 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 17:48:52,758 epoch 2 - iter 534/894 - loss 0.52081983 - time (sec): 8.41 - samples/sec: 6121.72 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 17:48:54,219 epoch 2 - iter 623/894 - loss 0.51963233 - time (sec): 9.87 - samples/sec: 6246.67 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 17:48:55,574 epoch 2 - iter 712/894 - loss 0.51540611 - time (sec): 11.23 - samples/sec: 6162.82 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 17:48:56,953 epoch 2 - iter 801/894 - loss 0.51041159 - time (sec): 12.61 - samples/sec: 6165.15 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 17:48:58,341 epoch 2 - iter 890/894 - loss 0.50535478 - time (sec): 14.00 - samples/sec: 6164.74 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 17:48:58,402 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:48:58,403 EPOCH 2 done: loss 0.5064 - lr: 0.000027 |
|
2023-10-18 17:49:03,571 DEV : loss 0.3578655421733856 - f1-score (micro avg) 0.0592 |
|
2023-10-18 17:49:03,594 saving best model |
|
2023-10-18 17:49:03,627 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:49:05,037 epoch 3 - iter 89/894 - loss 0.48336938 - time (sec): 1.41 - samples/sec: 6484.71 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 17:49:06,438 epoch 3 - iter 178/894 - loss 0.47658882 - time (sec): 2.81 - samples/sec: 6324.71 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 17:49:07,814 epoch 3 - iter 267/894 - loss 0.45712463 - time (sec): 4.19 - samples/sec: 6340.69 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 17:49:09,179 epoch 3 - iter 356/894 - loss 0.46393711 - time (sec): 5.55 - samples/sec: 6172.52 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 17:49:10,558 epoch 3 - iter 445/894 - loss 0.44830358 - time (sec): 6.93 - samples/sec: 6114.08 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 17:49:11,950 epoch 3 - iter 534/894 - loss 0.44669476 - time (sec): 8.32 - samples/sec: 6129.35 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 17:49:13,338 epoch 3 - iter 623/894 - loss 0.43924135 - time (sec): 9.71 - samples/sec: 6133.73 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 17:49:14,741 epoch 3 - iter 712/894 - loss 0.43899574 - time (sec): 11.11 - samples/sec: 6177.45 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 17:49:16,164 epoch 3 - iter 801/894 - loss 0.43256443 - time (sec): 12.54 - samples/sec: 6201.00 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 17:49:17,548 epoch 3 - iter 890/894 - loss 0.42949594 - time (sec): 13.92 - samples/sec: 6194.26 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 17:49:17,606 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:49:17,606 EPOCH 3 done: loss 0.4296 - lr: 0.000023 |
|
2023-10-18 17:49:22,809 DEV : loss 0.34198394417762756 - f1-score (micro avg) 0.2455 |
|
2023-10-18 17:49:22,832 saving best model |
|
2023-10-18 17:49:22,865 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:49:24,232 epoch 4 - iter 89/894 - loss 0.41318889 - time (sec): 1.37 - samples/sec: 5961.29 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 17:49:25,688 epoch 4 - iter 178/894 - loss 0.38872500 - time (sec): 2.82 - samples/sec: 6480.28 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 17:49:27,081 epoch 4 - iter 267/894 - loss 0.38659751 - time (sec): 4.22 - samples/sec: 6377.23 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 17:49:28,484 epoch 4 - iter 356/894 - loss 0.39484288 - time (sec): 5.62 - samples/sec: 6340.75 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 17:49:29,929 epoch 4 - iter 445/894 - loss 0.38596894 - time (sec): 7.06 - samples/sec: 6296.23 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 17:49:31,291 epoch 4 - iter 534/894 - loss 0.38762013 - time (sec): 8.43 - samples/sec: 6266.21 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 17:49:32,694 epoch 4 - iter 623/894 - loss 0.38363860 - time (sec): 9.83 - samples/sec: 6235.32 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 17:49:34,090 epoch 4 - iter 712/894 - loss 0.38723100 - time (sec): 11.22 - samples/sec: 6214.74 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 17:49:35,490 epoch 4 - iter 801/894 - loss 0.38665888 - time (sec): 12.62 - samples/sec: 6162.80 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 17:49:36,891 epoch 4 - iter 890/894 - loss 0.38610173 - time (sec): 14.03 - samples/sec: 6143.24 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 17:49:36,957 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:49:36,957 EPOCH 4 done: loss 0.3846 - lr: 0.000020 |
|
2023-10-18 17:49:41,899 DEV : loss 0.3150075674057007 - f1-score (micro avg) 0.3247 |
|
2023-10-18 17:49:41,923 saving best model |
|
2023-10-18 17:49:41,956 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:49:43,421 epoch 5 - iter 89/894 - loss 0.38144140 - time (sec): 1.46 - samples/sec: 5968.14 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 17:49:44,878 epoch 5 - iter 178/894 - loss 0.35024717 - time (sec): 2.92 - samples/sec: 6219.91 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 17:49:46,330 epoch 5 - iter 267/894 - loss 0.35588428 - time (sec): 4.37 - samples/sec: 5912.08 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 17:49:47,698 epoch 5 - iter 356/894 - loss 0.35467769 - time (sec): 5.74 - samples/sec: 6017.05 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 17:49:49,081 epoch 5 - iter 445/894 - loss 0.36077938 - time (sec): 7.12 - samples/sec: 5971.89 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 17:49:50,469 epoch 5 - iter 534/894 - loss 0.36734085 - time (sec): 8.51 - samples/sec: 5956.77 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 17:49:51,914 epoch 5 - iter 623/894 - loss 0.36702674 - time (sec): 9.96 - samples/sec: 6018.41 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 17:49:53,392 epoch 5 - iter 712/894 - loss 0.36851786 - time (sec): 11.44 - samples/sec: 6059.43 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 17:49:54,785 epoch 5 - iter 801/894 - loss 0.36449745 - time (sec): 12.83 - samples/sec: 6069.19 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 17:49:56,193 epoch 5 - iter 890/894 - loss 0.35757282 - time (sec): 14.24 - samples/sec: 6057.28 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 17:49:56,259 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:49:56,259 EPOCH 5 done: loss 0.3570 - lr: 0.000017 |
|
2023-10-18 17:50:01,547 DEV : loss 0.3157925307750702 - f1-score (micro avg) 0.3235 |
|
2023-10-18 17:50:01,571 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:50:03,005 epoch 6 - iter 89/894 - loss 0.34112212 - time (sec): 1.43 - samples/sec: 5988.42 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 17:50:04,422 epoch 6 - iter 178/894 - loss 0.33324202 - time (sec): 2.85 - samples/sec: 5812.42 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 17:50:05,839 epoch 6 - iter 267/894 - loss 0.32141621 - time (sec): 4.27 - samples/sec: 5680.00 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 17:50:07,210 epoch 6 - iter 356/894 - loss 0.34308998 - time (sec): 5.64 - samples/sec: 5766.94 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 17:50:08,571 epoch 6 - iter 445/894 - loss 0.34151889 - time (sec): 7.00 - samples/sec: 5841.83 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 17:50:10,009 epoch 6 - iter 534/894 - loss 0.35052796 - time (sec): 8.44 - samples/sec: 6032.34 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 17:50:11,412 epoch 6 - iter 623/894 - loss 0.34315959 - time (sec): 9.84 - samples/sec: 6061.10 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 17:50:12,775 epoch 6 - iter 712/894 - loss 0.33510239 - time (sec): 11.20 - samples/sec: 6112.00 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 17:50:14,192 epoch 6 - iter 801/894 - loss 0.33637264 - time (sec): 12.62 - samples/sec: 6162.15 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 17:50:15,601 epoch 6 - iter 890/894 - loss 0.33235115 - time (sec): 14.03 - samples/sec: 6145.37 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 17:50:15,658 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:50:15,658 EPOCH 6 done: loss 0.3326 - lr: 0.000013 |
|
2023-10-18 17:50:20,999 DEV : loss 0.30548417568206787 - f1-score (micro avg) 0.3435 |
|
2023-10-18 17:50:21,022 saving best model |
|
2023-10-18 17:50:21,056 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:50:22,623 epoch 7 - iter 89/894 - loss 0.31222141 - time (sec): 1.57 - samples/sec: 5714.42 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 17:50:24,005 epoch 7 - iter 178/894 - loss 0.32705010 - time (sec): 2.95 - samples/sec: 5872.12 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 17:50:25,399 epoch 7 - iter 267/894 - loss 0.32263937 - time (sec): 4.34 - samples/sec: 5931.39 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 17:50:26,782 epoch 7 - iter 356/894 - loss 0.32544371 - time (sec): 5.73 - samples/sec: 5904.42 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 17:50:28,201 epoch 7 - iter 445/894 - loss 0.31601726 - time (sec): 7.14 - samples/sec: 5919.08 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 17:50:29,634 epoch 7 - iter 534/894 - loss 0.31615422 - time (sec): 8.58 - samples/sec: 6012.46 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 17:50:31,015 epoch 7 - iter 623/894 - loss 0.31916724 - time (sec): 9.96 - samples/sec: 6066.94 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 17:50:32,413 epoch 7 - iter 712/894 - loss 0.32144582 - time (sec): 11.36 - samples/sec: 6137.21 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 17:50:33,805 epoch 7 - iter 801/894 - loss 0.32125099 - time (sec): 12.75 - samples/sec: 6150.79 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 17:50:35,151 epoch 7 - iter 890/894 - loss 0.31897858 - time (sec): 14.09 - samples/sec: 6117.17 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 17:50:35,210 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:50:35,210 EPOCH 7 done: loss 0.3195 - lr: 0.000010 |
|
2023-10-18 17:50:40,535 DEV : loss 0.3022187650203705 - f1-score (micro avg) 0.3497 |
|
2023-10-18 17:50:40,559 saving best model |
|
2023-10-18 17:50:40,599 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:50:42,000 epoch 8 - iter 89/894 - loss 0.34555510 - time (sec): 1.40 - samples/sec: 6326.33 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 17:50:43,433 epoch 8 - iter 178/894 - loss 0.32944189 - time (sec): 2.83 - samples/sec: 6467.94 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 17:50:44,836 epoch 8 - iter 267/894 - loss 0.32918232 - time (sec): 4.24 - samples/sec: 6225.29 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 17:50:46,332 epoch 8 - iter 356/894 - loss 0.31980933 - time (sec): 5.73 - samples/sec: 6146.00 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 17:50:47,718 epoch 8 - iter 445/894 - loss 0.31177364 - time (sec): 7.12 - samples/sec: 6270.04 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 17:50:49,196 epoch 8 - iter 534/894 - loss 0.30996565 - time (sec): 8.60 - samples/sec: 6170.04 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 17:50:50,598 epoch 8 - iter 623/894 - loss 0.31266497 - time (sec): 10.00 - samples/sec: 6092.52 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 17:50:52,119 epoch 8 - iter 712/894 - loss 0.30729815 - time (sec): 11.52 - samples/sec: 6096.38 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 17:50:53,590 epoch 8 - iter 801/894 - loss 0.31538057 - time (sec): 12.99 - samples/sec: 6039.31 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 17:50:54,948 epoch 8 - iter 890/894 - loss 0.31034285 - time (sec): 14.35 - samples/sec: 6002.89 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 17:50:55,010 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:50:55,010 EPOCH 8 done: loss 0.3097 - lr: 0.000007 |
|
2023-10-18 17:50:59,964 DEV : loss 0.30434364080429077 - f1-score (micro avg) 0.3522 |
|
2023-10-18 17:50:59,988 saving best model |
|
2023-10-18 17:51:00,025 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:51:01,426 epoch 9 - iter 89/894 - loss 0.23959273 - time (sec): 1.40 - samples/sec: 6024.01 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 17:51:02,796 epoch 9 - iter 178/894 - loss 0.27105112 - time (sec): 2.77 - samples/sec: 6003.56 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 17:51:04,184 epoch 9 - iter 267/894 - loss 0.28143971 - time (sec): 4.16 - samples/sec: 6298.31 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 17:51:05,606 epoch 9 - iter 356/894 - loss 0.29591837 - time (sec): 5.58 - samples/sec: 6341.19 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 17:51:07,028 epoch 9 - iter 445/894 - loss 0.29997546 - time (sec): 7.00 - samples/sec: 6173.84 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 17:51:08,706 epoch 9 - iter 534/894 - loss 0.30091527 - time (sec): 8.68 - samples/sec: 5943.69 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 17:51:10,103 epoch 9 - iter 623/894 - loss 0.30243132 - time (sec): 10.08 - samples/sec: 5982.17 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 17:51:11,554 epoch 9 - iter 712/894 - loss 0.29460013 - time (sec): 11.53 - samples/sec: 6064.68 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 17:51:12,917 epoch 9 - iter 801/894 - loss 0.30229177 - time (sec): 12.89 - samples/sec: 6047.04 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 17:51:14,303 epoch 9 - iter 890/894 - loss 0.30375418 - time (sec): 14.28 - samples/sec: 6036.73 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 17:51:14,365 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:51:14,365 EPOCH 9 done: loss 0.3029 - lr: 0.000003 |
|
2023-10-18 17:51:19,332 DEV : loss 0.30048123002052307 - f1-score (micro avg) 0.3541 |
|
2023-10-18 17:51:19,357 saving best model |
|
2023-10-18 17:51:19,393 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:51:20,890 epoch 10 - iter 89/894 - loss 0.29449068 - time (sec): 1.50 - samples/sec: 6558.64 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 17:51:22,283 epoch 10 - iter 178/894 - loss 0.30220126 - time (sec): 2.89 - samples/sec: 6265.58 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 17:51:23,707 epoch 10 - iter 267/894 - loss 0.28820853 - time (sec): 4.31 - samples/sec: 6138.84 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 17:51:25,097 epoch 10 - iter 356/894 - loss 0.29937325 - time (sec): 5.70 - samples/sec: 6225.42 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 17:51:26,464 epoch 10 - iter 445/894 - loss 0.29761073 - time (sec): 7.07 - samples/sec: 6096.08 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 17:51:27,899 epoch 10 - iter 534/894 - loss 0.29810315 - time (sec): 8.51 - samples/sec: 6153.61 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 17:51:29,274 epoch 10 - iter 623/894 - loss 0.30838358 - time (sec): 9.88 - samples/sec: 6263.76 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 17:51:30,631 epoch 10 - iter 712/894 - loss 0.31079931 - time (sec): 11.24 - samples/sec: 6198.45 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 17:51:31,990 epoch 10 - iter 801/894 - loss 0.30309182 - time (sec): 12.60 - samples/sec: 6192.14 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 17:51:33,352 epoch 10 - iter 890/894 - loss 0.30297146 - time (sec): 13.96 - samples/sec: 6148.71 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 17:51:33,434 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:51:33,434 EPOCH 10 done: loss 0.3025 - lr: 0.000000 |
|
2023-10-18 17:51:38,772 DEV : loss 0.300153523683548 - f1-score (micro avg) 0.3579 |
|
2023-10-18 17:51:38,797 saving best model |
|
2023-10-18 17:51:38,868 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:51:38,868 Loading model from best epoch ... |
|
2023-10-18 17:51:38,945 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time |
|
2023-10-18 17:51:41,291 |
|
Results: |
|
- F-score (micro) 0.3715 |
|
- F-score (macro) 0.1528 |
|
- Accuracy 0.2374 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.4882 0.5923 0.5353 596 |
|
pers 0.1927 0.2072 0.1997 333 |
|
org 0.0000 0.0000 0.0000 132 |
|
time 0.0500 0.0204 0.0290 49 |
|
prod 0.0000 0.0000 0.0000 66 |
|
|
|
micro avg 0.3842 0.3597 0.3715 1176 |
|
macro avg 0.1462 0.1640 0.1528 1176 |
|
weighted avg 0.3041 0.3597 0.3290 1176 |
|
|
|
2023-10-18 17:51:41,291 ---------------------------------------------------------------------------------------------------- |
|
|