|
2023-10-25 20:45:27,300 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:45:27,301 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 20:45:27,301 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:45:27,301 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-25 20:45:27,302 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:45:27,302 Train: 1166 sentences |
|
2023-10-25 20:45:27,302 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 20:45:27,302 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:45:27,302 Training Params: |
|
2023-10-25 20:45:27,302 - learning_rate: "3e-05" |
|
2023-10-25 20:45:27,302 - mini_batch_size: "8" |
|
2023-10-25 20:45:27,302 - max_epochs: "10" |
|
2023-10-25 20:45:27,302 - shuffle: "True" |
|
2023-10-25 20:45:27,302 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:45:27,302 Plugins: |
|
2023-10-25 20:45:27,302 - TensorboardLogger |
|
2023-10-25 20:45:27,302 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 20:45:27,302 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:45:27,302 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 20:45:27,302 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 20:45:27,302 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:45:27,302 Computation: |
|
2023-10-25 20:45:27,302 - compute on device: cuda:0 |
|
2023-10-25 20:45:27,302 - embedding storage: none |
|
2023-10-25 20:45:27,302 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:45:27,302 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-25 20:45:27,302 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:45:27,302 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:45:27,303 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 20:45:28,780 epoch 1 - iter 14/146 - loss 3.40387893 - time (sec): 1.48 - samples/sec: 3168.95 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 20:45:29,613 epoch 1 - iter 28/146 - loss 3.06671750 - time (sec): 2.31 - samples/sec: 3664.65 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 20:45:30,529 epoch 1 - iter 42/146 - loss 2.42857814 - time (sec): 3.23 - samples/sec: 3960.64 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 20:45:31,464 epoch 1 - iter 56/146 - loss 1.94932103 - time (sec): 4.16 - samples/sec: 4165.72 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 20:45:32,404 epoch 1 - iter 70/146 - loss 1.65766878 - time (sec): 5.10 - samples/sec: 4292.17 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 20:45:33,352 epoch 1 - iter 84/146 - loss 1.44606435 - time (sec): 6.05 - samples/sec: 4392.13 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 20:45:34,263 epoch 1 - iter 98/146 - loss 1.29854117 - time (sec): 6.96 - samples/sec: 4422.42 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 20:45:35,000 epoch 1 - iter 112/146 - loss 1.19665486 - time (sec): 7.70 - samples/sec: 4452.40 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 20:45:35,992 epoch 1 - iter 126/146 - loss 1.08010980 - time (sec): 8.69 - samples/sec: 4492.40 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 20:45:36,820 epoch 1 - iter 140/146 - loss 1.00953467 - time (sec): 9.52 - samples/sec: 4495.08 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 20:45:37,181 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:45:37,181 EPOCH 1 done: loss 0.9803 - lr: 0.000029 |
|
2023-10-25 20:45:37,681 DEV : loss 0.18878793716430664 - f1-score (micro avg) 0.4692 |
|
2023-10-25 20:45:37,685 saving best model |
|
2023-10-25 20:45:38,227 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:45:39,018 epoch 2 - iter 14/146 - loss 0.22373383 - time (sec): 0.79 - samples/sec: 5025.93 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 20:45:39,871 epoch 2 - iter 28/146 - loss 0.21574470 - time (sec): 1.64 - samples/sec: 5031.21 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 20:45:40,819 epoch 2 - iter 42/146 - loss 0.22004443 - time (sec): 2.59 - samples/sec: 5013.80 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 20:45:41,640 epoch 2 - iter 56/146 - loss 0.20860038 - time (sec): 3.41 - samples/sec: 5027.01 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 20:45:42,429 epoch 2 - iter 70/146 - loss 0.20976419 - time (sec): 4.20 - samples/sec: 4957.47 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 20:45:43,421 epoch 2 - iter 84/146 - loss 0.20054745 - time (sec): 5.19 - samples/sec: 4852.10 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 20:45:44,456 epoch 2 - iter 98/146 - loss 0.19171488 - time (sec): 6.23 - samples/sec: 4836.36 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 20:45:45,320 epoch 2 - iter 112/146 - loss 0.18425139 - time (sec): 7.09 - samples/sec: 4856.43 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 20:45:46,213 epoch 2 - iter 126/146 - loss 0.18453654 - time (sec): 7.98 - samples/sec: 4787.91 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 20:45:46,992 epoch 2 - iter 140/146 - loss 0.18299430 - time (sec): 8.76 - samples/sec: 4783.07 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 20:45:47,552 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:45:47,553 EPOCH 2 done: loss 0.1829 - lr: 0.000027 |
|
2023-10-25 20:45:48,459 DEV : loss 0.10772357136011124 - f1-score (micro avg) 0.6757 |
|
2023-10-25 20:45:48,464 saving best model |
|
2023-10-25 20:45:49,128 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:45:50,253 epoch 3 - iter 14/146 - loss 0.09397448 - time (sec): 1.12 - samples/sec: 4453.73 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 20:45:51,244 epoch 3 - iter 28/146 - loss 0.08863819 - time (sec): 2.11 - samples/sec: 4734.46 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 20:45:52,029 epoch 3 - iter 42/146 - loss 0.09172307 - time (sec): 2.90 - samples/sec: 4638.72 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 20:45:52,887 epoch 3 - iter 56/146 - loss 0.09536448 - time (sec): 3.76 - samples/sec: 4682.14 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 20:45:53,773 epoch 3 - iter 70/146 - loss 0.09920949 - time (sec): 4.64 - samples/sec: 4721.91 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 20:45:54,543 epoch 3 - iter 84/146 - loss 0.09883122 - time (sec): 5.41 - samples/sec: 4656.43 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 20:45:55,495 epoch 3 - iter 98/146 - loss 0.09485869 - time (sec): 6.37 - samples/sec: 4721.80 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 20:45:56,456 epoch 3 - iter 112/146 - loss 0.09585353 - time (sec): 7.33 - samples/sec: 4703.34 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 20:45:57,400 epoch 3 - iter 126/146 - loss 0.10213869 - time (sec): 8.27 - samples/sec: 4707.42 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 20:45:58,266 epoch 3 - iter 140/146 - loss 0.09946411 - time (sec): 9.14 - samples/sec: 4687.32 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 20:45:58,615 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:45:58,615 EPOCH 3 done: loss 0.0993 - lr: 0.000024 |
|
2023-10-25 20:45:59,521 DEV : loss 0.09623494744300842 - f1-score (micro avg) 0.7237 |
|
2023-10-25 20:45:59,525 saving best model |
|
2023-10-25 20:46:00,194 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:46:00,959 epoch 4 - iter 14/146 - loss 0.04022952 - time (sec): 0.76 - samples/sec: 4527.81 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 20:46:01,817 epoch 4 - iter 28/146 - loss 0.05326105 - time (sec): 1.62 - samples/sec: 4707.44 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 20:46:02,851 epoch 4 - iter 42/146 - loss 0.05115026 - time (sec): 2.65 - samples/sec: 4516.22 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 20:46:03,802 epoch 4 - iter 56/146 - loss 0.05355703 - time (sec): 3.61 - samples/sec: 4625.29 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 20:46:04,793 epoch 4 - iter 70/146 - loss 0.04916831 - time (sec): 4.60 - samples/sec: 4842.45 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 20:46:05,616 epoch 4 - iter 84/146 - loss 0.05190130 - time (sec): 5.42 - samples/sec: 4783.85 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 20:46:06,680 epoch 4 - iter 98/146 - loss 0.05403563 - time (sec): 6.48 - samples/sec: 4719.13 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 20:46:07,485 epoch 4 - iter 112/146 - loss 0.05760992 - time (sec): 7.29 - samples/sec: 4729.44 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 20:46:08,366 epoch 4 - iter 126/146 - loss 0.05858482 - time (sec): 8.17 - samples/sec: 4699.12 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 20:46:09,193 epoch 4 - iter 140/146 - loss 0.05860587 - time (sec): 9.00 - samples/sec: 4749.76 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 20:46:09,538 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:46:09,539 EPOCH 4 done: loss 0.0602 - lr: 0.000020 |
|
2023-10-25 20:46:10,450 DEV : loss 0.10259346663951874 - f1-score (micro avg) 0.7352 |
|
2023-10-25 20:46:10,454 saving best model |
|
2023-10-25 20:46:11,125 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:46:11,959 epoch 5 - iter 14/146 - loss 0.04835059 - time (sec): 0.83 - samples/sec: 4538.83 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 20:46:12,819 epoch 5 - iter 28/146 - loss 0.04414478 - time (sec): 1.69 - samples/sec: 4744.01 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 20:46:13,786 epoch 5 - iter 42/146 - loss 0.04694686 - time (sec): 2.66 - samples/sec: 4818.30 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 20:46:14,616 epoch 5 - iter 56/146 - loss 0.04615584 - time (sec): 3.49 - samples/sec: 4796.28 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 20:46:15,550 epoch 5 - iter 70/146 - loss 0.04037843 - time (sec): 4.42 - samples/sec: 4778.41 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 20:46:16,368 epoch 5 - iter 84/146 - loss 0.04012595 - time (sec): 5.24 - samples/sec: 4751.28 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 20:46:17,376 epoch 5 - iter 98/146 - loss 0.04025863 - time (sec): 6.25 - samples/sec: 4720.73 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 20:46:18,308 epoch 5 - iter 112/146 - loss 0.03949493 - time (sec): 7.18 - samples/sec: 4719.44 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 20:46:19,261 epoch 5 - iter 126/146 - loss 0.03932549 - time (sec): 8.13 - samples/sec: 4681.60 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 20:46:20,151 epoch 5 - iter 140/146 - loss 0.03746108 - time (sec): 9.02 - samples/sec: 4702.72 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 20:46:20,577 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:46:20,577 EPOCH 5 done: loss 0.0404 - lr: 0.000017 |
|
2023-10-25 20:46:21,635 DEV : loss 0.10988734662532806 - f1-score (micro avg) 0.7435 |
|
2023-10-25 20:46:21,639 saving best model |
|
2023-10-25 20:46:22,313 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:46:23,226 epoch 6 - iter 14/146 - loss 0.03584193 - time (sec): 0.91 - samples/sec: 4497.18 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 20:46:24,123 epoch 6 - iter 28/146 - loss 0.02695874 - time (sec): 1.81 - samples/sec: 4616.48 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 20:46:25,174 epoch 6 - iter 42/146 - loss 0.02419280 - time (sec): 2.86 - samples/sec: 4545.19 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 20:46:26,171 epoch 6 - iter 56/146 - loss 0.02537702 - time (sec): 3.86 - samples/sec: 4669.12 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 20:46:27,060 epoch 6 - iter 70/146 - loss 0.02481698 - time (sec): 4.75 - samples/sec: 4611.66 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 20:46:27,932 epoch 6 - iter 84/146 - loss 0.02388273 - time (sec): 5.62 - samples/sec: 4591.94 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 20:46:28,827 epoch 6 - iter 98/146 - loss 0.02589398 - time (sec): 6.51 - samples/sec: 4653.01 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 20:46:29,715 epoch 6 - iter 112/146 - loss 0.02424820 - time (sec): 7.40 - samples/sec: 4697.26 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 20:46:30,507 epoch 6 - iter 126/146 - loss 0.02469800 - time (sec): 8.19 - samples/sec: 4696.81 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 20:46:31,313 epoch 6 - iter 140/146 - loss 0.02532898 - time (sec): 9.00 - samples/sec: 4768.13 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 20:46:31,721 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:46:31,721 EPOCH 6 done: loss 0.0264 - lr: 0.000014 |
|
2023-10-25 20:46:32,629 DEV : loss 0.1386864334344864 - f1-score (micro avg) 0.7348 |
|
2023-10-25 20:46:32,634 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:46:33,506 epoch 7 - iter 14/146 - loss 0.02298317 - time (sec): 0.87 - samples/sec: 4624.97 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 20:46:34,390 epoch 7 - iter 28/146 - loss 0.01697781 - time (sec): 1.75 - samples/sec: 4561.88 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 20:46:35,300 epoch 7 - iter 42/146 - loss 0.01735609 - time (sec): 2.67 - samples/sec: 4561.13 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 20:46:36,143 epoch 7 - iter 56/146 - loss 0.01627697 - time (sec): 3.51 - samples/sec: 4555.86 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 20:46:37,224 epoch 7 - iter 70/146 - loss 0.01980291 - time (sec): 4.59 - samples/sec: 4619.86 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 20:46:38,274 epoch 7 - iter 84/146 - loss 0.01830047 - time (sec): 5.64 - samples/sec: 4600.61 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 20:46:39,173 epoch 7 - iter 98/146 - loss 0.01978492 - time (sec): 6.54 - samples/sec: 4609.19 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 20:46:40,095 epoch 7 - iter 112/146 - loss 0.01965229 - time (sec): 7.46 - samples/sec: 4637.54 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 20:46:40,973 epoch 7 - iter 126/146 - loss 0.01942350 - time (sec): 8.34 - samples/sec: 4592.38 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 20:46:41,772 epoch 7 - iter 140/146 - loss 0.01866854 - time (sec): 9.14 - samples/sec: 4683.18 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 20:46:42,111 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:46:42,111 EPOCH 7 done: loss 0.0185 - lr: 0.000010 |
|
2023-10-25 20:46:43,024 DEV : loss 0.12989920377731323 - f1-score (micro avg) 0.7407 |
|
2023-10-25 20:46:43,028 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:46:43,931 epoch 8 - iter 14/146 - loss 0.00678178 - time (sec): 0.90 - samples/sec: 4885.73 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 20:46:44,781 epoch 8 - iter 28/146 - loss 0.01030138 - time (sec): 1.75 - samples/sec: 5354.80 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 20:46:45,610 epoch 8 - iter 42/146 - loss 0.01041059 - time (sec): 2.58 - samples/sec: 5152.88 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 20:46:46,515 epoch 8 - iter 56/146 - loss 0.01122395 - time (sec): 3.49 - samples/sec: 5052.68 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 20:46:47,386 epoch 8 - iter 70/146 - loss 0.01098060 - time (sec): 4.36 - samples/sec: 4939.20 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 20:46:48,315 epoch 8 - iter 84/146 - loss 0.01162853 - time (sec): 5.29 - samples/sec: 4912.02 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 20:46:49,378 epoch 8 - iter 98/146 - loss 0.01200666 - time (sec): 6.35 - samples/sec: 4823.94 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 20:46:50,289 epoch 8 - iter 112/146 - loss 0.01164341 - time (sec): 7.26 - samples/sec: 4811.61 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 20:46:51,179 epoch 8 - iter 126/146 - loss 0.01388481 - time (sec): 8.15 - samples/sec: 4728.40 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 20:46:52,051 epoch 8 - iter 140/146 - loss 0.01385814 - time (sec): 9.02 - samples/sec: 4750.18 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 20:46:52,378 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:46:52,378 EPOCH 8 done: loss 0.0141 - lr: 0.000007 |
|
2023-10-25 20:46:53,292 DEV : loss 0.1553896963596344 - f1-score (micro avg) 0.7527 |
|
2023-10-25 20:46:53,296 saving best model |
|
2023-10-25 20:46:53,826 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:46:54,838 epoch 9 - iter 14/146 - loss 0.01917424 - time (sec): 1.01 - samples/sec: 4363.15 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 20:46:55,724 epoch 9 - iter 28/146 - loss 0.01227936 - time (sec): 1.90 - samples/sec: 4423.12 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 20:46:56,495 epoch 9 - iter 42/146 - loss 0.01288509 - time (sec): 2.67 - samples/sec: 4458.51 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 20:46:57,559 epoch 9 - iter 56/146 - loss 0.01164916 - time (sec): 3.73 - samples/sec: 4356.87 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 20:46:58,372 epoch 9 - iter 70/146 - loss 0.01131783 - time (sec): 4.54 - samples/sec: 4431.60 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 20:46:59,292 epoch 9 - iter 84/146 - loss 0.01232463 - time (sec): 5.46 - samples/sec: 4406.38 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 20:47:00,189 epoch 9 - iter 98/146 - loss 0.01239644 - time (sec): 6.36 - samples/sec: 4546.61 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 20:47:01,088 epoch 9 - iter 112/146 - loss 0.01154778 - time (sec): 7.26 - samples/sec: 4592.00 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 20:47:02,053 epoch 9 - iter 126/146 - loss 0.01104399 - time (sec): 8.22 - samples/sec: 4653.77 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 20:47:03,090 epoch 9 - iter 140/146 - loss 0.01095344 - time (sec): 9.26 - samples/sec: 4640.05 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 20:47:03,435 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:47:03,435 EPOCH 9 done: loss 0.0110 - lr: 0.000004 |
|
2023-10-25 20:47:04,346 DEV : loss 0.16054613888263702 - f1-score (micro avg) 0.7446 |
|
2023-10-25 20:47:04,350 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:47:05,164 epoch 10 - iter 14/146 - loss 0.00674318 - time (sec): 0.81 - samples/sec: 5173.53 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 20:47:06,134 epoch 10 - iter 28/146 - loss 0.01110079 - time (sec): 1.78 - samples/sec: 4683.12 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 20:47:07,014 epoch 10 - iter 42/146 - loss 0.01282647 - time (sec): 2.66 - samples/sec: 4633.70 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 20:47:07,950 epoch 10 - iter 56/146 - loss 0.01170567 - time (sec): 3.60 - samples/sec: 4666.99 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 20:47:08,947 epoch 10 - iter 70/146 - loss 0.01044850 - time (sec): 4.60 - samples/sec: 4570.03 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 20:47:09,806 epoch 10 - iter 84/146 - loss 0.00899261 - time (sec): 5.46 - samples/sec: 4505.29 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 20:47:10,843 epoch 10 - iter 98/146 - loss 0.00848585 - time (sec): 6.49 - samples/sec: 4560.37 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 20:47:11,802 epoch 10 - iter 112/146 - loss 0.00890020 - time (sec): 7.45 - samples/sec: 4616.28 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 20:47:12,761 epoch 10 - iter 126/146 - loss 0.00868290 - time (sec): 8.41 - samples/sec: 4583.94 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 20:47:13,584 epoch 10 - iter 140/146 - loss 0.00822908 - time (sec): 9.23 - samples/sec: 4587.58 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 20:47:13,961 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:47:13,961 EPOCH 10 done: loss 0.0079 - lr: 0.000000 |
|
2023-10-25 20:47:14,874 DEV : loss 0.16437558829784393 - f1-score (micro avg) 0.7414 |
|
2023-10-25 20:47:15,407 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:47:15,408 Loading model from best epoch ... |
|
2023-10-25 20:47:17,312 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-25 20:47:18,849 |
|
Results: |
|
- F-score (micro) 0.7801 |
|
- F-score (macro) 0.7068 |
|
- Accuracy 0.6671 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.7984 0.8649 0.8303 348 |
|
LOC 0.7230 0.8199 0.7684 261 |
|
ORG 0.5319 0.4808 0.5051 52 |
|
HumanProd 0.6800 0.7727 0.7234 22 |
|
|
|
micro avg 0.7477 0.8155 0.7801 683 |
|
macro avg 0.6833 0.7346 0.7068 683 |
|
weighted avg 0.7455 0.8155 0.7785 683 |
|
|
|
2023-10-25 20:47:18,849 ---------------------------------------------------------------------------------------------------- |
|
|