|
2023-10-18 22:03:52,771 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:03:52,771 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 22:03:52,771 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:03:52,771 MultiCorpus: 5777 train + 722 dev + 723 test sentences |
|
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl |
|
2023-10-18 22:03:52,771 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:03:52,771 Train: 5777 sentences |
|
2023-10-18 22:03:52,771 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 22:03:52,771 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:03:52,771 Training Params: |
|
2023-10-18 22:03:52,771 - learning_rate: "5e-05" |
|
2023-10-18 22:03:52,771 - mini_batch_size: "4" |
|
2023-10-18 22:03:52,771 - max_epochs: "10" |
|
2023-10-18 22:03:52,771 - shuffle: "True" |
|
2023-10-18 22:03:52,771 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:03:52,771 Plugins: |
|
2023-10-18 22:03:52,772 - TensorboardLogger |
|
2023-10-18 22:03:52,772 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 22:03:52,772 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:03:52,772 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 22:03:52,772 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 22:03:52,772 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:03:52,772 Computation: |
|
2023-10-18 22:03:52,772 - compute on device: cuda:0 |
|
2023-10-18 22:03:52,772 - embedding storage: none |
|
2023-10-18 22:03:52,772 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:03:52,772 Model training base path: "hmbench-icdar/nl-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-18 22:03:52,772 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:03:52,772 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:03:52,772 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 22:03:55,057 epoch 1 - iter 144/1445 - loss 3.04939705 - time (sec): 2.28 - samples/sec: 7554.12 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 22:03:57,542 epoch 1 - iter 288/1445 - loss 2.46990346 - time (sec): 4.77 - samples/sec: 7491.31 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 22:03:59,890 epoch 1 - iter 432/1445 - loss 1.90952042 - time (sec): 7.12 - samples/sec: 7505.30 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 22:04:02,309 epoch 1 - iter 576/1445 - loss 1.53392065 - time (sec): 9.54 - samples/sec: 7435.82 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 22:04:04,785 epoch 1 - iter 720/1445 - loss 1.28251065 - time (sec): 12.01 - samples/sec: 7413.29 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 22:04:07,177 epoch 1 - iter 864/1445 - loss 1.11100035 - time (sec): 14.40 - samples/sec: 7439.56 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 22:04:09,551 epoch 1 - iter 1008/1445 - loss 0.99718788 - time (sec): 16.78 - samples/sec: 7440.32 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 22:04:11,926 epoch 1 - iter 1152/1445 - loss 0.91067996 - time (sec): 19.15 - samples/sec: 7430.49 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 22:04:14,262 epoch 1 - iter 1296/1445 - loss 0.84307628 - time (sec): 21.49 - samples/sec: 7380.56 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 22:04:16,641 epoch 1 - iter 1440/1445 - loss 0.78429039 - time (sec): 23.87 - samples/sec: 7352.74 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-18 22:04:16,738 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:04:16,738 EPOCH 1 done: loss 0.7825 - lr: 0.000050 |
|
2023-10-18 22:04:17,981 DEV : loss 0.28100287914276123 - f1-score (micro avg) 0.008 |
|
2023-10-18 22:04:17,995 saving best model |
|
2023-10-18 22:04:18,025 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:04:20,063 epoch 2 - iter 144/1445 - loss 0.21043464 - time (sec): 2.04 - samples/sec: 9133.22 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 22:04:22,300 epoch 2 - iter 288/1445 - loss 0.22026683 - time (sec): 4.27 - samples/sec: 8371.87 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 22:04:24,777 epoch 2 - iter 432/1445 - loss 0.22617469 - time (sec): 6.75 - samples/sec: 7975.78 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 22:04:27,171 epoch 2 - iter 576/1445 - loss 0.21836761 - time (sec): 9.15 - samples/sec: 7834.18 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 22:04:29,583 epoch 2 - iter 720/1445 - loss 0.20927546 - time (sec): 11.56 - samples/sec: 7685.03 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 22:04:31,977 epoch 2 - iter 864/1445 - loss 0.20472373 - time (sec): 13.95 - samples/sec: 7706.06 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 22:04:34,309 epoch 2 - iter 1008/1445 - loss 0.20498386 - time (sec): 16.28 - samples/sec: 7594.15 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 22:04:36,633 epoch 2 - iter 1152/1445 - loss 0.20281506 - time (sec): 18.61 - samples/sec: 7534.71 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 22:04:39,002 epoch 2 - iter 1296/1445 - loss 0.20137163 - time (sec): 20.98 - samples/sec: 7520.35 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 22:04:41,453 epoch 2 - iter 1440/1445 - loss 0.19758146 - time (sec): 23.43 - samples/sec: 7503.41 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 22:04:41,525 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:04:41,525 EPOCH 2 done: loss 0.1978 - lr: 0.000044 |
|
2023-10-18 22:04:43,598 DEV : loss 0.2224675416946411 - f1-score (micro avg) 0.3516 |
|
2023-10-18 22:04:43,613 saving best model |
|
2023-10-18 22:04:43,646 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:04:46,072 epoch 3 - iter 144/1445 - loss 0.17380532 - time (sec): 2.43 - samples/sec: 7398.74 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 22:04:48,415 epoch 3 - iter 288/1445 - loss 0.17717372 - time (sec): 4.77 - samples/sec: 7288.23 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 22:04:50,812 epoch 3 - iter 432/1445 - loss 0.17447919 - time (sec): 7.17 - samples/sec: 7288.73 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 22:04:53,314 epoch 3 - iter 576/1445 - loss 0.16390738 - time (sec): 9.67 - samples/sec: 7323.27 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 22:04:55,740 epoch 3 - iter 720/1445 - loss 0.16608139 - time (sec): 12.09 - samples/sec: 7273.05 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 22:04:58,146 epoch 3 - iter 864/1445 - loss 0.16568500 - time (sec): 14.50 - samples/sec: 7274.68 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 22:05:00,448 epoch 3 - iter 1008/1445 - loss 0.16698127 - time (sec): 16.80 - samples/sec: 7269.41 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 22:05:02,901 epoch 3 - iter 1152/1445 - loss 0.16944510 - time (sec): 19.25 - samples/sec: 7266.35 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 22:05:05,325 epoch 3 - iter 1296/1445 - loss 0.16690178 - time (sec): 21.68 - samples/sec: 7286.48 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 22:05:07,744 epoch 3 - iter 1440/1445 - loss 0.16852499 - time (sec): 24.10 - samples/sec: 7293.93 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 22:05:07,820 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:05:07,820 EPOCH 3 done: loss 0.1684 - lr: 0.000039 |
|
2023-10-18 22:05:09,583 DEV : loss 0.2110542505979538 - f1-score (micro avg) 0.4304 |
|
2023-10-18 22:05:09,598 saving best model |
|
2023-10-18 22:05:09,635 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:05:12,027 epoch 4 - iter 144/1445 - loss 0.14271915 - time (sec): 2.39 - samples/sec: 7375.33 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 22:05:14,394 epoch 4 - iter 288/1445 - loss 0.14150049 - time (sec): 4.76 - samples/sec: 7247.02 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 22:05:16,815 epoch 4 - iter 432/1445 - loss 0.15054176 - time (sec): 7.18 - samples/sec: 7296.89 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 22:05:19,296 epoch 4 - iter 576/1445 - loss 0.14754641 - time (sec): 9.66 - samples/sec: 7395.45 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 22:05:21,741 epoch 4 - iter 720/1445 - loss 0.14682795 - time (sec): 12.11 - samples/sec: 7318.74 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 22:05:24,234 epoch 4 - iter 864/1445 - loss 0.15041203 - time (sec): 14.60 - samples/sec: 7319.96 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 22:05:26,655 epoch 4 - iter 1008/1445 - loss 0.15028435 - time (sec): 17.02 - samples/sec: 7345.06 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 22:05:29,107 epoch 4 - iter 1152/1445 - loss 0.14960320 - time (sec): 19.47 - samples/sec: 7311.67 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 22:05:31,509 epoch 4 - iter 1296/1445 - loss 0.14906818 - time (sec): 21.87 - samples/sec: 7257.61 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 22:05:33,926 epoch 4 - iter 1440/1445 - loss 0.15225222 - time (sec): 24.29 - samples/sec: 7232.75 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 22:05:34,002 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:05:34,002 EPOCH 4 done: loss 0.1522 - lr: 0.000033 |
|
2023-10-18 22:05:35,756 DEV : loss 0.1968742311000824 - f1-score (micro avg) 0.4964 |
|
2023-10-18 22:05:35,770 saving best model |
|
2023-10-18 22:05:35,804 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:05:38,269 epoch 5 - iter 144/1445 - loss 0.14958451 - time (sec): 2.46 - samples/sec: 7382.50 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 22:05:40,651 epoch 5 - iter 288/1445 - loss 0.14618727 - time (sec): 4.85 - samples/sec: 7416.12 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 22:05:43,002 epoch 5 - iter 432/1445 - loss 0.14296263 - time (sec): 7.20 - samples/sec: 7182.72 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 22:05:45,363 epoch 5 - iter 576/1445 - loss 0.14379031 - time (sec): 9.56 - samples/sec: 7143.56 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 22:05:47,696 epoch 5 - iter 720/1445 - loss 0.14137190 - time (sec): 11.89 - samples/sec: 7147.85 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 22:05:49,856 epoch 5 - iter 864/1445 - loss 0.13978700 - time (sec): 14.05 - samples/sec: 7383.74 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 22:05:52,257 epoch 5 - iter 1008/1445 - loss 0.13831696 - time (sec): 16.45 - samples/sec: 7388.86 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 22:05:54,628 epoch 5 - iter 1152/1445 - loss 0.13684637 - time (sec): 18.82 - samples/sec: 7394.68 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 22:05:57,044 epoch 5 - iter 1296/1445 - loss 0.13881762 - time (sec): 21.24 - samples/sec: 7387.41 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 22:05:59,456 epoch 5 - iter 1440/1445 - loss 0.13776441 - time (sec): 23.65 - samples/sec: 7424.32 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 22:05:59,530 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:05:59,530 EPOCH 5 done: loss 0.1379 - lr: 0.000028 |
|
2023-10-18 22:06:01,634 DEV : loss 0.19495084881782532 - f1-score (micro avg) 0.4916 |
|
2023-10-18 22:06:01,648 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:06:04,062 epoch 6 - iter 144/1445 - loss 0.12594247 - time (sec): 2.41 - samples/sec: 7059.98 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 22:06:06,473 epoch 6 - iter 288/1445 - loss 0.12652964 - time (sec): 4.82 - samples/sec: 7119.27 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 22:06:08,883 epoch 6 - iter 432/1445 - loss 0.13827789 - time (sec): 7.23 - samples/sec: 7205.93 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 22:06:11,250 epoch 6 - iter 576/1445 - loss 0.13760486 - time (sec): 9.60 - samples/sec: 7125.14 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 22:06:13,698 epoch 6 - iter 720/1445 - loss 0.13357783 - time (sec): 12.05 - samples/sec: 7181.94 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 22:06:15,879 epoch 6 - iter 864/1445 - loss 0.13088966 - time (sec): 14.23 - samples/sec: 7278.22 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 22:06:18,408 epoch 6 - iter 1008/1445 - loss 0.13165646 - time (sec): 16.76 - samples/sec: 7320.61 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 22:06:20,872 epoch 6 - iter 1152/1445 - loss 0.13027607 - time (sec): 19.22 - samples/sec: 7321.77 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 22:06:23,276 epoch 6 - iter 1296/1445 - loss 0.13119916 - time (sec): 21.63 - samples/sec: 7356.37 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 22:06:25,537 epoch 6 - iter 1440/1445 - loss 0.12967888 - time (sec): 23.89 - samples/sec: 7346.84 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 22:06:25,615 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:06:25,615 EPOCH 6 done: loss 0.1292 - lr: 0.000022 |
|
2023-10-18 22:06:27,398 DEV : loss 0.18579889833927155 - f1-score (micro avg) 0.5156 |
|
2023-10-18 22:06:27,413 saving best model |
|
2023-10-18 22:06:27,451 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:06:29,928 epoch 7 - iter 144/1445 - loss 0.12681691 - time (sec): 2.48 - samples/sec: 6824.01 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 22:06:32,363 epoch 7 - iter 288/1445 - loss 0.12846218 - time (sec): 4.91 - samples/sec: 7176.17 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 22:06:34,781 epoch 7 - iter 432/1445 - loss 0.12642328 - time (sec): 7.33 - samples/sec: 7181.57 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 22:06:37,197 epoch 7 - iter 576/1445 - loss 0.12739278 - time (sec): 9.75 - samples/sec: 7151.91 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 22:06:39,615 epoch 7 - iter 720/1445 - loss 0.12540456 - time (sec): 12.16 - samples/sec: 7118.98 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 22:06:41,962 epoch 7 - iter 864/1445 - loss 0.12415945 - time (sec): 14.51 - samples/sec: 7255.64 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 22:06:44,292 epoch 7 - iter 1008/1445 - loss 0.12210391 - time (sec): 16.84 - samples/sec: 7280.44 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 22:06:46,633 epoch 7 - iter 1152/1445 - loss 0.12369396 - time (sec): 19.18 - samples/sec: 7241.90 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 22:06:49,126 epoch 7 - iter 1296/1445 - loss 0.12424925 - time (sec): 21.67 - samples/sec: 7250.06 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 22:06:51,604 epoch 7 - iter 1440/1445 - loss 0.12294260 - time (sec): 24.15 - samples/sec: 7269.76 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 22:06:51,682 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:06:51,682 EPOCH 7 done: loss 0.1227 - lr: 0.000017 |
|
2023-10-18 22:06:53,446 DEV : loss 0.20053215324878693 - f1-score (micro avg) 0.5252 |
|
2023-10-18 22:06:53,460 saving best model |
|
2023-10-18 22:06:53,497 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:06:55,846 epoch 8 - iter 144/1445 - loss 0.11750770 - time (sec): 2.35 - samples/sec: 6864.78 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 22:06:58,242 epoch 8 - iter 288/1445 - loss 0.13673625 - time (sec): 4.74 - samples/sec: 7200.46 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 22:07:00,683 epoch 8 - iter 432/1445 - loss 0.12507159 - time (sec): 7.19 - samples/sec: 7375.94 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 22:07:02,957 epoch 8 - iter 576/1445 - loss 0.11956339 - time (sec): 9.46 - samples/sec: 7447.88 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 22:07:05,102 epoch 8 - iter 720/1445 - loss 0.11902581 - time (sec): 11.60 - samples/sec: 7600.71 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 22:07:07,541 epoch 8 - iter 864/1445 - loss 0.11620292 - time (sec): 14.04 - samples/sec: 7594.28 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 22:07:09,913 epoch 8 - iter 1008/1445 - loss 0.11605621 - time (sec): 16.41 - samples/sec: 7505.75 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 22:07:12,348 epoch 8 - iter 1152/1445 - loss 0.11566555 - time (sec): 18.85 - samples/sec: 7491.91 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 22:07:14,713 epoch 8 - iter 1296/1445 - loss 0.11583183 - time (sec): 21.21 - samples/sec: 7453.11 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 22:07:17,186 epoch 8 - iter 1440/1445 - loss 0.11792634 - time (sec): 23.69 - samples/sec: 7421.99 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 22:07:17,260 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:07:17,260 EPOCH 8 done: loss 0.1178 - lr: 0.000011 |
|
2023-10-18 22:07:19,373 DEV : loss 0.18764129281044006 - f1-score (micro avg) 0.5458 |
|
2023-10-18 22:07:19,389 saving best model |
|
2023-10-18 22:07:19,428 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:07:21,897 epoch 9 - iter 144/1445 - loss 0.10329269 - time (sec): 2.47 - samples/sec: 7899.55 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 22:07:24,301 epoch 9 - iter 288/1445 - loss 0.09813997 - time (sec): 4.87 - samples/sec: 7546.16 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 22:07:26,690 epoch 9 - iter 432/1445 - loss 0.10157801 - time (sec): 7.26 - samples/sec: 7367.73 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 22:07:29,085 epoch 9 - iter 576/1445 - loss 0.10788251 - time (sec): 9.66 - samples/sec: 7326.64 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 22:07:31,532 epoch 9 - iter 720/1445 - loss 0.11093889 - time (sec): 12.10 - samples/sec: 7305.87 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 22:07:33,924 epoch 9 - iter 864/1445 - loss 0.11257784 - time (sec): 14.49 - samples/sec: 7226.37 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 22:07:36,286 epoch 9 - iter 1008/1445 - loss 0.11285488 - time (sec): 16.86 - samples/sec: 7180.55 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 22:07:38,790 epoch 9 - iter 1152/1445 - loss 0.11212460 - time (sec): 19.36 - samples/sec: 7272.26 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 22:07:41,302 epoch 9 - iter 1296/1445 - loss 0.11292936 - time (sec): 21.87 - samples/sec: 7243.80 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 22:07:43,658 epoch 9 - iter 1440/1445 - loss 0.11176452 - time (sec): 24.23 - samples/sec: 7247.72 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 22:07:43,733 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:07:43,733 EPOCH 9 done: loss 0.1117 - lr: 0.000006 |
|
2023-10-18 22:07:45,510 DEV : loss 0.20176102221012115 - f1-score (micro avg) 0.5462 |
|
2023-10-18 22:07:45,524 saving best model |
|
2023-10-18 22:07:45,559 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:07:47,849 epoch 10 - iter 144/1445 - loss 0.09108433 - time (sec): 2.29 - samples/sec: 7462.73 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 22:07:50,252 epoch 10 - iter 288/1445 - loss 0.10968716 - time (sec): 4.69 - samples/sec: 7228.07 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 22:07:52,748 epoch 10 - iter 432/1445 - loss 0.10851551 - time (sec): 7.19 - samples/sec: 7260.53 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 22:07:55,252 epoch 10 - iter 576/1445 - loss 0.10952495 - time (sec): 9.69 - samples/sec: 7164.42 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 22:07:57,710 epoch 10 - iter 720/1445 - loss 0.11436161 - time (sec): 12.15 - samples/sec: 7271.10 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 22:08:00,175 epoch 10 - iter 864/1445 - loss 0.11388696 - time (sec): 14.62 - samples/sec: 7213.24 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 22:08:02,562 epoch 10 - iter 1008/1445 - loss 0.11149954 - time (sec): 17.00 - samples/sec: 7271.00 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 22:08:05,074 epoch 10 - iter 1152/1445 - loss 0.10999310 - time (sec): 19.51 - samples/sec: 7223.57 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 22:08:07,510 epoch 10 - iter 1296/1445 - loss 0.10865152 - time (sec): 21.95 - samples/sec: 7186.58 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 22:08:09,894 epoch 10 - iter 1440/1445 - loss 0.10931842 - time (sec): 24.33 - samples/sec: 7216.25 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 22:08:09,974 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:08:09,974 EPOCH 10 done: loss 0.1095 - lr: 0.000000 |
|
2023-10-18 22:08:11,764 DEV : loss 0.19709280133247375 - f1-score (micro avg) 0.5572 |
|
2023-10-18 22:08:11,780 saving best model |
|
2023-10-18 22:08:11,847 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 22:08:11,847 Loading model from best epoch ... |
|
2023-10-18 22:08:11,928 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-18 22:08:13,268 |
|
Results: |
|
- F-score (micro) 0.5566 |
|
- F-score (macro) 0.3924 |
|
- Accuracy 0.3956 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.5992 0.6725 0.6337 458 |
|
PER 0.5654 0.4751 0.5163 482 |
|
ORG 0.2000 0.0145 0.0270 69 |
|
|
|
micro avg 0.5823 0.5332 0.5566 1009 |
|
macro avg 0.4549 0.3874 0.3924 1009 |
|
weighted avg 0.5558 0.5332 0.5362 1009 |
|
|
|
2023-10-18 22:08:13,268 ---------------------------------------------------------------------------------------------------- |
|
|