|
2023-10-19 20:43:33,494 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:43:33,494 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-19 20:43:33,494 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:43:33,494 MultiCorpus: 7142 train + 698 dev + 2570 test sentences |
|
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator |
|
2023-10-19 20:43:33,494 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:43:33,494 Train: 7142 sentences |
|
2023-10-19 20:43:33,494 (train_with_dev=False, train_with_test=False) |
|
2023-10-19 20:43:33,494 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:43:33,494 Training Params: |
|
2023-10-19 20:43:33,494 - learning_rate: "3e-05" |
|
2023-10-19 20:43:33,494 - mini_batch_size: "8" |
|
2023-10-19 20:43:33,494 - max_epochs: "10" |
|
2023-10-19 20:43:33,494 - shuffle: "True" |
|
2023-10-19 20:43:33,494 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:43:33,494 Plugins: |
|
2023-10-19 20:43:33,494 - TensorboardLogger |
|
2023-10-19 20:43:33,495 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-19 20:43:33,495 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:43:33,495 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-19 20:43:33,495 - metric: "('micro avg', 'f1-score')" |
|
2023-10-19 20:43:33,495 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:43:33,495 Computation: |
|
2023-10-19 20:43:33,495 - compute on device: cuda:0 |
|
2023-10-19 20:43:33,495 - embedding storage: none |
|
2023-10-19 20:43:33,495 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:43:33,495 Model training base path: "hmbench-newseye/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-19 20:43:33,495 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:43:33,495 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:43:33,495 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-19 20:43:35,897 epoch 1 - iter 89/893 - loss 3.36474458 - time (sec): 2.40 - samples/sec: 10175.30 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 20:43:38,077 epoch 1 - iter 178/893 - loss 3.17615433 - time (sec): 4.58 - samples/sec: 10898.86 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 20:43:40,451 epoch 1 - iter 267/893 - loss 2.85791311 - time (sec): 6.96 - samples/sec: 10838.18 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 20:43:42,902 epoch 1 - iter 356/893 - loss 2.46419436 - time (sec): 9.41 - samples/sec: 10877.89 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 20:43:45,279 epoch 1 - iter 445/893 - loss 2.16650043 - time (sec): 11.78 - samples/sec: 10735.41 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 20:43:47,679 epoch 1 - iter 534/893 - loss 1.94859051 - time (sec): 14.18 - samples/sec: 10578.07 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 20:43:49,914 epoch 1 - iter 623/893 - loss 1.78202357 - time (sec): 16.42 - samples/sec: 10559.06 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 20:43:52,180 epoch 1 - iter 712/893 - loss 1.63765987 - time (sec): 18.68 - samples/sec: 10606.38 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 20:43:54,521 epoch 1 - iter 801/893 - loss 1.52172449 - time (sec): 21.03 - samples/sec: 10633.93 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 20:43:56,809 epoch 1 - iter 890/893 - loss 1.42986163 - time (sec): 23.31 - samples/sec: 10651.98 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 20:43:56,876 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:43:56,876 EPOCH 1 done: loss 1.4288 - lr: 0.000030 |
|
2023-10-19 20:43:57,836 DEV : loss 0.3583933711051941 - f1-score (micro avg) 0.0051 |
|
2023-10-19 20:43:57,851 saving best model |
|
2023-10-19 20:43:57,889 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:44:00,779 epoch 2 - iter 89/893 - loss 0.50752212 - time (sec): 2.89 - samples/sec: 9054.44 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 20:44:03,108 epoch 2 - iter 178/893 - loss 0.51365527 - time (sec): 5.22 - samples/sec: 9674.08 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 20:44:05,332 epoch 2 - iter 267/893 - loss 0.49925370 - time (sec): 7.44 - samples/sec: 9957.91 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 20:44:07,612 epoch 2 - iter 356/893 - loss 0.50135216 - time (sec): 9.72 - samples/sec: 10323.96 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 20:44:09,914 epoch 2 - iter 445/893 - loss 0.49100548 - time (sec): 12.02 - samples/sec: 10421.90 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 20:44:12,264 epoch 2 - iter 534/893 - loss 0.49304291 - time (sec): 14.37 - samples/sec: 10461.92 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 20:44:14,568 epoch 2 - iter 623/893 - loss 0.48690305 - time (sec): 16.68 - samples/sec: 10441.80 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 20:44:16,832 epoch 2 - iter 712/893 - loss 0.48141983 - time (sec): 18.94 - samples/sec: 10511.26 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 20:44:19,075 epoch 2 - iter 801/893 - loss 0.47901259 - time (sec): 21.19 - samples/sec: 10580.49 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 20:44:21,449 epoch 2 - iter 890/893 - loss 0.47294479 - time (sec): 23.56 - samples/sec: 10535.56 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 20:44:21,521 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:44:21,521 EPOCH 2 done: loss 0.4730 - lr: 0.000027 |
|
2023-10-19 20:44:23,836 DEV : loss 0.2727677524089813 - f1-score (micro avg) 0.2829 |
|
2023-10-19 20:44:23,851 saving best model |
|
2023-10-19 20:44:23,886 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:44:26,009 epoch 3 - iter 89/893 - loss 0.40319399 - time (sec): 2.12 - samples/sec: 10892.85 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 20:44:28,265 epoch 3 - iter 178/893 - loss 0.40080107 - time (sec): 4.38 - samples/sec: 10990.83 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 20:44:30,597 epoch 3 - iter 267/893 - loss 0.40130315 - time (sec): 6.71 - samples/sec: 10909.01 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 20:44:32,841 epoch 3 - iter 356/893 - loss 0.41361765 - time (sec): 8.95 - samples/sec: 11068.60 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 20:44:35,076 epoch 3 - iter 445/893 - loss 0.41501688 - time (sec): 11.19 - samples/sec: 11039.33 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 20:44:37,352 epoch 3 - iter 534/893 - loss 0.40577513 - time (sec): 13.47 - samples/sec: 11045.65 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 20:44:39,618 epoch 3 - iter 623/893 - loss 0.40181355 - time (sec): 15.73 - samples/sec: 11042.70 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 20:44:41,883 epoch 3 - iter 712/893 - loss 0.39672146 - time (sec): 18.00 - samples/sec: 11058.98 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 20:44:44,232 epoch 3 - iter 801/893 - loss 0.39109279 - time (sec): 20.34 - samples/sec: 11006.14 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 20:44:46,551 epoch 3 - iter 890/893 - loss 0.38856290 - time (sec): 22.66 - samples/sec: 10920.50 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 20:44:46,631 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:44:46,631 EPOCH 3 done: loss 0.3883 - lr: 0.000023 |
|
2023-10-19 20:44:49,500 DEV : loss 0.24324576556682587 - f1-score (micro avg) 0.3648 |
|
2023-10-19 20:44:49,514 saving best model |
|
2023-10-19 20:44:49,549 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:44:51,761 epoch 4 - iter 89/893 - loss 0.35462626 - time (sec): 2.21 - samples/sec: 11137.81 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 20:44:54,078 epoch 4 - iter 178/893 - loss 0.36122181 - time (sec): 4.53 - samples/sec: 10919.07 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 20:44:56,454 epoch 4 - iter 267/893 - loss 0.35177194 - time (sec): 6.90 - samples/sec: 10739.01 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 20:44:58,697 epoch 4 - iter 356/893 - loss 0.34922900 - time (sec): 9.15 - samples/sec: 10815.60 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 20:45:00,986 epoch 4 - iter 445/893 - loss 0.34769550 - time (sec): 11.44 - samples/sec: 10824.67 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 20:45:03,332 epoch 4 - iter 534/893 - loss 0.34695275 - time (sec): 13.78 - samples/sec: 10848.88 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 20:45:05,557 epoch 4 - iter 623/893 - loss 0.34866634 - time (sec): 16.01 - samples/sec: 10808.10 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 20:45:07,776 epoch 4 - iter 712/893 - loss 0.34969867 - time (sec): 18.23 - samples/sec: 10890.59 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 20:45:10,037 epoch 4 - iter 801/893 - loss 0.34854804 - time (sec): 20.49 - samples/sec: 10819.44 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 20:45:12,308 epoch 4 - iter 890/893 - loss 0.34644979 - time (sec): 22.76 - samples/sec: 10892.76 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 20:45:12,389 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:45:12,389 EPOCH 4 done: loss 0.3462 - lr: 0.000020 |
|
2023-10-19 20:45:14,748 DEV : loss 0.22785429656505585 - f1-score (micro avg) 0.4258 |
|
2023-10-19 20:45:14,761 saving best model |
|
2023-10-19 20:45:14,795 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:45:17,692 epoch 5 - iter 89/893 - loss 0.30613893 - time (sec): 2.90 - samples/sec: 8417.95 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 20:45:20,016 epoch 5 - iter 178/893 - loss 0.31878599 - time (sec): 5.22 - samples/sec: 9318.62 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 20:45:22,287 epoch 5 - iter 267/893 - loss 0.32562746 - time (sec): 7.49 - samples/sec: 9741.67 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 20:45:24,696 epoch 5 - iter 356/893 - loss 0.32506788 - time (sec): 9.90 - samples/sec: 9933.23 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 20:45:27,024 epoch 5 - iter 445/893 - loss 0.32782798 - time (sec): 12.23 - samples/sec: 10132.41 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 20:45:29,348 epoch 5 - iter 534/893 - loss 0.32504534 - time (sec): 14.55 - samples/sec: 10226.49 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 20:45:31,591 epoch 5 - iter 623/893 - loss 0.32566922 - time (sec): 16.80 - samples/sec: 10324.80 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 20:45:33,942 epoch 5 - iter 712/893 - loss 0.32624930 - time (sec): 19.15 - samples/sec: 10378.50 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 20:45:36,177 epoch 5 - iter 801/893 - loss 0.32188357 - time (sec): 21.38 - samples/sec: 10437.64 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 20:45:38,426 epoch 5 - iter 890/893 - loss 0.31978773 - time (sec): 23.63 - samples/sec: 10491.34 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 20:45:38,494 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:45:38,494 EPOCH 5 done: loss 0.3198 - lr: 0.000017 |
|
2023-10-19 20:45:40,829 DEV : loss 0.2165321707725525 - f1-score (micro avg) 0.4348 |
|
2023-10-19 20:45:40,842 saving best model |
|
2023-10-19 20:45:40,877 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:45:43,162 epoch 6 - iter 89/893 - loss 0.29571136 - time (sec): 2.28 - samples/sec: 10903.85 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 20:45:45,372 epoch 6 - iter 178/893 - loss 0.29524236 - time (sec): 4.49 - samples/sec: 10715.11 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 20:45:47,635 epoch 6 - iter 267/893 - loss 0.29256224 - time (sec): 6.76 - samples/sec: 10645.61 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 20:45:49,766 epoch 6 - iter 356/893 - loss 0.29383457 - time (sec): 8.89 - samples/sec: 10853.24 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 20:45:51,624 epoch 6 - iter 445/893 - loss 0.29565709 - time (sec): 10.75 - samples/sec: 11227.12 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 20:45:53,510 epoch 6 - iter 534/893 - loss 0.29858375 - time (sec): 12.63 - samples/sec: 11517.79 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 20:45:55,391 epoch 6 - iter 623/893 - loss 0.30036360 - time (sec): 14.51 - samples/sec: 11780.03 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 20:45:57,312 epoch 6 - iter 712/893 - loss 0.30287927 - time (sec): 16.43 - samples/sec: 11993.36 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 20:45:59,506 epoch 6 - iter 801/893 - loss 0.30185126 - time (sec): 18.63 - samples/sec: 11950.41 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 20:46:01,882 epoch 6 - iter 890/893 - loss 0.30165075 - time (sec): 21.00 - samples/sec: 11795.69 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 20:46:01,962 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:46:01,963 EPOCH 6 done: loss 0.3011 - lr: 0.000013 |
|
2023-10-19 20:46:04,835 DEV : loss 0.21037089824676514 - f1-score (micro avg) 0.4511 |
|
2023-10-19 20:46:04,849 saving best model |
|
2023-10-19 20:46:04,885 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:46:07,175 epoch 7 - iter 89/893 - loss 0.26471019 - time (sec): 2.29 - samples/sec: 9971.18 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 20:46:09,410 epoch 7 - iter 178/893 - loss 0.28780247 - time (sec): 4.52 - samples/sec: 10412.84 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 20:46:11,643 epoch 7 - iter 267/893 - loss 0.28545169 - time (sec): 6.76 - samples/sec: 10719.23 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 20:46:13,994 epoch 7 - iter 356/893 - loss 0.28151839 - time (sec): 9.11 - samples/sec: 10572.64 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 20:46:16,294 epoch 7 - iter 445/893 - loss 0.27983309 - time (sec): 11.41 - samples/sec: 10691.66 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 20:46:18,512 epoch 7 - iter 534/893 - loss 0.28086071 - time (sec): 13.63 - samples/sec: 10673.73 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 20:46:20,781 epoch 7 - iter 623/893 - loss 0.28374160 - time (sec): 15.90 - samples/sec: 10588.13 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 20:46:23,091 epoch 7 - iter 712/893 - loss 0.28437410 - time (sec): 18.21 - samples/sec: 10822.49 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 20:46:25,381 epoch 7 - iter 801/893 - loss 0.28490813 - time (sec): 20.50 - samples/sec: 10925.14 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 20:46:27,628 epoch 7 - iter 890/893 - loss 0.28586651 - time (sec): 22.74 - samples/sec: 10905.66 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 20:46:27,702 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:46:27,703 EPOCH 7 done: loss 0.2855 - lr: 0.000010 |
|
2023-10-19 20:46:30,575 DEV : loss 0.20654717087745667 - f1-score (micro avg) 0.464 |
|
2023-10-19 20:46:30,589 saving best model |
|
2023-10-19 20:46:30,625 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:46:32,913 epoch 8 - iter 89/893 - loss 0.25506781 - time (sec): 2.29 - samples/sec: 10313.18 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 20:46:35,306 epoch 8 - iter 178/893 - loss 0.26639228 - time (sec): 4.68 - samples/sec: 10626.13 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 20:46:37,561 epoch 8 - iter 267/893 - loss 0.27578597 - time (sec): 6.93 - samples/sec: 10701.28 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 20:46:39,905 epoch 8 - iter 356/893 - loss 0.27223326 - time (sec): 9.28 - samples/sec: 10679.88 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 20:46:42,211 epoch 8 - iter 445/893 - loss 0.28095154 - time (sec): 11.59 - samples/sec: 10590.91 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 20:46:44,495 epoch 8 - iter 534/893 - loss 0.28153055 - time (sec): 13.87 - samples/sec: 10632.32 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 20:46:46,785 epoch 8 - iter 623/893 - loss 0.27966004 - time (sec): 16.16 - samples/sec: 10602.39 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 20:46:49,018 epoch 8 - iter 712/893 - loss 0.27728927 - time (sec): 18.39 - samples/sec: 10633.48 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 20:46:51,332 epoch 8 - iter 801/893 - loss 0.27825347 - time (sec): 20.71 - samples/sec: 10759.87 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 20:46:53,591 epoch 8 - iter 890/893 - loss 0.27741145 - time (sec): 22.97 - samples/sec: 10801.42 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 20:46:53,670 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:46:53,670 EPOCH 8 done: loss 0.2780 - lr: 0.000007 |
|
2023-10-19 20:46:56,021 DEV : loss 0.20388463139533997 - f1-score (micro avg) 0.4581 |
|
2023-10-19 20:46:56,036 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:46:58,269 epoch 9 - iter 89/893 - loss 0.29014224 - time (sec): 2.23 - samples/sec: 10907.47 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 20:47:00,619 epoch 9 - iter 178/893 - loss 0.27479581 - time (sec): 4.58 - samples/sec: 10701.62 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 20:47:02,913 epoch 9 - iter 267/893 - loss 0.27971361 - time (sec): 6.88 - samples/sec: 10646.54 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 20:47:05,244 epoch 9 - iter 356/893 - loss 0.28509490 - time (sec): 9.21 - samples/sec: 10723.60 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 20:47:07,540 epoch 9 - iter 445/893 - loss 0.28257099 - time (sec): 11.50 - samples/sec: 10926.65 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 20:47:09,753 epoch 9 - iter 534/893 - loss 0.27725015 - time (sec): 13.72 - samples/sec: 10893.55 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 20:47:11,990 epoch 9 - iter 623/893 - loss 0.27539239 - time (sec): 15.95 - samples/sec: 10942.87 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 20:47:14,219 epoch 9 - iter 712/893 - loss 0.27405143 - time (sec): 18.18 - samples/sec: 10918.96 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 20:47:16,474 epoch 9 - iter 801/893 - loss 0.27273670 - time (sec): 20.44 - samples/sec: 10943.50 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 20:47:18,688 epoch 9 - iter 890/893 - loss 0.27246462 - time (sec): 22.65 - samples/sec: 10943.08 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 20:47:18,765 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:47:18,765 EPOCH 9 done: loss 0.2723 - lr: 0.000003 |
|
2023-10-19 20:47:21,657 DEV : loss 0.20137684047222137 - f1-score (micro avg) 0.4607 |
|
2023-10-19 20:47:21,672 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:47:23,925 epoch 10 - iter 89/893 - loss 0.25566250 - time (sec): 2.25 - samples/sec: 11641.70 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 20:47:26,170 epoch 10 - iter 178/893 - loss 0.25859874 - time (sec): 4.50 - samples/sec: 11516.21 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 20:47:28,406 epoch 10 - iter 267/893 - loss 0.26290661 - time (sec): 6.73 - samples/sec: 11582.63 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 20:47:30,601 epoch 10 - iter 356/893 - loss 0.26625566 - time (sec): 8.93 - samples/sec: 11474.83 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 20:47:32,957 epoch 10 - iter 445/893 - loss 0.26570550 - time (sec): 11.28 - samples/sec: 11277.42 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 20:47:35,213 epoch 10 - iter 534/893 - loss 0.26688781 - time (sec): 13.54 - samples/sec: 11193.53 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 20:47:37,553 epoch 10 - iter 623/893 - loss 0.26448392 - time (sec): 15.88 - samples/sec: 11081.52 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 20:47:39,817 epoch 10 - iter 712/893 - loss 0.26489187 - time (sec): 18.14 - samples/sec: 11013.74 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 20:47:42,115 epoch 10 - iter 801/893 - loss 0.26576664 - time (sec): 20.44 - samples/sec: 10938.03 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-19 20:47:44,475 epoch 10 - iter 890/893 - loss 0.26664035 - time (sec): 22.80 - samples/sec: 10860.27 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-19 20:47:44,559 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:47:44,559 EPOCH 10 done: loss 0.2669 - lr: 0.000000 |
|
2023-10-19 20:47:47,422 DEV : loss 0.2013109028339386 - f1-score (micro avg) 0.4648 |
|
2023-10-19 20:47:47,436 saving best model |
|
2023-10-19 20:47:47,496 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:47:47,497 Loading model from best epoch ... |
|
2023-10-19 20:47:47,573 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-19 20:47:52,296 |
|
Results: |
|
- F-score (micro) 0.3701 |
|
- F-score (macro) 0.2049 |
|
- Accuracy 0.2348 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.3853 0.4630 0.4206 1095 |
|
PER 0.3546 0.4447 0.3946 1012 |
|
ORG 0.0105 0.0028 0.0044 357 |
|
HumanProd 0.0000 0.0000 0.0000 33 |
|
|
|
micro avg 0.3575 0.3837 0.3701 2497 |
|
macro avg 0.1876 0.2276 0.2049 2497 |
|
weighted avg 0.3142 0.3837 0.3450 2497 |
|
|
|
2023-10-19 20:47:52,296 ---------------------------------------------------------------------------------------------------- |
|
|