|
2023-10-15 03:04:59,576 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:04:59,577 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-15 03:04:59,577 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:04:59,578 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences |
|
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator |
|
2023-10-15 03:04:59,578 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:04:59,578 Train: 3575 sentences |
|
2023-10-15 03:04:59,578 (train_with_dev=False, train_with_test=False) |
|
2023-10-15 03:04:59,578 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:04:59,578 Training Params: |
|
2023-10-15 03:04:59,578 - learning_rate: "0.00015" |
|
2023-10-15 03:04:59,578 - mini_batch_size: "8" |
|
2023-10-15 03:04:59,578 - max_epochs: "10" |
|
2023-10-15 03:04:59,578 - shuffle: "True" |
|
2023-10-15 03:04:59,578 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:04:59,578 Plugins: |
|
2023-10-15 03:04:59,578 - TensorboardLogger |
|
2023-10-15 03:04:59,578 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-15 03:04:59,578 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:04:59,578 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-15 03:04:59,578 - metric: "('micro avg', 'f1-score')" |
|
2023-10-15 03:04:59,578 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:04:59,578 Computation: |
|
2023-10-15 03:04:59,578 - compute on device: cuda:0 |
|
2023-10-15 03:04:59,578 - embedding storage: none |
|
2023-10-15 03:04:59,578 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:04:59,578 Model training base path: "hmbench-hipe2020/de-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-15 03:04:59,578 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:04:59,578 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:04:59,579 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-15 03:05:15,993 epoch 1 - iter 44/447 - loss 3.01978883 - time (sec): 16.41 - samples/sec: 527.97 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-15 03:05:31,936 epoch 1 - iter 88/447 - loss 3.00133266 - time (sec): 32.36 - samples/sec: 511.64 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-15 03:05:47,512 epoch 1 - iter 132/447 - loss 2.92259371 - time (sec): 47.93 - samples/sec: 521.08 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-15 03:06:03,335 epoch 1 - iter 176/447 - loss 2.78179957 - time (sec): 63.76 - samples/sec: 534.87 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-15 03:06:19,358 epoch 1 - iter 220/447 - loss 2.62597041 - time (sec): 79.78 - samples/sec: 538.56 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-15 03:06:35,950 epoch 1 - iter 264/447 - loss 2.45030473 - time (sec): 96.37 - samples/sec: 541.07 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-15 03:06:51,275 epoch 1 - iter 308/447 - loss 2.29870021 - time (sec): 111.70 - samples/sec: 532.50 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-15 03:07:08,861 epoch 1 - iter 352/447 - loss 2.09892316 - time (sec): 129.28 - samples/sec: 530.92 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-15 03:07:24,037 epoch 1 - iter 396/447 - loss 1.94680522 - time (sec): 144.46 - samples/sec: 530.10 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-15 03:07:39,742 epoch 1 - iter 440/447 - loss 1.79459938 - time (sec): 160.16 - samples/sec: 532.17 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-15 03:07:42,159 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:07:42,159 EPOCH 1 done: loss 1.7747 - lr: 0.000147 |
|
2023-10-15 03:08:06,055 DEV : loss 0.46788281202316284 - f1-score (micro avg) 0.0 |
|
2023-10-15 03:08:06,081 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:08:21,806 epoch 2 - iter 44/447 - loss 0.52250284 - time (sec): 15.72 - samples/sec: 565.95 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-15 03:08:37,461 epoch 2 - iter 88/447 - loss 0.46455257 - time (sec): 31.38 - samples/sec: 551.24 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-15 03:08:52,852 epoch 2 - iter 132/447 - loss 0.45965555 - time (sec): 46.77 - samples/sec: 536.18 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-15 03:09:08,719 epoch 2 - iter 176/447 - loss 0.43668872 - time (sec): 62.64 - samples/sec: 536.73 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-15 03:09:24,008 epoch 2 - iter 220/447 - loss 0.40776519 - time (sec): 77.93 - samples/sec: 538.54 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-15 03:09:39,895 epoch 2 - iter 264/447 - loss 0.39044184 - time (sec): 93.81 - samples/sec: 538.32 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-15 03:09:57,207 epoch 2 - iter 308/447 - loss 0.37910806 - time (sec): 111.13 - samples/sec: 539.10 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-15 03:10:12,303 epoch 2 - iter 352/447 - loss 0.36499181 - time (sec): 126.22 - samples/sec: 537.08 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-15 03:10:27,876 epoch 2 - iter 396/447 - loss 0.34942105 - time (sec): 141.79 - samples/sec: 538.09 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-15 03:10:43,586 epoch 2 - iter 440/447 - loss 0.33846353 - time (sec): 157.50 - samples/sec: 540.65 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-15 03:10:46,019 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:10:46,019 EPOCH 2 done: loss 0.3358 - lr: 0.000134 |
|
2023-10-15 03:11:11,866 DEV : loss 0.221779927611351 - f1-score (micro avg) 0.5672 |
|
2023-10-15 03:11:11,892 saving best model |
|
2023-10-15 03:11:12,500 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:11:28,182 epoch 3 - iter 44/447 - loss 0.21697222 - time (sec): 15.68 - samples/sec: 549.97 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-15 03:11:43,508 epoch 3 - iter 88/447 - loss 0.20839941 - time (sec): 31.01 - samples/sec: 533.33 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-15 03:12:02,109 epoch 3 - iter 132/447 - loss 0.21174967 - time (sec): 49.61 - samples/sec: 546.53 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-15 03:12:18,292 epoch 3 - iter 176/447 - loss 0.20384305 - time (sec): 65.79 - samples/sec: 550.72 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-15 03:12:33,767 epoch 3 - iter 220/447 - loss 0.19929970 - time (sec): 81.27 - samples/sec: 547.92 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-15 03:12:48,479 epoch 3 - iter 264/447 - loss 0.19553507 - time (sec): 95.98 - samples/sec: 538.22 - lr: 0.000124 - momentum: 0.000000 |
|
2023-10-15 03:13:03,626 epoch 3 - iter 308/447 - loss 0.19427403 - time (sec): 111.12 - samples/sec: 535.59 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-15 03:13:19,671 epoch 3 - iter 352/447 - loss 0.18939750 - time (sec): 127.17 - samples/sec: 541.01 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-15 03:13:35,203 epoch 3 - iter 396/447 - loss 0.18730435 - time (sec): 142.70 - samples/sec: 538.73 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-15 03:13:50,772 epoch 3 - iter 440/447 - loss 0.18261588 - time (sec): 158.27 - samples/sec: 538.56 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-15 03:13:53,203 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:13:53,203 EPOCH 3 done: loss 0.1828 - lr: 0.000117 |
|
2023-10-15 03:14:19,555 DEV : loss 0.21196609735488892 - f1-score (micro avg) 0.6645 |
|
2023-10-15 03:14:19,582 saving best model |
|
2023-10-15 03:14:29,099 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:14:44,594 epoch 4 - iter 44/447 - loss 0.12458170 - time (sec): 15.49 - samples/sec: 528.51 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-15 03:15:00,234 epoch 4 - iter 88/447 - loss 0.12489415 - time (sec): 31.13 - samples/sec: 546.25 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-15 03:15:17,300 epoch 4 - iter 132/447 - loss 0.12039860 - time (sec): 48.20 - samples/sec: 549.42 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-15 03:15:33,506 epoch 4 - iter 176/447 - loss 0.11268941 - time (sec): 64.40 - samples/sec: 558.22 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-15 03:15:49,080 epoch 4 - iter 220/447 - loss 0.11047919 - time (sec): 79.98 - samples/sec: 553.19 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-15 03:16:04,739 epoch 4 - iter 264/447 - loss 0.10848879 - time (sec): 95.64 - samples/sec: 554.24 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-15 03:16:19,804 epoch 4 - iter 308/447 - loss 0.10754764 - time (sec): 110.70 - samples/sec: 552.22 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-15 03:16:35,058 epoch 4 - iter 352/447 - loss 0.10601664 - time (sec): 125.96 - samples/sec: 548.80 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-15 03:16:50,082 epoch 4 - iter 396/447 - loss 0.10799045 - time (sec): 140.98 - samples/sec: 546.38 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-15 03:17:05,246 epoch 4 - iter 440/447 - loss 0.10594070 - time (sec): 156.14 - samples/sec: 545.97 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-15 03:17:07,682 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:17:07,682 EPOCH 4 done: loss 0.1050 - lr: 0.000100 |
|
2023-10-15 03:17:33,504 DEV : loss 0.1502165049314499 - f1-score (micro avg) 0.7452 |
|
2023-10-15 03:17:33,531 saving best model |
|
2023-10-15 03:17:37,818 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:17:53,611 epoch 5 - iter 44/447 - loss 0.07166056 - time (sec): 15.79 - samples/sec: 554.12 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-15 03:18:10,004 epoch 5 - iter 88/447 - loss 0.07049452 - time (sec): 32.18 - samples/sec: 561.32 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-15 03:18:25,262 epoch 5 - iter 132/447 - loss 0.06741331 - time (sec): 47.44 - samples/sec: 552.50 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-15 03:18:41,423 epoch 5 - iter 176/447 - loss 0.06793876 - time (sec): 63.60 - samples/sec: 550.92 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-15 03:18:58,865 epoch 5 - iter 220/447 - loss 0.06793416 - time (sec): 81.04 - samples/sec: 549.77 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-15 03:19:14,393 epoch 5 - iter 264/447 - loss 0.07122002 - time (sec): 96.57 - samples/sec: 547.49 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-15 03:19:29,723 epoch 5 - iter 308/447 - loss 0.06996369 - time (sec): 111.90 - samples/sec: 545.54 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-15 03:19:45,050 epoch 5 - iter 352/447 - loss 0.07078724 - time (sec): 127.23 - samples/sec: 542.96 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-15 03:20:00,352 epoch 5 - iter 396/447 - loss 0.06886941 - time (sec): 142.53 - samples/sec: 540.46 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-15 03:20:15,902 epoch 5 - iter 440/447 - loss 0.06749023 - time (sec): 158.08 - samples/sec: 539.13 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-15 03:20:18,377 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:20:18,377 EPOCH 5 done: loss 0.0671 - lr: 0.000084 |
|
2023-10-15 03:20:44,645 DEV : loss 0.16935649514198303 - f1-score (micro avg) 0.7586 |
|
2023-10-15 03:20:44,671 saving best model |
|
2023-10-15 03:20:48,900 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:21:04,483 epoch 6 - iter 44/447 - loss 0.04472399 - time (sec): 15.58 - samples/sec: 547.14 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-15 03:21:21,706 epoch 6 - iter 88/447 - loss 0.05060492 - time (sec): 32.80 - samples/sec: 540.75 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-15 03:21:37,259 epoch 6 - iter 132/447 - loss 0.04694672 - time (sec): 48.36 - samples/sec: 547.01 - lr: 0.000079 - momentum: 0.000000 |
|
2023-10-15 03:21:53,172 epoch 6 - iter 176/447 - loss 0.04332383 - time (sec): 64.27 - samples/sec: 552.15 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-15 03:22:08,300 epoch 6 - iter 220/447 - loss 0.04304184 - time (sec): 79.40 - samples/sec: 545.59 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-15 03:22:24,538 epoch 6 - iter 264/447 - loss 0.04446927 - time (sec): 95.64 - samples/sec: 543.72 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-15 03:22:40,667 epoch 6 - iter 308/447 - loss 0.04546331 - time (sec): 111.76 - samples/sec: 544.78 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-15 03:22:56,084 epoch 6 - iter 352/447 - loss 0.04531105 - time (sec): 127.18 - samples/sec: 543.54 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-15 03:23:11,080 epoch 6 - iter 396/447 - loss 0.04538256 - time (sec): 142.18 - samples/sec: 539.62 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-15 03:23:26,907 epoch 6 - iter 440/447 - loss 0.04478858 - time (sec): 158.00 - samples/sec: 540.52 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-15 03:23:29,259 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:23:29,259 EPOCH 6 done: loss 0.0447 - lr: 0.000067 |
|
2023-10-15 03:23:55,465 DEV : loss 0.19000039994716644 - f1-score (micro avg) 0.7643 |
|
2023-10-15 03:23:55,491 saving best model |
|
2023-10-15 03:23:58,764 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:24:13,720 epoch 7 - iter 44/447 - loss 0.02523236 - time (sec): 14.95 - samples/sec: 533.71 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-15 03:24:29,663 epoch 7 - iter 88/447 - loss 0.02638231 - time (sec): 30.90 - samples/sec: 560.20 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-15 03:24:45,034 epoch 7 - iter 132/447 - loss 0.02697728 - time (sec): 46.27 - samples/sec: 553.82 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-15 03:25:00,846 epoch 7 - iter 176/447 - loss 0.02819881 - time (sec): 62.08 - samples/sec: 555.69 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-15 03:25:18,235 epoch 7 - iter 220/447 - loss 0.02964525 - time (sec): 79.47 - samples/sec: 553.72 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-15 03:25:34,141 epoch 7 - iter 264/447 - loss 0.02964954 - time (sec): 95.37 - samples/sec: 547.81 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-15 03:25:49,642 epoch 7 - iter 308/447 - loss 0.02874052 - time (sec): 110.88 - samples/sec: 546.42 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-15 03:26:05,033 epoch 7 - iter 352/447 - loss 0.02885843 - time (sec): 126.27 - samples/sec: 544.05 - lr: 0.000054 - momentum: 0.000000 |
|
2023-10-15 03:26:20,247 epoch 7 - iter 396/447 - loss 0.03106093 - time (sec): 141.48 - samples/sec: 541.83 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-15 03:26:36,365 epoch 7 - iter 440/447 - loss 0.03077679 - time (sec): 157.60 - samples/sec: 542.12 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-15 03:26:38,706 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:26:38,707 EPOCH 7 done: loss 0.0307 - lr: 0.000050 |
|
2023-10-15 03:27:04,502 DEV : loss 0.19090385735034943 - f1-score (micro avg) 0.7667 |
|
2023-10-15 03:27:04,528 saving best model |
|
2023-10-15 03:27:08,206 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:27:23,975 epoch 8 - iter 44/447 - loss 0.02393544 - time (sec): 15.77 - samples/sec: 531.15 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-15 03:27:41,509 epoch 8 - iter 88/447 - loss 0.03310152 - time (sec): 33.30 - samples/sec: 548.42 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-15 03:27:57,178 epoch 8 - iter 132/447 - loss 0.02942044 - time (sec): 48.97 - samples/sec: 549.60 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-15 03:28:12,622 epoch 8 - iter 176/447 - loss 0.02706905 - time (sec): 64.41 - samples/sec: 548.00 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-15 03:28:27,968 epoch 8 - iter 220/447 - loss 0.02599557 - time (sec): 79.76 - samples/sec: 546.01 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-15 03:28:43,769 epoch 8 - iter 264/447 - loss 0.02513171 - time (sec): 95.56 - samples/sec: 539.38 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-15 03:28:59,570 epoch 8 - iter 308/447 - loss 0.02314944 - time (sec): 111.36 - samples/sec: 541.22 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-15 03:29:15,312 epoch 8 - iter 352/447 - loss 0.02305557 - time (sec): 127.10 - samples/sec: 543.78 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-15 03:29:30,462 epoch 8 - iter 396/447 - loss 0.02275457 - time (sec): 142.25 - samples/sec: 541.33 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-15 03:29:45,730 epoch 8 - iter 440/447 - loss 0.02291247 - time (sec): 157.52 - samples/sec: 540.82 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-15 03:29:48,168 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:29:48,169 EPOCH 8 done: loss 0.0227 - lr: 0.000034 |
|
2023-10-15 03:30:14,304 DEV : loss 0.19714942574501038 - f1-score (micro avg) 0.7611 |
|
2023-10-15 03:30:14,331 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:30:30,530 epoch 9 - iter 44/447 - loss 0.01520226 - time (sec): 16.20 - samples/sec: 548.79 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-15 03:30:46,254 epoch 9 - iter 88/447 - loss 0.01346179 - time (sec): 31.92 - samples/sec: 549.44 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-15 03:31:02,109 epoch 9 - iter 132/447 - loss 0.01312156 - time (sec): 47.78 - samples/sec: 543.20 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-15 03:31:17,445 epoch 9 - iter 176/447 - loss 0.01278497 - time (sec): 63.11 - samples/sec: 541.11 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-15 03:31:32,608 epoch 9 - iter 220/447 - loss 0.01260025 - time (sec): 78.28 - samples/sec: 538.95 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-15 03:31:48,556 epoch 9 - iter 264/447 - loss 0.01287074 - time (sec): 94.22 - samples/sec: 538.71 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-15 03:32:05,035 epoch 9 - iter 308/447 - loss 0.01389175 - time (sec): 110.70 - samples/sec: 543.42 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-15 03:32:20,276 epoch 9 - iter 352/447 - loss 0.01437736 - time (sec): 125.94 - samples/sec: 539.85 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-15 03:32:35,454 epoch 9 - iter 396/447 - loss 0.01477394 - time (sec): 141.12 - samples/sec: 538.22 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-15 03:32:52,774 epoch 9 - iter 440/447 - loss 0.01578011 - time (sec): 158.44 - samples/sec: 538.41 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-15 03:32:55,142 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:32:55,142 EPOCH 9 done: loss 0.0157 - lr: 0.000017 |
|
2023-10-15 03:33:21,058 DEV : loss 0.20480625331401825 - f1-score (micro avg) 0.7619 |
|
2023-10-15 03:33:21,085 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:33:36,258 epoch 10 - iter 44/447 - loss 0.00974946 - time (sec): 15.17 - samples/sec: 540.00 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-15 03:33:52,230 epoch 10 - iter 88/447 - loss 0.01297197 - time (sec): 31.14 - samples/sec: 556.12 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-15 03:34:07,703 epoch 10 - iter 132/447 - loss 0.01147150 - time (sec): 46.62 - samples/sec: 556.70 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-15 03:34:25,028 epoch 10 - iter 176/447 - loss 0.01480113 - time (sec): 63.94 - samples/sec: 554.72 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-15 03:34:40,920 epoch 10 - iter 220/447 - loss 0.01375097 - time (sec): 79.83 - samples/sec: 546.10 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-15 03:34:55,965 epoch 10 - iter 264/447 - loss 0.01369117 - time (sec): 94.88 - samples/sec: 542.48 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-15 03:35:11,409 epoch 10 - iter 308/447 - loss 0.01260225 - time (sec): 110.32 - samples/sec: 543.21 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-15 03:35:26,842 epoch 10 - iter 352/447 - loss 0.01255970 - time (sec): 125.76 - samples/sec: 544.24 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-15 03:35:41,667 epoch 10 - iter 396/447 - loss 0.01280365 - time (sec): 140.58 - samples/sec: 539.59 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-15 03:35:57,666 epoch 10 - iter 440/447 - loss 0.01242779 - time (sec): 156.58 - samples/sec: 543.84 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-15 03:36:00,120 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:36:00,120 EPOCH 10 done: loss 0.0123 - lr: 0.000001 |
|
2023-10-15 03:36:26,213 DEV : loss 0.20496107637882233 - f1-score (micro avg) 0.756 |
|
2023-10-15 03:36:26,839 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 03:36:26,840 Loading model from best epoch ... |
|
2023-10-15 03:36:34,278 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time |
|
2023-10-15 03:36:56,851 |
|
Results: |
|
- F-score (micro) 0.7451 |
|
- F-score (macro) 0.6272 |
|
- Accuracy 0.6072 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.8249 0.8775 0.8504 596 |
|
pers 0.6897 0.7808 0.7324 333 |
|
org 0.4596 0.5606 0.5051 132 |
|
prod 0.5273 0.4394 0.4793 66 |
|
time 0.5472 0.5918 0.5686 49 |
|
|
|
micro avg 0.7148 0.7781 0.7451 1176 |
|
macro avg 0.6097 0.6500 0.6272 1176 |
|
weighted avg 0.7173 0.7781 0.7457 1176 |
|
|
|
2023-10-15 03:36:56,851 ---------------------------------------------------------------------------------------------------- |
|
|