stefan-it's picture
Upload folder using huggingface_hub
ef6db1b
2023-10-15 03:04:59,576 ----------------------------------------------------------------------------------------------------
2023-10-15 03:04:59,577 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-15 03:04:59,577 ----------------------------------------------------------------------------------------------------
2023-10-15 03:04:59,578 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-15 03:04:59,578 ----------------------------------------------------------------------------------------------------
2023-10-15 03:04:59,578 Train: 3575 sentences
2023-10-15 03:04:59,578 (train_with_dev=False, train_with_test=False)
2023-10-15 03:04:59,578 ----------------------------------------------------------------------------------------------------
2023-10-15 03:04:59,578 Training Params:
2023-10-15 03:04:59,578 - learning_rate: "0.00015"
2023-10-15 03:04:59,578 - mini_batch_size: "8"
2023-10-15 03:04:59,578 - max_epochs: "10"
2023-10-15 03:04:59,578 - shuffle: "True"
2023-10-15 03:04:59,578 ----------------------------------------------------------------------------------------------------
2023-10-15 03:04:59,578 Plugins:
2023-10-15 03:04:59,578 - TensorboardLogger
2023-10-15 03:04:59,578 - LinearScheduler | warmup_fraction: '0.1'
2023-10-15 03:04:59,578 ----------------------------------------------------------------------------------------------------
2023-10-15 03:04:59,578 Final evaluation on model from best epoch (best-model.pt)
2023-10-15 03:04:59,578 - metric: "('micro avg', 'f1-score')"
2023-10-15 03:04:59,578 ----------------------------------------------------------------------------------------------------
2023-10-15 03:04:59,578 Computation:
2023-10-15 03:04:59,578 - compute on device: cuda:0
2023-10-15 03:04:59,578 - embedding storage: none
2023-10-15 03:04:59,578 ----------------------------------------------------------------------------------------------------
2023-10-15 03:04:59,578 Model training base path: "hmbench-hipe2020/de-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-5"
2023-10-15 03:04:59,578 ----------------------------------------------------------------------------------------------------
2023-10-15 03:04:59,578 ----------------------------------------------------------------------------------------------------
2023-10-15 03:04:59,579 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-15 03:05:15,993 epoch 1 - iter 44/447 - loss 3.01978883 - time (sec): 16.41 - samples/sec: 527.97 - lr: 0.000014 - momentum: 0.000000
2023-10-15 03:05:31,936 epoch 1 - iter 88/447 - loss 3.00133266 - time (sec): 32.36 - samples/sec: 511.64 - lr: 0.000029 - momentum: 0.000000
2023-10-15 03:05:47,512 epoch 1 - iter 132/447 - loss 2.92259371 - time (sec): 47.93 - samples/sec: 521.08 - lr: 0.000044 - momentum: 0.000000
2023-10-15 03:06:03,335 epoch 1 - iter 176/447 - loss 2.78179957 - time (sec): 63.76 - samples/sec: 534.87 - lr: 0.000059 - momentum: 0.000000
2023-10-15 03:06:19,358 epoch 1 - iter 220/447 - loss 2.62597041 - time (sec): 79.78 - samples/sec: 538.56 - lr: 0.000073 - momentum: 0.000000
2023-10-15 03:06:35,950 epoch 1 - iter 264/447 - loss 2.45030473 - time (sec): 96.37 - samples/sec: 541.07 - lr: 0.000088 - momentum: 0.000000
2023-10-15 03:06:51,275 epoch 1 - iter 308/447 - loss 2.29870021 - time (sec): 111.70 - samples/sec: 532.50 - lr: 0.000103 - momentum: 0.000000
2023-10-15 03:07:08,861 epoch 1 - iter 352/447 - loss 2.09892316 - time (sec): 129.28 - samples/sec: 530.92 - lr: 0.000118 - momentum: 0.000000
2023-10-15 03:07:24,037 epoch 1 - iter 396/447 - loss 1.94680522 - time (sec): 144.46 - samples/sec: 530.10 - lr: 0.000133 - momentum: 0.000000
2023-10-15 03:07:39,742 epoch 1 - iter 440/447 - loss 1.79459938 - time (sec): 160.16 - samples/sec: 532.17 - lr: 0.000147 - momentum: 0.000000
2023-10-15 03:07:42,159 ----------------------------------------------------------------------------------------------------
2023-10-15 03:07:42,159 EPOCH 1 done: loss 1.7747 - lr: 0.000147
2023-10-15 03:08:06,055 DEV : loss 0.46788281202316284 - f1-score (micro avg) 0.0
2023-10-15 03:08:06,081 ----------------------------------------------------------------------------------------------------
2023-10-15 03:08:21,806 epoch 2 - iter 44/447 - loss 0.52250284 - time (sec): 15.72 - samples/sec: 565.95 - lr: 0.000148 - momentum: 0.000000
2023-10-15 03:08:37,461 epoch 2 - iter 88/447 - loss 0.46455257 - time (sec): 31.38 - samples/sec: 551.24 - lr: 0.000147 - momentum: 0.000000
2023-10-15 03:08:52,852 epoch 2 - iter 132/447 - loss 0.45965555 - time (sec): 46.77 - samples/sec: 536.18 - lr: 0.000145 - momentum: 0.000000
2023-10-15 03:09:08,719 epoch 2 - iter 176/447 - loss 0.43668872 - time (sec): 62.64 - samples/sec: 536.73 - lr: 0.000143 - momentum: 0.000000
2023-10-15 03:09:24,008 epoch 2 - iter 220/447 - loss 0.40776519 - time (sec): 77.93 - samples/sec: 538.54 - lr: 0.000142 - momentum: 0.000000
2023-10-15 03:09:39,895 epoch 2 - iter 264/447 - loss 0.39044184 - time (sec): 93.81 - samples/sec: 538.32 - lr: 0.000140 - momentum: 0.000000
2023-10-15 03:09:57,207 epoch 2 - iter 308/447 - loss 0.37910806 - time (sec): 111.13 - samples/sec: 539.10 - lr: 0.000139 - momentum: 0.000000
2023-10-15 03:10:12,303 epoch 2 - iter 352/447 - loss 0.36499181 - time (sec): 126.22 - samples/sec: 537.08 - lr: 0.000137 - momentum: 0.000000
2023-10-15 03:10:27,876 epoch 2 - iter 396/447 - loss 0.34942105 - time (sec): 141.79 - samples/sec: 538.09 - lr: 0.000135 - momentum: 0.000000
2023-10-15 03:10:43,586 epoch 2 - iter 440/447 - loss 0.33846353 - time (sec): 157.50 - samples/sec: 540.65 - lr: 0.000134 - momentum: 0.000000
2023-10-15 03:10:46,019 ----------------------------------------------------------------------------------------------------
2023-10-15 03:10:46,019 EPOCH 2 done: loss 0.3358 - lr: 0.000134
2023-10-15 03:11:11,866 DEV : loss 0.221779927611351 - f1-score (micro avg) 0.5672
2023-10-15 03:11:11,892 saving best model
2023-10-15 03:11:12,500 ----------------------------------------------------------------------------------------------------
2023-10-15 03:11:28,182 epoch 3 - iter 44/447 - loss 0.21697222 - time (sec): 15.68 - samples/sec: 549.97 - lr: 0.000132 - momentum: 0.000000
2023-10-15 03:11:43,508 epoch 3 - iter 88/447 - loss 0.20839941 - time (sec): 31.01 - samples/sec: 533.33 - lr: 0.000130 - momentum: 0.000000
2023-10-15 03:12:02,109 epoch 3 - iter 132/447 - loss 0.21174967 - time (sec): 49.61 - samples/sec: 546.53 - lr: 0.000128 - momentum: 0.000000
2023-10-15 03:12:18,292 epoch 3 - iter 176/447 - loss 0.20384305 - time (sec): 65.79 - samples/sec: 550.72 - lr: 0.000127 - momentum: 0.000000
2023-10-15 03:12:33,767 epoch 3 - iter 220/447 - loss 0.19929970 - time (sec): 81.27 - samples/sec: 547.92 - lr: 0.000125 - momentum: 0.000000
2023-10-15 03:12:48,479 epoch 3 - iter 264/447 - loss 0.19553507 - time (sec): 95.98 - samples/sec: 538.22 - lr: 0.000124 - momentum: 0.000000
2023-10-15 03:13:03,626 epoch 3 - iter 308/447 - loss 0.19427403 - time (sec): 111.12 - samples/sec: 535.59 - lr: 0.000122 - momentum: 0.000000
2023-10-15 03:13:19,671 epoch 3 - iter 352/447 - loss 0.18939750 - time (sec): 127.17 - samples/sec: 541.01 - lr: 0.000120 - momentum: 0.000000
2023-10-15 03:13:35,203 epoch 3 - iter 396/447 - loss 0.18730435 - time (sec): 142.70 - samples/sec: 538.73 - lr: 0.000119 - momentum: 0.000000
2023-10-15 03:13:50,772 epoch 3 - iter 440/447 - loss 0.18261588 - time (sec): 158.27 - samples/sec: 538.56 - lr: 0.000117 - momentum: 0.000000
2023-10-15 03:13:53,203 ----------------------------------------------------------------------------------------------------
2023-10-15 03:13:53,203 EPOCH 3 done: loss 0.1828 - lr: 0.000117
2023-10-15 03:14:19,555 DEV : loss 0.21196609735488892 - f1-score (micro avg) 0.6645
2023-10-15 03:14:19,582 saving best model
2023-10-15 03:14:29,099 ----------------------------------------------------------------------------------------------------
2023-10-15 03:14:44,594 epoch 4 - iter 44/447 - loss 0.12458170 - time (sec): 15.49 - samples/sec: 528.51 - lr: 0.000115 - momentum: 0.000000
2023-10-15 03:15:00,234 epoch 4 - iter 88/447 - loss 0.12489415 - time (sec): 31.13 - samples/sec: 546.25 - lr: 0.000113 - momentum: 0.000000
2023-10-15 03:15:17,300 epoch 4 - iter 132/447 - loss 0.12039860 - time (sec): 48.20 - samples/sec: 549.42 - lr: 0.000112 - momentum: 0.000000
2023-10-15 03:15:33,506 epoch 4 - iter 176/447 - loss 0.11268941 - time (sec): 64.40 - samples/sec: 558.22 - lr: 0.000110 - momentum: 0.000000
2023-10-15 03:15:49,080 epoch 4 - iter 220/447 - loss 0.11047919 - time (sec): 79.98 - samples/sec: 553.19 - lr: 0.000109 - momentum: 0.000000
2023-10-15 03:16:04,739 epoch 4 - iter 264/447 - loss 0.10848879 - time (sec): 95.64 - samples/sec: 554.24 - lr: 0.000107 - momentum: 0.000000
2023-10-15 03:16:19,804 epoch 4 - iter 308/447 - loss 0.10754764 - time (sec): 110.70 - samples/sec: 552.22 - lr: 0.000105 - momentum: 0.000000
2023-10-15 03:16:35,058 epoch 4 - iter 352/447 - loss 0.10601664 - time (sec): 125.96 - samples/sec: 548.80 - lr: 0.000104 - momentum: 0.000000
2023-10-15 03:16:50,082 epoch 4 - iter 396/447 - loss 0.10799045 - time (sec): 140.98 - samples/sec: 546.38 - lr: 0.000102 - momentum: 0.000000
2023-10-15 03:17:05,246 epoch 4 - iter 440/447 - loss 0.10594070 - time (sec): 156.14 - samples/sec: 545.97 - lr: 0.000100 - momentum: 0.000000
2023-10-15 03:17:07,682 ----------------------------------------------------------------------------------------------------
2023-10-15 03:17:07,682 EPOCH 4 done: loss 0.1050 - lr: 0.000100
2023-10-15 03:17:33,504 DEV : loss 0.1502165049314499 - f1-score (micro avg) 0.7452
2023-10-15 03:17:33,531 saving best model
2023-10-15 03:17:37,818 ----------------------------------------------------------------------------------------------------
2023-10-15 03:17:53,611 epoch 5 - iter 44/447 - loss 0.07166056 - time (sec): 15.79 - samples/sec: 554.12 - lr: 0.000098 - momentum: 0.000000
2023-10-15 03:18:10,004 epoch 5 - iter 88/447 - loss 0.07049452 - time (sec): 32.18 - samples/sec: 561.32 - lr: 0.000097 - momentum: 0.000000
2023-10-15 03:18:25,262 epoch 5 - iter 132/447 - loss 0.06741331 - time (sec): 47.44 - samples/sec: 552.50 - lr: 0.000095 - momentum: 0.000000
2023-10-15 03:18:41,423 epoch 5 - iter 176/447 - loss 0.06793876 - time (sec): 63.60 - samples/sec: 550.92 - lr: 0.000094 - momentum: 0.000000
2023-10-15 03:18:58,865 epoch 5 - iter 220/447 - loss 0.06793416 - time (sec): 81.04 - samples/sec: 549.77 - lr: 0.000092 - momentum: 0.000000
2023-10-15 03:19:14,393 epoch 5 - iter 264/447 - loss 0.07122002 - time (sec): 96.57 - samples/sec: 547.49 - lr: 0.000090 - momentum: 0.000000
2023-10-15 03:19:29,723 epoch 5 - iter 308/447 - loss 0.06996369 - time (sec): 111.90 - samples/sec: 545.54 - lr: 0.000089 - momentum: 0.000000
2023-10-15 03:19:45,050 epoch 5 - iter 352/447 - loss 0.07078724 - time (sec): 127.23 - samples/sec: 542.96 - lr: 0.000087 - momentum: 0.000000
2023-10-15 03:20:00,352 epoch 5 - iter 396/447 - loss 0.06886941 - time (sec): 142.53 - samples/sec: 540.46 - lr: 0.000085 - momentum: 0.000000
2023-10-15 03:20:15,902 epoch 5 - iter 440/447 - loss 0.06749023 - time (sec): 158.08 - samples/sec: 539.13 - lr: 0.000084 - momentum: 0.000000
2023-10-15 03:20:18,377 ----------------------------------------------------------------------------------------------------
2023-10-15 03:20:18,377 EPOCH 5 done: loss 0.0671 - lr: 0.000084
2023-10-15 03:20:44,645 DEV : loss 0.16935649514198303 - f1-score (micro avg) 0.7586
2023-10-15 03:20:44,671 saving best model
2023-10-15 03:20:48,900 ----------------------------------------------------------------------------------------------------
2023-10-15 03:21:04,483 epoch 6 - iter 44/447 - loss 0.04472399 - time (sec): 15.58 - samples/sec: 547.14 - lr: 0.000082 - momentum: 0.000000
2023-10-15 03:21:21,706 epoch 6 - iter 88/447 - loss 0.05060492 - time (sec): 32.80 - samples/sec: 540.75 - lr: 0.000080 - momentum: 0.000000
2023-10-15 03:21:37,259 epoch 6 - iter 132/447 - loss 0.04694672 - time (sec): 48.36 - samples/sec: 547.01 - lr: 0.000079 - momentum: 0.000000
2023-10-15 03:21:53,172 epoch 6 - iter 176/447 - loss 0.04332383 - time (sec): 64.27 - samples/sec: 552.15 - lr: 0.000077 - momentum: 0.000000
2023-10-15 03:22:08,300 epoch 6 - iter 220/447 - loss 0.04304184 - time (sec): 79.40 - samples/sec: 545.59 - lr: 0.000075 - momentum: 0.000000
2023-10-15 03:22:24,538 epoch 6 - iter 264/447 - loss 0.04446927 - time (sec): 95.64 - samples/sec: 543.72 - lr: 0.000074 - momentum: 0.000000
2023-10-15 03:22:40,667 epoch 6 - iter 308/447 - loss 0.04546331 - time (sec): 111.76 - samples/sec: 544.78 - lr: 0.000072 - momentum: 0.000000
2023-10-15 03:22:56,084 epoch 6 - iter 352/447 - loss 0.04531105 - time (sec): 127.18 - samples/sec: 543.54 - lr: 0.000070 - momentum: 0.000000
2023-10-15 03:23:11,080 epoch 6 - iter 396/447 - loss 0.04538256 - time (sec): 142.18 - samples/sec: 539.62 - lr: 0.000069 - momentum: 0.000000
2023-10-15 03:23:26,907 epoch 6 - iter 440/447 - loss 0.04478858 - time (sec): 158.00 - samples/sec: 540.52 - lr: 0.000067 - momentum: 0.000000
2023-10-15 03:23:29,259 ----------------------------------------------------------------------------------------------------
2023-10-15 03:23:29,259 EPOCH 6 done: loss 0.0447 - lr: 0.000067
2023-10-15 03:23:55,465 DEV : loss 0.19000039994716644 - f1-score (micro avg) 0.7643
2023-10-15 03:23:55,491 saving best model
2023-10-15 03:23:58,764 ----------------------------------------------------------------------------------------------------
2023-10-15 03:24:13,720 epoch 7 - iter 44/447 - loss 0.02523236 - time (sec): 14.95 - samples/sec: 533.71 - lr: 0.000065 - momentum: 0.000000
2023-10-15 03:24:29,663 epoch 7 - iter 88/447 - loss 0.02638231 - time (sec): 30.90 - samples/sec: 560.20 - lr: 0.000064 - momentum: 0.000000
2023-10-15 03:24:45,034 epoch 7 - iter 132/447 - loss 0.02697728 - time (sec): 46.27 - samples/sec: 553.82 - lr: 0.000062 - momentum: 0.000000
2023-10-15 03:25:00,846 epoch 7 - iter 176/447 - loss 0.02819881 - time (sec): 62.08 - samples/sec: 555.69 - lr: 0.000060 - momentum: 0.000000
2023-10-15 03:25:18,235 epoch 7 - iter 220/447 - loss 0.02964525 - time (sec): 79.47 - samples/sec: 553.72 - lr: 0.000059 - momentum: 0.000000
2023-10-15 03:25:34,141 epoch 7 - iter 264/447 - loss 0.02964954 - time (sec): 95.37 - samples/sec: 547.81 - lr: 0.000057 - momentum: 0.000000
2023-10-15 03:25:49,642 epoch 7 - iter 308/447 - loss 0.02874052 - time (sec): 110.88 - samples/sec: 546.42 - lr: 0.000055 - momentum: 0.000000
2023-10-15 03:26:05,033 epoch 7 - iter 352/447 - loss 0.02885843 - time (sec): 126.27 - samples/sec: 544.05 - lr: 0.000054 - momentum: 0.000000
2023-10-15 03:26:20,247 epoch 7 - iter 396/447 - loss 0.03106093 - time (sec): 141.48 - samples/sec: 541.83 - lr: 0.000052 - momentum: 0.000000
2023-10-15 03:26:36,365 epoch 7 - iter 440/447 - loss 0.03077679 - time (sec): 157.60 - samples/sec: 542.12 - lr: 0.000050 - momentum: 0.000000
2023-10-15 03:26:38,706 ----------------------------------------------------------------------------------------------------
2023-10-15 03:26:38,707 EPOCH 7 done: loss 0.0307 - lr: 0.000050
2023-10-15 03:27:04,502 DEV : loss 0.19090385735034943 - f1-score (micro avg) 0.7667
2023-10-15 03:27:04,528 saving best model
2023-10-15 03:27:08,206 ----------------------------------------------------------------------------------------------------
2023-10-15 03:27:23,975 epoch 8 - iter 44/447 - loss 0.02393544 - time (sec): 15.77 - samples/sec: 531.15 - lr: 0.000049 - momentum: 0.000000
2023-10-15 03:27:41,509 epoch 8 - iter 88/447 - loss 0.03310152 - time (sec): 33.30 - samples/sec: 548.42 - lr: 0.000047 - momentum: 0.000000
2023-10-15 03:27:57,178 epoch 8 - iter 132/447 - loss 0.02942044 - time (sec): 48.97 - samples/sec: 549.60 - lr: 0.000045 - momentum: 0.000000
2023-10-15 03:28:12,622 epoch 8 - iter 176/447 - loss 0.02706905 - time (sec): 64.41 - samples/sec: 548.00 - lr: 0.000044 - momentum: 0.000000
2023-10-15 03:28:27,968 epoch 8 - iter 220/447 - loss 0.02599557 - time (sec): 79.76 - samples/sec: 546.01 - lr: 0.000042 - momentum: 0.000000
2023-10-15 03:28:43,769 epoch 8 - iter 264/447 - loss 0.02513171 - time (sec): 95.56 - samples/sec: 539.38 - lr: 0.000040 - momentum: 0.000000
2023-10-15 03:28:59,570 epoch 8 - iter 308/447 - loss 0.02314944 - time (sec): 111.36 - samples/sec: 541.22 - lr: 0.000039 - momentum: 0.000000
2023-10-15 03:29:15,312 epoch 8 - iter 352/447 - loss 0.02305557 - time (sec): 127.10 - samples/sec: 543.78 - lr: 0.000037 - momentum: 0.000000
2023-10-15 03:29:30,462 epoch 8 - iter 396/447 - loss 0.02275457 - time (sec): 142.25 - samples/sec: 541.33 - lr: 0.000035 - momentum: 0.000000
2023-10-15 03:29:45,730 epoch 8 - iter 440/447 - loss 0.02291247 - time (sec): 157.52 - samples/sec: 540.82 - lr: 0.000034 - momentum: 0.000000
2023-10-15 03:29:48,168 ----------------------------------------------------------------------------------------------------
2023-10-15 03:29:48,169 EPOCH 8 done: loss 0.0227 - lr: 0.000034
2023-10-15 03:30:14,304 DEV : loss 0.19714942574501038 - f1-score (micro avg) 0.7611
2023-10-15 03:30:14,331 ----------------------------------------------------------------------------------------------------
2023-10-15 03:30:30,530 epoch 9 - iter 44/447 - loss 0.01520226 - time (sec): 16.20 - samples/sec: 548.79 - lr: 0.000032 - momentum: 0.000000
2023-10-15 03:30:46,254 epoch 9 - iter 88/447 - loss 0.01346179 - time (sec): 31.92 - samples/sec: 549.44 - lr: 0.000030 - momentum: 0.000000
2023-10-15 03:31:02,109 epoch 9 - iter 132/447 - loss 0.01312156 - time (sec): 47.78 - samples/sec: 543.20 - lr: 0.000029 - momentum: 0.000000
2023-10-15 03:31:17,445 epoch 9 - iter 176/447 - loss 0.01278497 - time (sec): 63.11 - samples/sec: 541.11 - lr: 0.000027 - momentum: 0.000000
2023-10-15 03:31:32,608 epoch 9 - iter 220/447 - loss 0.01260025 - time (sec): 78.28 - samples/sec: 538.95 - lr: 0.000025 - momentum: 0.000000
2023-10-15 03:31:48,556 epoch 9 - iter 264/447 - loss 0.01287074 - time (sec): 94.22 - samples/sec: 538.71 - lr: 0.000024 - momentum: 0.000000
2023-10-15 03:32:05,035 epoch 9 - iter 308/447 - loss 0.01389175 - time (sec): 110.70 - samples/sec: 543.42 - lr: 0.000022 - momentum: 0.000000
2023-10-15 03:32:20,276 epoch 9 - iter 352/447 - loss 0.01437736 - time (sec): 125.94 - samples/sec: 539.85 - lr: 0.000020 - momentum: 0.000000
2023-10-15 03:32:35,454 epoch 9 - iter 396/447 - loss 0.01477394 - time (sec): 141.12 - samples/sec: 538.22 - lr: 0.000019 - momentum: 0.000000
2023-10-15 03:32:52,774 epoch 9 - iter 440/447 - loss 0.01578011 - time (sec): 158.44 - samples/sec: 538.41 - lr: 0.000017 - momentum: 0.000000
2023-10-15 03:32:55,142 ----------------------------------------------------------------------------------------------------
2023-10-15 03:32:55,142 EPOCH 9 done: loss 0.0157 - lr: 0.000017
2023-10-15 03:33:21,058 DEV : loss 0.20480625331401825 - f1-score (micro avg) 0.7619
2023-10-15 03:33:21,085 ----------------------------------------------------------------------------------------------------
2023-10-15 03:33:36,258 epoch 10 - iter 44/447 - loss 0.00974946 - time (sec): 15.17 - samples/sec: 540.00 - lr: 0.000015 - momentum: 0.000000
2023-10-15 03:33:52,230 epoch 10 - iter 88/447 - loss 0.01297197 - time (sec): 31.14 - samples/sec: 556.12 - lr: 0.000014 - momentum: 0.000000
2023-10-15 03:34:07,703 epoch 10 - iter 132/447 - loss 0.01147150 - time (sec): 46.62 - samples/sec: 556.70 - lr: 0.000012 - momentum: 0.000000
2023-10-15 03:34:25,028 epoch 10 - iter 176/447 - loss 0.01480113 - time (sec): 63.94 - samples/sec: 554.72 - lr: 0.000010 - momentum: 0.000000
2023-10-15 03:34:40,920 epoch 10 - iter 220/447 - loss 0.01375097 - time (sec): 79.83 - samples/sec: 546.10 - lr: 0.000009 - momentum: 0.000000
2023-10-15 03:34:55,965 epoch 10 - iter 264/447 - loss 0.01369117 - time (sec): 94.88 - samples/sec: 542.48 - lr: 0.000007 - momentum: 0.000000
2023-10-15 03:35:11,409 epoch 10 - iter 308/447 - loss 0.01260225 - time (sec): 110.32 - samples/sec: 543.21 - lr: 0.000005 - momentum: 0.000000
2023-10-15 03:35:26,842 epoch 10 - iter 352/447 - loss 0.01255970 - time (sec): 125.76 - samples/sec: 544.24 - lr: 0.000004 - momentum: 0.000000
2023-10-15 03:35:41,667 epoch 10 - iter 396/447 - loss 0.01280365 - time (sec): 140.58 - samples/sec: 539.59 - lr: 0.000002 - momentum: 0.000000
2023-10-15 03:35:57,666 epoch 10 - iter 440/447 - loss 0.01242779 - time (sec): 156.58 - samples/sec: 543.84 - lr: 0.000001 - momentum: 0.000000
2023-10-15 03:36:00,120 ----------------------------------------------------------------------------------------------------
2023-10-15 03:36:00,120 EPOCH 10 done: loss 0.0123 - lr: 0.000001
2023-10-15 03:36:26,213 DEV : loss 0.20496107637882233 - f1-score (micro avg) 0.756
2023-10-15 03:36:26,839 ----------------------------------------------------------------------------------------------------
2023-10-15 03:36:26,840 Loading model from best epoch ...
2023-10-15 03:36:34,278 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-15 03:36:56,851
Results:
- F-score (micro) 0.7451
- F-score (macro) 0.6272
- Accuracy 0.6072
By class:
precision recall f1-score support
loc 0.8249 0.8775 0.8504 596
pers 0.6897 0.7808 0.7324 333
org 0.4596 0.5606 0.5051 132
prod 0.5273 0.4394 0.4793 66
time 0.5472 0.5918 0.5686 49
micro avg 0.7148 0.7781 0.7451 1176
macro avg 0.6097 0.6500 0.6272 1176
weighted avg 0.7173 0.7781 0.7457 1176
2023-10-15 03:36:56,851 ----------------------------------------------------------------------------------------------------