stefan-it's picture
Upload folder using huggingface_hub
c3f0c28
2023-10-14 04:28:24,168 ----------------------------------------------------------------------------------------------------
2023-10-14 04:28:24,171 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-14 04:28:24,171 ----------------------------------------------------------------------------------------------------
2023-10-14 04:28:24,171 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-14 04:28:24,171 ----------------------------------------------------------------------------------------------------
2023-10-14 04:28:24,171 Train: 6183 sentences
2023-10-14 04:28:24,171 (train_with_dev=False, train_with_test=False)
2023-10-14 04:28:24,171 ----------------------------------------------------------------------------------------------------
2023-10-14 04:28:24,171 Training Params:
2023-10-14 04:28:24,171 - learning_rate: "0.00016"
2023-10-14 04:28:24,172 - mini_batch_size: "8"
2023-10-14 04:28:24,172 - max_epochs: "10"
2023-10-14 04:28:24,172 - shuffle: "True"
2023-10-14 04:28:24,172 ----------------------------------------------------------------------------------------------------
2023-10-14 04:28:24,172 Plugins:
2023-10-14 04:28:24,172 - TensorboardLogger
2023-10-14 04:28:24,172 - LinearScheduler | warmup_fraction: '0.1'
2023-10-14 04:28:24,172 ----------------------------------------------------------------------------------------------------
2023-10-14 04:28:24,172 Final evaluation on model from best epoch (best-model.pt)
2023-10-14 04:28:24,172 - metric: "('micro avg', 'f1-score')"
2023-10-14 04:28:24,172 ----------------------------------------------------------------------------------------------------
2023-10-14 04:28:24,172 Computation:
2023-10-14 04:28:24,172 - compute on device: cuda:0
2023-10-14 04:28:24,173 - embedding storage: none
2023-10-14 04:28:24,173 ----------------------------------------------------------------------------------------------------
2023-10-14 04:28:24,173 Model training base path: "hmbench-topres19th/en-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-5"
2023-10-14 04:28:24,173 ----------------------------------------------------------------------------------------------------
2023-10-14 04:28:24,173 ----------------------------------------------------------------------------------------------------
2023-10-14 04:28:24,173 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-14 04:29:05,072 epoch 1 - iter 77/773 - loss 2.54244239 - time (sec): 40.90 - samples/sec: 319.10 - lr: 0.000016 - momentum: 0.000000
2023-10-14 04:29:46,524 epoch 1 - iter 154/773 - loss 2.48615845 - time (sec): 82.35 - samples/sec: 296.47 - lr: 0.000032 - momentum: 0.000000
2023-10-14 04:30:29,288 epoch 1 - iter 231/773 - loss 2.28699351 - time (sec): 125.11 - samples/sec: 303.76 - lr: 0.000048 - momentum: 0.000000
2023-10-14 04:31:09,818 epoch 1 - iter 308/773 - loss 2.07436902 - time (sec): 165.64 - samples/sec: 304.93 - lr: 0.000064 - momentum: 0.000000
2023-10-14 04:31:49,883 epoch 1 - iter 385/773 - loss 1.84298193 - time (sec): 205.71 - samples/sec: 305.33 - lr: 0.000079 - momentum: 0.000000
2023-10-14 04:32:29,519 epoch 1 - iter 462/773 - loss 1.62458957 - time (sec): 245.34 - samples/sec: 304.06 - lr: 0.000095 - momentum: 0.000000
2023-10-14 04:33:10,622 epoch 1 - iter 539/773 - loss 1.43438535 - time (sec): 286.45 - samples/sec: 303.31 - lr: 0.000111 - momentum: 0.000000
2023-10-14 04:33:51,136 epoch 1 - iter 616/773 - loss 1.28509395 - time (sec): 326.96 - samples/sec: 303.63 - lr: 0.000127 - momentum: 0.000000
2023-10-14 04:34:30,851 epoch 1 - iter 693/773 - loss 1.16641135 - time (sec): 366.68 - samples/sec: 304.52 - lr: 0.000143 - momentum: 0.000000
2023-10-14 04:35:10,251 epoch 1 - iter 770/773 - loss 1.06740930 - time (sec): 406.08 - samples/sec: 305.20 - lr: 0.000159 - momentum: 0.000000
2023-10-14 04:35:11,658 ----------------------------------------------------------------------------------------------------
2023-10-14 04:35:11,658 EPOCH 1 done: loss 1.0650 - lr: 0.000159
2023-10-14 04:35:28,397 DEV : loss 0.10037031024694443 - f1-score (micro avg) 0.4469
2023-10-14 04:35:28,428 saving best model
2023-10-14 04:35:29,328 ----------------------------------------------------------------------------------------------------
2023-10-14 04:36:09,179 epoch 2 - iter 77/773 - loss 0.13592744 - time (sec): 39.85 - samples/sec: 313.04 - lr: 0.000158 - momentum: 0.000000
2023-10-14 04:36:48,447 epoch 2 - iter 154/773 - loss 0.12546525 - time (sec): 79.12 - samples/sec: 311.73 - lr: 0.000156 - momentum: 0.000000
2023-10-14 04:37:28,738 epoch 2 - iter 231/773 - loss 0.12188259 - time (sec): 119.41 - samples/sec: 314.85 - lr: 0.000155 - momentum: 0.000000
2023-10-14 04:38:09,450 epoch 2 - iter 308/773 - loss 0.11747475 - time (sec): 160.12 - samples/sec: 309.53 - lr: 0.000153 - momentum: 0.000000
2023-10-14 04:38:49,769 epoch 2 - iter 385/773 - loss 0.11284653 - time (sec): 200.44 - samples/sec: 310.92 - lr: 0.000151 - momentum: 0.000000
2023-10-14 04:39:29,955 epoch 2 - iter 462/773 - loss 0.11231366 - time (sec): 240.62 - samples/sec: 308.68 - lr: 0.000149 - momentum: 0.000000
2023-10-14 04:40:10,666 epoch 2 - iter 539/773 - loss 0.10792935 - time (sec): 281.34 - samples/sec: 306.91 - lr: 0.000148 - momentum: 0.000000
2023-10-14 04:40:51,913 epoch 2 - iter 616/773 - loss 0.10489975 - time (sec): 322.58 - samples/sec: 304.91 - lr: 0.000146 - momentum: 0.000000
2023-10-14 04:41:32,743 epoch 2 - iter 693/773 - loss 0.10300596 - time (sec): 363.41 - samples/sec: 304.24 - lr: 0.000144 - momentum: 0.000000
2023-10-14 04:42:13,919 epoch 2 - iter 770/773 - loss 0.09903163 - time (sec): 404.59 - samples/sec: 306.03 - lr: 0.000142 - momentum: 0.000000
2023-10-14 04:42:15,471 ----------------------------------------------------------------------------------------------------
2023-10-14 04:42:15,471 EPOCH 2 done: loss 0.0989 - lr: 0.000142
2023-10-14 04:42:33,065 DEV : loss 0.0493462048470974 - f1-score (micro avg) 0.7889
2023-10-14 04:42:33,094 saving best model
2023-10-14 04:42:35,694 ----------------------------------------------------------------------------------------------------
2023-10-14 04:43:16,469 epoch 3 - iter 77/773 - loss 0.05743308 - time (sec): 40.77 - samples/sec: 290.22 - lr: 0.000140 - momentum: 0.000000
2023-10-14 04:43:57,375 epoch 3 - iter 154/773 - loss 0.06599354 - time (sec): 81.68 - samples/sec: 295.49 - lr: 0.000139 - momentum: 0.000000
2023-10-14 04:44:37,367 epoch 3 - iter 231/773 - loss 0.06091843 - time (sec): 121.67 - samples/sec: 294.82 - lr: 0.000137 - momentum: 0.000000
2023-10-14 04:45:18,627 epoch 3 - iter 308/773 - loss 0.06058528 - time (sec): 162.93 - samples/sec: 293.19 - lr: 0.000135 - momentum: 0.000000
2023-10-14 04:45:59,351 epoch 3 - iter 385/773 - loss 0.05868897 - time (sec): 203.65 - samples/sec: 289.93 - lr: 0.000133 - momentum: 0.000000
2023-10-14 04:46:40,530 epoch 3 - iter 462/773 - loss 0.05682454 - time (sec): 244.83 - samples/sec: 294.16 - lr: 0.000132 - momentum: 0.000000
2023-10-14 04:47:21,353 epoch 3 - iter 539/773 - loss 0.05871382 - time (sec): 285.66 - samples/sec: 295.04 - lr: 0.000130 - momentum: 0.000000
2023-10-14 04:48:02,038 epoch 3 - iter 616/773 - loss 0.05735535 - time (sec): 326.34 - samples/sec: 300.23 - lr: 0.000128 - momentum: 0.000000
2023-10-14 04:48:42,542 epoch 3 - iter 693/773 - loss 0.05682201 - time (sec): 366.85 - samples/sec: 303.34 - lr: 0.000126 - momentum: 0.000000
2023-10-14 04:49:23,073 epoch 3 - iter 770/773 - loss 0.05721273 - time (sec): 407.38 - samples/sec: 303.82 - lr: 0.000125 - momentum: 0.000000
2023-10-14 04:49:24,586 ----------------------------------------------------------------------------------------------------
2023-10-14 04:49:24,586 EPOCH 3 done: loss 0.0571 - lr: 0.000125
2023-10-14 04:49:42,105 DEV : loss 0.052890341728925705 - f1-score (micro avg) 0.7961
2023-10-14 04:49:42,134 saving best model
2023-10-14 04:49:43,123 ----------------------------------------------------------------------------------------------------
2023-10-14 04:50:23,376 epoch 4 - iter 77/773 - loss 0.03274705 - time (sec): 40.25 - samples/sec: 294.43 - lr: 0.000123 - momentum: 0.000000
2023-10-14 04:51:03,479 epoch 4 - iter 154/773 - loss 0.03605609 - time (sec): 80.35 - samples/sec: 291.14 - lr: 0.000121 - momentum: 0.000000
2023-10-14 04:51:44,703 epoch 4 - iter 231/773 - loss 0.03632022 - time (sec): 121.58 - samples/sec: 301.99 - lr: 0.000119 - momentum: 0.000000
2023-10-14 04:52:24,782 epoch 4 - iter 308/773 - loss 0.03954406 - time (sec): 161.66 - samples/sec: 299.79 - lr: 0.000117 - momentum: 0.000000
2023-10-14 04:53:03,301 epoch 4 - iter 385/773 - loss 0.03855328 - time (sec): 200.18 - samples/sec: 301.88 - lr: 0.000116 - momentum: 0.000000
2023-10-14 04:53:44,031 epoch 4 - iter 462/773 - loss 0.03697764 - time (sec): 240.91 - samples/sec: 305.26 - lr: 0.000114 - momentum: 0.000000
2023-10-14 04:54:25,514 epoch 4 - iter 539/773 - loss 0.03544277 - time (sec): 282.39 - samples/sec: 306.90 - lr: 0.000112 - momentum: 0.000000
2023-10-14 04:55:06,007 epoch 4 - iter 616/773 - loss 0.03525164 - time (sec): 322.88 - samples/sec: 306.33 - lr: 0.000110 - momentum: 0.000000
2023-10-14 04:55:46,241 epoch 4 - iter 693/773 - loss 0.03645316 - time (sec): 363.12 - samples/sec: 304.84 - lr: 0.000109 - momentum: 0.000000
2023-10-14 04:56:26,905 epoch 4 - iter 770/773 - loss 0.03581575 - time (sec): 403.78 - samples/sec: 306.52 - lr: 0.000107 - momentum: 0.000000
2023-10-14 04:56:28,433 ----------------------------------------------------------------------------------------------------
2023-10-14 04:56:28,433 EPOCH 4 done: loss 0.0357 - lr: 0.000107
2023-10-14 04:56:45,803 DEV : loss 0.05653312802314758 - f1-score (micro avg) 0.816
2023-10-14 04:56:45,831 saving best model
2023-10-14 04:56:48,419 ----------------------------------------------------------------------------------------------------
2023-10-14 04:57:28,340 epoch 5 - iter 77/773 - loss 0.02077936 - time (sec): 39.92 - samples/sec: 317.84 - lr: 0.000105 - momentum: 0.000000
2023-10-14 04:58:09,403 epoch 5 - iter 154/773 - loss 0.02462730 - time (sec): 80.98 - samples/sec: 310.25 - lr: 0.000103 - momentum: 0.000000
2023-10-14 04:58:50,054 epoch 5 - iter 231/773 - loss 0.02366038 - time (sec): 121.63 - samples/sec: 306.66 - lr: 0.000101 - momentum: 0.000000
2023-10-14 04:59:30,275 epoch 5 - iter 308/773 - loss 0.02724055 - time (sec): 161.85 - samples/sec: 305.53 - lr: 0.000100 - momentum: 0.000000
2023-10-14 05:00:11,004 epoch 5 - iter 385/773 - loss 0.02658499 - time (sec): 202.58 - samples/sec: 307.71 - lr: 0.000098 - momentum: 0.000000
2023-10-14 05:00:52,378 epoch 5 - iter 462/773 - loss 0.02583538 - time (sec): 243.95 - samples/sec: 305.21 - lr: 0.000096 - momentum: 0.000000
2023-10-14 05:01:33,002 epoch 5 - iter 539/773 - loss 0.02497714 - time (sec): 284.58 - samples/sec: 306.36 - lr: 0.000094 - momentum: 0.000000
2023-10-14 05:02:14,734 epoch 5 - iter 616/773 - loss 0.02475567 - time (sec): 326.31 - samples/sec: 305.63 - lr: 0.000093 - momentum: 0.000000
2023-10-14 05:02:55,559 epoch 5 - iter 693/773 - loss 0.02437182 - time (sec): 367.13 - samples/sec: 303.83 - lr: 0.000091 - momentum: 0.000000
2023-10-14 05:03:36,276 epoch 5 - iter 770/773 - loss 0.02394195 - time (sec): 407.85 - samples/sec: 303.56 - lr: 0.000089 - momentum: 0.000000
2023-10-14 05:03:37,764 ----------------------------------------------------------------------------------------------------
2023-10-14 05:03:37,764 EPOCH 5 done: loss 0.0240 - lr: 0.000089
2023-10-14 05:03:55,466 DEV : loss 0.0688859298825264 - f1-score (micro avg) 0.8031
2023-10-14 05:03:55,496 ----------------------------------------------------------------------------------------------------
2023-10-14 05:04:35,617 epoch 6 - iter 77/773 - loss 0.01506946 - time (sec): 40.12 - samples/sec: 287.07 - lr: 0.000087 - momentum: 0.000000
2023-10-14 05:05:15,281 epoch 6 - iter 154/773 - loss 0.01615607 - time (sec): 79.78 - samples/sec: 300.51 - lr: 0.000085 - momentum: 0.000000
2023-10-14 05:05:55,755 epoch 6 - iter 231/773 - loss 0.01532933 - time (sec): 120.26 - samples/sec: 300.00 - lr: 0.000084 - momentum: 0.000000
2023-10-14 05:06:36,427 epoch 6 - iter 308/773 - loss 0.01523630 - time (sec): 160.93 - samples/sec: 303.87 - lr: 0.000082 - momentum: 0.000000
2023-10-14 05:07:17,537 epoch 6 - iter 385/773 - loss 0.01580359 - time (sec): 202.04 - samples/sec: 307.17 - lr: 0.000080 - momentum: 0.000000
2023-10-14 05:07:58,881 epoch 6 - iter 462/773 - loss 0.01646299 - time (sec): 243.38 - samples/sec: 305.67 - lr: 0.000078 - momentum: 0.000000
2023-10-14 05:08:38,955 epoch 6 - iter 539/773 - loss 0.01563844 - time (sec): 283.46 - samples/sec: 305.15 - lr: 0.000077 - momentum: 0.000000
2023-10-14 05:09:19,517 epoch 6 - iter 616/773 - loss 0.01644155 - time (sec): 324.02 - samples/sec: 305.40 - lr: 0.000075 - momentum: 0.000000
2023-10-14 05:09:59,785 epoch 6 - iter 693/773 - loss 0.01652567 - time (sec): 364.29 - samples/sec: 306.53 - lr: 0.000073 - momentum: 0.000000
2023-10-14 05:10:39,880 epoch 6 - iter 770/773 - loss 0.01657071 - time (sec): 404.38 - samples/sec: 306.49 - lr: 0.000071 - momentum: 0.000000
2023-10-14 05:10:41,347 ----------------------------------------------------------------------------------------------------
2023-10-14 05:10:41,347 EPOCH 6 done: loss 0.0165 - lr: 0.000071
2023-10-14 05:10:58,493 DEV : loss 0.07852932810783386 - f1-score (micro avg) 0.8047
2023-10-14 05:10:58,532 ----------------------------------------------------------------------------------------------------
2023-10-14 05:11:37,820 epoch 7 - iter 77/773 - loss 0.01159909 - time (sec): 39.29 - samples/sec: 289.93 - lr: 0.000069 - momentum: 0.000000
2023-10-14 05:12:17,357 epoch 7 - iter 154/773 - loss 0.01174655 - time (sec): 78.82 - samples/sec: 299.84 - lr: 0.000068 - momentum: 0.000000
2023-10-14 05:12:58,566 epoch 7 - iter 231/773 - loss 0.01105339 - time (sec): 120.03 - samples/sec: 305.27 - lr: 0.000066 - momentum: 0.000000
2023-10-14 05:13:40,261 epoch 7 - iter 308/773 - loss 0.01203373 - time (sec): 161.73 - samples/sec: 307.36 - lr: 0.000064 - momentum: 0.000000
2023-10-14 05:14:21,982 epoch 7 - iter 385/773 - loss 0.01173296 - time (sec): 203.45 - samples/sec: 308.46 - lr: 0.000062 - momentum: 0.000000
2023-10-14 05:15:02,810 epoch 7 - iter 462/773 - loss 0.01119084 - time (sec): 244.28 - samples/sec: 310.97 - lr: 0.000061 - momentum: 0.000000
2023-10-14 05:15:42,806 epoch 7 - iter 539/773 - loss 0.01125000 - time (sec): 284.27 - samples/sec: 308.90 - lr: 0.000059 - momentum: 0.000000
2023-10-14 05:16:21,757 epoch 7 - iter 616/773 - loss 0.01175103 - time (sec): 323.22 - samples/sec: 305.92 - lr: 0.000057 - momentum: 0.000000
2023-10-14 05:17:02,506 epoch 7 - iter 693/773 - loss 0.01159665 - time (sec): 363.97 - samples/sec: 306.82 - lr: 0.000055 - momentum: 0.000000
2023-10-14 05:17:42,862 epoch 7 - iter 770/773 - loss 0.01145544 - time (sec): 404.33 - samples/sec: 306.33 - lr: 0.000054 - momentum: 0.000000
2023-10-14 05:17:44,336 ----------------------------------------------------------------------------------------------------
2023-10-14 05:17:44,337 EPOCH 7 done: loss 0.0114 - lr: 0.000054
2023-10-14 05:18:01,345 DEV : loss 0.08401428908109665 - f1-score (micro avg) 0.8103
2023-10-14 05:18:01,376 ----------------------------------------------------------------------------------------------------
2023-10-14 05:18:41,541 epoch 8 - iter 77/773 - loss 0.00927132 - time (sec): 40.16 - samples/sec: 306.76 - lr: 0.000052 - momentum: 0.000000
2023-10-14 05:19:20,478 epoch 8 - iter 154/773 - loss 0.00950282 - time (sec): 79.10 - samples/sec: 313.87 - lr: 0.000050 - momentum: 0.000000
2023-10-14 05:20:00,678 epoch 8 - iter 231/773 - loss 0.00813838 - time (sec): 119.30 - samples/sec: 311.60 - lr: 0.000048 - momentum: 0.000000
2023-10-14 05:20:40,036 epoch 8 - iter 308/773 - loss 0.00833648 - time (sec): 158.66 - samples/sec: 311.51 - lr: 0.000046 - momentum: 0.000000
2023-10-14 05:21:20,322 epoch 8 - iter 385/773 - loss 0.00752591 - time (sec): 198.94 - samples/sec: 312.32 - lr: 0.000045 - momentum: 0.000000
2023-10-14 05:22:01,368 epoch 8 - iter 462/773 - loss 0.00804190 - time (sec): 239.99 - samples/sec: 310.99 - lr: 0.000043 - momentum: 0.000000
2023-10-14 05:22:41,494 epoch 8 - iter 539/773 - loss 0.00813464 - time (sec): 280.12 - samples/sec: 309.32 - lr: 0.000041 - momentum: 0.000000
2023-10-14 05:23:21,340 epoch 8 - iter 616/773 - loss 0.00780247 - time (sec): 319.96 - samples/sec: 307.58 - lr: 0.000039 - momentum: 0.000000
2023-10-14 05:24:01,896 epoch 8 - iter 693/773 - loss 0.00773844 - time (sec): 360.52 - samples/sec: 308.98 - lr: 0.000038 - momentum: 0.000000
2023-10-14 05:24:43,490 epoch 8 - iter 770/773 - loss 0.00792155 - time (sec): 402.11 - samples/sec: 307.94 - lr: 0.000036 - momentum: 0.000000
2023-10-14 05:24:45,039 ----------------------------------------------------------------------------------------------------
2023-10-14 05:24:45,039 EPOCH 8 done: loss 0.0079 - lr: 0.000036
2023-10-14 05:25:01,962 DEV : loss 0.09088896214962006 - f1-score (micro avg) 0.8079
2023-10-14 05:25:01,991 ----------------------------------------------------------------------------------------------------
2023-10-14 05:25:41,763 epoch 9 - iter 77/773 - loss 0.00236214 - time (sec): 39.77 - samples/sec: 280.21 - lr: 0.000034 - momentum: 0.000000
2023-10-14 05:26:20,893 epoch 9 - iter 154/773 - loss 0.00440063 - time (sec): 78.90 - samples/sec: 284.50 - lr: 0.000032 - momentum: 0.000000
2023-10-14 05:27:01,418 epoch 9 - iter 231/773 - loss 0.00459419 - time (sec): 119.42 - samples/sec: 294.35 - lr: 0.000030 - momentum: 0.000000
2023-10-14 05:27:41,980 epoch 9 - iter 308/773 - loss 0.00521844 - time (sec): 159.99 - samples/sec: 299.28 - lr: 0.000029 - momentum: 0.000000
2023-10-14 05:28:22,697 epoch 9 - iter 385/773 - loss 0.00571228 - time (sec): 200.70 - samples/sec: 304.45 - lr: 0.000027 - momentum: 0.000000
2023-10-14 05:29:03,129 epoch 9 - iter 462/773 - loss 0.00515251 - time (sec): 241.14 - samples/sec: 305.17 - lr: 0.000025 - momentum: 0.000000
2023-10-14 05:29:43,674 epoch 9 - iter 539/773 - loss 0.00494185 - time (sec): 281.68 - samples/sec: 307.35 - lr: 0.000023 - momentum: 0.000000
2023-10-14 05:30:24,625 epoch 9 - iter 616/773 - loss 0.00506031 - time (sec): 322.63 - samples/sec: 307.02 - lr: 0.000022 - momentum: 0.000000
2023-10-14 05:31:04,499 epoch 9 - iter 693/773 - loss 0.00526032 - time (sec): 362.51 - samples/sec: 307.10 - lr: 0.000020 - momentum: 0.000000
2023-10-14 05:31:44,877 epoch 9 - iter 770/773 - loss 0.00537554 - time (sec): 402.88 - samples/sec: 307.31 - lr: 0.000018 - momentum: 0.000000
2023-10-14 05:31:46,348 ----------------------------------------------------------------------------------------------------
2023-10-14 05:31:46,348 EPOCH 9 done: loss 0.0054 - lr: 0.000018
2023-10-14 05:32:03,139 DEV : loss 0.09677362442016602 - f1-score (micro avg) 0.8153
2023-10-14 05:32:03,169 ----------------------------------------------------------------------------------------------------
2023-10-14 05:32:43,330 epoch 10 - iter 77/773 - loss 0.00305417 - time (sec): 40.16 - samples/sec: 311.88 - lr: 0.000016 - momentum: 0.000000
2023-10-14 05:33:24,175 epoch 10 - iter 154/773 - loss 0.00415636 - time (sec): 81.00 - samples/sec: 308.13 - lr: 0.000014 - momentum: 0.000000
2023-10-14 05:34:04,361 epoch 10 - iter 231/773 - loss 0.00316323 - time (sec): 121.19 - samples/sec: 301.80 - lr: 0.000013 - momentum: 0.000000
2023-10-14 05:34:44,873 epoch 10 - iter 308/773 - loss 0.00355807 - time (sec): 161.70 - samples/sec: 299.73 - lr: 0.000011 - momentum: 0.000000
2023-10-14 05:35:26,020 epoch 10 - iter 385/773 - loss 0.00380355 - time (sec): 202.85 - samples/sec: 298.77 - lr: 0.000009 - momentum: 0.000000
2023-10-14 05:36:06,842 epoch 10 - iter 462/773 - loss 0.00383419 - time (sec): 243.67 - samples/sec: 300.43 - lr: 0.000007 - momentum: 0.000000
2023-10-14 05:36:47,496 epoch 10 - iter 539/773 - loss 0.00373753 - time (sec): 284.32 - samples/sec: 300.40 - lr: 0.000006 - momentum: 0.000000
2023-10-14 05:37:26,965 epoch 10 - iter 616/773 - loss 0.00385105 - time (sec): 323.79 - samples/sec: 300.61 - lr: 0.000004 - momentum: 0.000000
2023-10-14 05:38:07,918 epoch 10 - iter 693/773 - loss 0.00408200 - time (sec): 364.75 - samples/sec: 303.62 - lr: 0.000002 - momentum: 0.000000
2023-10-14 05:38:48,898 epoch 10 - iter 770/773 - loss 0.00412668 - time (sec): 405.73 - samples/sec: 305.15 - lr: 0.000000 - momentum: 0.000000
2023-10-14 05:38:50,408 ----------------------------------------------------------------------------------------------------
2023-10-14 05:38:50,408 EPOCH 10 done: loss 0.0041 - lr: 0.000000
2023-10-14 05:39:07,969 DEV : loss 0.09867867082357407 - f1-score (micro avg) 0.82
2023-10-14 05:39:07,999 saving best model
2023-10-14 05:39:11,593 ----------------------------------------------------------------------------------------------------
2023-10-14 05:39:11,595 Loading model from best epoch ...
2023-10-14 05:39:15,952 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-14 05:40:09,975
Results:
- F-score (micro) 0.7993
- F-score (macro) 0.7174
- Accuracy 0.6873
By class:
precision recall f1-score support
LOC 0.8470 0.8541 0.8505 946
BUILDING 0.5508 0.5568 0.5538 185
STREET 0.7288 0.7679 0.7478 56
micro avg 0.7950 0.8037 0.7993 1187
macro avg 0.7089 0.7262 0.7174 1187
weighted avg 0.7952 0.8037 0.7994 1187
2023-10-14 05:40:09,976 ----------------------------------------------------------------------------------------------------