|
2023-10-14 04:28:24,168 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 04:28:24,171 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-14 04:28:24,171 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 04:28:24,171 MultiCorpus: 6183 train + 680 dev + 2113 test sentences |
|
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator |
|
2023-10-14 04:28:24,171 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 04:28:24,171 Train: 6183 sentences |
|
2023-10-14 04:28:24,171 (train_with_dev=False, train_with_test=False) |
|
2023-10-14 04:28:24,171 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 04:28:24,171 Training Params: |
|
2023-10-14 04:28:24,171 - learning_rate: "0.00016" |
|
2023-10-14 04:28:24,172 - mini_batch_size: "8" |
|
2023-10-14 04:28:24,172 - max_epochs: "10" |
|
2023-10-14 04:28:24,172 - shuffle: "True" |
|
2023-10-14 04:28:24,172 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 04:28:24,172 Plugins: |
|
2023-10-14 04:28:24,172 - TensorboardLogger |
|
2023-10-14 04:28:24,172 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-14 04:28:24,172 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 04:28:24,172 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-14 04:28:24,172 - metric: "('micro avg', 'f1-score')" |
|
2023-10-14 04:28:24,172 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 04:28:24,172 Computation: |
|
2023-10-14 04:28:24,172 - compute on device: cuda:0 |
|
2023-10-14 04:28:24,173 - embedding storage: none |
|
2023-10-14 04:28:24,173 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 04:28:24,173 Model training base path: "hmbench-topres19th/en-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-14 04:28:24,173 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 04:28:24,173 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 04:28:24,173 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-14 04:29:05,072 epoch 1 - iter 77/773 - loss 2.54244239 - time (sec): 40.90 - samples/sec: 319.10 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 04:29:46,524 epoch 1 - iter 154/773 - loss 2.48615845 - time (sec): 82.35 - samples/sec: 296.47 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-14 04:30:29,288 epoch 1 - iter 231/773 - loss 2.28699351 - time (sec): 125.11 - samples/sec: 303.76 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-14 04:31:09,818 epoch 1 - iter 308/773 - loss 2.07436902 - time (sec): 165.64 - samples/sec: 304.93 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-14 04:31:49,883 epoch 1 - iter 385/773 - loss 1.84298193 - time (sec): 205.71 - samples/sec: 305.33 - lr: 0.000079 - momentum: 0.000000 |
|
2023-10-14 04:32:29,519 epoch 1 - iter 462/773 - loss 1.62458957 - time (sec): 245.34 - samples/sec: 304.06 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-14 04:33:10,622 epoch 1 - iter 539/773 - loss 1.43438535 - time (sec): 286.45 - samples/sec: 303.31 - lr: 0.000111 - momentum: 0.000000 |
|
2023-10-14 04:33:51,136 epoch 1 - iter 616/773 - loss 1.28509395 - time (sec): 326.96 - samples/sec: 303.63 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-14 04:34:30,851 epoch 1 - iter 693/773 - loss 1.16641135 - time (sec): 366.68 - samples/sec: 304.52 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-14 04:35:10,251 epoch 1 - iter 770/773 - loss 1.06740930 - time (sec): 406.08 - samples/sec: 305.20 - lr: 0.000159 - momentum: 0.000000 |
|
2023-10-14 04:35:11,658 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 04:35:11,658 EPOCH 1 done: loss 1.0650 - lr: 0.000159 |
|
2023-10-14 04:35:28,397 DEV : loss 0.10037031024694443 - f1-score (micro avg) 0.4469 |
|
2023-10-14 04:35:28,428 saving best model |
|
2023-10-14 04:35:29,328 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 04:36:09,179 epoch 2 - iter 77/773 - loss 0.13592744 - time (sec): 39.85 - samples/sec: 313.04 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-14 04:36:48,447 epoch 2 - iter 154/773 - loss 0.12546525 - time (sec): 79.12 - samples/sec: 311.73 - lr: 0.000156 - momentum: 0.000000 |
|
2023-10-14 04:37:28,738 epoch 2 - iter 231/773 - loss 0.12188259 - time (sec): 119.41 - samples/sec: 314.85 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-14 04:38:09,450 epoch 2 - iter 308/773 - loss 0.11747475 - time (sec): 160.12 - samples/sec: 309.53 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-14 04:38:49,769 epoch 2 - iter 385/773 - loss 0.11284653 - time (sec): 200.44 - samples/sec: 310.92 - lr: 0.000151 - momentum: 0.000000 |
|
2023-10-14 04:39:29,955 epoch 2 - iter 462/773 - loss 0.11231366 - time (sec): 240.62 - samples/sec: 308.68 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-14 04:40:10,666 epoch 2 - iter 539/773 - loss 0.10792935 - time (sec): 281.34 - samples/sec: 306.91 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-14 04:40:51,913 epoch 2 - iter 616/773 - loss 0.10489975 - time (sec): 322.58 - samples/sec: 304.91 - lr: 0.000146 - momentum: 0.000000 |
|
2023-10-14 04:41:32,743 epoch 2 - iter 693/773 - loss 0.10300596 - time (sec): 363.41 - samples/sec: 304.24 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-14 04:42:13,919 epoch 2 - iter 770/773 - loss 0.09903163 - time (sec): 404.59 - samples/sec: 306.03 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-14 04:42:15,471 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 04:42:15,471 EPOCH 2 done: loss 0.0989 - lr: 0.000142 |
|
2023-10-14 04:42:33,065 DEV : loss 0.0493462048470974 - f1-score (micro avg) 0.7889 |
|
2023-10-14 04:42:33,094 saving best model |
|
2023-10-14 04:42:35,694 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 04:43:16,469 epoch 3 - iter 77/773 - loss 0.05743308 - time (sec): 40.77 - samples/sec: 290.22 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-14 04:43:57,375 epoch 3 - iter 154/773 - loss 0.06599354 - time (sec): 81.68 - samples/sec: 295.49 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-14 04:44:37,367 epoch 3 - iter 231/773 - loss 0.06091843 - time (sec): 121.67 - samples/sec: 294.82 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-14 04:45:18,627 epoch 3 - iter 308/773 - loss 0.06058528 - time (sec): 162.93 - samples/sec: 293.19 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-14 04:45:59,351 epoch 3 - iter 385/773 - loss 0.05868897 - time (sec): 203.65 - samples/sec: 289.93 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-14 04:46:40,530 epoch 3 - iter 462/773 - loss 0.05682454 - time (sec): 244.83 - samples/sec: 294.16 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-14 04:47:21,353 epoch 3 - iter 539/773 - loss 0.05871382 - time (sec): 285.66 - samples/sec: 295.04 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-14 04:48:02,038 epoch 3 - iter 616/773 - loss 0.05735535 - time (sec): 326.34 - samples/sec: 300.23 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-14 04:48:42,542 epoch 3 - iter 693/773 - loss 0.05682201 - time (sec): 366.85 - samples/sec: 303.34 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-14 04:49:23,073 epoch 3 - iter 770/773 - loss 0.05721273 - time (sec): 407.38 - samples/sec: 303.82 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-14 04:49:24,586 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 04:49:24,586 EPOCH 3 done: loss 0.0571 - lr: 0.000125 |
|
2023-10-14 04:49:42,105 DEV : loss 0.052890341728925705 - f1-score (micro avg) 0.7961 |
|
2023-10-14 04:49:42,134 saving best model |
|
2023-10-14 04:49:43,123 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 04:50:23,376 epoch 4 - iter 77/773 - loss 0.03274705 - time (sec): 40.25 - samples/sec: 294.43 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-14 04:51:03,479 epoch 4 - iter 154/773 - loss 0.03605609 - time (sec): 80.35 - samples/sec: 291.14 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-14 04:51:44,703 epoch 4 - iter 231/773 - loss 0.03632022 - time (sec): 121.58 - samples/sec: 301.99 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-14 04:52:24,782 epoch 4 - iter 308/773 - loss 0.03954406 - time (sec): 161.66 - samples/sec: 299.79 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-14 04:53:03,301 epoch 4 - iter 385/773 - loss 0.03855328 - time (sec): 200.18 - samples/sec: 301.88 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-14 04:53:44,031 epoch 4 - iter 462/773 - loss 0.03697764 - time (sec): 240.91 - samples/sec: 305.26 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-14 04:54:25,514 epoch 4 - iter 539/773 - loss 0.03544277 - time (sec): 282.39 - samples/sec: 306.90 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-14 04:55:06,007 epoch 4 - iter 616/773 - loss 0.03525164 - time (sec): 322.88 - samples/sec: 306.33 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-14 04:55:46,241 epoch 4 - iter 693/773 - loss 0.03645316 - time (sec): 363.12 - samples/sec: 304.84 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-14 04:56:26,905 epoch 4 - iter 770/773 - loss 0.03581575 - time (sec): 403.78 - samples/sec: 306.52 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-14 04:56:28,433 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 04:56:28,433 EPOCH 4 done: loss 0.0357 - lr: 0.000107 |
|
2023-10-14 04:56:45,803 DEV : loss 0.05653312802314758 - f1-score (micro avg) 0.816 |
|
2023-10-14 04:56:45,831 saving best model |
|
2023-10-14 04:56:48,419 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 04:57:28,340 epoch 5 - iter 77/773 - loss 0.02077936 - time (sec): 39.92 - samples/sec: 317.84 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-14 04:58:09,403 epoch 5 - iter 154/773 - loss 0.02462730 - time (sec): 80.98 - samples/sec: 310.25 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-14 04:58:50,054 epoch 5 - iter 231/773 - loss 0.02366038 - time (sec): 121.63 - samples/sec: 306.66 - lr: 0.000101 - momentum: 0.000000 |
|
2023-10-14 04:59:30,275 epoch 5 - iter 308/773 - loss 0.02724055 - time (sec): 161.85 - samples/sec: 305.53 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-14 05:00:11,004 epoch 5 - iter 385/773 - loss 0.02658499 - time (sec): 202.58 - samples/sec: 307.71 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-14 05:00:52,378 epoch 5 - iter 462/773 - loss 0.02583538 - time (sec): 243.95 - samples/sec: 305.21 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-14 05:01:33,002 epoch 5 - iter 539/773 - loss 0.02497714 - time (sec): 284.58 - samples/sec: 306.36 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-14 05:02:14,734 epoch 5 - iter 616/773 - loss 0.02475567 - time (sec): 326.31 - samples/sec: 305.63 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-14 05:02:55,559 epoch 5 - iter 693/773 - loss 0.02437182 - time (sec): 367.13 - samples/sec: 303.83 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-14 05:03:36,276 epoch 5 - iter 770/773 - loss 0.02394195 - time (sec): 407.85 - samples/sec: 303.56 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-14 05:03:37,764 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 05:03:37,764 EPOCH 5 done: loss 0.0240 - lr: 0.000089 |
|
2023-10-14 05:03:55,466 DEV : loss 0.0688859298825264 - f1-score (micro avg) 0.8031 |
|
2023-10-14 05:03:55,496 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 05:04:35,617 epoch 6 - iter 77/773 - loss 0.01506946 - time (sec): 40.12 - samples/sec: 287.07 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-14 05:05:15,281 epoch 6 - iter 154/773 - loss 0.01615607 - time (sec): 79.78 - samples/sec: 300.51 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-14 05:05:55,755 epoch 6 - iter 231/773 - loss 0.01532933 - time (sec): 120.26 - samples/sec: 300.00 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-14 05:06:36,427 epoch 6 - iter 308/773 - loss 0.01523630 - time (sec): 160.93 - samples/sec: 303.87 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-14 05:07:17,537 epoch 6 - iter 385/773 - loss 0.01580359 - time (sec): 202.04 - samples/sec: 307.17 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-14 05:07:58,881 epoch 6 - iter 462/773 - loss 0.01646299 - time (sec): 243.38 - samples/sec: 305.67 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-14 05:08:38,955 epoch 6 - iter 539/773 - loss 0.01563844 - time (sec): 283.46 - samples/sec: 305.15 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-14 05:09:19,517 epoch 6 - iter 616/773 - loss 0.01644155 - time (sec): 324.02 - samples/sec: 305.40 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-14 05:09:59,785 epoch 6 - iter 693/773 - loss 0.01652567 - time (sec): 364.29 - samples/sec: 306.53 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-14 05:10:39,880 epoch 6 - iter 770/773 - loss 0.01657071 - time (sec): 404.38 - samples/sec: 306.49 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-14 05:10:41,347 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 05:10:41,347 EPOCH 6 done: loss 0.0165 - lr: 0.000071 |
|
2023-10-14 05:10:58,493 DEV : loss 0.07852932810783386 - f1-score (micro avg) 0.8047 |
|
2023-10-14 05:10:58,532 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 05:11:37,820 epoch 7 - iter 77/773 - loss 0.01159909 - time (sec): 39.29 - samples/sec: 289.93 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-14 05:12:17,357 epoch 7 - iter 154/773 - loss 0.01174655 - time (sec): 78.82 - samples/sec: 299.84 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-14 05:12:58,566 epoch 7 - iter 231/773 - loss 0.01105339 - time (sec): 120.03 - samples/sec: 305.27 - lr: 0.000066 - momentum: 0.000000 |
|
2023-10-14 05:13:40,261 epoch 7 - iter 308/773 - loss 0.01203373 - time (sec): 161.73 - samples/sec: 307.36 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-14 05:14:21,982 epoch 7 - iter 385/773 - loss 0.01173296 - time (sec): 203.45 - samples/sec: 308.46 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-14 05:15:02,810 epoch 7 - iter 462/773 - loss 0.01119084 - time (sec): 244.28 - samples/sec: 310.97 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-14 05:15:42,806 epoch 7 - iter 539/773 - loss 0.01125000 - time (sec): 284.27 - samples/sec: 308.90 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-14 05:16:21,757 epoch 7 - iter 616/773 - loss 0.01175103 - time (sec): 323.22 - samples/sec: 305.92 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-14 05:17:02,506 epoch 7 - iter 693/773 - loss 0.01159665 - time (sec): 363.97 - samples/sec: 306.82 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-14 05:17:42,862 epoch 7 - iter 770/773 - loss 0.01145544 - time (sec): 404.33 - samples/sec: 306.33 - lr: 0.000054 - momentum: 0.000000 |
|
2023-10-14 05:17:44,336 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 05:17:44,337 EPOCH 7 done: loss 0.0114 - lr: 0.000054 |
|
2023-10-14 05:18:01,345 DEV : loss 0.08401428908109665 - f1-score (micro avg) 0.8103 |
|
2023-10-14 05:18:01,376 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 05:18:41,541 epoch 8 - iter 77/773 - loss 0.00927132 - time (sec): 40.16 - samples/sec: 306.76 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-14 05:19:20,478 epoch 8 - iter 154/773 - loss 0.00950282 - time (sec): 79.10 - samples/sec: 313.87 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-14 05:20:00,678 epoch 8 - iter 231/773 - loss 0.00813838 - time (sec): 119.30 - samples/sec: 311.60 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-14 05:20:40,036 epoch 8 - iter 308/773 - loss 0.00833648 - time (sec): 158.66 - samples/sec: 311.51 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-14 05:21:20,322 epoch 8 - iter 385/773 - loss 0.00752591 - time (sec): 198.94 - samples/sec: 312.32 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-14 05:22:01,368 epoch 8 - iter 462/773 - loss 0.00804190 - time (sec): 239.99 - samples/sec: 310.99 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-14 05:22:41,494 epoch 8 - iter 539/773 - loss 0.00813464 - time (sec): 280.12 - samples/sec: 309.32 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-14 05:23:21,340 epoch 8 - iter 616/773 - loss 0.00780247 - time (sec): 319.96 - samples/sec: 307.58 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-14 05:24:01,896 epoch 8 - iter 693/773 - loss 0.00773844 - time (sec): 360.52 - samples/sec: 308.98 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-14 05:24:43,490 epoch 8 - iter 770/773 - loss 0.00792155 - time (sec): 402.11 - samples/sec: 307.94 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-14 05:24:45,039 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 05:24:45,039 EPOCH 8 done: loss 0.0079 - lr: 0.000036 |
|
2023-10-14 05:25:01,962 DEV : loss 0.09088896214962006 - f1-score (micro avg) 0.8079 |
|
2023-10-14 05:25:01,991 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 05:25:41,763 epoch 9 - iter 77/773 - loss 0.00236214 - time (sec): 39.77 - samples/sec: 280.21 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-14 05:26:20,893 epoch 9 - iter 154/773 - loss 0.00440063 - time (sec): 78.90 - samples/sec: 284.50 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-14 05:27:01,418 epoch 9 - iter 231/773 - loss 0.00459419 - time (sec): 119.42 - samples/sec: 294.35 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-14 05:27:41,980 epoch 9 - iter 308/773 - loss 0.00521844 - time (sec): 159.99 - samples/sec: 299.28 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 05:28:22,697 epoch 9 - iter 385/773 - loss 0.00571228 - time (sec): 200.70 - samples/sec: 304.45 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 05:29:03,129 epoch 9 - iter 462/773 - loss 0.00515251 - time (sec): 241.14 - samples/sec: 305.17 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 05:29:43,674 epoch 9 - iter 539/773 - loss 0.00494185 - time (sec): 281.68 - samples/sec: 307.35 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-14 05:30:24,625 epoch 9 - iter 616/773 - loss 0.00506031 - time (sec): 322.63 - samples/sec: 307.02 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 05:31:04,499 epoch 9 - iter 693/773 - loss 0.00526032 - time (sec): 362.51 - samples/sec: 307.10 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 05:31:44,877 epoch 9 - iter 770/773 - loss 0.00537554 - time (sec): 402.88 - samples/sec: 307.31 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 05:31:46,348 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 05:31:46,348 EPOCH 9 done: loss 0.0054 - lr: 0.000018 |
|
2023-10-14 05:32:03,139 DEV : loss 0.09677362442016602 - f1-score (micro avg) 0.8153 |
|
2023-10-14 05:32:03,169 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 05:32:43,330 epoch 10 - iter 77/773 - loss 0.00305417 - time (sec): 40.16 - samples/sec: 311.88 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 05:33:24,175 epoch 10 - iter 154/773 - loss 0.00415636 - time (sec): 81.00 - samples/sec: 308.13 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 05:34:04,361 epoch 10 - iter 231/773 - loss 0.00316323 - time (sec): 121.19 - samples/sec: 301.80 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-14 05:34:44,873 epoch 10 - iter 308/773 - loss 0.00355807 - time (sec): 161.70 - samples/sec: 299.73 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-14 05:35:26,020 epoch 10 - iter 385/773 - loss 0.00380355 - time (sec): 202.85 - samples/sec: 298.77 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 05:36:06,842 epoch 10 - iter 462/773 - loss 0.00383419 - time (sec): 243.67 - samples/sec: 300.43 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 05:36:47,496 epoch 10 - iter 539/773 - loss 0.00373753 - time (sec): 284.32 - samples/sec: 300.40 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-14 05:37:26,965 epoch 10 - iter 616/773 - loss 0.00385105 - time (sec): 323.79 - samples/sec: 300.61 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 05:38:07,918 epoch 10 - iter 693/773 - loss 0.00408200 - time (sec): 364.75 - samples/sec: 303.62 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 05:38:48,898 epoch 10 - iter 770/773 - loss 0.00412668 - time (sec): 405.73 - samples/sec: 305.15 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-14 05:38:50,408 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 05:38:50,408 EPOCH 10 done: loss 0.0041 - lr: 0.000000 |
|
2023-10-14 05:39:07,969 DEV : loss 0.09867867082357407 - f1-score (micro avg) 0.82 |
|
2023-10-14 05:39:07,999 saving best model |
|
2023-10-14 05:39:11,593 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 05:39:11,595 Loading model from best epoch ... |
|
2023-10-14 05:39:15,952 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET |
|
2023-10-14 05:40:09,975 |
|
Results: |
|
- F-score (micro) 0.7993 |
|
- F-score (macro) 0.7174 |
|
- Accuracy 0.6873 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8470 0.8541 0.8505 946 |
|
BUILDING 0.5508 0.5568 0.5538 185 |
|
STREET 0.7288 0.7679 0.7478 56 |
|
|
|
micro avg 0.7950 0.8037 0.7993 1187 |
|
macro avg 0.7089 0.7262 0.7174 1187 |
|
weighted avg 0.7952 0.8037 0.7994 1187 |
|
|
|
2023-10-14 05:40:09,976 ---------------------------------------------------------------------------------------------------- |
|
|