stefan-it's picture
Upload folder using huggingface_hub
c340f63
2023-10-13 17:15:31,806 ----------------------------------------------------------------------------------------------------
2023-10-13 17:15:31,809 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 17:15:31,809 ----------------------------------------------------------------------------------------------------
2023-10-13 17:15:31,810 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-13 17:15:31,810 ----------------------------------------------------------------------------------------------------
2023-10-13 17:15:31,810 Train: 6183 sentences
2023-10-13 17:15:31,810 (train_with_dev=False, train_with_test=False)
2023-10-13 17:15:31,810 ----------------------------------------------------------------------------------------------------
2023-10-13 17:15:31,810 Training Params:
2023-10-13 17:15:31,810 - learning_rate: "0.00015"
2023-10-13 17:15:31,810 - mini_batch_size: "8"
2023-10-13 17:15:31,810 - max_epochs: "10"
2023-10-13 17:15:31,810 - shuffle: "True"
2023-10-13 17:15:31,811 ----------------------------------------------------------------------------------------------------
2023-10-13 17:15:31,811 Plugins:
2023-10-13 17:15:31,811 - TensorboardLogger
2023-10-13 17:15:31,811 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 17:15:31,811 ----------------------------------------------------------------------------------------------------
2023-10-13 17:15:31,811 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 17:15:31,811 - metric: "('micro avg', 'f1-score')"
2023-10-13 17:15:31,811 ----------------------------------------------------------------------------------------------------
2023-10-13 17:15:31,811 Computation:
2023-10-13 17:15:31,811 - compute on device: cuda:0
2023-10-13 17:15:31,811 - embedding storage: none
2023-10-13 17:15:31,811 ----------------------------------------------------------------------------------------------------
2023-10-13 17:15:31,811 Model training base path: "hmbench-topres19th/en-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-3"
2023-10-13 17:15:31,812 ----------------------------------------------------------------------------------------------------
2023-10-13 17:15:31,812 ----------------------------------------------------------------------------------------------------
2023-10-13 17:15:31,812 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-13 17:16:11,330 epoch 1 - iter 77/773 - loss 2.53777300 - time (sec): 39.52 - samples/sec: 292.16 - lr: 0.000015 - momentum: 0.000000
2023-10-13 17:16:51,228 epoch 1 - iter 154/773 - loss 2.49369541 - time (sec): 79.41 - samples/sec: 303.18 - lr: 0.000030 - momentum: 0.000000
2023-10-13 17:17:31,929 epoch 1 - iter 231/773 - loss 2.32948188 - time (sec): 120.11 - samples/sec: 304.80 - lr: 0.000045 - momentum: 0.000000
2023-10-13 17:18:13,110 epoch 1 - iter 308/773 - loss 2.11470827 - time (sec): 161.30 - samples/sec: 304.89 - lr: 0.000060 - momentum: 0.000000
2023-10-13 17:18:54,504 epoch 1 - iter 385/773 - loss 1.90288187 - time (sec): 202.69 - samples/sec: 301.50 - lr: 0.000075 - momentum: 0.000000
2023-10-13 17:19:34,785 epoch 1 - iter 462/773 - loss 1.67712086 - time (sec): 242.97 - samples/sec: 301.73 - lr: 0.000089 - momentum: 0.000000
2023-10-13 17:20:14,388 epoch 1 - iter 539/773 - loss 1.47746186 - time (sec): 282.57 - samples/sec: 303.06 - lr: 0.000104 - momentum: 0.000000
2023-10-13 17:20:54,452 epoch 1 - iter 616/773 - loss 1.31731180 - time (sec): 322.64 - samples/sec: 304.67 - lr: 0.000119 - momentum: 0.000000
2023-10-13 17:21:34,169 epoch 1 - iter 693/773 - loss 1.19495894 - time (sec): 362.35 - samples/sec: 305.47 - lr: 0.000134 - momentum: 0.000000
2023-10-13 17:22:14,919 epoch 1 - iter 770/773 - loss 1.08518740 - time (sec): 403.10 - samples/sec: 307.41 - lr: 0.000149 - momentum: 0.000000
2023-10-13 17:22:16,338 ----------------------------------------------------------------------------------------------------
2023-10-13 17:22:16,338 EPOCH 1 done: loss 1.0824 - lr: 0.000149
2023-10-13 17:22:32,706 DEV : loss 0.09791397303342819 - f1-score (micro avg) 0.0
2023-10-13 17:22:32,733 ----------------------------------------------------------------------------------------------------
2023-10-13 17:23:12,872 epoch 2 - iter 77/773 - loss 0.13435438 - time (sec): 40.14 - samples/sec: 279.67 - lr: 0.000148 - momentum: 0.000000
2023-10-13 17:23:53,937 epoch 2 - iter 154/773 - loss 0.12660142 - time (sec): 81.20 - samples/sec: 285.64 - lr: 0.000147 - momentum: 0.000000
2023-10-13 17:24:34,663 epoch 2 - iter 231/773 - loss 0.12315260 - time (sec): 121.93 - samples/sec: 297.53 - lr: 0.000145 - momentum: 0.000000
2023-10-13 17:25:14,614 epoch 2 - iter 308/773 - loss 0.12161588 - time (sec): 161.88 - samples/sec: 302.05 - lr: 0.000143 - momentum: 0.000000
2023-10-13 17:25:54,874 epoch 2 - iter 385/773 - loss 0.11903730 - time (sec): 202.14 - samples/sec: 304.42 - lr: 0.000142 - momentum: 0.000000
2023-10-13 17:26:34,375 epoch 2 - iter 462/773 - loss 0.11583604 - time (sec): 241.64 - samples/sec: 302.69 - lr: 0.000140 - momentum: 0.000000
2023-10-13 17:27:13,810 epoch 2 - iter 539/773 - loss 0.11302760 - time (sec): 281.07 - samples/sec: 301.81 - lr: 0.000138 - momentum: 0.000000
2023-10-13 17:27:54,601 epoch 2 - iter 616/773 - loss 0.10939658 - time (sec): 321.87 - samples/sec: 306.16 - lr: 0.000137 - momentum: 0.000000
2023-10-13 17:28:34,289 epoch 2 - iter 693/773 - loss 0.10642948 - time (sec): 361.55 - samples/sec: 305.63 - lr: 0.000135 - momentum: 0.000000
2023-10-13 17:29:15,133 epoch 2 - iter 770/773 - loss 0.10414025 - time (sec): 402.40 - samples/sec: 308.04 - lr: 0.000133 - momentum: 0.000000
2023-10-13 17:29:16,532 ----------------------------------------------------------------------------------------------------
2023-10-13 17:29:16,533 EPOCH 2 done: loss 0.1042 - lr: 0.000133
2023-10-13 17:29:34,169 DEV : loss 0.06052974984049797 - f1-score (micro avg) 0.7483
2023-10-13 17:29:34,204 saving best model
2023-10-13 17:29:35,171 ----------------------------------------------------------------------------------------------------
2023-10-13 17:30:15,916 epoch 3 - iter 77/773 - loss 0.06637866 - time (sec): 40.74 - samples/sec: 315.46 - lr: 0.000132 - momentum: 0.000000
2023-10-13 17:30:57,103 epoch 3 - iter 154/773 - loss 0.07367969 - time (sec): 81.93 - samples/sec: 307.59 - lr: 0.000130 - momentum: 0.000000
2023-10-13 17:31:38,243 epoch 3 - iter 231/773 - loss 0.06770262 - time (sec): 123.07 - samples/sec: 303.97 - lr: 0.000128 - momentum: 0.000000
2023-10-13 17:32:19,573 epoch 3 - iter 308/773 - loss 0.06804529 - time (sec): 164.40 - samples/sec: 301.03 - lr: 0.000127 - momentum: 0.000000
2023-10-13 17:33:01,236 epoch 3 - iter 385/773 - loss 0.06731084 - time (sec): 206.06 - samples/sec: 297.59 - lr: 0.000125 - momentum: 0.000000
2023-10-13 17:33:42,967 epoch 3 - iter 462/773 - loss 0.06679348 - time (sec): 247.79 - samples/sec: 298.22 - lr: 0.000123 - momentum: 0.000000
2023-10-13 17:34:22,539 epoch 3 - iter 539/773 - loss 0.06486323 - time (sec): 287.37 - samples/sec: 300.80 - lr: 0.000122 - momentum: 0.000000
2023-10-13 17:35:02,752 epoch 3 - iter 616/773 - loss 0.06256280 - time (sec): 327.58 - samples/sec: 301.75 - lr: 0.000120 - momentum: 0.000000
2023-10-13 17:35:43,326 epoch 3 - iter 693/773 - loss 0.06248414 - time (sec): 368.15 - samples/sec: 302.07 - lr: 0.000118 - momentum: 0.000000
2023-10-13 17:36:23,839 epoch 3 - iter 770/773 - loss 0.06337434 - time (sec): 408.67 - samples/sec: 302.55 - lr: 0.000117 - momentum: 0.000000
2023-10-13 17:36:25,542 ----------------------------------------------------------------------------------------------------
2023-10-13 17:36:25,543 EPOCH 3 done: loss 0.0635 - lr: 0.000117
2023-10-13 17:36:43,553 DEV : loss 0.05895433574914932 - f1-score (micro avg) 0.7747
2023-10-13 17:36:43,582 saving best model
2023-10-13 17:36:46,243 ----------------------------------------------------------------------------------------------------
2023-10-13 17:37:26,456 epoch 4 - iter 77/773 - loss 0.03920921 - time (sec): 40.21 - samples/sec: 297.47 - lr: 0.000115 - momentum: 0.000000
2023-10-13 17:38:05,917 epoch 4 - iter 154/773 - loss 0.04471010 - time (sec): 79.67 - samples/sec: 302.53 - lr: 0.000113 - momentum: 0.000000
2023-10-13 17:38:46,036 epoch 4 - iter 231/773 - loss 0.04357768 - time (sec): 119.79 - samples/sec: 306.88 - lr: 0.000112 - momentum: 0.000000
2023-10-13 17:39:25,576 epoch 4 - iter 308/773 - loss 0.04110929 - time (sec): 159.33 - samples/sec: 303.72 - lr: 0.000110 - momentum: 0.000000
2023-10-13 17:40:06,355 epoch 4 - iter 385/773 - loss 0.04203085 - time (sec): 200.11 - samples/sec: 303.61 - lr: 0.000108 - momentum: 0.000000
2023-10-13 17:40:47,876 epoch 4 - iter 462/773 - loss 0.04164789 - time (sec): 241.63 - samples/sec: 305.88 - lr: 0.000107 - momentum: 0.000000
2023-10-13 17:41:27,686 epoch 4 - iter 539/773 - loss 0.04178221 - time (sec): 281.44 - samples/sec: 305.45 - lr: 0.000105 - momentum: 0.000000
2023-10-13 17:42:08,800 epoch 4 - iter 616/773 - loss 0.04237077 - time (sec): 322.55 - samples/sec: 306.22 - lr: 0.000103 - momentum: 0.000000
2023-10-13 17:42:49,799 epoch 4 - iter 693/773 - loss 0.04162761 - time (sec): 363.55 - samples/sec: 307.00 - lr: 0.000102 - momentum: 0.000000
2023-10-13 17:43:30,072 epoch 4 - iter 770/773 - loss 0.04094877 - time (sec): 403.82 - samples/sec: 306.52 - lr: 0.000100 - momentum: 0.000000
2023-10-13 17:43:31,577 ----------------------------------------------------------------------------------------------------
2023-10-13 17:43:31,577 EPOCH 4 done: loss 0.0409 - lr: 0.000100
2023-10-13 17:43:48,933 DEV : loss 0.061349667608737946 - f1-score (micro avg) 0.8024
2023-10-13 17:43:48,960 saving best model
2023-10-13 17:43:51,557 ----------------------------------------------------------------------------------------------------
2023-10-13 17:44:32,555 epoch 5 - iter 77/773 - loss 0.02670187 - time (sec): 40.99 - samples/sec: 319.39 - lr: 0.000098 - momentum: 0.000000
2023-10-13 17:45:12,578 epoch 5 - iter 154/773 - loss 0.02458544 - time (sec): 81.02 - samples/sec: 301.83 - lr: 0.000097 - momentum: 0.000000
2023-10-13 17:45:52,902 epoch 5 - iter 231/773 - loss 0.02523199 - time (sec): 121.34 - samples/sec: 307.11 - lr: 0.000095 - momentum: 0.000000
2023-10-13 17:46:33,070 epoch 5 - iter 308/773 - loss 0.02445345 - time (sec): 161.51 - samples/sec: 308.37 - lr: 0.000093 - momentum: 0.000000
2023-10-13 17:47:14,170 epoch 5 - iter 385/773 - loss 0.02715882 - time (sec): 202.61 - samples/sec: 309.92 - lr: 0.000092 - momentum: 0.000000
2023-10-13 17:47:54,217 epoch 5 - iter 462/773 - loss 0.02748993 - time (sec): 242.66 - samples/sec: 311.16 - lr: 0.000090 - momentum: 0.000000
2023-10-13 17:48:34,167 epoch 5 - iter 539/773 - loss 0.02757364 - time (sec): 282.61 - samples/sec: 310.41 - lr: 0.000088 - momentum: 0.000000
2023-10-13 17:49:13,961 epoch 5 - iter 616/773 - loss 0.02745903 - time (sec): 322.40 - samples/sec: 311.14 - lr: 0.000087 - momentum: 0.000000
2023-10-13 17:49:53,667 epoch 5 - iter 693/773 - loss 0.02735164 - time (sec): 362.11 - samples/sec: 310.47 - lr: 0.000085 - momentum: 0.000000
2023-10-13 17:50:32,927 epoch 5 - iter 770/773 - loss 0.02703869 - time (sec): 401.37 - samples/sec: 308.69 - lr: 0.000083 - momentum: 0.000000
2023-10-13 17:50:34,344 ----------------------------------------------------------------------------------------------------
2023-10-13 17:50:34,345 EPOCH 5 done: loss 0.0271 - lr: 0.000083
2023-10-13 17:50:51,329 DEV : loss 0.07207323610782623 - f1-score (micro avg) 0.7876
2023-10-13 17:50:51,359 ----------------------------------------------------------------------------------------------------
2023-10-13 17:51:32,093 epoch 6 - iter 77/773 - loss 0.02037186 - time (sec): 40.73 - samples/sec: 324.37 - lr: 0.000082 - momentum: 0.000000
2023-10-13 17:52:11,821 epoch 6 - iter 154/773 - loss 0.02091576 - time (sec): 80.46 - samples/sec: 306.02 - lr: 0.000080 - momentum: 0.000000
2023-10-13 17:52:53,494 epoch 6 - iter 231/773 - loss 0.02128225 - time (sec): 122.13 - samples/sec: 310.25 - lr: 0.000078 - momentum: 0.000000
2023-10-13 17:53:34,630 epoch 6 - iter 308/773 - loss 0.02091718 - time (sec): 163.27 - samples/sec: 307.22 - lr: 0.000077 - momentum: 0.000000
2023-10-13 17:54:14,102 epoch 6 - iter 385/773 - loss 0.01902102 - time (sec): 202.74 - samples/sec: 304.66 - lr: 0.000075 - momentum: 0.000000
2023-10-13 17:54:54,541 epoch 6 - iter 462/773 - loss 0.01975430 - time (sec): 243.18 - samples/sec: 306.46 - lr: 0.000073 - momentum: 0.000000
2023-10-13 17:55:33,990 epoch 6 - iter 539/773 - loss 0.01975584 - time (sec): 282.63 - samples/sec: 305.55 - lr: 0.000072 - momentum: 0.000000
2023-10-13 17:56:13,256 epoch 6 - iter 616/773 - loss 0.01933327 - time (sec): 321.89 - samples/sec: 304.57 - lr: 0.000070 - momentum: 0.000000
2023-10-13 17:56:53,888 epoch 6 - iter 693/773 - loss 0.01883937 - time (sec): 362.53 - samples/sec: 304.22 - lr: 0.000068 - momentum: 0.000000
2023-10-13 17:57:34,581 epoch 6 - iter 770/773 - loss 0.01913272 - time (sec): 403.22 - samples/sec: 307.28 - lr: 0.000067 - momentum: 0.000000
2023-10-13 17:57:36,011 ----------------------------------------------------------------------------------------------------
2023-10-13 17:57:36,012 EPOCH 6 done: loss 0.0193 - lr: 0.000067
2023-10-13 17:57:52,925 DEV : loss 0.08004289120435715 - f1-score (micro avg) 0.7896
2023-10-13 17:57:52,953 ----------------------------------------------------------------------------------------------------
2023-10-13 17:58:33,473 epoch 7 - iter 77/773 - loss 0.01052829 - time (sec): 40.52 - samples/sec: 315.07 - lr: 0.000065 - momentum: 0.000000
2023-10-13 17:59:12,688 epoch 7 - iter 154/773 - loss 0.01217112 - time (sec): 79.73 - samples/sec: 311.68 - lr: 0.000063 - momentum: 0.000000
2023-10-13 17:59:52,552 epoch 7 - iter 231/773 - loss 0.01258506 - time (sec): 119.60 - samples/sec: 311.55 - lr: 0.000062 - momentum: 0.000000
2023-10-13 18:00:33,331 epoch 7 - iter 308/773 - loss 0.01270538 - time (sec): 160.38 - samples/sec: 311.57 - lr: 0.000060 - momentum: 0.000000
2023-10-13 18:01:13,969 epoch 7 - iter 385/773 - loss 0.01276005 - time (sec): 201.01 - samples/sec: 309.45 - lr: 0.000058 - momentum: 0.000000
2023-10-13 18:01:53,522 epoch 7 - iter 462/773 - loss 0.01229893 - time (sec): 240.57 - samples/sec: 308.18 - lr: 0.000057 - momentum: 0.000000
2023-10-13 18:02:33,945 epoch 7 - iter 539/773 - loss 0.01220461 - time (sec): 280.99 - samples/sec: 307.69 - lr: 0.000055 - momentum: 0.000000
2023-10-13 18:03:14,733 epoch 7 - iter 616/773 - loss 0.01161193 - time (sec): 321.78 - samples/sec: 308.13 - lr: 0.000054 - momentum: 0.000000
2023-10-13 18:03:55,166 epoch 7 - iter 693/773 - loss 0.01260338 - time (sec): 362.21 - samples/sec: 307.12 - lr: 0.000052 - momentum: 0.000000
2023-10-13 18:04:34,914 epoch 7 - iter 770/773 - loss 0.01235583 - time (sec): 401.96 - samples/sec: 307.96 - lr: 0.000050 - momentum: 0.000000
2023-10-13 18:04:36,433 ----------------------------------------------------------------------------------------------------
2023-10-13 18:04:36,433 EPOCH 7 done: loss 0.0124 - lr: 0.000050
2023-10-13 18:04:53,194 DEV : loss 0.09459361433982849 - f1-score (micro avg) 0.792
2023-10-13 18:04:53,223 ----------------------------------------------------------------------------------------------------
2023-10-13 18:05:33,830 epoch 8 - iter 77/773 - loss 0.01144622 - time (sec): 40.60 - samples/sec: 328.26 - lr: 0.000048 - momentum: 0.000000
2023-10-13 18:06:15,374 epoch 8 - iter 154/773 - loss 0.00970948 - time (sec): 82.15 - samples/sec: 316.58 - lr: 0.000047 - momentum: 0.000000
2023-10-13 18:06:56,115 epoch 8 - iter 231/773 - loss 0.00944804 - time (sec): 122.89 - samples/sec: 308.35 - lr: 0.000045 - momentum: 0.000000
2023-10-13 18:07:37,287 epoch 8 - iter 308/773 - loss 0.00923867 - time (sec): 164.06 - samples/sec: 306.77 - lr: 0.000043 - momentum: 0.000000
2023-10-13 18:08:18,386 epoch 8 - iter 385/773 - loss 0.00978032 - time (sec): 205.16 - samples/sec: 310.98 - lr: 0.000042 - momentum: 0.000000
2023-10-13 18:08:59,496 epoch 8 - iter 462/773 - loss 0.01173064 - time (sec): 246.27 - samples/sec: 308.52 - lr: 0.000040 - momentum: 0.000000
2023-10-13 18:09:39,687 epoch 8 - iter 539/773 - loss 0.01090035 - time (sec): 286.46 - samples/sec: 307.75 - lr: 0.000039 - momentum: 0.000000
2023-10-13 18:10:18,679 epoch 8 - iter 616/773 - loss 0.01064762 - time (sec): 325.45 - samples/sec: 304.49 - lr: 0.000037 - momentum: 0.000000
2023-10-13 18:10:58,847 epoch 8 - iter 693/773 - loss 0.01013823 - time (sec): 365.62 - samples/sec: 303.87 - lr: 0.000035 - momentum: 0.000000
2023-10-13 18:11:38,667 epoch 8 - iter 770/773 - loss 0.00975702 - time (sec): 405.44 - samples/sec: 305.54 - lr: 0.000034 - momentum: 0.000000
2023-10-13 18:11:40,105 ----------------------------------------------------------------------------------------------------
2023-10-13 18:11:40,106 EPOCH 8 done: loss 0.0098 - lr: 0.000034
2023-10-13 18:11:57,039 DEV : loss 0.09352090209722519 - f1-score (micro avg) 0.7842
2023-10-13 18:11:57,070 ----------------------------------------------------------------------------------------------------
2023-10-13 18:12:37,400 epoch 9 - iter 77/773 - loss 0.00652820 - time (sec): 40.33 - samples/sec: 318.29 - lr: 0.000032 - momentum: 0.000000
2023-10-13 18:13:18,257 epoch 9 - iter 154/773 - loss 0.00570655 - time (sec): 81.18 - samples/sec: 317.61 - lr: 0.000030 - momentum: 0.000000
2023-10-13 18:13:57,276 epoch 9 - iter 231/773 - loss 0.00585034 - time (sec): 120.20 - samples/sec: 310.14 - lr: 0.000028 - momentum: 0.000000
2023-10-13 18:14:37,023 epoch 9 - iter 308/773 - loss 0.00612878 - time (sec): 159.95 - samples/sec: 308.35 - lr: 0.000027 - momentum: 0.000000
2023-10-13 18:15:16,631 epoch 9 - iter 385/773 - loss 0.00671168 - time (sec): 199.56 - samples/sec: 307.49 - lr: 0.000025 - momentum: 0.000000
2023-10-13 18:15:56,285 epoch 9 - iter 462/773 - loss 0.00718371 - time (sec): 239.21 - samples/sec: 304.49 - lr: 0.000024 - momentum: 0.000000
2023-10-13 18:16:37,807 epoch 9 - iter 539/773 - loss 0.00797348 - time (sec): 280.74 - samples/sec: 305.58 - lr: 0.000022 - momentum: 0.000000
2023-10-13 18:17:18,932 epoch 9 - iter 616/773 - loss 0.00801701 - time (sec): 321.86 - samples/sec: 304.66 - lr: 0.000020 - momentum: 0.000000
2023-10-13 18:17:59,375 epoch 9 - iter 693/773 - loss 0.00786144 - time (sec): 362.30 - samples/sec: 304.33 - lr: 0.000019 - momentum: 0.000000
2023-10-13 18:18:40,458 epoch 9 - iter 770/773 - loss 0.00750063 - time (sec): 403.39 - samples/sec: 307.03 - lr: 0.000017 - momentum: 0.000000
2023-10-13 18:18:41,956 ----------------------------------------------------------------------------------------------------
2023-10-13 18:18:41,956 EPOCH 9 done: loss 0.0075 - lr: 0.000017
2023-10-13 18:18:59,133 DEV : loss 0.09976237267255783 - f1-score (micro avg) 0.7751
2023-10-13 18:18:59,164 ----------------------------------------------------------------------------------------------------
2023-10-13 18:19:39,491 epoch 10 - iter 77/773 - loss 0.00581833 - time (sec): 40.33 - samples/sec: 301.35 - lr: 0.000015 - momentum: 0.000000
2023-10-13 18:20:18,915 epoch 10 - iter 154/773 - loss 0.00435431 - time (sec): 79.75 - samples/sec: 296.06 - lr: 0.000014 - momentum: 0.000000
2023-10-13 18:20:59,083 epoch 10 - iter 231/773 - loss 0.00547099 - time (sec): 119.92 - samples/sec: 294.88 - lr: 0.000012 - momentum: 0.000000
2023-10-13 18:21:38,863 epoch 10 - iter 308/773 - loss 0.00540846 - time (sec): 159.70 - samples/sec: 301.38 - lr: 0.000010 - momentum: 0.000000
2023-10-13 18:22:18,093 epoch 10 - iter 385/773 - loss 0.00499552 - time (sec): 198.93 - samples/sec: 304.11 - lr: 0.000009 - momentum: 0.000000
2023-10-13 18:22:58,689 epoch 10 - iter 462/773 - loss 0.00533351 - time (sec): 239.52 - samples/sec: 307.37 - lr: 0.000007 - momentum: 0.000000
2023-10-13 18:23:39,202 epoch 10 - iter 539/773 - loss 0.00590520 - time (sec): 280.04 - samples/sec: 307.79 - lr: 0.000005 - momentum: 0.000000
2023-10-13 18:24:20,024 epoch 10 - iter 616/773 - loss 0.00588295 - time (sec): 320.86 - samples/sec: 309.60 - lr: 0.000004 - momentum: 0.000000
2023-10-13 18:25:00,654 epoch 10 - iter 693/773 - loss 0.00607221 - time (sec): 361.49 - samples/sec: 308.83 - lr: 0.000002 - momentum: 0.000000
2023-10-13 18:25:40,608 epoch 10 - iter 770/773 - loss 0.00573109 - time (sec): 401.44 - samples/sec: 308.29 - lr: 0.000000 - momentum: 0.000000
2023-10-13 18:25:42,149 ----------------------------------------------------------------------------------------------------
2023-10-13 18:25:42,149 EPOCH 10 done: loss 0.0057 - lr: 0.000000
2023-10-13 18:25:59,998 DEV : loss 0.10486873239278793 - f1-score (micro avg) 0.7791
2023-10-13 18:26:01,376 ----------------------------------------------------------------------------------------------------
2023-10-13 18:26:01,377 Loading model from best epoch ...
2023-10-13 18:26:05,333 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-13 18:27:01,370
Results:
- F-score (micro) 0.7957
- F-score (macro) 0.7121
- Accuracy 0.6857
By class:
precision recall f1-score support
LOC 0.8471 0.8436 0.8453 946
BUILDING 0.5362 0.6811 0.6000 185
STREET 0.7037 0.6786 0.6909 56
micro avg 0.7815 0.8104 0.7957 1187
macro avg 0.6957 0.7344 0.7121 1187
weighted avg 0.7919 0.8104 0.7998 1187
2023-10-13 18:27:01,370 ----------------------------------------------------------------------------------------------------