stefan-it's picture
Upload folder using huggingface_hub
8f61925
2023-10-13 10:28:31,999 ----------------------------------------------------------------------------------------------------
2023-10-13 10:28:32,002 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 10:28:32,002 ----------------------------------------------------------------------------------------------------
2023-10-13 10:28:32,002 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-13 10:28:32,002 ----------------------------------------------------------------------------------------------------
2023-10-13 10:28:32,002 Train: 7936 sentences
2023-10-13 10:28:32,002 (train_with_dev=False, train_with_test=False)
2023-10-13 10:28:32,002 ----------------------------------------------------------------------------------------------------
2023-10-13 10:28:32,002 Training Params:
2023-10-13 10:28:32,002 - learning_rate: "0.00016"
2023-10-13 10:28:32,002 - mini_batch_size: "8"
2023-10-13 10:28:32,002 - max_epochs: "10"
2023-10-13 10:28:32,003 - shuffle: "True"
2023-10-13 10:28:32,003 ----------------------------------------------------------------------------------------------------
2023-10-13 10:28:32,003 Plugins:
2023-10-13 10:28:32,003 - TensorboardLogger
2023-10-13 10:28:32,003 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 10:28:32,003 ----------------------------------------------------------------------------------------------------
2023-10-13 10:28:32,003 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 10:28:32,003 - metric: "('micro avg', 'f1-score')"
2023-10-13 10:28:32,003 ----------------------------------------------------------------------------------------------------
2023-10-13 10:28:32,003 Computation:
2023-10-13 10:28:32,003 - compute on device: cuda:0
2023-10-13 10:28:32,003 - embedding storage: none
2023-10-13 10:28:32,003 ----------------------------------------------------------------------------------------------------
2023-10-13 10:28:32,003 Model training base path: "hmbench-icdar/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-5"
2023-10-13 10:28:32,004 ----------------------------------------------------------------------------------------------------
2023-10-13 10:28:32,004 ----------------------------------------------------------------------------------------------------
2023-10-13 10:28:32,004 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-13 10:29:22,990 epoch 1 - iter 99/992 - loss 2.54607160 - time (sec): 50.98 - samples/sec: 346.76 - lr: 0.000016 - momentum: 0.000000
2023-10-13 10:30:12,629 epoch 1 - iter 198/992 - loss 2.45530289 - time (sec): 100.62 - samples/sec: 331.63 - lr: 0.000032 - momentum: 0.000000
2023-10-13 10:31:03,198 epoch 1 - iter 297/992 - loss 2.22291643 - time (sec): 151.19 - samples/sec: 334.65 - lr: 0.000048 - momentum: 0.000000
2023-10-13 10:31:51,815 epoch 1 - iter 396/992 - loss 1.99643086 - time (sec): 199.81 - samples/sec: 328.36 - lr: 0.000064 - momentum: 0.000000
2023-10-13 10:32:42,894 epoch 1 - iter 495/992 - loss 1.74941636 - time (sec): 250.89 - samples/sec: 325.20 - lr: 0.000080 - momentum: 0.000000
2023-10-13 10:33:33,129 epoch 1 - iter 594/992 - loss 1.53508071 - time (sec): 301.12 - samples/sec: 323.92 - lr: 0.000096 - momentum: 0.000000
2023-10-13 10:34:23,728 epoch 1 - iter 693/992 - loss 1.36371276 - time (sec): 351.72 - samples/sec: 325.04 - lr: 0.000112 - momentum: 0.000000
2023-10-13 10:35:13,806 epoch 1 - iter 792/992 - loss 1.22715710 - time (sec): 401.80 - samples/sec: 324.30 - lr: 0.000128 - momentum: 0.000000
2023-10-13 10:36:04,123 epoch 1 - iter 891/992 - loss 1.10440570 - time (sec): 452.12 - samples/sec: 327.35 - lr: 0.000144 - momentum: 0.000000
2023-10-13 10:36:54,592 epoch 1 - iter 990/992 - loss 1.01705618 - time (sec): 502.59 - samples/sec: 325.82 - lr: 0.000160 - momentum: 0.000000
2023-10-13 10:36:55,567 ----------------------------------------------------------------------------------------------------
2023-10-13 10:36:55,567 EPOCH 1 done: loss 1.0159 - lr: 0.000160
2023-10-13 10:37:22,889 DEV : loss 0.1508796662092209 - f1-score (micro avg) 0.6481
2023-10-13 10:37:22,931 saving best model
2023-10-13 10:37:23,947 ----------------------------------------------------------------------------------------------------
2023-10-13 10:38:18,742 epoch 2 - iter 99/992 - loss 0.18502898 - time (sec): 54.79 - samples/sec: 302.05 - lr: 0.000158 - momentum: 0.000000
2023-10-13 10:39:12,710 epoch 2 - iter 198/992 - loss 0.16311170 - time (sec): 108.76 - samples/sec: 305.55 - lr: 0.000156 - momentum: 0.000000
2023-10-13 10:40:05,412 epoch 2 - iter 297/992 - loss 0.15362429 - time (sec): 161.46 - samples/sec: 310.91 - lr: 0.000155 - momentum: 0.000000
2023-10-13 10:40:57,282 epoch 2 - iter 396/992 - loss 0.15052045 - time (sec): 213.33 - samples/sec: 308.57 - lr: 0.000153 - momentum: 0.000000
2023-10-13 10:41:48,850 epoch 2 - iter 495/992 - loss 0.14467975 - time (sec): 264.90 - samples/sec: 310.68 - lr: 0.000151 - momentum: 0.000000
2023-10-13 10:42:41,866 epoch 2 - iter 594/992 - loss 0.14120945 - time (sec): 317.92 - samples/sec: 310.06 - lr: 0.000149 - momentum: 0.000000
2023-10-13 10:43:33,582 epoch 2 - iter 693/992 - loss 0.13726963 - time (sec): 369.63 - samples/sec: 310.73 - lr: 0.000148 - momentum: 0.000000
2023-10-13 10:44:29,297 epoch 2 - iter 792/992 - loss 0.13323570 - time (sec): 425.35 - samples/sec: 307.54 - lr: 0.000146 - momentum: 0.000000
2023-10-13 10:45:23,019 epoch 2 - iter 891/992 - loss 0.13115777 - time (sec): 479.07 - samples/sec: 305.23 - lr: 0.000144 - momentum: 0.000000
2023-10-13 10:46:19,168 epoch 2 - iter 990/992 - loss 0.12735787 - time (sec): 535.22 - samples/sec: 305.89 - lr: 0.000142 - momentum: 0.000000
2023-10-13 10:46:20,180 ----------------------------------------------------------------------------------------------------
2023-10-13 10:46:20,181 EPOCH 2 done: loss 0.1273 - lr: 0.000142
2023-10-13 10:46:47,084 DEV : loss 0.08622897416353226 - f1-score (micro avg) 0.7445
2023-10-13 10:46:47,127 saving best model
2023-10-13 10:46:49,839 ----------------------------------------------------------------------------------------------------
2023-10-13 10:47:46,493 epoch 3 - iter 99/992 - loss 0.07517256 - time (sec): 56.65 - samples/sec: 284.28 - lr: 0.000140 - momentum: 0.000000
2023-10-13 10:48:39,495 epoch 3 - iter 198/992 - loss 0.08038487 - time (sec): 109.65 - samples/sec: 295.60 - lr: 0.000139 - momentum: 0.000000
2023-10-13 10:49:29,787 epoch 3 - iter 297/992 - loss 0.07981542 - time (sec): 159.94 - samples/sec: 303.29 - lr: 0.000137 - momentum: 0.000000
2023-10-13 10:50:21,492 epoch 3 - iter 396/992 - loss 0.08034004 - time (sec): 211.65 - samples/sec: 306.67 - lr: 0.000135 - momentum: 0.000000
2023-10-13 10:51:10,594 epoch 3 - iter 495/992 - loss 0.07926530 - time (sec): 260.75 - samples/sec: 310.48 - lr: 0.000133 - momentum: 0.000000
2023-10-13 10:52:02,752 epoch 3 - iter 594/992 - loss 0.07675274 - time (sec): 312.91 - samples/sec: 312.36 - lr: 0.000132 - momentum: 0.000000
2023-10-13 10:52:53,359 epoch 3 - iter 693/992 - loss 0.07650777 - time (sec): 363.52 - samples/sec: 314.27 - lr: 0.000130 - momentum: 0.000000
2023-10-13 10:53:44,643 epoch 3 - iter 792/992 - loss 0.07496469 - time (sec): 414.80 - samples/sec: 314.81 - lr: 0.000128 - momentum: 0.000000
2023-10-13 10:54:37,296 epoch 3 - iter 891/992 - loss 0.07279589 - time (sec): 467.45 - samples/sec: 314.83 - lr: 0.000126 - momentum: 0.000000
2023-10-13 10:55:30,001 epoch 3 - iter 990/992 - loss 0.07288910 - time (sec): 520.16 - samples/sec: 314.61 - lr: 0.000125 - momentum: 0.000000
2023-10-13 10:55:31,036 ----------------------------------------------------------------------------------------------------
2023-10-13 10:55:31,036 EPOCH 3 done: loss 0.0728 - lr: 0.000125
2023-10-13 10:55:58,293 DEV : loss 0.0825371965765953 - f1-score (micro avg) 0.7707
2023-10-13 10:55:58,349 saving best model
2023-10-13 10:56:01,110 ----------------------------------------------------------------------------------------------------
2023-10-13 10:56:53,520 epoch 4 - iter 99/992 - loss 0.05384753 - time (sec): 52.41 - samples/sec: 314.45 - lr: 0.000123 - momentum: 0.000000
2023-10-13 10:57:42,908 epoch 4 - iter 198/992 - loss 0.05210955 - time (sec): 101.79 - samples/sec: 321.15 - lr: 0.000121 - momentum: 0.000000
2023-10-13 10:58:35,321 epoch 4 - iter 297/992 - loss 0.04795320 - time (sec): 154.21 - samples/sec: 317.21 - lr: 0.000119 - momentum: 0.000000
2023-10-13 10:59:27,126 epoch 4 - iter 396/992 - loss 0.04830221 - time (sec): 206.01 - samples/sec: 316.89 - lr: 0.000117 - momentum: 0.000000
2023-10-13 11:00:19,194 epoch 4 - iter 495/992 - loss 0.04876538 - time (sec): 258.08 - samples/sec: 319.29 - lr: 0.000116 - momentum: 0.000000
2023-10-13 11:01:16,686 epoch 4 - iter 594/992 - loss 0.04825072 - time (sec): 315.57 - samples/sec: 314.28 - lr: 0.000114 - momentum: 0.000000
2023-10-13 11:02:13,835 epoch 4 - iter 693/992 - loss 0.04824595 - time (sec): 372.72 - samples/sec: 308.38 - lr: 0.000112 - momentum: 0.000000
2023-10-13 11:03:12,486 epoch 4 - iter 792/992 - loss 0.04915292 - time (sec): 431.37 - samples/sec: 304.00 - lr: 0.000110 - momentum: 0.000000
2023-10-13 11:04:07,648 epoch 4 - iter 891/992 - loss 0.04956659 - time (sec): 486.53 - samples/sec: 303.02 - lr: 0.000109 - momentum: 0.000000
2023-10-13 11:04:58,156 epoch 4 - iter 990/992 - loss 0.05060249 - time (sec): 537.04 - samples/sec: 304.81 - lr: 0.000107 - momentum: 0.000000
2023-10-13 11:04:59,158 ----------------------------------------------------------------------------------------------------
2023-10-13 11:04:59,158 EPOCH 4 done: loss 0.0506 - lr: 0.000107
2023-10-13 11:05:25,002 DEV : loss 0.10595876723527908 - f1-score (micro avg) 0.7755
2023-10-13 11:05:25,051 saving best model
2023-10-13 11:05:27,786 ----------------------------------------------------------------------------------------------------
2023-10-13 11:06:19,003 epoch 5 - iter 99/992 - loss 0.03030251 - time (sec): 51.21 - samples/sec: 327.85 - lr: 0.000105 - momentum: 0.000000
2023-10-13 11:07:10,468 epoch 5 - iter 198/992 - loss 0.03358011 - time (sec): 102.68 - samples/sec: 324.66 - lr: 0.000103 - momentum: 0.000000
2023-10-13 11:08:03,278 epoch 5 - iter 297/992 - loss 0.03869957 - time (sec): 155.49 - samples/sec: 321.86 - lr: 0.000101 - momentum: 0.000000
2023-10-13 11:08:56,370 epoch 5 - iter 396/992 - loss 0.03861410 - time (sec): 208.58 - samples/sec: 318.55 - lr: 0.000100 - momentum: 0.000000
2023-10-13 11:09:52,646 epoch 5 - iter 495/992 - loss 0.03807750 - time (sec): 264.86 - samples/sec: 311.47 - lr: 0.000098 - momentum: 0.000000
2023-10-13 11:10:42,137 epoch 5 - iter 594/992 - loss 0.03891176 - time (sec): 314.35 - samples/sec: 312.64 - lr: 0.000096 - momentum: 0.000000
2023-10-13 11:11:36,838 epoch 5 - iter 693/992 - loss 0.03893009 - time (sec): 369.05 - samples/sec: 309.49 - lr: 0.000094 - momentum: 0.000000
2023-10-13 11:12:29,850 epoch 5 - iter 792/992 - loss 0.03920528 - time (sec): 422.06 - samples/sec: 310.98 - lr: 0.000093 - momentum: 0.000000
2023-10-13 11:13:19,013 epoch 5 - iter 891/992 - loss 0.03906993 - time (sec): 471.22 - samples/sec: 313.33 - lr: 0.000091 - momentum: 0.000000
2023-10-13 11:14:10,723 epoch 5 - iter 990/992 - loss 0.03870073 - time (sec): 522.93 - samples/sec: 313.20 - lr: 0.000089 - momentum: 0.000000
2023-10-13 11:14:11,614 ----------------------------------------------------------------------------------------------------
2023-10-13 11:14:11,614 EPOCH 5 done: loss 0.0387 - lr: 0.000089
2023-10-13 11:14:40,264 DEV : loss 0.12928840517997742 - f1-score (micro avg) 0.7545
2023-10-13 11:14:40,315 ----------------------------------------------------------------------------------------------------
2023-10-13 11:15:34,003 epoch 6 - iter 99/992 - loss 0.02303380 - time (sec): 53.69 - samples/sec: 318.24 - lr: 0.000087 - momentum: 0.000000
2023-10-13 11:16:24,574 epoch 6 - iter 198/992 - loss 0.02422261 - time (sec): 104.26 - samples/sec: 321.60 - lr: 0.000085 - momentum: 0.000000
2023-10-13 11:17:14,756 epoch 6 - iter 297/992 - loss 0.02528001 - time (sec): 154.44 - samples/sec: 319.42 - lr: 0.000084 - momentum: 0.000000
2023-10-13 11:18:08,316 epoch 6 - iter 396/992 - loss 0.02751581 - time (sec): 208.00 - samples/sec: 316.97 - lr: 0.000082 - momentum: 0.000000
2023-10-13 11:19:02,202 epoch 6 - iter 495/992 - loss 0.02730045 - time (sec): 261.88 - samples/sec: 314.10 - lr: 0.000080 - momentum: 0.000000
2023-10-13 11:19:54,079 epoch 6 - iter 594/992 - loss 0.02740559 - time (sec): 313.76 - samples/sec: 314.14 - lr: 0.000078 - momentum: 0.000000
2023-10-13 11:20:45,462 epoch 6 - iter 693/992 - loss 0.02803668 - time (sec): 365.14 - samples/sec: 314.89 - lr: 0.000077 - momentum: 0.000000
2023-10-13 11:21:36,302 epoch 6 - iter 792/992 - loss 0.02830503 - time (sec): 415.98 - samples/sec: 314.74 - lr: 0.000075 - momentum: 0.000000
2023-10-13 11:22:27,272 epoch 6 - iter 891/992 - loss 0.02753052 - time (sec): 466.95 - samples/sec: 315.17 - lr: 0.000073 - momentum: 0.000000
2023-10-13 11:23:22,766 epoch 6 - iter 990/992 - loss 0.02823935 - time (sec): 522.45 - samples/sec: 313.35 - lr: 0.000071 - momentum: 0.000000
2023-10-13 11:23:23,978 ----------------------------------------------------------------------------------------------------
2023-10-13 11:23:23,978 EPOCH 6 done: loss 0.0284 - lr: 0.000071
2023-10-13 11:23:50,511 DEV : loss 0.1517123430967331 - f1-score (micro avg) 0.7531
2023-10-13 11:23:50,554 ----------------------------------------------------------------------------------------------------
2023-10-13 11:24:45,873 epoch 7 - iter 99/992 - loss 0.02153415 - time (sec): 55.32 - samples/sec: 298.05 - lr: 0.000069 - momentum: 0.000000
2023-10-13 11:25:40,846 epoch 7 - iter 198/992 - loss 0.01976551 - time (sec): 110.29 - samples/sec: 292.37 - lr: 0.000068 - momentum: 0.000000
2023-10-13 11:26:36,948 epoch 7 - iter 297/992 - loss 0.01958761 - time (sec): 166.39 - samples/sec: 295.63 - lr: 0.000066 - momentum: 0.000000
2023-10-13 11:27:29,131 epoch 7 - iter 396/992 - loss 0.02027603 - time (sec): 218.58 - samples/sec: 297.93 - lr: 0.000064 - momentum: 0.000000
2023-10-13 11:28:21,300 epoch 7 - iter 495/992 - loss 0.01929391 - time (sec): 270.74 - samples/sec: 301.02 - lr: 0.000062 - momentum: 0.000000
2023-10-13 11:29:13,969 epoch 7 - iter 594/992 - loss 0.01900255 - time (sec): 323.41 - samples/sec: 302.36 - lr: 0.000061 - momentum: 0.000000
2023-10-13 11:30:07,017 epoch 7 - iter 693/992 - loss 0.01962984 - time (sec): 376.46 - samples/sec: 303.03 - lr: 0.000059 - momentum: 0.000000
2023-10-13 11:30:58,497 epoch 7 - iter 792/992 - loss 0.01983379 - time (sec): 427.94 - samples/sec: 303.80 - lr: 0.000057 - momentum: 0.000000
2023-10-13 11:31:51,112 epoch 7 - iter 891/992 - loss 0.02059606 - time (sec): 480.56 - samples/sec: 306.03 - lr: 0.000055 - momentum: 0.000000
2023-10-13 11:32:42,304 epoch 7 - iter 990/992 - loss 0.02176361 - time (sec): 531.75 - samples/sec: 307.97 - lr: 0.000053 - momentum: 0.000000
2023-10-13 11:32:43,201 ----------------------------------------------------------------------------------------------------
2023-10-13 11:32:43,202 EPOCH 7 done: loss 0.0217 - lr: 0.000053
2023-10-13 11:33:09,891 DEV : loss 0.1812378168106079 - f1-score (micro avg) 0.7595
2023-10-13 11:33:09,941 ----------------------------------------------------------------------------------------------------
2023-10-13 11:34:04,722 epoch 8 - iter 99/992 - loss 0.01704028 - time (sec): 54.78 - samples/sec: 300.82 - lr: 0.000052 - momentum: 0.000000
2023-10-13 11:34:56,792 epoch 8 - iter 198/992 - loss 0.01462415 - time (sec): 106.85 - samples/sec: 310.66 - lr: 0.000050 - momentum: 0.000000
2023-10-13 11:35:49,552 epoch 8 - iter 297/992 - loss 0.01536696 - time (sec): 159.61 - samples/sec: 308.94 - lr: 0.000048 - momentum: 0.000000
2023-10-13 11:36:39,869 epoch 8 - iter 396/992 - loss 0.01614481 - time (sec): 209.93 - samples/sec: 314.36 - lr: 0.000046 - momentum: 0.000000
2023-10-13 11:37:32,877 epoch 8 - iter 495/992 - loss 0.01559292 - time (sec): 262.93 - samples/sec: 312.81 - lr: 0.000045 - momentum: 0.000000
2023-10-13 11:38:23,968 epoch 8 - iter 594/992 - loss 0.01640693 - time (sec): 314.02 - samples/sec: 313.95 - lr: 0.000043 - momentum: 0.000000
2023-10-13 11:39:15,526 epoch 8 - iter 693/992 - loss 0.01635913 - time (sec): 365.58 - samples/sec: 314.32 - lr: 0.000041 - momentum: 0.000000
2023-10-13 11:40:05,322 epoch 8 - iter 792/992 - loss 0.01623154 - time (sec): 415.38 - samples/sec: 314.29 - lr: 0.000039 - momentum: 0.000000
2023-10-13 11:40:57,198 epoch 8 - iter 891/992 - loss 0.01718004 - time (sec): 467.25 - samples/sec: 314.59 - lr: 0.000037 - momentum: 0.000000
2023-10-13 11:41:49,259 epoch 8 - iter 990/992 - loss 0.01717590 - time (sec): 519.31 - samples/sec: 315.32 - lr: 0.000036 - momentum: 0.000000
2023-10-13 11:41:50,206 ----------------------------------------------------------------------------------------------------
2023-10-13 11:41:50,206 EPOCH 8 done: loss 0.0172 - lr: 0.000036
2023-10-13 11:42:16,985 DEV : loss 0.19388391077518463 - f1-score (micro avg) 0.7615
2023-10-13 11:42:17,029 ----------------------------------------------------------------------------------------------------
2023-10-13 11:43:09,797 epoch 9 - iter 99/992 - loss 0.01365430 - time (sec): 52.77 - samples/sec: 298.89 - lr: 0.000034 - momentum: 0.000000
2023-10-13 11:44:01,364 epoch 9 - iter 198/992 - loss 0.01301086 - time (sec): 104.33 - samples/sec: 304.87 - lr: 0.000032 - momentum: 0.000000
2023-10-13 11:44:55,127 epoch 9 - iter 297/992 - loss 0.01218563 - time (sec): 158.10 - samples/sec: 306.14 - lr: 0.000030 - momentum: 0.000000
2023-10-13 11:45:47,425 epoch 9 - iter 396/992 - loss 0.01242401 - time (sec): 210.39 - samples/sec: 309.23 - lr: 0.000029 - momentum: 0.000000
2023-10-13 11:46:37,878 epoch 9 - iter 495/992 - loss 0.01331846 - time (sec): 260.85 - samples/sec: 311.47 - lr: 0.000027 - momentum: 0.000000
2023-10-13 11:47:28,572 epoch 9 - iter 594/992 - loss 0.01257105 - time (sec): 311.54 - samples/sec: 307.73 - lr: 0.000025 - momentum: 0.000000
2023-10-13 11:48:20,226 epoch 9 - iter 693/992 - loss 0.01263550 - time (sec): 363.19 - samples/sec: 311.99 - lr: 0.000023 - momentum: 0.000000
2023-10-13 11:49:12,057 epoch 9 - iter 792/992 - loss 0.01330865 - time (sec): 415.03 - samples/sec: 312.75 - lr: 0.000022 - momentum: 0.000000
2023-10-13 11:50:03,410 epoch 9 - iter 891/992 - loss 0.01361265 - time (sec): 466.38 - samples/sec: 315.44 - lr: 0.000020 - momentum: 0.000000
2023-10-13 11:50:54,785 epoch 9 - iter 990/992 - loss 0.01318658 - time (sec): 517.75 - samples/sec: 315.95 - lr: 0.000018 - momentum: 0.000000
2023-10-13 11:50:55,861 ----------------------------------------------------------------------------------------------------
2023-10-13 11:50:55,861 EPOCH 9 done: loss 0.0132 - lr: 0.000018
2023-10-13 11:51:24,536 DEV : loss 0.20332369208335876 - f1-score (micro avg) 0.7693
2023-10-13 11:51:24,583 ----------------------------------------------------------------------------------------------------
2023-10-13 11:52:17,634 epoch 10 - iter 99/992 - loss 0.00750081 - time (sec): 53.05 - samples/sec: 319.09 - lr: 0.000016 - momentum: 0.000000
2023-10-13 11:53:10,300 epoch 10 - iter 198/992 - loss 0.00995278 - time (sec): 105.71 - samples/sec: 308.47 - lr: 0.000014 - momentum: 0.000000
2023-10-13 11:54:01,858 epoch 10 - iter 297/992 - loss 0.00947042 - time (sec): 157.27 - samples/sec: 308.50 - lr: 0.000013 - momentum: 0.000000
2023-10-13 11:54:52,270 epoch 10 - iter 396/992 - loss 0.01003417 - time (sec): 207.68 - samples/sec: 310.08 - lr: 0.000011 - momentum: 0.000000
2023-10-13 11:55:43,447 epoch 10 - iter 495/992 - loss 0.00926563 - time (sec): 258.86 - samples/sec: 313.62 - lr: 0.000009 - momentum: 0.000000
2023-10-13 11:56:34,584 epoch 10 - iter 594/992 - loss 0.00963875 - time (sec): 310.00 - samples/sec: 316.09 - lr: 0.000007 - momentum: 0.000000
2023-10-13 11:57:26,440 epoch 10 - iter 693/992 - loss 0.00967293 - time (sec): 361.85 - samples/sec: 316.90 - lr: 0.000006 - momentum: 0.000000
2023-10-13 11:58:16,976 epoch 10 - iter 792/992 - loss 0.01004068 - time (sec): 412.39 - samples/sec: 318.73 - lr: 0.000004 - momentum: 0.000000
2023-10-13 11:59:06,927 epoch 10 - iter 891/992 - loss 0.00984311 - time (sec): 462.34 - samples/sec: 319.79 - lr: 0.000002 - momentum: 0.000000
2023-10-13 12:00:01,405 epoch 10 - iter 990/992 - loss 0.01031698 - time (sec): 516.82 - samples/sec: 316.56 - lr: 0.000000 - momentum: 0.000000
2023-10-13 12:00:02,536 ----------------------------------------------------------------------------------------------------
2023-10-13 12:00:02,536 EPOCH 10 done: loss 0.0103 - lr: 0.000000
2023-10-13 12:00:28,857 DEV : loss 0.20612968504428864 - f1-score (micro avg) 0.7629
2023-10-13 12:00:29,870 ----------------------------------------------------------------------------------------------------
2023-10-13 12:00:29,872 Loading model from best epoch ...
2023-10-13 12:00:34,701 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-13 12:01:02,091
Results:
- F-score (micro) 0.7756
- F-score (macro) 0.6868
- Accuracy 0.6592
By class:
precision recall f1-score support
LOC 0.8223 0.8550 0.8383 655
PER 0.6926 0.7982 0.7417 223
ORG 0.5392 0.4331 0.4803 127
micro avg 0.7625 0.7891 0.7756 1005
macro avg 0.6847 0.6954 0.6868 1005
weighted avg 0.7578 0.7891 0.7716 1005
2023-10-13 12:01:02,091 ----------------------------------------------------------------------------------------------------