|
2023-10-13 10:28:31,999 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:28:32,002 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-13 10:28:32,002 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:28:32,002 MultiCorpus: 7936 train + 992 dev + 992 test sentences |
|
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr |
|
2023-10-13 10:28:32,002 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:28:32,002 Train: 7936 sentences |
|
2023-10-13 10:28:32,002 (train_with_dev=False, train_with_test=False) |
|
2023-10-13 10:28:32,002 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:28:32,002 Training Params: |
|
2023-10-13 10:28:32,002 - learning_rate: "0.00016" |
|
2023-10-13 10:28:32,002 - mini_batch_size: "8" |
|
2023-10-13 10:28:32,002 - max_epochs: "10" |
|
2023-10-13 10:28:32,003 - shuffle: "True" |
|
2023-10-13 10:28:32,003 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:28:32,003 Plugins: |
|
2023-10-13 10:28:32,003 - TensorboardLogger |
|
2023-10-13 10:28:32,003 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-13 10:28:32,003 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:28:32,003 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-13 10:28:32,003 - metric: "('micro avg', 'f1-score')" |
|
2023-10-13 10:28:32,003 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:28:32,003 Computation: |
|
2023-10-13 10:28:32,003 - compute on device: cuda:0 |
|
2023-10-13 10:28:32,003 - embedding storage: none |
|
2023-10-13 10:28:32,003 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:28:32,003 Model training base path: "hmbench-icdar/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-13 10:28:32,004 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:28:32,004 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:28:32,004 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-13 10:29:22,990 epoch 1 - iter 99/992 - loss 2.54607160 - time (sec): 50.98 - samples/sec: 346.76 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 10:30:12,629 epoch 1 - iter 198/992 - loss 2.45530289 - time (sec): 100.62 - samples/sec: 331.63 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 10:31:03,198 epoch 1 - iter 297/992 - loss 2.22291643 - time (sec): 151.19 - samples/sec: 334.65 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 10:31:51,815 epoch 1 - iter 396/992 - loss 1.99643086 - time (sec): 199.81 - samples/sec: 328.36 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-13 10:32:42,894 epoch 1 - iter 495/992 - loss 1.74941636 - time (sec): 250.89 - samples/sec: 325.20 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-13 10:33:33,129 epoch 1 - iter 594/992 - loss 1.53508071 - time (sec): 301.12 - samples/sec: 323.92 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-13 10:34:23,728 epoch 1 - iter 693/992 - loss 1.36371276 - time (sec): 351.72 - samples/sec: 325.04 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-13 10:35:13,806 epoch 1 - iter 792/992 - loss 1.22715710 - time (sec): 401.80 - samples/sec: 324.30 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-13 10:36:04,123 epoch 1 - iter 891/992 - loss 1.10440570 - time (sec): 452.12 - samples/sec: 327.35 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-13 10:36:54,592 epoch 1 - iter 990/992 - loss 1.01705618 - time (sec): 502.59 - samples/sec: 325.82 - lr: 0.000160 - momentum: 0.000000 |
|
2023-10-13 10:36:55,567 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:36:55,567 EPOCH 1 done: loss 1.0159 - lr: 0.000160 |
|
2023-10-13 10:37:22,889 DEV : loss 0.1508796662092209 - f1-score (micro avg) 0.6481 |
|
2023-10-13 10:37:22,931 saving best model |
|
2023-10-13 10:37:23,947 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:38:18,742 epoch 2 - iter 99/992 - loss 0.18502898 - time (sec): 54.79 - samples/sec: 302.05 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-13 10:39:12,710 epoch 2 - iter 198/992 - loss 0.16311170 - time (sec): 108.76 - samples/sec: 305.55 - lr: 0.000156 - momentum: 0.000000 |
|
2023-10-13 10:40:05,412 epoch 2 - iter 297/992 - loss 0.15362429 - time (sec): 161.46 - samples/sec: 310.91 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-13 10:40:57,282 epoch 2 - iter 396/992 - loss 0.15052045 - time (sec): 213.33 - samples/sec: 308.57 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-13 10:41:48,850 epoch 2 - iter 495/992 - loss 0.14467975 - time (sec): 264.90 - samples/sec: 310.68 - lr: 0.000151 - momentum: 0.000000 |
|
2023-10-13 10:42:41,866 epoch 2 - iter 594/992 - loss 0.14120945 - time (sec): 317.92 - samples/sec: 310.06 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-13 10:43:33,582 epoch 2 - iter 693/992 - loss 0.13726963 - time (sec): 369.63 - samples/sec: 310.73 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-13 10:44:29,297 epoch 2 - iter 792/992 - loss 0.13323570 - time (sec): 425.35 - samples/sec: 307.54 - lr: 0.000146 - momentum: 0.000000 |
|
2023-10-13 10:45:23,019 epoch 2 - iter 891/992 - loss 0.13115777 - time (sec): 479.07 - samples/sec: 305.23 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-13 10:46:19,168 epoch 2 - iter 990/992 - loss 0.12735787 - time (sec): 535.22 - samples/sec: 305.89 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-13 10:46:20,180 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:46:20,181 EPOCH 2 done: loss 0.1273 - lr: 0.000142 |
|
2023-10-13 10:46:47,084 DEV : loss 0.08622897416353226 - f1-score (micro avg) 0.7445 |
|
2023-10-13 10:46:47,127 saving best model |
|
2023-10-13 10:46:49,839 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:47:46,493 epoch 3 - iter 99/992 - loss 0.07517256 - time (sec): 56.65 - samples/sec: 284.28 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-13 10:48:39,495 epoch 3 - iter 198/992 - loss 0.08038487 - time (sec): 109.65 - samples/sec: 295.60 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-13 10:49:29,787 epoch 3 - iter 297/992 - loss 0.07981542 - time (sec): 159.94 - samples/sec: 303.29 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-13 10:50:21,492 epoch 3 - iter 396/992 - loss 0.08034004 - time (sec): 211.65 - samples/sec: 306.67 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-13 10:51:10,594 epoch 3 - iter 495/992 - loss 0.07926530 - time (sec): 260.75 - samples/sec: 310.48 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-13 10:52:02,752 epoch 3 - iter 594/992 - loss 0.07675274 - time (sec): 312.91 - samples/sec: 312.36 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-13 10:52:53,359 epoch 3 - iter 693/992 - loss 0.07650777 - time (sec): 363.52 - samples/sec: 314.27 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-13 10:53:44,643 epoch 3 - iter 792/992 - loss 0.07496469 - time (sec): 414.80 - samples/sec: 314.81 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-13 10:54:37,296 epoch 3 - iter 891/992 - loss 0.07279589 - time (sec): 467.45 - samples/sec: 314.83 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-13 10:55:30,001 epoch 3 - iter 990/992 - loss 0.07288910 - time (sec): 520.16 - samples/sec: 314.61 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-13 10:55:31,036 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:55:31,036 EPOCH 3 done: loss 0.0728 - lr: 0.000125 |
|
2023-10-13 10:55:58,293 DEV : loss 0.0825371965765953 - f1-score (micro avg) 0.7707 |
|
2023-10-13 10:55:58,349 saving best model |
|
2023-10-13 10:56:01,110 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:56:53,520 epoch 4 - iter 99/992 - loss 0.05384753 - time (sec): 52.41 - samples/sec: 314.45 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-13 10:57:42,908 epoch 4 - iter 198/992 - loss 0.05210955 - time (sec): 101.79 - samples/sec: 321.15 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-13 10:58:35,321 epoch 4 - iter 297/992 - loss 0.04795320 - time (sec): 154.21 - samples/sec: 317.21 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-13 10:59:27,126 epoch 4 - iter 396/992 - loss 0.04830221 - time (sec): 206.01 - samples/sec: 316.89 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-13 11:00:19,194 epoch 4 - iter 495/992 - loss 0.04876538 - time (sec): 258.08 - samples/sec: 319.29 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-13 11:01:16,686 epoch 4 - iter 594/992 - loss 0.04825072 - time (sec): 315.57 - samples/sec: 314.28 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-13 11:02:13,835 epoch 4 - iter 693/992 - loss 0.04824595 - time (sec): 372.72 - samples/sec: 308.38 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-13 11:03:12,486 epoch 4 - iter 792/992 - loss 0.04915292 - time (sec): 431.37 - samples/sec: 304.00 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-13 11:04:07,648 epoch 4 - iter 891/992 - loss 0.04956659 - time (sec): 486.53 - samples/sec: 303.02 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-13 11:04:58,156 epoch 4 - iter 990/992 - loss 0.05060249 - time (sec): 537.04 - samples/sec: 304.81 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-13 11:04:59,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:04:59,158 EPOCH 4 done: loss 0.0506 - lr: 0.000107 |
|
2023-10-13 11:05:25,002 DEV : loss 0.10595876723527908 - f1-score (micro avg) 0.7755 |
|
2023-10-13 11:05:25,051 saving best model |
|
2023-10-13 11:05:27,786 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:06:19,003 epoch 5 - iter 99/992 - loss 0.03030251 - time (sec): 51.21 - samples/sec: 327.85 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-13 11:07:10,468 epoch 5 - iter 198/992 - loss 0.03358011 - time (sec): 102.68 - samples/sec: 324.66 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-13 11:08:03,278 epoch 5 - iter 297/992 - loss 0.03869957 - time (sec): 155.49 - samples/sec: 321.86 - lr: 0.000101 - momentum: 0.000000 |
|
2023-10-13 11:08:56,370 epoch 5 - iter 396/992 - loss 0.03861410 - time (sec): 208.58 - samples/sec: 318.55 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-13 11:09:52,646 epoch 5 - iter 495/992 - loss 0.03807750 - time (sec): 264.86 - samples/sec: 311.47 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-13 11:10:42,137 epoch 5 - iter 594/992 - loss 0.03891176 - time (sec): 314.35 - samples/sec: 312.64 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-13 11:11:36,838 epoch 5 - iter 693/992 - loss 0.03893009 - time (sec): 369.05 - samples/sec: 309.49 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-13 11:12:29,850 epoch 5 - iter 792/992 - loss 0.03920528 - time (sec): 422.06 - samples/sec: 310.98 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-13 11:13:19,013 epoch 5 - iter 891/992 - loss 0.03906993 - time (sec): 471.22 - samples/sec: 313.33 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-13 11:14:10,723 epoch 5 - iter 990/992 - loss 0.03870073 - time (sec): 522.93 - samples/sec: 313.20 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-13 11:14:11,614 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:14:11,614 EPOCH 5 done: loss 0.0387 - lr: 0.000089 |
|
2023-10-13 11:14:40,264 DEV : loss 0.12928840517997742 - f1-score (micro avg) 0.7545 |
|
2023-10-13 11:14:40,315 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:15:34,003 epoch 6 - iter 99/992 - loss 0.02303380 - time (sec): 53.69 - samples/sec: 318.24 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-13 11:16:24,574 epoch 6 - iter 198/992 - loss 0.02422261 - time (sec): 104.26 - samples/sec: 321.60 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-13 11:17:14,756 epoch 6 - iter 297/992 - loss 0.02528001 - time (sec): 154.44 - samples/sec: 319.42 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-13 11:18:08,316 epoch 6 - iter 396/992 - loss 0.02751581 - time (sec): 208.00 - samples/sec: 316.97 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-13 11:19:02,202 epoch 6 - iter 495/992 - loss 0.02730045 - time (sec): 261.88 - samples/sec: 314.10 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-13 11:19:54,079 epoch 6 - iter 594/992 - loss 0.02740559 - time (sec): 313.76 - samples/sec: 314.14 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-13 11:20:45,462 epoch 6 - iter 693/992 - loss 0.02803668 - time (sec): 365.14 - samples/sec: 314.89 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-13 11:21:36,302 epoch 6 - iter 792/992 - loss 0.02830503 - time (sec): 415.98 - samples/sec: 314.74 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-13 11:22:27,272 epoch 6 - iter 891/992 - loss 0.02753052 - time (sec): 466.95 - samples/sec: 315.17 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-13 11:23:22,766 epoch 6 - iter 990/992 - loss 0.02823935 - time (sec): 522.45 - samples/sec: 313.35 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-13 11:23:23,978 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:23:23,978 EPOCH 6 done: loss 0.0284 - lr: 0.000071 |
|
2023-10-13 11:23:50,511 DEV : loss 0.1517123430967331 - f1-score (micro avg) 0.7531 |
|
2023-10-13 11:23:50,554 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:24:45,873 epoch 7 - iter 99/992 - loss 0.02153415 - time (sec): 55.32 - samples/sec: 298.05 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-13 11:25:40,846 epoch 7 - iter 198/992 - loss 0.01976551 - time (sec): 110.29 - samples/sec: 292.37 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-13 11:26:36,948 epoch 7 - iter 297/992 - loss 0.01958761 - time (sec): 166.39 - samples/sec: 295.63 - lr: 0.000066 - momentum: 0.000000 |
|
2023-10-13 11:27:29,131 epoch 7 - iter 396/992 - loss 0.02027603 - time (sec): 218.58 - samples/sec: 297.93 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-13 11:28:21,300 epoch 7 - iter 495/992 - loss 0.01929391 - time (sec): 270.74 - samples/sec: 301.02 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-13 11:29:13,969 epoch 7 - iter 594/992 - loss 0.01900255 - time (sec): 323.41 - samples/sec: 302.36 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-13 11:30:07,017 epoch 7 - iter 693/992 - loss 0.01962984 - time (sec): 376.46 - samples/sec: 303.03 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-13 11:30:58,497 epoch 7 - iter 792/992 - loss 0.01983379 - time (sec): 427.94 - samples/sec: 303.80 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-13 11:31:51,112 epoch 7 - iter 891/992 - loss 0.02059606 - time (sec): 480.56 - samples/sec: 306.03 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-13 11:32:42,304 epoch 7 - iter 990/992 - loss 0.02176361 - time (sec): 531.75 - samples/sec: 307.97 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-13 11:32:43,201 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:32:43,202 EPOCH 7 done: loss 0.0217 - lr: 0.000053 |
|
2023-10-13 11:33:09,891 DEV : loss 0.1812378168106079 - f1-score (micro avg) 0.7595 |
|
2023-10-13 11:33:09,941 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:34:04,722 epoch 8 - iter 99/992 - loss 0.01704028 - time (sec): 54.78 - samples/sec: 300.82 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-13 11:34:56,792 epoch 8 - iter 198/992 - loss 0.01462415 - time (sec): 106.85 - samples/sec: 310.66 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-13 11:35:49,552 epoch 8 - iter 297/992 - loss 0.01536696 - time (sec): 159.61 - samples/sec: 308.94 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 11:36:39,869 epoch 8 - iter 396/992 - loss 0.01614481 - time (sec): 209.93 - samples/sec: 314.36 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-13 11:37:32,877 epoch 8 - iter 495/992 - loss 0.01559292 - time (sec): 262.93 - samples/sec: 312.81 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 11:38:23,968 epoch 8 - iter 594/992 - loss 0.01640693 - time (sec): 314.02 - samples/sec: 313.95 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 11:39:15,526 epoch 8 - iter 693/992 - loss 0.01635913 - time (sec): 365.58 - samples/sec: 314.32 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-13 11:40:05,322 epoch 8 - iter 792/992 - loss 0.01623154 - time (sec): 415.38 - samples/sec: 314.29 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-13 11:40:57,198 epoch 8 - iter 891/992 - loss 0.01718004 - time (sec): 467.25 - samples/sec: 314.59 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 11:41:49,259 epoch 8 - iter 990/992 - loss 0.01717590 - time (sec): 519.31 - samples/sec: 315.32 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-13 11:41:50,206 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:41:50,206 EPOCH 8 done: loss 0.0172 - lr: 0.000036 |
|
2023-10-13 11:42:16,985 DEV : loss 0.19388391077518463 - f1-score (micro avg) 0.7615 |
|
2023-10-13 11:42:17,029 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:43:09,797 epoch 9 - iter 99/992 - loss 0.01365430 - time (sec): 52.77 - samples/sec: 298.89 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-13 11:44:01,364 epoch 9 - iter 198/992 - loss 0.01301086 - time (sec): 104.33 - samples/sec: 304.87 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 11:44:55,127 epoch 9 - iter 297/992 - loss 0.01218563 - time (sec): 158.10 - samples/sec: 306.14 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 11:45:47,425 epoch 9 - iter 396/992 - loss 0.01242401 - time (sec): 210.39 - samples/sec: 309.23 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 11:46:37,878 epoch 9 - iter 495/992 - loss 0.01331846 - time (sec): 260.85 - samples/sec: 311.47 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 11:47:28,572 epoch 9 - iter 594/992 - loss 0.01257105 - time (sec): 311.54 - samples/sec: 307.73 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 11:48:20,226 epoch 9 - iter 693/992 - loss 0.01263550 - time (sec): 363.19 - samples/sec: 311.99 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 11:49:12,057 epoch 9 - iter 792/992 - loss 0.01330865 - time (sec): 415.03 - samples/sec: 312.75 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 11:50:03,410 epoch 9 - iter 891/992 - loss 0.01361265 - time (sec): 466.38 - samples/sec: 315.44 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 11:50:54,785 epoch 9 - iter 990/992 - loss 0.01318658 - time (sec): 517.75 - samples/sec: 315.95 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 11:50:55,861 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:50:55,861 EPOCH 9 done: loss 0.0132 - lr: 0.000018 |
|
2023-10-13 11:51:24,536 DEV : loss 0.20332369208335876 - f1-score (micro avg) 0.7693 |
|
2023-10-13 11:51:24,583 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 11:52:17,634 epoch 10 - iter 99/992 - loss 0.00750081 - time (sec): 53.05 - samples/sec: 319.09 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 11:53:10,300 epoch 10 - iter 198/992 - loss 0.00995278 - time (sec): 105.71 - samples/sec: 308.47 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 11:54:01,858 epoch 10 - iter 297/992 - loss 0.00947042 - time (sec): 157.27 - samples/sec: 308.50 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 11:54:52,270 epoch 10 - iter 396/992 - loss 0.01003417 - time (sec): 207.68 - samples/sec: 310.08 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 11:55:43,447 epoch 10 - iter 495/992 - loss 0.00926563 - time (sec): 258.86 - samples/sec: 313.62 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 11:56:34,584 epoch 10 - iter 594/992 - loss 0.00963875 - time (sec): 310.00 - samples/sec: 316.09 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 11:57:26,440 epoch 10 - iter 693/992 - loss 0.00967293 - time (sec): 361.85 - samples/sec: 316.90 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 11:58:16,976 epoch 10 - iter 792/992 - loss 0.01004068 - time (sec): 412.39 - samples/sec: 318.73 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 11:59:06,927 epoch 10 - iter 891/992 - loss 0.00984311 - time (sec): 462.34 - samples/sec: 319.79 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 12:00:01,405 epoch 10 - iter 990/992 - loss 0.01031698 - time (sec): 516.82 - samples/sec: 316.56 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 12:00:02,536 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:00:02,536 EPOCH 10 done: loss 0.0103 - lr: 0.000000 |
|
2023-10-13 12:00:28,857 DEV : loss 0.20612968504428864 - f1-score (micro avg) 0.7629 |
|
2023-10-13 12:00:29,870 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 12:00:29,872 Loading model from best epoch ... |
|
2023-10-13 12:00:34,701 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-13 12:01:02,091 |
|
Results: |
|
- F-score (micro) 0.7756 |
|
- F-score (macro) 0.6868 |
|
- Accuracy 0.6592 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8223 0.8550 0.8383 655 |
|
PER 0.6926 0.7982 0.7417 223 |
|
ORG 0.5392 0.4331 0.4803 127 |
|
|
|
micro avg 0.7625 0.7891 0.7756 1005 |
|
macro avg 0.6847 0.6954 0.6868 1005 |
|
weighted avg 0.7578 0.7891 0.7716 1005 |
|
|
|
2023-10-13 12:01:02,091 ---------------------------------------------------------------------------------------------------- |
|
|