stefan-it's picture
Upload folder using huggingface_hub
cf3cfe1
2023-10-13 18:40:22,179 ----------------------------------------------------------------------------------------------------
2023-10-13 18:40:22,181 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 18:40:22,181 ----------------------------------------------------------------------------------------------------
2023-10-13 18:40:22,181 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
2023-10-13 18:40:22,181 ----------------------------------------------------------------------------------------------------
2023-10-13 18:40:22,181 Train: 14465 sentences
2023-10-13 18:40:22,181 (train_with_dev=False, train_with_test=False)
2023-10-13 18:40:22,182 ----------------------------------------------------------------------------------------------------
2023-10-13 18:40:22,182 Training Params:
2023-10-13 18:40:22,182 - learning_rate: "0.00016"
2023-10-13 18:40:22,182 - mini_batch_size: "4"
2023-10-13 18:40:22,182 - max_epochs: "10"
2023-10-13 18:40:22,182 - shuffle: "True"
2023-10-13 18:40:22,182 ----------------------------------------------------------------------------------------------------
2023-10-13 18:40:22,182 Plugins:
2023-10-13 18:40:22,182 - TensorboardLogger
2023-10-13 18:40:22,182 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 18:40:22,182 ----------------------------------------------------------------------------------------------------
2023-10-13 18:40:22,182 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 18:40:22,182 - metric: "('micro avg', 'f1-score')"
2023-10-13 18:40:22,182 ----------------------------------------------------------------------------------------------------
2023-10-13 18:40:22,182 Computation:
2023-10-13 18:40:22,183 - compute on device: cuda:0
2023-10-13 18:40:22,183 - embedding storage: none
2023-10-13 18:40:22,183 ----------------------------------------------------------------------------------------------------
2023-10-13 18:40:22,183 Model training base path: "hmbench-letemps/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-2"
2023-10-13 18:40:22,183 ----------------------------------------------------------------------------------------------------
2023-10-13 18:40:22,183 ----------------------------------------------------------------------------------------------------
2023-10-13 18:40:22,183 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-13 18:41:58,767 epoch 1 - iter 361/3617 - loss 2.52124674 - time (sec): 96.58 - samples/sec: 386.79 - lr: 0.000016 - momentum: 0.000000
2023-10-13 18:43:34,508 epoch 1 - iter 722/3617 - loss 2.11644111 - time (sec): 192.32 - samples/sec: 388.65 - lr: 0.000032 - momentum: 0.000000
2023-10-13 18:45:09,650 epoch 1 - iter 1083/3617 - loss 1.64036489 - time (sec): 287.46 - samples/sec: 393.68 - lr: 0.000048 - momentum: 0.000000
2023-10-13 18:46:45,788 epoch 1 - iter 1444/3617 - loss 1.30596637 - time (sec): 383.60 - samples/sec: 393.95 - lr: 0.000064 - momentum: 0.000000
2023-10-13 18:48:22,571 epoch 1 - iter 1805/3617 - loss 1.08312946 - time (sec): 480.39 - samples/sec: 393.97 - lr: 0.000080 - momentum: 0.000000
2023-10-13 18:49:59,710 epoch 1 - iter 2166/3617 - loss 0.93558591 - time (sec): 577.53 - samples/sec: 392.82 - lr: 0.000096 - momentum: 0.000000
2023-10-13 18:51:37,931 epoch 1 - iter 2527/3617 - loss 0.82548930 - time (sec): 675.75 - samples/sec: 392.25 - lr: 0.000112 - momentum: 0.000000
2023-10-13 18:53:15,367 epoch 1 - iter 2888/3617 - loss 0.73875966 - time (sec): 773.18 - samples/sec: 391.69 - lr: 0.000128 - momentum: 0.000000
2023-10-13 18:54:56,085 epoch 1 - iter 3249/3617 - loss 0.67039148 - time (sec): 873.90 - samples/sec: 391.02 - lr: 0.000144 - momentum: 0.000000
2023-10-13 18:56:40,552 epoch 1 - iter 3610/3617 - loss 0.61660340 - time (sec): 978.37 - samples/sec: 387.65 - lr: 0.000160 - momentum: 0.000000
2023-10-13 18:56:42,400 ----------------------------------------------------------------------------------------------------
2023-10-13 18:56:42,400 EPOCH 1 done: loss 0.6157 - lr: 0.000160
2023-10-13 18:57:20,914 DEV : loss 0.13260270655155182 - f1-score (micro avg) 0.5508
2023-10-13 18:57:20,972 saving best model
2023-10-13 18:57:21,837 ----------------------------------------------------------------------------------------------------
2023-10-13 18:59:02,582 epoch 2 - iter 361/3617 - loss 0.11003363 - time (sec): 100.74 - samples/sec: 364.24 - lr: 0.000158 - momentum: 0.000000
2023-10-13 19:00:41,098 epoch 2 - iter 722/3617 - loss 0.10629388 - time (sec): 199.26 - samples/sec: 374.92 - lr: 0.000156 - momentum: 0.000000
2023-10-13 19:02:20,551 epoch 2 - iter 1083/3617 - loss 0.10240780 - time (sec): 298.71 - samples/sec: 378.36 - lr: 0.000155 - momentum: 0.000000
2023-10-13 19:03:57,204 epoch 2 - iter 1444/3617 - loss 0.10038512 - time (sec): 395.36 - samples/sec: 381.86 - lr: 0.000153 - momentum: 0.000000
2023-10-13 19:05:34,866 epoch 2 - iter 1805/3617 - loss 0.09845307 - time (sec): 493.03 - samples/sec: 385.47 - lr: 0.000151 - momentum: 0.000000
2023-10-13 19:07:10,754 epoch 2 - iter 2166/3617 - loss 0.09905672 - time (sec): 588.91 - samples/sec: 384.71 - lr: 0.000149 - momentum: 0.000000
2023-10-13 19:08:47,888 epoch 2 - iter 2527/3617 - loss 0.09733769 - time (sec): 686.05 - samples/sec: 385.00 - lr: 0.000148 - momentum: 0.000000
2023-10-13 19:10:26,147 epoch 2 - iter 2888/3617 - loss 0.09555977 - time (sec): 784.31 - samples/sec: 386.99 - lr: 0.000146 - momentum: 0.000000
2023-10-13 19:12:03,836 epoch 2 - iter 3249/3617 - loss 0.09404990 - time (sec): 882.00 - samples/sec: 387.21 - lr: 0.000144 - momentum: 0.000000
2023-10-13 19:13:40,645 epoch 2 - iter 3610/3617 - loss 0.09376699 - time (sec): 978.81 - samples/sec: 387.37 - lr: 0.000142 - momentum: 0.000000
2023-10-13 19:13:42,408 ----------------------------------------------------------------------------------------------------
2023-10-13 19:13:42,408 EPOCH 2 done: loss 0.0938 - lr: 0.000142
2023-10-13 19:14:21,070 DEV : loss 0.1215815320611 - f1-score (micro avg) 0.5926
2023-10-13 19:14:21,126 saving best model
2023-10-13 19:14:23,664 ----------------------------------------------------------------------------------------------------
2023-10-13 19:16:01,864 epoch 3 - iter 361/3617 - loss 0.05640694 - time (sec): 98.20 - samples/sec: 399.92 - lr: 0.000140 - momentum: 0.000000
2023-10-13 19:17:37,015 epoch 3 - iter 722/3617 - loss 0.05950424 - time (sec): 193.35 - samples/sec: 392.78 - lr: 0.000139 - momentum: 0.000000
2023-10-13 19:19:13,647 epoch 3 - iter 1083/3617 - loss 0.06066695 - time (sec): 289.98 - samples/sec: 391.53 - lr: 0.000137 - momentum: 0.000000
2023-10-13 19:20:51,628 epoch 3 - iter 1444/3617 - loss 0.06135792 - time (sec): 387.96 - samples/sec: 389.69 - lr: 0.000135 - momentum: 0.000000
2023-10-13 19:22:29,614 epoch 3 - iter 1805/3617 - loss 0.06119400 - time (sec): 485.95 - samples/sec: 390.11 - lr: 0.000133 - momentum: 0.000000
2023-10-13 19:24:06,053 epoch 3 - iter 2166/3617 - loss 0.06300933 - time (sec): 582.39 - samples/sec: 388.28 - lr: 0.000132 - momentum: 0.000000
2023-10-13 19:25:44,104 epoch 3 - iter 2527/3617 - loss 0.06310662 - time (sec): 680.44 - samples/sec: 390.67 - lr: 0.000130 - momentum: 0.000000
2023-10-13 19:27:20,850 epoch 3 - iter 2888/3617 - loss 0.06456942 - time (sec): 777.18 - samples/sec: 388.75 - lr: 0.000128 - momentum: 0.000000
2023-10-13 19:29:01,123 epoch 3 - iter 3249/3617 - loss 0.06429657 - time (sec): 877.46 - samples/sec: 388.25 - lr: 0.000126 - momentum: 0.000000
2023-10-13 19:30:43,178 epoch 3 - iter 3610/3617 - loss 0.06441539 - time (sec): 979.51 - samples/sec: 387.28 - lr: 0.000124 - momentum: 0.000000
2023-10-13 19:30:44,906 ----------------------------------------------------------------------------------------------------
2023-10-13 19:30:44,907 EPOCH 3 done: loss 0.0644 - lr: 0.000124
2023-10-13 19:31:24,062 DEV : loss 0.1586698442697525 - f1-score (micro avg) 0.6321
2023-10-13 19:31:24,118 saving best model
2023-10-13 19:31:26,657 ----------------------------------------------------------------------------------------------------
2023-10-13 19:33:04,849 epoch 4 - iter 361/3617 - loss 0.04452930 - time (sec): 98.19 - samples/sec: 378.11 - lr: 0.000123 - momentum: 0.000000
2023-10-13 19:34:44,327 epoch 4 - iter 722/3617 - loss 0.04066016 - time (sec): 197.67 - samples/sec: 385.17 - lr: 0.000121 - momentum: 0.000000
2023-10-13 19:36:23,824 epoch 4 - iter 1083/3617 - loss 0.04578705 - time (sec): 297.16 - samples/sec: 382.49 - lr: 0.000119 - momentum: 0.000000
2023-10-13 19:38:02,132 epoch 4 - iter 1444/3617 - loss 0.04562518 - time (sec): 395.47 - samples/sec: 381.71 - lr: 0.000117 - momentum: 0.000000
2023-10-13 19:39:39,551 epoch 4 - iter 1805/3617 - loss 0.04646266 - time (sec): 492.89 - samples/sec: 383.13 - lr: 0.000116 - momentum: 0.000000
2023-10-13 19:41:15,820 epoch 4 - iter 2166/3617 - loss 0.04519717 - time (sec): 589.16 - samples/sec: 384.26 - lr: 0.000114 - momentum: 0.000000
2023-10-13 19:42:55,082 epoch 4 - iter 2527/3617 - loss 0.04533348 - time (sec): 688.42 - samples/sec: 383.51 - lr: 0.000112 - momentum: 0.000000
2023-10-13 19:44:36,264 epoch 4 - iter 2888/3617 - loss 0.04473135 - time (sec): 789.60 - samples/sec: 382.69 - lr: 0.000110 - momentum: 0.000000
2023-10-13 19:46:16,493 epoch 4 - iter 3249/3617 - loss 0.04510623 - time (sec): 889.83 - samples/sec: 383.59 - lr: 0.000108 - momentum: 0.000000
2023-10-13 19:47:53,562 epoch 4 - iter 3610/3617 - loss 0.04637355 - time (sec): 986.90 - samples/sec: 384.36 - lr: 0.000107 - momentum: 0.000000
2023-10-13 19:47:55,182 ----------------------------------------------------------------------------------------------------
2023-10-13 19:47:55,182 EPOCH 4 done: loss 0.0464 - lr: 0.000107
2023-10-13 19:48:34,460 DEV : loss 0.2130165547132492 - f1-score (micro avg) 0.6471
2023-10-13 19:48:34,518 saving best model
2023-10-13 19:48:37,082 ----------------------------------------------------------------------------------------------------
2023-10-13 19:50:14,518 epoch 5 - iter 361/3617 - loss 0.02758511 - time (sec): 97.43 - samples/sec: 397.38 - lr: 0.000105 - momentum: 0.000000
2023-10-13 19:51:50,077 epoch 5 - iter 722/3617 - loss 0.02910352 - time (sec): 192.99 - samples/sec: 401.98 - lr: 0.000103 - momentum: 0.000000
2023-10-13 19:53:24,280 epoch 5 - iter 1083/3617 - loss 0.02918864 - time (sec): 287.19 - samples/sec: 398.11 - lr: 0.000101 - momentum: 0.000000
2023-10-13 19:55:00,737 epoch 5 - iter 1444/3617 - loss 0.03179865 - time (sec): 383.65 - samples/sec: 400.35 - lr: 0.000100 - momentum: 0.000000
2023-10-13 19:56:40,217 epoch 5 - iter 1805/3617 - loss 0.03047882 - time (sec): 483.13 - samples/sec: 397.62 - lr: 0.000098 - momentum: 0.000000
2023-10-13 19:58:17,353 epoch 5 - iter 2166/3617 - loss 0.03136089 - time (sec): 580.27 - samples/sec: 393.32 - lr: 0.000096 - momentum: 0.000000
2023-10-13 19:59:58,091 epoch 5 - iter 2527/3617 - loss 0.03107949 - time (sec): 681.01 - samples/sec: 391.65 - lr: 0.000094 - momentum: 0.000000
2023-10-13 20:01:34,403 epoch 5 - iter 2888/3617 - loss 0.03125495 - time (sec): 777.32 - samples/sec: 392.23 - lr: 0.000092 - momentum: 0.000000
2023-10-13 20:03:15,510 epoch 5 - iter 3249/3617 - loss 0.03139656 - time (sec): 878.42 - samples/sec: 388.36 - lr: 0.000091 - momentum: 0.000000
2023-10-13 20:04:55,226 epoch 5 - iter 3610/3617 - loss 0.03151360 - time (sec): 978.14 - samples/sec: 387.71 - lr: 0.000089 - momentum: 0.000000
2023-10-13 20:04:56,955 ----------------------------------------------------------------------------------------------------
2023-10-13 20:04:56,955 EPOCH 5 done: loss 0.0316 - lr: 0.000089
2023-10-13 20:05:37,673 DEV : loss 0.2452983260154724 - f1-score (micro avg) 0.6203
2023-10-13 20:05:37,733 ----------------------------------------------------------------------------------------------------
2023-10-13 20:07:19,848 epoch 6 - iter 361/3617 - loss 0.01651558 - time (sec): 102.11 - samples/sec: 372.95 - lr: 0.000087 - momentum: 0.000000
2023-10-13 20:08:58,158 epoch 6 - iter 722/3617 - loss 0.01869225 - time (sec): 200.42 - samples/sec: 375.10 - lr: 0.000085 - momentum: 0.000000
2023-10-13 20:10:36,227 epoch 6 - iter 1083/3617 - loss 0.01996453 - time (sec): 298.49 - samples/sec: 375.91 - lr: 0.000084 - momentum: 0.000000
2023-10-13 20:12:14,888 epoch 6 - iter 1444/3617 - loss 0.02211179 - time (sec): 397.15 - samples/sec: 379.66 - lr: 0.000082 - momentum: 0.000000
2023-10-13 20:13:51,084 epoch 6 - iter 1805/3617 - loss 0.02255716 - time (sec): 493.35 - samples/sec: 381.30 - lr: 0.000080 - momentum: 0.000000
2023-10-13 20:15:26,711 epoch 6 - iter 2166/3617 - loss 0.02243773 - time (sec): 588.98 - samples/sec: 383.43 - lr: 0.000078 - momentum: 0.000000
2023-10-13 20:17:02,468 epoch 6 - iter 2527/3617 - loss 0.02283358 - time (sec): 684.73 - samples/sec: 385.02 - lr: 0.000076 - momentum: 0.000000
2023-10-13 20:18:38,763 epoch 6 - iter 2888/3617 - loss 0.02312108 - time (sec): 781.03 - samples/sec: 387.56 - lr: 0.000075 - momentum: 0.000000
2023-10-13 20:20:14,206 epoch 6 - iter 3249/3617 - loss 0.02327124 - time (sec): 876.47 - samples/sec: 388.97 - lr: 0.000073 - momentum: 0.000000
2023-10-13 20:21:49,847 epoch 6 - iter 3610/3617 - loss 0.02419111 - time (sec): 972.11 - samples/sec: 390.08 - lr: 0.000071 - momentum: 0.000000
2023-10-13 20:21:51,535 ----------------------------------------------------------------------------------------------------
2023-10-13 20:21:51,535 EPOCH 6 done: loss 0.0242 - lr: 0.000071
2023-10-13 20:22:30,924 DEV : loss 0.27563825249671936 - f1-score (micro avg) 0.6231
2023-10-13 20:22:30,982 ----------------------------------------------------------------------------------------------------
2023-10-13 20:24:08,378 epoch 7 - iter 361/3617 - loss 0.01521082 - time (sec): 97.39 - samples/sec: 396.60 - lr: 0.000069 - momentum: 0.000000
2023-10-13 20:25:46,638 epoch 7 - iter 722/3617 - loss 0.01323902 - time (sec): 195.65 - samples/sec: 389.93 - lr: 0.000068 - momentum: 0.000000
2023-10-13 20:27:24,850 epoch 7 - iter 1083/3617 - loss 0.01408018 - time (sec): 293.87 - samples/sec: 392.34 - lr: 0.000066 - momentum: 0.000000
2023-10-13 20:29:00,668 epoch 7 - iter 1444/3617 - loss 0.01314577 - time (sec): 389.68 - samples/sec: 389.85 - lr: 0.000064 - momentum: 0.000000
2023-10-13 20:30:36,682 epoch 7 - iter 1805/3617 - loss 0.01419142 - time (sec): 485.70 - samples/sec: 391.03 - lr: 0.000062 - momentum: 0.000000
2023-10-13 20:32:12,581 epoch 7 - iter 2166/3617 - loss 0.01554244 - time (sec): 581.60 - samples/sec: 394.58 - lr: 0.000060 - momentum: 0.000000
2023-10-13 20:33:47,873 epoch 7 - iter 2527/3617 - loss 0.01578059 - time (sec): 676.89 - samples/sec: 394.74 - lr: 0.000059 - momentum: 0.000000
2023-10-13 20:35:23,614 epoch 7 - iter 2888/3617 - loss 0.01553333 - time (sec): 772.63 - samples/sec: 393.38 - lr: 0.000057 - momentum: 0.000000
2023-10-13 20:36:59,404 epoch 7 - iter 3249/3617 - loss 0.01558103 - time (sec): 868.42 - samples/sec: 392.48 - lr: 0.000055 - momentum: 0.000000
2023-10-13 20:38:37,239 epoch 7 - iter 3610/3617 - loss 0.01537530 - time (sec): 966.25 - samples/sec: 392.34 - lr: 0.000053 - momentum: 0.000000
2023-10-13 20:38:39,132 ----------------------------------------------------------------------------------------------------
2023-10-13 20:38:39,133 EPOCH 7 done: loss 0.0153 - lr: 0.000053
2023-10-13 20:39:17,634 DEV : loss 0.33923789858818054 - f1-score (micro avg) 0.6456
2023-10-13 20:39:17,691 ----------------------------------------------------------------------------------------------------
2023-10-13 20:40:58,364 epoch 8 - iter 361/3617 - loss 0.01277912 - time (sec): 100.67 - samples/sec: 378.20 - lr: 0.000052 - momentum: 0.000000
2023-10-13 20:42:39,601 epoch 8 - iter 722/3617 - loss 0.01152385 - time (sec): 201.91 - samples/sec: 384.45 - lr: 0.000050 - momentum: 0.000000
2023-10-13 20:44:18,870 epoch 8 - iter 1083/3617 - loss 0.01084500 - time (sec): 301.18 - samples/sec: 385.62 - lr: 0.000048 - momentum: 0.000000
2023-10-13 20:45:57,550 epoch 8 - iter 1444/3617 - loss 0.00969400 - time (sec): 399.86 - samples/sec: 386.70 - lr: 0.000046 - momentum: 0.000000
2023-10-13 20:47:33,176 epoch 8 - iter 1805/3617 - loss 0.00998289 - time (sec): 495.48 - samples/sec: 385.30 - lr: 0.000044 - momentum: 0.000000
2023-10-13 20:49:09,788 epoch 8 - iter 2166/3617 - loss 0.01065515 - time (sec): 592.09 - samples/sec: 388.20 - lr: 0.000043 - momentum: 0.000000
2023-10-13 20:50:44,987 epoch 8 - iter 2527/3617 - loss 0.01060654 - time (sec): 687.29 - samples/sec: 387.95 - lr: 0.000041 - momentum: 0.000000
2023-10-13 20:52:20,864 epoch 8 - iter 2888/3617 - loss 0.01058643 - time (sec): 783.17 - samples/sec: 388.32 - lr: 0.000039 - momentum: 0.000000
2023-10-13 20:53:59,752 epoch 8 - iter 3249/3617 - loss 0.01058392 - time (sec): 882.06 - samples/sec: 388.07 - lr: 0.000037 - momentum: 0.000000
2023-10-13 20:55:42,153 epoch 8 - iter 3610/3617 - loss 0.01028093 - time (sec): 984.46 - samples/sec: 385.49 - lr: 0.000036 - momentum: 0.000000
2023-10-13 20:55:43,713 ----------------------------------------------------------------------------------------------------
2023-10-13 20:55:43,713 EPOCH 8 done: loss 0.0103 - lr: 0.000036
2023-10-13 20:56:23,186 DEV : loss 0.33330851793289185 - f1-score (micro avg) 0.6595
2023-10-13 20:56:23,252 saving best model
2023-10-13 20:56:25,832 ----------------------------------------------------------------------------------------------------
2023-10-13 20:58:02,028 epoch 9 - iter 361/3617 - loss 0.00470043 - time (sec): 96.19 - samples/sec: 378.73 - lr: 0.000034 - momentum: 0.000000
2023-10-13 20:59:40,260 epoch 9 - iter 722/3617 - loss 0.00531921 - time (sec): 194.42 - samples/sec: 384.79 - lr: 0.000032 - momentum: 0.000000
2023-10-13 21:01:18,063 epoch 9 - iter 1083/3617 - loss 0.00654361 - time (sec): 292.22 - samples/sec: 387.48 - lr: 0.000030 - momentum: 0.000000
2023-10-13 21:02:57,618 epoch 9 - iter 1444/3617 - loss 0.00676990 - time (sec): 391.78 - samples/sec: 385.89 - lr: 0.000028 - momentum: 0.000000
2023-10-13 21:04:40,159 epoch 9 - iter 1805/3617 - loss 0.00686632 - time (sec): 494.32 - samples/sec: 383.67 - lr: 0.000027 - momentum: 0.000000
2023-10-13 21:06:17,098 epoch 9 - iter 2166/3617 - loss 0.00685262 - time (sec): 591.26 - samples/sec: 385.86 - lr: 0.000025 - momentum: 0.000000
2023-10-13 21:07:52,623 epoch 9 - iter 2527/3617 - loss 0.00658488 - time (sec): 686.78 - samples/sec: 386.29 - lr: 0.000023 - momentum: 0.000000
2023-10-13 21:09:28,784 epoch 9 - iter 2888/3617 - loss 0.00668225 - time (sec): 782.95 - samples/sec: 384.96 - lr: 0.000021 - momentum: 0.000000
2023-10-13 21:11:07,550 epoch 9 - iter 3249/3617 - loss 0.00659644 - time (sec): 881.71 - samples/sec: 385.57 - lr: 0.000020 - momentum: 0.000000
2023-10-13 21:12:45,880 epoch 9 - iter 3610/3617 - loss 0.00676788 - time (sec): 980.04 - samples/sec: 386.92 - lr: 0.000018 - momentum: 0.000000
2023-10-13 21:12:47,696 ----------------------------------------------------------------------------------------------------
2023-10-13 21:12:47,696 EPOCH 9 done: loss 0.0068 - lr: 0.000018
2023-10-13 21:13:27,417 DEV : loss 0.3747117519378662 - f1-score (micro avg) 0.6531
2023-10-13 21:13:27,476 ----------------------------------------------------------------------------------------------------
2023-10-13 21:15:06,935 epoch 10 - iter 361/3617 - loss 0.00214926 - time (sec): 99.46 - samples/sec: 384.82 - lr: 0.000016 - momentum: 0.000000
2023-10-13 21:16:45,731 epoch 10 - iter 722/3617 - loss 0.00207497 - time (sec): 198.25 - samples/sec: 383.67 - lr: 0.000014 - momentum: 0.000000
2023-10-13 21:18:22,667 epoch 10 - iter 1083/3617 - loss 0.00303010 - time (sec): 295.19 - samples/sec: 384.39 - lr: 0.000012 - momentum: 0.000000
2023-10-13 21:20:05,149 epoch 10 - iter 1444/3617 - loss 0.00354649 - time (sec): 397.67 - samples/sec: 382.40 - lr: 0.000011 - momentum: 0.000000
2023-10-13 21:21:42,479 epoch 10 - iter 1805/3617 - loss 0.00366430 - time (sec): 495.00 - samples/sec: 382.22 - lr: 0.000009 - momentum: 0.000000
2023-10-13 21:23:21,257 epoch 10 - iter 2166/3617 - loss 0.00439229 - time (sec): 593.78 - samples/sec: 382.12 - lr: 0.000007 - momentum: 0.000000
2023-10-13 21:25:03,186 epoch 10 - iter 2527/3617 - loss 0.00430784 - time (sec): 695.71 - samples/sec: 381.88 - lr: 0.000005 - momentum: 0.000000
2023-10-13 21:26:45,484 epoch 10 - iter 2888/3617 - loss 0.00395290 - time (sec): 798.01 - samples/sec: 381.84 - lr: 0.000004 - momentum: 0.000000
2023-10-13 21:28:24,586 epoch 10 - iter 3249/3617 - loss 0.00388602 - time (sec): 897.11 - samples/sec: 379.67 - lr: 0.000002 - momentum: 0.000000
2023-10-13 21:30:08,585 epoch 10 - iter 3610/3617 - loss 0.00392291 - time (sec): 1001.11 - samples/sec: 378.90 - lr: 0.000000 - momentum: 0.000000
2023-10-13 21:30:10,346 ----------------------------------------------------------------------------------------------------
2023-10-13 21:30:10,347 EPOCH 10 done: loss 0.0039 - lr: 0.000000
2023-10-13 21:30:52,851 DEV : loss 0.3852955400943756 - f1-score (micro avg) 0.6562
2023-10-13 21:30:53,782 ----------------------------------------------------------------------------------------------------
2023-10-13 21:30:53,784 Loading model from best epoch ...
2023-10-13 21:30:57,709 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
2023-10-13 21:31:56,313
Results:
- F-score (micro) 0.6292
- F-score (macro) 0.4868
- Accuracy 0.4702
By class:
precision recall f1-score support
loc 0.6219 0.7597 0.6839 591
pers 0.5708 0.7003 0.6289 357
org 0.1571 0.1392 0.1477 79
micro avg 0.5772 0.6913 0.6292 1027
macro avg 0.4499 0.5331 0.4868 1027
weighted avg 0.5684 0.6913 0.6236 1027
2023-10-13 21:31:56,313 ----------------------------------------------------------------------------------------------------