stefan-it's picture
Upload ./training.log with huggingface_hub
317d410
2023-11-16 03:28:06,601 ----------------------------------------------------------------------------------------------------
2023-11-16 03:28:06,603 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): XLMRobertaModel(
(embeddings): XLMRobertaEmbeddings(
(word_embeddings): Embedding(250003, 1024)
(position_embeddings): Embedding(514, 1024, padding_idx=1)
(token_type_embeddings): Embedding(1, 1024)
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): XLMRobertaEncoder(
(layer): ModuleList(
(0-23): 24 x XLMRobertaLayer(
(attention): XLMRobertaAttention(
(self): XLMRobertaSelfAttention(
(query): Linear(in_features=1024, out_features=1024, bias=True)
(key): Linear(in_features=1024, out_features=1024, bias=True)
(value): Linear(in_features=1024, out_features=1024, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): XLMRobertaSelfOutput(
(dense): Linear(in_features=1024, out_features=1024, bias=True)
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): XLMRobertaIntermediate(
(dense): Linear(in_features=1024, out_features=4096, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): XLMRobertaOutput(
(dense): Linear(in_features=4096, out_features=1024, bias=True)
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): XLMRobertaPooler(
(dense): Linear(in_features=1024, out_features=1024, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1024, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-11-16 03:28:06,603 ----------------------------------------------------------------------------------------------------
2023-11-16 03:28:06,603 MultiCorpus: 30000 train + 10000 dev + 10000 test sentences
- ColumnCorpus Corpus: 20000 train + 0 dev + 0 test sentences - /root/.flair/datasets/ner_multi_xtreme/en
- ColumnCorpus Corpus: 10000 train + 10000 dev + 10000 test sentences - /root/.flair/datasets/ner_multi_xtreme/ka
2023-11-16 03:28:06,603 ----------------------------------------------------------------------------------------------------
2023-11-16 03:28:06,603 Train: 30000 sentences
2023-11-16 03:28:06,603 (train_with_dev=False, train_with_test=False)
2023-11-16 03:28:06,603 ----------------------------------------------------------------------------------------------------
2023-11-16 03:28:06,604 Training Params:
2023-11-16 03:28:06,604 - learning_rate: "5e-06"
2023-11-16 03:28:06,604 - mini_batch_size: "4"
2023-11-16 03:28:06,604 - max_epochs: "10"
2023-11-16 03:28:06,604 - shuffle: "True"
2023-11-16 03:28:06,604 ----------------------------------------------------------------------------------------------------
2023-11-16 03:28:06,604 Plugins:
2023-11-16 03:28:06,604 - TensorboardLogger
2023-11-16 03:28:06,604 - LinearScheduler | warmup_fraction: '0.1'
2023-11-16 03:28:06,604 ----------------------------------------------------------------------------------------------------
2023-11-16 03:28:06,604 Final evaluation on model from best epoch (best-model.pt)
2023-11-16 03:28:06,604 - metric: "('micro avg', 'f1-score')"
2023-11-16 03:28:06,604 ----------------------------------------------------------------------------------------------------
2023-11-16 03:28:06,604 Computation:
2023-11-16 03:28:06,604 - compute on device: cuda:0
2023-11-16 03:28:06,604 - embedding storage: none
2023-11-16 03:28:06,604 ----------------------------------------------------------------------------------------------------
2023-11-16 03:28:06,604 Model training base path: "autotrain-flair-georgian-ner-xlm_r_large-bs4-e10-lr5e-06-3"
2023-11-16 03:28:06,604 ----------------------------------------------------------------------------------------------------
2023-11-16 03:28:06,604 ----------------------------------------------------------------------------------------------------
2023-11-16 03:28:06,604 Logging anything other than scalars to TensorBoard is currently not supported.
2023-11-16 03:29:38,193 epoch 1 - iter 750/7500 - loss 2.70469216 - time (sec): 91.59 - samples/sec: 264.85 - lr: 0.000000 - momentum: 0.000000
2023-11-16 03:31:09,213 epoch 1 - iter 1500/7500 - loss 2.24893654 - time (sec): 182.61 - samples/sec: 261.81 - lr: 0.000001 - momentum: 0.000000
2023-11-16 03:32:42,308 epoch 1 - iter 2250/7500 - loss 1.97006153 - time (sec): 275.70 - samples/sec: 260.33 - lr: 0.000001 - momentum: 0.000000
2023-11-16 03:34:16,815 epoch 1 - iter 3000/7500 - loss 1.72031860 - time (sec): 370.21 - samples/sec: 260.02 - lr: 0.000002 - momentum: 0.000000
2023-11-16 03:35:50,112 epoch 1 - iter 3750/7500 - loss 1.52308109 - time (sec): 463.51 - samples/sec: 259.42 - lr: 0.000002 - momentum: 0.000000
2023-11-16 03:37:23,760 epoch 1 - iter 4500/7500 - loss 1.36457847 - time (sec): 557.15 - samples/sec: 259.48 - lr: 0.000003 - momentum: 0.000000
2023-11-16 03:38:57,168 epoch 1 - iter 5250/7500 - loss 1.24407079 - time (sec): 650.56 - samples/sec: 259.07 - lr: 0.000003 - momentum: 0.000000
2023-11-16 03:40:28,972 epoch 1 - iter 6000/7500 - loss 1.15260515 - time (sec): 742.37 - samples/sec: 259.75 - lr: 0.000004 - momentum: 0.000000
2023-11-16 03:42:03,894 epoch 1 - iter 6750/7500 - loss 1.07519645 - time (sec): 837.29 - samples/sec: 258.95 - lr: 0.000004 - momentum: 0.000000
2023-11-16 03:43:39,060 epoch 1 - iter 7500/7500 - loss 1.01557427 - time (sec): 932.45 - samples/sec: 258.24 - lr: 0.000005 - momentum: 0.000000
2023-11-16 03:43:39,062 ----------------------------------------------------------------------------------------------------
2023-11-16 03:43:39,063 EPOCH 1 done: loss 1.0156 - lr: 0.000005
2023-11-16 03:44:06,229 DEV : loss 0.27559971809387207 - f1-score (micro avg) 0.8152
2023-11-16 03:44:08,725 saving best model
2023-11-16 03:44:10,470 ----------------------------------------------------------------------------------------------------
2023-11-16 03:45:42,474 epoch 2 - iter 750/7500 - loss 0.39106376 - time (sec): 92.00 - samples/sec: 261.03 - lr: 0.000005 - momentum: 0.000000
2023-11-16 03:47:15,847 epoch 2 - iter 1500/7500 - loss 0.40555598 - time (sec): 185.37 - samples/sec: 261.97 - lr: 0.000005 - momentum: 0.000000
2023-11-16 03:48:49,533 epoch 2 - iter 2250/7500 - loss 0.40652252 - time (sec): 279.06 - samples/sec: 260.36 - lr: 0.000005 - momentum: 0.000000
2023-11-16 03:50:24,376 epoch 2 - iter 3000/7500 - loss 0.40712357 - time (sec): 373.90 - samples/sec: 258.58 - lr: 0.000005 - momentum: 0.000000
2023-11-16 03:52:01,501 epoch 2 - iter 3750/7500 - loss 0.40345429 - time (sec): 471.03 - samples/sec: 256.65 - lr: 0.000005 - momentum: 0.000000
2023-11-16 03:53:38,242 epoch 2 - iter 4500/7500 - loss 0.40372313 - time (sec): 567.77 - samples/sec: 255.87 - lr: 0.000005 - momentum: 0.000000
2023-11-16 03:55:11,702 epoch 2 - iter 5250/7500 - loss 0.40504927 - time (sec): 661.23 - samples/sec: 255.50 - lr: 0.000005 - momentum: 0.000000
2023-11-16 03:56:44,579 epoch 2 - iter 6000/7500 - loss 0.40569421 - time (sec): 754.11 - samples/sec: 256.15 - lr: 0.000005 - momentum: 0.000000
2023-11-16 03:58:17,886 epoch 2 - iter 6750/7500 - loss 0.40571892 - time (sec): 847.41 - samples/sec: 256.18 - lr: 0.000005 - momentum: 0.000000
2023-11-16 03:59:50,847 epoch 2 - iter 7500/7500 - loss 0.40365851 - time (sec): 940.37 - samples/sec: 256.06 - lr: 0.000004 - momentum: 0.000000
2023-11-16 03:59:50,849 ----------------------------------------------------------------------------------------------------
2023-11-16 03:59:50,849 EPOCH 2 done: loss 0.4037 - lr: 0.000004
2023-11-16 04:00:17,681 DEV : loss 0.271997332572937 - f1-score (micro avg) 0.8697
2023-11-16 04:00:20,070 saving best model
2023-11-16 04:00:23,060 ----------------------------------------------------------------------------------------------------
2023-11-16 04:01:57,142 epoch 3 - iter 750/7500 - loss 0.34646794 - time (sec): 94.08 - samples/sec: 250.74 - lr: 0.000004 - momentum: 0.000000
2023-11-16 04:03:32,257 epoch 3 - iter 1500/7500 - loss 0.33277165 - time (sec): 189.19 - samples/sec: 253.91 - lr: 0.000004 - momentum: 0.000000
2023-11-16 04:05:06,742 epoch 3 - iter 2250/7500 - loss 0.34013081 - time (sec): 283.68 - samples/sec: 253.23 - lr: 0.000004 - momentum: 0.000000
2023-11-16 04:06:41,133 epoch 3 - iter 3000/7500 - loss 0.33864371 - time (sec): 378.07 - samples/sec: 253.41 - lr: 0.000004 - momentum: 0.000000
2023-11-16 04:08:14,833 epoch 3 - iter 3750/7500 - loss 0.34190452 - time (sec): 471.77 - samples/sec: 254.37 - lr: 0.000004 - momentum: 0.000000
2023-11-16 04:09:45,391 epoch 3 - iter 4500/7500 - loss 0.34219639 - time (sec): 562.33 - samples/sec: 256.12 - lr: 0.000004 - momentum: 0.000000
2023-11-16 04:11:18,334 epoch 3 - iter 5250/7500 - loss 0.34365478 - time (sec): 655.27 - samples/sec: 256.94 - lr: 0.000004 - momentum: 0.000000
2023-11-16 04:12:52,829 epoch 3 - iter 6000/7500 - loss 0.34431528 - time (sec): 749.76 - samples/sec: 256.24 - lr: 0.000004 - momentum: 0.000000
2023-11-16 04:14:25,065 epoch 3 - iter 6750/7500 - loss 0.34309773 - time (sec): 842.00 - samples/sec: 257.59 - lr: 0.000004 - momentum: 0.000000
2023-11-16 04:15:57,201 epoch 3 - iter 7500/7500 - loss 0.34251715 - time (sec): 934.14 - samples/sec: 257.77 - lr: 0.000004 - momentum: 0.000000
2023-11-16 04:15:57,204 ----------------------------------------------------------------------------------------------------
2023-11-16 04:15:57,204 EPOCH 3 done: loss 0.3425 - lr: 0.000004
2023-11-16 04:16:24,728 DEV : loss 0.2714731991291046 - f1-score (micro avg) 0.8842
2023-11-16 04:16:27,191 saving best model
2023-11-16 04:16:29,639 ----------------------------------------------------------------------------------------------------
2023-11-16 04:18:06,042 epoch 4 - iter 750/7500 - loss 0.29074268 - time (sec): 96.40 - samples/sec: 252.40 - lr: 0.000004 - momentum: 0.000000
2023-11-16 04:19:39,895 epoch 4 - iter 1500/7500 - loss 0.29294947 - time (sec): 190.25 - samples/sec: 256.92 - lr: 0.000004 - momentum: 0.000000
2023-11-16 04:21:12,116 epoch 4 - iter 2250/7500 - loss 0.29693683 - time (sec): 282.47 - samples/sec: 257.67 - lr: 0.000004 - momentum: 0.000000
2023-11-16 04:22:43,053 epoch 4 - iter 3000/7500 - loss 0.29670062 - time (sec): 373.41 - samples/sec: 259.41 - lr: 0.000004 - momentum: 0.000000
2023-11-16 04:24:17,342 epoch 4 - iter 3750/7500 - loss 0.29561519 - time (sec): 467.70 - samples/sec: 257.80 - lr: 0.000004 - momentum: 0.000000
2023-11-16 04:25:50,373 epoch 4 - iter 4500/7500 - loss 0.29194840 - time (sec): 560.73 - samples/sec: 258.18 - lr: 0.000004 - momentum: 0.000000
2023-11-16 04:27:24,822 epoch 4 - iter 5250/7500 - loss 0.29857267 - time (sec): 655.18 - samples/sec: 257.96 - lr: 0.000004 - momentum: 0.000000
2023-11-16 04:28:58,980 epoch 4 - iter 6000/7500 - loss 0.30018714 - time (sec): 749.34 - samples/sec: 257.24 - lr: 0.000003 - momentum: 0.000000
2023-11-16 04:30:33,294 epoch 4 - iter 6750/7500 - loss 0.30336094 - time (sec): 843.65 - samples/sec: 257.08 - lr: 0.000003 - momentum: 0.000000
2023-11-16 04:32:07,976 epoch 4 - iter 7500/7500 - loss 0.30240959 - time (sec): 938.33 - samples/sec: 256.62 - lr: 0.000003 - momentum: 0.000000
2023-11-16 04:32:07,980 ----------------------------------------------------------------------------------------------------
2023-11-16 04:32:07,980 EPOCH 4 done: loss 0.3024 - lr: 0.000003
2023-11-16 04:32:35,569 DEV : loss 0.2897871732711792 - f1-score (micro avg) 0.8922
2023-11-16 04:32:38,075 saving best model
2023-11-16 04:32:40,983 ----------------------------------------------------------------------------------------------------
2023-11-16 04:34:14,736 epoch 5 - iter 750/7500 - loss 0.22168761 - time (sec): 93.75 - samples/sec: 260.88 - lr: 0.000003 - momentum: 0.000000
2023-11-16 04:35:48,275 epoch 5 - iter 1500/7500 - loss 0.23358638 - time (sec): 187.29 - samples/sec: 258.77 - lr: 0.000003 - momentum: 0.000000
2023-11-16 04:37:23,490 epoch 5 - iter 2250/7500 - loss 0.24130242 - time (sec): 282.50 - samples/sec: 256.72 - lr: 0.000003 - momentum: 0.000000
2023-11-16 04:38:57,959 epoch 5 - iter 3000/7500 - loss 0.24848714 - time (sec): 376.97 - samples/sec: 257.38 - lr: 0.000003 - momentum: 0.000000
2023-11-16 04:40:32,648 epoch 5 - iter 3750/7500 - loss 0.25384312 - time (sec): 471.66 - samples/sec: 255.68 - lr: 0.000003 - momentum: 0.000000
2023-11-16 04:42:08,140 epoch 5 - iter 4500/7500 - loss 0.25352346 - time (sec): 567.15 - samples/sec: 254.83 - lr: 0.000003 - momentum: 0.000000
2023-11-16 04:43:42,002 epoch 5 - iter 5250/7500 - loss 0.25599881 - time (sec): 661.01 - samples/sec: 255.09 - lr: 0.000003 - momentum: 0.000000
2023-11-16 04:45:13,434 epoch 5 - iter 6000/7500 - loss 0.25515887 - time (sec): 752.45 - samples/sec: 255.70 - lr: 0.000003 - momentum: 0.000000
2023-11-16 04:46:47,959 epoch 5 - iter 6750/7500 - loss 0.25539887 - time (sec): 846.97 - samples/sec: 255.87 - lr: 0.000003 - momentum: 0.000000
2023-11-16 04:48:21,835 epoch 5 - iter 7500/7500 - loss 0.25660205 - time (sec): 940.85 - samples/sec: 255.94 - lr: 0.000003 - momentum: 0.000000
2023-11-16 04:48:21,838 ----------------------------------------------------------------------------------------------------
2023-11-16 04:48:21,838 EPOCH 5 done: loss 0.2566 - lr: 0.000003
2023-11-16 04:48:49,130 DEV : loss 0.28101304173469543 - f1-score (micro avg) 0.8973
2023-11-16 04:48:51,696 saving best model
2023-11-16 04:48:53,741 ----------------------------------------------------------------------------------------------------
2023-11-16 04:50:26,277 epoch 6 - iter 750/7500 - loss 0.22465859 - time (sec): 92.53 - samples/sec: 255.27 - lr: 0.000003 - momentum: 0.000000
2023-11-16 04:52:00,738 epoch 6 - iter 1500/7500 - loss 0.21970656 - time (sec): 186.99 - samples/sec: 254.83 - lr: 0.000003 - momentum: 0.000000
2023-11-16 04:53:34,911 epoch 6 - iter 2250/7500 - loss 0.21946764 - time (sec): 281.17 - samples/sec: 255.61 - lr: 0.000003 - momentum: 0.000000
2023-11-16 04:55:09,828 epoch 6 - iter 3000/7500 - loss 0.21638489 - time (sec): 376.08 - samples/sec: 255.02 - lr: 0.000003 - momentum: 0.000000
2023-11-16 04:56:43,313 epoch 6 - iter 3750/7500 - loss 0.21414458 - time (sec): 469.57 - samples/sec: 255.78 - lr: 0.000003 - momentum: 0.000000
2023-11-16 04:58:15,828 epoch 6 - iter 4500/7500 - loss 0.21434532 - time (sec): 562.08 - samples/sec: 256.74 - lr: 0.000002 - momentum: 0.000000
2023-11-16 04:59:47,824 epoch 6 - iter 5250/7500 - loss 0.21772911 - time (sec): 654.08 - samples/sec: 257.12 - lr: 0.000002 - momentum: 0.000000
2023-11-16 05:01:19,648 epoch 6 - iter 6000/7500 - loss 0.21657089 - time (sec): 745.90 - samples/sec: 257.68 - lr: 0.000002 - momentum: 0.000000
2023-11-16 05:02:53,268 epoch 6 - iter 6750/7500 - loss 0.21549326 - time (sec): 839.52 - samples/sec: 257.74 - lr: 0.000002 - momentum: 0.000000
2023-11-16 05:04:26,550 epoch 6 - iter 7500/7500 - loss 0.21351207 - time (sec): 932.80 - samples/sec: 258.14 - lr: 0.000002 - momentum: 0.000000
2023-11-16 05:04:26,555 ----------------------------------------------------------------------------------------------------
2023-11-16 05:04:26,555 EPOCH 6 done: loss 0.2135 - lr: 0.000002
2023-11-16 05:04:53,798 DEV : loss 0.3079068958759308 - f1-score (micro avg) 0.9002
2023-11-16 05:04:56,055 saving best model
2023-11-16 05:04:58,666 ----------------------------------------------------------------------------------------------------
2023-11-16 05:06:33,192 epoch 7 - iter 750/7500 - loss 0.18268097 - time (sec): 94.52 - samples/sec: 251.77 - lr: 0.000002 - momentum: 0.000000
2023-11-16 05:08:05,085 epoch 7 - iter 1500/7500 - loss 0.18175139 - time (sec): 186.42 - samples/sec: 257.03 - lr: 0.000002 - momentum: 0.000000
2023-11-16 05:09:42,400 epoch 7 - iter 2250/7500 - loss 0.19001507 - time (sec): 283.73 - samples/sec: 252.39 - lr: 0.000002 - momentum: 0.000000
2023-11-16 05:11:16,393 epoch 7 - iter 3000/7500 - loss 0.18641112 - time (sec): 377.72 - samples/sec: 253.16 - lr: 0.000002 - momentum: 0.000000
2023-11-16 05:12:51,038 epoch 7 - iter 3750/7500 - loss 0.18515279 - time (sec): 472.37 - samples/sec: 253.66 - lr: 0.000002 - momentum: 0.000000
2023-11-16 05:14:26,806 epoch 7 - iter 4500/7500 - loss 0.18525402 - time (sec): 568.14 - samples/sec: 253.77 - lr: 0.000002 - momentum: 0.000000
2023-11-16 05:16:01,622 epoch 7 - iter 5250/7500 - loss 0.18863436 - time (sec): 662.95 - samples/sec: 253.50 - lr: 0.000002 - momentum: 0.000000
2023-11-16 05:17:35,258 epoch 7 - iter 6000/7500 - loss 0.18494686 - time (sec): 756.59 - samples/sec: 253.73 - lr: 0.000002 - momentum: 0.000000
2023-11-16 05:19:09,299 epoch 7 - iter 6750/7500 - loss 0.18556342 - time (sec): 850.63 - samples/sec: 254.64 - lr: 0.000002 - momentum: 0.000000
2023-11-16 05:20:43,065 epoch 7 - iter 7500/7500 - loss 0.18644460 - time (sec): 944.40 - samples/sec: 254.97 - lr: 0.000002 - momentum: 0.000000
2023-11-16 05:20:43,069 ----------------------------------------------------------------------------------------------------
2023-11-16 05:20:43,069 EPOCH 7 done: loss 0.1864 - lr: 0.000002
2023-11-16 05:21:10,981 DEV : loss 0.2802160382270813 - f1-score (micro avg) 0.9048
2023-11-16 05:21:13,241 saving best model
2023-11-16 05:21:15,612 ----------------------------------------------------------------------------------------------------
2023-11-16 05:22:49,896 epoch 8 - iter 750/7500 - loss 0.14122739 - time (sec): 94.28 - samples/sec: 259.14 - lr: 0.000002 - momentum: 0.000000
2023-11-16 05:24:23,137 epoch 8 - iter 1500/7500 - loss 0.14874139 - time (sec): 187.52 - samples/sec: 258.77 - lr: 0.000002 - momentum: 0.000000
2023-11-16 05:25:56,602 epoch 8 - iter 2250/7500 - loss 0.15341856 - time (sec): 280.99 - samples/sec: 257.70 - lr: 0.000002 - momentum: 0.000000
2023-11-16 05:27:28,992 epoch 8 - iter 3000/7500 - loss 0.15416389 - time (sec): 373.38 - samples/sec: 258.36 - lr: 0.000001 - momentum: 0.000000
2023-11-16 05:29:03,053 epoch 8 - iter 3750/7500 - loss 0.15634692 - time (sec): 467.44 - samples/sec: 257.36 - lr: 0.000001 - momentum: 0.000000
2023-11-16 05:30:37,470 epoch 8 - iter 4500/7500 - loss 0.15700278 - time (sec): 561.85 - samples/sec: 256.31 - lr: 0.000001 - momentum: 0.000000
2023-11-16 05:32:10,851 epoch 8 - iter 5250/7500 - loss 0.15692674 - time (sec): 655.24 - samples/sec: 256.36 - lr: 0.000001 - momentum: 0.000000
2023-11-16 05:33:44,315 epoch 8 - iter 6000/7500 - loss 0.15879525 - time (sec): 748.70 - samples/sec: 256.96 - lr: 0.000001 - momentum: 0.000000
2023-11-16 05:35:16,930 epoch 8 - iter 6750/7500 - loss 0.15726830 - time (sec): 841.31 - samples/sec: 257.53 - lr: 0.000001 - momentum: 0.000000
2023-11-16 05:36:52,298 epoch 8 - iter 7500/7500 - loss 0.15647824 - time (sec): 936.68 - samples/sec: 257.07 - lr: 0.000001 - momentum: 0.000000
2023-11-16 05:36:52,300 ----------------------------------------------------------------------------------------------------
2023-11-16 05:36:52,301 EPOCH 8 done: loss 0.1565 - lr: 0.000001
2023-11-16 05:37:19,973 DEV : loss 0.3105733096599579 - f1-score (micro avg) 0.9056
2023-11-16 05:37:21,975 saving best model
2023-11-16 05:37:24,268 ----------------------------------------------------------------------------------------------------
2023-11-16 05:38:57,543 epoch 9 - iter 750/7500 - loss 0.13578701 - time (sec): 93.27 - samples/sec: 260.82 - lr: 0.000001 - momentum: 0.000000
2023-11-16 05:40:30,477 epoch 9 - iter 1500/7500 - loss 0.13977943 - time (sec): 186.21 - samples/sec: 258.36 - lr: 0.000001 - momentum: 0.000000
2023-11-16 05:42:04,338 epoch 9 - iter 2250/7500 - loss 0.13579281 - time (sec): 280.07 - samples/sec: 257.18 - lr: 0.000001 - momentum: 0.000000
2023-11-16 05:43:38,134 epoch 9 - iter 3000/7500 - loss 0.13083188 - time (sec): 373.86 - samples/sec: 257.67 - lr: 0.000001 - momentum: 0.000000
2023-11-16 05:45:11,777 epoch 9 - iter 3750/7500 - loss 0.13761002 - time (sec): 467.51 - samples/sec: 257.61 - lr: 0.000001 - momentum: 0.000000
2023-11-16 05:46:46,321 epoch 9 - iter 4500/7500 - loss 0.13992387 - time (sec): 562.05 - samples/sec: 256.71 - lr: 0.000001 - momentum: 0.000000
2023-11-16 05:48:20,030 epoch 9 - iter 5250/7500 - loss 0.13868841 - time (sec): 655.76 - samples/sec: 256.12 - lr: 0.000001 - momentum: 0.000000
2023-11-16 05:49:52,363 epoch 9 - iter 6000/7500 - loss 0.13924211 - time (sec): 748.09 - samples/sec: 256.89 - lr: 0.000001 - momentum: 0.000000
2023-11-16 05:51:24,421 epoch 9 - iter 6750/7500 - loss 0.13714285 - time (sec): 840.15 - samples/sec: 257.44 - lr: 0.000001 - momentum: 0.000000
2023-11-16 05:52:57,141 epoch 9 - iter 7500/7500 - loss 0.13574777 - time (sec): 932.87 - samples/sec: 258.12 - lr: 0.000001 - momentum: 0.000000
2023-11-16 05:52:57,144 ----------------------------------------------------------------------------------------------------
2023-11-16 05:52:57,144 EPOCH 9 done: loss 0.1357 - lr: 0.000001
2023-11-16 05:53:24,122 DEV : loss 0.30354949831962585 - f1-score (micro avg) 0.9069
2023-11-16 05:53:26,256 saving best model
2023-11-16 05:53:28,623 ----------------------------------------------------------------------------------------------------
2023-11-16 05:55:01,065 epoch 10 - iter 750/7500 - loss 0.11662424 - time (sec): 92.44 - samples/sec: 258.78 - lr: 0.000001 - momentum: 0.000000
2023-11-16 05:56:34,198 epoch 10 - iter 1500/7500 - loss 0.10739844 - time (sec): 185.57 - samples/sec: 260.22 - lr: 0.000000 - momentum: 0.000000
2023-11-16 05:58:07,104 epoch 10 - iter 2250/7500 - loss 0.11728002 - time (sec): 278.48 - samples/sec: 261.23 - lr: 0.000000 - momentum: 0.000000
2023-11-16 05:59:38,545 epoch 10 - iter 3000/7500 - loss 0.11111246 - time (sec): 369.92 - samples/sec: 263.47 - lr: 0.000000 - momentum: 0.000000
2023-11-16 06:01:13,020 epoch 10 - iter 3750/7500 - loss 0.11185424 - time (sec): 464.39 - samples/sec: 261.72 - lr: 0.000000 - momentum: 0.000000
2023-11-16 06:02:47,617 epoch 10 - iter 4500/7500 - loss 0.11443883 - time (sec): 558.99 - samples/sec: 260.01 - lr: 0.000000 - momentum: 0.000000
2023-11-16 06:04:22,079 epoch 10 - iter 5250/7500 - loss 0.11684866 - time (sec): 653.45 - samples/sec: 259.18 - lr: 0.000000 - momentum: 0.000000
2023-11-16 06:05:55,899 epoch 10 - iter 6000/7500 - loss 0.11690532 - time (sec): 747.27 - samples/sec: 258.80 - lr: 0.000000 - momentum: 0.000000
2023-11-16 06:07:29,604 epoch 10 - iter 6750/7500 - loss 0.11669926 - time (sec): 840.98 - samples/sec: 258.11 - lr: 0.000000 - momentum: 0.000000
2023-11-16 06:09:02,429 epoch 10 - iter 7500/7500 - loss 0.11723510 - time (sec): 933.80 - samples/sec: 257.87 - lr: 0.000000 - momentum: 0.000000
2023-11-16 06:09:02,432 ----------------------------------------------------------------------------------------------------
2023-11-16 06:09:02,432 EPOCH 10 done: loss 0.1172 - lr: 0.000000
2023-11-16 06:09:29,736 DEV : loss 0.3160940110683441 - f1-score (micro avg) 0.9064
2023-11-16 06:09:34,595 ----------------------------------------------------------------------------------------------------
2023-11-16 06:09:34,598 Loading model from best epoch ...
2023-11-16 06:09:44,517 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-PER, B-PER, E-PER, I-PER
2023-11-16 06:10:13,551
Results:
- F-score (micro) 0.9076
- F-score (macro) 0.9067
- Accuracy 0.8601
By class:
precision recall f1-score support
LOC 0.9066 0.9143 0.9105 5288
PER 0.9231 0.9485 0.9356 3962
ORG 0.8737 0.8742 0.8739 3807
micro avg 0.9022 0.9130 0.9076 13057
macro avg 0.9012 0.9123 0.9067 13057
weighted avg 0.9020 0.9130 0.9075 13057
2023-11-16 06:10:13,551 ----------------------------------------------------------------------------------------------------