stefan-it's picture
Upload folder using huggingface_hub
91b7667
2023-10-17 18:14:36,572 ----------------------------------------------------------------------------------------------------
2023-10-17 18:14:36,573 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 18:14:36,573 ----------------------------------------------------------------------------------------------------
2023-10-17 18:14:36,573 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-17 18:14:36,573 ----------------------------------------------------------------------------------------------------
2023-10-17 18:14:36,573 Train: 5777 sentences
2023-10-17 18:14:36,573 (train_with_dev=False, train_with_test=False)
2023-10-17 18:14:36,573 ----------------------------------------------------------------------------------------------------
2023-10-17 18:14:36,573 Training Params:
2023-10-17 18:14:36,573 - learning_rate: "5e-05"
2023-10-17 18:14:36,573 - mini_batch_size: "8"
2023-10-17 18:14:36,573 - max_epochs: "10"
2023-10-17 18:14:36,574 - shuffle: "True"
2023-10-17 18:14:36,574 ----------------------------------------------------------------------------------------------------
2023-10-17 18:14:36,574 Plugins:
2023-10-17 18:14:36,574 - TensorboardLogger
2023-10-17 18:14:36,574 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 18:14:36,574 ----------------------------------------------------------------------------------------------------
2023-10-17 18:14:36,574 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 18:14:36,574 - metric: "('micro avg', 'f1-score')"
2023-10-17 18:14:36,574 ----------------------------------------------------------------------------------------------------
2023-10-17 18:14:36,574 Computation:
2023-10-17 18:14:36,574 - compute on device: cuda:0
2023-10-17 18:14:36,574 - embedding storage: none
2023-10-17 18:14:36,574 ----------------------------------------------------------------------------------------------------
2023-10-17 18:14:36,574 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-17 18:14:36,574 ----------------------------------------------------------------------------------------------------
2023-10-17 18:14:36,574 ----------------------------------------------------------------------------------------------------
2023-10-17 18:14:36,574 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 18:14:41,661 epoch 1 - iter 72/723 - loss 2.82846299 - time (sec): 5.09 - samples/sec: 3184.96 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:14:47,525 epoch 1 - iter 144/723 - loss 1.53756178 - time (sec): 10.95 - samples/sec: 3122.44 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:14:52,788 epoch 1 - iter 216/723 - loss 1.08563753 - time (sec): 16.21 - samples/sec: 3188.21 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:14:58,171 epoch 1 - iter 288/723 - loss 0.85363225 - time (sec): 21.60 - samples/sec: 3202.36 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:15:03,632 epoch 1 - iter 360/723 - loss 0.70678074 - time (sec): 27.06 - samples/sec: 3220.89 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:15:09,301 epoch 1 - iter 432/723 - loss 0.60781553 - time (sec): 32.73 - samples/sec: 3226.45 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:15:14,759 epoch 1 - iter 504/723 - loss 0.53815724 - time (sec): 38.18 - samples/sec: 3229.85 - lr: 0.000035 - momentum: 0.000000
2023-10-17 18:15:20,287 epoch 1 - iter 576/723 - loss 0.48624214 - time (sec): 43.71 - samples/sec: 3230.61 - lr: 0.000040 - momentum: 0.000000
2023-10-17 18:15:26,129 epoch 1 - iter 648/723 - loss 0.44583351 - time (sec): 49.55 - samples/sec: 3209.61 - lr: 0.000045 - momentum: 0.000000
2023-10-17 18:15:30,721 epoch 1 - iter 720/723 - loss 0.41328459 - time (sec): 54.15 - samples/sec: 3244.08 - lr: 0.000050 - momentum: 0.000000
2023-10-17 18:15:30,886 ----------------------------------------------------------------------------------------------------
2023-10-17 18:15:30,886 EPOCH 1 done: loss 0.4122 - lr: 0.000050
2023-10-17 18:15:33,753 DEV : loss 0.12260212749242783 - f1-score (micro avg) 0.5904
2023-10-17 18:15:33,773 saving best model
2023-10-17 18:15:34,229 ----------------------------------------------------------------------------------------------------
2023-10-17 18:15:39,667 epoch 2 - iter 72/723 - loss 0.08593148 - time (sec): 5.44 - samples/sec: 3423.88 - lr: 0.000049 - momentum: 0.000000
2023-10-17 18:15:44,495 epoch 2 - iter 144/723 - loss 0.08917940 - time (sec): 10.26 - samples/sec: 3457.27 - lr: 0.000049 - momentum: 0.000000
2023-10-17 18:15:50,485 epoch 2 - iter 216/723 - loss 0.09223283 - time (sec): 16.25 - samples/sec: 3293.87 - lr: 0.000048 - momentum: 0.000000
2023-10-17 18:15:55,565 epoch 2 - iter 288/723 - loss 0.08777997 - time (sec): 21.33 - samples/sec: 3334.97 - lr: 0.000048 - momentum: 0.000000
2023-10-17 18:16:00,549 epoch 2 - iter 360/723 - loss 0.09090521 - time (sec): 26.32 - samples/sec: 3337.51 - lr: 0.000047 - momentum: 0.000000
2023-10-17 18:16:05,459 epoch 2 - iter 432/723 - loss 0.09183452 - time (sec): 31.23 - samples/sec: 3344.78 - lr: 0.000047 - momentum: 0.000000
2023-10-17 18:16:11,257 epoch 2 - iter 504/723 - loss 0.08898983 - time (sec): 37.03 - samples/sec: 3329.60 - lr: 0.000046 - momentum: 0.000000
2023-10-17 18:16:16,170 epoch 2 - iter 576/723 - loss 0.08716904 - time (sec): 41.94 - samples/sec: 3327.72 - lr: 0.000046 - momentum: 0.000000
2023-10-17 18:16:21,460 epoch 2 - iter 648/723 - loss 0.08641612 - time (sec): 47.23 - samples/sec: 3325.11 - lr: 0.000045 - momentum: 0.000000
2023-10-17 18:16:26,879 epoch 2 - iter 720/723 - loss 0.08419639 - time (sec): 52.65 - samples/sec: 3339.21 - lr: 0.000044 - momentum: 0.000000
2023-10-17 18:16:27,044 ----------------------------------------------------------------------------------------------------
2023-10-17 18:16:27,045 EPOCH 2 done: loss 0.0842 - lr: 0.000044
2023-10-17 18:16:30,379 DEV : loss 0.07783125340938568 - f1-score (micro avg) 0.8082
2023-10-17 18:16:30,401 saving best model
2023-10-17 18:16:30,846 ----------------------------------------------------------------------------------------------------
2023-10-17 18:16:36,693 epoch 3 - iter 72/723 - loss 0.05301331 - time (sec): 5.84 - samples/sec: 2992.47 - lr: 0.000044 - momentum: 0.000000
2023-10-17 18:16:41,929 epoch 3 - iter 144/723 - loss 0.05340362 - time (sec): 11.08 - samples/sec: 3145.26 - lr: 0.000043 - momentum: 0.000000
2023-10-17 18:16:46,948 epoch 3 - iter 216/723 - loss 0.05843774 - time (sec): 16.10 - samples/sec: 3203.11 - lr: 0.000043 - momentum: 0.000000
2023-10-17 18:16:52,164 epoch 3 - iter 288/723 - loss 0.05891078 - time (sec): 21.31 - samples/sec: 3207.70 - lr: 0.000042 - momentum: 0.000000
2023-10-17 18:16:57,780 epoch 3 - iter 360/723 - loss 0.05823104 - time (sec): 26.93 - samples/sec: 3203.75 - lr: 0.000042 - momentum: 0.000000
2023-10-17 18:17:03,835 epoch 3 - iter 432/723 - loss 0.06089242 - time (sec): 32.99 - samples/sec: 3215.06 - lr: 0.000041 - momentum: 0.000000
2023-10-17 18:17:09,285 epoch 3 - iter 504/723 - loss 0.06007862 - time (sec): 38.44 - samples/sec: 3237.98 - lr: 0.000041 - momentum: 0.000000
2023-10-17 18:17:14,379 epoch 3 - iter 576/723 - loss 0.05958555 - time (sec): 43.53 - samples/sec: 3254.82 - lr: 0.000040 - momentum: 0.000000
2023-10-17 18:17:19,692 epoch 3 - iter 648/723 - loss 0.05839709 - time (sec): 48.84 - samples/sec: 3251.06 - lr: 0.000039 - momentum: 0.000000
2023-10-17 18:17:24,980 epoch 3 - iter 720/723 - loss 0.05846821 - time (sec): 54.13 - samples/sec: 3241.95 - lr: 0.000039 - momentum: 0.000000
2023-10-17 18:17:25,236 ----------------------------------------------------------------------------------------------------
2023-10-17 18:17:25,236 EPOCH 3 done: loss 0.0585 - lr: 0.000039
2023-10-17 18:17:28,635 DEV : loss 0.06642612814903259 - f1-score (micro avg) 0.8681
2023-10-17 18:17:28,656 saving best model
2023-10-17 18:17:29,133 ----------------------------------------------------------------------------------------------------
2023-10-17 18:17:34,244 epoch 4 - iter 72/723 - loss 0.03361156 - time (sec): 5.11 - samples/sec: 3379.83 - lr: 0.000038 - momentum: 0.000000
2023-10-17 18:17:39,440 epoch 4 - iter 144/723 - loss 0.03971135 - time (sec): 10.30 - samples/sec: 3358.00 - lr: 0.000038 - momentum: 0.000000
2023-10-17 18:17:44,886 epoch 4 - iter 216/723 - loss 0.03902710 - time (sec): 15.75 - samples/sec: 3322.89 - lr: 0.000037 - momentum: 0.000000
2023-10-17 18:17:50,204 epoch 4 - iter 288/723 - loss 0.03924984 - time (sec): 21.07 - samples/sec: 3303.57 - lr: 0.000037 - momentum: 0.000000
2023-10-17 18:17:55,096 epoch 4 - iter 360/723 - loss 0.03839109 - time (sec): 25.96 - samples/sec: 3310.04 - lr: 0.000036 - momentum: 0.000000
2023-10-17 18:18:00,634 epoch 4 - iter 432/723 - loss 0.03903095 - time (sec): 31.50 - samples/sec: 3308.90 - lr: 0.000036 - momentum: 0.000000
2023-10-17 18:18:06,123 epoch 4 - iter 504/723 - loss 0.04064308 - time (sec): 36.99 - samples/sec: 3313.10 - lr: 0.000035 - momentum: 0.000000
2023-10-17 18:18:11,189 epoch 4 - iter 576/723 - loss 0.04149638 - time (sec): 42.05 - samples/sec: 3315.45 - lr: 0.000034 - momentum: 0.000000
2023-10-17 18:18:16,556 epoch 4 - iter 648/723 - loss 0.04083870 - time (sec): 47.42 - samples/sec: 3315.10 - lr: 0.000034 - momentum: 0.000000
2023-10-17 18:18:21,965 epoch 4 - iter 720/723 - loss 0.04158702 - time (sec): 52.83 - samples/sec: 3323.32 - lr: 0.000033 - momentum: 0.000000
2023-10-17 18:18:22,157 ----------------------------------------------------------------------------------------------------
2023-10-17 18:18:22,158 EPOCH 4 done: loss 0.0416 - lr: 0.000033
2023-10-17 18:18:26,317 DEV : loss 0.08198380470275879 - f1-score (micro avg) 0.8284
2023-10-17 18:18:26,337 ----------------------------------------------------------------------------------------------------
2023-10-17 18:18:31,722 epoch 5 - iter 72/723 - loss 0.02307719 - time (sec): 5.38 - samples/sec: 3302.59 - lr: 0.000033 - momentum: 0.000000
2023-10-17 18:18:36,964 epoch 5 - iter 144/723 - loss 0.02546761 - time (sec): 10.62 - samples/sec: 3328.56 - lr: 0.000032 - momentum: 0.000000
2023-10-17 18:18:41,939 epoch 5 - iter 216/723 - loss 0.02670835 - time (sec): 15.60 - samples/sec: 3328.47 - lr: 0.000032 - momentum: 0.000000
2023-10-17 18:18:47,488 epoch 5 - iter 288/723 - loss 0.02866238 - time (sec): 21.15 - samples/sec: 3329.01 - lr: 0.000031 - momentum: 0.000000
2023-10-17 18:18:52,380 epoch 5 - iter 360/723 - loss 0.02813552 - time (sec): 26.04 - samples/sec: 3347.59 - lr: 0.000031 - momentum: 0.000000
2023-10-17 18:18:57,646 epoch 5 - iter 432/723 - loss 0.02897696 - time (sec): 31.31 - samples/sec: 3336.12 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:19:03,014 epoch 5 - iter 504/723 - loss 0.03086408 - time (sec): 36.68 - samples/sec: 3303.79 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:19:08,556 epoch 5 - iter 576/723 - loss 0.03134206 - time (sec): 42.22 - samples/sec: 3291.15 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:19:14,073 epoch 5 - iter 648/723 - loss 0.03166141 - time (sec): 47.73 - samples/sec: 3282.55 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:19:19,823 epoch 5 - iter 720/723 - loss 0.03223354 - time (sec): 53.48 - samples/sec: 3284.17 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:19:20,031 ----------------------------------------------------------------------------------------------------
2023-10-17 18:19:20,031 EPOCH 5 done: loss 0.0322 - lr: 0.000028
2023-10-17 18:19:23,459 DEV : loss 0.09982232004404068 - f1-score (micro avg) 0.8539
2023-10-17 18:19:23,478 ----------------------------------------------------------------------------------------------------
2023-10-17 18:19:28,683 epoch 6 - iter 72/723 - loss 0.01778433 - time (sec): 5.20 - samples/sec: 3306.78 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:19:33,594 epoch 6 - iter 144/723 - loss 0.02139824 - time (sec): 10.11 - samples/sec: 3389.91 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:19:39,086 epoch 6 - iter 216/723 - loss 0.02084459 - time (sec): 15.61 - samples/sec: 3359.56 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:19:43,976 epoch 6 - iter 288/723 - loss 0.01975764 - time (sec): 20.50 - samples/sec: 3322.86 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:19:50,263 epoch 6 - iter 360/723 - loss 0.02010295 - time (sec): 26.78 - samples/sec: 3268.81 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:19:55,145 epoch 6 - iter 432/723 - loss 0.02157312 - time (sec): 31.67 - samples/sec: 3326.35 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:20:00,278 epoch 6 - iter 504/723 - loss 0.02145423 - time (sec): 36.80 - samples/sec: 3327.88 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:20:05,845 epoch 6 - iter 576/723 - loss 0.02219732 - time (sec): 42.37 - samples/sec: 3342.04 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:20:10,852 epoch 6 - iter 648/723 - loss 0.02281230 - time (sec): 47.37 - samples/sec: 3341.14 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:20:16,104 epoch 6 - iter 720/723 - loss 0.02320535 - time (sec): 52.62 - samples/sec: 3340.45 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:20:16,251 ----------------------------------------------------------------------------------------------------
2023-10-17 18:20:16,251 EPOCH 6 done: loss 0.0233 - lr: 0.000022
2023-10-17 18:20:19,622 DEV : loss 0.10310214757919312 - f1-score (micro avg) 0.8637
2023-10-17 18:20:19,639 ----------------------------------------------------------------------------------------------------
2023-10-17 18:20:24,696 epoch 7 - iter 72/723 - loss 0.01137758 - time (sec): 5.06 - samples/sec: 3312.99 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:20:29,955 epoch 7 - iter 144/723 - loss 0.01035726 - time (sec): 10.31 - samples/sec: 3328.52 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:20:35,456 epoch 7 - iter 216/723 - loss 0.01222737 - time (sec): 15.82 - samples/sec: 3347.94 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:20:40,772 epoch 7 - iter 288/723 - loss 0.01398472 - time (sec): 21.13 - samples/sec: 3325.67 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:20:46,399 epoch 7 - iter 360/723 - loss 0.01391412 - time (sec): 26.76 - samples/sec: 3309.49 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:20:51,771 epoch 7 - iter 432/723 - loss 0.01726287 - time (sec): 32.13 - samples/sec: 3299.59 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:20:57,440 epoch 7 - iter 504/723 - loss 0.01722411 - time (sec): 37.80 - samples/sec: 3304.69 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:21:02,335 epoch 7 - iter 576/723 - loss 0.01664986 - time (sec): 42.69 - samples/sec: 3312.40 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:21:07,462 epoch 7 - iter 648/723 - loss 0.01678267 - time (sec): 47.82 - samples/sec: 3321.58 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:21:12,580 epoch 7 - iter 720/723 - loss 0.01673920 - time (sec): 52.94 - samples/sec: 3318.91 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:21:12,758 ----------------------------------------------------------------------------------------------------
2023-10-17 18:21:12,758 EPOCH 7 done: loss 0.0168 - lr: 0.000017
2023-10-17 18:21:16,524 DEV : loss 0.12066669762134552 - f1-score (micro avg) 0.8634
2023-10-17 18:21:16,541 ----------------------------------------------------------------------------------------------------
2023-10-17 18:21:21,843 epoch 8 - iter 72/723 - loss 0.00672480 - time (sec): 5.30 - samples/sec: 3449.42 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:21:27,188 epoch 8 - iter 144/723 - loss 0.00814462 - time (sec): 10.65 - samples/sec: 3398.01 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:21:32,219 epoch 8 - iter 216/723 - loss 0.00959825 - time (sec): 15.68 - samples/sec: 3412.55 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:21:37,317 epoch 8 - iter 288/723 - loss 0.00947515 - time (sec): 20.77 - samples/sec: 3372.69 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:21:42,967 epoch 8 - iter 360/723 - loss 0.01067416 - time (sec): 26.42 - samples/sec: 3350.48 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:21:47,982 epoch 8 - iter 432/723 - loss 0.01055203 - time (sec): 31.44 - samples/sec: 3372.63 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:21:52,946 epoch 8 - iter 504/723 - loss 0.01033183 - time (sec): 36.40 - samples/sec: 3359.17 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:21:58,319 epoch 8 - iter 576/723 - loss 0.01006916 - time (sec): 41.78 - samples/sec: 3347.55 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:22:03,879 epoch 8 - iter 648/723 - loss 0.01046073 - time (sec): 47.34 - samples/sec: 3343.65 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:22:09,026 epoch 8 - iter 720/723 - loss 0.01086649 - time (sec): 52.48 - samples/sec: 3350.17 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:22:09,176 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:09,176 EPOCH 8 done: loss 0.0108 - lr: 0.000011
2023-10-17 18:22:12,573 DEV : loss 0.12902837991714478 - f1-score (micro avg) 0.8617
2023-10-17 18:22:12,591 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:18,040 epoch 9 - iter 72/723 - loss 0.00542030 - time (sec): 5.45 - samples/sec: 3217.95 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:22:23,260 epoch 9 - iter 144/723 - loss 0.00471316 - time (sec): 10.67 - samples/sec: 3292.45 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:22:28,631 epoch 9 - iter 216/723 - loss 0.00496387 - time (sec): 16.04 - samples/sec: 3327.91 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:22:34,047 epoch 9 - iter 288/723 - loss 0.00662306 - time (sec): 21.45 - samples/sec: 3324.31 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:22:39,129 epoch 9 - iter 360/723 - loss 0.00710139 - time (sec): 26.54 - samples/sec: 3306.26 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:22:44,235 epoch 9 - iter 432/723 - loss 0.00756603 - time (sec): 31.64 - samples/sec: 3336.32 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:22:50,214 epoch 9 - iter 504/723 - loss 0.00737050 - time (sec): 37.62 - samples/sec: 3275.66 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:22:55,692 epoch 9 - iter 576/723 - loss 0.00778212 - time (sec): 43.10 - samples/sec: 3280.49 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:23:00,511 epoch 9 - iter 648/723 - loss 0.00759945 - time (sec): 47.92 - samples/sec: 3296.91 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:23:06,028 epoch 9 - iter 720/723 - loss 0.00777119 - time (sec): 53.44 - samples/sec: 3286.78 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:23:06,230 ----------------------------------------------------------------------------------------------------
2023-10-17 18:23:06,230 EPOCH 9 done: loss 0.0077 - lr: 0.000006
2023-10-17 18:23:09,690 DEV : loss 0.14203666150569916 - f1-score (micro avg) 0.8652
2023-10-17 18:23:09,725 ----------------------------------------------------------------------------------------------------
2023-10-17 18:23:15,274 epoch 10 - iter 72/723 - loss 0.00541186 - time (sec): 5.55 - samples/sec: 3342.77 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:23:20,833 epoch 10 - iter 144/723 - loss 0.00416299 - time (sec): 11.11 - samples/sec: 3195.94 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:23:26,151 epoch 10 - iter 216/723 - loss 0.00402157 - time (sec): 16.42 - samples/sec: 3212.81 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:23:31,122 epoch 10 - iter 288/723 - loss 0.00415435 - time (sec): 21.39 - samples/sec: 3264.59 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:23:36,696 epoch 10 - iter 360/723 - loss 0.00474951 - time (sec): 26.97 - samples/sec: 3276.79 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:23:42,377 epoch 10 - iter 432/723 - loss 0.00538210 - time (sec): 32.65 - samples/sec: 3274.77 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:23:47,603 epoch 10 - iter 504/723 - loss 0.00603509 - time (sec): 37.88 - samples/sec: 3278.50 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:23:52,459 epoch 10 - iter 576/723 - loss 0.00551894 - time (sec): 42.73 - samples/sec: 3305.81 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:23:57,911 epoch 10 - iter 648/723 - loss 0.00616743 - time (sec): 48.18 - samples/sec: 3310.83 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:24:03,071 epoch 10 - iter 720/723 - loss 0.00577409 - time (sec): 53.34 - samples/sec: 3289.75 - lr: 0.000000 - momentum: 0.000000
2023-10-17 18:24:03,303 ----------------------------------------------------------------------------------------------------
2023-10-17 18:24:03,304 EPOCH 10 done: loss 0.0058 - lr: 0.000000
2023-10-17 18:24:07,157 DEV : loss 0.14152230322360992 - f1-score (micro avg) 0.8652
2023-10-17 18:24:07,568 ----------------------------------------------------------------------------------------------------
2023-10-17 18:24:07,569 Loading model from best epoch ...
2023-10-17 18:24:09,014 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 18:24:11,493
Results:
- F-score (micro) 0.845
- F-score (macro) 0.726
- Accuracy 0.7393
By class:
precision recall f1-score support
PER 0.8364 0.8485 0.8424 482
LOC 0.9297 0.8952 0.9121 458
ORG 0.4265 0.4203 0.4234 69
micro avg 0.8497 0.8404 0.8450 1009
macro avg 0.7309 0.7213 0.7260 1009
weighted avg 0.8507 0.8404 0.8454 1009
2023-10-17 18:24:11,494 ----------------------------------------------------------------------------------------------------