stefan-it's picture
Upload folder using huggingface_hub
a8bc8b7
2023-10-17 19:00:55,795 ----------------------------------------------------------------------------------------------------
2023-10-17 19:00:55,796 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 19:00:55,796 ----------------------------------------------------------------------------------------------------
2023-10-17 19:00:55,796 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-17 19:00:55,796 ----------------------------------------------------------------------------------------------------
2023-10-17 19:00:55,796 Train: 5777 sentences
2023-10-17 19:00:55,796 (train_with_dev=False, train_with_test=False)
2023-10-17 19:00:55,796 ----------------------------------------------------------------------------------------------------
2023-10-17 19:00:55,796 Training Params:
2023-10-17 19:00:55,796 - learning_rate: "5e-05"
2023-10-17 19:00:55,796 - mini_batch_size: "8"
2023-10-17 19:00:55,796 - max_epochs: "10"
2023-10-17 19:00:55,796 - shuffle: "True"
2023-10-17 19:00:55,796 ----------------------------------------------------------------------------------------------------
2023-10-17 19:00:55,796 Plugins:
2023-10-17 19:00:55,796 - TensorboardLogger
2023-10-17 19:00:55,797 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 19:00:55,797 ----------------------------------------------------------------------------------------------------
2023-10-17 19:00:55,797 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 19:00:55,797 - metric: "('micro avg', 'f1-score')"
2023-10-17 19:00:55,797 ----------------------------------------------------------------------------------------------------
2023-10-17 19:00:55,797 Computation:
2023-10-17 19:00:55,797 - compute on device: cuda:0
2023-10-17 19:00:55,797 - embedding storage: none
2023-10-17 19:00:55,797 ----------------------------------------------------------------------------------------------------
2023-10-17 19:00:55,797 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-17 19:00:55,797 ----------------------------------------------------------------------------------------------------
2023-10-17 19:00:55,797 ----------------------------------------------------------------------------------------------------
2023-10-17 19:00:55,797 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 19:01:01,305 epoch 1 - iter 72/723 - loss 2.43346961 - time (sec): 5.51 - samples/sec: 3053.64 - lr: 0.000005 - momentum: 0.000000
2023-10-17 19:01:06,577 epoch 1 - iter 144/723 - loss 1.39735442 - time (sec): 10.78 - samples/sec: 3151.18 - lr: 0.000010 - momentum: 0.000000
2023-10-17 19:01:11,902 epoch 1 - iter 216/723 - loss 0.98647245 - time (sec): 16.10 - samples/sec: 3181.79 - lr: 0.000015 - momentum: 0.000000
2023-10-17 19:01:17,195 epoch 1 - iter 288/723 - loss 0.77444671 - time (sec): 21.40 - samples/sec: 3222.92 - lr: 0.000020 - momentum: 0.000000
2023-10-17 19:01:22,154 epoch 1 - iter 360/723 - loss 0.64792967 - time (sec): 26.36 - samples/sec: 3272.45 - lr: 0.000025 - momentum: 0.000000
2023-10-17 19:01:27,784 epoch 1 - iter 432/723 - loss 0.55703287 - time (sec): 31.99 - samples/sec: 3262.43 - lr: 0.000030 - momentum: 0.000000
2023-10-17 19:01:33,257 epoch 1 - iter 504/723 - loss 0.49720863 - time (sec): 37.46 - samples/sec: 3257.16 - lr: 0.000035 - momentum: 0.000000
2023-10-17 19:01:38,961 epoch 1 - iter 576/723 - loss 0.44895161 - time (sec): 43.16 - samples/sec: 3238.06 - lr: 0.000040 - momentum: 0.000000
2023-10-17 19:01:44,455 epoch 1 - iter 648/723 - loss 0.41042010 - time (sec): 48.66 - samples/sec: 3237.70 - lr: 0.000045 - momentum: 0.000000
2023-10-17 19:01:49,975 epoch 1 - iter 720/723 - loss 0.38196732 - time (sec): 54.18 - samples/sec: 3239.23 - lr: 0.000050 - momentum: 0.000000
2023-10-17 19:01:50,192 ----------------------------------------------------------------------------------------------------
2023-10-17 19:01:50,192 EPOCH 1 done: loss 0.3808 - lr: 0.000050
2023-10-17 19:01:53,250 DEV : loss 0.07317255437374115 - f1-score (micro avg) 0.8295
2023-10-17 19:01:53,272 saving best model
2023-10-17 19:01:53,806 ----------------------------------------------------------------------------------------------------
2023-10-17 19:01:58,797 epoch 2 - iter 72/723 - loss 0.09594623 - time (sec): 4.99 - samples/sec: 3324.68 - lr: 0.000049 - momentum: 0.000000
2023-10-17 19:02:04,165 epoch 2 - iter 144/723 - loss 0.09490250 - time (sec): 10.36 - samples/sec: 3306.29 - lr: 0.000049 - momentum: 0.000000
2023-10-17 19:02:09,207 epoch 2 - iter 216/723 - loss 0.09141357 - time (sec): 15.40 - samples/sec: 3358.20 - lr: 0.000048 - momentum: 0.000000
2023-10-17 19:02:14,375 epoch 2 - iter 288/723 - loss 0.09618996 - time (sec): 20.57 - samples/sec: 3351.79 - lr: 0.000048 - momentum: 0.000000
2023-10-17 19:02:19,860 epoch 2 - iter 360/723 - loss 0.09056097 - time (sec): 26.05 - samples/sec: 3367.70 - lr: 0.000047 - momentum: 0.000000
2023-10-17 19:02:25,515 epoch 2 - iter 432/723 - loss 0.08791455 - time (sec): 31.71 - samples/sec: 3381.75 - lr: 0.000047 - momentum: 0.000000
2023-10-17 19:02:30,295 epoch 2 - iter 504/723 - loss 0.09029004 - time (sec): 36.49 - samples/sec: 3370.87 - lr: 0.000046 - momentum: 0.000000
2023-10-17 19:02:35,898 epoch 2 - iter 576/723 - loss 0.08956464 - time (sec): 42.09 - samples/sec: 3368.16 - lr: 0.000046 - momentum: 0.000000
2023-10-17 19:02:40,963 epoch 2 - iter 648/723 - loss 0.08990159 - time (sec): 47.16 - samples/sec: 3356.48 - lr: 0.000045 - momentum: 0.000000
2023-10-17 19:02:46,034 epoch 2 - iter 720/723 - loss 0.08747206 - time (sec): 52.23 - samples/sec: 3361.45 - lr: 0.000044 - momentum: 0.000000
2023-10-17 19:02:46,197 ----------------------------------------------------------------------------------------------------
2023-10-17 19:02:46,197 EPOCH 2 done: loss 0.0873 - lr: 0.000044
2023-10-17 19:02:50,030 DEV : loss 0.06033102422952652 - f1-score (micro avg) 0.8512
2023-10-17 19:02:50,046 saving best model
2023-10-17 19:02:50,562 ----------------------------------------------------------------------------------------------------
2023-10-17 19:02:55,561 epoch 3 - iter 72/723 - loss 0.05939468 - time (sec): 5.00 - samples/sec: 3464.35 - lr: 0.000044 - momentum: 0.000000
2023-10-17 19:03:00,234 epoch 3 - iter 144/723 - loss 0.06603411 - time (sec): 9.67 - samples/sec: 3510.27 - lr: 0.000043 - momentum: 0.000000
2023-10-17 19:03:05,632 epoch 3 - iter 216/723 - loss 0.06721595 - time (sec): 15.07 - samples/sec: 3457.17 - lr: 0.000043 - momentum: 0.000000
2023-10-17 19:03:10,987 epoch 3 - iter 288/723 - loss 0.06225454 - time (sec): 20.42 - samples/sec: 3431.78 - lr: 0.000042 - momentum: 0.000000
2023-10-17 19:03:16,247 epoch 3 - iter 360/723 - loss 0.06149114 - time (sec): 25.68 - samples/sec: 3439.35 - lr: 0.000042 - momentum: 0.000000
2023-10-17 19:03:21,493 epoch 3 - iter 432/723 - loss 0.06630663 - time (sec): 30.93 - samples/sec: 3417.42 - lr: 0.000041 - momentum: 0.000000
2023-10-17 19:03:26,570 epoch 3 - iter 504/723 - loss 0.06678596 - time (sec): 36.01 - samples/sec: 3413.39 - lr: 0.000041 - momentum: 0.000000
2023-10-17 19:03:32,030 epoch 3 - iter 576/723 - loss 0.06411011 - time (sec): 41.47 - samples/sec: 3413.58 - lr: 0.000040 - momentum: 0.000000
2023-10-17 19:03:37,092 epoch 3 - iter 648/723 - loss 0.06295898 - time (sec): 46.53 - samples/sec: 3408.05 - lr: 0.000039 - momentum: 0.000000
2023-10-17 19:03:42,333 epoch 3 - iter 720/723 - loss 0.06222050 - time (sec): 51.77 - samples/sec: 3391.79 - lr: 0.000039 - momentum: 0.000000
2023-10-17 19:03:42,513 ----------------------------------------------------------------------------------------------------
2023-10-17 19:03:42,513 EPOCH 3 done: loss 0.0621 - lr: 0.000039
2023-10-17 19:03:45,685 DEV : loss 0.06847171485424042 - f1-score (micro avg) 0.865
2023-10-17 19:03:45,702 saving best model
2023-10-17 19:03:46,180 ----------------------------------------------------------------------------------------------------
2023-10-17 19:03:51,616 epoch 4 - iter 72/723 - loss 0.04840124 - time (sec): 5.43 - samples/sec: 3391.17 - lr: 0.000038 - momentum: 0.000000
2023-10-17 19:03:57,075 epoch 4 - iter 144/723 - loss 0.04176500 - time (sec): 10.89 - samples/sec: 3354.24 - lr: 0.000038 - momentum: 0.000000
2023-10-17 19:04:01,951 epoch 4 - iter 216/723 - loss 0.04392943 - time (sec): 15.77 - samples/sec: 3360.14 - lr: 0.000037 - momentum: 0.000000
2023-10-17 19:04:07,383 epoch 4 - iter 288/723 - loss 0.04364348 - time (sec): 21.20 - samples/sec: 3342.12 - lr: 0.000037 - momentum: 0.000000
2023-10-17 19:04:12,696 epoch 4 - iter 360/723 - loss 0.04293801 - time (sec): 26.51 - samples/sec: 3294.13 - lr: 0.000036 - momentum: 0.000000
2023-10-17 19:04:17,989 epoch 4 - iter 432/723 - loss 0.04366335 - time (sec): 31.81 - samples/sec: 3293.19 - lr: 0.000036 - momentum: 0.000000
2023-10-17 19:04:23,052 epoch 4 - iter 504/723 - loss 0.04340638 - time (sec): 36.87 - samples/sec: 3327.19 - lr: 0.000035 - momentum: 0.000000
2023-10-17 19:04:28,459 epoch 4 - iter 576/723 - loss 0.04418918 - time (sec): 42.28 - samples/sec: 3317.02 - lr: 0.000034 - momentum: 0.000000
2023-10-17 19:04:33,632 epoch 4 - iter 648/723 - loss 0.04365686 - time (sec): 47.45 - samples/sec: 3318.67 - lr: 0.000034 - momentum: 0.000000
2023-10-17 19:04:38,779 epoch 4 - iter 720/723 - loss 0.04506250 - time (sec): 52.60 - samples/sec: 3341.13 - lr: 0.000033 - momentum: 0.000000
2023-10-17 19:04:38,929 ----------------------------------------------------------------------------------------------------
2023-10-17 19:04:38,929 EPOCH 4 done: loss 0.0450 - lr: 0.000033
2023-10-17 19:04:42,195 DEV : loss 0.08221735805273056 - f1-score (micro avg) 0.8364
2023-10-17 19:04:42,212 ----------------------------------------------------------------------------------------------------
2023-10-17 19:04:47,487 epoch 5 - iter 72/723 - loss 0.03743549 - time (sec): 5.27 - samples/sec: 3199.31 - lr: 0.000033 - momentum: 0.000000
2023-10-17 19:04:52,273 epoch 5 - iter 144/723 - loss 0.03409042 - time (sec): 10.06 - samples/sec: 3290.44 - lr: 0.000032 - momentum: 0.000000
2023-10-17 19:04:58,219 epoch 5 - iter 216/723 - loss 0.03376897 - time (sec): 16.01 - samples/sec: 3273.26 - lr: 0.000032 - momentum: 0.000000
2023-10-17 19:05:03,291 epoch 5 - iter 288/723 - loss 0.03120938 - time (sec): 21.08 - samples/sec: 3286.84 - lr: 0.000031 - momentum: 0.000000
2023-10-17 19:05:08,653 epoch 5 - iter 360/723 - loss 0.02837158 - time (sec): 26.44 - samples/sec: 3279.57 - lr: 0.000031 - momentum: 0.000000
2023-10-17 19:05:13,853 epoch 5 - iter 432/723 - loss 0.03029903 - time (sec): 31.64 - samples/sec: 3307.30 - lr: 0.000030 - momentum: 0.000000
2023-10-17 19:05:19,092 epoch 5 - iter 504/723 - loss 0.03306042 - time (sec): 36.88 - samples/sec: 3329.69 - lr: 0.000029 - momentum: 0.000000
2023-10-17 19:05:24,266 epoch 5 - iter 576/723 - loss 0.03308298 - time (sec): 42.05 - samples/sec: 3336.33 - lr: 0.000029 - momentum: 0.000000
2023-10-17 19:05:29,270 epoch 5 - iter 648/723 - loss 0.03349256 - time (sec): 47.06 - samples/sec: 3343.06 - lr: 0.000028 - momentum: 0.000000
2023-10-17 19:05:34,586 epoch 5 - iter 720/723 - loss 0.03379937 - time (sec): 52.37 - samples/sec: 3357.59 - lr: 0.000028 - momentum: 0.000000
2023-10-17 19:05:34,737 ----------------------------------------------------------------------------------------------------
2023-10-17 19:05:34,737 EPOCH 5 done: loss 0.0338 - lr: 0.000028
2023-10-17 19:05:38,334 DEV : loss 0.10720644146203995 - f1-score (micro avg) 0.8427
2023-10-17 19:05:38,350 ----------------------------------------------------------------------------------------------------
2023-10-17 19:05:43,670 epoch 6 - iter 72/723 - loss 0.01321373 - time (sec): 5.32 - samples/sec: 3423.73 - lr: 0.000027 - momentum: 0.000000
2023-10-17 19:05:48,889 epoch 6 - iter 144/723 - loss 0.01673976 - time (sec): 10.54 - samples/sec: 3405.46 - lr: 0.000027 - momentum: 0.000000
2023-10-17 19:05:53,981 epoch 6 - iter 216/723 - loss 0.02074721 - time (sec): 15.63 - samples/sec: 3431.04 - lr: 0.000026 - momentum: 0.000000
2023-10-17 19:05:59,313 epoch 6 - iter 288/723 - loss 0.02272213 - time (sec): 20.96 - samples/sec: 3397.60 - lr: 0.000026 - momentum: 0.000000
2023-10-17 19:06:04,725 epoch 6 - iter 360/723 - loss 0.02522995 - time (sec): 26.37 - samples/sec: 3400.38 - lr: 0.000025 - momentum: 0.000000
2023-10-17 19:06:09,964 epoch 6 - iter 432/723 - loss 0.02397191 - time (sec): 31.61 - samples/sec: 3400.05 - lr: 0.000024 - momentum: 0.000000
2023-10-17 19:06:15,057 epoch 6 - iter 504/723 - loss 0.02395899 - time (sec): 36.71 - samples/sec: 3398.23 - lr: 0.000024 - momentum: 0.000000
2023-10-17 19:06:19,981 epoch 6 - iter 576/723 - loss 0.02394027 - time (sec): 41.63 - samples/sec: 3393.65 - lr: 0.000023 - momentum: 0.000000
2023-10-17 19:06:25,252 epoch 6 - iter 648/723 - loss 0.02321616 - time (sec): 46.90 - samples/sec: 3383.09 - lr: 0.000023 - momentum: 0.000000
2023-10-17 19:06:30,319 epoch 6 - iter 720/723 - loss 0.02360596 - time (sec): 51.97 - samples/sec: 3381.82 - lr: 0.000022 - momentum: 0.000000
2023-10-17 19:06:30,493 ----------------------------------------------------------------------------------------------------
2023-10-17 19:06:30,493 EPOCH 6 done: loss 0.0236 - lr: 0.000022
2023-10-17 19:06:33,631 DEV : loss 0.11060281097888947 - f1-score (micro avg) 0.8685
2023-10-17 19:06:33,647 saving best model
2023-10-17 19:06:34,132 ----------------------------------------------------------------------------------------------------
2023-10-17 19:06:39,278 epoch 7 - iter 72/723 - loss 0.00876763 - time (sec): 5.14 - samples/sec: 3437.08 - lr: 0.000022 - momentum: 0.000000
2023-10-17 19:06:44,216 epoch 7 - iter 144/723 - loss 0.02003264 - time (sec): 10.08 - samples/sec: 3403.22 - lr: 0.000021 - momentum: 0.000000
2023-10-17 19:06:49,975 epoch 7 - iter 216/723 - loss 0.01713238 - time (sec): 15.84 - samples/sec: 3350.31 - lr: 0.000021 - momentum: 0.000000
2023-10-17 19:06:55,422 epoch 7 - iter 288/723 - loss 0.01901167 - time (sec): 21.29 - samples/sec: 3363.59 - lr: 0.000020 - momentum: 0.000000
2023-10-17 19:07:01,160 epoch 7 - iter 360/723 - loss 0.01889764 - time (sec): 27.03 - samples/sec: 3314.38 - lr: 0.000019 - momentum: 0.000000
2023-10-17 19:07:06,516 epoch 7 - iter 432/723 - loss 0.01931896 - time (sec): 32.38 - samples/sec: 3310.20 - lr: 0.000019 - momentum: 0.000000
2023-10-17 19:07:11,616 epoch 7 - iter 504/723 - loss 0.01874983 - time (sec): 37.48 - samples/sec: 3321.01 - lr: 0.000018 - momentum: 0.000000
2023-10-17 19:07:16,600 epoch 7 - iter 576/723 - loss 0.01774094 - time (sec): 42.46 - samples/sec: 3335.39 - lr: 0.000018 - momentum: 0.000000
2023-10-17 19:07:21,515 epoch 7 - iter 648/723 - loss 0.01802068 - time (sec): 47.38 - samples/sec: 3345.34 - lr: 0.000017 - momentum: 0.000000
2023-10-17 19:07:26,773 epoch 7 - iter 720/723 - loss 0.01760980 - time (sec): 52.64 - samples/sec: 3338.15 - lr: 0.000017 - momentum: 0.000000
2023-10-17 19:07:26,934 ----------------------------------------------------------------------------------------------------
2023-10-17 19:07:26,934 EPOCH 7 done: loss 0.0176 - lr: 0.000017
2023-10-17 19:07:30,123 DEV : loss 0.12062493711709976 - f1-score (micro avg) 0.871
2023-10-17 19:07:30,140 saving best model
2023-10-17 19:07:30,646 ----------------------------------------------------------------------------------------------------
2023-10-17 19:07:35,825 epoch 8 - iter 72/723 - loss 0.00784529 - time (sec): 5.18 - samples/sec: 3430.07 - lr: 0.000016 - momentum: 0.000000
2023-10-17 19:07:41,032 epoch 8 - iter 144/723 - loss 0.01225137 - time (sec): 10.38 - samples/sec: 3398.94 - lr: 0.000016 - momentum: 0.000000
2023-10-17 19:07:46,165 epoch 8 - iter 216/723 - loss 0.01162890 - time (sec): 15.52 - samples/sec: 3352.59 - lr: 0.000015 - momentum: 0.000000
2023-10-17 19:07:51,358 epoch 8 - iter 288/723 - loss 0.01081302 - time (sec): 20.71 - samples/sec: 3356.51 - lr: 0.000014 - momentum: 0.000000
2023-10-17 19:07:56,407 epoch 8 - iter 360/723 - loss 0.01066576 - time (sec): 25.76 - samples/sec: 3346.47 - lr: 0.000014 - momentum: 0.000000
2023-10-17 19:08:01,553 epoch 8 - iter 432/723 - loss 0.01039858 - time (sec): 30.91 - samples/sec: 3357.31 - lr: 0.000013 - momentum: 0.000000
2023-10-17 19:08:06,835 epoch 8 - iter 504/723 - loss 0.01051658 - time (sec): 36.19 - samples/sec: 3341.97 - lr: 0.000013 - momentum: 0.000000
2023-10-17 19:08:12,477 epoch 8 - iter 576/723 - loss 0.01174400 - time (sec): 41.83 - samples/sec: 3352.33 - lr: 0.000012 - momentum: 0.000000
2023-10-17 19:08:17,598 epoch 8 - iter 648/723 - loss 0.01179448 - time (sec): 46.95 - samples/sec: 3354.69 - lr: 0.000012 - momentum: 0.000000
2023-10-17 19:08:23,116 epoch 8 - iter 720/723 - loss 0.01201515 - time (sec): 52.47 - samples/sec: 3346.33 - lr: 0.000011 - momentum: 0.000000
2023-10-17 19:08:23,307 ----------------------------------------------------------------------------------------------------
2023-10-17 19:08:23,307 EPOCH 8 done: loss 0.0120 - lr: 0.000011
2023-10-17 19:08:26,870 DEV : loss 0.12349953502416611 - f1-score (micro avg) 0.8772
2023-10-17 19:08:26,886 saving best model
2023-10-17 19:08:27,376 ----------------------------------------------------------------------------------------------------
2023-10-17 19:08:32,712 epoch 9 - iter 72/723 - loss 0.00578653 - time (sec): 5.33 - samples/sec: 3302.60 - lr: 0.000011 - momentum: 0.000000
2023-10-17 19:08:38,063 epoch 9 - iter 144/723 - loss 0.00590740 - time (sec): 10.68 - samples/sec: 3389.04 - lr: 0.000010 - momentum: 0.000000
2023-10-17 19:08:42,976 epoch 9 - iter 216/723 - loss 0.00660418 - time (sec): 15.60 - samples/sec: 3394.47 - lr: 0.000009 - momentum: 0.000000
2023-10-17 19:08:48,105 epoch 9 - iter 288/723 - loss 0.00656305 - time (sec): 20.73 - samples/sec: 3401.06 - lr: 0.000009 - momentum: 0.000000
2023-10-17 19:08:53,371 epoch 9 - iter 360/723 - loss 0.00596195 - time (sec): 25.99 - samples/sec: 3407.27 - lr: 0.000008 - momentum: 0.000000
2023-10-17 19:08:58,592 epoch 9 - iter 432/723 - loss 0.00648461 - time (sec): 31.21 - samples/sec: 3392.16 - lr: 0.000008 - momentum: 0.000000
2023-10-17 19:09:04,561 epoch 9 - iter 504/723 - loss 0.00760297 - time (sec): 37.18 - samples/sec: 3353.67 - lr: 0.000007 - momentum: 0.000000
2023-10-17 19:09:09,754 epoch 9 - iter 576/723 - loss 0.00747124 - time (sec): 42.37 - samples/sec: 3344.01 - lr: 0.000007 - momentum: 0.000000
2023-10-17 19:09:15,067 epoch 9 - iter 648/723 - loss 0.00742990 - time (sec): 47.69 - samples/sec: 3352.27 - lr: 0.000006 - momentum: 0.000000
2023-10-17 19:09:19,783 epoch 9 - iter 720/723 - loss 0.00779320 - time (sec): 52.40 - samples/sec: 3354.70 - lr: 0.000006 - momentum: 0.000000
2023-10-17 19:09:19,935 ----------------------------------------------------------------------------------------------------
2023-10-17 19:09:19,936 EPOCH 9 done: loss 0.0078 - lr: 0.000006
2023-10-17 19:09:23,169 DEV : loss 0.13878272473812103 - f1-score (micro avg) 0.8739
2023-10-17 19:09:23,185 ----------------------------------------------------------------------------------------------------
2023-10-17 19:09:28,694 epoch 10 - iter 72/723 - loss 0.00379082 - time (sec): 5.51 - samples/sec: 3264.76 - lr: 0.000005 - momentum: 0.000000
2023-10-17 19:09:33,696 epoch 10 - iter 144/723 - loss 0.00277850 - time (sec): 10.51 - samples/sec: 3333.21 - lr: 0.000004 - momentum: 0.000000
2023-10-17 19:09:39,096 epoch 10 - iter 216/723 - loss 0.00300045 - time (sec): 15.91 - samples/sec: 3324.90 - lr: 0.000004 - momentum: 0.000000
2023-10-17 19:09:44,374 epoch 10 - iter 288/723 - loss 0.00434885 - time (sec): 21.19 - samples/sec: 3333.17 - lr: 0.000003 - momentum: 0.000000
2023-10-17 19:09:49,498 epoch 10 - iter 360/723 - loss 0.00447209 - time (sec): 26.31 - samples/sec: 3338.90 - lr: 0.000003 - momentum: 0.000000
2023-10-17 19:09:55,377 epoch 10 - iter 432/723 - loss 0.00446912 - time (sec): 32.19 - samples/sec: 3295.01 - lr: 0.000002 - momentum: 0.000000
2023-10-17 19:10:00,772 epoch 10 - iter 504/723 - loss 0.00425532 - time (sec): 37.59 - samples/sec: 3273.65 - lr: 0.000002 - momentum: 0.000000
2023-10-17 19:10:05,867 epoch 10 - iter 576/723 - loss 0.00443165 - time (sec): 42.68 - samples/sec: 3277.16 - lr: 0.000001 - momentum: 0.000000
2023-10-17 19:10:10,997 epoch 10 - iter 648/723 - loss 0.00471256 - time (sec): 47.81 - samples/sec: 3300.13 - lr: 0.000001 - momentum: 0.000000
2023-10-17 19:10:16,284 epoch 10 - iter 720/723 - loss 0.00513082 - time (sec): 53.10 - samples/sec: 3311.35 - lr: 0.000000 - momentum: 0.000000
2023-10-17 19:10:16,431 ----------------------------------------------------------------------------------------------------
2023-10-17 19:10:16,432 EPOCH 10 done: loss 0.0051 - lr: 0.000000
2023-10-17 19:10:19,626 DEV : loss 0.1487891674041748 - f1-score (micro avg) 0.8743
2023-10-17 19:10:20,036 ----------------------------------------------------------------------------------------------------
2023-10-17 19:10:20,037 Loading model from best epoch ...
2023-10-17 19:10:21,726 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 19:10:24,520
Results:
- F-score (micro) 0.8384
- F-score (macro) 0.7459
- Accuracy 0.7301
By class:
precision recall f1-score support
PER 0.8697 0.8029 0.8350 482
LOC 0.9105 0.8886 0.8994 458
ORG 0.4535 0.5652 0.5032 69
micro avg 0.8517 0.8256 0.8384 1009
macro avg 0.7446 0.7523 0.7459 1009
weighted avg 0.8597 0.8256 0.8415 1009
2023-10-17 19:10:24,520 ----------------------------------------------------------------------------------------------------