stefan-it's picture
Upload folder using huggingface_hub
5e0c2fe
2023-10-17 16:31:56,593 ----------------------------------------------------------------------------------------------------
2023-10-17 16:31:56,594 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 16:31:56,594 ----------------------------------------------------------------------------------------------------
2023-10-17 16:31:56,594 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-17 16:31:56,595 ----------------------------------------------------------------------------------------------------
2023-10-17 16:31:56,595 Train: 5777 sentences
2023-10-17 16:31:56,595 (train_with_dev=False, train_with_test=False)
2023-10-17 16:31:56,595 ----------------------------------------------------------------------------------------------------
2023-10-17 16:31:56,595 Training Params:
2023-10-17 16:31:56,595 - learning_rate: "3e-05"
2023-10-17 16:31:56,595 - mini_batch_size: "8"
2023-10-17 16:31:56,595 - max_epochs: "10"
2023-10-17 16:31:56,595 - shuffle: "True"
2023-10-17 16:31:56,595 ----------------------------------------------------------------------------------------------------
2023-10-17 16:31:56,595 Plugins:
2023-10-17 16:31:56,595 - TensorboardLogger
2023-10-17 16:31:56,595 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 16:31:56,595 ----------------------------------------------------------------------------------------------------
2023-10-17 16:31:56,595 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 16:31:56,595 - metric: "('micro avg', 'f1-score')"
2023-10-17 16:31:56,595 ----------------------------------------------------------------------------------------------------
2023-10-17 16:31:56,595 Computation:
2023-10-17 16:31:56,595 - compute on device: cuda:0
2023-10-17 16:31:56,595 - embedding storage: none
2023-10-17 16:31:56,595 ----------------------------------------------------------------------------------------------------
2023-10-17 16:31:56,595 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-17 16:31:56,595 ----------------------------------------------------------------------------------------------------
2023-10-17 16:31:56,595 ----------------------------------------------------------------------------------------------------
2023-10-17 16:31:56,595 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 16:32:02,943 epoch 1 - iter 72/723 - loss 3.00768245 - time (sec): 6.35 - samples/sec: 2875.21 - lr: 0.000003 - momentum: 0.000000
2023-10-17 16:32:08,183 epoch 1 - iter 144/723 - loss 1.83377417 - time (sec): 11.59 - samples/sec: 3076.95 - lr: 0.000006 - momentum: 0.000000
2023-10-17 16:32:13,449 epoch 1 - iter 216/723 - loss 1.34809257 - time (sec): 16.85 - samples/sec: 3108.76 - lr: 0.000009 - momentum: 0.000000
2023-10-17 16:32:18,647 epoch 1 - iter 288/723 - loss 1.06011472 - time (sec): 22.05 - samples/sec: 3178.28 - lr: 0.000012 - momentum: 0.000000
2023-10-17 16:32:24,054 epoch 1 - iter 360/723 - loss 0.87287490 - time (sec): 27.46 - samples/sec: 3221.22 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:32:29,261 epoch 1 - iter 432/723 - loss 0.75621233 - time (sec): 32.66 - samples/sec: 3237.17 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:32:34,595 epoch 1 - iter 504/723 - loss 0.66869532 - time (sec): 38.00 - samples/sec: 3252.04 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:32:39,865 epoch 1 - iter 576/723 - loss 0.59841051 - time (sec): 43.27 - samples/sec: 3265.45 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:32:45,074 epoch 1 - iter 648/723 - loss 0.54748897 - time (sec): 48.48 - samples/sec: 3263.03 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:32:50,460 epoch 1 - iter 720/723 - loss 0.50420486 - time (sec): 53.86 - samples/sec: 3263.27 - lr: 0.000030 - momentum: 0.000000
2023-10-17 16:32:50,630 ----------------------------------------------------------------------------------------------------
2023-10-17 16:32:50,630 EPOCH 1 done: loss 0.5034 - lr: 0.000030
2023-10-17 16:32:53,442 DEV : loss 0.09952846169471741 - f1-score (micro avg) 0.7835
2023-10-17 16:32:53,458 saving best model
2023-10-17 16:32:53,801 ----------------------------------------------------------------------------------------------------
2023-10-17 16:32:58,765 epoch 2 - iter 72/723 - loss 0.13271887 - time (sec): 4.96 - samples/sec: 3334.96 - lr: 0.000030 - momentum: 0.000000
2023-10-17 16:33:03,658 epoch 2 - iter 144/723 - loss 0.12163388 - time (sec): 9.86 - samples/sec: 3400.14 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:33:09,123 epoch 2 - iter 216/723 - loss 0.10850316 - time (sec): 15.32 - samples/sec: 3329.04 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:33:14,409 epoch 2 - iter 288/723 - loss 0.10125886 - time (sec): 20.61 - samples/sec: 3345.74 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:33:19,451 epoch 2 - iter 360/723 - loss 0.09796960 - time (sec): 25.65 - samples/sec: 3351.30 - lr: 0.000028 - momentum: 0.000000
2023-10-17 16:33:24,872 epoch 2 - iter 432/723 - loss 0.09334936 - time (sec): 31.07 - samples/sec: 3384.78 - lr: 0.000028 - momentum: 0.000000
2023-10-17 16:33:30,230 epoch 2 - iter 504/723 - loss 0.09221920 - time (sec): 36.43 - samples/sec: 3378.17 - lr: 0.000028 - momentum: 0.000000
2023-10-17 16:33:35,347 epoch 2 - iter 576/723 - loss 0.09019324 - time (sec): 41.55 - samples/sec: 3370.17 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:33:40,659 epoch 2 - iter 648/723 - loss 0.08963976 - time (sec): 46.86 - samples/sec: 3375.95 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:33:45,971 epoch 2 - iter 720/723 - loss 0.09017542 - time (sec): 52.17 - samples/sec: 3368.84 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:33:46,144 ----------------------------------------------------------------------------------------------------
2023-10-17 16:33:46,144 EPOCH 2 done: loss 0.0903 - lr: 0.000027
2023-10-17 16:33:49,394 DEV : loss 0.09726935625076294 - f1-score (micro avg) 0.7537
2023-10-17 16:33:49,411 ----------------------------------------------------------------------------------------------------
2023-10-17 16:33:55,148 epoch 3 - iter 72/723 - loss 0.07215370 - time (sec): 5.74 - samples/sec: 3173.83 - lr: 0.000026 - momentum: 0.000000
2023-10-17 16:34:00,300 epoch 3 - iter 144/723 - loss 0.06483466 - time (sec): 10.89 - samples/sec: 3249.51 - lr: 0.000026 - momentum: 0.000000
2023-10-17 16:34:05,931 epoch 3 - iter 216/723 - loss 0.06657008 - time (sec): 16.52 - samples/sec: 3294.95 - lr: 0.000026 - momentum: 0.000000
2023-10-17 16:34:11,026 epoch 3 - iter 288/723 - loss 0.06642041 - time (sec): 21.61 - samples/sec: 3298.27 - lr: 0.000025 - momentum: 0.000000
2023-10-17 16:34:16,051 epoch 3 - iter 360/723 - loss 0.06531288 - time (sec): 26.64 - samples/sec: 3330.33 - lr: 0.000025 - momentum: 0.000000
2023-10-17 16:34:21,370 epoch 3 - iter 432/723 - loss 0.06197145 - time (sec): 31.96 - samples/sec: 3347.61 - lr: 0.000025 - momentum: 0.000000
2023-10-17 16:34:26,222 epoch 3 - iter 504/723 - loss 0.06234391 - time (sec): 36.81 - samples/sec: 3352.64 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:34:31,001 epoch 3 - iter 576/723 - loss 0.06144159 - time (sec): 41.59 - samples/sec: 3362.47 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:34:36,364 epoch 3 - iter 648/723 - loss 0.06093836 - time (sec): 46.95 - samples/sec: 3359.96 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:34:41,970 epoch 3 - iter 720/723 - loss 0.06193645 - time (sec): 52.56 - samples/sec: 3339.84 - lr: 0.000023 - momentum: 0.000000
2023-10-17 16:34:42,210 ----------------------------------------------------------------------------------------------------
2023-10-17 16:34:42,210 EPOCH 3 done: loss 0.0619 - lr: 0.000023
2023-10-17 16:34:45,535 DEV : loss 0.06067139655351639 - f1-score (micro avg) 0.8784
2023-10-17 16:34:45,557 saving best model
2023-10-17 16:34:45,993 ----------------------------------------------------------------------------------------------------
2023-10-17 16:34:51,188 epoch 4 - iter 72/723 - loss 0.03604175 - time (sec): 5.19 - samples/sec: 3431.92 - lr: 0.000023 - momentum: 0.000000
2023-10-17 16:34:56,194 epoch 4 - iter 144/723 - loss 0.04356223 - time (sec): 10.20 - samples/sec: 3418.58 - lr: 0.000023 - momentum: 0.000000
2023-10-17 16:35:01,218 epoch 4 - iter 216/723 - loss 0.03967575 - time (sec): 15.22 - samples/sec: 3391.10 - lr: 0.000022 - momentum: 0.000000
2023-10-17 16:35:06,482 epoch 4 - iter 288/723 - loss 0.04323559 - time (sec): 20.48 - samples/sec: 3354.49 - lr: 0.000022 - momentum: 0.000000
2023-10-17 16:35:12,130 epoch 4 - iter 360/723 - loss 0.04616657 - time (sec): 26.13 - samples/sec: 3318.56 - lr: 0.000022 - momentum: 0.000000
2023-10-17 16:35:17,425 epoch 4 - iter 432/723 - loss 0.04481773 - time (sec): 31.43 - samples/sec: 3340.12 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:35:22,592 epoch 4 - iter 504/723 - loss 0.04476274 - time (sec): 36.59 - samples/sec: 3354.14 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:35:27,892 epoch 4 - iter 576/723 - loss 0.04301556 - time (sec): 41.89 - samples/sec: 3352.09 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:35:33,137 epoch 4 - iter 648/723 - loss 0.04247650 - time (sec): 47.14 - samples/sec: 3352.96 - lr: 0.000020 - momentum: 0.000000
2023-10-17 16:35:38,436 epoch 4 - iter 720/723 - loss 0.04245231 - time (sec): 52.44 - samples/sec: 3351.38 - lr: 0.000020 - momentum: 0.000000
2023-10-17 16:35:38,580 ----------------------------------------------------------------------------------------------------
2023-10-17 16:35:38,580 EPOCH 4 done: loss 0.0424 - lr: 0.000020
2023-10-17 16:35:42,472 DEV : loss 0.06585061550140381 - f1-score (micro avg) 0.88
2023-10-17 16:35:42,489 saving best model
2023-10-17 16:35:42,962 ----------------------------------------------------------------------------------------------------
2023-10-17 16:35:47,869 epoch 5 - iter 72/723 - loss 0.04265918 - time (sec): 4.90 - samples/sec: 3353.74 - lr: 0.000020 - momentum: 0.000000
2023-10-17 16:35:53,060 epoch 5 - iter 144/723 - loss 0.03658285 - time (sec): 10.10 - samples/sec: 3370.07 - lr: 0.000019 - momentum: 0.000000
2023-10-17 16:35:58,295 epoch 5 - iter 216/723 - loss 0.03498816 - time (sec): 15.33 - samples/sec: 3369.09 - lr: 0.000019 - momentum: 0.000000
2023-10-17 16:36:03,141 epoch 5 - iter 288/723 - loss 0.03432794 - time (sec): 20.18 - samples/sec: 3361.53 - lr: 0.000019 - momentum: 0.000000
2023-10-17 16:36:08,503 epoch 5 - iter 360/723 - loss 0.03464982 - time (sec): 25.54 - samples/sec: 3369.88 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:36:13,602 epoch 5 - iter 432/723 - loss 0.03485007 - time (sec): 30.64 - samples/sec: 3380.28 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:36:19,327 epoch 5 - iter 504/723 - loss 0.03480689 - time (sec): 36.36 - samples/sec: 3367.59 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:36:25,026 epoch 5 - iter 576/723 - loss 0.03484667 - time (sec): 42.06 - samples/sec: 3362.37 - lr: 0.000017 - momentum: 0.000000
2023-10-17 16:36:30,154 epoch 5 - iter 648/723 - loss 0.03347624 - time (sec): 47.19 - samples/sec: 3371.81 - lr: 0.000017 - momentum: 0.000000
2023-10-17 16:36:35,020 epoch 5 - iter 720/723 - loss 0.03333207 - time (sec): 52.05 - samples/sec: 3372.78 - lr: 0.000017 - momentum: 0.000000
2023-10-17 16:36:35,222 ----------------------------------------------------------------------------------------------------
2023-10-17 16:36:35,222 EPOCH 5 done: loss 0.0333 - lr: 0.000017
2023-10-17 16:36:38,511 DEV : loss 0.08196107298135757 - f1-score (micro avg) 0.8394
2023-10-17 16:36:38,529 ----------------------------------------------------------------------------------------------------
2023-10-17 16:36:44,227 epoch 6 - iter 72/723 - loss 0.03440873 - time (sec): 5.70 - samples/sec: 3228.25 - lr: 0.000016 - momentum: 0.000000
2023-10-17 16:36:49,782 epoch 6 - iter 144/723 - loss 0.02600069 - time (sec): 11.25 - samples/sec: 3159.61 - lr: 0.000016 - momentum: 0.000000
2023-10-17 16:36:54,634 epoch 6 - iter 216/723 - loss 0.02505399 - time (sec): 16.10 - samples/sec: 3284.53 - lr: 0.000016 - momentum: 0.000000
2023-10-17 16:36:59,642 epoch 6 - iter 288/723 - loss 0.02392099 - time (sec): 21.11 - samples/sec: 3329.96 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:37:05,080 epoch 6 - iter 360/723 - loss 0.02284558 - time (sec): 26.55 - samples/sec: 3331.65 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:37:10,418 epoch 6 - iter 432/723 - loss 0.02214248 - time (sec): 31.89 - samples/sec: 3347.77 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:37:15,719 epoch 6 - iter 504/723 - loss 0.02141012 - time (sec): 37.19 - samples/sec: 3352.93 - lr: 0.000014 - momentum: 0.000000
2023-10-17 16:37:20,686 epoch 6 - iter 576/723 - loss 0.02233096 - time (sec): 42.16 - samples/sec: 3348.19 - lr: 0.000014 - momentum: 0.000000
2023-10-17 16:37:25,758 epoch 6 - iter 648/723 - loss 0.02339352 - time (sec): 47.23 - samples/sec: 3346.09 - lr: 0.000014 - momentum: 0.000000
2023-10-17 16:37:30,821 epoch 6 - iter 720/723 - loss 0.02328563 - time (sec): 52.29 - samples/sec: 3356.86 - lr: 0.000013 - momentum: 0.000000
2023-10-17 16:37:31,137 ----------------------------------------------------------------------------------------------------
2023-10-17 16:37:31,137 EPOCH 6 done: loss 0.0232 - lr: 0.000013
2023-10-17 16:37:34,605 DEV : loss 0.1128404513001442 - f1-score (micro avg) 0.8366
2023-10-17 16:37:34,630 ----------------------------------------------------------------------------------------------------
2023-10-17 16:37:40,108 epoch 7 - iter 72/723 - loss 0.01703668 - time (sec): 5.48 - samples/sec: 3361.43 - lr: 0.000013 - momentum: 0.000000
2023-10-17 16:37:45,433 epoch 7 - iter 144/723 - loss 0.01533515 - time (sec): 10.80 - samples/sec: 3352.87 - lr: 0.000013 - momentum: 0.000000
2023-10-17 16:37:50,493 epoch 7 - iter 216/723 - loss 0.01594799 - time (sec): 15.86 - samples/sec: 3353.59 - lr: 0.000012 - momentum: 0.000000
2023-10-17 16:37:55,922 epoch 7 - iter 288/723 - loss 0.01731920 - time (sec): 21.29 - samples/sec: 3335.13 - lr: 0.000012 - momentum: 0.000000
2023-10-17 16:38:01,120 epoch 7 - iter 360/723 - loss 0.01755588 - time (sec): 26.49 - samples/sec: 3360.12 - lr: 0.000012 - momentum: 0.000000
2023-10-17 16:38:06,133 epoch 7 - iter 432/723 - loss 0.01741061 - time (sec): 31.50 - samples/sec: 3366.32 - lr: 0.000011 - momentum: 0.000000
2023-10-17 16:38:11,380 epoch 7 - iter 504/723 - loss 0.01882233 - time (sec): 36.75 - samples/sec: 3343.50 - lr: 0.000011 - momentum: 0.000000
2023-10-17 16:38:16,383 epoch 7 - iter 576/723 - loss 0.01883728 - time (sec): 41.75 - samples/sec: 3356.38 - lr: 0.000011 - momentum: 0.000000
2023-10-17 16:38:21,516 epoch 7 - iter 648/723 - loss 0.01884261 - time (sec): 46.88 - samples/sec: 3354.11 - lr: 0.000010 - momentum: 0.000000
2023-10-17 16:38:26,987 epoch 7 - iter 720/723 - loss 0.01918716 - time (sec): 52.36 - samples/sec: 3354.59 - lr: 0.000010 - momentum: 0.000000
2023-10-17 16:38:27,191 ----------------------------------------------------------------------------------------------------
2023-10-17 16:38:27,191 EPOCH 7 done: loss 0.0192 - lr: 0.000010
2023-10-17 16:38:31,364 DEV : loss 0.10547000914812088 - f1-score (micro avg) 0.8758
2023-10-17 16:38:31,408 ----------------------------------------------------------------------------------------------------
2023-10-17 16:38:36,744 epoch 8 - iter 72/723 - loss 0.01480279 - time (sec): 5.33 - samples/sec: 3207.05 - lr: 0.000010 - momentum: 0.000000
2023-10-17 16:38:42,026 epoch 8 - iter 144/723 - loss 0.01398233 - time (sec): 10.62 - samples/sec: 3223.61 - lr: 0.000009 - momentum: 0.000000
2023-10-17 16:38:47,455 epoch 8 - iter 216/723 - loss 0.01388295 - time (sec): 16.04 - samples/sec: 3243.82 - lr: 0.000009 - momentum: 0.000000
2023-10-17 16:38:52,719 epoch 8 - iter 288/723 - loss 0.01560830 - time (sec): 21.31 - samples/sec: 3253.01 - lr: 0.000009 - momentum: 0.000000
2023-10-17 16:38:57,608 epoch 8 - iter 360/723 - loss 0.01346550 - time (sec): 26.20 - samples/sec: 3262.82 - lr: 0.000008 - momentum: 0.000000
2023-10-17 16:39:03,090 epoch 8 - iter 432/723 - loss 0.01242414 - time (sec): 31.68 - samples/sec: 3281.19 - lr: 0.000008 - momentum: 0.000000
2023-10-17 16:39:08,362 epoch 8 - iter 504/723 - loss 0.01329213 - time (sec): 36.95 - samples/sec: 3308.45 - lr: 0.000008 - momentum: 0.000000
2023-10-17 16:39:13,433 epoch 8 - iter 576/723 - loss 0.01413418 - time (sec): 42.02 - samples/sec: 3319.59 - lr: 0.000007 - momentum: 0.000000
2023-10-17 16:39:18,797 epoch 8 - iter 648/723 - loss 0.01466463 - time (sec): 47.39 - samples/sec: 3330.57 - lr: 0.000007 - momentum: 0.000000
2023-10-17 16:39:24,125 epoch 8 - iter 720/723 - loss 0.01453908 - time (sec): 52.72 - samples/sec: 3331.91 - lr: 0.000007 - momentum: 0.000000
2023-10-17 16:39:24,294 ----------------------------------------------------------------------------------------------------
2023-10-17 16:39:24,295 EPOCH 8 done: loss 0.0146 - lr: 0.000007
2023-10-17 16:39:27,548 DEV : loss 0.11225023120641708 - f1-score (micro avg) 0.8738
2023-10-17 16:39:27,569 ----------------------------------------------------------------------------------------------------
2023-10-17 16:39:33,008 epoch 9 - iter 72/723 - loss 0.01105950 - time (sec): 5.44 - samples/sec: 3327.69 - lr: 0.000006 - momentum: 0.000000
2023-10-17 16:39:38,897 epoch 9 - iter 144/723 - loss 0.01071290 - time (sec): 11.33 - samples/sec: 3253.51 - lr: 0.000006 - momentum: 0.000000
2023-10-17 16:39:43,889 epoch 9 - iter 216/723 - loss 0.01095588 - time (sec): 16.32 - samples/sec: 3309.64 - lr: 0.000006 - momentum: 0.000000
2023-10-17 16:39:49,555 epoch 9 - iter 288/723 - loss 0.01087741 - time (sec): 21.98 - samples/sec: 3242.17 - lr: 0.000005 - momentum: 0.000000
2023-10-17 16:39:54,889 epoch 9 - iter 360/723 - loss 0.01075241 - time (sec): 27.32 - samples/sec: 3274.38 - lr: 0.000005 - momentum: 0.000000
2023-10-17 16:40:00,069 epoch 9 - iter 432/723 - loss 0.01016298 - time (sec): 32.50 - samples/sec: 3297.01 - lr: 0.000005 - momentum: 0.000000
2023-10-17 16:40:05,009 epoch 9 - iter 504/723 - loss 0.01061815 - time (sec): 37.44 - samples/sec: 3311.66 - lr: 0.000004 - momentum: 0.000000
2023-10-17 16:40:10,029 epoch 9 - iter 576/723 - loss 0.00974392 - time (sec): 42.46 - samples/sec: 3324.65 - lr: 0.000004 - momentum: 0.000000
2023-10-17 16:40:15,475 epoch 9 - iter 648/723 - loss 0.00992418 - time (sec): 47.90 - samples/sec: 3306.12 - lr: 0.000004 - momentum: 0.000000
2023-10-17 16:40:21,097 epoch 9 - iter 720/723 - loss 0.00977684 - time (sec): 53.53 - samples/sec: 3279.59 - lr: 0.000003 - momentum: 0.000000
2023-10-17 16:40:21,279 ----------------------------------------------------------------------------------------------------
2023-10-17 16:40:21,279 EPOCH 9 done: loss 0.0097 - lr: 0.000003
2023-10-17 16:40:24,489 DEV : loss 0.12020541727542877 - f1-score (micro avg) 0.8718
2023-10-17 16:40:24,506 ----------------------------------------------------------------------------------------------------
2023-10-17 16:40:29,889 epoch 10 - iter 72/723 - loss 0.00414168 - time (sec): 5.38 - samples/sec: 3371.93 - lr: 0.000003 - momentum: 0.000000
2023-10-17 16:40:34,809 epoch 10 - iter 144/723 - loss 0.00510478 - time (sec): 10.30 - samples/sec: 3386.67 - lr: 0.000003 - momentum: 0.000000
2023-10-17 16:40:39,459 epoch 10 - iter 216/723 - loss 0.00443617 - time (sec): 14.95 - samples/sec: 3354.89 - lr: 0.000002 - momentum: 0.000000
2023-10-17 16:40:45,030 epoch 10 - iter 288/723 - loss 0.00582761 - time (sec): 20.52 - samples/sec: 3301.79 - lr: 0.000002 - momentum: 0.000000
2023-10-17 16:40:50,363 epoch 10 - iter 360/723 - loss 0.00623510 - time (sec): 25.86 - samples/sec: 3317.75 - lr: 0.000002 - momentum: 0.000000
2023-10-17 16:40:55,508 epoch 10 - iter 432/723 - loss 0.00600288 - time (sec): 31.00 - samples/sec: 3317.56 - lr: 0.000001 - momentum: 0.000000
2023-10-17 16:41:00,864 epoch 10 - iter 504/723 - loss 0.00660354 - time (sec): 36.36 - samples/sec: 3320.53 - lr: 0.000001 - momentum: 0.000000
2023-10-17 16:41:06,132 epoch 10 - iter 576/723 - loss 0.00641511 - time (sec): 41.63 - samples/sec: 3326.39 - lr: 0.000001 - momentum: 0.000000
2023-10-17 16:41:11,678 epoch 10 - iter 648/723 - loss 0.00674726 - time (sec): 47.17 - samples/sec: 3325.81 - lr: 0.000000 - momentum: 0.000000
2023-10-17 16:41:17,289 epoch 10 - iter 720/723 - loss 0.00692422 - time (sec): 52.78 - samples/sec: 3328.41 - lr: 0.000000 - momentum: 0.000000
2023-10-17 16:41:17,478 ----------------------------------------------------------------------------------------------------
2023-10-17 16:41:17,479 EPOCH 10 done: loss 0.0070 - lr: 0.000000
2023-10-17 16:41:21,045 DEV : loss 0.12641684710979462 - f1-score (micro avg) 0.8674
2023-10-17 16:41:21,413 ----------------------------------------------------------------------------------------------------
2023-10-17 16:41:21,415 Loading model from best epoch ...
2023-10-17 16:41:22,773 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 16:41:25,530
Results:
- F-score (micro) 0.8635
- F-score (macro) 0.7539
- Accuracy 0.7666
By class:
precision recall f1-score support
PER 0.8607 0.8589 0.8598 482
LOC 0.9405 0.8974 0.9184 458
ORG 0.5686 0.4203 0.4833 69
micro avg 0.8813 0.8464 0.8635 1009
macro avg 0.7899 0.7255 0.7539 1009
weighted avg 0.8770 0.8464 0.8607 1009
2023-10-17 16:41:25,531 ----------------------------------------------------------------------------------------------------