stefan-it's picture
Upload folder using huggingface_hub
fffeaf6
2023-10-17 16:05:51,820 ----------------------------------------------------------------------------------------------------
2023-10-17 16:05:51,821 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 16:05:51,822 ----------------------------------------------------------------------------------------------------
2023-10-17 16:05:51,822 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-17 16:05:51,822 ----------------------------------------------------------------------------------------------------
2023-10-17 16:05:51,822 Train: 5777 sentences
2023-10-17 16:05:51,822 (train_with_dev=False, train_with_test=False)
2023-10-17 16:05:51,822 ----------------------------------------------------------------------------------------------------
2023-10-17 16:05:51,822 Training Params:
2023-10-17 16:05:51,822 - learning_rate: "3e-05"
2023-10-17 16:05:51,822 - mini_batch_size: "4"
2023-10-17 16:05:51,822 - max_epochs: "10"
2023-10-17 16:05:51,822 - shuffle: "True"
2023-10-17 16:05:51,822 ----------------------------------------------------------------------------------------------------
2023-10-17 16:05:51,822 Plugins:
2023-10-17 16:05:51,822 - TensorboardLogger
2023-10-17 16:05:51,822 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 16:05:51,822 ----------------------------------------------------------------------------------------------------
2023-10-17 16:05:51,822 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 16:05:51,822 - metric: "('micro avg', 'f1-score')"
2023-10-17 16:05:51,822 ----------------------------------------------------------------------------------------------------
2023-10-17 16:05:51,822 Computation:
2023-10-17 16:05:51,822 - compute on device: cuda:0
2023-10-17 16:05:51,822 - embedding storage: none
2023-10-17 16:05:51,822 ----------------------------------------------------------------------------------------------------
2023-10-17 16:05:51,822 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-17 16:05:51,822 ----------------------------------------------------------------------------------------------------
2023-10-17 16:05:51,822 ----------------------------------------------------------------------------------------------------
2023-10-17 16:05:51,823 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 16:05:58,938 epoch 1 - iter 144/1445 - loss 2.61442915 - time (sec): 7.11 - samples/sec: 2565.02 - lr: 0.000003 - momentum: 0.000000
2023-10-17 16:06:05,920 epoch 1 - iter 288/1445 - loss 1.48508856 - time (sec): 14.10 - samples/sec: 2529.03 - lr: 0.000006 - momentum: 0.000000
2023-10-17 16:06:13,014 epoch 1 - iter 432/1445 - loss 1.10139561 - time (sec): 21.19 - samples/sec: 2472.30 - lr: 0.000009 - momentum: 0.000000
2023-10-17 16:06:20,202 epoch 1 - iter 576/1445 - loss 0.86933963 - time (sec): 28.38 - samples/sec: 2469.61 - lr: 0.000012 - momentum: 0.000000
2023-10-17 16:06:27,561 epoch 1 - iter 720/1445 - loss 0.72081647 - time (sec): 35.74 - samples/sec: 2474.93 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:06:34,978 epoch 1 - iter 864/1445 - loss 0.62859943 - time (sec): 43.15 - samples/sec: 2450.29 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:06:42,071 epoch 1 - iter 1008/1445 - loss 0.55956683 - time (sec): 50.25 - samples/sec: 2459.27 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:06:49,693 epoch 1 - iter 1152/1445 - loss 0.50313260 - time (sec): 57.87 - samples/sec: 2441.54 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:06:56,773 epoch 1 - iter 1296/1445 - loss 0.46190545 - time (sec): 64.95 - samples/sec: 2435.48 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:07:04,076 epoch 1 - iter 1440/1445 - loss 0.42817614 - time (sec): 72.25 - samples/sec: 2432.73 - lr: 0.000030 - momentum: 0.000000
2023-10-17 16:07:04,322 ----------------------------------------------------------------------------------------------------
2023-10-17 16:07:04,323 EPOCH 1 done: loss 0.4276 - lr: 0.000030
2023-10-17 16:07:07,040 DEV : loss 0.08966663479804993 - f1-score (micro avg) 0.8004
2023-10-17 16:07:07,056 saving best model
2023-10-17 16:07:07,420 ----------------------------------------------------------------------------------------------------
2023-10-17 16:07:14,351 epoch 2 - iter 144/1445 - loss 0.13373991 - time (sec): 6.93 - samples/sec: 2388.36 - lr: 0.000030 - momentum: 0.000000
2023-10-17 16:07:21,190 epoch 2 - iter 288/1445 - loss 0.12005429 - time (sec): 13.77 - samples/sec: 2433.94 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:07:28,283 epoch 2 - iter 432/1445 - loss 0.10977419 - time (sec): 20.86 - samples/sec: 2444.98 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:07:35,769 epoch 2 - iter 576/1445 - loss 0.10492325 - time (sec): 28.35 - samples/sec: 2432.19 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:07:42,654 epoch 2 - iter 720/1445 - loss 0.10061786 - time (sec): 35.23 - samples/sec: 2439.70 - lr: 0.000028 - momentum: 0.000000
2023-10-17 16:07:49,741 epoch 2 - iter 864/1445 - loss 0.09806729 - time (sec): 42.32 - samples/sec: 2485.05 - lr: 0.000028 - momentum: 0.000000
2023-10-17 16:07:56,799 epoch 2 - iter 1008/1445 - loss 0.09598940 - time (sec): 49.38 - samples/sec: 2492.22 - lr: 0.000028 - momentum: 0.000000
2023-10-17 16:08:03,879 epoch 2 - iter 1152/1445 - loss 0.09402430 - time (sec): 56.46 - samples/sec: 2480.00 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:08:11,024 epoch 2 - iter 1296/1445 - loss 0.09410389 - time (sec): 63.60 - samples/sec: 2487.14 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:08:18,116 epoch 2 - iter 1440/1445 - loss 0.09305196 - time (sec): 70.69 - samples/sec: 2486.04 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:08:18,336 ----------------------------------------------------------------------------------------------------
2023-10-17 16:08:18,336 EPOCH 2 done: loss 0.0932 - lr: 0.000027
2023-10-17 16:08:21,685 DEV : loss 0.09091634303331375 - f1-score (micro avg) 0.8244
2023-10-17 16:08:21,701 saving best model
2023-10-17 16:08:22,149 ----------------------------------------------------------------------------------------------------
2023-10-17 16:08:29,534 epoch 3 - iter 144/1445 - loss 0.06992594 - time (sec): 7.38 - samples/sec: 2466.82 - lr: 0.000026 - momentum: 0.000000
2023-10-17 16:08:36,552 epoch 3 - iter 288/1445 - loss 0.06304889 - time (sec): 14.40 - samples/sec: 2457.36 - lr: 0.000026 - momentum: 0.000000
2023-10-17 16:08:43,801 epoch 3 - iter 432/1445 - loss 0.06653206 - time (sec): 21.65 - samples/sec: 2514.49 - lr: 0.000026 - momentum: 0.000000
2023-10-17 16:08:50,910 epoch 3 - iter 576/1445 - loss 0.06750477 - time (sec): 28.76 - samples/sec: 2479.06 - lr: 0.000025 - momentum: 0.000000
2023-10-17 16:08:58,054 epoch 3 - iter 720/1445 - loss 0.06590474 - time (sec): 35.90 - samples/sec: 2471.29 - lr: 0.000025 - momentum: 0.000000
2023-10-17 16:09:05,405 epoch 3 - iter 864/1445 - loss 0.06338105 - time (sec): 43.25 - samples/sec: 2473.63 - lr: 0.000025 - momentum: 0.000000
2023-10-17 16:09:12,492 epoch 3 - iter 1008/1445 - loss 0.06397475 - time (sec): 50.34 - samples/sec: 2451.70 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:09:19,380 epoch 3 - iter 1152/1445 - loss 0.06368394 - time (sec): 57.23 - samples/sec: 2443.68 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:09:26,479 epoch 3 - iter 1296/1445 - loss 0.06411821 - time (sec): 64.32 - samples/sec: 2452.52 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:09:33,742 epoch 3 - iter 1440/1445 - loss 0.06471648 - time (sec): 71.59 - samples/sec: 2452.06 - lr: 0.000023 - momentum: 0.000000
2023-10-17 16:09:34,000 ----------------------------------------------------------------------------------------------------
2023-10-17 16:09:34,001 EPOCH 3 done: loss 0.0647 - lr: 0.000023
2023-10-17 16:09:37,163 DEV : loss 0.08195469528436661 - f1-score (micro avg) 0.8681
2023-10-17 16:09:37,180 saving best model
2023-10-17 16:09:37,649 ----------------------------------------------------------------------------------------------------
2023-10-17 16:09:44,774 epoch 4 - iter 144/1445 - loss 0.03793689 - time (sec): 7.12 - samples/sec: 2500.53 - lr: 0.000023 - momentum: 0.000000
2023-10-17 16:09:51,815 epoch 4 - iter 288/1445 - loss 0.04584517 - time (sec): 14.16 - samples/sec: 2460.92 - lr: 0.000023 - momentum: 0.000000
2023-10-17 16:09:58,809 epoch 4 - iter 432/1445 - loss 0.04581201 - time (sec): 21.16 - samples/sec: 2439.30 - lr: 0.000022 - momentum: 0.000000
2023-10-17 16:10:05,955 epoch 4 - iter 576/1445 - loss 0.04818147 - time (sec): 28.30 - samples/sec: 2427.68 - lr: 0.000022 - momentum: 0.000000
2023-10-17 16:10:13,182 epoch 4 - iter 720/1445 - loss 0.05102175 - time (sec): 35.53 - samples/sec: 2440.70 - lr: 0.000022 - momentum: 0.000000
2023-10-17 16:10:20,228 epoch 4 - iter 864/1445 - loss 0.05185422 - time (sec): 42.58 - samples/sec: 2465.40 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:10:27,265 epoch 4 - iter 1008/1445 - loss 0.05102231 - time (sec): 49.61 - samples/sec: 2473.93 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:10:34,786 epoch 4 - iter 1152/1445 - loss 0.04988097 - time (sec): 57.14 - samples/sec: 2457.91 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:10:41,874 epoch 4 - iter 1296/1445 - loss 0.04932647 - time (sec): 64.22 - samples/sec: 2460.98 - lr: 0.000020 - momentum: 0.000000
2023-10-17 16:10:49,054 epoch 4 - iter 1440/1445 - loss 0.04942813 - time (sec): 71.40 - samples/sec: 2461.23 - lr: 0.000020 - momentum: 0.000000
2023-10-17 16:10:49,309 ----------------------------------------------------------------------------------------------------
2023-10-17 16:10:49,309 EPOCH 4 done: loss 0.0494 - lr: 0.000020
2023-10-17 16:10:52,521 DEV : loss 0.10280071198940277 - f1-score (micro avg) 0.8594
2023-10-17 16:10:52,537 ----------------------------------------------------------------------------------------------------
2023-10-17 16:10:59,669 epoch 5 - iter 144/1445 - loss 0.05086725 - time (sec): 7.13 - samples/sec: 2306.72 - lr: 0.000020 - momentum: 0.000000
2023-10-17 16:11:06,676 epoch 5 - iter 288/1445 - loss 0.03847390 - time (sec): 14.14 - samples/sec: 2406.40 - lr: 0.000019 - momentum: 0.000000
2023-10-17 16:11:13,729 epoch 5 - iter 432/1445 - loss 0.03722457 - time (sec): 21.19 - samples/sec: 2437.47 - lr: 0.000019 - momentum: 0.000000
2023-10-17 16:11:20,636 epoch 5 - iter 576/1445 - loss 0.03791969 - time (sec): 28.10 - samples/sec: 2413.95 - lr: 0.000019 - momentum: 0.000000
2023-10-17 16:11:27,649 epoch 5 - iter 720/1445 - loss 0.03762962 - time (sec): 35.11 - samples/sec: 2451.19 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:11:34,669 epoch 5 - iter 864/1445 - loss 0.03832143 - time (sec): 42.13 - samples/sec: 2458.12 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:11:41,965 epoch 5 - iter 1008/1445 - loss 0.03898272 - time (sec): 49.43 - samples/sec: 2477.51 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:11:49,173 epoch 5 - iter 1152/1445 - loss 0.03774384 - time (sec): 56.63 - samples/sec: 2497.13 - lr: 0.000017 - momentum: 0.000000
2023-10-17 16:11:56,200 epoch 5 - iter 1296/1445 - loss 0.03646897 - time (sec): 63.66 - samples/sec: 2499.37 - lr: 0.000017 - momentum: 0.000000
2023-10-17 16:12:03,012 epoch 5 - iter 1440/1445 - loss 0.03614791 - time (sec): 70.47 - samples/sec: 2491.29 - lr: 0.000017 - momentum: 0.000000
2023-10-17 16:12:03,244 ----------------------------------------------------------------------------------------------------
2023-10-17 16:12:03,244 EPOCH 5 done: loss 0.0361 - lr: 0.000017
2023-10-17 16:12:06,435 DEV : loss 0.10241829603910446 - f1-score (micro avg) 0.8564
2023-10-17 16:12:06,451 ----------------------------------------------------------------------------------------------------
2023-10-17 16:12:13,704 epoch 6 - iter 144/1445 - loss 0.04938968 - time (sec): 7.25 - samples/sec: 2536.58 - lr: 0.000016 - momentum: 0.000000
2023-10-17 16:12:20,795 epoch 6 - iter 288/1445 - loss 0.03451984 - time (sec): 14.34 - samples/sec: 2478.84 - lr: 0.000016 - momentum: 0.000000
2023-10-17 16:12:27,672 epoch 6 - iter 432/1445 - loss 0.03115367 - time (sec): 21.22 - samples/sec: 2492.77 - lr: 0.000016 - momentum: 0.000000
2023-10-17 16:12:34,599 epoch 6 - iter 576/1445 - loss 0.02804596 - time (sec): 28.15 - samples/sec: 2497.72 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:12:41,827 epoch 6 - iter 720/1445 - loss 0.02653754 - time (sec): 35.37 - samples/sec: 2500.60 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:12:48,924 epoch 6 - iter 864/1445 - loss 0.02661037 - time (sec): 42.47 - samples/sec: 2513.53 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:12:56,046 epoch 6 - iter 1008/1445 - loss 0.02512559 - time (sec): 49.59 - samples/sec: 2514.30 - lr: 0.000014 - momentum: 0.000000
2023-10-17 16:13:03,060 epoch 6 - iter 1152/1445 - loss 0.02561509 - time (sec): 56.61 - samples/sec: 2493.42 - lr: 0.000014 - momentum: 0.000000
2023-10-17 16:13:09,934 epoch 6 - iter 1296/1445 - loss 0.02667536 - time (sec): 63.48 - samples/sec: 2489.39 - lr: 0.000014 - momentum: 0.000000
2023-10-17 16:13:16,851 epoch 6 - iter 1440/1445 - loss 0.02699248 - time (sec): 70.40 - samples/sec: 2493.45 - lr: 0.000013 - momentum: 0.000000
2023-10-17 16:13:17,148 ----------------------------------------------------------------------------------------------------
2023-10-17 16:13:17,149 EPOCH 6 done: loss 0.0269 - lr: 0.000013
2023-10-17 16:13:20,688 DEV : loss 0.12599067389965057 - f1-score (micro avg) 0.8492
2023-10-17 16:13:20,704 ----------------------------------------------------------------------------------------------------
2023-10-17 16:13:27,879 epoch 7 - iter 144/1445 - loss 0.01798898 - time (sec): 7.17 - samples/sec: 2565.89 - lr: 0.000013 - momentum: 0.000000
2023-10-17 16:13:34,920 epoch 7 - iter 288/1445 - loss 0.01577177 - time (sec): 14.22 - samples/sec: 2547.63 - lr: 0.000013 - momentum: 0.000000
2023-10-17 16:13:41,963 epoch 7 - iter 432/1445 - loss 0.01581476 - time (sec): 21.26 - samples/sec: 2502.35 - lr: 0.000012 - momentum: 0.000000
2023-10-17 16:13:49,180 epoch 7 - iter 576/1445 - loss 0.01575283 - time (sec): 28.47 - samples/sec: 2493.63 - lr: 0.000012 - momentum: 0.000000
2023-10-17 16:13:56,308 epoch 7 - iter 720/1445 - loss 0.01733916 - time (sec): 35.60 - samples/sec: 2499.92 - lr: 0.000012 - momentum: 0.000000
2023-10-17 16:14:03,346 epoch 7 - iter 864/1445 - loss 0.01744267 - time (sec): 42.64 - samples/sec: 2486.88 - lr: 0.000011 - momentum: 0.000000
2023-10-17 16:14:10,369 epoch 7 - iter 1008/1445 - loss 0.01855480 - time (sec): 49.66 - samples/sec: 2474.02 - lr: 0.000011 - momentum: 0.000000
2023-10-17 16:14:17,360 epoch 7 - iter 1152/1445 - loss 0.01883145 - time (sec): 56.65 - samples/sec: 2473.49 - lr: 0.000011 - momentum: 0.000000
2023-10-17 16:14:24,443 epoch 7 - iter 1296/1445 - loss 0.01853015 - time (sec): 63.74 - samples/sec: 2467.26 - lr: 0.000010 - momentum: 0.000000
2023-10-17 16:14:31,615 epoch 7 - iter 1440/1445 - loss 0.01913705 - time (sec): 70.91 - samples/sec: 2476.79 - lr: 0.000010 - momentum: 0.000000
2023-10-17 16:14:31,850 ----------------------------------------------------------------------------------------------------
2023-10-17 16:14:31,850 EPOCH 7 done: loss 0.0192 - lr: 0.000010
2023-10-17 16:14:35,121 DEV : loss 0.12630820274353027 - f1-score (micro avg) 0.8673
2023-10-17 16:14:35,139 ----------------------------------------------------------------------------------------------------
2023-10-17 16:14:42,235 epoch 8 - iter 144/1445 - loss 0.01838052 - time (sec): 7.09 - samples/sec: 2411.28 - lr: 0.000010 - momentum: 0.000000
2023-10-17 16:14:49,383 epoch 8 - iter 288/1445 - loss 0.01367032 - time (sec): 14.24 - samples/sec: 2402.76 - lr: 0.000009 - momentum: 0.000000
2023-10-17 16:14:56,511 epoch 8 - iter 432/1445 - loss 0.01331123 - time (sec): 21.37 - samples/sec: 2435.34 - lr: 0.000009 - momentum: 0.000000
2023-10-17 16:15:03,633 epoch 8 - iter 576/1445 - loss 0.01435875 - time (sec): 28.49 - samples/sec: 2432.92 - lr: 0.000009 - momentum: 0.000000
2023-10-17 16:15:10,448 epoch 8 - iter 720/1445 - loss 0.01281387 - time (sec): 35.31 - samples/sec: 2421.01 - lr: 0.000008 - momentum: 0.000000
2023-10-17 16:15:17,642 epoch 8 - iter 864/1445 - loss 0.01215775 - time (sec): 42.50 - samples/sec: 2445.73 - lr: 0.000008 - momentum: 0.000000
2023-10-17 16:15:24,878 epoch 8 - iter 1008/1445 - loss 0.01228586 - time (sec): 49.74 - samples/sec: 2457.97 - lr: 0.000008 - momentum: 0.000000
2023-10-17 16:15:32,003 epoch 8 - iter 1152/1445 - loss 0.01409772 - time (sec): 56.86 - samples/sec: 2453.22 - lr: 0.000007 - momentum: 0.000000
2023-10-17 16:15:39,161 epoch 8 - iter 1296/1445 - loss 0.01486635 - time (sec): 64.02 - samples/sec: 2465.22 - lr: 0.000007 - momentum: 0.000000
2023-10-17 16:15:46,199 epoch 8 - iter 1440/1445 - loss 0.01428808 - time (sec): 71.06 - samples/sec: 2471.79 - lr: 0.000007 - momentum: 0.000000
2023-10-17 16:15:46,432 ----------------------------------------------------------------------------------------------------
2023-10-17 16:15:46,433 EPOCH 8 done: loss 0.0143 - lr: 0.000007
2023-10-17 16:15:49,712 DEV : loss 0.147283136844635 - f1-score (micro avg) 0.8614
2023-10-17 16:15:49,727 ----------------------------------------------------------------------------------------------------
2023-10-17 16:15:56,769 epoch 9 - iter 144/1445 - loss 0.01334705 - time (sec): 7.04 - samples/sec: 2570.34 - lr: 0.000006 - momentum: 0.000000
2023-10-17 16:16:04,493 epoch 9 - iter 288/1445 - loss 0.01614528 - time (sec): 14.76 - samples/sec: 2495.81 - lr: 0.000006 - momentum: 0.000000
2023-10-17 16:16:11,470 epoch 9 - iter 432/1445 - loss 0.01383466 - time (sec): 21.74 - samples/sec: 2484.14 - lr: 0.000006 - momentum: 0.000000
2023-10-17 16:16:18,444 epoch 9 - iter 576/1445 - loss 0.01256669 - time (sec): 28.72 - samples/sec: 2482.26 - lr: 0.000005 - momentum: 0.000000
2023-10-17 16:16:25,416 epoch 9 - iter 720/1445 - loss 0.01229858 - time (sec): 35.69 - samples/sec: 2506.55 - lr: 0.000005 - momentum: 0.000000
2023-10-17 16:16:32,172 epoch 9 - iter 864/1445 - loss 0.01145068 - time (sec): 42.44 - samples/sec: 2524.48 - lr: 0.000005 - momentum: 0.000000
2023-10-17 16:16:39,190 epoch 9 - iter 1008/1445 - loss 0.01150023 - time (sec): 49.46 - samples/sec: 2506.71 - lr: 0.000004 - momentum: 0.000000
2023-10-17 16:16:46,049 epoch 9 - iter 1152/1445 - loss 0.01113663 - time (sec): 56.32 - samples/sec: 2506.36 - lr: 0.000004 - momentum: 0.000000
2023-10-17 16:16:53,266 epoch 9 - iter 1296/1445 - loss 0.01096505 - time (sec): 63.54 - samples/sec: 2492.67 - lr: 0.000004 - momentum: 0.000000
2023-10-17 16:17:00,355 epoch 9 - iter 1440/1445 - loss 0.01055015 - time (sec): 70.63 - samples/sec: 2485.57 - lr: 0.000003 - momentum: 0.000000
2023-10-17 16:17:00,582 ----------------------------------------------------------------------------------------------------
2023-10-17 16:17:00,582 EPOCH 9 done: loss 0.0105 - lr: 0.000003
2023-10-17 16:17:03,819 DEV : loss 0.13861502707004547 - f1-score (micro avg) 0.8674
2023-10-17 16:17:03,835 ----------------------------------------------------------------------------------------------------
2023-10-17 16:17:10,875 epoch 10 - iter 144/1445 - loss 0.00264139 - time (sec): 7.04 - samples/sec: 2578.53 - lr: 0.000003 - momentum: 0.000000
2023-10-17 16:17:17,605 epoch 10 - iter 288/1445 - loss 0.00303990 - time (sec): 13.77 - samples/sec: 2533.99 - lr: 0.000003 - momentum: 0.000000
2023-10-17 16:17:24,447 epoch 10 - iter 432/1445 - loss 0.00328206 - time (sec): 20.61 - samples/sec: 2433.78 - lr: 0.000002 - momentum: 0.000000
2023-10-17 16:17:31,703 epoch 10 - iter 576/1445 - loss 0.00421475 - time (sec): 27.87 - samples/sec: 2431.67 - lr: 0.000002 - momentum: 0.000000
2023-10-17 16:17:38,866 epoch 10 - iter 720/1445 - loss 0.00504231 - time (sec): 35.03 - samples/sec: 2448.92 - lr: 0.000002 - momentum: 0.000000
2023-10-17 16:17:45,871 epoch 10 - iter 864/1445 - loss 0.00543516 - time (sec): 42.03 - samples/sec: 2446.74 - lr: 0.000001 - momentum: 0.000000
2023-10-17 16:17:52,967 epoch 10 - iter 1008/1445 - loss 0.00520470 - time (sec): 49.13 - samples/sec: 2457.22 - lr: 0.000001 - momentum: 0.000000
2023-10-17 16:18:00,019 epoch 10 - iter 1152/1445 - loss 0.00518206 - time (sec): 56.18 - samples/sec: 2464.51 - lr: 0.000001 - momentum: 0.000000
2023-10-17 16:18:07,103 epoch 10 - iter 1296/1445 - loss 0.00570006 - time (sec): 63.27 - samples/sec: 2479.72 - lr: 0.000000 - momentum: 0.000000
2023-10-17 16:18:14,425 epoch 10 - iter 1440/1445 - loss 0.00595358 - time (sec): 70.59 - samples/sec: 2488.80 - lr: 0.000000 - momentum: 0.000000
2023-10-17 16:18:14,655 ----------------------------------------------------------------------------------------------------
2023-10-17 16:18:14,656 EPOCH 10 done: loss 0.0060 - lr: 0.000000
2023-10-17 16:18:17,857 DEV : loss 0.150188148021698 - f1-score (micro avg) 0.8647
2023-10-17 16:18:18,219 ----------------------------------------------------------------------------------------------------
2023-10-17 16:18:18,220 Loading model from best epoch ...
2023-10-17 16:18:19,595 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 16:18:22,388
Results:
- F-score (micro) 0.8422
- F-score (macro) 0.7199
- Accuracy 0.7358
By class:
precision recall f1-score support
PER 0.8524 0.8506 0.8515 482
LOC 0.9390 0.8406 0.8871 458
ORG 0.5333 0.3478 0.4211 69
micro avg 0.8750 0.8117 0.8422 1009
macro avg 0.7749 0.6797 0.7199 1009
weighted avg 0.8699 0.8117 0.8382 1009
2023-10-17 16:18:22,388 ----------------------------------------------------------------------------------------------------