stefan-it's picture
Upload folder using huggingface_hub
6aaf5cc
2023-10-17 18:08:20,831 ----------------------------------------------------------------------------------------------------
2023-10-17 18:08:20,832 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 18:08:20,832 ----------------------------------------------------------------------------------------------------
2023-10-17 18:08:20,832 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-17 18:08:20,832 ----------------------------------------------------------------------------------------------------
2023-10-17 18:08:20,832 Train: 1166 sentences
2023-10-17 18:08:20,832 (train_with_dev=False, train_with_test=False)
2023-10-17 18:08:20,832 ----------------------------------------------------------------------------------------------------
2023-10-17 18:08:20,832 Training Params:
2023-10-17 18:08:20,832 - learning_rate: "3e-05"
2023-10-17 18:08:20,832 - mini_batch_size: "4"
2023-10-17 18:08:20,832 - max_epochs: "10"
2023-10-17 18:08:20,832 - shuffle: "True"
2023-10-17 18:08:20,832 ----------------------------------------------------------------------------------------------------
2023-10-17 18:08:20,832 Plugins:
2023-10-17 18:08:20,832 - TensorboardLogger
2023-10-17 18:08:20,832 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 18:08:20,832 ----------------------------------------------------------------------------------------------------
2023-10-17 18:08:20,832 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 18:08:20,832 - metric: "('micro avg', 'f1-score')"
2023-10-17 18:08:20,832 ----------------------------------------------------------------------------------------------------
2023-10-17 18:08:20,832 Computation:
2023-10-17 18:08:20,832 - compute on device: cuda:0
2023-10-17 18:08:20,832 - embedding storage: none
2023-10-17 18:08:20,833 ----------------------------------------------------------------------------------------------------
2023-10-17 18:08:20,833 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-17 18:08:20,833 ----------------------------------------------------------------------------------------------------
2023-10-17 18:08:20,833 ----------------------------------------------------------------------------------------------------
2023-10-17 18:08:20,833 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 18:08:22,668 epoch 1 - iter 29/292 - loss 3.83095474 - time (sec): 1.83 - samples/sec: 2821.52 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:08:24,296 epoch 1 - iter 58/292 - loss 3.23658540 - time (sec): 3.46 - samples/sec: 2713.21 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:08:25,947 epoch 1 - iter 87/292 - loss 2.54606603 - time (sec): 5.11 - samples/sec: 2640.12 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:08:27,753 epoch 1 - iter 116/292 - loss 2.06575240 - time (sec): 6.92 - samples/sec: 2708.68 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:08:29,506 epoch 1 - iter 145/292 - loss 1.72231768 - time (sec): 8.67 - samples/sec: 2751.32 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:08:31,136 epoch 1 - iter 174/292 - loss 1.52369828 - time (sec): 10.30 - samples/sec: 2745.06 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:08:32,635 epoch 1 - iter 203/292 - loss 1.39043738 - time (sec): 11.80 - samples/sec: 2745.15 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:08:34,112 epoch 1 - iter 232/292 - loss 1.27968694 - time (sec): 13.28 - samples/sec: 2716.28 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:08:35,587 epoch 1 - iter 261/292 - loss 1.19004896 - time (sec): 14.75 - samples/sec: 2691.85 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:08:37,377 epoch 1 - iter 290/292 - loss 1.09299373 - time (sec): 16.54 - samples/sec: 2678.21 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:08:37,468 ----------------------------------------------------------------------------------------------------
2023-10-17 18:08:37,469 EPOCH 1 done: loss 1.0906 - lr: 0.000030
2023-10-17 18:08:38,354 DEV : loss 0.18845295906066895 - f1-score (micro avg) 0.4088
2023-10-17 18:08:38,363 saving best model
2023-10-17 18:08:38,787 ----------------------------------------------------------------------------------------------------
2023-10-17 18:08:40,590 epoch 2 - iter 29/292 - loss 0.23954317 - time (sec): 1.80 - samples/sec: 2710.24 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:08:42,259 epoch 2 - iter 58/292 - loss 0.21325555 - time (sec): 3.47 - samples/sec: 2631.21 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:08:43,845 epoch 2 - iter 87/292 - loss 0.21658595 - time (sec): 5.05 - samples/sec: 2615.84 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:08:45,581 epoch 2 - iter 116/292 - loss 0.22084222 - time (sec): 6.79 - samples/sec: 2590.21 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:08:47,197 epoch 2 - iter 145/292 - loss 0.21637883 - time (sec): 8.41 - samples/sec: 2611.73 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:08:49,047 epoch 2 - iter 174/292 - loss 0.20713289 - time (sec): 10.26 - samples/sec: 2651.82 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:08:50,795 epoch 2 - iter 203/292 - loss 0.19540900 - time (sec): 12.00 - samples/sec: 2695.67 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:08:52,425 epoch 2 - iter 232/292 - loss 0.19774814 - time (sec): 13.63 - samples/sec: 2678.32 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:08:54,129 epoch 2 - iter 261/292 - loss 0.19336878 - time (sec): 15.34 - samples/sec: 2651.92 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:08:55,633 epoch 2 - iter 290/292 - loss 0.18836344 - time (sec): 16.84 - samples/sec: 2625.39 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:08:55,724 ----------------------------------------------------------------------------------------------------
2023-10-17 18:08:55,724 EPOCH 2 done: loss 0.1893 - lr: 0.000027
2023-10-17 18:08:57,214 DEV : loss 0.12409567087888718 - f1-score (micro avg) 0.7039
2023-10-17 18:08:57,222 saving best model
2023-10-17 18:08:57,674 ----------------------------------------------------------------------------------------------------
2023-10-17 18:08:59,473 epoch 3 - iter 29/292 - loss 0.10343379 - time (sec): 1.79 - samples/sec: 2569.81 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:09:01,112 epoch 3 - iter 58/292 - loss 0.10878038 - time (sec): 3.43 - samples/sec: 2664.32 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:09:02,742 epoch 3 - iter 87/292 - loss 0.11194346 - time (sec): 5.06 - samples/sec: 2556.99 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:09:04,578 epoch 3 - iter 116/292 - loss 0.10777722 - time (sec): 6.90 - samples/sec: 2630.96 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:09:06,197 epoch 3 - iter 145/292 - loss 0.10578385 - time (sec): 8.52 - samples/sec: 2640.27 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:09:07,680 epoch 3 - iter 174/292 - loss 0.10165213 - time (sec): 10.00 - samples/sec: 2607.74 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:09:09,448 epoch 3 - iter 203/292 - loss 0.10049750 - time (sec): 11.77 - samples/sec: 2656.01 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:09:11,008 epoch 3 - iter 232/292 - loss 0.10804139 - time (sec): 13.33 - samples/sec: 2645.16 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:09:12,756 epoch 3 - iter 261/292 - loss 0.11308089 - time (sec): 15.08 - samples/sec: 2670.35 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:09:14,328 epoch 3 - iter 290/292 - loss 0.11125189 - time (sec): 16.65 - samples/sec: 2657.12 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:09:14,420 ----------------------------------------------------------------------------------------------------
2023-10-17 18:09:14,420 EPOCH 3 done: loss 0.1109 - lr: 0.000023
2023-10-17 18:09:15,677 DEV : loss 0.1368805170059204 - f1-score (micro avg) 0.7664
2023-10-17 18:09:15,683 saving best model
2023-10-17 18:09:16,127 ----------------------------------------------------------------------------------------------------
2023-10-17 18:09:17,772 epoch 4 - iter 29/292 - loss 0.07602933 - time (sec): 1.64 - samples/sec: 2710.92 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:09:19,506 epoch 4 - iter 58/292 - loss 0.06899111 - time (sec): 3.38 - samples/sec: 2704.75 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:09:21,191 epoch 4 - iter 87/292 - loss 0.06879207 - time (sec): 5.06 - samples/sec: 2690.07 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:09:22,832 epoch 4 - iter 116/292 - loss 0.06850169 - time (sec): 6.70 - samples/sec: 2615.43 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:09:24,556 epoch 4 - iter 145/292 - loss 0.06645933 - time (sec): 8.43 - samples/sec: 2638.53 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:09:26,057 epoch 4 - iter 174/292 - loss 0.06576952 - time (sec): 9.93 - samples/sec: 2611.46 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:09:27,849 epoch 4 - iter 203/292 - loss 0.07071067 - time (sec): 11.72 - samples/sec: 2638.15 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:09:29,650 epoch 4 - iter 232/292 - loss 0.07305983 - time (sec): 13.52 - samples/sec: 2644.58 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:09:31,248 epoch 4 - iter 261/292 - loss 0.07376713 - time (sec): 15.12 - samples/sec: 2625.02 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:09:32,875 epoch 4 - iter 290/292 - loss 0.07138872 - time (sec): 16.75 - samples/sec: 2643.47 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:09:32,964 ----------------------------------------------------------------------------------------------------
2023-10-17 18:09:32,964 EPOCH 4 done: loss 0.0712 - lr: 0.000020
2023-10-17 18:09:34,213 DEV : loss 0.10807133466005325 - f1-score (micro avg) 0.7623
2023-10-17 18:09:34,222 ----------------------------------------------------------------------------------------------------
2023-10-17 18:09:35,831 epoch 5 - iter 29/292 - loss 0.05834853 - time (sec): 1.61 - samples/sec: 2385.75 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:09:37,492 epoch 5 - iter 58/292 - loss 0.05588902 - time (sec): 3.27 - samples/sec: 2556.16 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:09:39,195 epoch 5 - iter 87/292 - loss 0.04660830 - time (sec): 4.97 - samples/sec: 2644.73 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:09:40,748 epoch 5 - iter 116/292 - loss 0.04283946 - time (sec): 6.52 - samples/sec: 2595.41 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:09:42,451 epoch 5 - iter 145/292 - loss 0.04946763 - time (sec): 8.23 - samples/sec: 2632.15 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:09:44,177 epoch 5 - iter 174/292 - loss 0.05168536 - time (sec): 9.95 - samples/sec: 2635.84 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:09:45,728 epoch 5 - iter 203/292 - loss 0.05007307 - time (sec): 11.50 - samples/sec: 2659.48 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:09:47,189 epoch 5 - iter 232/292 - loss 0.05154864 - time (sec): 12.97 - samples/sec: 2627.91 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:09:48,832 epoch 5 - iter 261/292 - loss 0.05190958 - time (sec): 14.61 - samples/sec: 2642.78 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:09:50,727 epoch 5 - iter 290/292 - loss 0.04837098 - time (sec): 16.50 - samples/sec: 2671.02 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:09:50,865 ----------------------------------------------------------------------------------------------------
2023-10-17 18:09:50,866 EPOCH 5 done: loss 0.0480 - lr: 0.000017
2023-10-17 18:09:52,174 DEV : loss 0.14370526373386383 - f1-score (micro avg) 0.7589
2023-10-17 18:09:52,182 ----------------------------------------------------------------------------------------------------
2023-10-17 18:09:53,762 epoch 6 - iter 29/292 - loss 0.02581687 - time (sec): 1.58 - samples/sec: 2547.80 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:09:55,422 epoch 6 - iter 58/292 - loss 0.02827170 - time (sec): 3.24 - samples/sec: 2570.39 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:09:57,156 epoch 6 - iter 87/292 - loss 0.02844652 - time (sec): 4.97 - samples/sec: 2620.17 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:09:58,882 epoch 6 - iter 116/292 - loss 0.03172945 - time (sec): 6.70 - samples/sec: 2642.15 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:10:00,513 epoch 6 - iter 145/292 - loss 0.03178169 - time (sec): 8.33 - samples/sec: 2614.11 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:10:02,214 epoch 6 - iter 174/292 - loss 0.03164753 - time (sec): 10.03 - samples/sec: 2592.78 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:10:03,769 epoch 6 - iter 203/292 - loss 0.03087760 - time (sec): 11.59 - samples/sec: 2624.66 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:10:05,645 epoch 6 - iter 232/292 - loss 0.03350463 - time (sec): 13.46 - samples/sec: 2650.21 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:10:07,312 epoch 6 - iter 261/292 - loss 0.03346027 - time (sec): 15.13 - samples/sec: 2651.04 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:10:08,915 epoch 6 - iter 290/292 - loss 0.03440584 - time (sec): 16.73 - samples/sec: 2648.35 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:10:09,006 ----------------------------------------------------------------------------------------------------
2023-10-17 18:10:09,007 EPOCH 6 done: loss 0.0343 - lr: 0.000013
2023-10-17 18:10:10,343 DEV : loss 0.14083223044872284 - f1-score (micro avg) 0.7277
2023-10-17 18:10:10,348 ----------------------------------------------------------------------------------------------------
2023-10-17 18:10:12,061 epoch 7 - iter 29/292 - loss 0.02684070 - time (sec): 1.71 - samples/sec: 2449.55 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:10:13,716 epoch 7 - iter 58/292 - loss 0.02804202 - time (sec): 3.37 - samples/sec: 2601.88 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:10:15,396 epoch 7 - iter 87/292 - loss 0.02728816 - time (sec): 5.05 - samples/sec: 2533.64 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:10:17,025 epoch 7 - iter 116/292 - loss 0.02959221 - time (sec): 6.68 - samples/sec: 2518.88 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:10:18,802 epoch 7 - iter 145/292 - loss 0.02635240 - time (sec): 8.45 - samples/sec: 2598.73 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:10:20,511 epoch 7 - iter 174/292 - loss 0.02532558 - time (sec): 10.16 - samples/sec: 2604.77 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:10:22,109 epoch 7 - iter 203/292 - loss 0.02582429 - time (sec): 11.76 - samples/sec: 2608.96 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:10:23,786 epoch 7 - iter 232/292 - loss 0.02531994 - time (sec): 13.44 - samples/sec: 2622.46 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:10:25,608 epoch 7 - iter 261/292 - loss 0.02453137 - time (sec): 15.26 - samples/sec: 2652.54 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:10:27,172 epoch 7 - iter 290/292 - loss 0.02590830 - time (sec): 16.82 - samples/sec: 2622.54 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:10:27,280 ----------------------------------------------------------------------------------------------------
2023-10-17 18:10:27,280 EPOCH 7 done: loss 0.0258 - lr: 0.000010
2023-10-17 18:10:28,530 DEV : loss 0.1593863070011139 - f1-score (micro avg) 0.755
2023-10-17 18:10:28,536 ----------------------------------------------------------------------------------------------------
2023-10-17 18:10:30,524 epoch 8 - iter 29/292 - loss 0.02520487 - time (sec): 1.99 - samples/sec: 2309.62 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:10:32,414 epoch 8 - iter 58/292 - loss 0.02786812 - time (sec): 3.88 - samples/sec: 2523.46 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:10:34,001 epoch 8 - iter 87/292 - loss 0.02098684 - time (sec): 5.46 - samples/sec: 2578.86 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:10:35,714 epoch 8 - iter 116/292 - loss 0.01947693 - time (sec): 7.18 - samples/sec: 2632.57 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:10:37,312 epoch 8 - iter 145/292 - loss 0.01741232 - time (sec): 8.77 - samples/sec: 2604.14 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:10:38,895 epoch 8 - iter 174/292 - loss 0.02044811 - time (sec): 10.36 - samples/sec: 2612.55 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:10:40,581 epoch 8 - iter 203/292 - loss 0.02099220 - time (sec): 12.04 - samples/sec: 2680.30 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:10:42,079 epoch 8 - iter 232/292 - loss 0.02151490 - time (sec): 13.54 - samples/sec: 2639.95 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:10:43,602 epoch 8 - iter 261/292 - loss 0.02058897 - time (sec): 15.07 - samples/sec: 2620.07 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:10:45,364 epoch 8 - iter 290/292 - loss 0.01989925 - time (sec): 16.83 - samples/sec: 2633.04 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:10:45,451 ----------------------------------------------------------------------------------------------------
2023-10-17 18:10:45,451 EPOCH 8 done: loss 0.0198 - lr: 0.000007
2023-10-17 18:10:46,749 DEV : loss 0.16249039769172668 - f1-score (micro avg) 0.7335
2023-10-17 18:10:46,755 ----------------------------------------------------------------------------------------------------
2023-10-17 18:10:48,382 epoch 9 - iter 29/292 - loss 0.00668975 - time (sec): 1.63 - samples/sec: 2507.95 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:10:50,068 epoch 9 - iter 58/292 - loss 0.01877431 - time (sec): 3.31 - samples/sec: 2774.90 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:10:51,767 epoch 9 - iter 87/292 - loss 0.01592873 - time (sec): 5.01 - samples/sec: 2794.63 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:10:53,379 epoch 9 - iter 116/292 - loss 0.01512380 - time (sec): 6.62 - samples/sec: 2752.74 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:10:55,042 epoch 9 - iter 145/292 - loss 0.01369025 - time (sec): 8.29 - samples/sec: 2723.12 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:10:56,664 epoch 9 - iter 174/292 - loss 0.01356662 - time (sec): 9.91 - samples/sec: 2658.39 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:10:58,319 epoch 9 - iter 203/292 - loss 0.01300736 - time (sec): 11.56 - samples/sec: 2644.31 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:11:00,072 epoch 9 - iter 232/292 - loss 0.01299119 - time (sec): 13.32 - samples/sec: 2661.66 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:11:01,819 epoch 9 - iter 261/292 - loss 0.01374651 - time (sec): 15.06 - samples/sec: 2629.91 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:11:03,765 epoch 9 - iter 290/292 - loss 0.01334354 - time (sec): 17.01 - samples/sec: 2605.18 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:11:03,860 ----------------------------------------------------------------------------------------------------
2023-10-17 18:11:03,860 EPOCH 9 done: loss 0.0134 - lr: 0.000003
2023-10-17 18:11:05,174 DEV : loss 0.16921038925647736 - f1-score (micro avg) 0.7516
2023-10-17 18:11:05,184 ----------------------------------------------------------------------------------------------------
2023-10-17 18:11:06,961 epoch 10 - iter 29/292 - loss 0.00890121 - time (sec): 1.78 - samples/sec: 2247.21 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:11:08,579 epoch 10 - iter 58/292 - loss 0.00743095 - time (sec): 3.39 - samples/sec: 2377.05 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:11:10,242 epoch 10 - iter 87/292 - loss 0.00759210 - time (sec): 5.06 - samples/sec: 2500.96 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:11:11,869 epoch 10 - iter 116/292 - loss 0.00820561 - time (sec): 6.68 - samples/sec: 2534.58 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:11:13,619 epoch 10 - iter 145/292 - loss 0.00773890 - time (sec): 8.43 - samples/sec: 2638.46 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:11:15,276 epoch 10 - iter 174/292 - loss 0.01031477 - time (sec): 10.09 - samples/sec: 2628.47 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:11:17,122 epoch 10 - iter 203/292 - loss 0.01109630 - time (sec): 11.94 - samples/sec: 2621.48 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:11:18,735 epoch 10 - iter 232/292 - loss 0.01091261 - time (sec): 13.55 - samples/sec: 2613.13 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:11:20,332 epoch 10 - iter 261/292 - loss 0.01138787 - time (sec): 15.15 - samples/sec: 2600.35 - lr: 0.000000 - momentum: 0.000000
2023-10-17 18:11:22,014 epoch 10 - iter 290/292 - loss 0.01147225 - time (sec): 16.83 - samples/sec: 2627.67 - lr: 0.000000 - momentum: 0.000000
2023-10-17 18:11:22,108 ----------------------------------------------------------------------------------------------------
2023-10-17 18:11:22,108 EPOCH 10 done: loss 0.0114 - lr: 0.000000
2023-10-17 18:11:23,353 DEV : loss 0.16777698695659637 - f1-score (micro avg) 0.7565
2023-10-17 18:11:23,699 ----------------------------------------------------------------------------------------------------
2023-10-17 18:11:23,700 Loading model from best epoch ...
2023-10-17 18:11:25,061 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-17 18:11:27,419
Results:
- F-score (micro) 0.7545
- F-score (macro) 0.6648
- Accuracy 0.6275
By class:
precision recall f1-score support
PER 0.8040 0.8132 0.8086 348
LOC 0.7419 0.7931 0.7667 261
ORG 0.3333 0.3462 0.3396 52
HumanProd 0.7619 0.7273 0.7442 22
micro avg 0.7422 0.7672 0.7545 683
macro avg 0.6603 0.6699 0.6648 683
weighted avg 0.7431 0.7672 0.7548 683
2023-10-17 18:11:27,419 ----------------------------------------------------------------------------------------------------