stefan-it's picture
Upload folder using huggingface_hub
f57dfc1
2023-10-17 18:22:08,334 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:08,335 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 18:22:08,336 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:08,336 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-17 18:22:08,336 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:08,336 Train: 1166 sentences
2023-10-17 18:22:08,336 (train_with_dev=False, train_with_test=False)
2023-10-17 18:22:08,336 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:08,336 Training Params:
2023-10-17 18:22:08,336 - learning_rate: "3e-05"
2023-10-17 18:22:08,336 - mini_batch_size: "4"
2023-10-17 18:22:08,336 - max_epochs: "10"
2023-10-17 18:22:08,336 - shuffle: "True"
2023-10-17 18:22:08,336 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:08,336 Plugins:
2023-10-17 18:22:08,336 - TensorboardLogger
2023-10-17 18:22:08,336 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 18:22:08,336 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:08,336 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 18:22:08,336 - metric: "('micro avg', 'f1-score')"
2023-10-17 18:22:08,336 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:08,336 Computation:
2023-10-17 18:22:08,336 - compute on device: cuda:0
2023-10-17 18:22:08,336 - embedding storage: none
2023-10-17 18:22:08,337 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:08,337 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-17 18:22:08,337 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:08,337 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:08,337 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 18:22:10,053 epoch 1 - iter 29/292 - loss 3.29075352 - time (sec): 1.71 - samples/sec: 2529.78 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:22:11,718 epoch 1 - iter 58/292 - loss 2.81746381 - time (sec): 3.38 - samples/sec: 2685.26 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:22:13,288 epoch 1 - iter 87/292 - loss 2.22634437 - time (sec): 4.95 - samples/sec: 2663.13 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:22:14,926 epoch 1 - iter 116/292 - loss 1.80283354 - time (sec): 6.59 - samples/sec: 2663.05 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:22:16,496 epoch 1 - iter 145/292 - loss 1.56272639 - time (sec): 8.16 - samples/sec: 2641.34 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:22:18,110 epoch 1 - iter 174/292 - loss 1.38891179 - time (sec): 9.77 - samples/sec: 2642.90 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:22:20,065 epoch 1 - iter 203/292 - loss 1.21331209 - time (sec): 11.73 - samples/sec: 2676.70 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:22:21,688 epoch 1 - iter 232/292 - loss 1.09882571 - time (sec): 13.35 - samples/sec: 2669.49 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:22:23,225 epoch 1 - iter 261/292 - loss 1.01402133 - time (sec): 14.89 - samples/sec: 2658.37 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:22:25,033 epoch 1 - iter 290/292 - loss 0.92983375 - time (sec): 16.70 - samples/sec: 2650.54 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:22:25,121 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:25,121 EPOCH 1 done: loss 0.9276 - lr: 0.000030
2023-10-17 18:22:26,443 DEV : loss 0.1775444895029068 - f1-score (micro avg) 0.5148
2023-10-17 18:22:26,450 saving best model
2023-10-17 18:22:26,868 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:28,503 epoch 2 - iter 29/292 - loss 0.19367025 - time (sec): 1.63 - samples/sec: 2651.22 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:22:30,294 epoch 2 - iter 58/292 - loss 0.28925208 - time (sec): 3.42 - samples/sec: 2797.41 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:22:31,929 epoch 2 - iter 87/292 - loss 0.25382216 - time (sec): 5.06 - samples/sec: 2779.74 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:22:33,588 epoch 2 - iter 116/292 - loss 0.23765017 - time (sec): 6.72 - samples/sec: 2748.33 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:22:35,478 epoch 2 - iter 145/292 - loss 0.22656271 - time (sec): 8.61 - samples/sec: 2709.18 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:22:37,097 epoch 2 - iter 174/292 - loss 0.21665063 - time (sec): 10.23 - samples/sec: 2637.37 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:22:38,621 epoch 2 - iter 203/292 - loss 0.20686273 - time (sec): 11.75 - samples/sec: 2643.01 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:22:40,328 epoch 2 - iter 232/292 - loss 0.20255896 - time (sec): 13.46 - samples/sec: 2662.00 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:22:41,991 epoch 2 - iter 261/292 - loss 0.19422242 - time (sec): 15.12 - samples/sec: 2642.25 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:22:43,690 epoch 2 - iter 290/292 - loss 0.19259531 - time (sec): 16.82 - samples/sec: 2623.71 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:22:43,786 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:43,786 EPOCH 2 done: loss 0.1923 - lr: 0.000027
2023-10-17 18:22:45,082 DEV : loss 0.11384982615709305 - f1-score (micro avg) 0.6875
2023-10-17 18:22:45,091 saving best model
2023-10-17 18:22:45,666 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:47,733 epoch 3 - iter 29/292 - loss 0.12609457 - time (sec): 2.07 - samples/sec: 2495.80 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:22:49,343 epoch 3 - iter 58/292 - loss 0.11403106 - time (sec): 3.67 - samples/sec: 2427.26 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:22:51,129 epoch 3 - iter 87/292 - loss 0.09945726 - time (sec): 5.46 - samples/sec: 2600.62 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:22:52,898 epoch 3 - iter 116/292 - loss 0.11604692 - time (sec): 7.23 - samples/sec: 2636.42 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:22:54,383 epoch 3 - iter 145/292 - loss 0.11717357 - time (sec): 8.71 - samples/sec: 2629.09 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:22:55,956 epoch 3 - iter 174/292 - loss 0.11808605 - time (sec): 10.29 - samples/sec: 2640.12 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:22:57,541 epoch 3 - iter 203/292 - loss 0.12234197 - time (sec): 11.87 - samples/sec: 2635.35 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:22:59,154 epoch 3 - iter 232/292 - loss 0.12383806 - time (sec): 13.49 - samples/sec: 2627.86 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:23:00,925 epoch 3 - iter 261/292 - loss 0.11671173 - time (sec): 15.26 - samples/sec: 2624.03 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:23:02,462 epoch 3 - iter 290/292 - loss 0.11564586 - time (sec): 16.79 - samples/sec: 2616.71 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:23:02,628 ----------------------------------------------------------------------------------------------------
2023-10-17 18:23:02,629 EPOCH 3 done: loss 0.1151 - lr: 0.000023
2023-10-17 18:23:03,924 DEV : loss 0.10733965039253235 - f1-score (micro avg) 0.7304
2023-10-17 18:23:03,930 saving best model
2023-10-17 18:23:04,410 ----------------------------------------------------------------------------------------------------
2023-10-17 18:23:06,273 epoch 4 - iter 29/292 - loss 0.06862492 - time (sec): 1.86 - samples/sec: 2819.48 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:23:08,027 epoch 4 - iter 58/292 - loss 0.06978255 - time (sec): 3.61 - samples/sec: 2660.66 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:23:09,753 epoch 4 - iter 87/292 - loss 0.06502896 - time (sec): 5.34 - samples/sec: 2518.04 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:23:11,418 epoch 4 - iter 116/292 - loss 0.06528685 - time (sec): 7.01 - samples/sec: 2534.77 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:23:13,378 epoch 4 - iter 145/292 - loss 0.06746468 - time (sec): 8.97 - samples/sec: 2496.87 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:23:15,096 epoch 4 - iter 174/292 - loss 0.06978266 - time (sec): 10.68 - samples/sec: 2485.01 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:23:16,777 epoch 4 - iter 203/292 - loss 0.07330462 - time (sec): 12.36 - samples/sec: 2502.00 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:23:18,682 epoch 4 - iter 232/292 - loss 0.07255344 - time (sec): 14.27 - samples/sec: 2519.53 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:23:20,390 epoch 4 - iter 261/292 - loss 0.07636325 - time (sec): 15.98 - samples/sec: 2496.08 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:23:22,272 epoch 4 - iter 290/292 - loss 0.07363746 - time (sec): 17.86 - samples/sec: 2475.07 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:23:22,372 ----------------------------------------------------------------------------------------------------
2023-10-17 18:23:22,372 EPOCH 4 done: loss 0.0737 - lr: 0.000020
2023-10-17 18:23:23,974 DEV : loss 0.12645508348941803 - f1-score (micro avg) 0.7176
2023-10-17 18:23:23,981 ----------------------------------------------------------------------------------------------------
2023-10-17 18:23:25,982 epoch 5 - iter 29/292 - loss 0.04528808 - time (sec): 2.00 - samples/sec: 2699.83 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:23:27,540 epoch 5 - iter 58/292 - loss 0.04526513 - time (sec): 3.56 - samples/sec: 2532.24 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:23:29,069 epoch 5 - iter 87/292 - loss 0.04063193 - time (sec): 5.09 - samples/sec: 2508.69 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:23:30,896 epoch 5 - iter 116/292 - loss 0.04159612 - time (sec): 6.91 - samples/sec: 2539.02 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:23:32,631 epoch 5 - iter 145/292 - loss 0.04703466 - time (sec): 8.65 - samples/sec: 2565.18 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:23:34,258 epoch 5 - iter 174/292 - loss 0.05258129 - time (sec): 10.28 - samples/sec: 2596.51 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:23:35,870 epoch 5 - iter 203/292 - loss 0.05073233 - time (sec): 11.89 - samples/sec: 2612.89 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:23:37,437 epoch 5 - iter 232/292 - loss 0.04940808 - time (sec): 13.46 - samples/sec: 2596.12 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:23:39,027 epoch 5 - iter 261/292 - loss 0.05129740 - time (sec): 15.05 - samples/sec: 2595.94 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:23:40,862 epoch 5 - iter 290/292 - loss 0.05371207 - time (sec): 16.88 - samples/sec: 2614.18 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:23:40,962 ----------------------------------------------------------------------------------------------------
2023-10-17 18:23:40,962 EPOCH 5 done: loss 0.0534 - lr: 0.000017
2023-10-17 18:23:42,232 DEV : loss 0.11940745264291763 - f1-score (micro avg) 0.7759
2023-10-17 18:23:42,239 saving best model
2023-10-17 18:23:42,727 ----------------------------------------------------------------------------------------------------
2023-10-17 18:23:44,728 epoch 6 - iter 29/292 - loss 0.06336152 - time (sec): 2.00 - samples/sec: 2809.12 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:23:46,222 epoch 6 - iter 58/292 - loss 0.04984604 - time (sec): 3.49 - samples/sec: 2664.89 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:23:48,037 epoch 6 - iter 87/292 - loss 0.04339346 - time (sec): 5.31 - samples/sec: 2646.12 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:23:49,752 epoch 6 - iter 116/292 - loss 0.04573025 - time (sec): 7.02 - samples/sec: 2690.52 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:23:51,307 epoch 6 - iter 145/292 - loss 0.04211581 - time (sec): 8.58 - samples/sec: 2648.93 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:23:53,077 epoch 6 - iter 174/292 - loss 0.04062692 - time (sec): 10.35 - samples/sec: 2636.85 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:23:54,688 epoch 6 - iter 203/292 - loss 0.03879493 - time (sec): 11.96 - samples/sec: 2612.05 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:23:56,342 epoch 6 - iter 232/292 - loss 0.04138779 - time (sec): 13.61 - samples/sec: 2617.60 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:23:57,903 epoch 6 - iter 261/292 - loss 0.03921456 - time (sec): 15.17 - samples/sec: 2635.69 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:23:59,507 epoch 6 - iter 290/292 - loss 0.03847237 - time (sec): 16.78 - samples/sec: 2629.93 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:23:59,620 ----------------------------------------------------------------------------------------------------
2023-10-17 18:23:59,620 EPOCH 6 done: loss 0.0383 - lr: 0.000013
2023-10-17 18:24:00,937 DEV : loss 0.1452915370464325 - f1-score (micro avg) 0.7638
2023-10-17 18:24:00,945 ----------------------------------------------------------------------------------------------------
2023-10-17 18:24:02,863 epoch 7 - iter 29/292 - loss 0.01800297 - time (sec): 1.92 - samples/sec: 2613.26 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:24:04,612 epoch 7 - iter 58/292 - loss 0.02732078 - time (sec): 3.67 - samples/sec: 2476.56 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:24:06,343 epoch 7 - iter 87/292 - loss 0.02475918 - time (sec): 5.40 - samples/sec: 2476.61 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:24:08,295 epoch 7 - iter 116/292 - loss 0.02385267 - time (sec): 7.35 - samples/sec: 2503.29 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:24:10,151 epoch 7 - iter 145/292 - loss 0.02314634 - time (sec): 9.20 - samples/sec: 2530.23 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:24:12,025 epoch 7 - iter 174/292 - loss 0.02813068 - time (sec): 11.08 - samples/sec: 2523.76 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:24:13,510 epoch 7 - iter 203/292 - loss 0.02705599 - time (sec): 12.56 - samples/sec: 2491.99 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:24:15,209 epoch 7 - iter 232/292 - loss 0.02818280 - time (sec): 14.26 - samples/sec: 2494.83 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:24:16,833 epoch 7 - iter 261/292 - loss 0.03043653 - time (sec): 15.89 - samples/sec: 2520.51 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:24:18,505 epoch 7 - iter 290/292 - loss 0.02929832 - time (sec): 17.56 - samples/sec: 2519.15 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:24:18,596 ----------------------------------------------------------------------------------------------------
2023-10-17 18:24:18,597 EPOCH 7 done: loss 0.0292 - lr: 0.000010
2023-10-17 18:24:19,886 DEV : loss 0.1461927443742752 - f1-score (micro avg) 0.7742
2023-10-17 18:24:19,891 ----------------------------------------------------------------------------------------------------
2023-10-17 18:24:21,646 epoch 8 - iter 29/292 - loss 0.03964611 - time (sec): 1.75 - samples/sec: 2658.25 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:24:23,335 epoch 8 - iter 58/292 - loss 0.03889202 - time (sec): 3.44 - samples/sec: 2646.02 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:24:25,092 epoch 8 - iter 87/292 - loss 0.02958935 - time (sec): 5.20 - samples/sec: 2789.28 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:24:26,913 epoch 8 - iter 116/292 - loss 0.02593733 - time (sec): 7.02 - samples/sec: 2775.65 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:24:28,631 epoch 8 - iter 145/292 - loss 0.02276293 - time (sec): 8.74 - samples/sec: 2748.87 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:24:30,309 epoch 8 - iter 174/292 - loss 0.01998816 - time (sec): 10.42 - samples/sec: 2708.65 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:24:31,954 epoch 8 - iter 203/292 - loss 0.01899104 - time (sec): 12.06 - samples/sec: 2670.48 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:24:33,562 epoch 8 - iter 232/292 - loss 0.01835548 - time (sec): 13.67 - samples/sec: 2619.08 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:24:35,287 epoch 8 - iter 261/292 - loss 0.01831061 - time (sec): 15.39 - samples/sec: 2611.58 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:24:37,075 epoch 8 - iter 290/292 - loss 0.02306213 - time (sec): 17.18 - samples/sec: 2580.44 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:24:37,178 ----------------------------------------------------------------------------------------------------
2023-10-17 18:24:37,179 EPOCH 8 done: loss 0.0230 - lr: 0.000007
2023-10-17 18:24:38,457 DEV : loss 0.16589583456516266 - f1-score (micro avg) 0.7576
2023-10-17 18:24:38,463 ----------------------------------------------------------------------------------------------------
2023-10-17 18:24:40,345 epoch 9 - iter 29/292 - loss 0.01221716 - time (sec): 1.88 - samples/sec: 2532.17 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:24:42,286 epoch 9 - iter 58/292 - loss 0.01392285 - time (sec): 3.82 - samples/sec: 2575.67 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:24:44,059 epoch 9 - iter 87/292 - loss 0.01599142 - time (sec): 5.59 - samples/sec: 2575.62 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:24:45,697 epoch 9 - iter 116/292 - loss 0.01608586 - time (sec): 7.23 - samples/sec: 2530.87 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:24:47,691 epoch 9 - iter 145/292 - loss 0.01384258 - time (sec): 9.23 - samples/sec: 2501.83 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:24:49,376 epoch 9 - iter 174/292 - loss 0.01495323 - time (sec): 10.91 - samples/sec: 2523.02 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:24:50,967 epoch 9 - iter 203/292 - loss 0.01795553 - time (sec): 12.50 - samples/sec: 2506.00 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:24:52,554 epoch 9 - iter 232/292 - loss 0.01783076 - time (sec): 14.09 - samples/sec: 2476.96 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:24:54,238 epoch 9 - iter 261/292 - loss 0.01670887 - time (sec): 15.77 - samples/sec: 2509.78 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:24:55,939 epoch 9 - iter 290/292 - loss 0.01876896 - time (sec): 17.48 - samples/sec: 2524.52 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:24:56,043 ----------------------------------------------------------------------------------------------------
2023-10-17 18:24:56,043 EPOCH 9 done: loss 0.0186 - lr: 0.000003
2023-10-17 18:24:57,544 DEV : loss 0.1677304357290268 - f1-score (micro avg) 0.7592
2023-10-17 18:24:57,549 ----------------------------------------------------------------------------------------------------
2023-10-17 18:24:59,366 epoch 10 - iter 29/292 - loss 0.00784158 - time (sec): 1.82 - samples/sec: 2728.00 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:25:01,066 epoch 10 - iter 58/292 - loss 0.02249789 - time (sec): 3.52 - samples/sec: 2635.67 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:25:02,622 epoch 10 - iter 87/292 - loss 0.02059791 - time (sec): 5.07 - samples/sec: 2559.88 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:25:04,353 epoch 10 - iter 116/292 - loss 0.02078976 - time (sec): 6.80 - samples/sec: 2614.57 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:25:06,078 epoch 10 - iter 145/292 - loss 0.01755985 - time (sec): 8.53 - samples/sec: 2630.15 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:25:07,773 epoch 10 - iter 174/292 - loss 0.01454950 - time (sec): 10.22 - samples/sec: 2670.86 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:25:09,340 epoch 10 - iter 203/292 - loss 0.01411050 - time (sec): 11.79 - samples/sec: 2663.23 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:25:10,842 epoch 10 - iter 232/292 - loss 0.01338063 - time (sec): 13.29 - samples/sec: 2664.45 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:25:12,449 epoch 10 - iter 261/292 - loss 0.01542384 - time (sec): 14.90 - samples/sec: 2666.82 - lr: 0.000000 - momentum: 0.000000
2023-10-17 18:25:14,092 epoch 10 - iter 290/292 - loss 0.01437749 - time (sec): 16.54 - samples/sec: 2670.08 - lr: 0.000000 - momentum: 0.000000
2023-10-17 18:25:14,205 ----------------------------------------------------------------------------------------------------
2023-10-17 18:25:14,205 EPOCH 10 done: loss 0.0149 - lr: 0.000000
2023-10-17 18:25:15,518 DEV : loss 0.16656455397605896 - f1-score (micro avg) 0.7686
2023-10-17 18:25:15,903 ----------------------------------------------------------------------------------------------------
2023-10-17 18:25:15,904 Loading model from best epoch ...
2023-10-17 18:25:18,148 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-17 18:25:21,021
Results:
- F-score (micro) 0.757
- F-score (macro) 0.6877
- Accuracy 0.6327
By class:
precision recall f1-score support
PER 0.7801 0.8563 0.8164 348
LOC 0.6565 0.8276 0.7322 261
ORG 0.5116 0.4231 0.4632 52
HumanProd 0.7083 0.7727 0.7391 22
micro avg 0.7108 0.8097 0.7570 683
macro avg 0.6642 0.7199 0.6877 683
weighted avg 0.7101 0.8097 0.7549 683
2023-10-17 18:25:21,021 ----------------------------------------------------------------------------------------------------