stefan-it's picture
Upload folder using huggingface_hub
930ecf6
2023-10-17 18:11:54,463 ----------------------------------------------------------------------------------------------------
2023-10-17 18:11:54,464 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 18:11:54,464 ----------------------------------------------------------------------------------------------------
2023-10-17 18:11:54,464 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-17 18:11:54,464 ----------------------------------------------------------------------------------------------------
2023-10-17 18:11:54,464 Train: 1166 sentences
2023-10-17 18:11:54,464 (train_with_dev=False, train_with_test=False)
2023-10-17 18:11:54,464 ----------------------------------------------------------------------------------------------------
2023-10-17 18:11:54,464 Training Params:
2023-10-17 18:11:54,464 - learning_rate: "5e-05"
2023-10-17 18:11:54,464 - mini_batch_size: "4"
2023-10-17 18:11:54,464 - max_epochs: "10"
2023-10-17 18:11:54,464 - shuffle: "True"
2023-10-17 18:11:54,464 ----------------------------------------------------------------------------------------------------
2023-10-17 18:11:54,464 Plugins:
2023-10-17 18:11:54,464 - TensorboardLogger
2023-10-17 18:11:54,464 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 18:11:54,464 ----------------------------------------------------------------------------------------------------
2023-10-17 18:11:54,464 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 18:11:54,464 - metric: "('micro avg', 'f1-score')"
2023-10-17 18:11:54,464 ----------------------------------------------------------------------------------------------------
2023-10-17 18:11:54,464 Computation:
2023-10-17 18:11:54,465 - compute on device: cuda:0
2023-10-17 18:11:54,465 - embedding storage: none
2023-10-17 18:11:54,465 ----------------------------------------------------------------------------------------------------
2023-10-17 18:11:54,465 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-17 18:11:54,465 ----------------------------------------------------------------------------------------------------
2023-10-17 18:11:54,465 ----------------------------------------------------------------------------------------------------
2023-10-17 18:11:54,465 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 18:11:56,232 epoch 1 - iter 29/292 - loss 3.72244560 - time (sec): 1.77 - samples/sec: 2929.45 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:11:57,841 epoch 1 - iter 58/292 - loss 2.87723673 - time (sec): 3.38 - samples/sec: 2783.65 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:11:59,491 epoch 1 - iter 87/292 - loss 2.16638363 - time (sec): 5.02 - samples/sec: 2686.42 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:12:01,276 epoch 1 - iter 116/292 - loss 1.73400770 - time (sec): 6.81 - samples/sec: 2752.02 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:12:03,016 epoch 1 - iter 145/292 - loss 1.44462307 - time (sec): 8.55 - samples/sec: 2790.43 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:12:04,649 epoch 1 - iter 174/292 - loss 1.27857058 - time (sec): 10.18 - samples/sec: 2777.00 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:12:06,182 epoch 1 - iter 203/292 - loss 1.16794464 - time (sec): 11.72 - samples/sec: 2765.04 - lr: 0.000035 - momentum: 0.000000
2023-10-17 18:12:07,673 epoch 1 - iter 232/292 - loss 1.07277868 - time (sec): 13.21 - samples/sec: 2731.07 - lr: 0.000040 - momentum: 0.000000
2023-10-17 18:12:09,189 epoch 1 - iter 261/292 - loss 1.00143201 - time (sec): 14.72 - samples/sec: 2697.24 - lr: 0.000045 - momentum: 0.000000
2023-10-17 18:12:10,973 epoch 1 - iter 290/292 - loss 0.92152928 - time (sec): 16.51 - samples/sec: 2684.11 - lr: 0.000049 - momentum: 0.000000
2023-10-17 18:12:11,060 ----------------------------------------------------------------------------------------------------
2023-10-17 18:12:11,060 EPOCH 1 done: loss 0.9198 - lr: 0.000049
2023-10-17 18:12:12,098 DEV : loss 0.1843765377998352 - f1-score (micro avg) 0.3702
2023-10-17 18:12:12,103 saving best model
2023-10-17 18:12:12,449 ----------------------------------------------------------------------------------------------------
2023-10-17 18:12:14,179 epoch 2 - iter 29/292 - loss 0.20142189 - time (sec): 1.73 - samples/sec: 2819.84 - lr: 0.000049 - momentum: 0.000000
2023-10-17 18:12:15,816 epoch 2 - iter 58/292 - loss 0.19284668 - time (sec): 3.37 - samples/sec: 2711.29 - lr: 0.000049 - momentum: 0.000000
2023-10-17 18:12:17,358 epoch 2 - iter 87/292 - loss 0.19371720 - time (sec): 4.91 - samples/sec: 2694.01 - lr: 0.000048 - momentum: 0.000000
2023-10-17 18:12:19,037 epoch 2 - iter 116/292 - loss 0.19402488 - time (sec): 6.59 - samples/sec: 2670.45 - lr: 0.000048 - momentum: 0.000000
2023-10-17 18:12:20,644 epoch 2 - iter 145/292 - loss 0.19150294 - time (sec): 8.19 - samples/sec: 2679.62 - lr: 0.000047 - momentum: 0.000000
2023-10-17 18:12:22,547 epoch 2 - iter 174/292 - loss 0.18668752 - time (sec): 10.10 - samples/sec: 2693.88 - lr: 0.000047 - momentum: 0.000000
2023-10-17 18:12:24,314 epoch 2 - iter 203/292 - loss 0.17585029 - time (sec): 11.86 - samples/sec: 2727.66 - lr: 0.000046 - momentum: 0.000000
2023-10-17 18:12:25,925 epoch 2 - iter 232/292 - loss 0.17976168 - time (sec): 13.47 - samples/sec: 2710.00 - lr: 0.000046 - momentum: 0.000000
2023-10-17 18:12:27,638 epoch 2 - iter 261/292 - loss 0.17722873 - time (sec): 15.19 - samples/sec: 2678.16 - lr: 0.000045 - momentum: 0.000000
2023-10-17 18:12:29,117 epoch 2 - iter 290/292 - loss 0.17283090 - time (sec): 16.67 - samples/sec: 2653.11 - lr: 0.000045 - momentum: 0.000000
2023-10-17 18:12:29,206 ----------------------------------------------------------------------------------------------------
2023-10-17 18:12:29,206 EPOCH 2 done: loss 0.1732 - lr: 0.000045
2023-10-17 18:12:30,464 DEV : loss 0.13461413979530334 - f1-score (micro avg) 0.6947
2023-10-17 18:12:30,472 saving best model
2023-10-17 18:12:30,975 ----------------------------------------------------------------------------------------------------
2023-10-17 18:12:32,778 epoch 3 - iter 29/292 - loss 0.09618809 - time (sec): 1.80 - samples/sec: 2562.17 - lr: 0.000044 - momentum: 0.000000
2023-10-17 18:12:34,376 epoch 3 - iter 58/292 - loss 0.10261956 - time (sec): 3.40 - samples/sec: 2692.25 - lr: 0.000043 - momentum: 0.000000
2023-10-17 18:12:35,945 epoch 3 - iter 87/292 - loss 0.10491458 - time (sec): 4.97 - samples/sec: 2606.49 - lr: 0.000043 - momentum: 0.000000
2023-10-17 18:12:37,720 epoch 3 - iter 116/292 - loss 0.09932210 - time (sec): 6.74 - samples/sec: 2692.42 - lr: 0.000042 - momentum: 0.000000
2023-10-17 18:12:39,353 epoch 3 - iter 145/292 - loss 0.09832985 - time (sec): 8.37 - samples/sec: 2685.39 - lr: 0.000042 - momentum: 0.000000
2023-10-17 18:12:40,829 epoch 3 - iter 174/292 - loss 0.09621778 - time (sec): 9.85 - samples/sec: 2647.67 - lr: 0.000041 - momentum: 0.000000
2023-10-17 18:12:42,621 epoch 3 - iter 203/292 - loss 0.09595474 - time (sec): 11.64 - samples/sec: 2684.96 - lr: 0.000041 - momentum: 0.000000
2023-10-17 18:12:44,216 epoch 3 - iter 232/292 - loss 0.10537357 - time (sec): 13.24 - samples/sec: 2663.50 - lr: 0.000040 - momentum: 0.000000
2023-10-17 18:12:45,975 epoch 3 - iter 261/292 - loss 0.10883278 - time (sec): 15.00 - samples/sec: 2684.69 - lr: 0.000040 - momentum: 0.000000
2023-10-17 18:12:47,552 epoch 3 - iter 290/292 - loss 0.10855203 - time (sec): 16.57 - samples/sec: 2669.18 - lr: 0.000039 - momentum: 0.000000
2023-10-17 18:12:47,643 ----------------------------------------------------------------------------------------------------
2023-10-17 18:12:47,643 EPOCH 3 done: loss 0.1085 - lr: 0.000039
2023-10-17 18:12:48,942 DEV : loss 0.1387384980916977 - f1-score (micro avg) 0.6935
2023-10-17 18:12:48,947 ----------------------------------------------------------------------------------------------------
2023-10-17 18:12:50,574 epoch 4 - iter 29/292 - loss 0.07413975 - time (sec): 1.63 - samples/sec: 2740.39 - lr: 0.000038 - momentum: 0.000000
2023-10-17 18:12:52,258 epoch 4 - iter 58/292 - loss 0.06064259 - time (sec): 3.31 - samples/sec: 2759.68 - lr: 0.000038 - momentum: 0.000000
2023-10-17 18:12:53,852 epoch 4 - iter 87/292 - loss 0.06381220 - time (sec): 4.90 - samples/sec: 2776.94 - lr: 0.000037 - momentum: 0.000000
2023-10-17 18:12:55,435 epoch 4 - iter 116/292 - loss 0.06394328 - time (sec): 6.49 - samples/sec: 2702.57 - lr: 0.000037 - momentum: 0.000000
2023-10-17 18:12:57,367 epoch 4 - iter 145/292 - loss 0.06459610 - time (sec): 8.42 - samples/sec: 2641.27 - lr: 0.000036 - momentum: 0.000000
2023-10-17 18:12:58,848 epoch 4 - iter 174/292 - loss 0.06399316 - time (sec): 9.90 - samples/sec: 2618.85 - lr: 0.000036 - momentum: 0.000000
2023-10-17 18:13:00,637 epoch 4 - iter 203/292 - loss 0.06982796 - time (sec): 11.69 - samples/sec: 2645.30 - lr: 0.000035 - momentum: 0.000000
2023-10-17 18:13:02,466 epoch 4 - iter 232/292 - loss 0.07008573 - time (sec): 13.52 - samples/sec: 2645.36 - lr: 0.000035 - momentum: 0.000000
2023-10-17 18:13:04,108 epoch 4 - iter 261/292 - loss 0.07002274 - time (sec): 15.16 - samples/sec: 2618.03 - lr: 0.000034 - momentum: 0.000000
2023-10-17 18:13:05,744 epoch 4 - iter 290/292 - loss 0.06785738 - time (sec): 16.80 - samples/sec: 2635.55 - lr: 0.000033 - momentum: 0.000000
2023-10-17 18:13:05,832 ----------------------------------------------------------------------------------------------------
2023-10-17 18:13:05,832 EPOCH 4 done: loss 0.0676 - lr: 0.000033
2023-10-17 18:13:07,093 DEV : loss 0.12671661376953125 - f1-score (micro avg) 0.7692
2023-10-17 18:13:07,098 saving best model
2023-10-17 18:13:07,538 ----------------------------------------------------------------------------------------------------
2023-10-17 18:13:09,105 epoch 5 - iter 29/292 - loss 0.05161666 - time (sec): 1.56 - samples/sec: 2451.61 - lr: 0.000033 - momentum: 0.000000
2023-10-17 18:13:10,738 epoch 5 - iter 58/292 - loss 0.04878953 - time (sec): 3.20 - samples/sec: 2613.08 - lr: 0.000032 - momentum: 0.000000
2023-10-17 18:13:12,483 epoch 5 - iter 87/292 - loss 0.04146877 - time (sec): 4.94 - samples/sec: 2660.09 - lr: 0.000032 - momentum: 0.000000
2023-10-17 18:13:14,013 epoch 5 - iter 116/292 - loss 0.03720400 - time (sec): 6.47 - samples/sec: 2616.06 - lr: 0.000031 - momentum: 0.000000
2023-10-17 18:13:15,709 epoch 5 - iter 145/292 - loss 0.04613955 - time (sec): 8.17 - samples/sec: 2650.67 - lr: 0.000031 - momentum: 0.000000
2023-10-17 18:13:17,484 epoch 5 - iter 174/292 - loss 0.04788267 - time (sec): 9.94 - samples/sec: 2638.20 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:13:19,056 epoch 5 - iter 203/292 - loss 0.04607835 - time (sec): 11.52 - samples/sec: 2656.76 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:13:20,498 epoch 5 - iter 232/292 - loss 0.04885133 - time (sec): 12.96 - samples/sec: 2629.38 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:13:22,140 epoch 5 - iter 261/292 - loss 0.04946117 - time (sec): 14.60 - samples/sec: 2644.39 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:13:24,065 epoch 5 - iter 290/292 - loss 0.05008020 - time (sec): 16.53 - samples/sec: 2667.43 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:13:24,200 ----------------------------------------------------------------------------------------------------
2023-10-17 18:13:24,201 EPOCH 5 done: loss 0.0497 - lr: 0.000028
2023-10-17 18:13:25,432 DEV : loss 0.13220328092575073 - f1-score (micro avg) 0.7615
2023-10-17 18:13:25,437 ----------------------------------------------------------------------------------------------------
2023-10-17 18:13:26,974 epoch 6 - iter 29/292 - loss 0.02271820 - time (sec): 1.54 - samples/sec: 2617.26 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:13:28,618 epoch 6 - iter 58/292 - loss 0.02683061 - time (sec): 3.18 - samples/sec: 2617.63 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:13:30,384 epoch 6 - iter 87/292 - loss 0.02927446 - time (sec): 4.95 - samples/sec: 2634.53 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:13:32,140 epoch 6 - iter 116/292 - loss 0.02925845 - time (sec): 6.70 - samples/sec: 2640.94 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:13:33,793 epoch 6 - iter 145/292 - loss 0.02975885 - time (sec): 8.36 - samples/sec: 2605.91 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:13:35,574 epoch 6 - iter 174/292 - loss 0.02942265 - time (sec): 10.14 - samples/sec: 2565.70 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:13:37,166 epoch 6 - iter 203/292 - loss 0.02904656 - time (sec): 11.73 - samples/sec: 2592.82 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:13:39,032 epoch 6 - iter 232/292 - loss 0.03367419 - time (sec): 13.59 - samples/sec: 2624.29 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:13:40,680 epoch 6 - iter 261/292 - loss 0.03362666 - time (sec): 15.24 - samples/sec: 2631.26 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:13:42,284 epoch 6 - iter 290/292 - loss 0.03352873 - time (sec): 16.85 - samples/sec: 2630.18 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:13:42,376 ----------------------------------------------------------------------------------------------------
2023-10-17 18:13:42,376 EPOCH 6 done: loss 0.0334 - lr: 0.000022
2023-10-17 18:13:43,648 DEV : loss 0.15977859497070312 - f1-score (micro avg) 0.7592
2023-10-17 18:13:43,653 ----------------------------------------------------------------------------------------------------
2023-10-17 18:13:45,352 epoch 7 - iter 29/292 - loss 0.02607234 - time (sec): 1.70 - samples/sec: 2468.52 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:13:46,952 epoch 7 - iter 58/292 - loss 0.02640207 - time (sec): 3.30 - samples/sec: 2655.79 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:13:48,604 epoch 7 - iter 87/292 - loss 0.03014010 - time (sec): 4.95 - samples/sec: 2582.80 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:13:50,236 epoch 7 - iter 116/292 - loss 0.03008604 - time (sec): 6.58 - samples/sec: 2554.63 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:13:51,999 epoch 7 - iter 145/292 - loss 0.02617809 - time (sec): 8.35 - samples/sec: 2632.22 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:13:53,689 epoch 7 - iter 174/292 - loss 0.02372276 - time (sec): 10.03 - samples/sec: 2637.70 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:13:55,281 epoch 7 - iter 203/292 - loss 0.02211949 - time (sec): 11.63 - samples/sec: 2638.78 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:13:56,931 epoch 7 - iter 232/292 - loss 0.02126649 - time (sec): 13.28 - samples/sec: 2653.80 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:13:58,709 epoch 7 - iter 261/292 - loss 0.02005502 - time (sec): 15.06 - samples/sec: 2688.35 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:14:00,269 epoch 7 - iter 290/292 - loss 0.02058177 - time (sec): 16.62 - samples/sec: 2655.10 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:14:00,379 ----------------------------------------------------------------------------------------------------
2023-10-17 18:14:00,379 EPOCH 7 done: loss 0.0205 - lr: 0.000017
2023-10-17 18:14:01,624 DEV : loss 0.15402135252952576 - f1-score (micro avg) 0.757
2023-10-17 18:14:01,629 ----------------------------------------------------------------------------------------------------
2023-10-17 18:14:03,436 epoch 8 - iter 29/292 - loss 0.01399093 - time (sec): 1.81 - samples/sec: 2540.88 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:14:05,294 epoch 8 - iter 58/292 - loss 0.01777831 - time (sec): 3.66 - samples/sec: 2669.75 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:14:06,888 epoch 8 - iter 87/292 - loss 0.01394238 - time (sec): 5.26 - samples/sec: 2679.88 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:14:08,636 epoch 8 - iter 116/292 - loss 0.01327971 - time (sec): 7.01 - samples/sec: 2696.56 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:14:10,244 epoch 8 - iter 145/292 - loss 0.01180690 - time (sec): 8.61 - samples/sec: 2652.66 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:14:11,804 epoch 8 - iter 174/292 - loss 0.01183391 - time (sec): 10.17 - samples/sec: 2659.75 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:14:13,500 epoch 8 - iter 203/292 - loss 0.01461138 - time (sec): 11.87 - samples/sec: 2719.37 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:14:14,988 epoch 8 - iter 232/292 - loss 0.01425436 - time (sec): 13.36 - samples/sec: 2676.13 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:14:16,530 epoch 8 - iter 261/292 - loss 0.01362277 - time (sec): 14.90 - samples/sec: 2649.15 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:14:18,276 epoch 8 - iter 290/292 - loss 0.01307128 - time (sec): 16.65 - samples/sec: 2661.65 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:14:18,366 ----------------------------------------------------------------------------------------------------
2023-10-17 18:14:18,366 EPOCH 8 done: loss 0.0130 - lr: 0.000011
2023-10-17 18:14:19,617 DEV : loss 0.1692199856042862 - f1-score (micro avg) 0.7725
2023-10-17 18:14:19,622 saving best model
2023-10-17 18:14:20,051 ----------------------------------------------------------------------------------------------------
2023-10-17 18:14:21,697 epoch 9 - iter 29/292 - loss 0.00569490 - time (sec): 1.64 - samples/sec: 2479.03 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:14:23,397 epoch 9 - iter 58/292 - loss 0.01617149 - time (sec): 3.34 - samples/sec: 2747.21 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:14:25,048 epoch 9 - iter 87/292 - loss 0.01390161 - time (sec): 5.00 - samples/sec: 2802.73 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:14:26,634 epoch 9 - iter 116/292 - loss 0.01213710 - time (sec): 6.58 - samples/sec: 2769.76 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:14:28,322 epoch 9 - iter 145/292 - loss 0.01099370 - time (sec): 8.27 - samples/sec: 2728.66 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:14:30,093 epoch 9 - iter 174/292 - loss 0.01120828 - time (sec): 10.04 - samples/sec: 2623.04 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:14:31,727 epoch 9 - iter 203/292 - loss 0.01044705 - time (sec): 11.67 - samples/sec: 2618.98 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:14:33,463 epoch 9 - iter 232/292 - loss 0.00985920 - time (sec): 13.41 - samples/sec: 2642.78 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:14:35,069 epoch 9 - iter 261/292 - loss 0.01125430 - time (sec): 15.02 - samples/sec: 2638.05 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:14:36,917 epoch 9 - iter 290/292 - loss 0.01153666 - time (sec): 16.86 - samples/sec: 2627.36 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:14:37,001 ----------------------------------------------------------------------------------------------------
2023-10-17 18:14:37,001 EPOCH 9 done: loss 0.0115 - lr: 0.000006
2023-10-17 18:14:38,248 DEV : loss 0.16192570328712463 - f1-score (micro avg) 0.7742
2023-10-17 18:14:38,253 saving best model
2023-10-17 18:14:38,648 ----------------------------------------------------------------------------------------------------
2023-10-17 18:14:40,380 epoch 10 - iter 29/292 - loss 0.00879055 - time (sec): 1.73 - samples/sec: 2309.25 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:14:41,997 epoch 10 - iter 58/292 - loss 0.00594428 - time (sec): 3.34 - samples/sec: 2412.00 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:14:43,671 epoch 10 - iter 87/292 - loss 0.00500674 - time (sec): 5.02 - samples/sec: 2520.21 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:14:45,276 epoch 10 - iter 116/292 - loss 0.00433942 - time (sec): 6.62 - samples/sec: 2557.56 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:14:47,004 epoch 10 - iter 145/292 - loss 0.00402439 - time (sec): 8.35 - samples/sec: 2664.56 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:14:48,651 epoch 10 - iter 174/292 - loss 0.00608918 - time (sec): 10.00 - samples/sec: 2652.86 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:14:50,466 epoch 10 - iter 203/292 - loss 0.00741514 - time (sec): 11.81 - samples/sec: 2648.87 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:14:52,070 epoch 10 - iter 232/292 - loss 0.00948936 - time (sec): 13.42 - samples/sec: 2638.99 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:14:53,673 epoch 10 - iter 261/292 - loss 0.00939826 - time (sec): 15.02 - samples/sec: 2622.28 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:14:55,352 epoch 10 - iter 290/292 - loss 0.00888394 - time (sec): 16.70 - samples/sec: 2648.16 - lr: 0.000000 - momentum: 0.000000
2023-10-17 18:14:55,443 ----------------------------------------------------------------------------------------------------
2023-10-17 18:14:55,443 EPOCH 10 done: loss 0.0088 - lr: 0.000000
2023-10-17 18:14:56,679 DEV : loss 0.16277405619621277 - f1-score (micro avg) 0.7692
2023-10-17 18:14:57,004 ----------------------------------------------------------------------------------------------------
2023-10-17 18:14:57,005 Loading model from best epoch ...
2023-10-17 18:14:58,412 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-17 18:15:00,795
Results:
- F-score (micro) 0.7612
- F-score (macro) 0.7118
- Accuracy 0.6356
By class:
precision recall f1-score support
PER 0.8172 0.8477 0.8322 348
LOC 0.6431 0.8352 0.7267 261
ORG 0.4400 0.4231 0.4314 52
HumanProd 0.9000 0.8182 0.8571 22
micro avg 0.7182 0.8097 0.7612 683
macro avg 0.7001 0.7311 0.7118 683
weighted avg 0.7246 0.8097 0.7621 683
2023-10-17 18:15:00,796 ----------------------------------------------------------------------------------------------------