2023-10-17 18:11:54,463 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:11:54,464 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 18:11:54,464 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:11:54,464 MultiCorpus: 1166 train + 165 dev + 415 test sentences - NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator 2023-10-17 18:11:54,464 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:11:54,464 Train: 1166 sentences 2023-10-17 18:11:54,464 (train_with_dev=False, train_with_test=False) 2023-10-17 18:11:54,464 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:11:54,464 Training Params: 2023-10-17 18:11:54,464 - learning_rate: "5e-05" 2023-10-17 18:11:54,464 - mini_batch_size: "4" 2023-10-17 18:11:54,464 - max_epochs: "10" 2023-10-17 18:11:54,464 - shuffle: "True" 2023-10-17 18:11:54,464 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:11:54,464 Plugins: 2023-10-17 18:11:54,464 - TensorboardLogger 2023-10-17 18:11:54,464 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 18:11:54,464 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:11:54,464 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 18:11:54,464 - metric: "('micro avg', 'f1-score')" 2023-10-17 18:11:54,464 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:11:54,464 Computation: 2023-10-17 18:11:54,465 - compute on device: cuda:0 2023-10-17 18:11:54,465 - embedding storage: none 2023-10-17 18:11:54,465 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:11:54,465 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-17 18:11:54,465 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:11:54,465 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:11:54,465 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 18:11:56,232 epoch 1 - iter 29/292 - loss 3.72244560 - time (sec): 1.77 - samples/sec: 2929.45 - lr: 0.000005 - momentum: 0.000000 2023-10-17 18:11:57,841 epoch 1 - iter 58/292 - loss 2.87723673 - time (sec): 3.38 - samples/sec: 2783.65 - lr: 0.000010 - momentum: 0.000000 2023-10-17 18:11:59,491 epoch 1 - iter 87/292 - loss 2.16638363 - time (sec): 5.02 - samples/sec: 2686.42 - lr: 0.000015 - momentum: 0.000000 2023-10-17 18:12:01,276 epoch 1 - iter 116/292 - loss 1.73400770 - time (sec): 6.81 - samples/sec: 2752.02 - lr: 0.000020 - momentum: 0.000000 2023-10-17 18:12:03,016 epoch 1 - iter 145/292 - loss 1.44462307 - time (sec): 8.55 - samples/sec: 2790.43 - lr: 0.000025 - momentum: 0.000000 2023-10-17 18:12:04,649 epoch 1 - iter 174/292 - loss 1.27857058 - time (sec): 10.18 - samples/sec: 2777.00 - lr: 0.000030 - momentum: 0.000000 2023-10-17 18:12:06,182 epoch 1 - iter 203/292 - loss 1.16794464 - time (sec): 11.72 - samples/sec: 2765.04 - lr: 0.000035 - momentum: 0.000000 2023-10-17 18:12:07,673 epoch 1 - iter 232/292 - loss 1.07277868 - time (sec): 13.21 - samples/sec: 2731.07 - lr: 0.000040 - momentum: 0.000000 2023-10-17 18:12:09,189 epoch 1 - iter 261/292 - loss 1.00143201 - time (sec): 14.72 - samples/sec: 2697.24 - lr: 0.000045 - momentum: 0.000000 2023-10-17 18:12:10,973 epoch 1 - iter 290/292 - loss 0.92152928 - time (sec): 16.51 - samples/sec: 2684.11 - lr: 0.000049 - momentum: 0.000000 2023-10-17 18:12:11,060 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:12:11,060 EPOCH 1 done: loss 0.9198 - lr: 0.000049 2023-10-17 18:12:12,098 DEV : loss 0.1843765377998352 - f1-score (micro avg) 0.3702 2023-10-17 18:12:12,103 saving best model 2023-10-17 18:12:12,449 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:12:14,179 epoch 2 - iter 29/292 - loss 0.20142189 - time (sec): 1.73 - samples/sec: 2819.84 - lr: 0.000049 - momentum: 0.000000 2023-10-17 18:12:15,816 epoch 2 - iter 58/292 - loss 0.19284668 - time (sec): 3.37 - samples/sec: 2711.29 - lr: 0.000049 - momentum: 0.000000 2023-10-17 18:12:17,358 epoch 2 - iter 87/292 - loss 0.19371720 - time (sec): 4.91 - samples/sec: 2694.01 - lr: 0.000048 - momentum: 0.000000 2023-10-17 18:12:19,037 epoch 2 - iter 116/292 - loss 0.19402488 - time (sec): 6.59 - samples/sec: 2670.45 - lr: 0.000048 - momentum: 0.000000 2023-10-17 18:12:20,644 epoch 2 - iter 145/292 - loss 0.19150294 - time (sec): 8.19 - samples/sec: 2679.62 - lr: 0.000047 - momentum: 0.000000 2023-10-17 18:12:22,547 epoch 2 - iter 174/292 - loss 0.18668752 - time (sec): 10.10 - samples/sec: 2693.88 - lr: 0.000047 - momentum: 0.000000 2023-10-17 18:12:24,314 epoch 2 - iter 203/292 - loss 0.17585029 - time (sec): 11.86 - samples/sec: 2727.66 - lr: 0.000046 - momentum: 0.000000 2023-10-17 18:12:25,925 epoch 2 - iter 232/292 - loss 0.17976168 - time (sec): 13.47 - samples/sec: 2710.00 - lr: 0.000046 - momentum: 0.000000 2023-10-17 18:12:27,638 epoch 2 - iter 261/292 - loss 0.17722873 - time (sec): 15.19 - samples/sec: 2678.16 - lr: 0.000045 - momentum: 0.000000 2023-10-17 18:12:29,117 epoch 2 - iter 290/292 - loss 0.17283090 - time (sec): 16.67 - samples/sec: 2653.11 - lr: 0.000045 - momentum: 0.000000 2023-10-17 18:12:29,206 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:12:29,206 EPOCH 2 done: loss 0.1732 - lr: 0.000045 2023-10-17 18:12:30,464 DEV : loss 0.13461413979530334 - f1-score (micro avg) 0.6947 2023-10-17 18:12:30,472 saving best model 2023-10-17 18:12:30,975 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:12:32,778 epoch 3 - iter 29/292 - loss 0.09618809 - time (sec): 1.80 - samples/sec: 2562.17 - lr: 0.000044 - momentum: 0.000000 2023-10-17 18:12:34,376 epoch 3 - iter 58/292 - loss 0.10261956 - time (sec): 3.40 - samples/sec: 2692.25 - lr: 0.000043 - momentum: 0.000000 2023-10-17 18:12:35,945 epoch 3 - iter 87/292 - loss 0.10491458 - time (sec): 4.97 - samples/sec: 2606.49 - lr: 0.000043 - momentum: 0.000000 2023-10-17 18:12:37,720 epoch 3 - iter 116/292 - loss 0.09932210 - time (sec): 6.74 - samples/sec: 2692.42 - lr: 0.000042 - momentum: 0.000000 2023-10-17 18:12:39,353 epoch 3 - iter 145/292 - loss 0.09832985 - time (sec): 8.37 - samples/sec: 2685.39 - lr: 0.000042 - momentum: 0.000000 2023-10-17 18:12:40,829 epoch 3 - iter 174/292 - loss 0.09621778 - time (sec): 9.85 - samples/sec: 2647.67 - lr: 0.000041 - momentum: 0.000000 2023-10-17 18:12:42,621 epoch 3 - iter 203/292 - loss 0.09595474 - time (sec): 11.64 - samples/sec: 2684.96 - lr: 0.000041 - momentum: 0.000000 2023-10-17 18:12:44,216 epoch 3 - iter 232/292 - loss 0.10537357 - time (sec): 13.24 - samples/sec: 2663.50 - lr: 0.000040 - momentum: 0.000000 2023-10-17 18:12:45,975 epoch 3 - iter 261/292 - loss 0.10883278 - time (sec): 15.00 - samples/sec: 2684.69 - lr: 0.000040 - momentum: 0.000000 2023-10-17 18:12:47,552 epoch 3 - iter 290/292 - loss 0.10855203 - time (sec): 16.57 - samples/sec: 2669.18 - lr: 0.000039 - momentum: 0.000000 2023-10-17 18:12:47,643 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:12:47,643 EPOCH 3 done: loss 0.1085 - lr: 0.000039 2023-10-17 18:12:48,942 DEV : loss 0.1387384980916977 - f1-score (micro avg) 0.6935 2023-10-17 18:12:48,947 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:12:50,574 epoch 4 - iter 29/292 - loss 0.07413975 - time (sec): 1.63 - samples/sec: 2740.39 - lr: 0.000038 - momentum: 0.000000 2023-10-17 18:12:52,258 epoch 4 - iter 58/292 - loss 0.06064259 - time (sec): 3.31 - samples/sec: 2759.68 - lr: 0.000038 - momentum: 0.000000 2023-10-17 18:12:53,852 epoch 4 - iter 87/292 - loss 0.06381220 - time (sec): 4.90 - samples/sec: 2776.94 - lr: 0.000037 - momentum: 0.000000 2023-10-17 18:12:55,435 epoch 4 - iter 116/292 - loss 0.06394328 - time (sec): 6.49 - samples/sec: 2702.57 - lr: 0.000037 - momentum: 0.000000 2023-10-17 18:12:57,367 epoch 4 - iter 145/292 - loss 0.06459610 - time (sec): 8.42 - samples/sec: 2641.27 - lr: 0.000036 - momentum: 0.000000 2023-10-17 18:12:58,848 epoch 4 - iter 174/292 - loss 0.06399316 - time (sec): 9.90 - samples/sec: 2618.85 - lr: 0.000036 - momentum: 0.000000 2023-10-17 18:13:00,637 epoch 4 - iter 203/292 - loss 0.06982796 - time (sec): 11.69 - samples/sec: 2645.30 - lr: 0.000035 - momentum: 0.000000 2023-10-17 18:13:02,466 epoch 4 - iter 232/292 - loss 0.07008573 - time (sec): 13.52 - samples/sec: 2645.36 - lr: 0.000035 - momentum: 0.000000 2023-10-17 18:13:04,108 epoch 4 - iter 261/292 - loss 0.07002274 - time (sec): 15.16 - samples/sec: 2618.03 - lr: 0.000034 - momentum: 0.000000 2023-10-17 18:13:05,744 epoch 4 - iter 290/292 - loss 0.06785738 - time (sec): 16.80 - samples/sec: 2635.55 - lr: 0.000033 - momentum: 0.000000 2023-10-17 18:13:05,832 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:13:05,832 EPOCH 4 done: loss 0.0676 - lr: 0.000033 2023-10-17 18:13:07,093 DEV : loss 0.12671661376953125 - f1-score (micro avg) 0.7692 2023-10-17 18:13:07,098 saving best model 2023-10-17 18:13:07,538 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:13:09,105 epoch 5 - iter 29/292 - loss 0.05161666 - time (sec): 1.56 - samples/sec: 2451.61 - lr: 0.000033 - momentum: 0.000000 2023-10-17 18:13:10,738 epoch 5 - iter 58/292 - loss 0.04878953 - time (sec): 3.20 - samples/sec: 2613.08 - lr: 0.000032 - momentum: 0.000000 2023-10-17 18:13:12,483 epoch 5 - iter 87/292 - loss 0.04146877 - time (sec): 4.94 - samples/sec: 2660.09 - lr: 0.000032 - momentum: 0.000000 2023-10-17 18:13:14,013 epoch 5 - iter 116/292 - loss 0.03720400 - time (sec): 6.47 - samples/sec: 2616.06 - lr: 0.000031 - momentum: 0.000000 2023-10-17 18:13:15,709 epoch 5 - iter 145/292 - loss 0.04613955 - time (sec): 8.17 - samples/sec: 2650.67 - lr: 0.000031 - momentum: 0.000000 2023-10-17 18:13:17,484 epoch 5 - iter 174/292 - loss 0.04788267 - time (sec): 9.94 - samples/sec: 2638.20 - lr: 0.000030 - momentum: 0.000000 2023-10-17 18:13:19,056 epoch 5 - iter 203/292 - loss 0.04607835 - time (sec): 11.52 - samples/sec: 2656.76 - lr: 0.000030 - momentum: 0.000000 2023-10-17 18:13:20,498 epoch 5 - iter 232/292 - loss 0.04885133 - time (sec): 12.96 - samples/sec: 2629.38 - lr: 0.000029 - momentum: 0.000000 2023-10-17 18:13:22,140 epoch 5 - iter 261/292 - loss 0.04946117 - time (sec): 14.60 - samples/sec: 2644.39 - lr: 0.000028 - momentum: 0.000000 2023-10-17 18:13:24,065 epoch 5 - iter 290/292 - loss 0.05008020 - time (sec): 16.53 - samples/sec: 2667.43 - lr: 0.000028 - momentum: 0.000000 2023-10-17 18:13:24,200 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:13:24,201 EPOCH 5 done: loss 0.0497 - lr: 0.000028 2023-10-17 18:13:25,432 DEV : loss 0.13220328092575073 - f1-score (micro avg) 0.7615 2023-10-17 18:13:25,437 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:13:26,974 epoch 6 - iter 29/292 - loss 0.02271820 - time (sec): 1.54 - samples/sec: 2617.26 - lr: 0.000027 - momentum: 0.000000 2023-10-17 18:13:28,618 epoch 6 - iter 58/292 - loss 0.02683061 - time (sec): 3.18 - samples/sec: 2617.63 - lr: 0.000027 - momentum: 0.000000 2023-10-17 18:13:30,384 epoch 6 - iter 87/292 - loss 0.02927446 - time (sec): 4.95 - samples/sec: 2634.53 - lr: 0.000026 - momentum: 0.000000 2023-10-17 18:13:32,140 epoch 6 - iter 116/292 - loss 0.02925845 - time (sec): 6.70 - samples/sec: 2640.94 - lr: 0.000026 - momentum: 0.000000 2023-10-17 18:13:33,793 epoch 6 - iter 145/292 - loss 0.02975885 - time (sec): 8.36 - samples/sec: 2605.91 - lr: 0.000025 - momentum: 0.000000 2023-10-17 18:13:35,574 epoch 6 - iter 174/292 - loss 0.02942265 - time (sec): 10.14 - samples/sec: 2565.70 - lr: 0.000025 - momentum: 0.000000 2023-10-17 18:13:37,166 epoch 6 - iter 203/292 - loss 0.02904656 - time (sec): 11.73 - samples/sec: 2592.82 - lr: 0.000024 - momentum: 0.000000 2023-10-17 18:13:39,032 epoch 6 - iter 232/292 - loss 0.03367419 - time (sec): 13.59 - samples/sec: 2624.29 - lr: 0.000023 - momentum: 0.000000 2023-10-17 18:13:40,680 epoch 6 - iter 261/292 - loss 0.03362666 - time (sec): 15.24 - samples/sec: 2631.26 - lr: 0.000023 - momentum: 0.000000 2023-10-17 18:13:42,284 epoch 6 - iter 290/292 - loss 0.03352873 - time (sec): 16.85 - samples/sec: 2630.18 - lr: 0.000022 - momentum: 0.000000 2023-10-17 18:13:42,376 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:13:42,376 EPOCH 6 done: loss 0.0334 - lr: 0.000022 2023-10-17 18:13:43,648 DEV : loss 0.15977859497070312 - f1-score (micro avg) 0.7592 2023-10-17 18:13:43,653 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:13:45,352 epoch 7 - iter 29/292 - loss 0.02607234 - time (sec): 1.70 - samples/sec: 2468.52 - lr: 0.000022 - momentum: 0.000000 2023-10-17 18:13:46,952 epoch 7 - iter 58/292 - loss 0.02640207 - time (sec): 3.30 - samples/sec: 2655.79 - lr: 0.000021 - momentum: 0.000000 2023-10-17 18:13:48,604 epoch 7 - iter 87/292 - loss 0.03014010 - time (sec): 4.95 - samples/sec: 2582.80 - lr: 0.000021 - momentum: 0.000000 2023-10-17 18:13:50,236 epoch 7 - iter 116/292 - loss 0.03008604 - time (sec): 6.58 - samples/sec: 2554.63 - lr: 0.000020 - momentum: 0.000000 2023-10-17 18:13:51,999 epoch 7 - iter 145/292 - loss 0.02617809 - time (sec): 8.35 - samples/sec: 2632.22 - lr: 0.000020 - momentum: 0.000000 2023-10-17 18:13:53,689 epoch 7 - iter 174/292 - loss 0.02372276 - time (sec): 10.03 - samples/sec: 2637.70 - lr: 0.000019 - momentum: 0.000000 2023-10-17 18:13:55,281 epoch 7 - iter 203/292 - loss 0.02211949 - time (sec): 11.63 - samples/sec: 2638.78 - lr: 0.000018 - momentum: 0.000000 2023-10-17 18:13:56,931 epoch 7 - iter 232/292 - loss 0.02126649 - time (sec): 13.28 - samples/sec: 2653.80 - lr: 0.000018 - momentum: 0.000000 2023-10-17 18:13:58,709 epoch 7 - iter 261/292 - loss 0.02005502 - time (sec): 15.06 - samples/sec: 2688.35 - lr: 0.000017 - momentum: 0.000000 2023-10-17 18:14:00,269 epoch 7 - iter 290/292 - loss 0.02058177 - time (sec): 16.62 - samples/sec: 2655.10 - lr: 0.000017 - momentum: 0.000000 2023-10-17 18:14:00,379 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:14:00,379 EPOCH 7 done: loss 0.0205 - lr: 0.000017 2023-10-17 18:14:01,624 DEV : loss 0.15402135252952576 - f1-score (micro avg) 0.757 2023-10-17 18:14:01,629 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:14:03,436 epoch 8 - iter 29/292 - loss 0.01399093 - time (sec): 1.81 - samples/sec: 2540.88 - lr: 0.000016 - momentum: 0.000000 2023-10-17 18:14:05,294 epoch 8 - iter 58/292 - loss 0.01777831 - time (sec): 3.66 - samples/sec: 2669.75 - lr: 0.000016 - momentum: 0.000000 2023-10-17 18:14:06,888 epoch 8 - iter 87/292 - loss 0.01394238 - time (sec): 5.26 - samples/sec: 2679.88 - lr: 0.000015 - momentum: 0.000000 2023-10-17 18:14:08,636 epoch 8 - iter 116/292 - loss 0.01327971 - time (sec): 7.01 - samples/sec: 2696.56 - lr: 0.000015 - momentum: 0.000000 2023-10-17 18:14:10,244 epoch 8 - iter 145/292 - loss 0.01180690 - time (sec): 8.61 - samples/sec: 2652.66 - lr: 0.000014 - momentum: 0.000000 2023-10-17 18:14:11,804 epoch 8 - iter 174/292 - loss 0.01183391 - time (sec): 10.17 - samples/sec: 2659.75 - lr: 0.000013 - momentum: 0.000000 2023-10-17 18:14:13,500 epoch 8 - iter 203/292 - loss 0.01461138 - time (sec): 11.87 - samples/sec: 2719.37 - lr: 0.000013 - momentum: 0.000000 2023-10-17 18:14:14,988 epoch 8 - iter 232/292 - loss 0.01425436 - time (sec): 13.36 - samples/sec: 2676.13 - lr: 0.000012 - momentum: 0.000000 2023-10-17 18:14:16,530 epoch 8 - iter 261/292 - loss 0.01362277 - time (sec): 14.90 - samples/sec: 2649.15 - lr: 0.000012 - momentum: 0.000000 2023-10-17 18:14:18,276 epoch 8 - iter 290/292 - loss 0.01307128 - time (sec): 16.65 - samples/sec: 2661.65 - lr: 0.000011 - momentum: 0.000000 2023-10-17 18:14:18,366 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:14:18,366 EPOCH 8 done: loss 0.0130 - lr: 0.000011 2023-10-17 18:14:19,617 DEV : loss 0.1692199856042862 - f1-score (micro avg) 0.7725 2023-10-17 18:14:19,622 saving best model 2023-10-17 18:14:20,051 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:14:21,697 epoch 9 - iter 29/292 - loss 0.00569490 - time (sec): 1.64 - samples/sec: 2479.03 - lr: 0.000011 - momentum: 0.000000 2023-10-17 18:14:23,397 epoch 9 - iter 58/292 - loss 0.01617149 - time (sec): 3.34 - samples/sec: 2747.21 - lr: 0.000010 - momentum: 0.000000 2023-10-17 18:14:25,048 epoch 9 - iter 87/292 - loss 0.01390161 - time (sec): 5.00 - samples/sec: 2802.73 - lr: 0.000010 - momentum: 0.000000 2023-10-17 18:14:26,634 epoch 9 - iter 116/292 - loss 0.01213710 - time (sec): 6.58 - samples/sec: 2769.76 - lr: 0.000009 - momentum: 0.000000 2023-10-17 18:14:28,322 epoch 9 - iter 145/292 - loss 0.01099370 - time (sec): 8.27 - samples/sec: 2728.66 - lr: 0.000008 - momentum: 0.000000 2023-10-17 18:14:30,093 epoch 9 - iter 174/292 - loss 0.01120828 - time (sec): 10.04 - samples/sec: 2623.04 - lr: 0.000008 - momentum: 0.000000 2023-10-17 18:14:31,727 epoch 9 - iter 203/292 - loss 0.01044705 - time (sec): 11.67 - samples/sec: 2618.98 - lr: 0.000007 - momentum: 0.000000 2023-10-17 18:14:33,463 epoch 9 - iter 232/292 - loss 0.00985920 - time (sec): 13.41 - samples/sec: 2642.78 - lr: 0.000007 - momentum: 0.000000 2023-10-17 18:14:35,069 epoch 9 - iter 261/292 - loss 0.01125430 - time (sec): 15.02 - samples/sec: 2638.05 - lr: 0.000006 - momentum: 0.000000 2023-10-17 18:14:36,917 epoch 9 - iter 290/292 - loss 0.01153666 - time (sec): 16.86 - samples/sec: 2627.36 - lr: 0.000006 - momentum: 0.000000 2023-10-17 18:14:37,001 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:14:37,001 EPOCH 9 done: loss 0.0115 - lr: 0.000006 2023-10-17 18:14:38,248 DEV : loss 0.16192570328712463 - f1-score (micro avg) 0.7742 2023-10-17 18:14:38,253 saving best model 2023-10-17 18:14:38,648 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:14:40,380 epoch 10 - iter 29/292 - loss 0.00879055 - time (sec): 1.73 - samples/sec: 2309.25 - lr: 0.000005 - momentum: 0.000000 2023-10-17 18:14:41,997 epoch 10 - iter 58/292 - loss 0.00594428 - time (sec): 3.34 - samples/sec: 2412.00 - lr: 0.000005 - momentum: 0.000000 2023-10-17 18:14:43,671 epoch 10 - iter 87/292 - loss 0.00500674 - time (sec): 5.02 - samples/sec: 2520.21 - lr: 0.000004 - momentum: 0.000000 2023-10-17 18:14:45,276 epoch 10 - iter 116/292 - loss 0.00433942 - time (sec): 6.62 - samples/sec: 2557.56 - lr: 0.000003 - momentum: 0.000000 2023-10-17 18:14:47,004 epoch 10 - iter 145/292 - loss 0.00402439 - time (sec): 8.35 - samples/sec: 2664.56 - lr: 0.000003 - momentum: 0.000000 2023-10-17 18:14:48,651 epoch 10 - iter 174/292 - loss 0.00608918 - time (sec): 10.00 - samples/sec: 2652.86 - lr: 0.000002 - momentum: 0.000000 2023-10-17 18:14:50,466 epoch 10 - iter 203/292 - loss 0.00741514 - time (sec): 11.81 - samples/sec: 2648.87 - lr: 0.000002 - momentum: 0.000000 2023-10-17 18:14:52,070 epoch 10 - iter 232/292 - loss 0.00948936 - time (sec): 13.42 - samples/sec: 2638.99 - lr: 0.000001 - momentum: 0.000000 2023-10-17 18:14:53,673 epoch 10 - iter 261/292 - loss 0.00939826 - time (sec): 15.02 - samples/sec: 2622.28 - lr: 0.000001 - momentum: 0.000000 2023-10-17 18:14:55,352 epoch 10 - iter 290/292 - loss 0.00888394 - time (sec): 16.70 - samples/sec: 2648.16 - lr: 0.000000 - momentum: 0.000000 2023-10-17 18:14:55,443 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:14:55,443 EPOCH 10 done: loss 0.0088 - lr: 0.000000 2023-10-17 18:14:56,679 DEV : loss 0.16277405619621277 - f1-score (micro avg) 0.7692 2023-10-17 18:14:57,004 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:14:57,005 Loading model from best epoch ... 2023-10-17 18:14:58,412 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-17 18:15:00,795 Results: - F-score (micro) 0.7612 - F-score (macro) 0.7118 - Accuracy 0.6356 By class: precision recall f1-score support PER 0.8172 0.8477 0.8322 348 LOC 0.6431 0.8352 0.7267 261 ORG 0.4400 0.4231 0.4314 52 HumanProd 0.9000 0.8182 0.8571 22 micro avg 0.7182 0.8097 0.7612 683 macro avg 0.7001 0.7311 0.7118 683 weighted avg 0.7246 0.8097 0.7621 683 2023-10-17 18:15:00,796 ----------------------------------------------------------------------------------------------------