|
2023-10-17 18:11:54,463 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:11:54,464 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 18:11:54,464 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:11:54,464 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-17 18:11:54,464 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:11:54,464 Train: 1166 sentences |
|
2023-10-17 18:11:54,464 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 18:11:54,464 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:11:54,464 Training Params: |
|
2023-10-17 18:11:54,464 - learning_rate: "5e-05" |
|
2023-10-17 18:11:54,464 - mini_batch_size: "4" |
|
2023-10-17 18:11:54,464 - max_epochs: "10" |
|
2023-10-17 18:11:54,464 - shuffle: "True" |
|
2023-10-17 18:11:54,464 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:11:54,464 Plugins: |
|
2023-10-17 18:11:54,464 - TensorboardLogger |
|
2023-10-17 18:11:54,464 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 18:11:54,464 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:11:54,464 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 18:11:54,464 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 18:11:54,464 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:11:54,464 Computation: |
|
2023-10-17 18:11:54,465 - compute on device: cuda:0 |
|
2023-10-17 18:11:54,465 - embedding storage: none |
|
2023-10-17 18:11:54,465 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:11:54,465 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-17 18:11:54,465 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:11:54,465 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:11:54,465 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 18:11:56,232 epoch 1 - iter 29/292 - loss 3.72244560 - time (sec): 1.77 - samples/sec: 2929.45 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 18:11:57,841 epoch 1 - iter 58/292 - loss 2.87723673 - time (sec): 3.38 - samples/sec: 2783.65 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 18:11:59,491 epoch 1 - iter 87/292 - loss 2.16638363 - time (sec): 5.02 - samples/sec: 2686.42 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 18:12:01,276 epoch 1 - iter 116/292 - loss 1.73400770 - time (sec): 6.81 - samples/sec: 2752.02 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 18:12:03,016 epoch 1 - iter 145/292 - loss 1.44462307 - time (sec): 8.55 - samples/sec: 2790.43 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 18:12:04,649 epoch 1 - iter 174/292 - loss 1.27857058 - time (sec): 10.18 - samples/sec: 2777.00 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 18:12:06,182 epoch 1 - iter 203/292 - loss 1.16794464 - time (sec): 11.72 - samples/sec: 2765.04 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 18:12:07,673 epoch 1 - iter 232/292 - loss 1.07277868 - time (sec): 13.21 - samples/sec: 2731.07 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 18:12:09,189 epoch 1 - iter 261/292 - loss 1.00143201 - time (sec): 14.72 - samples/sec: 2697.24 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 18:12:10,973 epoch 1 - iter 290/292 - loss 0.92152928 - time (sec): 16.51 - samples/sec: 2684.11 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 18:12:11,060 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:12:11,060 EPOCH 1 done: loss 0.9198 - lr: 0.000049 |
|
2023-10-17 18:12:12,098 DEV : loss 0.1843765377998352 - f1-score (micro avg) 0.3702 |
|
2023-10-17 18:12:12,103 saving best model |
|
2023-10-17 18:12:12,449 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:12:14,179 epoch 2 - iter 29/292 - loss 0.20142189 - time (sec): 1.73 - samples/sec: 2819.84 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 18:12:15,816 epoch 2 - iter 58/292 - loss 0.19284668 - time (sec): 3.37 - samples/sec: 2711.29 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 18:12:17,358 epoch 2 - iter 87/292 - loss 0.19371720 - time (sec): 4.91 - samples/sec: 2694.01 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 18:12:19,037 epoch 2 - iter 116/292 - loss 0.19402488 - time (sec): 6.59 - samples/sec: 2670.45 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 18:12:20,644 epoch 2 - iter 145/292 - loss 0.19150294 - time (sec): 8.19 - samples/sec: 2679.62 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 18:12:22,547 epoch 2 - iter 174/292 - loss 0.18668752 - time (sec): 10.10 - samples/sec: 2693.88 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 18:12:24,314 epoch 2 - iter 203/292 - loss 0.17585029 - time (sec): 11.86 - samples/sec: 2727.66 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 18:12:25,925 epoch 2 - iter 232/292 - loss 0.17976168 - time (sec): 13.47 - samples/sec: 2710.00 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 18:12:27,638 epoch 2 - iter 261/292 - loss 0.17722873 - time (sec): 15.19 - samples/sec: 2678.16 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 18:12:29,117 epoch 2 - iter 290/292 - loss 0.17283090 - time (sec): 16.67 - samples/sec: 2653.11 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 18:12:29,206 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:12:29,206 EPOCH 2 done: loss 0.1732 - lr: 0.000045 |
|
2023-10-17 18:12:30,464 DEV : loss 0.13461413979530334 - f1-score (micro avg) 0.6947 |
|
2023-10-17 18:12:30,472 saving best model |
|
2023-10-17 18:12:30,975 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:12:32,778 epoch 3 - iter 29/292 - loss 0.09618809 - time (sec): 1.80 - samples/sec: 2562.17 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 18:12:34,376 epoch 3 - iter 58/292 - loss 0.10261956 - time (sec): 3.40 - samples/sec: 2692.25 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 18:12:35,945 epoch 3 - iter 87/292 - loss 0.10491458 - time (sec): 4.97 - samples/sec: 2606.49 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 18:12:37,720 epoch 3 - iter 116/292 - loss 0.09932210 - time (sec): 6.74 - samples/sec: 2692.42 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 18:12:39,353 epoch 3 - iter 145/292 - loss 0.09832985 - time (sec): 8.37 - samples/sec: 2685.39 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 18:12:40,829 epoch 3 - iter 174/292 - loss 0.09621778 - time (sec): 9.85 - samples/sec: 2647.67 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 18:12:42,621 epoch 3 - iter 203/292 - loss 0.09595474 - time (sec): 11.64 - samples/sec: 2684.96 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 18:12:44,216 epoch 3 - iter 232/292 - loss 0.10537357 - time (sec): 13.24 - samples/sec: 2663.50 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 18:12:45,975 epoch 3 - iter 261/292 - loss 0.10883278 - time (sec): 15.00 - samples/sec: 2684.69 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 18:12:47,552 epoch 3 - iter 290/292 - loss 0.10855203 - time (sec): 16.57 - samples/sec: 2669.18 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-17 18:12:47,643 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:12:47,643 EPOCH 3 done: loss 0.1085 - lr: 0.000039 |
|
2023-10-17 18:12:48,942 DEV : loss 0.1387384980916977 - f1-score (micro avg) 0.6935 |
|
2023-10-17 18:12:48,947 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:12:50,574 epoch 4 - iter 29/292 - loss 0.07413975 - time (sec): 1.63 - samples/sec: 2740.39 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 18:12:52,258 epoch 4 - iter 58/292 - loss 0.06064259 - time (sec): 3.31 - samples/sec: 2759.68 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 18:12:53,852 epoch 4 - iter 87/292 - loss 0.06381220 - time (sec): 4.90 - samples/sec: 2776.94 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 18:12:55,435 epoch 4 - iter 116/292 - loss 0.06394328 - time (sec): 6.49 - samples/sec: 2702.57 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 18:12:57,367 epoch 4 - iter 145/292 - loss 0.06459610 - time (sec): 8.42 - samples/sec: 2641.27 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 18:12:58,848 epoch 4 - iter 174/292 - loss 0.06399316 - time (sec): 9.90 - samples/sec: 2618.85 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 18:13:00,637 epoch 4 - iter 203/292 - loss 0.06982796 - time (sec): 11.69 - samples/sec: 2645.30 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 18:13:02,466 epoch 4 - iter 232/292 - loss 0.07008573 - time (sec): 13.52 - samples/sec: 2645.36 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 18:13:04,108 epoch 4 - iter 261/292 - loss 0.07002274 - time (sec): 15.16 - samples/sec: 2618.03 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 18:13:05,744 epoch 4 - iter 290/292 - loss 0.06785738 - time (sec): 16.80 - samples/sec: 2635.55 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 18:13:05,832 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:13:05,832 EPOCH 4 done: loss 0.0676 - lr: 0.000033 |
|
2023-10-17 18:13:07,093 DEV : loss 0.12671661376953125 - f1-score (micro avg) 0.7692 |
|
2023-10-17 18:13:07,098 saving best model |
|
2023-10-17 18:13:07,538 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:13:09,105 epoch 5 - iter 29/292 - loss 0.05161666 - time (sec): 1.56 - samples/sec: 2451.61 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 18:13:10,738 epoch 5 - iter 58/292 - loss 0.04878953 - time (sec): 3.20 - samples/sec: 2613.08 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 18:13:12,483 epoch 5 - iter 87/292 - loss 0.04146877 - time (sec): 4.94 - samples/sec: 2660.09 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 18:13:14,013 epoch 5 - iter 116/292 - loss 0.03720400 - time (sec): 6.47 - samples/sec: 2616.06 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 18:13:15,709 epoch 5 - iter 145/292 - loss 0.04613955 - time (sec): 8.17 - samples/sec: 2650.67 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 18:13:17,484 epoch 5 - iter 174/292 - loss 0.04788267 - time (sec): 9.94 - samples/sec: 2638.20 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 18:13:19,056 epoch 5 - iter 203/292 - loss 0.04607835 - time (sec): 11.52 - samples/sec: 2656.76 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 18:13:20,498 epoch 5 - iter 232/292 - loss 0.04885133 - time (sec): 12.96 - samples/sec: 2629.38 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 18:13:22,140 epoch 5 - iter 261/292 - loss 0.04946117 - time (sec): 14.60 - samples/sec: 2644.39 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 18:13:24,065 epoch 5 - iter 290/292 - loss 0.05008020 - time (sec): 16.53 - samples/sec: 2667.43 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 18:13:24,200 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:13:24,201 EPOCH 5 done: loss 0.0497 - lr: 0.000028 |
|
2023-10-17 18:13:25,432 DEV : loss 0.13220328092575073 - f1-score (micro avg) 0.7615 |
|
2023-10-17 18:13:25,437 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:13:26,974 epoch 6 - iter 29/292 - loss 0.02271820 - time (sec): 1.54 - samples/sec: 2617.26 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 18:13:28,618 epoch 6 - iter 58/292 - loss 0.02683061 - time (sec): 3.18 - samples/sec: 2617.63 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 18:13:30,384 epoch 6 - iter 87/292 - loss 0.02927446 - time (sec): 4.95 - samples/sec: 2634.53 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 18:13:32,140 epoch 6 - iter 116/292 - loss 0.02925845 - time (sec): 6.70 - samples/sec: 2640.94 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 18:13:33,793 epoch 6 - iter 145/292 - loss 0.02975885 - time (sec): 8.36 - samples/sec: 2605.91 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 18:13:35,574 epoch 6 - iter 174/292 - loss 0.02942265 - time (sec): 10.14 - samples/sec: 2565.70 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 18:13:37,166 epoch 6 - iter 203/292 - loss 0.02904656 - time (sec): 11.73 - samples/sec: 2592.82 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 18:13:39,032 epoch 6 - iter 232/292 - loss 0.03367419 - time (sec): 13.59 - samples/sec: 2624.29 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 18:13:40,680 epoch 6 - iter 261/292 - loss 0.03362666 - time (sec): 15.24 - samples/sec: 2631.26 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 18:13:42,284 epoch 6 - iter 290/292 - loss 0.03352873 - time (sec): 16.85 - samples/sec: 2630.18 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 18:13:42,376 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:13:42,376 EPOCH 6 done: loss 0.0334 - lr: 0.000022 |
|
2023-10-17 18:13:43,648 DEV : loss 0.15977859497070312 - f1-score (micro avg) 0.7592 |
|
2023-10-17 18:13:43,653 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:13:45,352 epoch 7 - iter 29/292 - loss 0.02607234 - time (sec): 1.70 - samples/sec: 2468.52 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 18:13:46,952 epoch 7 - iter 58/292 - loss 0.02640207 - time (sec): 3.30 - samples/sec: 2655.79 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 18:13:48,604 epoch 7 - iter 87/292 - loss 0.03014010 - time (sec): 4.95 - samples/sec: 2582.80 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 18:13:50,236 epoch 7 - iter 116/292 - loss 0.03008604 - time (sec): 6.58 - samples/sec: 2554.63 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 18:13:51,999 epoch 7 - iter 145/292 - loss 0.02617809 - time (sec): 8.35 - samples/sec: 2632.22 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 18:13:53,689 epoch 7 - iter 174/292 - loss 0.02372276 - time (sec): 10.03 - samples/sec: 2637.70 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 18:13:55,281 epoch 7 - iter 203/292 - loss 0.02211949 - time (sec): 11.63 - samples/sec: 2638.78 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 18:13:56,931 epoch 7 - iter 232/292 - loss 0.02126649 - time (sec): 13.28 - samples/sec: 2653.80 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 18:13:58,709 epoch 7 - iter 261/292 - loss 0.02005502 - time (sec): 15.06 - samples/sec: 2688.35 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 18:14:00,269 epoch 7 - iter 290/292 - loss 0.02058177 - time (sec): 16.62 - samples/sec: 2655.10 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 18:14:00,379 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:14:00,379 EPOCH 7 done: loss 0.0205 - lr: 0.000017 |
|
2023-10-17 18:14:01,624 DEV : loss 0.15402135252952576 - f1-score (micro avg) 0.757 |
|
2023-10-17 18:14:01,629 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:14:03,436 epoch 8 - iter 29/292 - loss 0.01399093 - time (sec): 1.81 - samples/sec: 2540.88 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 18:14:05,294 epoch 8 - iter 58/292 - loss 0.01777831 - time (sec): 3.66 - samples/sec: 2669.75 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 18:14:06,888 epoch 8 - iter 87/292 - loss 0.01394238 - time (sec): 5.26 - samples/sec: 2679.88 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 18:14:08,636 epoch 8 - iter 116/292 - loss 0.01327971 - time (sec): 7.01 - samples/sec: 2696.56 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 18:14:10,244 epoch 8 - iter 145/292 - loss 0.01180690 - time (sec): 8.61 - samples/sec: 2652.66 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 18:14:11,804 epoch 8 - iter 174/292 - loss 0.01183391 - time (sec): 10.17 - samples/sec: 2659.75 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 18:14:13,500 epoch 8 - iter 203/292 - loss 0.01461138 - time (sec): 11.87 - samples/sec: 2719.37 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 18:14:14,988 epoch 8 - iter 232/292 - loss 0.01425436 - time (sec): 13.36 - samples/sec: 2676.13 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 18:14:16,530 epoch 8 - iter 261/292 - loss 0.01362277 - time (sec): 14.90 - samples/sec: 2649.15 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 18:14:18,276 epoch 8 - iter 290/292 - loss 0.01307128 - time (sec): 16.65 - samples/sec: 2661.65 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 18:14:18,366 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:14:18,366 EPOCH 8 done: loss 0.0130 - lr: 0.000011 |
|
2023-10-17 18:14:19,617 DEV : loss 0.1692199856042862 - f1-score (micro avg) 0.7725 |
|
2023-10-17 18:14:19,622 saving best model |
|
2023-10-17 18:14:20,051 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:14:21,697 epoch 9 - iter 29/292 - loss 0.00569490 - time (sec): 1.64 - samples/sec: 2479.03 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 18:14:23,397 epoch 9 - iter 58/292 - loss 0.01617149 - time (sec): 3.34 - samples/sec: 2747.21 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 18:14:25,048 epoch 9 - iter 87/292 - loss 0.01390161 - time (sec): 5.00 - samples/sec: 2802.73 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 18:14:26,634 epoch 9 - iter 116/292 - loss 0.01213710 - time (sec): 6.58 - samples/sec: 2769.76 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 18:14:28,322 epoch 9 - iter 145/292 - loss 0.01099370 - time (sec): 8.27 - samples/sec: 2728.66 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 18:14:30,093 epoch 9 - iter 174/292 - loss 0.01120828 - time (sec): 10.04 - samples/sec: 2623.04 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 18:14:31,727 epoch 9 - iter 203/292 - loss 0.01044705 - time (sec): 11.67 - samples/sec: 2618.98 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 18:14:33,463 epoch 9 - iter 232/292 - loss 0.00985920 - time (sec): 13.41 - samples/sec: 2642.78 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 18:14:35,069 epoch 9 - iter 261/292 - loss 0.01125430 - time (sec): 15.02 - samples/sec: 2638.05 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 18:14:36,917 epoch 9 - iter 290/292 - loss 0.01153666 - time (sec): 16.86 - samples/sec: 2627.36 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 18:14:37,001 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:14:37,001 EPOCH 9 done: loss 0.0115 - lr: 0.000006 |
|
2023-10-17 18:14:38,248 DEV : loss 0.16192570328712463 - f1-score (micro avg) 0.7742 |
|
2023-10-17 18:14:38,253 saving best model |
|
2023-10-17 18:14:38,648 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:14:40,380 epoch 10 - iter 29/292 - loss 0.00879055 - time (sec): 1.73 - samples/sec: 2309.25 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 18:14:41,997 epoch 10 - iter 58/292 - loss 0.00594428 - time (sec): 3.34 - samples/sec: 2412.00 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 18:14:43,671 epoch 10 - iter 87/292 - loss 0.00500674 - time (sec): 5.02 - samples/sec: 2520.21 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 18:14:45,276 epoch 10 - iter 116/292 - loss 0.00433942 - time (sec): 6.62 - samples/sec: 2557.56 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 18:14:47,004 epoch 10 - iter 145/292 - loss 0.00402439 - time (sec): 8.35 - samples/sec: 2664.56 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 18:14:48,651 epoch 10 - iter 174/292 - loss 0.00608918 - time (sec): 10.00 - samples/sec: 2652.86 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 18:14:50,466 epoch 10 - iter 203/292 - loss 0.00741514 - time (sec): 11.81 - samples/sec: 2648.87 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 18:14:52,070 epoch 10 - iter 232/292 - loss 0.00948936 - time (sec): 13.42 - samples/sec: 2638.99 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 18:14:53,673 epoch 10 - iter 261/292 - loss 0.00939826 - time (sec): 15.02 - samples/sec: 2622.28 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 18:14:55,352 epoch 10 - iter 290/292 - loss 0.00888394 - time (sec): 16.70 - samples/sec: 2648.16 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 18:14:55,443 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:14:55,443 EPOCH 10 done: loss 0.0088 - lr: 0.000000 |
|
2023-10-17 18:14:56,679 DEV : loss 0.16277405619621277 - f1-score (micro avg) 0.7692 |
|
2023-10-17 18:14:57,004 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:14:57,005 Loading model from best epoch ... |
|
2023-10-17 18:14:58,412 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-17 18:15:00,795 |
|
Results: |
|
- F-score (micro) 0.7612 |
|
- F-score (macro) 0.7118 |
|
- Accuracy 0.6356 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8172 0.8477 0.8322 348 |
|
LOC 0.6431 0.8352 0.7267 261 |
|
ORG 0.4400 0.4231 0.4314 52 |
|
HumanProd 0.9000 0.8182 0.8571 22 |
|
|
|
micro avg 0.7182 0.8097 0.7612 683 |
|
macro avg 0.7001 0.7311 0.7118 683 |
|
weighted avg 0.7246 0.8097 0.7621 683 |
|
|
|
2023-10-17 18:15:00,796 ---------------------------------------------------------------------------------------------------- |
|
|