|
2023-10-17 17:37:29,787 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:37:29,788 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 17:37:29,788 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:37:29,788 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-17 17:37:29,788 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:37:29,788 Train: 1166 sentences |
|
2023-10-17 17:37:29,788 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 17:37:29,788 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:37:29,788 Training Params: |
|
2023-10-17 17:37:29,788 - learning_rate: "5e-05" |
|
2023-10-17 17:37:29,789 - mini_batch_size: "8" |
|
2023-10-17 17:37:29,789 - max_epochs: "10" |
|
2023-10-17 17:37:29,789 - shuffle: "True" |
|
2023-10-17 17:37:29,789 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:37:29,789 Plugins: |
|
2023-10-17 17:37:29,789 - TensorboardLogger |
|
2023-10-17 17:37:29,789 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 17:37:29,789 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:37:29,789 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 17:37:29,789 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 17:37:29,789 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:37:29,789 Computation: |
|
2023-10-17 17:37:29,789 - compute on device: cuda:0 |
|
2023-10-17 17:37:29,789 - embedding storage: none |
|
2023-10-17 17:37:29,789 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:37:29,789 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-17 17:37:29,789 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:37:29,789 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:37:29,789 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 17:37:31,187 epoch 1 - iter 14/146 - loss 3.37573558 - time (sec): 1.40 - samples/sec: 2716.09 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 17:37:32,560 epoch 1 - iter 28/146 - loss 2.95919448 - time (sec): 2.77 - samples/sec: 2752.12 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 17:37:34,759 epoch 1 - iter 42/146 - loss 2.24791386 - time (sec): 4.97 - samples/sec: 2756.23 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 17:37:36,522 epoch 1 - iter 56/146 - loss 1.84282911 - time (sec): 6.73 - samples/sec: 2657.88 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 17:37:37,701 epoch 1 - iter 70/146 - loss 1.61728577 - time (sec): 7.91 - samples/sec: 2728.57 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:37:39,260 epoch 1 - iter 84/146 - loss 1.44152327 - time (sec): 9.47 - samples/sec: 2721.35 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 17:37:40,590 epoch 1 - iter 98/146 - loss 1.28259256 - time (sec): 10.80 - samples/sec: 2763.18 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 17:37:41,945 epoch 1 - iter 112/146 - loss 1.15921053 - time (sec): 12.16 - samples/sec: 2807.55 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 17:37:43,407 epoch 1 - iter 126/146 - loss 1.06002134 - time (sec): 13.62 - samples/sec: 2821.55 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 17:37:44,867 epoch 1 - iter 140/146 - loss 0.98907932 - time (sec): 15.08 - samples/sec: 2804.11 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 17:37:45,491 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:37:45,491 EPOCH 1 done: loss 0.9535 - lr: 0.000048 |
|
2023-10-17 17:37:46,312 DEV : loss 0.16792482137680054 - f1-score (micro avg) 0.5135 |
|
2023-10-17 17:37:46,317 saving best model |
|
2023-10-17 17:37:46,657 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:37:48,271 epoch 2 - iter 14/146 - loss 0.21406715 - time (sec): 1.61 - samples/sec: 2819.73 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-17 17:37:49,772 epoch 2 - iter 28/146 - loss 0.19088152 - time (sec): 3.11 - samples/sec: 2721.56 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 17:37:51,118 epoch 2 - iter 42/146 - loss 0.18486880 - time (sec): 4.46 - samples/sec: 2842.05 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 17:37:52,987 epoch 2 - iter 56/146 - loss 0.17632972 - time (sec): 6.33 - samples/sec: 2819.25 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 17:37:54,311 epoch 2 - iter 70/146 - loss 0.17389479 - time (sec): 7.65 - samples/sec: 2888.93 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 17:37:55,626 epoch 2 - iter 84/146 - loss 0.17353218 - time (sec): 8.97 - samples/sec: 2995.75 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 17:37:56,773 epoch 2 - iter 98/146 - loss 0.17824868 - time (sec): 10.12 - samples/sec: 2997.07 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 17:37:57,986 epoch 2 - iter 112/146 - loss 0.18444188 - time (sec): 11.33 - samples/sec: 2983.30 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 17:37:59,584 epoch 2 - iter 126/146 - loss 0.18735041 - time (sec): 12.93 - samples/sec: 2976.20 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 17:38:01,174 epoch 2 - iter 140/146 - loss 0.18194918 - time (sec): 14.52 - samples/sec: 2946.22 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 17:38:01,804 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:38:01,804 EPOCH 2 done: loss 0.1799 - lr: 0.000045 |
|
2023-10-17 17:38:03,284 DEV : loss 0.1151837706565857 - f1-score (micro avg) 0.668 |
|
2023-10-17 17:38:03,289 saving best model |
|
2023-10-17 17:38:03,764 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:38:05,411 epoch 3 - iter 14/146 - loss 0.11815419 - time (sec): 1.65 - samples/sec: 2724.25 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 17:38:06,872 epoch 3 - iter 28/146 - loss 0.11456933 - time (sec): 3.11 - samples/sec: 2816.48 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 17:38:08,324 epoch 3 - iter 42/146 - loss 0.12498663 - time (sec): 4.56 - samples/sec: 2901.73 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 17:38:09,615 epoch 3 - iter 56/146 - loss 0.11790534 - time (sec): 5.85 - samples/sec: 2942.75 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 17:38:10,964 epoch 3 - iter 70/146 - loss 0.11400928 - time (sec): 7.20 - samples/sec: 2934.56 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 17:38:12,331 epoch 3 - iter 84/146 - loss 0.10974986 - time (sec): 8.57 - samples/sec: 2940.41 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 17:38:13,931 epoch 3 - iter 98/146 - loss 0.10732321 - time (sec): 10.17 - samples/sec: 2945.09 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 17:38:15,528 epoch 3 - iter 112/146 - loss 0.10781926 - time (sec): 11.76 - samples/sec: 2947.88 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 17:38:16,806 epoch 3 - iter 126/146 - loss 0.10821794 - time (sec): 13.04 - samples/sec: 2952.99 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 17:38:18,148 epoch 3 - iter 140/146 - loss 0.10839025 - time (sec): 14.38 - samples/sec: 2930.90 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-17 17:38:18,954 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:38:18,954 EPOCH 3 done: loss 0.1088 - lr: 0.000039 |
|
2023-10-17 17:38:20,213 DEV : loss 0.10572662949562073 - f1-score (micro avg) 0.6921 |
|
2023-10-17 17:38:20,217 saving best model |
|
2023-10-17 17:38:20,697 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:38:21,822 epoch 4 - iter 14/146 - loss 0.09358790 - time (sec): 1.12 - samples/sec: 3095.06 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 17:38:23,043 epoch 4 - iter 28/146 - loss 0.07984905 - time (sec): 2.34 - samples/sec: 3154.25 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 17:38:24,710 epoch 4 - iter 42/146 - loss 0.06872645 - time (sec): 4.00 - samples/sec: 3109.62 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 17:38:26,305 epoch 4 - iter 56/146 - loss 0.06929502 - time (sec): 5.60 - samples/sec: 2991.71 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 17:38:27,794 epoch 4 - iter 70/146 - loss 0.06615047 - time (sec): 7.09 - samples/sec: 2967.01 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 17:38:29,354 epoch 4 - iter 84/146 - loss 0.06341990 - time (sec): 8.65 - samples/sec: 2962.93 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 17:38:31,194 epoch 4 - iter 98/146 - loss 0.06994356 - time (sec): 10.49 - samples/sec: 2918.65 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 17:38:32,546 epoch 4 - iter 112/146 - loss 0.06799879 - time (sec): 11.84 - samples/sec: 2904.32 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 17:38:34,013 epoch 4 - iter 126/146 - loss 0.07029829 - time (sec): 13.31 - samples/sec: 2925.93 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 17:38:35,487 epoch 4 - iter 140/146 - loss 0.07084618 - time (sec): 14.78 - samples/sec: 2901.54 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 17:38:36,018 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:38:36,018 EPOCH 4 done: loss 0.0699 - lr: 0.000034 |
|
2023-10-17 17:38:37,313 DEV : loss 0.126682311296463 - f1-score (micro avg) 0.7511 |
|
2023-10-17 17:38:37,318 saving best model |
|
2023-10-17 17:38:37,798 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:38:39,540 epoch 5 - iter 14/146 - loss 0.05401972 - time (sec): 1.74 - samples/sec: 2868.10 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 17:38:41,233 epoch 5 - iter 28/146 - loss 0.04019088 - time (sec): 3.43 - samples/sec: 2849.95 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 17:38:42,521 epoch 5 - iter 42/146 - loss 0.04999847 - time (sec): 4.72 - samples/sec: 2983.70 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 17:38:43,717 epoch 5 - iter 56/146 - loss 0.05432339 - time (sec): 5.92 - samples/sec: 2985.17 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 17:38:45,230 epoch 5 - iter 70/146 - loss 0.05670146 - time (sec): 7.43 - samples/sec: 2940.20 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 17:38:46,681 epoch 5 - iter 84/146 - loss 0.05548785 - time (sec): 8.88 - samples/sec: 2943.86 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 17:38:48,071 epoch 5 - iter 98/146 - loss 0.05219240 - time (sec): 10.27 - samples/sec: 2986.20 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 17:38:49,402 epoch 5 - iter 112/146 - loss 0.05188564 - time (sec): 11.60 - samples/sec: 2993.44 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 17:38:50,761 epoch 5 - iter 126/146 - loss 0.04945330 - time (sec): 12.96 - samples/sec: 2977.12 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 17:38:52,167 epoch 5 - iter 140/146 - loss 0.04807587 - time (sec): 14.37 - samples/sec: 2983.25 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 17:38:52,632 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:38:52,632 EPOCH 5 done: loss 0.0471 - lr: 0.000028 |
|
2023-10-17 17:38:53,948 DEV : loss 0.12283353507518768 - f1-score (micro avg) 0.764 |
|
2023-10-17 17:38:53,953 saving best model |
|
2023-10-17 17:38:54,398 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:38:55,858 epoch 6 - iter 14/146 - loss 0.03681802 - time (sec): 1.46 - samples/sec: 2991.47 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:38:57,148 epoch 6 - iter 28/146 - loss 0.03075934 - time (sec): 2.75 - samples/sec: 2831.24 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:38:59,358 epoch 6 - iter 42/146 - loss 0.03043773 - time (sec): 4.96 - samples/sec: 2519.99 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:39:00,913 epoch 6 - iter 56/146 - loss 0.03076017 - time (sec): 6.51 - samples/sec: 2587.83 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:39:02,240 epoch 6 - iter 70/146 - loss 0.02960510 - time (sec): 7.84 - samples/sec: 2593.20 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:39:03,797 epoch 6 - iter 84/146 - loss 0.02886610 - time (sec): 9.40 - samples/sec: 2581.09 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:39:05,279 epoch 6 - iter 98/146 - loss 0.03043924 - time (sec): 10.88 - samples/sec: 2590.89 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:39:06,942 epoch 6 - iter 112/146 - loss 0.03186061 - time (sec): 12.54 - samples/sec: 2628.15 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:39:08,443 epoch 6 - iter 126/146 - loss 0.03215404 - time (sec): 14.04 - samples/sec: 2662.69 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 17:39:10,324 epoch 6 - iter 140/146 - loss 0.03176934 - time (sec): 15.92 - samples/sec: 2692.28 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 17:39:10,845 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:39:10,845 EPOCH 6 done: loss 0.0315 - lr: 0.000023 |
|
2023-10-17 17:39:12,121 DEV : loss 0.10742789506912231 - f1-score (micro avg) 0.7627 |
|
2023-10-17 17:39:12,125 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:39:13,750 epoch 7 - iter 14/146 - loss 0.01262368 - time (sec): 1.62 - samples/sec: 2937.75 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 17:39:15,331 epoch 7 - iter 28/146 - loss 0.02207873 - time (sec): 3.20 - samples/sec: 2714.29 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:39:16,945 epoch 7 - iter 42/146 - loss 0.02262086 - time (sec): 4.82 - samples/sec: 2793.54 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:39:18,482 epoch 7 - iter 56/146 - loss 0.02038830 - time (sec): 6.36 - samples/sec: 2833.69 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:39:20,148 epoch 7 - iter 70/146 - loss 0.01965031 - time (sec): 8.02 - samples/sec: 2839.38 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:39:21,490 epoch 7 - iter 84/146 - loss 0.01978349 - time (sec): 9.36 - samples/sec: 2883.88 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 17:39:23,003 epoch 7 - iter 98/146 - loss 0.02074431 - time (sec): 10.88 - samples/sec: 2891.38 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 17:39:24,323 epoch 7 - iter 112/146 - loss 0.01969168 - time (sec): 12.20 - samples/sec: 2922.56 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:39:25,487 epoch 7 - iter 126/146 - loss 0.01981820 - time (sec): 13.36 - samples/sec: 2922.63 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:39:26,788 epoch 7 - iter 140/146 - loss 0.02159605 - time (sec): 14.66 - samples/sec: 2925.17 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 17:39:27,363 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:39:27,363 EPOCH 7 done: loss 0.0213 - lr: 0.000017 |
|
2023-10-17 17:39:28,709 DEV : loss 0.1274358183145523 - f1-score (micro avg) 0.7559 |
|
2023-10-17 17:39:28,715 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:39:30,242 epoch 8 - iter 14/146 - loss 0.01472338 - time (sec): 1.53 - samples/sec: 2973.44 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 17:39:31,880 epoch 8 - iter 28/146 - loss 0.02160793 - time (sec): 3.16 - samples/sec: 2927.18 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 17:39:33,565 epoch 8 - iter 42/146 - loss 0.01728589 - time (sec): 4.85 - samples/sec: 2815.92 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:39:34,870 epoch 8 - iter 56/146 - loss 0.01931026 - time (sec): 6.15 - samples/sec: 2868.69 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:39:36,236 epoch 8 - iter 70/146 - loss 0.01779502 - time (sec): 7.52 - samples/sec: 2842.77 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 17:39:37,477 epoch 8 - iter 84/146 - loss 0.01627756 - time (sec): 8.76 - samples/sec: 2865.82 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 17:39:38,941 epoch 8 - iter 98/146 - loss 0.01662670 - time (sec): 10.22 - samples/sec: 2873.57 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 17:39:40,407 epoch 8 - iter 112/146 - loss 0.01534973 - time (sec): 11.69 - samples/sec: 2859.92 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 17:39:41,805 epoch 8 - iter 126/146 - loss 0.01445702 - time (sec): 13.09 - samples/sec: 2884.04 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:39:43,494 epoch 8 - iter 140/146 - loss 0.01553068 - time (sec): 14.78 - samples/sec: 2900.40 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:39:43,971 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:39:43,971 EPOCH 8 done: loss 0.0153 - lr: 0.000012 |
|
2023-10-17 17:39:45,269 DEV : loss 0.141546830534935 - f1-score (micro avg) 0.7462 |
|
2023-10-17 17:39:45,274 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:39:46,828 epoch 9 - iter 14/146 - loss 0.00978563 - time (sec): 1.55 - samples/sec: 2657.05 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 17:39:48,213 epoch 9 - iter 28/146 - loss 0.01442725 - time (sec): 2.94 - samples/sec: 2729.65 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 17:39:49,441 epoch 9 - iter 42/146 - loss 0.01169914 - time (sec): 4.17 - samples/sec: 2734.87 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 17:39:50,688 epoch 9 - iter 56/146 - loss 0.01092067 - time (sec): 5.41 - samples/sec: 2777.73 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 17:39:52,381 epoch 9 - iter 70/146 - loss 0.00994777 - time (sec): 7.11 - samples/sec: 2837.20 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 17:39:53,855 epoch 9 - iter 84/146 - loss 0.01072843 - time (sec): 8.58 - samples/sec: 2879.33 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 17:39:55,147 epoch 9 - iter 98/146 - loss 0.01072225 - time (sec): 9.87 - samples/sec: 2884.07 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 17:39:56,641 epoch 9 - iter 112/146 - loss 0.00993920 - time (sec): 11.37 - samples/sec: 2881.71 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 17:39:58,337 epoch 9 - iter 126/146 - loss 0.01048425 - time (sec): 13.06 - samples/sec: 2909.76 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 17:39:59,721 epoch 9 - iter 140/146 - loss 0.01035252 - time (sec): 14.45 - samples/sec: 2919.43 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:40:00,613 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:40:00,613 EPOCH 9 done: loss 0.0104 - lr: 0.000006 |
|
2023-10-17 17:40:01,845 DEV : loss 0.1482328176498413 - f1-score (micro avg) 0.7843 |
|
2023-10-17 17:40:01,849 saving best model |
|
2023-10-17 17:40:02,314 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:40:03,886 epoch 10 - iter 14/146 - loss 0.01064113 - time (sec): 1.56 - samples/sec: 2989.33 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:40:05,357 epoch 10 - iter 28/146 - loss 0.00752883 - time (sec): 3.04 - samples/sec: 3075.90 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:40:06,801 epoch 10 - iter 42/146 - loss 0.00810847 - time (sec): 4.48 - samples/sec: 3031.72 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 17:40:08,183 epoch 10 - iter 56/146 - loss 0.00767424 - time (sec): 5.86 - samples/sec: 2976.27 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 17:40:09,664 epoch 10 - iter 70/146 - loss 0.00631805 - time (sec): 7.34 - samples/sec: 2947.32 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:40:11,180 epoch 10 - iter 84/146 - loss 0.00616265 - time (sec): 8.86 - samples/sec: 2935.80 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:40:12,703 epoch 10 - iter 98/146 - loss 0.00683143 - time (sec): 10.38 - samples/sec: 2905.56 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 17:40:14,228 epoch 10 - iter 112/146 - loss 0.00780732 - time (sec): 11.91 - samples/sec: 2914.68 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 17:40:15,862 epoch 10 - iter 126/146 - loss 0.00805699 - time (sec): 13.54 - samples/sec: 2908.24 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 17:40:17,212 epoch 10 - iter 140/146 - loss 0.00786973 - time (sec): 14.89 - samples/sec: 2910.88 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 17:40:17,622 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:40:17,623 EPOCH 10 done: loss 0.0077 - lr: 0.000000 |
|
2023-10-17 17:40:18,868 DEV : loss 0.14403872191905975 - f1-score (micro avg) 0.7783 |
|
2023-10-17 17:40:19,212 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:40:19,213 Loading model from best epoch ... |
|
2023-10-17 17:40:20,598 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-17 17:40:23,115 |
|
Results: |
|
- F-score (micro) 0.7612 |
|
- F-score (macro) 0.701 |
|
- Accuracy 0.6366 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8122 0.8448 0.8282 348 |
|
LOC 0.6605 0.8199 0.7316 261 |
|
ORG 0.4615 0.4615 0.4615 52 |
|
HumanProd 0.7500 0.8182 0.7826 22 |
|
|
|
micro avg 0.7218 0.8053 0.7612 683 |
|
macro avg 0.6710 0.7361 0.7010 683 |
|
weighted avg 0.7255 0.8053 0.7619 683 |
|
|
|
2023-10-17 17:40:23,116 ---------------------------------------------------------------------------------------------------- |
|
|