|
2023-10-17 17:34:10,698 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:34:10,699 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 17:34:10,699 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:34:10,699 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-17 17:34:10,699 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:34:10,699 Train: 1166 sentences |
|
2023-10-17 17:34:10,699 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 17:34:10,699 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:34:10,699 Training Params: |
|
2023-10-17 17:34:10,699 - learning_rate: "3e-05" |
|
2023-10-17 17:34:10,700 - mini_batch_size: "8" |
|
2023-10-17 17:34:10,700 - max_epochs: "10" |
|
2023-10-17 17:34:10,700 - shuffle: "True" |
|
2023-10-17 17:34:10,700 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:34:10,700 Plugins: |
|
2023-10-17 17:34:10,700 - TensorboardLogger |
|
2023-10-17 17:34:10,700 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 17:34:10,700 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:34:10,700 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 17:34:10,700 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 17:34:10,700 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:34:10,700 Computation: |
|
2023-10-17 17:34:10,700 - compute on device: cuda:0 |
|
2023-10-17 17:34:10,700 - embedding storage: none |
|
2023-10-17 17:34:10,700 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:34:10,700 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-17 17:34:10,700 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:34:10,700 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:34:10,700 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 17:34:12,102 epoch 1 - iter 14/146 - loss 3.41781629 - time (sec): 1.40 - samples/sec: 2706.86 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:34:13,479 epoch 1 - iter 28/146 - loss 3.19523070 - time (sec): 2.78 - samples/sec: 2743.78 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:34:15,696 epoch 1 - iter 42/146 - loss 2.72247714 - time (sec): 5.00 - samples/sec: 2741.67 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 17:34:17,030 epoch 1 - iter 56/146 - loss 2.24817868 - time (sec): 6.33 - samples/sec: 2826.85 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 17:34:18,216 epoch 1 - iter 70/146 - loss 1.97425059 - time (sec): 7.52 - samples/sec: 2872.38 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 17:34:19,771 epoch 1 - iter 84/146 - loss 1.76062958 - time (sec): 9.07 - samples/sec: 2841.10 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 17:34:21,105 epoch 1 - iter 98/146 - loss 1.56691924 - time (sec): 10.40 - samples/sec: 2868.07 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:34:22,445 epoch 1 - iter 112/146 - loss 1.41624902 - time (sec): 11.74 - samples/sec: 2905.87 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 17:34:23,861 epoch 1 - iter 126/146 - loss 1.29548011 - time (sec): 13.16 - samples/sec: 2919.54 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:34:25,272 epoch 1 - iter 140/146 - loss 1.20756763 - time (sec): 14.57 - samples/sec: 2901.50 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 17:34:25,909 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:34:25,909 EPOCH 1 done: loss 1.1629 - lr: 0.000029 |
|
2023-10-17 17:34:26,796 DEV : loss 0.21301493048667908 - f1-score (micro avg) 0.4358 |
|
2023-10-17 17:34:26,804 saving best model |
|
2023-10-17 17:34:27,174 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:34:29,054 epoch 2 - iter 14/146 - loss 0.26935291 - time (sec): 1.88 - samples/sec: 2422.00 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 17:34:30,577 epoch 2 - iter 28/146 - loss 0.23373936 - time (sec): 3.40 - samples/sec: 2491.77 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 17:34:31,942 epoch 2 - iter 42/146 - loss 0.22380424 - time (sec): 4.77 - samples/sec: 2659.63 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 17:34:33,819 epoch 2 - iter 56/146 - loss 0.20873831 - time (sec): 6.64 - samples/sec: 2685.69 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 17:34:35,150 epoch 2 - iter 70/146 - loss 0.20731025 - time (sec): 7.97 - samples/sec: 2772.54 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 17:34:36,476 epoch 2 - iter 84/146 - loss 0.20314886 - time (sec): 9.30 - samples/sec: 2888.49 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 17:34:37,622 epoch 2 - iter 98/146 - loss 0.20692578 - time (sec): 10.45 - samples/sec: 2901.97 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 17:34:38,837 epoch 2 - iter 112/146 - loss 0.21286406 - time (sec): 11.66 - samples/sec: 2898.05 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:34:40,431 epoch 2 - iter 126/146 - loss 0.21773797 - time (sec): 13.26 - samples/sec: 2902.32 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:34:42,013 epoch 2 - iter 140/146 - loss 0.21049606 - time (sec): 14.84 - samples/sec: 2882.52 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:34:42,634 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:34:42,634 EPOCH 2 done: loss 0.2088 - lr: 0.000027 |
|
2023-10-17 17:34:43,905 DEV : loss 0.12763920426368713 - f1-score (micro avg) 0.6166 |
|
2023-10-17 17:34:43,910 saving best model |
|
2023-10-17 17:34:44,373 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:34:45,986 epoch 3 - iter 14/146 - loss 0.13733881 - time (sec): 1.61 - samples/sec: 2783.38 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:34:47,467 epoch 3 - iter 28/146 - loss 0.12832365 - time (sec): 3.09 - samples/sec: 2830.11 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:34:48,938 epoch 3 - iter 42/146 - loss 0.13952727 - time (sec): 4.56 - samples/sec: 2899.24 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:34:50,224 epoch 3 - iter 56/146 - loss 0.13596454 - time (sec): 5.85 - samples/sec: 2943.42 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:34:51,562 epoch 3 - iter 70/146 - loss 0.13330878 - time (sec): 7.19 - samples/sec: 2939.38 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:34:52,922 epoch 3 - iter 84/146 - loss 0.12656844 - time (sec): 8.55 - samples/sec: 2946.73 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:34:54,523 epoch 3 - iter 98/146 - loss 0.12323148 - time (sec): 10.15 - samples/sec: 2950.36 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:34:56,140 epoch 3 - iter 112/146 - loss 0.12354407 - time (sec): 11.76 - samples/sec: 2947.41 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:34:57,428 epoch 3 - iter 126/146 - loss 0.12251103 - time (sec): 13.05 - samples/sec: 2950.33 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:34:58,780 epoch 3 - iter 140/146 - loss 0.12318821 - time (sec): 14.40 - samples/sec: 2926.48 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:34:59,602 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:34:59,602 EPOCH 3 done: loss 0.1235 - lr: 0.000024 |
|
2023-10-17 17:35:00,844 DEV : loss 0.11239827424287796 - f1-score (micro avg) 0.7169 |
|
2023-10-17 17:35:00,848 saving best model |
|
2023-10-17 17:35:01,320 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:35:02,428 epoch 4 - iter 14/146 - loss 0.10523599 - time (sec): 1.11 - samples/sec: 3124.48 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 17:35:03,652 epoch 4 - iter 28/146 - loss 0.09148276 - time (sec): 2.33 - samples/sec: 3164.43 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 17:35:05,339 epoch 4 - iter 42/146 - loss 0.07551795 - time (sec): 4.02 - samples/sec: 3099.36 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 17:35:06,930 epoch 4 - iter 56/146 - loss 0.07430281 - time (sec): 5.61 - samples/sec: 2987.42 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 17:35:08,428 epoch 4 - iter 70/146 - loss 0.07274444 - time (sec): 7.11 - samples/sec: 2959.51 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 17:35:09,989 epoch 4 - iter 84/146 - loss 0.07061816 - time (sec): 8.67 - samples/sec: 2956.86 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:35:11,818 epoch 4 - iter 98/146 - loss 0.07687115 - time (sec): 10.50 - samples/sec: 2916.54 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:35:13,168 epoch 4 - iter 112/146 - loss 0.07669767 - time (sec): 11.85 - samples/sec: 2902.91 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:35:14,638 epoch 4 - iter 126/146 - loss 0.08004943 - time (sec): 13.32 - samples/sec: 2924.14 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:35:16,115 epoch 4 - iter 140/146 - loss 0.07957329 - time (sec): 14.79 - samples/sec: 2899.39 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:35:16,652 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:35:16,652 EPOCH 4 done: loss 0.0787 - lr: 0.000020 |
|
2023-10-17 17:35:17,960 DEV : loss 0.12767371535301208 - f1-score (micro avg) 0.7347 |
|
2023-10-17 17:35:17,965 saving best model |
|
2023-10-17 17:35:18,422 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:35:20,198 epoch 5 - iter 14/146 - loss 0.06713213 - time (sec): 1.77 - samples/sec: 2816.55 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:35:22,147 epoch 5 - iter 28/146 - loss 0.05002344 - time (sec): 3.72 - samples/sec: 2629.73 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 17:35:23,492 epoch 5 - iter 42/146 - loss 0.06121285 - time (sec): 5.07 - samples/sec: 2780.50 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 17:35:24,688 epoch 5 - iter 56/146 - loss 0.06588173 - time (sec): 6.26 - samples/sec: 2820.49 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 17:35:26,222 epoch 5 - iter 70/146 - loss 0.06594720 - time (sec): 7.80 - samples/sec: 2802.19 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:35:27,674 epoch 5 - iter 84/146 - loss 0.06721579 - time (sec): 9.25 - samples/sec: 2827.08 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:35:29,074 epoch 5 - iter 98/146 - loss 0.06330378 - time (sec): 10.65 - samples/sec: 2880.43 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:35:30,422 epoch 5 - iter 112/146 - loss 0.06266183 - time (sec): 12.00 - samples/sec: 2895.08 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:35:31,794 epoch 5 - iter 126/146 - loss 0.06018288 - time (sec): 13.37 - samples/sec: 2886.31 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 17:35:33,215 epoch 5 - iter 140/146 - loss 0.05869652 - time (sec): 14.79 - samples/sec: 2898.02 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 17:35:33,682 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:35:33,683 EPOCH 5 done: loss 0.0575 - lr: 0.000017 |
|
2023-10-17 17:35:35,033 DEV : loss 0.11119718104600906 - f1-score (micro avg) 0.7366 |
|
2023-10-17 17:35:35,042 saving best model |
|
2023-10-17 17:35:35,505 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:35:36,875 epoch 6 - iter 14/146 - loss 0.04824613 - time (sec): 1.37 - samples/sec: 3188.47 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 17:35:38,086 epoch 6 - iter 28/146 - loss 0.03975259 - time (sec): 2.58 - samples/sec: 3016.19 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 17:35:39,837 epoch 6 - iter 42/146 - loss 0.04375931 - time (sec): 4.33 - samples/sec: 2885.12 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 17:35:41,314 epoch 6 - iter 56/146 - loss 0.04373476 - time (sec): 5.81 - samples/sec: 2902.21 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:35:42,542 epoch 6 - iter 70/146 - loss 0.04210856 - time (sec): 7.04 - samples/sec: 2889.54 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:35:44,084 epoch 6 - iter 84/146 - loss 0.04064450 - time (sec): 8.58 - samples/sec: 2827.82 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:35:45,532 epoch 6 - iter 98/146 - loss 0.04063501 - time (sec): 10.03 - samples/sec: 2811.44 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:35:47,178 epoch 6 - iter 112/146 - loss 0.04206829 - time (sec): 11.67 - samples/sec: 2824.16 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 17:35:48,636 epoch 6 - iter 126/146 - loss 0.04214994 - time (sec): 13.13 - samples/sec: 2847.92 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 17:35:50,492 epoch 6 - iter 140/146 - loss 0.04140993 - time (sec): 14.99 - samples/sec: 2860.81 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 17:35:51,009 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:35:51,009 EPOCH 6 done: loss 0.0411 - lr: 0.000014 |
|
2023-10-17 17:35:52,347 DEV : loss 0.12232689559459686 - f1-score (micro avg) 0.7387 |
|
2023-10-17 17:35:52,353 saving best model |
|
2023-10-17 17:35:52,811 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:35:54,469 epoch 7 - iter 14/146 - loss 0.02256882 - time (sec): 1.66 - samples/sec: 2879.40 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 17:35:56,042 epoch 7 - iter 28/146 - loss 0.03288779 - time (sec): 3.23 - samples/sec: 2693.38 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 17:35:57,619 epoch 7 - iter 42/146 - loss 0.03598142 - time (sec): 4.81 - samples/sec: 2800.42 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:35:59,128 epoch 7 - iter 56/146 - loss 0.03345291 - time (sec): 6.31 - samples/sec: 2852.01 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:36:00,762 epoch 7 - iter 70/146 - loss 0.03144006 - time (sec): 7.95 - samples/sec: 2865.15 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:36:02,090 epoch 7 - iter 84/146 - loss 0.03099728 - time (sec): 9.28 - samples/sec: 2910.56 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:36:03,582 epoch 7 - iter 98/146 - loss 0.03123906 - time (sec): 10.77 - samples/sec: 2920.04 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 17:36:04,878 epoch 7 - iter 112/146 - loss 0.03086938 - time (sec): 12.06 - samples/sec: 2954.52 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 17:36:06,038 epoch 7 - iter 126/146 - loss 0.02983269 - time (sec): 13.23 - samples/sec: 2952.40 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 17:36:07,324 epoch 7 - iter 140/146 - loss 0.03052926 - time (sec): 14.51 - samples/sec: 2955.31 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 17:36:07,903 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:36:07,903 EPOCH 7 done: loss 0.0303 - lr: 0.000010 |
|
2023-10-17 17:36:09,300 DEV : loss 0.11899629980325699 - f1-score (micro avg) 0.7489 |
|
2023-10-17 17:36:09,306 saving best model |
|
2023-10-17 17:36:09,770 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:36:11,285 epoch 8 - iter 14/146 - loss 0.01724715 - time (sec): 1.51 - samples/sec: 2998.30 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 17:36:12,868 epoch 8 - iter 28/146 - loss 0.02957612 - time (sec): 3.10 - samples/sec: 2991.71 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 17:36:14,548 epoch 8 - iter 42/146 - loss 0.02774801 - time (sec): 4.78 - samples/sec: 2859.21 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 17:36:15,854 epoch 8 - iter 56/146 - loss 0.03053498 - time (sec): 6.08 - samples/sec: 2902.91 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 17:36:17,216 epoch 8 - iter 70/146 - loss 0.02811800 - time (sec): 7.44 - samples/sec: 2871.92 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 17:36:18,479 epoch 8 - iter 84/146 - loss 0.02621154 - time (sec): 8.71 - samples/sec: 2883.41 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 17:36:19,931 epoch 8 - iter 98/146 - loss 0.02607277 - time (sec): 10.16 - samples/sec: 2892.05 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 17:36:21,382 epoch 8 - iter 112/146 - loss 0.02474799 - time (sec): 11.61 - samples/sec: 2879.71 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 17:36:22,964 epoch 8 - iter 126/146 - loss 0.02303571 - time (sec): 13.19 - samples/sec: 2861.56 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 17:36:24,668 epoch 8 - iter 140/146 - loss 0.02411078 - time (sec): 14.90 - samples/sec: 2877.30 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 17:36:25,140 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:36:25,140 EPOCH 8 done: loss 0.0239 - lr: 0.000007 |
|
2023-10-17 17:36:26,405 DEV : loss 0.1307678520679474 - f1-score (micro avg) 0.7686 |
|
2023-10-17 17:36:26,410 saving best model |
|
2023-10-17 17:36:26,879 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:36:28,444 epoch 9 - iter 14/146 - loss 0.01901972 - time (sec): 1.56 - samples/sec: 2641.86 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:36:29,830 epoch 9 - iter 28/146 - loss 0.02385709 - time (sec): 2.95 - samples/sec: 2719.97 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:36:31,047 epoch 9 - iter 42/146 - loss 0.02182791 - time (sec): 4.16 - samples/sec: 2735.27 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:36:32,276 epoch 9 - iter 56/146 - loss 0.01826679 - time (sec): 5.39 - samples/sec: 2787.55 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:36:33,957 epoch 9 - iter 70/146 - loss 0.01817023 - time (sec): 7.08 - samples/sec: 2849.20 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:36:35,421 epoch 9 - iter 84/146 - loss 0.01820976 - time (sec): 8.54 - samples/sec: 2893.14 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:36:36,738 epoch 9 - iter 98/146 - loss 0.01827366 - time (sec): 9.86 - samples/sec: 2888.66 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:36:38,078 epoch 9 - iter 112/146 - loss 0.01689217 - time (sec): 11.20 - samples/sec: 2925.35 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 17:36:39,820 epoch 9 - iter 126/146 - loss 0.01694711 - time (sec): 12.94 - samples/sec: 2937.51 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 17:36:41,252 epoch 9 - iter 140/146 - loss 0.01782277 - time (sec): 14.37 - samples/sec: 2934.83 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 17:36:42,148 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:36:42,148 EPOCH 9 done: loss 0.0181 - lr: 0.000004 |
|
2023-10-17 17:36:43,479 DEV : loss 0.13454537093639374 - f1-score (micro avg) 0.7615 |
|
2023-10-17 17:36:43,487 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:36:45,042 epoch 10 - iter 14/146 - loss 0.01570550 - time (sec): 1.55 - samples/sec: 3010.72 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:36:46,505 epoch 10 - iter 28/146 - loss 0.01766356 - time (sec): 3.02 - samples/sec: 3095.27 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:36:47,949 epoch 10 - iter 42/146 - loss 0.02308068 - time (sec): 4.46 - samples/sec: 3045.17 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:36:49,330 epoch 10 - iter 56/146 - loss 0.01941746 - time (sec): 5.84 - samples/sec: 2986.50 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 17:36:50,825 epoch 10 - iter 70/146 - loss 0.01612324 - time (sec): 7.34 - samples/sec: 2950.05 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 17:36:52,332 epoch 10 - iter 84/146 - loss 0.01574558 - time (sec): 8.84 - samples/sec: 2940.92 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 17:36:53,845 epoch 10 - iter 98/146 - loss 0.01635377 - time (sec): 10.36 - samples/sec: 2912.53 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 17:36:55,386 epoch 10 - iter 112/146 - loss 0.01613895 - time (sec): 11.90 - samples/sec: 2916.89 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 17:36:56,988 epoch 10 - iter 126/146 - loss 0.01578624 - time (sec): 13.50 - samples/sec: 2917.22 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 17:36:58,336 epoch 10 - iter 140/146 - loss 0.01493481 - time (sec): 14.85 - samples/sec: 2919.38 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 17:36:58,751 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:36:58,751 EPOCH 10 done: loss 0.0147 - lr: 0.000000 |
|
2023-10-17 17:37:00,104 DEV : loss 0.13558898866176605 - f1-score (micro avg) 0.7582 |
|
2023-10-17 17:37:00,455 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:37:00,456 Loading model from best epoch ... |
|
2023-10-17 17:37:01,822 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-17 17:37:04,514 |
|
Results: |
|
- F-score (micro) 0.7634 |
|
- F-score (macro) 0.6812 |
|
- Accuracy 0.635 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8362 0.8506 0.8433 348 |
|
LOC 0.6292 0.8582 0.7261 261 |
|
ORG 0.4737 0.3462 0.4000 52 |
|
HumanProd 0.7391 0.7727 0.7556 22 |
|
|
|
micro avg 0.7198 0.8126 0.7634 683 |
|
macro avg 0.6695 0.7069 0.6812 683 |
|
weighted avg 0.7264 0.8126 0.7619 683 |
|
|
|
2023-10-17 17:37:04,515 ---------------------------------------------------------------------------------------------------- |
|
|