|
2023-10-17 18:05:02,992 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:05:02,993 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 18:05:02,993 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:05:02,993 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-17 18:05:02,994 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:05:02,994 Train: 1166 sentences |
|
2023-10-17 18:05:02,994 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 18:05:02,994 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:05:02,994 Training Params: |
|
2023-10-17 18:05:02,994 - learning_rate: "5e-05" |
|
2023-10-17 18:05:02,994 - mini_batch_size: "8" |
|
2023-10-17 18:05:02,994 - max_epochs: "10" |
|
2023-10-17 18:05:02,994 - shuffle: "True" |
|
2023-10-17 18:05:02,994 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:05:02,994 Plugins: |
|
2023-10-17 18:05:02,994 - TensorboardLogger |
|
2023-10-17 18:05:02,994 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 18:05:02,994 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:05:02,994 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 18:05:02,994 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 18:05:02,994 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:05:02,994 Computation: |
|
2023-10-17 18:05:02,994 - compute on device: cuda:0 |
|
2023-10-17 18:05:02,994 - embedding storage: none |
|
2023-10-17 18:05:02,994 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:05:02,994 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-17 18:05:02,994 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:05:02,994 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:05:02,994 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 18:05:04,216 epoch 1 - iter 14/146 - loss 3.47286437 - time (sec): 1.22 - samples/sec: 3020.40 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 18:05:05,861 epoch 1 - iter 28/146 - loss 2.95733369 - time (sec): 2.87 - samples/sec: 3024.74 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 18:05:07,185 epoch 1 - iter 42/146 - loss 2.39187734 - time (sec): 4.19 - samples/sec: 3083.46 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 18:05:08,974 epoch 1 - iter 56/146 - loss 1.95449677 - time (sec): 5.98 - samples/sec: 2967.80 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 18:05:10,441 epoch 1 - iter 70/146 - loss 1.66046494 - time (sec): 7.45 - samples/sec: 2956.40 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 18:05:12,085 epoch 1 - iter 84/146 - loss 1.44854117 - time (sec): 9.09 - samples/sec: 2915.08 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 18:05:13,346 epoch 1 - iter 98/146 - loss 1.29761420 - time (sec): 10.35 - samples/sec: 2963.66 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 18:05:14,662 epoch 1 - iter 112/146 - loss 1.20737029 - time (sec): 11.67 - samples/sec: 2956.59 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 18:05:16,112 epoch 1 - iter 126/146 - loss 1.10631188 - time (sec): 13.12 - samples/sec: 2951.03 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 18:05:17,340 epoch 1 - iter 140/146 - loss 1.03133640 - time (sec): 14.34 - samples/sec: 2977.96 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 18:05:17,983 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:05:17,983 EPOCH 1 done: loss 1.0020 - lr: 0.000048 |
|
2023-10-17 18:05:19,171 DEV : loss 0.19741927087306976 - f1-score (micro avg) 0.4553 |
|
2023-10-17 18:05:19,177 saving best model |
|
2023-10-17 18:05:19,608 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:05:21,033 epoch 2 - iter 14/146 - loss 0.25789668 - time (sec): 1.42 - samples/sec: 3035.78 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-17 18:05:22,205 epoch 2 - iter 28/146 - loss 0.23029793 - time (sec): 2.59 - samples/sec: 2944.73 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 18:05:23,767 epoch 2 - iter 42/146 - loss 0.21329382 - time (sec): 4.16 - samples/sec: 3004.58 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 18:05:25,642 epoch 2 - iter 56/146 - loss 0.21601636 - time (sec): 6.03 - samples/sec: 2908.44 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 18:05:26,915 epoch 2 - iter 70/146 - loss 0.21063728 - time (sec): 7.30 - samples/sec: 2912.94 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 18:05:28,452 epoch 2 - iter 84/146 - loss 0.20436532 - time (sec): 8.84 - samples/sec: 2886.36 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 18:05:30,128 epoch 2 - iter 98/146 - loss 0.19733768 - time (sec): 10.52 - samples/sec: 2898.31 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 18:05:31,546 epoch 2 - iter 112/146 - loss 0.19273671 - time (sec): 11.94 - samples/sec: 2908.88 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 18:05:33,102 epoch 2 - iter 126/146 - loss 0.19300856 - time (sec): 13.49 - samples/sec: 2911.12 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 18:05:34,409 epoch 2 - iter 140/146 - loss 0.19113966 - time (sec): 14.80 - samples/sec: 2894.14 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 18:05:34,939 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:05:34,939 EPOCH 2 done: loss 0.1881 - lr: 0.000045 |
|
2023-10-17 18:05:36,222 DEV : loss 0.13468769192695618 - f1-score (micro avg) 0.6377 |
|
2023-10-17 18:05:36,228 saving best model |
|
2023-10-17 18:05:36,706 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:05:38,233 epoch 3 - iter 14/146 - loss 0.11423988 - time (sec): 1.52 - samples/sec: 3080.79 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 18:05:39,736 epoch 3 - iter 28/146 - loss 0.11646607 - time (sec): 3.02 - samples/sec: 3041.37 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 18:05:41,158 epoch 3 - iter 42/146 - loss 0.12663839 - time (sec): 4.45 - samples/sec: 2980.57 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 18:05:42,666 epoch 3 - iter 56/146 - loss 0.11742262 - time (sec): 5.96 - samples/sec: 2917.25 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 18:05:44,139 epoch 3 - iter 70/146 - loss 0.11414842 - time (sec): 7.43 - samples/sec: 2933.62 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 18:05:45,700 epoch 3 - iter 84/146 - loss 0.11775656 - time (sec): 8.99 - samples/sec: 2925.79 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 18:05:46,857 epoch 3 - iter 98/146 - loss 0.11622171 - time (sec): 10.15 - samples/sec: 2956.79 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 18:05:48,433 epoch 3 - iter 112/146 - loss 0.10943556 - time (sec): 11.72 - samples/sec: 2963.12 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 18:05:49,959 epoch 3 - iter 126/146 - loss 0.10714731 - time (sec): 13.25 - samples/sec: 2946.48 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 18:05:51,484 epoch 3 - iter 140/146 - loss 0.10632914 - time (sec): 14.77 - samples/sec: 2906.52 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-17 18:05:52,028 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:05:52,028 EPOCH 3 done: loss 0.1094 - lr: 0.000039 |
|
2023-10-17 18:05:53,295 DEV : loss 0.11463181674480438 - f1-score (micro avg) 0.7632 |
|
2023-10-17 18:05:53,302 saving best model |
|
2023-10-17 18:05:53,742 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:05:55,188 epoch 4 - iter 14/146 - loss 0.06282635 - time (sec): 1.44 - samples/sec: 3137.79 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 18:05:56,757 epoch 4 - iter 28/146 - loss 0.05920855 - time (sec): 3.01 - samples/sec: 3022.23 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 18:05:57,989 epoch 4 - iter 42/146 - loss 0.06414759 - time (sec): 4.25 - samples/sec: 3033.61 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 18:05:59,649 epoch 4 - iter 56/146 - loss 0.06138867 - time (sec): 5.91 - samples/sec: 2989.23 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 18:06:01,189 epoch 4 - iter 70/146 - loss 0.06194662 - time (sec): 7.45 - samples/sec: 2986.56 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 18:06:02,572 epoch 4 - iter 84/146 - loss 0.06618929 - time (sec): 8.83 - samples/sec: 3000.74 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 18:06:04,127 epoch 4 - iter 98/146 - loss 0.06718422 - time (sec): 10.38 - samples/sec: 2940.97 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 18:06:05,598 epoch 4 - iter 112/146 - loss 0.07020686 - time (sec): 11.85 - samples/sec: 2949.15 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 18:06:07,060 epoch 4 - iter 126/146 - loss 0.07201581 - time (sec): 13.32 - samples/sec: 2939.98 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 18:06:08,314 epoch 4 - iter 140/146 - loss 0.07102769 - time (sec): 14.57 - samples/sec: 2918.29 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 18:06:08,903 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:06:08,903 EPOCH 4 done: loss 0.0717 - lr: 0.000034 |
|
2023-10-17 18:06:10,317 DEV : loss 0.10693204402923584 - f1-score (micro avg) 0.7439 |
|
2023-10-17 18:06:10,322 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:06:11,781 epoch 5 - iter 14/146 - loss 0.03715351 - time (sec): 1.46 - samples/sec: 2881.34 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 18:06:13,056 epoch 5 - iter 28/146 - loss 0.04279936 - time (sec): 2.73 - samples/sec: 3043.99 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 18:06:14,564 epoch 5 - iter 42/146 - loss 0.04029772 - time (sec): 4.24 - samples/sec: 3112.72 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 18:06:16,028 epoch 5 - iter 56/146 - loss 0.04254317 - time (sec): 5.70 - samples/sec: 3035.67 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 18:06:17,575 epoch 5 - iter 70/146 - loss 0.05147397 - time (sec): 7.25 - samples/sec: 2907.16 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 18:06:19,092 epoch 5 - iter 84/146 - loss 0.04970674 - time (sec): 8.77 - samples/sec: 2943.52 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 18:06:20,386 epoch 5 - iter 98/146 - loss 0.04728730 - time (sec): 10.06 - samples/sec: 2947.61 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 18:06:21,887 epoch 5 - iter 112/146 - loss 0.04576205 - time (sec): 11.56 - samples/sec: 2916.40 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 18:06:23,372 epoch 5 - iter 126/146 - loss 0.04431415 - time (sec): 13.05 - samples/sec: 2949.96 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 18:06:24,968 epoch 5 - iter 140/146 - loss 0.04350621 - time (sec): 14.65 - samples/sec: 2939.01 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 18:06:25,457 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:06:25,457 EPOCH 5 done: loss 0.0430 - lr: 0.000028 |
|
2023-10-17 18:06:26,744 DEV : loss 0.1116286963224411 - f1-score (micro avg) 0.7702 |
|
2023-10-17 18:06:26,749 saving best model |
|
2023-10-17 18:06:27,203 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:06:28,618 epoch 6 - iter 14/146 - loss 0.04553866 - time (sec): 1.41 - samples/sec: 2913.20 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 18:06:30,185 epoch 6 - iter 28/146 - loss 0.03490225 - time (sec): 2.98 - samples/sec: 2979.56 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 18:06:31,516 epoch 6 - iter 42/146 - loss 0.04149229 - time (sec): 4.31 - samples/sec: 2877.42 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 18:06:32,937 epoch 6 - iter 56/146 - loss 0.03838322 - time (sec): 5.73 - samples/sec: 2794.34 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 18:06:34,459 epoch 6 - iter 70/146 - loss 0.03566342 - time (sec): 7.25 - samples/sec: 2817.37 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 18:06:36,022 epoch 6 - iter 84/146 - loss 0.03699880 - time (sec): 8.82 - samples/sec: 2887.86 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 18:06:37,123 epoch 6 - iter 98/146 - loss 0.03665218 - time (sec): 9.92 - samples/sec: 2914.67 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 18:06:38,530 epoch 6 - iter 112/146 - loss 0.03524922 - time (sec): 11.32 - samples/sec: 2914.45 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 18:06:39,961 epoch 6 - iter 126/146 - loss 0.03441689 - time (sec): 12.76 - samples/sec: 2944.67 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 18:06:41,501 epoch 6 - iter 140/146 - loss 0.03221802 - time (sec): 14.30 - samples/sec: 2971.92 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 18:06:42,233 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:06:42,234 EPOCH 6 done: loss 0.0318 - lr: 0.000023 |
|
2023-10-17 18:06:43,498 DEV : loss 0.12657296657562256 - f1-score (micro avg) 0.7738 |
|
2023-10-17 18:06:43,503 saving best model |
|
2023-10-17 18:06:43,948 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:06:45,372 epoch 7 - iter 14/146 - loss 0.02524628 - time (sec): 1.42 - samples/sec: 2856.66 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 18:06:46,593 epoch 7 - iter 28/146 - loss 0.02680280 - time (sec): 2.64 - samples/sec: 2870.59 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 18:06:47,949 epoch 7 - iter 42/146 - loss 0.02213969 - time (sec): 4.00 - samples/sec: 2918.32 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 18:06:49,309 epoch 7 - iter 56/146 - loss 0.02139683 - time (sec): 5.36 - samples/sec: 2981.76 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 18:06:50,826 epoch 7 - iter 70/146 - loss 0.02696387 - time (sec): 6.87 - samples/sec: 3016.39 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 18:06:52,353 epoch 7 - iter 84/146 - loss 0.02742539 - time (sec): 8.40 - samples/sec: 2924.97 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 18:06:53,828 epoch 7 - iter 98/146 - loss 0.02526137 - time (sec): 9.88 - samples/sec: 2923.43 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 18:06:55,369 epoch 7 - iter 112/146 - loss 0.02426403 - time (sec): 11.42 - samples/sec: 2891.27 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 18:06:57,093 epoch 7 - iter 126/146 - loss 0.02384645 - time (sec): 13.14 - samples/sec: 2859.96 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 18:06:58,546 epoch 7 - iter 140/146 - loss 0.02357317 - time (sec): 14.60 - samples/sec: 2899.92 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 18:06:59,258 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:06:59,258 EPOCH 7 done: loss 0.0229 - lr: 0.000017 |
|
2023-10-17 18:07:00,727 DEV : loss 0.14095118641853333 - f1-score (micro avg) 0.7478 |
|
2023-10-17 18:07:00,731 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:07:02,124 epoch 8 - iter 14/146 - loss 0.01576638 - time (sec): 1.39 - samples/sec: 2988.30 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 18:07:03,698 epoch 8 - iter 28/146 - loss 0.01550301 - time (sec): 2.97 - samples/sec: 2868.29 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 18:07:05,126 epoch 8 - iter 42/146 - loss 0.01623872 - time (sec): 4.39 - samples/sec: 2890.47 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 18:07:06,652 epoch 8 - iter 56/146 - loss 0.02023556 - time (sec): 5.92 - samples/sec: 2938.03 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 18:07:08,136 epoch 8 - iter 70/146 - loss 0.02022101 - time (sec): 7.40 - samples/sec: 2935.66 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 18:07:09,637 epoch 8 - iter 84/146 - loss 0.01985932 - time (sec): 8.90 - samples/sec: 2943.32 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 18:07:11,362 epoch 8 - iter 98/146 - loss 0.01851297 - time (sec): 10.63 - samples/sec: 2911.68 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 18:07:12,649 epoch 8 - iter 112/146 - loss 0.01853413 - time (sec): 11.92 - samples/sec: 2887.11 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 18:07:14,247 epoch 8 - iter 126/146 - loss 0.01856705 - time (sec): 13.52 - samples/sec: 2907.94 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 18:07:15,590 epoch 8 - iter 140/146 - loss 0.01763127 - time (sec): 14.86 - samples/sec: 2898.42 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 18:07:16,079 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:07:16,079 EPOCH 8 done: loss 0.0176 - lr: 0.000012 |
|
2023-10-17 18:07:17,329 DEV : loss 0.1616245061159134 - f1-score (micro avg) 0.7654 |
|
2023-10-17 18:07:17,334 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:07:18,671 epoch 9 - iter 14/146 - loss 0.01183606 - time (sec): 1.34 - samples/sec: 2853.15 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 18:07:20,453 epoch 9 - iter 28/146 - loss 0.01570372 - time (sec): 3.12 - samples/sec: 2805.11 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 18:07:22,240 epoch 9 - iter 42/146 - loss 0.01929006 - time (sec): 4.90 - samples/sec: 2875.05 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 18:07:23,638 epoch 9 - iter 56/146 - loss 0.01748759 - time (sec): 6.30 - samples/sec: 2857.79 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 18:07:24,850 epoch 9 - iter 70/146 - loss 0.01629176 - time (sec): 7.52 - samples/sec: 2892.31 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 18:07:26,312 epoch 9 - iter 84/146 - loss 0.01521756 - time (sec): 8.98 - samples/sec: 2916.25 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 18:07:27,669 epoch 9 - iter 98/146 - loss 0.01433705 - time (sec): 10.33 - samples/sec: 2895.91 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 18:07:29,090 epoch 9 - iter 112/146 - loss 0.01425475 - time (sec): 11.76 - samples/sec: 2951.20 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 18:07:30,475 epoch 9 - iter 126/146 - loss 0.01361620 - time (sec): 13.14 - samples/sec: 2915.46 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 18:07:31,749 epoch 9 - iter 140/146 - loss 0.01297694 - time (sec): 14.41 - samples/sec: 2921.88 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 18:07:32,406 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:07:32,406 EPOCH 9 done: loss 0.0124 - lr: 0.000006 |
|
2023-10-17 18:07:33,632 DEV : loss 0.1573966145515442 - f1-score (micro avg) 0.773 |
|
2023-10-17 18:07:33,637 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:07:35,439 epoch 10 - iter 14/146 - loss 0.01626036 - time (sec): 1.80 - samples/sec: 2789.02 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 18:07:36,974 epoch 10 - iter 28/146 - loss 0.01146527 - time (sec): 3.34 - samples/sec: 2861.72 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 18:07:38,303 epoch 10 - iter 42/146 - loss 0.01070628 - time (sec): 4.67 - samples/sec: 2915.67 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 18:07:39,553 epoch 10 - iter 56/146 - loss 0.00897110 - time (sec): 5.91 - samples/sec: 2970.08 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 18:07:40,942 epoch 10 - iter 70/146 - loss 0.00880247 - time (sec): 7.30 - samples/sec: 2932.76 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 18:07:42,486 epoch 10 - iter 84/146 - loss 0.00928154 - time (sec): 8.85 - samples/sec: 2866.28 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 18:07:43,982 epoch 10 - iter 98/146 - loss 0.00880370 - time (sec): 10.34 - samples/sec: 2874.41 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 18:07:45,598 epoch 10 - iter 112/146 - loss 0.00874673 - time (sec): 11.96 - samples/sec: 2885.33 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 18:07:46,803 epoch 10 - iter 126/146 - loss 0.00921825 - time (sec): 13.16 - samples/sec: 2915.35 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 18:07:48,198 epoch 10 - iter 140/146 - loss 0.00968192 - time (sec): 14.56 - samples/sec: 2915.90 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 18:07:49,005 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:07:49,005 EPOCH 10 done: loss 0.0093 - lr: 0.000000 |
|
2023-10-17 18:07:50,352 DEV : loss 0.1613503247499466 - f1-score (micro avg) 0.7775 |
|
2023-10-17 18:07:50,357 saving best model |
|
2023-10-17 18:07:51,156 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:07:51,157 Loading model from best epoch ... |
|
2023-10-17 18:07:52,769 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-17 18:07:55,187 |
|
Results: |
|
- F-score (micro) 0.7562 |
|
- F-score (macro) 0.6535 |
|
- Accuracy 0.6287 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8229 0.8678 0.8448 348 |
|
LOC 0.6350 0.8199 0.7157 261 |
|
ORG 0.4400 0.4231 0.4314 52 |
|
HumanProd 0.6087 0.6364 0.6222 22 |
|
|
|
micro avg 0.7104 0.8082 0.7562 683 |
|
macro avg 0.6266 0.6868 0.6535 683 |
|
weighted avg 0.7150 0.8082 0.7568 683 |
|
|
|
2023-10-17 18:07:55,187 ---------------------------------------------------------------------------------------------------- |
|
|