|
2023-10-17 18:18:51,727 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:18:51,728 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 18:18:51,728 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:18:51,728 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-17 18:18:51,728 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:18:51,728 Train: 1166 sentences |
|
2023-10-17 18:18:51,728 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 18:18:51,728 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:18:51,728 Training Params: |
|
2023-10-17 18:18:51,728 - learning_rate: "5e-05" |
|
2023-10-17 18:18:51,728 - mini_batch_size: "8" |
|
2023-10-17 18:18:51,728 - max_epochs: "10" |
|
2023-10-17 18:18:51,728 - shuffle: "True" |
|
2023-10-17 18:18:51,728 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:18:51,728 Plugins: |
|
2023-10-17 18:18:51,728 - TensorboardLogger |
|
2023-10-17 18:18:51,728 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 18:18:51,728 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:18:51,728 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 18:18:51,728 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 18:18:51,729 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:18:51,729 Computation: |
|
2023-10-17 18:18:51,729 - compute on device: cuda:0 |
|
2023-10-17 18:18:51,729 - embedding storage: none |
|
2023-10-17 18:18:51,729 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:18:51,729 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-17 18:18:51,729 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:18:51,729 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:18:51,729 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 18:18:53,207 epoch 1 - iter 14/146 - loss 3.57138140 - time (sec): 1.48 - samples/sec: 3117.02 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 18:18:54,839 epoch 1 - iter 28/146 - loss 3.25629001 - time (sec): 3.11 - samples/sec: 2929.60 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 18:18:56,377 epoch 1 - iter 42/146 - loss 2.62329353 - time (sec): 4.65 - samples/sec: 2849.91 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 18:18:58,017 epoch 1 - iter 56/146 - loss 2.16516736 - time (sec): 6.29 - samples/sec: 2897.17 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 18:18:59,544 epoch 1 - iter 70/146 - loss 1.79573783 - time (sec): 7.81 - samples/sec: 2954.85 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 18:19:00,998 epoch 1 - iter 84/146 - loss 1.58428127 - time (sec): 9.27 - samples/sec: 2941.43 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 18:19:02,279 epoch 1 - iter 98/146 - loss 1.44725030 - time (sec): 10.55 - samples/sec: 2972.42 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 18:19:03,479 epoch 1 - iter 112/146 - loss 1.32663289 - time (sec): 11.75 - samples/sec: 2990.30 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 18:19:04,639 epoch 1 - iter 126/146 - loss 1.23401294 - time (sec): 12.91 - samples/sec: 2988.08 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 18:19:06,192 epoch 1 - iter 140/146 - loss 1.13546776 - time (sec): 14.46 - samples/sec: 2968.79 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 18:19:06,709 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:19:06,709 EPOCH 1 done: loss 1.1051 - lr: 0.000048 |
|
2023-10-17 18:19:07,780 DEV : loss 0.18029391765594482 - f1-score (micro avg) 0.4518 |
|
2023-10-17 18:19:07,786 saving best model |
|
2023-10-17 18:19:08,120 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:19:09,578 epoch 2 - iter 14/146 - loss 0.20642889 - time (sec): 1.46 - samples/sec: 3236.63 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-17 18:19:10,987 epoch 2 - iter 28/146 - loss 0.19816193 - time (sec): 2.87 - samples/sec: 3059.66 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 18:19:12,153 epoch 2 - iter 42/146 - loss 0.21577244 - time (sec): 4.03 - samples/sec: 3137.94 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 18:19:13,699 epoch 2 - iter 56/146 - loss 0.21590787 - time (sec): 5.58 - samples/sec: 3016.92 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 18:19:15,252 epoch 2 - iter 70/146 - loss 0.20769861 - time (sec): 7.13 - samples/sec: 2997.00 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 18:19:16,922 epoch 2 - iter 84/146 - loss 0.19984801 - time (sec): 8.80 - samples/sec: 2995.60 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 18:19:18,566 epoch 2 - iter 98/146 - loss 0.19155294 - time (sec): 10.44 - samples/sec: 3007.57 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 18:19:19,863 epoch 2 - iter 112/146 - loss 0.19859158 - time (sec): 11.74 - samples/sec: 3030.45 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 18:19:21,351 epoch 2 - iter 126/146 - loss 0.19709102 - time (sec): 13.23 - samples/sec: 2962.73 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 18:19:22,731 epoch 2 - iter 140/146 - loss 0.19273806 - time (sec): 14.61 - samples/sec: 2964.24 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 18:19:23,139 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:19:23,139 EPOCH 2 done: loss 0.1917 - lr: 0.000045 |
|
2023-10-17 18:19:24,422 DEV : loss 0.1203632652759552 - f1-score (micro avg) 0.6582 |
|
2023-10-17 18:19:24,428 saving best model |
|
2023-10-17 18:19:24,853 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:19:26,517 epoch 3 - iter 14/146 - loss 0.09267755 - time (sec): 1.66 - samples/sec: 2600.35 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 18:19:27,981 epoch 3 - iter 28/146 - loss 0.10326544 - time (sec): 3.13 - samples/sec: 2865.97 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 18:19:29,298 epoch 3 - iter 42/146 - loss 0.09818320 - time (sec): 4.44 - samples/sec: 2782.45 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 18:19:30,952 epoch 3 - iter 56/146 - loss 0.09912457 - time (sec): 6.10 - samples/sec: 2856.56 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 18:19:32,383 epoch 3 - iter 70/146 - loss 0.09709361 - time (sec): 7.53 - samples/sec: 2916.81 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 18:19:33,511 epoch 3 - iter 84/146 - loss 0.09349095 - time (sec): 8.66 - samples/sec: 2928.46 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 18:19:34,943 epoch 3 - iter 98/146 - loss 0.09138427 - time (sec): 10.09 - samples/sec: 2961.60 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 18:19:36,615 epoch 3 - iter 112/146 - loss 0.10036149 - time (sec): 11.76 - samples/sec: 2903.60 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 18:19:38,076 epoch 3 - iter 126/146 - loss 0.09911074 - time (sec): 13.22 - samples/sec: 2908.58 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 18:19:39,489 epoch 3 - iter 140/146 - loss 0.10141433 - time (sec): 14.63 - samples/sec: 2910.08 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-17 18:19:40,109 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:19:40,109 EPOCH 3 done: loss 0.1021 - lr: 0.000039 |
|
2023-10-17 18:19:41,334 DEV : loss 0.11388804763555527 - f1-score (micro avg) 0.6987 |
|
2023-10-17 18:19:41,339 saving best model |
|
2023-10-17 18:19:41,750 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:19:43,144 epoch 4 - iter 14/146 - loss 0.06414953 - time (sec): 1.39 - samples/sec: 3119.81 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 18:19:44,622 epoch 4 - iter 28/146 - loss 0.05727491 - time (sec): 2.87 - samples/sec: 3113.91 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 18:19:45,777 epoch 4 - iter 42/146 - loss 0.05337460 - time (sec): 4.02 - samples/sec: 3175.45 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 18:19:47,224 epoch 4 - iter 56/146 - loss 0.05571035 - time (sec): 5.47 - samples/sec: 3116.51 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 18:19:48,677 epoch 4 - iter 70/146 - loss 0.05504218 - time (sec): 6.92 - samples/sec: 3039.65 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 18:19:49,994 epoch 4 - iter 84/146 - loss 0.05492070 - time (sec): 8.24 - samples/sec: 3050.17 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 18:19:51,688 epoch 4 - iter 98/146 - loss 0.06015328 - time (sec): 9.94 - samples/sec: 3020.01 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 18:19:53,192 epoch 4 - iter 112/146 - loss 0.06333164 - time (sec): 11.44 - samples/sec: 3024.92 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 18:19:54,596 epoch 4 - iter 126/146 - loss 0.06145486 - time (sec): 12.84 - samples/sec: 2992.93 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 18:19:56,082 epoch 4 - iter 140/146 - loss 0.06061764 - time (sec): 14.33 - samples/sec: 2988.54 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 18:19:56,587 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:19:56,587 EPOCH 4 done: loss 0.0598 - lr: 0.000034 |
|
2023-10-17 18:19:57,821 DEV : loss 0.1203216090798378 - f1-score (micro avg) 0.7059 |
|
2023-10-17 18:19:57,827 saving best model |
|
2023-10-17 18:19:58,254 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:19:59,666 epoch 5 - iter 14/146 - loss 0.06225692 - time (sec): 1.41 - samples/sec: 2691.92 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 18:20:01,084 epoch 5 - iter 28/146 - loss 0.04994899 - time (sec): 2.83 - samples/sec: 2840.72 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 18:20:02,658 epoch 5 - iter 42/146 - loss 0.04348932 - time (sec): 4.40 - samples/sec: 2902.82 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 18:20:03,902 epoch 5 - iter 56/146 - loss 0.03584489 - time (sec): 5.65 - samples/sec: 2918.20 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 18:20:05,298 epoch 5 - iter 70/146 - loss 0.04017369 - time (sec): 7.04 - samples/sec: 2968.42 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 18:20:06,800 epoch 5 - iter 84/146 - loss 0.04557808 - time (sec): 8.54 - samples/sec: 2966.33 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 18:20:08,284 epoch 5 - iter 98/146 - loss 0.04297787 - time (sec): 10.03 - samples/sec: 2971.64 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 18:20:09,323 epoch 5 - iter 112/146 - loss 0.04523185 - time (sec): 11.07 - samples/sec: 2988.29 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 18:20:10,752 epoch 5 - iter 126/146 - loss 0.04441842 - time (sec): 12.50 - samples/sec: 3007.23 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 18:20:12,525 epoch 5 - iter 140/146 - loss 0.04365021 - time (sec): 14.27 - samples/sec: 2987.44 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 18:20:13,147 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:20:13,148 EPOCH 5 done: loss 0.0434 - lr: 0.000028 |
|
2023-10-17 18:20:14,430 DEV : loss 0.13626918196678162 - f1-score (micro avg) 0.7626 |
|
2023-10-17 18:20:14,435 saving best model |
|
2023-10-17 18:20:14,859 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:20:16,178 epoch 6 - iter 14/146 - loss 0.02567778 - time (sec): 1.32 - samples/sec: 2909.87 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 18:20:17,659 epoch 6 - iter 28/146 - loss 0.02978228 - time (sec): 2.80 - samples/sec: 2840.82 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 18:20:19,453 epoch 6 - iter 42/146 - loss 0.02585818 - time (sec): 4.59 - samples/sec: 2744.48 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 18:20:21,013 epoch 6 - iter 56/146 - loss 0.02827968 - time (sec): 6.15 - samples/sec: 2777.90 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 18:20:22,447 epoch 6 - iter 70/146 - loss 0.03036843 - time (sec): 7.59 - samples/sec: 2776.72 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 18:20:23,931 epoch 6 - iter 84/146 - loss 0.02907844 - time (sec): 9.07 - samples/sec: 2764.46 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 18:20:25,337 epoch 6 - iter 98/146 - loss 0.02934355 - time (sec): 10.48 - samples/sec: 2805.93 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 18:20:26,986 epoch 6 - iter 112/146 - loss 0.03072343 - time (sec): 12.13 - samples/sec: 2840.23 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 18:20:28,501 epoch 6 - iter 126/146 - loss 0.02984132 - time (sec): 13.64 - samples/sec: 2851.61 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 18:20:29,795 epoch 6 - iter 140/146 - loss 0.02945338 - time (sec): 14.93 - samples/sec: 2865.59 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 18:20:30,379 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:20:30,379 EPOCH 6 done: loss 0.0291 - lr: 0.000023 |
|
2023-10-17 18:20:31,674 DEV : loss 0.12430983781814575 - f1-score (micro avg) 0.7601 |
|
2023-10-17 18:20:31,685 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:20:33,162 epoch 7 - iter 14/146 - loss 0.01936975 - time (sec): 1.48 - samples/sec: 2723.21 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 18:20:34,672 epoch 7 - iter 28/146 - loss 0.02306047 - time (sec): 2.99 - samples/sec: 2848.46 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 18:20:36,124 epoch 7 - iter 42/146 - loss 0.02391709 - time (sec): 4.44 - samples/sec: 2826.28 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 18:20:37,323 epoch 7 - iter 56/146 - loss 0.02142364 - time (sec): 5.64 - samples/sec: 2838.38 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 18:20:38,950 epoch 7 - iter 70/146 - loss 0.02362173 - time (sec): 7.26 - samples/sec: 2818.86 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 18:20:40,530 epoch 7 - iter 84/146 - loss 0.02178736 - time (sec): 8.84 - samples/sec: 2859.09 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 18:20:41,874 epoch 7 - iter 98/146 - loss 0.02161645 - time (sec): 10.19 - samples/sec: 2885.03 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 18:20:43,483 epoch 7 - iter 112/146 - loss 0.01989179 - time (sec): 11.80 - samples/sec: 2911.61 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 18:20:44,838 epoch 7 - iter 126/146 - loss 0.02011158 - time (sec): 13.15 - samples/sec: 2934.47 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 18:20:46,397 epoch 7 - iter 140/146 - loss 0.02190589 - time (sec): 14.71 - samples/sec: 2915.69 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 18:20:46,912 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:20:46,912 EPOCH 7 done: loss 0.0220 - lr: 0.000017 |
|
2023-10-17 18:20:48,161 DEV : loss 0.13426262140274048 - f1-score (micro avg) 0.7555 |
|
2023-10-17 18:20:48,166 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:20:49,790 epoch 8 - iter 14/146 - loss 0.00931042 - time (sec): 1.62 - samples/sec: 2783.08 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 18:20:51,345 epoch 8 - iter 28/146 - loss 0.01739007 - time (sec): 3.18 - samples/sec: 2910.73 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 18:20:52,902 epoch 8 - iter 42/146 - loss 0.01451596 - time (sec): 4.74 - samples/sec: 2893.79 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 18:20:54,489 epoch 8 - iter 56/146 - loss 0.01415561 - time (sec): 6.32 - samples/sec: 2901.64 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 18:20:55,872 epoch 8 - iter 70/146 - loss 0.01323646 - time (sec): 7.71 - samples/sec: 2911.93 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 18:20:57,064 epoch 8 - iter 84/146 - loss 0.01398395 - time (sec): 8.90 - samples/sec: 2942.06 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 18:20:58,563 epoch 8 - iter 98/146 - loss 0.01498906 - time (sec): 10.40 - samples/sec: 2970.57 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 18:20:59,831 epoch 8 - iter 112/146 - loss 0.01619772 - time (sec): 11.66 - samples/sec: 2984.01 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 18:21:00,910 epoch 8 - iter 126/146 - loss 0.01546532 - time (sec): 12.74 - samples/sec: 2986.70 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 18:21:02,605 epoch 8 - iter 140/146 - loss 0.01469667 - time (sec): 14.44 - samples/sec: 2953.15 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 18:21:03,120 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:21:03,120 EPOCH 8 done: loss 0.0145 - lr: 0.000012 |
|
2023-10-17 18:21:04,469 DEV : loss 0.1528475284576416 - f1-score (micro avg) 0.7964 |
|
2023-10-17 18:21:04,477 saving best model |
|
2023-10-17 18:21:04,986 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:21:06,462 epoch 9 - iter 14/146 - loss 0.00654971 - time (sec): 1.47 - samples/sec: 2731.72 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 18:21:07,829 epoch 9 - iter 28/146 - loss 0.01042476 - time (sec): 2.84 - samples/sec: 3029.21 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 18:21:09,456 epoch 9 - iter 42/146 - loss 0.00979821 - time (sec): 4.47 - samples/sec: 2995.46 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 18:21:10,877 epoch 9 - iter 56/146 - loss 0.00776655 - time (sec): 5.89 - samples/sec: 2991.33 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 18:21:12,371 epoch 9 - iter 70/146 - loss 0.00765477 - time (sec): 7.38 - samples/sec: 2972.77 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 18:21:13,842 epoch 9 - iter 84/146 - loss 0.00853630 - time (sec): 8.85 - samples/sec: 2881.96 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 18:21:15,287 epoch 9 - iter 98/146 - loss 0.00833710 - time (sec): 10.30 - samples/sec: 2869.04 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 18:21:16,932 epoch 9 - iter 112/146 - loss 0.00889373 - time (sec): 11.94 - samples/sec: 2855.52 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 18:21:18,337 epoch 9 - iter 126/146 - loss 0.00916008 - time (sec): 13.35 - samples/sec: 2866.00 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 18:21:20,144 epoch 9 - iter 140/146 - loss 0.01040999 - time (sec): 15.16 - samples/sec: 2840.72 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 18:21:20,785 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:21:20,785 EPOCH 9 done: loss 0.0102 - lr: 0.000006 |
|
2023-10-17 18:21:22,122 DEV : loss 0.15759235620498657 - f1-score (micro avg) 0.7729 |
|
2023-10-17 18:21:22,129 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:21:23,627 epoch 10 - iter 14/146 - loss 0.00461926 - time (sec): 1.50 - samples/sec: 2600.98 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 18:21:24,914 epoch 10 - iter 28/146 - loss 0.00522148 - time (sec): 2.78 - samples/sec: 2808.60 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 18:21:26,424 epoch 10 - iter 42/146 - loss 0.00457207 - time (sec): 4.29 - samples/sec: 2837.86 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 18:21:27,827 epoch 10 - iter 56/146 - loss 0.00463493 - time (sec): 5.70 - samples/sec: 2888.41 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 18:21:29,449 epoch 10 - iter 70/146 - loss 0.00433431 - time (sec): 7.32 - samples/sec: 2974.36 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 18:21:30,782 epoch 10 - iter 84/146 - loss 0.00576813 - time (sec): 8.65 - samples/sec: 2969.54 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 18:21:32,528 epoch 10 - iter 98/146 - loss 0.00654355 - time (sec): 10.40 - samples/sec: 2903.48 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 18:21:33,908 epoch 10 - iter 112/146 - loss 0.00753927 - time (sec): 11.78 - samples/sec: 2913.78 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 18:21:35,323 epoch 10 - iter 126/146 - loss 0.00754727 - time (sec): 13.19 - samples/sec: 2877.14 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 18:21:36,771 epoch 10 - iter 140/146 - loss 0.00764366 - time (sec): 14.64 - samples/sec: 2919.96 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 18:21:37,361 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:21:37,362 EPOCH 10 done: loss 0.0075 - lr: 0.000000 |
|
2023-10-17 18:21:38,630 DEV : loss 0.16067492961883545 - f1-score (micro avg) 0.7768 |
|
2023-10-17 18:21:38,964 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:21:38,965 Loading model from best epoch ... |
|
2023-10-17 18:21:40,331 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-17 18:21:42,849 |
|
Results: |
|
- F-score (micro) 0.7645 |
|
- F-score (macro) 0.669 |
|
- Accuracy 0.6352 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8232 0.8563 0.8394 348 |
|
LOC 0.6585 0.8276 0.7334 261 |
|
ORG 0.5116 0.4231 0.4632 52 |
|
HumanProd 0.5714 0.7273 0.6400 22 |
|
|
|
micro avg 0.7254 0.8082 0.7645 683 |
|
macro avg 0.6412 0.7086 0.6690 683 |
|
weighted avg 0.7284 0.8082 0.7639 683 |
|
|
|
2023-10-17 18:21:42,849 ---------------------------------------------------------------------------------------------------- |
|
|