|
2023-10-17 18:15:25,593 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:15:25,594 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 18:15:25,594 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:15:25,594 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-17 18:15:25,594 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:15:25,594 Train: 1166 sentences |
|
2023-10-17 18:15:25,594 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 18:15:25,594 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:15:25,594 Training Params: |
|
2023-10-17 18:15:25,594 - learning_rate: "3e-05" |
|
2023-10-17 18:15:25,594 - mini_batch_size: "8" |
|
2023-10-17 18:15:25,594 - max_epochs: "10" |
|
2023-10-17 18:15:25,594 - shuffle: "True" |
|
2023-10-17 18:15:25,594 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:15:25,595 Plugins: |
|
2023-10-17 18:15:25,595 - TensorboardLogger |
|
2023-10-17 18:15:25,595 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 18:15:25,595 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:15:25,595 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 18:15:25,595 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 18:15:25,595 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:15:25,595 Computation: |
|
2023-10-17 18:15:25,595 - compute on device: cuda:0 |
|
2023-10-17 18:15:25,595 - embedding storage: none |
|
2023-10-17 18:15:25,595 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:15:25,595 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-17 18:15:25,595 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:15:25,595 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:15:25,595 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 18:15:27,065 epoch 1 - iter 14/146 - loss 3.61689160 - time (sec): 1.47 - samples/sec: 3134.32 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 18:15:28,697 epoch 1 - iter 28/146 - loss 3.45583081 - time (sec): 3.10 - samples/sec: 2937.46 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 18:15:30,233 epoch 1 - iter 42/146 - loss 3.03740792 - time (sec): 4.64 - samples/sec: 2855.93 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 18:15:31,892 epoch 1 - iter 56/146 - loss 2.50761450 - time (sec): 6.30 - samples/sec: 2892.85 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 18:15:33,437 epoch 1 - iter 70/146 - loss 2.08084143 - time (sec): 7.84 - samples/sec: 2944.69 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 18:15:34,911 epoch 1 - iter 84/146 - loss 1.83799606 - time (sec): 9.32 - samples/sec: 2926.52 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 18:15:36,193 epoch 1 - iter 98/146 - loss 1.68258983 - time (sec): 10.60 - samples/sec: 2958.89 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 18:15:37,405 epoch 1 - iter 112/146 - loss 1.54466778 - time (sec): 11.81 - samples/sec: 2975.35 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 18:15:38,570 epoch 1 - iter 126/146 - loss 1.43607685 - time (sec): 12.97 - samples/sec: 2973.29 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 18:15:40,140 epoch 1 - iter 140/146 - loss 1.32061406 - time (sec): 14.54 - samples/sec: 2952.12 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 18:15:40,645 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:15:40,646 EPOCH 1 done: loss 1.2846 - lr: 0.000029 |
|
2023-10-17 18:15:41,637 DEV : loss 0.19552753865718842 - f1-score (micro avg) 0.4402 |
|
2023-10-17 18:15:41,642 saving best model |
|
2023-10-17 18:15:41,975 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:15:43,448 epoch 2 - iter 14/146 - loss 0.25210580 - time (sec): 1.47 - samples/sec: 3201.40 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 18:15:44,863 epoch 2 - iter 28/146 - loss 0.23091153 - time (sec): 2.89 - samples/sec: 3036.34 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 18:15:46,028 epoch 2 - iter 42/146 - loss 0.25363766 - time (sec): 4.05 - samples/sec: 3121.72 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 18:15:47,571 epoch 2 - iter 56/146 - loss 0.25203999 - time (sec): 5.60 - samples/sec: 3007.28 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 18:15:49,098 epoch 2 - iter 70/146 - loss 0.23974587 - time (sec): 7.12 - samples/sec: 3000.34 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 18:15:50,767 epoch 2 - iter 84/146 - loss 0.23286681 - time (sec): 8.79 - samples/sec: 2998.47 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 18:15:52,419 epoch 2 - iter 98/146 - loss 0.21984282 - time (sec): 10.44 - samples/sec: 3007.83 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 18:15:53,712 epoch 2 - iter 112/146 - loss 0.22405276 - time (sec): 11.74 - samples/sec: 3031.68 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 18:15:55,192 epoch 2 - iter 126/146 - loss 0.21982674 - time (sec): 13.22 - samples/sec: 2965.80 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 18:15:56,565 epoch 2 - iter 140/146 - loss 0.21373885 - time (sec): 14.59 - samples/sec: 2968.47 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 18:15:56,974 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:15:56,975 EPOCH 2 done: loss 0.2122 - lr: 0.000027 |
|
2023-10-17 18:15:58,208 DEV : loss 0.13967962563037872 - f1-score (micro avg) 0.6198 |
|
2023-10-17 18:15:58,215 saving best model |
|
2023-10-17 18:15:58,640 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:16:00,122 epoch 3 - iter 14/146 - loss 0.11131063 - time (sec): 1.47 - samples/sec: 2930.17 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 18:16:01,582 epoch 3 - iter 28/146 - loss 0.13050366 - time (sec): 2.93 - samples/sec: 3052.87 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 18:16:02,892 epoch 3 - iter 42/146 - loss 0.12162901 - time (sec): 4.24 - samples/sec: 2912.72 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 18:16:04,557 epoch 3 - iter 56/146 - loss 0.11964735 - time (sec): 5.91 - samples/sec: 2946.82 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 18:16:06,006 epoch 3 - iter 70/146 - loss 0.11783808 - time (sec): 7.36 - samples/sec: 2983.56 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 18:16:07,147 epoch 3 - iter 84/146 - loss 0.11516116 - time (sec): 8.50 - samples/sec: 2982.07 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 18:16:08,617 epoch 3 - iter 98/146 - loss 0.11251443 - time (sec): 9.97 - samples/sec: 2996.69 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 18:16:10,335 epoch 3 - iter 112/146 - loss 0.12177305 - time (sec): 11.69 - samples/sec: 2921.42 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 18:16:11,812 epoch 3 - iter 126/146 - loss 0.12094707 - time (sec): 13.17 - samples/sec: 2920.99 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 18:16:13,234 epoch 3 - iter 140/146 - loss 0.12211446 - time (sec): 14.59 - samples/sec: 2919.44 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 18:16:13,860 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:16:13,860 EPOCH 3 done: loss 0.1230 - lr: 0.000024 |
|
2023-10-17 18:16:15,099 DEV : loss 0.12319271266460419 - f1-score (micro avg) 0.7021 |
|
2023-10-17 18:16:15,103 saving best model |
|
2023-10-17 18:16:15,516 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:16:16,910 epoch 4 - iter 14/146 - loss 0.07446918 - time (sec): 1.39 - samples/sec: 3126.83 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 18:16:18,406 epoch 4 - iter 28/146 - loss 0.07544335 - time (sec): 2.88 - samples/sec: 3098.50 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 18:16:19,573 epoch 4 - iter 42/146 - loss 0.07942405 - time (sec): 4.05 - samples/sec: 3153.97 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 18:16:21,016 epoch 4 - iter 56/146 - loss 0.08085290 - time (sec): 5.49 - samples/sec: 3103.44 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 18:16:22,472 epoch 4 - iter 70/146 - loss 0.07764363 - time (sec): 6.95 - samples/sec: 3028.09 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 18:16:23,964 epoch 4 - iter 84/146 - loss 0.07552423 - time (sec): 8.44 - samples/sec: 2977.51 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 18:16:25,675 epoch 4 - iter 98/146 - loss 0.08142206 - time (sec): 10.15 - samples/sec: 2955.13 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 18:16:27,189 epoch 4 - iter 112/146 - loss 0.08418781 - time (sec): 11.67 - samples/sec: 2965.89 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 18:16:28,598 epoch 4 - iter 126/146 - loss 0.08204856 - time (sec): 13.08 - samples/sec: 2939.69 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 18:16:30,075 epoch 4 - iter 140/146 - loss 0.08171260 - time (sec): 14.55 - samples/sec: 2942.50 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 18:16:30,568 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:16:30,568 EPOCH 4 done: loss 0.0810 - lr: 0.000020 |
|
2023-10-17 18:16:31,869 DEV : loss 0.11123495548963547 - f1-score (micro avg) 0.7265 |
|
2023-10-17 18:16:31,874 saving best model |
|
2023-10-17 18:16:32,310 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:16:33,707 epoch 5 - iter 14/146 - loss 0.07595815 - time (sec): 1.40 - samples/sec: 2719.13 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 18:16:35,138 epoch 5 - iter 28/146 - loss 0.06655941 - time (sec): 2.83 - samples/sec: 2843.09 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 18:16:36,720 epoch 5 - iter 42/146 - loss 0.05683304 - time (sec): 4.41 - samples/sec: 2898.57 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 18:16:37,969 epoch 5 - iter 56/146 - loss 0.04997737 - time (sec): 5.66 - samples/sec: 2912.50 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 18:16:39,377 epoch 5 - iter 70/146 - loss 0.05412388 - time (sec): 7.07 - samples/sec: 2958.66 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 18:16:40,880 epoch 5 - iter 84/146 - loss 0.05953456 - time (sec): 8.57 - samples/sec: 2958.21 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 18:16:42,379 epoch 5 - iter 98/146 - loss 0.05702498 - time (sec): 10.07 - samples/sec: 2960.03 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 18:16:43,443 epoch 5 - iter 112/146 - loss 0.05850990 - time (sec): 11.13 - samples/sec: 2970.86 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 18:16:44,906 epoch 5 - iter 126/146 - loss 0.05714029 - time (sec): 12.59 - samples/sec: 2983.70 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 18:16:46,694 epoch 5 - iter 140/146 - loss 0.05542780 - time (sec): 14.38 - samples/sec: 2963.74 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 18:16:47,334 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:16:47,334 EPOCH 5 done: loss 0.0550 - lr: 0.000017 |
|
2023-10-17 18:16:48,612 DEV : loss 0.11169999092817307 - f1-score (micro avg) 0.754 |
|
2023-10-17 18:16:48,618 saving best model |
|
2023-10-17 18:16:49,041 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:16:50,331 epoch 6 - iter 14/146 - loss 0.03957065 - time (sec): 1.28 - samples/sec: 2984.68 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 18:16:51,800 epoch 6 - iter 28/146 - loss 0.04961863 - time (sec): 2.75 - samples/sec: 2888.36 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 18:16:53,397 epoch 6 - iter 42/146 - loss 0.04313213 - time (sec): 4.35 - samples/sec: 2896.92 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 18:16:54,981 epoch 6 - iter 56/146 - loss 0.04127585 - time (sec): 5.93 - samples/sec: 2880.26 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 18:16:56,440 epoch 6 - iter 70/146 - loss 0.04424832 - time (sec): 7.39 - samples/sec: 2849.49 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 18:16:57,963 epoch 6 - iter 84/146 - loss 0.04266501 - time (sec): 8.92 - samples/sec: 2812.03 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 18:16:59,396 epoch 6 - iter 98/146 - loss 0.04351976 - time (sec): 10.35 - samples/sec: 2840.40 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 18:17:01,045 epoch 6 - iter 112/146 - loss 0.04419179 - time (sec): 12.00 - samples/sec: 2870.34 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 18:17:02,529 epoch 6 - iter 126/146 - loss 0.04318850 - time (sec): 13.48 - samples/sec: 2885.05 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 18:17:03,791 epoch 6 - iter 140/146 - loss 0.04286066 - time (sec): 14.74 - samples/sec: 2902.70 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 18:17:04,362 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:17:04,362 EPOCH 6 done: loss 0.0422 - lr: 0.000014 |
|
2023-10-17 18:17:05,596 DEV : loss 0.11784238368272781 - f1-score (micro avg) 0.7661 |
|
2023-10-17 18:17:05,601 saving best model |
|
2023-10-17 18:17:06,046 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:17:07,424 epoch 7 - iter 14/146 - loss 0.03385022 - time (sec): 1.37 - samples/sec: 2927.23 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 18:17:08,837 epoch 7 - iter 28/146 - loss 0.03372986 - time (sec): 2.79 - samples/sec: 3052.80 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 18:17:10,292 epoch 7 - iter 42/146 - loss 0.03560445 - time (sec): 4.24 - samples/sec: 2957.91 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 18:17:11,474 epoch 7 - iter 56/146 - loss 0.03370650 - time (sec): 5.42 - samples/sec: 2950.45 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 18:17:13,086 epoch 7 - iter 70/146 - loss 0.03515693 - time (sec): 7.03 - samples/sec: 2910.61 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 18:17:14,681 epoch 7 - iter 84/146 - loss 0.03267296 - time (sec): 8.63 - samples/sec: 2930.16 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 18:17:16,030 epoch 7 - iter 98/146 - loss 0.03209009 - time (sec): 9.98 - samples/sec: 2945.41 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 18:17:17,833 epoch 7 - iter 112/146 - loss 0.03060223 - time (sec): 11.78 - samples/sec: 2915.36 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 18:17:19,181 epoch 7 - iter 126/146 - loss 0.03048273 - time (sec): 13.13 - samples/sec: 2939.36 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 18:17:20,728 epoch 7 - iter 140/146 - loss 0.03009423 - time (sec): 14.68 - samples/sec: 2922.40 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 18:17:21,244 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:17:21,244 EPOCH 7 done: loss 0.0305 - lr: 0.000010 |
|
2023-10-17 18:17:22,487 DEV : loss 0.12970943748950958 - f1-score (micro avg) 0.7652 |
|
2023-10-17 18:17:22,491 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:17:24,101 epoch 8 - iter 14/146 - loss 0.03967823 - time (sec): 1.61 - samples/sec: 2808.92 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 18:17:25,640 epoch 8 - iter 28/146 - loss 0.04068898 - time (sec): 3.15 - samples/sec: 2938.80 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 18:17:27,175 epoch 8 - iter 42/146 - loss 0.03075275 - time (sec): 4.68 - samples/sec: 2926.50 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 18:17:28,734 epoch 8 - iter 56/146 - loss 0.02745463 - time (sec): 6.24 - samples/sec: 2938.76 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 18:17:30,104 epoch 8 - iter 70/146 - loss 0.02452288 - time (sec): 7.61 - samples/sec: 2947.94 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 18:17:31,279 epoch 8 - iter 84/146 - loss 0.02648615 - time (sec): 8.79 - samples/sec: 2979.13 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 18:17:32,802 epoch 8 - iter 98/146 - loss 0.02578929 - time (sec): 10.31 - samples/sec: 2995.43 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 18:17:34,106 epoch 8 - iter 112/146 - loss 0.02578146 - time (sec): 11.61 - samples/sec: 2997.10 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 18:17:35,181 epoch 8 - iter 126/146 - loss 0.02544646 - time (sec): 12.69 - samples/sec: 2999.56 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 18:17:36,877 epoch 8 - iter 140/146 - loss 0.02347594 - time (sec): 14.38 - samples/sec: 2964.33 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 18:17:37,393 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:17:37,393 EPOCH 8 done: loss 0.0231 - lr: 0.000007 |
|
2023-10-17 18:17:38,624 DEV : loss 0.12909089028835297 - f1-score (micro avg) 0.7606 |
|
2023-10-17 18:17:38,629 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:17:40,113 epoch 9 - iter 14/146 - loss 0.01610542 - time (sec): 1.48 - samples/sec: 2717.22 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 18:17:41,485 epoch 9 - iter 28/146 - loss 0.02338741 - time (sec): 2.86 - samples/sec: 3014.68 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 18:17:43,099 epoch 9 - iter 42/146 - loss 0.01898118 - time (sec): 4.47 - samples/sec: 2995.68 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 18:17:44,530 epoch 9 - iter 56/146 - loss 0.01556164 - time (sec): 5.90 - samples/sec: 2986.01 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 18:17:46,025 epoch 9 - iter 70/146 - loss 0.01504665 - time (sec): 7.39 - samples/sec: 2968.23 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 18:17:47,261 epoch 9 - iter 84/146 - loss 0.01595822 - time (sec): 8.63 - samples/sec: 2956.70 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 18:17:48,678 epoch 9 - iter 98/146 - loss 0.01550102 - time (sec): 10.05 - samples/sec: 2941.07 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 18:17:50,319 epoch 9 - iter 112/146 - loss 0.01652017 - time (sec): 11.69 - samples/sec: 2918.00 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 18:17:51,698 epoch 9 - iter 126/146 - loss 0.01700961 - time (sec): 13.07 - samples/sec: 2927.73 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 18:17:53,503 epoch 9 - iter 140/146 - loss 0.01918912 - time (sec): 14.87 - samples/sec: 2894.91 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 18:17:54,135 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:17:54,136 EPOCH 9 done: loss 0.0195 - lr: 0.000004 |
|
2023-10-17 18:17:55,386 DEV : loss 0.129518061876297 - f1-score (micro avg) 0.7489 |
|
2023-10-17 18:17:55,392 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:17:56,839 epoch 10 - iter 14/146 - loss 0.01275460 - time (sec): 1.45 - samples/sec: 2691.79 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 18:17:58,169 epoch 10 - iter 28/146 - loss 0.00987069 - time (sec): 2.78 - samples/sec: 2816.79 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 18:17:59,720 epoch 10 - iter 42/146 - loss 0.00991440 - time (sec): 4.33 - samples/sec: 2815.40 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 18:18:01,134 epoch 10 - iter 56/146 - loss 0.00988055 - time (sec): 5.74 - samples/sec: 2865.60 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 18:18:02,734 epoch 10 - iter 70/146 - loss 0.01038453 - time (sec): 7.34 - samples/sec: 2965.40 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 18:18:04,048 epoch 10 - iter 84/146 - loss 0.01235361 - time (sec): 8.66 - samples/sec: 2968.13 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 18:18:05,765 epoch 10 - iter 98/146 - loss 0.01293458 - time (sec): 10.37 - samples/sec: 2910.71 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 18:18:07,131 epoch 10 - iter 112/146 - loss 0.01413777 - time (sec): 11.74 - samples/sec: 2923.45 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 18:18:08,559 epoch 10 - iter 126/146 - loss 0.01605264 - time (sec): 13.17 - samples/sec: 2882.86 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 18:18:09,996 epoch 10 - iter 140/146 - loss 0.01671795 - time (sec): 14.60 - samples/sec: 2927.32 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 18:18:10,578 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:18:10,578 EPOCH 10 done: loss 0.0165 - lr: 0.000000 |
|
2023-10-17 18:18:12,075 DEV : loss 0.13104750216007233 - f1-score (micro avg) 0.7522 |
|
2023-10-17 18:18:12,423 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:18:12,425 Loading model from best epoch ... |
|
2023-10-17 18:18:13,828 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-17 18:18:16,268 |
|
Results: |
|
- F-score (micro) 0.7599 |
|
- F-score (macro) 0.6534 |
|
- Accuracy 0.6298 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8110 0.8506 0.8303 348 |
|
LOC 0.6879 0.8276 0.7513 261 |
|
ORG 0.3654 0.3654 0.3654 52 |
|
HumanProd 0.6522 0.6818 0.6667 22 |
|
|
|
micro avg 0.7241 0.7994 0.7599 683 |
|
macro avg 0.6291 0.6813 0.6534 683 |
|
weighted avg 0.7249 0.7994 0.7594 683 |
|
|
|
2023-10-17 18:18:16,268 ---------------------------------------------------------------------------------------------------- |
|
|