|
2023-10-25 20:57:45,050 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:57:45,051 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 20:57:45,051 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:57:45,051 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-25 20:57:45,051 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:57:45,051 Train: 1166 sentences |
|
2023-10-25 20:57:45,051 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 20:57:45,051 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:57:45,051 Training Params: |
|
2023-10-25 20:57:45,051 - learning_rate: "5e-05" |
|
2023-10-25 20:57:45,051 - mini_batch_size: "8" |
|
2023-10-25 20:57:45,051 - max_epochs: "10" |
|
2023-10-25 20:57:45,051 - shuffle: "True" |
|
2023-10-25 20:57:45,051 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:57:45,051 Plugins: |
|
2023-10-25 20:57:45,051 - TensorboardLogger |
|
2023-10-25 20:57:45,051 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 20:57:45,051 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:57:45,051 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 20:57:45,051 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 20:57:45,051 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:57:45,052 Computation: |
|
2023-10-25 20:57:45,052 - compute on device: cuda:0 |
|
2023-10-25 20:57:45,052 - embedding storage: none |
|
2023-10-25 20:57:45,052 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:57:45,052 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-25 20:57:45,052 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:57:45,052 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:57:45,052 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 20:57:45,830 epoch 1 - iter 14/146 - loss 3.01622847 - time (sec): 0.78 - samples/sec: 4751.19 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 20:57:46,820 epoch 1 - iter 28/146 - loss 2.22477731 - time (sec): 1.77 - samples/sec: 4730.67 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 20:57:47,986 epoch 1 - iter 42/146 - loss 1.61094234 - time (sec): 2.93 - samples/sec: 4669.33 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 20:57:48,920 epoch 1 - iter 56/146 - loss 1.36453122 - time (sec): 3.87 - samples/sec: 4681.34 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 20:57:49,784 epoch 1 - iter 70/146 - loss 1.18628181 - time (sec): 4.73 - samples/sec: 4724.75 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 20:57:50,749 epoch 1 - iter 84/146 - loss 1.04111618 - time (sec): 5.70 - samples/sec: 4734.07 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 20:57:51,569 epoch 1 - iter 98/146 - loss 0.95230138 - time (sec): 6.52 - samples/sec: 4705.80 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 20:57:52,336 epoch 1 - iter 112/146 - loss 0.88580683 - time (sec): 7.28 - samples/sec: 4691.89 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 20:57:53,173 epoch 1 - iter 126/146 - loss 0.82747503 - time (sec): 8.12 - samples/sec: 4669.42 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 20:57:54,064 epoch 1 - iter 140/146 - loss 0.77136082 - time (sec): 9.01 - samples/sec: 4629.27 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 20:57:54,587 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:57:54,588 EPOCH 1 done: loss 0.7478 - lr: 0.000048 |
|
2023-10-25 20:57:55,254 DEV : loss 0.1681840568780899 - f1-score (micro avg) 0.5124 |
|
2023-10-25 20:57:55,258 saving best model |
|
2023-10-25 20:57:55,724 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:57:56,621 epoch 2 - iter 14/146 - loss 0.20128551 - time (sec): 0.90 - samples/sec: 4563.17 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-25 20:57:57,445 epoch 2 - iter 28/146 - loss 0.20146634 - time (sec): 1.72 - samples/sec: 4541.90 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 20:57:58,360 epoch 2 - iter 42/146 - loss 0.17940294 - time (sec): 2.63 - samples/sec: 4542.16 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 20:57:59,324 epoch 2 - iter 56/146 - loss 0.17263735 - time (sec): 3.60 - samples/sec: 4578.63 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 20:58:00,282 epoch 2 - iter 70/146 - loss 0.16408791 - time (sec): 4.56 - samples/sec: 4539.43 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 20:58:01,100 epoch 2 - iter 84/146 - loss 0.16825407 - time (sec): 5.37 - samples/sec: 4527.82 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 20:58:01,982 epoch 2 - iter 98/146 - loss 0.17462443 - time (sec): 6.26 - samples/sec: 4519.57 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 20:58:02,872 epoch 2 - iter 112/146 - loss 0.17533292 - time (sec): 7.15 - samples/sec: 4507.14 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 20:58:03,920 epoch 2 - iter 126/146 - loss 0.17354615 - time (sec): 8.19 - samples/sec: 4578.78 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 20:58:04,812 epoch 2 - iter 140/146 - loss 0.16374339 - time (sec): 9.09 - samples/sec: 4679.53 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 20:58:05,196 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:58:05,196 EPOCH 2 done: loss 0.1616 - lr: 0.000045 |
|
2023-10-25 20:58:06,105 DEV : loss 0.11024896055459976 - f1-score (micro avg) 0.682 |
|
2023-10-25 20:58:06,110 saving best model |
|
2023-10-25 20:58:06,722 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:58:07,698 epoch 3 - iter 14/146 - loss 0.63961016 - time (sec): 0.97 - samples/sec: 4852.67 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 20:58:08,656 epoch 3 - iter 28/146 - loss 0.35982451 - time (sec): 1.93 - samples/sec: 4830.06 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 20:58:09,621 epoch 3 - iter 42/146 - loss 0.26511020 - time (sec): 2.90 - samples/sec: 4682.42 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 20:58:10,686 epoch 3 - iter 56/146 - loss 0.22289918 - time (sec): 3.96 - samples/sec: 4397.11 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 20:58:11,506 epoch 3 - iter 70/146 - loss 0.19749367 - time (sec): 4.78 - samples/sec: 4428.40 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 20:58:12,273 epoch 3 - iter 84/146 - loss 0.17871944 - time (sec): 5.55 - samples/sec: 4475.12 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 20:58:13,143 epoch 3 - iter 98/146 - loss 0.16277166 - time (sec): 6.42 - samples/sec: 4557.86 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 20:58:14,048 epoch 3 - iter 112/146 - loss 0.15065108 - time (sec): 7.32 - samples/sec: 4587.59 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 20:58:14,988 epoch 3 - iter 126/146 - loss 0.14336333 - time (sec): 8.26 - samples/sec: 4661.00 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 20:58:15,926 epoch 3 - iter 140/146 - loss 0.13551970 - time (sec): 9.20 - samples/sec: 4615.07 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-25 20:58:16,350 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:58:16,350 EPOCH 3 done: loss 0.1345 - lr: 0.000039 |
|
2023-10-25 20:58:17,262 DEV : loss 0.11921205371618271 - f1-score (micro avg) 0.7597 |
|
2023-10-25 20:58:17,266 saving best model |
|
2023-10-25 20:58:17,746 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:58:18,649 epoch 4 - iter 14/146 - loss 0.10516627 - time (sec): 0.90 - samples/sec: 4576.84 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 20:58:19,517 epoch 4 - iter 28/146 - loss 0.07107744 - time (sec): 1.77 - samples/sec: 4923.37 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 20:58:20,388 epoch 4 - iter 42/146 - loss 0.06235583 - time (sec): 2.64 - samples/sec: 4755.79 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 20:58:21,254 epoch 4 - iter 56/146 - loss 0.05881476 - time (sec): 3.51 - samples/sec: 4755.40 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 20:58:22,257 epoch 4 - iter 70/146 - loss 0.05872437 - time (sec): 4.51 - samples/sec: 4726.52 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 20:58:23,161 epoch 4 - iter 84/146 - loss 0.05638929 - time (sec): 5.41 - samples/sec: 4665.22 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 20:58:23,967 epoch 4 - iter 98/146 - loss 0.05876076 - time (sec): 6.22 - samples/sec: 4726.33 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 20:58:24,995 epoch 4 - iter 112/146 - loss 0.05938172 - time (sec): 7.25 - samples/sec: 4730.39 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 20:58:25,817 epoch 4 - iter 126/146 - loss 0.05658355 - time (sec): 8.07 - samples/sec: 4768.04 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 20:58:26,711 epoch 4 - iter 140/146 - loss 0.05586036 - time (sec): 8.96 - samples/sec: 4756.48 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 20:58:27,068 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:58:27,068 EPOCH 4 done: loss 0.0544 - lr: 0.000034 |
|
2023-10-25 20:58:27,984 DEV : loss 0.12713664770126343 - f1-score (micro avg) 0.7342 |
|
2023-10-25 20:58:27,988 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:58:28,752 epoch 5 - iter 14/146 - loss 0.02468832 - time (sec): 0.76 - samples/sec: 4653.36 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 20:58:29,736 epoch 5 - iter 28/146 - loss 0.02918576 - time (sec): 1.75 - samples/sec: 4582.50 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 20:58:30,628 epoch 5 - iter 42/146 - loss 0.02943205 - time (sec): 2.64 - samples/sec: 4764.19 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 20:58:31,539 epoch 5 - iter 56/146 - loss 0.03396994 - time (sec): 3.55 - samples/sec: 4835.17 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 20:58:32,304 epoch 5 - iter 70/146 - loss 0.03515634 - time (sec): 4.31 - samples/sec: 4808.79 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 20:58:33,274 epoch 5 - iter 84/146 - loss 0.03735341 - time (sec): 5.28 - samples/sec: 4747.26 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 20:58:34,265 epoch 5 - iter 98/146 - loss 0.03861920 - time (sec): 6.28 - samples/sec: 4763.08 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 20:58:35,213 epoch 5 - iter 112/146 - loss 0.03553686 - time (sec): 7.22 - samples/sec: 4752.47 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 20:58:36,085 epoch 5 - iter 126/146 - loss 0.03369232 - time (sec): 8.10 - samples/sec: 4762.98 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 20:58:36,989 epoch 5 - iter 140/146 - loss 0.03255683 - time (sec): 9.00 - samples/sec: 4784.99 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 20:58:37,319 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:58:37,320 EPOCH 5 done: loss 0.0329 - lr: 0.000028 |
|
2023-10-25 20:58:38,235 DEV : loss 0.13207438588142395 - f1-score (micro avg) 0.7431 |
|
2023-10-25 20:58:38,239 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:58:39,294 epoch 6 - iter 14/146 - loss 0.01695521 - time (sec): 1.05 - samples/sec: 4951.89 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 20:58:40,188 epoch 6 - iter 28/146 - loss 0.02311012 - time (sec): 1.95 - samples/sec: 4724.81 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 20:58:41,100 epoch 6 - iter 42/146 - loss 0.02372267 - time (sec): 2.86 - samples/sec: 4704.55 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 20:58:42,262 epoch 6 - iter 56/146 - loss 0.02395132 - time (sec): 4.02 - samples/sec: 4535.45 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 20:58:43,220 epoch 6 - iter 70/146 - loss 0.02718854 - time (sec): 4.98 - samples/sec: 4455.30 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 20:58:44,250 epoch 6 - iter 84/146 - loss 0.02461396 - time (sec): 6.01 - samples/sec: 4453.46 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 20:58:45,088 epoch 6 - iter 98/146 - loss 0.02420481 - time (sec): 6.85 - samples/sec: 4489.02 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 20:58:45,977 epoch 6 - iter 112/146 - loss 0.02314700 - time (sec): 7.74 - samples/sec: 4520.31 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 20:58:46,885 epoch 6 - iter 126/146 - loss 0.02490428 - time (sec): 8.64 - samples/sec: 4577.35 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 20:58:47,670 epoch 6 - iter 140/146 - loss 0.02435065 - time (sec): 9.43 - samples/sec: 4547.67 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 20:58:48,021 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:58:48,021 EPOCH 6 done: loss 0.0237 - lr: 0.000023 |
|
2023-10-25 20:58:48,942 DEV : loss 0.144234299659729 - f1-score (micro avg) 0.7811 |
|
2023-10-25 20:58:48,946 saving best model |
|
2023-10-25 20:58:49,550 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:58:50,465 epoch 7 - iter 14/146 - loss 0.02913716 - time (sec): 0.91 - samples/sec: 4483.73 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 20:58:51,312 epoch 7 - iter 28/146 - loss 0.01908361 - time (sec): 1.76 - samples/sec: 4764.35 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 20:58:52,484 epoch 7 - iter 42/146 - loss 0.01671570 - time (sec): 2.93 - samples/sec: 4581.70 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 20:58:53,411 epoch 7 - iter 56/146 - loss 0.01520409 - time (sec): 3.86 - samples/sec: 4584.57 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 20:58:54,229 epoch 7 - iter 70/146 - loss 0.01433610 - time (sec): 4.67 - samples/sec: 4661.75 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 20:58:55,224 epoch 7 - iter 84/146 - loss 0.01388687 - time (sec): 5.67 - samples/sec: 4643.06 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 20:58:56,081 epoch 7 - iter 98/146 - loss 0.01544967 - time (sec): 6.53 - samples/sec: 4644.60 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 20:58:56,956 epoch 7 - iter 112/146 - loss 0.01505549 - time (sec): 7.40 - samples/sec: 4623.87 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 20:58:57,784 epoch 7 - iter 126/146 - loss 0.01601917 - time (sec): 8.23 - samples/sec: 4633.05 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 20:58:58,651 epoch 7 - iter 140/146 - loss 0.01602666 - time (sec): 9.10 - samples/sec: 4709.87 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 20:58:59,008 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:58:59,009 EPOCH 7 done: loss 0.0161 - lr: 0.000017 |
|
2023-10-25 20:58:59,927 DEV : loss 0.1582878828048706 - f1-score (micro avg) 0.7458 |
|
2023-10-25 20:58:59,932 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:59:00,830 epoch 8 - iter 14/146 - loss 0.00430692 - time (sec): 0.90 - samples/sec: 4845.41 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 20:59:01,630 epoch 8 - iter 28/146 - loss 0.00918028 - time (sec): 1.70 - samples/sec: 4999.65 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 20:59:02,471 epoch 8 - iter 42/146 - loss 0.01094742 - time (sec): 2.54 - samples/sec: 4764.20 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 20:59:03,356 epoch 8 - iter 56/146 - loss 0.01000401 - time (sec): 3.42 - samples/sec: 4687.03 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 20:59:04,189 epoch 8 - iter 70/146 - loss 0.01022771 - time (sec): 4.26 - samples/sec: 4830.38 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 20:59:05,109 epoch 8 - iter 84/146 - loss 0.01016162 - time (sec): 5.18 - samples/sec: 4856.82 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 20:59:05,996 epoch 8 - iter 98/146 - loss 0.01079623 - time (sec): 6.06 - samples/sec: 4861.19 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 20:59:06,993 epoch 8 - iter 112/146 - loss 0.00990041 - time (sec): 7.06 - samples/sec: 4791.22 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 20:59:08,025 epoch 8 - iter 126/146 - loss 0.00935641 - time (sec): 8.09 - samples/sec: 4816.07 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 20:59:08,886 epoch 8 - iter 140/146 - loss 0.00928960 - time (sec): 8.95 - samples/sec: 4776.76 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 20:59:09,321 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:59:09,321 EPOCH 8 done: loss 0.0107 - lr: 0.000012 |
|
2023-10-25 20:59:10,237 DEV : loss 0.17408005893230438 - f1-score (micro avg) 0.7442 |
|
2023-10-25 20:59:10,241 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:59:11,090 epoch 9 - iter 14/146 - loss 0.00897464 - time (sec): 0.85 - samples/sec: 4664.55 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 20:59:12,014 epoch 9 - iter 28/146 - loss 0.01043103 - time (sec): 1.77 - samples/sec: 4651.90 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 20:59:12,849 epoch 9 - iter 42/146 - loss 0.01054132 - time (sec): 2.61 - samples/sec: 4743.05 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 20:59:13,741 epoch 9 - iter 56/146 - loss 0.00869160 - time (sec): 3.50 - samples/sec: 4620.65 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 20:59:14,641 epoch 9 - iter 70/146 - loss 0.00705254 - time (sec): 4.40 - samples/sec: 4619.49 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 20:59:15,607 epoch 9 - iter 84/146 - loss 0.00677544 - time (sec): 5.36 - samples/sec: 4692.36 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 20:59:16,540 epoch 9 - iter 98/146 - loss 0.00713812 - time (sec): 6.30 - samples/sec: 4716.11 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 20:59:17,422 epoch 9 - iter 112/146 - loss 0.01059142 - time (sec): 7.18 - samples/sec: 4714.65 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 20:59:18,563 epoch 9 - iter 126/146 - loss 0.00958536 - time (sec): 8.32 - samples/sec: 4608.11 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 20:59:19,488 epoch 9 - iter 140/146 - loss 0.01016471 - time (sec): 9.25 - samples/sec: 4617.10 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 20:59:19,835 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:59:19,835 EPOCH 9 done: loss 0.0098 - lr: 0.000006 |
|
2023-10-25 20:59:20,751 DEV : loss 0.1856423020362854 - f1-score (micro avg) 0.7553 |
|
2023-10-25 20:59:20,755 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:59:21,541 epoch 10 - iter 14/146 - loss 0.00410973 - time (sec): 0.78 - samples/sec: 4832.34 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 20:59:22,346 epoch 10 - iter 28/146 - loss 0.00363689 - time (sec): 1.59 - samples/sec: 4706.27 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 20:59:23,198 epoch 10 - iter 42/146 - loss 0.00418360 - time (sec): 2.44 - samples/sec: 4664.27 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 20:59:24,273 epoch 10 - iter 56/146 - loss 0.00693018 - time (sec): 3.52 - samples/sec: 4809.94 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 20:59:25,255 epoch 10 - iter 70/146 - loss 0.00605639 - time (sec): 4.50 - samples/sec: 4743.21 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 20:59:26,236 epoch 10 - iter 84/146 - loss 0.00595401 - time (sec): 5.48 - samples/sec: 4846.88 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 20:59:27,182 epoch 10 - iter 98/146 - loss 0.00546547 - time (sec): 6.43 - samples/sec: 4870.19 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 20:59:27,970 epoch 10 - iter 112/146 - loss 0.00649974 - time (sec): 7.21 - samples/sec: 4852.31 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 20:59:28,775 epoch 10 - iter 126/146 - loss 0.00591971 - time (sec): 8.02 - samples/sec: 4824.36 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 20:59:29,578 epoch 10 - iter 140/146 - loss 0.00621200 - time (sec): 8.82 - samples/sec: 4858.39 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 20:59:29,909 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:59:29,910 EPOCH 10 done: loss 0.0063 - lr: 0.000000 |
|
2023-10-25 20:59:30,826 DEV : loss 0.18521195650100708 - f1-score (micro avg) 0.7611 |
|
2023-10-25 20:59:31,305 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:59:31,306 Loading model from best epoch ... |
|
2023-10-25 20:59:32,867 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-25 20:59:34,407 |
|
Results: |
|
- F-score (micro) 0.7399 |
|
- F-score (macro) 0.6807 |
|
- Accuracy 0.6134 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.7945 0.8333 0.8135 348 |
|
LOC 0.6228 0.8161 0.7065 261 |
|
ORG 0.4203 0.5577 0.4793 52 |
|
HumanProd 0.6800 0.7727 0.7234 22 |
|
|
|
micro avg 0.6854 0.8038 0.7399 683 |
|
macro avg 0.6294 0.7450 0.6807 683 |
|
weighted avg 0.6967 0.8038 0.7442 683 |
|
|
|
2023-10-25 20:59:34,407 ---------------------------------------------------------------------------------------------------- |
|
|