stefan-it's picture
Upload folder using huggingface_hub
3cc7ded
2023-10-17 18:22:56,458 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:56,460 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 18:22:56,460 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:56,460 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-17 18:22:56,460 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:56,460 Train: 3575 sentences
2023-10-17 18:22:56,460 (train_with_dev=False, train_with_test=False)
2023-10-17 18:22:56,460 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:56,460 Training Params:
2023-10-17 18:22:56,461 - learning_rate: "3e-05"
2023-10-17 18:22:56,461 - mini_batch_size: "8"
2023-10-17 18:22:56,461 - max_epochs: "10"
2023-10-17 18:22:56,461 - shuffle: "True"
2023-10-17 18:22:56,461 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:56,461 Plugins:
2023-10-17 18:22:56,461 - TensorboardLogger
2023-10-17 18:22:56,461 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 18:22:56,461 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:56,461 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 18:22:56,461 - metric: "('micro avg', 'f1-score')"
2023-10-17 18:22:56,461 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:56,461 Computation:
2023-10-17 18:22:56,461 - compute on device: cuda:0
2023-10-17 18:22:56,461 - embedding storage: none
2023-10-17 18:22:56,462 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:56,462 Model training base path: "hmbench-hipe2020/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-17 18:22:56,462 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:56,462 ----------------------------------------------------------------------------------------------------
2023-10-17 18:22:56,462 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 18:23:00,459 epoch 1 - iter 44/447 - loss 3.47864138 - time (sec): 4.00 - samples/sec: 2030.88 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:23:04,581 epoch 1 - iter 88/447 - loss 2.55710229 - time (sec): 8.12 - samples/sec: 2032.52 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:23:08,886 epoch 1 - iter 132/447 - loss 1.86053232 - time (sec): 12.42 - samples/sec: 2020.82 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:23:12,858 epoch 1 - iter 176/447 - loss 1.50695410 - time (sec): 16.39 - samples/sec: 2015.45 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:23:17,209 epoch 1 - iter 220/447 - loss 1.29221222 - time (sec): 20.74 - samples/sec: 1991.95 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:23:21,488 epoch 1 - iter 264/447 - loss 1.14030675 - time (sec): 25.02 - samples/sec: 1988.13 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:23:25,570 epoch 1 - iter 308/447 - loss 1.02286371 - time (sec): 29.11 - samples/sec: 1989.96 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:23:29,803 epoch 1 - iter 352/447 - loss 0.92044886 - time (sec): 33.34 - samples/sec: 2001.84 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:23:33,922 epoch 1 - iter 396/447 - loss 0.83590367 - time (sec): 37.46 - samples/sec: 2028.50 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:23:38,350 epoch 1 - iter 440/447 - loss 0.77269581 - time (sec): 41.89 - samples/sec: 2032.07 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:23:38,985 ----------------------------------------------------------------------------------------------------
2023-10-17 18:23:38,986 EPOCH 1 done: loss 0.7632 - lr: 0.000029
2023-10-17 18:23:45,635 DEV : loss 0.1704760640859604 - f1-score (micro avg) 0.5929
2023-10-17 18:23:45,691 saving best model
2023-10-17 18:23:46,230 ----------------------------------------------------------------------------------------------------
2023-10-17 18:23:50,723 epoch 2 - iter 44/447 - loss 0.19666925 - time (sec): 4.49 - samples/sec: 2213.33 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:23:55,147 epoch 2 - iter 88/447 - loss 0.19892337 - time (sec): 8.91 - samples/sec: 2059.19 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:23:59,486 epoch 2 - iter 132/447 - loss 0.18893141 - time (sec): 13.25 - samples/sec: 1977.00 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:24:03,659 epoch 2 - iter 176/447 - loss 0.17899012 - time (sec): 17.43 - samples/sec: 1995.89 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:24:07,581 epoch 2 - iter 220/447 - loss 0.17159262 - time (sec): 21.35 - samples/sec: 2002.25 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:24:11,986 epoch 2 - iter 264/447 - loss 0.16684092 - time (sec): 25.75 - samples/sec: 2016.70 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:24:15,928 epoch 2 - iter 308/447 - loss 0.16348986 - time (sec): 29.70 - samples/sec: 2029.33 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:24:20,022 epoch 2 - iter 352/447 - loss 0.16073205 - time (sec): 33.79 - samples/sec: 2035.11 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:24:24,189 epoch 2 - iter 396/447 - loss 0.15953827 - time (sec): 37.96 - samples/sec: 2024.86 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:24:28,341 epoch 2 - iter 440/447 - loss 0.15682018 - time (sec): 42.11 - samples/sec: 2024.85 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:24:28,988 ----------------------------------------------------------------------------------------------------
2023-10-17 18:24:28,988 EPOCH 2 done: loss 0.1561 - lr: 0.000027
2023-10-17 18:24:40,469 DEV : loss 0.13275983929634094 - f1-score (micro avg) 0.7155
2023-10-17 18:24:40,525 saving best model
2023-10-17 18:24:41,904 ----------------------------------------------------------------------------------------------------
2023-10-17 18:24:45,996 epoch 3 - iter 44/447 - loss 0.10231429 - time (sec): 4.09 - samples/sec: 2070.71 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:24:50,353 epoch 3 - iter 88/447 - loss 0.09837450 - time (sec): 8.45 - samples/sec: 1960.79 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:24:54,770 epoch 3 - iter 132/447 - loss 0.08928842 - time (sec): 12.86 - samples/sec: 1997.73 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:24:59,020 epoch 3 - iter 176/447 - loss 0.08587204 - time (sec): 17.11 - samples/sec: 1990.85 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:25:03,231 epoch 3 - iter 220/447 - loss 0.08780169 - time (sec): 21.32 - samples/sec: 1994.11 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:25:07,773 epoch 3 - iter 264/447 - loss 0.08849332 - time (sec): 25.87 - samples/sec: 2014.73 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:25:11,823 epoch 3 - iter 308/447 - loss 0.08785527 - time (sec): 29.91 - samples/sec: 2011.18 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:25:15,770 epoch 3 - iter 352/447 - loss 0.08793258 - time (sec): 33.86 - samples/sec: 2015.16 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:25:19,769 epoch 3 - iter 396/447 - loss 0.08667014 - time (sec): 37.86 - samples/sec: 2022.88 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:25:24,022 epoch 3 - iter 440/447 - loss 0.08804741 - time (sec): 42.11 - samples/sec: 2022.37 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:25:24,684 ----------------------------------------------------------------------------------------------------
2023-10-17 18:25:24,684 EPOCH 3 done: loss 0.0873 - lr: 0.000023
2023-10-17 18:25:36,642 DEV : loss 0.12239837646484375 - f1-score (micro avg) 0.7448
2023-10-17 18:25:36,699 saving best model
2023-10-17 18:25:38,097 ----------------------------------------------------------------------------------------------------
2023-10-17 18:25:42,133 epoch 4 - iter 44/447 - loss 0.04228104 - time (sec): 4.03 - samples/sec: 1847.41 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:25:46,467 epoch 4 - iter 88/447 - loss 0.04192572 - time (sec): 8.37 - samples/sec: 1942.68 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:25:50,981 epoch 4 - iter 132/447 - loss 0.04579833 - time (sec): 12.88 - samples/sec: 1997.86 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:25:55,113 epoch 4 - iter 176/447 - loss 0.04637274 - time (sec): 17.01 - samples/sec: 2023.77 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:25:59,247 epoch 4 - iter 220/447 - loss 0.05171639 - time (sec): 21.15 - samples/sec: 2023.94 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:26:03,672 epoch 4 - iter 264/447 - loss 0.04922290 - time (sec): 25.57 - samples/sec: 2008.52 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:26:07,840 epoch 4 - iter 308/447 - loss 0.05116720 - time (sec): 29.74 - samples/sec: 2012.52 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:26:12,058 epoch 4 - iter 352/447 - loss 0.05417139 - time (sec): 33.96 - samples/sec: 2016.53 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:26:16,379 epoch 4 - iter 396/447 - loss 0.05412233 - time (sec): 38.28 - samples/sec: 2015.16 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:26:20,315 epoch 4 - iter 440/447 - loss 0.05304027 - time (sec): 42.21 - samples/sec: 2025.18 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:26:20,936 ----------------------------------------------------------------------------------------------------
2023-10-17 18:26:20,936 EPOCH 4 done: loss 0.0528 - lr: 0.000020
2023-10-17 18:26:32,048 DEV : loss 0.15048351883888245 - f1-score (micro avg) 0.7634
2023-10-17 18:26:32,112 saving best model
2023-10-17 18:26:33,525 ----------------------------------------------------------------------------------------------------
2023-10-17 18:26:37,619 epoch 5 - iter 44/447 - loss 0.03447358 - time (sec): 4.09 - samples/sec: 2087.76 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:26:41,725 epoch 5 - iter 88/447 - loss 0.03612779 - time (sec): 8.20 - samples/sec: 2077.75 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:26:46,105 epoch 5 - iter 132/447 - loss 0.03196480 - time (sec): 12.57 - samples/sec: 2104.83 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:26:50,102 epoch 5 - iter 176/447 - loss 0.03188731 - time (sec): 16.57 - samples/sec: 2098.67 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:26:54,032 epoch 5 - iter 220/447 - loss 0.03579741 - time (sec): 20.50 - samples/sec: 2075.87 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:26:58,319 epoch 5 - iter 264/447 - loss 0.03656570 - time (sec): 24.79 - samples/sec: 2059.65 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:27:02,597 epoch 5 - iter 308/447 - loss 0.03466670 - time (sec): 29.07 - samples/sec: 2049.26 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:27:06,755 epoch 5 - iter 352/447 - loss 0.03453502 - time (sec): 33.23 - samples/sec: 2034.28 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:27:10,873 epoch 5 - iter 396/447 - loss 0.03511712 - time (sec): 37.34 - samples/sec: 2026.61 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:27:14,974 epoch 5 - iter 440/447 - loss 0.03542620 - time (sec): 41.44 - samples/sec: 2035.88 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:27:16,386 ----------------------------------------------------------------------------------------------------
2023-10-17 18:27:16,386 EPOCH 5 done: loss 0.0350 - lr: 0.000017
2023-10-17 18:27:27,531 DEV : loss 0.1623251587152481 - f1-score (micro avg) 0.7844
2023-10-17 18:27:27,594 saving best model
2023-10-17 18:27:29,062 ----------------------------------------------------------------------------------------------------
2023-10-17 18:27:33,401 epoch 6 - iter 44/447 - loss 0.01509559 - time (sec): 4.33 - samples/sec: 2206.74 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:27:37,584 epoch 6 - iter 88/447 - loss 0.01823388 - time (sec): 8.52 - samples/sec: 2079.31 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:27:41,590 epoch 6 - iter 132/447 - loss 0.01937156 - time (sec): 12.52 - samples/sec: 2051.81 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:27:45,621 epoch 6 - iter 176/447 - loss 0.01793456 - time (sec): 16.55 - samples/sec: 2041.66 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:27:49,595 epoch 6 - iter 220/447 - loss 0.01946342 - time (sec): 20.53 - samples/sec: 2042.92 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:27:53,724 epoch 6 - iter 264/447 - loss 0.02107942 - time (sec): 24.66 - samples/sec: 2059.88 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:27:57,943 epoch 6 - iter 308/447 - loss 0.02236597 - time (sec): 28.88 - samples/sec: 2051.73 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:28:02,225 epoch 6 - iter 352/447 - loss 0.02262167 - time (sec): 33.16 - samples/sec: 2052.87 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:28:06,722 epoch 6 - iter 396/447 - loss 0.02326928 - time (sec): 37.66 - samples/sec: 2057.37 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:28:10,793 epoch 6 - iter 440/447 - loss 0.02302421 - time (sec): 41.73 - samples/sec: 2049.48 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:28:11,421 ----------------------------------------------------------------------------------------------------
2023-10-17 18:28:11,421 EPOCH 6 done: loss 0.0230 - lr: 0.000013
2023-10-17 18:28:22,898 DEV : loss 0.18466560542583466 - f1-score (micro avg) 0.7896
2023-10-17 18:28:22,963 saving best model
2023-10-17 18:28:24,375 ----------------------------------------------------------------------------------------------------
2023-10-17 18:28:28,484 epoch 7 - iter 44/447 - loss 0.01071602 - time (sec): 4.10 - samples/sec: 2124.81 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:28:32,529 epoch 7 - iter 88/447 - loss 0.01722305 - time (sec): 8.15 - samples/sec: 2079.46 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:28:36,648 epoch 7 - iter 132/447 - loss 0.01562625 - time (sec): 12.27 - samples/sec: 2048.27 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:28:40,896 epoch 7 - iter 176/447 - loss 0.01506616 - time (sec): 16.52 - samples/sec: 2039.77 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:28:45,030 epoch 7 - iter 220/447 - loss 0.01543355 - time (sec): 20.65 - samples/sec: 2035.10 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:28:49,087 epoch 7 - iter 264/447 - loss 0.01583832 - time (sec): 24.71 - samples/sec: 2030.05 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:28:53,199 epoch 7 - iter 308/447 - loss 0.01579137 - time (sec): 28.82 - samples/sec: 2038.85 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:28:57,215 epoch 7 - iter 352/447 - loss 0.01554866 - time (sec): 32.84 - samples/sec: 2025.54 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:29:01,841 epoch 7 - iter 396/447 - loss 0.01580312 - time (sec): 37.46 - samples/sec: 2044.11 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:29:06,083 epoch 7 - iter 440/447 - loss 0.01627817 - time (sec): 41.70 - samples/sec: 2038.44 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:29:06,720 ----------------------------------------------------------------------------------------------------
2023-10-17 18:29:06,720 EPOCH 7 done: loss 0.0160 - lr: 0.000010
2023-10-17 18:29:18,210 DEV : loss 0.1884232461452484 - f1-score (micro avg) 0.7829
2023-10-17 18:29:18,273 ----------------------------------------------------------------------------------------------------
2023-10-17 18:29:22,356 epoch 8 - iter 44/447 - loss 0.00844862 - time (sec): 4.08 - samples/sec: 1946.01 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:29:26,631 epoch 8 - iter 88/447 - loss 0.00991015 - time (sec): 8.36 - samples/sec: 1976.96 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:29:31,495 epoch 8 - iter 132/447 - loss 0.00813816 - time (sec): 13.22 - samples/sec: 2024.96 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:29:35,485 epoch 8 - iter 176/447 - loss 0.00781880 - time (sec): 17.21 - samples/sec: 2003.52 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:29:39,968 epoch 8 - iter 220/447 - loss 0.00905071 - time (sec): 21.69 - samples/sec: 1973.23 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:29:44,408 epoch 8 - iter 264/447 - loss 0.00964617 - time (sec): 26.13 - samples/sec: 1955.13 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:29:48,891 epoch 8 - iter 308/447 - loss 0.01023682 - time (sec): 30.62 - samples/sec: 1964.53 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:29:53,138 epoch 8 - iter 352/447 - loss 0.01126515 - time (sec): 34.86 - samples/sec: 1964.94 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:29:57,471 epoch 8 - iter 396/447 - loss 0.01068923 - time (sec): 39.20 - samples/sec: 1947.11 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:30:01,738 epoch 8 - iter 440/447 - loss 0.01036581 - time (sec): 43.46 - samples/sec: 1966.25 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:30:02,342 ----------------------------------------------------------------------------------------------------
2023-10-17 18:30:02,342 EPOCH 8 done: loss 0.0102 - lr: 0.000007
2023-10-17 18:30:14,071 DEV : loss 0.20807437598705292 - f1-score (micro avg) 0.8002
2023-10-17 18:30:14,141 saving best model
2023-10-17 18:30:15,594 ----------------------------------------------------------------------------------------------------
2023-10-17 18:30:19,632 epoch 9 - iter 44/447 - loss 0.01126364 - time (sec): 4.03 - samples/sec: 1945.90 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:30:23,938 epoch 9 - iter 88/447 - loss 0.00956380 - time (sec): 8.34 - samples/sec: 1948.16 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:30:28,056 epoch 9 - iter 132/447 - loss 0.00736466 - time (sec): 12.46 - samples/sec: 1977.24 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:30:32,761 epoch 9 - iter 176/447 - loss 0.00652351 - time (sec): 17.16 - samples/sec: 1987.56 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:30:36,806 epoch 9 - iter 220/447 - loss 0.00638266 - time (sec): 21.21 - samples/sec: 1966.54 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:30:40,969 epoch 9 - iter 264/447 - loss 0.00625169 - time (sec): 25.37 - samples/sec: 1963.44 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:30:45,081 epoch 9 - iter 308/447 - loss 0.00658656 - time (sec): 29.48 - samples/sec: 1972.54 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:30:49,485 epoch 9 - iter 352/447 - loss 0.00665066 - time (sec): 33.89 - samples/sec: 1988.90 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:30:54,117 epoch 9 - iter 396/447 - loss 0.00700973 - time (sec): 38.52 - samples/sec: 1987.54 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:30:58,353 epoch 9 - iter 440/447 - loss 0.00692549 - time (sec): 42.75 - samples/sec: 1996.72 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:30:58,976 ----------------------------------------------------------------------------------------------------
2023-10-17 18:30:58,977 EPOCH 9 done: loss 0.0069 - lr: 0.000003
2023-10-17 18:31:10,510 DEV : loss 0.20884786546230316 - f1-score (micro avg) 0.8053
2023-10-17 18:31:10,590 saving best model
2023-10-17 18:31:12,128 ----------------------------------------------------------------------------------------------------
2023-10-17 18:31:16,504 epoch 10 - iter 44/447 - loss 0.00304434 - time (sec): 4.37 - samples/sec: 2074.87 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:31:21,479 epoch 10 - iter 88/447 - loss 0.00250309 - time (sec): 9.35 - samples/sec: 2027.70 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:31:25,867 epoch 10 - iter 132/447 - loss 0.00262760 - time (sec): 13.73 - samples/sec: 1988.83 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:31:29,989 epoch 10 - iter 176/447 - loss 0.00262202 - time (sec): 17.86 - samples/sec: 2011.63 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:31:34,089 epoch 10 - iter 220/447 - loss 0.00294347 - time (sec): 21.96 - samples/sec: 2004.03 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:31:38,444 epoch 10 - iter 264/447 - loss 0.00372192 - time (sec): 26.31 - samples/sec: 1990.88 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:31:42,924 epoch 10 - iter 308/447 - loss 0.00443006 - time (sec): 30.79 - samples/sec: 1961.30 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:31:47,821 epoch 10 - iter 352/447 - loss 0.00455511 - time (sec): 35.69 - samples/sec: 1910.52 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:31:52,601 epoch 10 - iter 396/447 - loss 0.00430018 - time (sec): 40.47 - samples/sec: 1890.24 - lr: 0.000000 - momentum: 0.000000
2023-10-17 18:31:56,738 epoch 10 - iter 440/447 - loss 0.00448107 - time (sec): 44.60 - samples/sec: 1909.00 - lr: 0.000000 - momentum: 0.000000
2023-10-17 18:31:57,429 ----------------------------------------------------------------------------------------------------
2023-10-17 18:31:57,429 EPOCH 10 done: loss 0.0044 - lr: 0.000000
2023-10-17 18:32:08,477 DEV : loss 0.2179129272699356 - f1-score (micro avg) 0.8045
2023-10-17 18:32:09,127 ----------------------------------------------------------------------------------------------------
2023-10-17 18:32:09,130 Loading model from best epoch ...
2023-10-17 18:32:11,833 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-17 18:32:19,302
Results:
- F-score (micro) 0.7693
- F-score (macro) 0.706
- Accuracy 0.6448
By class:
precision recall f1-score support
loc 0.8632 0.8473 0.8552 596
pers 0.6787 0.7928 0.7313 333
org 0.5500 0.5833 0.5662 132
prod 0.6885 0.6364 0.6614 66
time 0.7391 0.6939 0.7158 49
micro avg 0.7551 0.7840 0.7693 1176
macro avg 0.7039 0.7107 0.7060 1176
weighted avg 0.7608 0.7840 0.7710 1176
2023-10-17 18:32:19,303 ----------------------------------------------------------------------------------------------------