|
2023-10-17 16:56:44,723 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:56:44,725 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 16:56:44,725 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:56:44,725 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences |
|
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator |
|
2023-10-17 16:56:44,725 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:56:44,725 Train: 3575 sentences |
|
2023-10-17 16:56:44,725 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 16:56:44,726 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:56:44,726 Training Params: |
|
2023-10-17 16:56:44,726 - learning_rate: "5e-05" |
|
2023-10-17 16:56:44,726 - mini_batch_size: "8" |
|
2023-10-17 16:56:44,726 - max_epochs: "10" |
|
2023-10-17 16:56:44,726 - shuffle: "True" |
|
2023-10-17 16:56:44,726 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:56:44,726 Plugins: |
|
2023-10-17 16:56:44,726 - TensorboardLogger |
|
2023-10-17 16:56:44,726 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 16:56:44,726 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:56:44,726 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 16:56:44,726 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 16:56:44,726 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:56:44,726 Computation: |
|
2023-10-17 16:56:44,727 - compute on device: cuda:0 |
|
2023-10-17 16:56:44,727 - embedding storage: none |
|
2023-10-17 16:56:44,727 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:56:44,727 Model training base path: "hmbench-hipe2020/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-17 16:56:44,727 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:56:44,727 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:56:44,727 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 16:56:48,757 epoch 1 - iter 44/447 - loss 3.30809376 - time (sec): 4.03 - samples/sec: 1917.00 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 16:56:53,024 epoch 1 - iter 88/447 - loss 2.13499911 - time (sec): 8.30 - samples/sec: 2028.44 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 16:56:57,150 epoch 1 - iter 132/447 - loss 1.59167597 - time (sec): 12.42 - samples/sec: 2057.26 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 16:57:01,074 epoch 1 - iter 176/447 - loss 1.29369185 - time (sec): 16.35 - samples/sec: 2065.76 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 16:57:05,222 epoch 1 - iter 220/447 - loss 1.10953451 - time (sec): 20.49 - samples/sec: 2060.94 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 16:57:09,610 epoch 1 - iter 264/447 - loss 0.95129997 - time (sec): 24.88 - samples/sec: 2085.41 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 16:57:13,715 epoch 1 - iter 308/447 - loss 0.86138651 - time (sec): 28.99 - samples/sec: 2081.98 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 16:57:17,684 epoch 1 - iter 352/447 - loss 0.78885677 - time (sec): 32.96 - samples/sec: 2073.24 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-17 16:57:21,974 epoch 1 - iter 396/447 - loss 0.72376654 - time (sec): 37.25 - samples/sec: 2071.32 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 16:57:25,932 epoch 1 - iter 440/447 - loss 0.67845000 - time (sec): 41.20 - samples/sec: 2065.52 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 16:57:26,563 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:57:26,564 EPOCH 1 done: loss 0.6716 - lr: 0.000049 |
|
2023-10-17 16:57:33,391 DEV : loss 0.23900346457958221 - f1-score (micro avg) 0.6082 |
|
2023-10-17 16:57:33,446 saving best model |
|
2023-10-17 16:57:33,993 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:57:38,068 epoch 2 - iter 44/447 - loss 0.16398010 - time (sec): 4.07 - samples/sec: 2093.32 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 16:57:42,038 epoch 2 - iter 88/447 - loss 0.15526367 - time (sec): 8.04 - samples/sec: 2099.43 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 16:57:46,047 epoch 2 - iter 132/447 - loss 0.14942996 - time (sec): 12.05 - samples/sec: 2041.00 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 16:57:49,992 epoch 2 - iter 176/447 - loss 0.15411946 - time (sec): 16.00 - samples/sec: 2007.81 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 16:57:54,245 epoch 2 - iter 220/447 - loss 0.15333183 - time (sec): 20.25 - samples/sec: 2048.44 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 16:57:58,661 epoch 2 - iter 264/447 - loss 0.15730108 - time (sec): 24.67 - samples/sec: 2054.37 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 16:58:02,647 epoch 2 - iter 308/447 - loss 0.15903483 - time (sec): 28.65 - samples/sec: 2059.87 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 16:58:06,891 epoch 2 - iter 352/447 - loss 0.15414908 - time (sec): 32.90 - samples/sec: 2065.39 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 16:58:11,277 epoch 2 - iter 396/447 - loss 0.14780656 - time (sec): 37.28 - samples/sec: 2074.68 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 16:58:15,283 epoch 2 - iter 440/447 - loss 0.14671942 - time (sec): 41.29 - samples/sec: 2065.99 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 16:58:15,903 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:58:15,903 EPOCH 2 done: loss 0.1463 - lr: 0.000045 |
|
2023-10-17 16:58:27,617 DEV : loss 0.13179324567317963 - f1-score (micro avg) 0.7139 |
|
2023-10-17 16:58:27,683 saving best model |
|
2023-10-17 16:58:29,135 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:58:33,213 epoch 3 - iter 44/447 - loss 0.09154608 - time (sec): 4.07 - samples/sec: 2108.98 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 16:58:37,346 epoch 3 - iter 88/447 - loss 0.08126586 - time (sec): 8.21 - samples/sec: 2077.58 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 16:58:41,669 epoch 3 - iter 132/447 - loss 0.07757411 - time (sec): 12.53 - samples/sec: 2078.10 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 16:58:45,855 epoch 3 - iter 176/447 - loss 0.07876017 - time (sec): 16.72 - samples/sec: 2033.65 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 16:58:50,377 epoch 3 - iter 220/447 - loss 0.08044112 - time (sec): 21.24 - samples/sec: 2013.71 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 16:58:54,907 epoch 3 - iter 264/447 - loss 0.08197015 - time (sec): 25.77 - samples/sec: 2008.00 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 16:58:59,464 epoch 3 - iter 308/447 - loss 0.07876519 - time (sec): 30.33 - samples/sec: 1981.62 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 16:59:03,614 epoch 3 - iter 352/447 - loss 0.07855959 - time (sec): 34.47 - samples/sec: 1988.29 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 16:59:07,970 epoch 3 - iter 396/447 - loss 0.07891854 - time (sec): 38.83 - samples/sec: 1994.34 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 16:59:12,029 epoch 3 - iter 440/447 - loss 0.07888207 - time (sec): 42.89 - samples/sec: 1990.43 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-17 16:59:12,673 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:59:12,673 EPOCH 3 done: loss 0.0788 - lr: 0.000039 |
|
2023-10-17 16:59:24,476 DEV : loss 0.1531600058078766 - f1-score (micro avg) 0.7569 |
|
2023-10-17 16:59:24,536 saving best model |
|
2023-10-17 16:59:25,929 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 16:59:30,444 epoch 4 - iter 44/447 - loss 0.05990645 - time (sec): 4.51 - samples/sec: 1989.46 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 16:59:34,512 epoch 4 - iter 88/447 - loss 0.04887083 - time (sec): 8.58 - samples/sec: 2018.74 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 16:59:38,671 epoch 4 - iter 132/447 - loss 0.04855191 - time (sec): 12.74 - samples/sec: 2033.66 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 16:59:42,795 epoch 4 - iter 176/447 - loss 0.05102939 - time (sec): 16.86 - samples/sec: 2019.73 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 16:59:46,890 epoch 4 - iter 220/447 - loss 0.05332854 - time (sec): 20.96 - samples/sec: 2033.26 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 16:59:51,075 epoch 4 - iter 264/447 - loss 0.05472661 - time (sec): 25.14 - samples/sec: 2039.55 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 16:59:55,185 epoch 4 - iter 308/447 - loss 0.05514047 - time (sec): 29.25 - samples/sec: 2032.86 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 16:59:59,158 epoch 4 - iter 352/447 - loss 0.05271247 - time (sec): 33.22 - samples/sec: 2030.20 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 17:00:03,723 epoch 4 - iter 396/447 - loss 0.05290971 - time (sec): 37.79 - samples/sec: 2033.66 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 17:00:07,795 epoch 4 - iter 440/447 - loss 0.05333484 - time (sec): 41.86 - samples/sec: 2033.09 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 17:00:08,446 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:00:08,446 EPOCH 4 done: loss 0.0527 - lr: 0.000033 |
|
2023-10-17 17:00:19,058 DEV : loss 0.1661703735589981 - f1-score (micro avg) 0.7689 |
|
2023-10-17 17:00:19,121 saving best model |
|
2023-10-17 17:00:19,717 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:00:23,641 epoch 5 - iter 44/447 - loss 0.02390067 - time (sec): 3.92 - samples/sec: 2091.95 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 17:00:27,667 epoch 5 - iter 88/447 - loss 0.03146218 - time (sec): 7.95 - samples/sec: 2154.91 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 17:00:32,534 epoch 5 - iter 132/447 - loss 0.03162818 - time (sec): 12.81 - samples/sec: 2086.40 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 17:00:36,393 epoch 5 - iter 176/447 - loss 0.02971463 - time (sec): 16.67 - samples/sec: 2064.86 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 17:00:40,741 epoch 5 - iter 220/447 - loss 0.02941788 - time (sec): 21.02 - samples/sec: 2073.01 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 17:00:45,054 epoch 5 - iter 264/447 - loss 0.03197842 - time (sec): 25.34 - samples/sec: 2059.58 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 17:00:49,175 epoch 5 - iter 308/447 - loss 0.03204083 - time (sec): 29.46 - samples/sec: 2046.51 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 17:00:53,272 epoch 5 - iter 352/447 - loss 0.03206859 - time (sec): 33.55 - samples/sec: 2034.85 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 17:00:57,581 epoch 5 - iter 396/447 - loss 0.03511202 - time (sec): 37.86 - samples/sec: 2033.68 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 17:01:01,561 epoch 5 - iter 440/447 - loss 0.03540433 - time (sec): 41.84 - samples/sec: 2035.72 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 17:01:02,212 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:01:02,212 EPOCH 5 done: loss 0.0351 - lr: 0.000028 |
|
2023-10-17 17:01:12,880 DEV : loss 0.19611674547195435 - f1-score (micro avg) 0.7828 |
|
2023-10-17 17:01:12,937 saving best model |
|
2023-10-17 17:01:14,312 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:01:18,523 epoch 6 - iter 44/447 - loss 0.01715331 - time (sec): 4.21 - samples/sec: 2090.26 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:01:23,111 epoch 6 - iter 88/447 - loss 0.01976064 - time (sec): 8.79 - samples/sec: 2081.58 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:01:27,390 epoch 6 - iter 132/447 - loss 0.02660722 - time (sec): 13.07 - samples/sec: 2041.65 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:01:31,491 epoch 6 - iter 176/447 - loss 0.02603617 - time (sec): 17.17 - samples/sec: 1997.37 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:01:35,489 epoch 6 - iter 220/447 - loss 0.02428162 - time (sec): 21.17 - samples/sec: 1953.34 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:01:39,549 epoch 6 - iter 264/447 - loss 0.02414744 - time (sec): 25.23 - samples/sec: 1973.76 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:01:43,891 epoch 6 - iter 308/447 - loss 0.02596565 - time (sec): 29.57 - samples/sec: 1999.62 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:01:48,147 epoch 6 - iter 352/447 - loss 0.02425663 - time (sec): 33.83 - samples/sec: 1995.53 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 17:01:52,679 epoch 6 - iter 396/447 - loss 0.02469203 - time (sec): 38.36 - samples/sec: 2010.34 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 17:01:56,788 epoch 6 - iter 440/447 - loss 0.02423347 - time (sec): 42.47 - samples/sec: 2013.25 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 17:01:57,424 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:01:57,424 EPOCH 6 done: loss 0.0248 - lr: 0.000022 |
|
2023-10-17 17:02:09,220 DEV : loss 0.23869967460632324 - f1-score (micro avg) 0.7879 |
|
2023-10-17 17:02:09,276 saving best model |
|
2023-10-17 17:02:10,678 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:02:14,967 epoch 7 - iter 44/447 - loss 0.00914659 - time (sec): 4.28 - samples/sec: 2158.31 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 17:02:19,119 epoch 7 - iter 88/447 - loss 0.01155494 - time (sec): 8.44 - samples/sec: 2041.34 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:02:23,516 epoch 7 - iter 132/447 - loss 0.01063806 - time (sec): 12.83 - samples/sec: 2017.80 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:02:27,519 epoch 7 - iter 176/447 - loss 0.01089153 - time (sec): 16.84 - samples/sec: 2006.83 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:02:31,541 epoch 7 - iter 220/447 - loss 0.01611888 - time (sec): 20.86 - samples/sec: 2003.44 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:02:35,594 epoch 7 - iter 264/447 - loss 0.01739927 - time (sec): 24.91 - samples/sec: 2005.94 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 17:02:39,858 epoch 7 - iter 308/447 - loss 0.01620851 - time (sec): 29.18 - samples/sec: 2006.53 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:02:43,914 epoch 7 - iter 352/447 - loss 0.01519289 - time (sec): 33.23 - samples/sec: 2023.06 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:02:47,996 epoch 7 - iter 396/447 - loss 0.01501461 - time (sec): 37.31 - samples/sec: 2029.97 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 17:02:52,301 epoch 7 - iter 440/447 - loss 0.01504242 - time (sec): 41.62 - samples/sec: 2043.92 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 17:02:53,038 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:02:53,038 EPOCH 7 done: loss 0.0150 - lr: 0.000017 |
|
2023-10-17 17:03:04,279 DEV : loss 0.235974982380867 - f1-score (micro avg) 0.7969 |
|
2023-10-17 17:03:04,335 saving best model |
|
2023-10-17 17:03:05,741 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:03:09,994 epoch 8 - iter 44/447 - loss 0.00261419 - time (sec): 4.25 - samples/sec: 2028.93 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 17:03:14,169 epoch 8 - iter 88/447 - loss 0.00424733 - time (sec): 8.42 - samples/sec: 2005.72 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 17:03:18,279 epoch 8 - iter 132/447 - loss 0.00574360 - time (sec): 12.53 - samples/sec: 2043.17 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:03:22,478 epoch 8 - iter 176/447 - loss 0.00668216 - time (sec): 16.73 - samples/sec: 2019.77 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:03:26,679 epoch 8 - iter 220/447 - loss 0.00675792 - time (sec): 20.93 - samples/sec: 2005.06 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 17:03:30,956 epoch 8 - iter 264/447 - loss 0.00747790 - time (sec): 25.21 - samples/sec: 2017.30 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 17:03:35,163 epoch 8 - iter 308/447 - loss 0.00697616 - time (sec): 29.42 - samples/sec: 2020.15 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 17:03:39,472 epoch 8 - iter 352/447 - loss 0.00788368 - time (sec): 33.73 - samples/sec: 2006.34 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:03:43,711 epoch 8 - iter 396/447 - loss 0.00799914 - time (sec): 37.97 - samples/sec: 2010.70 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:03:48,109 epoch 8 - iter 440/447 - loss 0.00821087 - time (sec): 42.36 - samples/sec: 2011.44 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 17:03:48,746 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:03:48,746 EPOCH 8 done: loss 0.0081 - lr: 0.000011 |
|
2023-10-17 17:04:00,423 DEV : loss 0.25937923789024353 - f1-score (micro avg) 0.7981 |
|
2023-10-17 17:04:00,486 saving best model |
|
2023-10-17 17:04:01,884 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:04:06,305 epoch 9 - iter 44/447 - loss 0.00723446 - time (sec): 4.42 - samples/sec: 2026.19 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 17:04:10,864 epoch 9 - iter 88/447 - loss 0.00487931 - time (sec): 8.98 - samples/sec: 2114.36 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 17:04:14,962 epoch 9 - iter 132/447 - loss 0.00554836 - time (sec): 13.07 - samples/sec: 2079.81 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 17:04:19,149 epoch 9 - iter 176/447 - loss 0.00594698 - time (sec): 17.26 - samples/sec: 2043.96 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 17:04:23,564 epoch 9 - iter 220/447 - loss 0.00618049 - time (sec): 21.68 - samples/sec: 2039.55 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 17:04:27,826 epoch 9 - iter 264/447 - loss 0.00575274 - time (sec): 25.94 - samples/sec: 2044.86 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 17:04:31,919 epoch 9 - iter 308/447 - loss 0.00511891 - time (sec): 30.03 - samples/sec: 2041.76 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 17:04:35,941 epoch 9 - iter 352/447 - loss 0.00557735 - time (sec): 34.05 - samples/sec: 2033.36 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 17:04:39,858 epoch 9 - iter 396/447 - loss 0.00624259 - time (sec): 37.97 - samples/sec: 2034.83 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:04:43,841 epoch 9 - iter 440/447 - loss 0.00613617 - time (sec): 41.95 - samples/sec: 2035.35 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:04:44,441 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:04:44,441 EPOCH 9 done: loss 0.0061 - lr: 0.000006 |
|
2023-10-17 17:04:55,640 DEV : loss 0.25230181217193604 - f1-score (micro avg) 0.8013 |
|
2023-10-17 17:04:55,695 saving best model |
|
2023-10-17 17:04:57,150 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:05:01,029 epoch 10 - iter 44/447 - loss 0.00272054 - time (sec): 3.88 - samples/sec: 2204.05 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:05:05,159 epoch 10 - iter 88/447 - loss 0.00268195 - time (sec): 8.01 - samples/sec: 2094.47 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:05:08,962 epoch 10 - iter 132/447 - loss 0.00252588 - time (sec): 11.81 - samples/sec: 2083.97 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 17:05:13,241 epoch 10 - iter 176/447 - loss 0.00294056 - time (sec): 16.09 - samples/sec: 2082.48 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:05:17,614 epoch 10 - iter 220/447 - loss 0.00255976 - time (sec): 20.46 - samples/sec: 2064.85 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:05:22,162 epoch 10 - iter 264/447 - loss 0.00364326 - time (sec): 25.01 - samples/sec: 2055.52 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 17:05:26,085 epoch 10 - iter 308/447 - loss 0.00413231 - time (sec): 28.93 - samples/sec: 2051.16 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 17:05:30,151 epoch 10 - iter 352/447 - loss 0.00402208 - time (sec): 33.00 - samples/sec: 2070.54 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 17:05:34,136 epoch 10 - iter 396/447 - loss 0.00401550 - time (sec): 36.98 - samples/sec: 2073.31 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 17:05:38,349 epoch 10 - iter 440/447 - loss 0.00394541 - time (sec): 41.20 - samples/sec: 2067.87 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 17:05:39,001 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:05:39,001 EPOCH 10 done: loss 0.0039 - lr: 0.000000 |
|
2023-10-17 17:05:50,065 DEV : loss 0.25091075897216797 - f1-score (micro avg) 0.8008 |
|
2023-10-17 17:05:50,781 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:05:50,783 Loading model from best epoch ... |
|
2023-10-17 17:05:53,439 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time |
|
2023-10-17 17:06:00,784 |
|
Results: |
|
- F-score (micro) 0.769 |
|
- F-score (macro) 0.691 |
|
- Accuracy 0.645 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.8699 0.8523 0.8610 596 |
|
pers 0.7037 0.7988 0.7482 333 |
|
org 0.5156 0.5000 0.5077 132 |
|
prod 0.5806 0.5455 0.5625 66 |
|
time 0.7755 0.7755 0.7755 49 |
|
|
|
micro avg 0.7610 0.7772 0.7690 1176 |
|
macro avg 0.6891 0.6944 0.6910 1176 |
|
weighted avg 0.7629 0.7772 0.7691 1176 |
|
|
|
2023-10-17 17:06:00,784 ---------------------------------------------------------------------------------------------------- |
|
|