stefan-it's picture
Upload folder using huggingface_hub
81534ff
2023-10-07 02:50:46,978 ----------------------------------------------------------------------------------------------------
2023-10-07 02:50:46,979 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-07 02:50:46,979 ----------------------------------------------------------------------------------------------------
2023-10-07 02:50:46,979 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /app/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-07 02:50:46,979 ----------------------------------------------------------------------------------------------------
2023-10-07 02:50:46,979 Train: 1100 sentences
2023-10-07 02:50:46,980 (train_with_dev=False, train_with_test=False)
2023-10-07 02:50:46,980 ----------------------------------------------------------------------------------------------------
2023-10-07 02:50:46,980 Training Params:
2023-10-07 02:50:46,980 - learning_rate: "0.00016"
2023-10-07 02:50:46,980 - mini_batch_size: "8"
2023-10-07 02:50:46,980 - max_epochs: "10"
2023-10-07 02:50:46,980 - shuffle: "True"
2023-10-07 02:50:46,980 ----------------------------------------------------------------------------------------------------
2023-10-07 02:50:46,980 Plugins:
2023-10-07 02:50:46,980 - TensorboardLogger
2023-10-07 02:50:46,980 - LinearScheduler | warmup_fraction: '0.1'
2023-10-07 02:50:46,980 ----------------------------------------------------------------------------------------------------
2023-10-07 02:50:46,980 Final evaluation on model from best epoch (best-model.pt)
2023-10-07 02:50:46,980 - metric: "('micro avg', 'f1-score')"
2023-10-07 02:50:46,980 ----------------------------------------------------------------------------------------------------
2023-10-07 02:50:46,980 Computation:
2023-10-07 02:50:46,980 - compute on device: cuda:0
2023-10-07 02:50:46,980 - embedding storage: none
2023-10-07 02:50:46,980 ----------------------------------------------------------------------------------------------------
2023-10-07 02:50:46,980 Model training base path: "hmbench-ajmc/de-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-5"
2023-10-07 02:50:46,981 ----------------------------------------------------------------------------------------------------
2023-10-07 02:50:46,981 ----------------------------------------------------------------------------------------------------
2023-10-07 02:50:46,981 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-07 02:50:56,610 epoch 1 - iter 13/138 - loss 3.23940148 - time (sec): 9.63 - samples/sec: 234.11 - lr: 0.000014 - momentum: 0.000000
2023-10-07 02:51:06,432 epoch 1 - iter 26/138 - loss 3.23247267 - time (sec): 19.45 - samples/sec: 236.35 - lr: 0.000029 - momentum: 0.000000
2023-10-07 02:51:15,816 epoch 1 - iter 39/138 - loss 3.22264854 - time (sec): 28.83 - samples/sec: 234.44 - lr: 0.000044 - momentum: 0.000000
2023-10-07 02:51:24,869 epoch 1 - iter 52/138 - loss 3.20650886 - time (sec): 37.89 - samples/sec: 233.82 - lr: 0.000059 - momentum: 0.000000
2023-10-07 02:51:34,050 epoch 1 - iter 65/138 - loss 3.17692821 - time (sec): 47.07 - samples/sec: 231.00 - lr: 0.000074 - momentum: 0.000000
2023-10-07 02:51:43,237 epoch 1 - iter 78/138 - loss 3.12659848 - time (sec): 56.25 - samples/sec: 231.78 - lr: 0.000089 - momentum: 0.000000
2023-10-07 02:51:52,645 epoch 1 - iter 91/138 - loss 3.06408818 - time (sec): 65.66 - samples/sec: 230.14 - lr: 0.000104 - momentum: 0.000000
2023-10-07 02:52:02,020 epoch 1 - iter 104/138 - loss 2.99104784 - time (sec): 75.04 - samples/sec: 228.88 - lr: 0.000119 - momentum: 0.000000
2023-10-07 02:52:11,388 epoch 1 - iter 117/138 - loss 2.90959570 - time (sec): 84.41 - samples/sec: 228.96 - lr: 0.000134 - momentum: 0.000000
2023-10-07 02:52:21,050 epoch 1 - iter 130/138 - loss 2.82199950 - time (sec): 94.07 - samples/sec: 229.74 - lr: 0.000150 - momentum: 0.000000
2023-10-07 02:52:26,362 ----------------------------------------------------------------------------------------------------
2023-10-07 02:52:26,362 EPOCH 1 done: loss 2.7733 - lr: 0.000150
2023-10-07 02:52:32,727 DEV : loss 1.7714149951934814 - f1-score (micro avg) 0.0
2023-10-07 02:52:32,732 ----------------------------------------------------------------------------------------------------
2023-10-07 02:52:41,598 epoch 2 - iter 13/138 - loss 1.74744445 - time (sec): 8.86 - samples/sec: 216.82 - lr: 0.000158 - momentum: 0.000000
2023-10-07 02:52:50,810 epoch 2 - iter 26/138 - loss 1.62447698 - time (sec): 18.08 - samples/sec: 222.99 - lr: 0.000157 - momentum: 0.000000
2023-10-07 02:53:00,319 epoch 2 - iter 39/138 - loss 1.54616233 - time (sec): 27.59 - samples/sec: 227.43 - lr: 0.000155 - momentum: 0.000000
2023-10-07 02:53:09,957 epoch 2 - iter 52/138 - loss 1.44001383 - time (sec): 37.22 - samples/sec: 228.26 - lr: 0.000153 - momentum: 0.000000
2023-10-07 02:53:19,641 epoch 2 - iter 65/138 - loss 1.36801088 - time (sec): 46.91 - samples/sec: 226.51 - lr: 0.000152 - momentum: 0.000000
2023-10-07 02:53:29,612 epoch 2 - iter 78/138 - loss 1.27695003 - time (sec): 56.88 - samples/sec: 226.67 - lr: 0.000150 - momentum: 0.000000
2023-10-07 02:53:39,457 epoch 2 - iter 91/138 - loss 1.23570652 - time (sec): 66.72 - samples/sec: 227.53 - lr: 0.000148 - momentum: 0.000000
2023-10-07 02:53:48,820 epoch 2 - iter 104/138 - loss 1.18748962 - time (sec): 76.09 - samples/sec: 228.29 - lr: 0.000147 - momentum: 0.000000
2023-10-07 02:53:58,140 epoch 2 - iter 117/138 - loss 1.15095374 - time (sec): 85.41 - samples/sec: 228.29 - lr: 0.000145 - momentum: 0.000000
2023-10-07 02:54:07,469 epoch 2 - iter 130/138 - loss 1.10561969 - time (sec): 94.74 - samples/sec: 227.02 - lr: 0.000143 - momentum: 0.000000
2023-10-07 02:54:13,054 ----------------------------------------------------------------------------------------------------
2023-10-07 02:54:13,054 EPOCH 2 done: loss 1.0854 - lr: 0.000143
2023-10-07 02:54:19,640 DEV : loss 0.6780990362167358 - f1-score (micro avg) 0.0
2023-10-07 02:54:19,645 ----------------------------------------------------------------------------------------------------
2023-10-07 02:54:28,626 epoch 3 - iter 13/138 - loss 0.63548034 - time (sec): 8.98 - samples/sec: 215.38 - lr: 0.000141 - momentum: 0.000000
2023-10-07 02:54:38,928 epoch 3 - iter 26/138 - loss 0.60374176 - time (sec): 19.28 - samples/sec: 223.74 - lr: 0.000139 - momentum: 0.000000
2023-10-07 02:54:47,780 epoch 3 - iter 39/138 - loss 0.58436161 - time (sec): 28.13 - samples/sec: 222.01 - lr: 0.000137 - momentum: 0.000000
2023-10-07 02:54:57,058 epoch 3 - iter 52/138 - loss 0.57245968 - time (sec): 37.41 - samples/sec: 221.67 - lr: 0.000136 - momentum: 0.000000
2023-10-07 02:55:07,364 epoch 3 - iter 65/138 - loss 0.56564918 - time (sec): 47.72 - samples/sec: 222.41 - lr: 0.000134 - momentum: 0.000000
2023-10-07 02:55:17,153 epoch 3 - iter 78/138 - loss 0.56027276 - time (sec): 57.51 - samples/sec: 223.51 - lr: 0.000132 - momentum: 0.000000
2023-10-07 02:55:26,847 epoch 3 - iter 91/138 - loss 0.54900006 - time (sec): 67.20 - samples/sec: 223.67 - lr: 0.000131 - momentum: 0.000000
2023-10-07 02:55:36,332 epoch 3 - iter 104/138 - loss 0.53125939 - time (sec): 76.69 - samples/sec: 224.16 - lr: 0.000129 - momentum: 0.000000
2023-10-07 02:55:46,210 epoch 3 - iter 117/138 - loss 0.51026333 - time (sec): 86.56 - samples/sec: 223.56 - lr: 0.000127 - momentum: 0.000000
2023-10-07 02:55:55,739 epoch 3 - iter 130/138 - loss 0.50269540 - time (sec): 96.09 - samples/sec: 222.33 - lr: 0.000126 - momentum: 0.000000
2023-10-07 02:56:01,879 ----------------------------------------------------------------------------------------------------
2023-10-07 02:56:01,880 EPOCH 3 done: loss 0.4967 - lr: 0.000126
2023-10-07 02:56:08,501 DEV : loss 0.37396812438964844 - f1-score (micro avg) 0.5606
2023-10-07 02:56:08,507 saving best model
2023-10-07 02:56:09,386 ----------------------------------------------------------------------------------------------------
2023-10-07 02:56:18,343 epoch 4 - iter 13/138 - loss 0.37294611 - time (sec): 8.96 - samples/sec: 219.08 - lr: 0.000123 - momentum: 0.000000
2023-10-07 02:56:27,465 epoch 4 - iter 26/138 - loss 0.35023092 - time (sec): 18.08 - samples/sec: 215.73 - lr: 0.000121 - momentum: 0.000000
2023-10-07 02:56:37,563 epoch 4 - iter 39/138 - loss 0.34009719 - time (sec): 28.18 - samples/sec: 217.60 - lr: 0.000120 - momentum: 0.000000
2023-10-07 02:56:46,702 epoch 4 - iter 52/138 - loss 0.32901441 - time (sec): 37.31 - samples/sec: 216.59 - lr: 0.000118 - momentum: 0.000000
2023-10-07 02:56:56,550 epoch 4 - iter 65/138 - loss 0.32718879 - time (sec): 47.16 - samples/sec: 218.69 - lr: 0.000116 - momentum: 0.000000
2023-10-07 02:57:06,963 epoch 4 - iter 78/138 - loss 0.31695286 - time (sec): 57.58 - samples/sec: 221.00 - lr: 0.000115 - momentum: 0.000000
2023-10-07 02:57:16,390 epoch 4 - iter 91/138 - loss 0.31790304 - time (sec): 67.00 - samples/sec: 221.71 - lr: 0.000113 - momentum: 0.000000
2023-10-07 02:57:26,409 epoch 4 - iter 104/138 - loss 0.30531621 - time (sec): 77.02 - samples/sec: 221.21 - lr: 0.000111 - momentum: 0.000000
2023-10-07 02:57:35,729 epoch 4 - iter 117/138 - loss 0.29577957 - time (sec): 86.34 - samples/sec: 220.51 - lr: 0.000110 - momentum: 0.000000
2023-10-07 02:57:45,472 epoch 4 - iter 130/138 - loss 0.29688831 - time (sec): 96.08 - samples/sec: 221.33 - lr: 0.000108 - momentum: 0.000000
2023-10-07 02:57:51,598 ----------------------------------------------------------------------------------------------------
2023-10-07 02:57:51,599 EPOCH 4 done: loss 0.2930 - lr: 0.000108
2023-10-07 02:57:58,238 DEV : loss 0.24758128821849823 - f1-score (micro avg) 0.787
2023-10-07 02:57:58,244 saving best model
2023-10-07 02:57:59,176 ----------------------------------------------------------------------------------------------------
2023-10-07 02:58:08,957 epoch 5 - iter 13/138 - loss 0.26177444 - time (sec): 9.78 - samples/sec: 230.38 - lr: 0.000105 - momentum: 0.000000
2023-10-07 02:58:18,596 epoch 5 - iter 26/138 - loss 0.23181115 - time (sec): 19.42 - samples/sec: 226.17 - lr: 0.000104 - momentum: 0.000000
2023-10-07 02:58:27,653 epoch 5 - iter 39/138 - loss 0.22413738 - time (sec): 28.48 - samples/sec: 223.73 - lr: 0.000102 - momentum: 0.000000
2023-10-07 02:58:38,289 epoch 5 - iter 52/138 - loss 0.21715395 - time (sec): 39.11 - samples/sec: 224.61 - lr: 0.000100 - momentum: 0.000000
2023-10-07 02:58:48,186 epoch 5 - iter 65/138 - loss 0.21899650 - time (sec): 49.01 - samples/sec: 223.98 - lr: 0.000099 - momentum: 0.000000
2023-10-07 02:58:57,933 epoch 5 - iter 78/138 - loss 0.21427119 - time (sec): 58.76 - samples/sec: 224.27 - lr: 0.000097 - momentum: 0.000000
2023-10-07 02:59:07,909 epoch 5 - iter 91/138 - loss 0.21300241 - time (sec): 68.73 - samples/sec: 224.23 - lr: 0.000095 - momentum: 0.000000
2023-10-07 02:59:17,179 epoch 5 - iter 104/138 - loss 0.21259146 - time (sec): 78.00 - samples/sec: 223.71 - lr: 0.000094 - momentum: 0.000000
2023-10-07 02:59:26,509 epoch 5 - iter 117/138 - loss 0.20669994 - time (sec): 87.33 - samples/sec: 222.14 - lr: 0.000092 - momentum: 0.000000
2023-10-07 02:59:36,268 epoch 5 - iter 130/138 - loss 0.20240058 - time (sec): 97.09 - samples/sec: 222.27 - lr: 0.000090 - momentum: 0.000000
2023-10-07 02:59:41,783 ----------------------------------------------------------------------------------------------------
2023-10-07 02:59:41,783 EPOCH 5 done: loss 0.1996 - lr: 0.000090
2023-10-07 02:59:48,469 DEV : loss 0.18343476951122284 - f1-score (micro avg) 0.8146
2023-10-07 02:59:48,475 saving best model
2023-10-07 02:59:49,474 ----------------------------------------------------------------------------------------------------
2023-10-07 02:59:58,999 epoch 6 - iter 13/138 - loss 0.14529243 - time (sec): 9.52 - samples/sec: 212.42 - lr: 0.000088 - momentum: 0.000000
2023-10-07 03:00:09,035 epoch 6 - iter 26/138 - loss 0.17101439 - time (sec): 19.56 - samples/sec: 218.82 - lr: 0.000086 - momentum: 0.000000
2023-10-07 03:00:18,866 epoch 6 - iter 39/138 - loss 0.15746332 - time (sec): 29.39 - samples/sec: 221.02 - lr: 0.000084 - momentum: 0.000000
2023-10-07 03:00:28,960 epoch 6 - iter 52/138 - loss 0.14626629 - time (sec): 39.48 - samples/sec: 220.85 - lr: 0.000083 - momentum: 0.000000
2023-10-07 03:00:38,210 epoch 6 - iter 65/138 - loss 0.15063673 - time (sec): 48.73 - samples/sec: 219.06 - lr: 0.000081 - momentum: 0.000000
2023-10-07 03:00:47,870 epoch 6 - iter 78/138 - loss 0.14478979 - time (sec): 58.39 - samples/sec: 219.64 - lr: 0.000079 - momentum: 0.000000
2023-10-07 03:00:58,277 epoch 6 - iter 91/138 - loss 0.13908013 - time (sec): 68.80 - samples/sec: 220.12 - lr: 0.000077 - momentum: 0.000000
2023-10-07 03:01:08,339 epoch 6 - iter 104/138 - loss 0.13844338 - time (sec): 78.86 - samples/sec: 220.49 - lr: 0.000076 - momentum: 0.000000
2023-10-07 03:01:17,890 epoch 6 - iter 117/138 - loss 0.13893122 - time (sec): 88.41 - samples/sec: 220.00 - lr: 0.000074 - momentum: 0.000000
2023-10-07 03:01:27,985 epoch 6 - iter 130/138 - loss 0.13719839 - time (sec): 98.51 - samples/sec: 220.01 - lr: 0.000072 - momentum: 0.000000
2023-10-07 03:01:33,551 ----------------------------------------------------------------------------------------------------
2023-10-07 03:01:33,551 EPOCH 6 done: loss 0.1367 - lr: 0.000072
2023-10-07 03:01:40,239 DEV : loss 0.14827768504619598 - f1-score (micro avg) 0.8437
2023-10-07 03:01:40,245 saving best model
2023-10-07 03:01:41,257 ----------------------------------------------------------------------------------------------------
2023-10-07 03:01:50,987 epoch 7 - iter 13/138 - loss 0.10834373 - time (sec): 9.73 - samples/sec: 222.03 - lr: 0.000070 - momentum: 0.000000
2023-10-07 03:02:00,702 epoch 7 - iter 26/138 - loss 0.11546109 - time (sec): 19.44 - samples/sec: 224.61 - lr: 0.000068 - momentum: 0.000000
2023-10-07 03:02:09,804 epoch 7 - iter 39/138 - loss 0.11006260 - time (sec): 28.55 - samples/sec: 219.83 - lr: 0.000066 - momentum: 0.000000
2023-10-07 03:02:19,559 epoch 7 - iter 52/138 - loss 0.10901675 - time (sec): 38.30 - samples/sec: 218.15 - lr: 0.000065 - momentum: 0.000000
2023-10-07 03:02:29,692 epoch 7 - iter 65/138 - loss 0.11088053 - time (sec): 48.43 - samples/sec: 220.30 - lr: 0.000063 - momentum: 0.000000
2023-10-07 03:02:40,419 epoch 7 - iter 78/138 - loss 0.10564613 - time (sec): 59.16 - samples/sec: 221.06 - lr: 0.000061 - momentum: 0.000000
2023-10-07 03:02:50,308 epoch 7 - iter 91/138 - loss 0.10392941 - time (sec): 69.05 - samples/sec: 221.08 - lr: 0.000060 - momentum: 0.000000
2023-10-07 03:02:59,945 epoch 7 - iter 104/138 - loss 0.10210098 - time (sec): 78.69 - samples/sec: 219.71 - lr: 0.000058 - momentum: 0.000000
2023-10-07 03:03:09,766 epoch 7 - iter 117/138 - loss 0.09954443 - time (sec): 88.51 - samples/sec: 218.56 - lr: 0.000056 - momentum: 0.000000
2023-10-07 03:03:19,622 epoch 7 - iter 130/138 - loss 0.09865909 - time (sec): 98.36 - samples/sec: 218.63 - lr: 0.000055 - momentum: 0.000000
2023-10-07 03:03:25,486 ----------------------------------------------------------------------------------------------------
2023-10-07 03:03:25,487 EPOCH 7 done: loss 0.1000 - lr: 0.000055
2023-10-07 03:03:32,354 DEV : loss 0.13571377098560333 - f1-score (micro avg) 0.8565
2023-10-07 03:03:32,360 saving best model
2023-10-07 03:03:33,360 ----------------------------------------------------------------------------------------------------
2023-10-07 03:03:43,654 epoch 8 - iter 13/138 - loss 0.08716852 - time (sec): 10.29 - samples/sec: 212.40 - lr: 0.000052 - momentum: 0.000000
2023-10-07 03:03:53,371 epoch 8 - iter 26/138 - loss 0.07268907 - time (sec): 20.01 - samples/sec: 215.21 - lr: 0.000050 - momentum: 0.000000
2023-10-07 03:04:03,021 epoch 8 - iter 39/138 - loss 0.07320897 - time (sec): 29.66 - samples/sec: 212.35 - lr: 0.000049 - momentum: 0.000000
2023-10-07 03:04:12,646 epoch 8 - iter 52/138 - loss 0.07592629 - time (sec): 39.28 - samples/sec: 212.07 - lr: 0.000047 - momentum: 0.000000
2023-10-07 03:04:22,071 epoch 8 - iter 65/138 - loss 0.08505494 - time (sec): 48.71 - samples/sec: 213.39 - lr: 0.000045 - momentum: 0.000000
2023-10-07 03:04:32,123 epoch 8 - iter 78/138 - loss 0.08635753 - time (sec): 58.76 - samples/sec: 215.21 - lr: 0.000044 - momentum: 0.000000
2023-10-07 03:04:42,221 epoch 8 - iter 91/138 - loss 0.08850028 - time (sec): 68.86 - samples/sec: 216.85 - lr: 0.000042 - momentum: 0.000000
2023-10-07 03:04:51,686 epoch 8 - iter 104/138 - loss 0.08625315 - time (sec): 78.32 - samples/sec: 217.90 - lr: 0.000040 - momentum: 0.000000
2023-10-07 03:05:02,471 epoch 8 - iter 117/138 - loss 0.08087891 - time (sec): 89.11 - samples/sec: 220.20 - lr: 0.000039 - momentum: 0.000000
2023-10-07 03:05:12,097 epoch 8 - iter 130/138 - loss 0.07859175 - time (sec): 98.73 - samples/sec: 219.13 - lr: 0.000037 - momentum: 0.000000
2023-10-07 03:05:17,502 ----------------------------------------------------------------------------------------------------
2023-10-07 03:05:17,503 EPOCH 8 done: loss 0.0798 - lr: 0.000037
2023-10-07 03:05:24,259 DEV : loss 0.12491855770349503 - f1-score (micro avg) 0.8612
2023-10-07 03:05:24,265 saving best model
2023-10-07 03:05:25,303 ----------------------------------------------------------------------------------------------------
2023-10-07 03:05:35,634 epoch 9 - iter 13/138 - loss 0.04433517 - time (sec): 10.33 - samples/sec: 220.75 - lr: 0.000034 - momentum: 0.000000
2023-10-07 03:05:45,131 epoch 9 - iter 26/138 - loss 0.06776162 - time (sec): 19.83 - samples/sec: 212.50 - lr: 0.000033 - momentum: 0.000000
2023-10-07 03:05:55,255 epoch 9 - iter 39/138 - loss 0.06552928 - time (sec): 29.95 - samples/sec: 214.10 - lr: 0.000031 - momentum: 0.000000
2023-10-07 03:06:05,172 epoch 9 - iter 52/138 - loss 0.06581716 - time (sec): 39.87 - samples/sec: 217.43 - lr: 0.000029 - momentum: 0.000000
2023-10-07 03:06:15,188 epoch 9 - iter 65/138 - loss 0.06534515 - time (sec): 49.88 - samples/sec: 218.27 - lr: 0.000028 - momentum: 0.000000
2023-10-07 03:06:25,480 epoch 9 - iter 78/138 - loss 0.06522383 - time (sec): 60.17 - samples/sec: 219.30 - lr: 0.000026 - momentum: 0.000000
2023-10-07 03:06:35,650 epoch 9 - iter 91/138 - loss 0.06538079 - time (sec): 70.34 - samples/sec: 219.69 - lr: 0.000024 - momentum: 0.000000
2023-10-07 03:06:45,318 epoch 9 - iter 104/138 - loss 0.06513024 - time (sec): 80.01 - samples/sec: 220.57 - lr: 0.000023 - momentum: 0.000000
2023-10-07 03:06:54,504 epoch 9 - iter 117/138 - loss 0.06527947 - time (sec): 89.20 - samples/sec: 218.40 - lr: 0.000021 - momentum: 0.000000
2023-10-07 03:07:04,048 epoch 9 - iter 130/138 - loss 0.06560602 - time (sec): 98.74 - samples/sec: 219.00 - lr: 0.000019 - momentum: 0.000000
2023-10-07 03:07:09,707 ----------------------------------------------------------------------------------------------------
2023-10-07 03:07:09,707 EPOCH 9 done: loss 0.0668 - lr: 0.000019
2023-10-07 03:07:16,350 DEV : loss 0.11793527007102966 - f1-score (micro avg) 0.8623
2023-10-07 03:07:16,356 saving best model
2023-10-07 03:07:17,356 ----------------------------------------------------------------------------------------------------
2023-10-07 03:07:26,928 epoch 10 - iter 13/138 - loss 0.04765528 - time (sec): 9.57 - samples/sec: 219.73 - lr: 0.000017 - momentum: 0.000000
2023-10-07 03:07:36,332 epoch 10 - iter 26/138 - loss 0.05095999 - time (sec): 18.98 - samples/sec: 224.82 - lr: 0.000015 - momentum: 0.000000
2023-10-07 03:07:46,285 epoch 10 - iter 39/138 - loss 0.04979380 - time (sec): 28.93 - samples/sec: 223.59 - lr: 0.000013 - momentum: 0.000000
2023-10-07 03:07:56,079 epoch 10 - iter 52/138 - loss 0.04846799 - time (sec): 38.72 - samples/sec: 223.10 - lr: 0.000012 - momentum: 0.000000
2023-10-07 03:08:06,186 epoch 10 - iter 65/138 - loss 0.04940159 - time (sec): 48.83 - samples/sec: 222.61 - lr: 0.000010 - momentum: 0.000000
2023-10-07 03:08:16,145 epoch 10 - iter 78/138 - loss 0.05443687 - time (sec): 58.79 - samples/sec: 222.17 - lr: 0.000008 - momentum: 0.000000
2023-10-07 03:08:26,271 epoch 10 - iter 91/138 - loss 0.05744336 - time (sec): 68.91 - samples/sec: 222.83 - lr: 0.000007 - momentum: 0.000000
2023-10-07 03:08:35,811 epoch 10 - iter 104/138 - loss 0.06336397 - time (sec): 78.45 - samples/sec: 222.05 - lr: 0.000005 - momentum: 0.000000
2023-10-07 03:08:45,236 epoch 10 - iter 117/138 - loss 0.06398527 - time (sec): 87.88 - samples/sec: 222.12 - lr: 0.000003 - momentum: 0.000000
2023-10-07 03:08:54,652 epoch 10 - iter 130/138 - loss 0.06288008 - time (sec): 97.29 - samples/sec: 220.63 - lr: 0.000002 - momentum: 0.000000
2023-10-07 03:09:00,548 ----------------------------------------------------------------------------------------------------
2023-10-07 03:09:00,548 EPOCH 10 done: loss 0.0624 - lr: 0.000002
2023-10-07 03:09:07,241 DEV : loss 0.11792446672916412 - f1-score (micro avg) 0.8592
2023-10-07 03:09:08,235 ----------------------------------------------------------------------------------------------------
2023-10-07 03:09:08,237 Loading model from best epoch ...
2023-10-07 03:09:11,271 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-07 03:09:18,469
Results:
- F-score (micro) 0.8854
- F-score (macro) 0.5288
- Accuracy 0.8173
By class:
precision recall f1-score support
scope 0.8920 0.8920 0.8920 176
pers 0.9091 0.9375 0.9231 128
work 0.8077 0.8514 0.8289 74
object 0.0000 0.0000 0.0000 2
loc 0.0000 0.0000 0.0000 2
micro avg 0.8808 0.8901 0.8854 382
macro avg 0.5218 0.5362 0.5288 382
weighted avg 0.8721 0.8901 0.8809 382
2023-10-07 03:09:18,469 ----------------------------------------------------------------------------------------------------