File size: 24,185 Bytes
90b7512 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 |
2023-10-18 18:12:26,648 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:26,648 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 18:12:26,648 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:26,648 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-18 18:12:26,648 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:26,648 Train: 3575 sentences
2023-10-18 18:12:26,648 (train_with_dev=False, train_with_test=False)
2023-10-18 18:12:26,648 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:26,648 Training Params:
2023-10-18 18:12:26,648 - learning_rate: "3e-05"
2023-10-18 18:12:26,648 - mini_batch_size: "4"
2023-10-18 18:12:26,648 - max_epochs: "10"
2023-10-18 18:12:26,648 - shuffle: "True"
2023-10-18 18:12:26,648 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:26,649 Plugins:
2023-10-18 18:12:26,649 - TensorboardLogger
2023-10-18 18:12:26,649 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 18:12:26,649 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:26,649 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 18:12:26,649 - metric: "('micro avg', 'f1-score')"
2023-10-18 18:12:26,649 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:26,649 Computation:
2023-10-18 18:12:26,649 - compute on device: cuda:0
2023-10-18 18:12:26,649 - embedding storage: none
2023-10-18 18:12:26,649 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:26,649 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-18 18:12:26,649 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:26,649 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:26,649 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 18:12:28,063 epoch 1 - iter 89/894 - loss 4.30065895 - time (sec): 1.41 - samples/sec: 5863.68 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:12:29,477 epoch 1 - iter 178/894 - loss 4.05118813 - time (sec): 2.83 - samples/sec: 5957.89 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:12:30,886 epoch 1 - iter 267/894 - loss 3.74797971 - time (sec): 4.24 - samples/sec: 6281.29 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:12:32,265 epoch 1 - iter 356/894 - loss 3.35481920 - time (sec): 5.62 - samples/sec: 6340.56 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:12:33,666 epoch 1 - iter 445/894 - loss 2.91699777 - time (sec): 7.02 - samples/sec: 6417.64 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:12:35,047 epoch 1 - iter 534/894 - loss 2.58442281 - time (sec): 8.40 - samples/sec: 6383.78 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:12:36,429 epoch 1 - iter 623/894 - loss 2.32823076 - time (sec): 9.78 - samples/sec: 6307.91 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:12:37,798 epoch 1 - iter 712/894 - loss 2.12806775 - time (sec): 11.15 - samples/sec: 6264.38 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:12:39,182 epoch 1 - iter 801/894 - loss 1.97667733 - time (sec): 12.53 - samples/sec: 6194.87 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:12:40,561 epoch 1 - iter 890/894 - loss 1.83834318 - time (sec): 13.91 - samples/sec: 6197.81 - lr: 0.000030 - momentum: 0.000000
2023-10-18 18:12:40,624 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:40,625 EPOCH 1 done: loss 1.8338 - lr: 0.000030
2023-10-18 18:12:42,893 DEV : loss 0.4581781327724457 - f1-score (micro avg) 0.0
2023-10-18 18:12:42,916 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:44,292 epoch 2 - iter 89/894 - loss 0.59867823 - time (sec): 1.38 - samples/sec: 6435.72 - lr: 0.000030 - momentum: 0.000000
2023-10-18 18:12:45,665 epoch 2 - iter 178/894 - loss 0.57480149 - time (sec): 2.75 - samples/sec: 6269.54 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:12:47,021 epoch 2 - iter 267/894 - loss 0.56770059 - time (sec): 4.10 - samples/sec: 6239.82 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:12:48,401 epoch 2 - iter 356/894 - loss 0.56163379 - time (sec): 5.49 - samples/sec: 6187.91 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:12:49,776 epoch 2 - iter 445/894 - loss 0.54920126 - time (sec): 6.86 - samples/sec: 6175.95 - lr: 0.000028 - momentum: 0.000000
2023-10-18 18:12:51,155 epoch 2 - iter 534/894 - loss 0.54336327 - time (sec): 8.24 - samples/sec: 6073.43 - lr: 0.000028 - momentum: 0.000000
2023-10-18 18:12:52,540 epoch 2 - iter 623/894 - loss 0.53277504 - time (sec): 9.62 - samples/sec: 6109.68 - lr: 0.000028 - momentum: 0.000000
2023-10-18 18:12:53,993 epoch 2 - iter 712/894 - loss 0.51554670 - time (sec): 11.08 - samples/sec: 6207.01 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:12:55,391 epoch 2 - iter 801/894 - loss 0.51151749 - time (sec): 12.47 - samples/sec: 6234.99 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:12:56,802 epoch 2 - iter 890/894 - loss 0.50365857 - time (sec): 13.89 - samples/sec: 6205.94 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:12:56,873 ----------------------------------------------------------------------------------------------------
2023-10-18 18:12:56,874 EPOCH 2 done: loss 0.5047 - lr: 0.000027
2023-10-18 18:13:02,051 DEV : loss 0.34852492809295654 - f1-score (micro avg) 0.0967
2023-10-18 18:13:02,074 saving best model
2023-10-18 18:13:02,108 ----------------------------------------------------------------------------------------------------
2023-10-18 18:13:03,520 epoch 3 - iter 89/894 - loss 0.42103620 - time (sec): 1.41 - samples/sec: 6298.51 - lr: 0.000026 - momentum: 0.000000
2023-10-18 18:13:04,895 epoch 3 - iter 178/894 - loss 0.40181846 - time (sec): 2.79 - samples/sec: 6186.38 - lr: 0.000026 - momentum: 0.000000
2023-10-18 18:13:06,296 epoch 3 - iter 267/894 - loss 0.41796152 - time (sec): 4.19 - samples/sec: 6196.39 - lr: 0.000026 - momentum: 0.000000
2023-10-18 18:13:07,671 epoch 3 - iter 356/894 - loss 0.42404257 - time (sec): 5.56 - samples/sec: 6124.96 - lr: 0.000025 - momentum: 0.000000
2023-10-18 18:13:09,042 epoch 3 - iter 445/894 - loss 0.41948278 - time (sec): 6.93 - samples/sec: 6176.89 - lr: 0.000025 - momentum: 0.000000
2023-10-18 18:13:10,487 epoch 3 - iter 534/894 - loss 0.41394295 - time (sec): 8.38 - samples/sec: 6298.19 - lr: 0.000025 - momentum: 0.000000
2023-10-18 18:13:11,914 epoch 3 - iter 623/894 - loss 0.42265535 - time (sec): 9.81 - samples/sec: 6275.67 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:13:13,303 epoch 3 - iter 712/894 - loss 0.41519781 - time (sec): 11.20 - samples/sec: 6238.64 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:13:14,686 epoch 3 - iter 801/894 - loss 0.41634385 - time (sec): 12.58 - samples/sec: 6174.11 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:13:16,059 epoch 3 - iter 890/894 - loss 0.41644892 - time (sec): 13.95 - samples/sec: 6174.42 - lr: 0.000023 - momentum: 0.000000
2023-10-18 18:13:16,119 ----------------------------------------------------------------------------------------------------
2023-10-18 18:13:16,119 EPOCH 3 done: loss 0.4159 - lr: 0.000023
2023-10-18 18:13:21,342 DEV : loss 0.32269880175590515 - f1-score (micro avg) 0.2706
2023-10-18 18:13:21,366 saving best model
2023-10-18 18:13:21,403 ----------------------------------------------------------------------------------------------------
2023-10-18 18:13:22,790 epoch 4 - iter 89/894 - loss 0.35650452 - time (sec): 1.39 - samples/sec: 5576.49 - lr: 0.000023 - momentum: 0.000000
2023-10-18 18:13:24,169 epoch 4 - iter 178/894 - loss 0.37533141 - time (sec): 2.77 - samples/sec: 5757.37 - lr: 0.000023 - momentum: 0.000000
2023-10-18 18:13:25,573 epoch 4 - iter 267/894 - loss 0.40715939 - time (sec): 4.17 - samples/sec: 5835.06 - lr: 0.000022 - momentum: 0.000000
2023-10-18 18:13:26,968 epoch 4 - iter 356/894 - loss 0.40776949 - time (sec): 5.56 - samples/sec: 5922.05 - lr: 0.000022 - momentum: 0.000000
2023-10-18 18:13:28,396 epoch 4 - iter 445/894 - loss 0.39118533 - time (sec): 6.99 - samples/sec: 6059.11 - lr: 0.000022 - momentum: 0.000000
2023-10-18 18:13:29,766 epoch 4 - iter 534/894 - loss 0.38025983 - time (sec): 8.36 - samples/sec: 6110.48 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:13:31,206 epoch 4 - iter 623/894 - loss 0.37519774 - time (sec): 9.80 - samples/sec: 6187.68 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:13:32,610 epoch 4 - iter 712/894 - loss 0.37572856 - time (sec): 11.21 - samples/sec: 6218.96 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:13:33,990 epoch 4 - iter 801/894 - loss 0.37183429 - time (sec): 12.59 - samples/sec: 6197.42 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:13:35,366 epoch 4 - iter 890/894 - loss 0.37518355 - time (sec): 13.96 - samples/sec: 6178.59 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:13:35,422 ----------------------------------------------------------------------------------------------------
2023-10-18 18:13:35,422 EPOCH 4 done: loss 0.3762 - lr: 0.000020
2023-10-18 18:13:40,338 DEV : loss 0.32122358679771423 - f1-score (micro avg) 0.2991
2023-10-18 18:13:40,361 saving best model
2023-10-18 18:13:40,395 ----------------------------------------------------------------------------------------------------
2023-10-18 18:13:41,655 epoch 5 - iter 89/894 - loss 0.37027438 - time (sec): 1.26 - samples/sec: 7062.84 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:13:42,917 epoch 5 - iter 178/894 - loss 0.34873801 - time (sec): 2.52 - samples/sec: 6657.06 - lr: 0.000019 - momentum: 0.000000
2023-10-18 18:13:44,643 epoch 5 - iter 267/894 - loss 0.34681192 - time (sec): 4.25 - samples/sec: 6165.03 - lr: 0.000019 - momentum: 0.000000
2023-10-18 18:13:46,029 epoch 5 - iter 356/894 - loss 0.35184385 - time (sec): 5.63 - samples/sec: 6206.58 - lr: 0.000019 - momentum: 0.000000
2023-10-18 18:13:47,434 epoch 5 - iter 445/894 - loss 0.34721870 - time (sec): 7.04 - samples/sec: 6219.04 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:13:48,822 epoch 5 - iter 534/894 - loss 0.34945722 - time (sec): 8.43 - samples/sec: 6192.10 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:13:50,180 epoch 5 - iter 623/894 - loss 0.35630689 - time (sec): 9.78 - samples/sec: 6119.76 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:13:51,574 epoch 5 - iter 712/894 - loss 0.35521245 - time (sec): 11.18 - samples/sec: 6089.61 - lr: 0.000017 - momentum: 0.000000
2023-10-18 18:13:52,948 epoch 5 - iter 801/894 - loss 0.35160330 - time (sec): 12.55 - samples/sec: 6080.81 - lr: 0.000017 - momentum: 0.000000
2023-10-18 18:13:54,243 epoch 5 - iter 890/894 - loss 0.35440492 - time (sec): 13.85 - samples/sec: 6232.31 - lr: 0.000017 - momentum: 0.000000
2023-10-18 18:13:54,305 ----------------------------------------------------------------------------------------------------
2023-10-18 18:13:54,305 EPOCH 5 done: loss 0.3544 - lr: 0.000017
2023-10-18 18:13:59,306 DEV : loss 0.3080124258995056 - f1-score (micro avg) 0.3158
2023-10-18 18:13:59,331 saving best model
2023-10-18 18:13:59,364 ----------------------------------------------------------------------------------------------------
2023-10-18 18:14:00,783 epoch 6 - iter 89/894 - loss 0.33467325 - time (sec): 1.42 - samples/sec: 5601.86 - lr: 0.000016 - momentum: 0.000000
2023-10-18 18:14:02,208 epoch 6 - iter 178/894 - loss 0.30810723 - time (sec): 2.84 - samples/sec: 6105.34 - lr: 0.000016 - momentum: 0.000000
2023-10-18 18:14:03,625 epoch 6 - iter 267/894 - loss 0.30292506 - time (sec): 4.26 - samples/sec: 5918.09 - lr: 0.000016 - momentum: 0.000000
2023-10-18 18:14:05,038 epoch 6 - iter 356/894 - loss 0.32078430 - time (sec): 5.67 - samples/sec: 6105.76 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:14:06,396 epoch 6 - iter 445/894 - loss 0.32374234 - time (sec): 7.03 - samples/sec: 6237.92 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:14:07,777 epoch 6 - iter 534/894 - loss 0.32865529 - time (sec): 8.41 - samples/sec: 6183.14 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:14:09,178 epoch 6 - iter 623/894 - loss 0.32702913 - time (sec): 9.81 - samples/sec: 6127.61 - lr: 0.000014 - momentum: 0.000000
2023-10-18 18:14:10,596 epoch 6 - iter 712/894 - loss 0.32725480 - time (sec): 11.23 - samples/sec: 6209.99 - lr: 0.000014 - momentum: 0.000000
2023-10-18 18:14:11,856 epoch 6 - iter 801/894 - loss 0.32393042 - time (sec): 12.49 - samples/sec: 6227.81 - lr: 0.000014 - momentum: 0.000000
2023-10-18 18:14:13,084 epoch 6 - iter 890/894 - loss 0.33649160 - time (sec): 13.72 - samples/sec: 6284.34 - lr: 0.000013 - momentum: 0.000000
2023-10-18 18:14:13,135 ----------------------------------------------------------------------------------------------------
2023-10-18 18:14:13,135 EPOCH 6 done: loss 0.3365 - lr: 0.000013
2023-10-18 18:14:18,439 DEV : loss 0.3021206855773926 - f1-score (micro avg) 0.3207
2023-10-18 18:14:18,464 saving best model
2023-10-18 18:14:18,496 ----------------------------------------------------------------------------------------------------
2023-10-18 18:14:19,958 epoch 7 - iter 89/894 - loss 0.26960562 - time (sec): 1.46 - samples/sec: 6365.87 - lr: 0.000013 - momentum: 0.000000
2023-10-18 18:14:21,346 epoch 7 - iter 178/894 - loss 0.30468972 - time (sec): 2.85 - samples/sec: 6151.31 - lr: 0.000013 - momentum: 0.000000
2023-10-18 18:14:22,767 epoch 7 - iter 267/894 - loss 0.33313252 - time (sec): 4.27 - samples/sec: 6407.92 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:14:24,187 epoch 7 - iter 356/894 - loss 0.33431427 - time (sec): 5.69 - samples/sec: 6344.58 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:14:25,577 epoch 7 - iter 445/894 - loss 0.33591565 - time (sec): 7.08 - samples/sec: 6302.37 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:14:26,922 epoch 7 - iter 534/894 - loss 0.33295271 - time (sec): 8.43 - samples/sec: 6245.70 - lr: 0.000011 - momentum: 0.000000
2023-10-18 18:14:28,331 epoch 7 - iter 623/894 - loss 0.32496234 - time (sec): 9.83 - samples/sec: 6181.53 - lr: 0.000011 - momentum: 0.000000
2023-10-18 18:14:29,748 epoch 7 - iter 712/894 - loss 0.32501740 - time (sec): 11.25 - samples/sec: 6160.77 - lr: 0.000011 - momentum: 0.000000
2023-10-18 18:14:31,133 epoch 7 - iter 801/894 - loss 0.31957759 - time (sec): 12.64 - samples/sec: 6170.46 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:14:32,480 epoch 7 - iter 890/894 - loss 0.32210478 - time (sec): 13.98 - samples/sec: 6164.55 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:14:32,542 ----------------------------------------------------------------------------------------------------
2023-10-18 18:14:32,543 EPOCH 7 done: loss 0.3228 - lr: 0.000010
2023-10-18 18:14:37,838 DEV : loss 0.30848488211631775 - f1-score (micro avg) 0.3318
2023-10-18 18:14:37,863 saving best model
2023-10-18 18:14:37,897 ----------------------------------------------------------------------------------------------------
2023-10-18 18:14:39,315 epoch 8 - iter 89/894 - loss 0.31380512 - time (sec): 1.42 - samples/sec: 6724.19 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:14:40,788 epoch 8 - iter 178/894 - loss 0.30161176 - time (sec): 2.89 - samples/sec: 6165.41 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:14:42,187 epoch 8 - iter 267/894 - loss 0.31408641 - time (sec): 4.29 - samples/sec: 6172.07 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:14:43,636 epoch 8 - iter 356/894 - loss 0.32273847 - time (sec): 5.74 - samples/sec: 6020.27 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:14:45,011 epoch 8 - iter 445/894 - loss 0.32694106 - time (sec): 7.11 - samples/sec: 6034.66 - lr: 0.000008 - momentum: 0.000000
2023-10-18 18:14:46,388 epoch 8 - iter 534/894 - loss 0.32254045 - time (sec): 8.49 - samples/sec: 6039.39 - lr: 0.000008 - momentum: 0.000000
2023-10-18 18:14:47,761 epoch 8 - iter 623/894 - loss 0.31843528 - time (sec): 9.86 - samples/sec: 6009.62 - lr: 0.000008 - momentum: 0.000000
2023-10-18 18:14:49,180 epoch 8 - iter 712/894 - loss 0.31741782 - time (sec): 11.28 - samples/sec: 6032.12 - lr: 0.000007 - momentum: 0.000000
2023-10-18 18:14:50,555 epoch 8 - iter 801/894 - loss 0.31067949 - time (sec): 12.66 - samples/sec: 6049.15 - lr: 0.000007 - momentum: 0.000000
2023-10-18 18:14:51,967 epoch 8 - iter 890/894 - loss 0.31317693 - time (sec): 14.07 - samples/sec: 6118.58 - lr: 0.000007 - momentum: 0.000000
2023-10-18 18:14:52,032 ----------------------------------------------------------------------------------------------------
2023-10-18 18:14:52,032 EPOCH 8 done: loss 0.3121 - lr: 0.000007
2023-10-18 18:14:57,331 DEV : loss 0.304724782705307 - f1-score (micro avg) 0.3341
2023-10-18 18:14:57,355 saving best model
2023-10-18 18:14:57,395 ----------------------------------------------------------------------------------------------------
2023-10-18 18:14:58,774 epoch 9 - iter 89/894 - loss 0.28201030 - time (sec): 1.38 - samples/sec: 5974.75 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:15:00,149 epoch 9 - iter 178/894 - loss 0.30076729 - time (sec): 2.75 - samples/sec: 5684.88 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:15:01,567 epoch 9 - iter 267/894 - loss 0.29425291 - time (sec): 4.17 - samples/sec: 5863.62 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:15:03,049 epoch 9 - iter 356/894 - loss 0.30464495 - time (sec): 5.65 - samples/sec: 5799.05 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:15:04,519 epoch 9 - iter 445/894 - loss 0.30458727 - time (sec): 7.12 - samples/sec: 5906.97 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:15:05,921 epoch 9 - iter 534/894 - loss 0.30462622 - time (sec): 8.53 - samples/sec: 6011.64 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:15:07,339 epoch 9 - iter 623/894 - loss 0.29970434 - time (sec): 9.94 - samples/sec: 6014.50 - lr: 0.000004 - momentum: 0.000000
2023-10-18 18:15:08,843 epoch 9 - iter 712/894 - loss 0.29994186 - time (sec): 11.45 - samples/sec: 5930.13 - lr: 0.000004 - momentum: 0.000000
2023-10-18 18:15:10,270 epoch 9 - iter 801/894 - loss 0.30270378 - time (sec): 12.87 - samples/sec: 6023.83 - lr: 0.000004 - momentum: 0.000000
2023-10-18 18:15:11,655 epoch 9 - iter 890/894 - loss 0.30548739 - time (sec): 14.26 - samples/sec: 6053.64 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:15:11,717 ----------------------------------------------------------------------------------------------------
2023-10-18 18:15:11,717 EPOCH 9 done: loss 0.3060 - lr: 0.000003
2023-10-18 18:15:16,672 DEV : loss 0.3093281090259552 - f1-score (micro avg) 0.3296
2023-10-18 18:15:16,697 ----------------------------------------------------------------------------------------------------
2023-10-18 18:15:18,092 epoch 10 - iter 89/894 - loss 0.35217359 - time (sec): 1.39 - samples/sec: 5946.91 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:15:19,482 epoch 10 - iter 178/894 - loss 0.32604035 - time (sec): 2.78 - samples/sec: 5919.61 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:15:20,873 epoch 10 - iter 267/894 - loss 0.30390326 - time (sec): 4.18 - samples/sec: 6002.65 - lr: 0.000002 - momentum: 0.000000
2023-10-18 18:15:22,239 epoch 10 - iter 356/894 - loss 0.29975890 - time (sec): 5.54 - samples/sec: 5978.58 - lr: 0.000002 - momentum: 0.000000
2023-10-18 18:15:23,593 epoch 10 - iter 445/894 - loss 0.30706171 - time (sec): 6.90 - samples/sec: 5917.41 - lr: 0.000002 - momentum: 0.000000
2023-10-18 18:15:25,047 epoch 10 - iter 534/894 - loss 0.29824962 - time (sec): 8.35 - samples/sec: 5945.27 - lr: 0.000001 - momentum: 0.000000
2023-10-18 18:15:26,410 epoch 10 - iter 623/894 - loss 0.29881777 - time (sec): 9.71 - samples/sec: 5970.96 - lr: 0.000001 - momentum: 0.000000
2023-10-18 18:15:28,198 epoch 10 - iter 712/894 - loss 0.30115841 - time (sec): 11.50 - samples/sec: 5963.76 - lr: 0.000001 - momentum: 0.000000
2023-10-18 18:15:29,602 epoch 10 - iter 801/894 - loss 0.30301864 - time (sec): 12.90 - samples/sec: 5967.88 - lr: 0.000000 - momentum: 0.000000
2023-10-18 18:15:31,020 epoch 10 - iter 890/894 - loss 0.30007494 - time (sec): 14.32 - samples/sec: 6006.28 - lr: 0.000000 - momentum: 0.000000
2023-10-18 18:15:31,082 ----------------------------------------------------------------------------------------------------
2023-10-18 18:15:31,082 EPOCH 10 done: loss 0.3003 - lr: 0.000000
2023-10-18 18:15:36,037 DEV : loss 0.30702558159828186 - f1-score (micro avg) 0.3351
2023-10-18 18:15:36,062 saving best model
2023-10-18 18:15:36,125 ----------------------------------------------------------------------------------------------------
2023-10-18 18:15:36,126 Loading model from best epoch ...
2023-10-18 18:15:36,208 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-18 18:15:38,533
Results:
- F-score (micro) 0.3319
- F-score (macro) 0.1314
- Accuracy 0.2091
By class:
precision recall f1-score support
loc 0.4859 0.5503 0.5161 596
pers 0.1228 0.1652 0.1408 333
org 0.0000 0.0000 0.0000 132
prod 0.0000 0.0000 0.0000 66
time 0.0000 0.0000 0.0000 49
micro avg 0.3383 0.3257 0.3319 1176
macro avg 0.1217 0.1431 0.1314 1176
weighted avg 0.2810 0.3257 0.3015 1176
2023-10-18 18:15:38,533 ----------------------------------------------------------------------------------------------------
|