stefan-it's picture
Upload folder using huggingface_hub
fac3980
2023-10-19 20:32:00,351 ----------------------------------------------------------------------------------------------------
2023-10-19 20:32:00,351 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-19 20:32:00,351 ----------------------------------------------------------------------------------------------------
2023-10-19 20:32:00,351 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-19 20:32:00,351 ----------------------------------------------------------------------------------------------------
2023-10-19 20:32:00,351 Train: 7142 sentences
2023-10-19 20:32:00,351 (train_with_dev=False, train_with_test=False)
2023-10-19 20:32:00,351 ----------------------------------------------------------------------------------------------------
2023-10-19 20:32:00,351 Training Params:
2023-10-19 20:32:00,351 - learning_rate: "3e-05"
2023-10-19 20:32:00,351 - mini_batch_size: "4"
2023-10-19 20:32:00,352 - max_epochs: "10"
2023-10-19 20:32:00,352 - shuffle: "True"
2023-10-19 20:32:00,352 ----------------------------------------------------------------------------------------------------
2023-10-19 20:32:00,352 Plugins:
2023-10-19 20:32:00,352 - TensorboardLogger
2023-10-19 20:32:00,352 - LinearScheduler | warmup_fraction: '0.1'
2023-10-19 20:32:00,352 ----------------------------------------------------------------------------------------------------
2023-10-19 20:32:00,352 Final evaluation on model from best epoch (best-model.pt)
2023-10-19 20:32:00,352 - metric: "('micro avg', 'f1-score')"
2023-10-19 20:32:00,352 ----------------------------------------------------------------------------------------------------
2023-10-19 20:32:00,352 Computation:
2023-10-19 20:32:00,352 - compute on device: cuda:0
2023-10-19 20:32:00,352 - embedding storage: none
2023-10-19 20:32:00,352 ----------------------------------------------------------------------------------------------------
2023-10-19 20:32:00,352 Model training base path: "hmbench-newseye/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-19 20:32:00,352 ----------------------------------------------------------------------------------------------------
2023-10-19 20:32:00,352 ----------------------------------------------------------------------------------------------------
2023-10-19 20:32:00,352 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-19 20:32:03,297 epoch 1 - iter 178/1786 - loss 3.32754534 - time (sec): 2.94 - samples/sec: 8301.19 - lr: 0.000003 - momentum: 0.000000
2023-10-19 20:32:06,455 epoch 1 - iter 356/1786 - loss 2.99552020 - time (sec): 6.10 - samples/sec: 8180.03 - lr: 0.000006 - momentum: 0.000000
2023-10-19 20:32:09,601 epoch 1 - iter 534/1786 - loss 2.50760636 - time (sec): 9.25 - samples/sec: 8150.43 - lr: 0.000009 - momentum: 0.000000
2023-10-19 20:32:12,742 epoch 1 - iter 712/1786 - loss 2.07292944 - time (sec): 12.39 - samples/sec: 8259.58 - lr: 0.000012 - momentum: 0.000000
2023-10-19 20:32:15,794 epoch 1 - iter 890/1786 - loss 1.82574013 - time (sec): 15.44 - samples/sec: 8191.99 - lr: 0.000015 - momentum: 0.000000
2023-10-19 20:32:18,850 epoch 1 - iter 1068/1786 - loss 1.65402095 - time (sec): 18.50 - samples/sec: 8111.22 - lr: 0.000018 - momentum: 0.000000
2023-10-19 20:32:21,884 epoch 1 - iter 1246/1786 - loss 1.52017083 - time (sec): 21.53 - samples/sec: 8051.47 - lr: 0.000021 - momentum: 0.000000
2023-10-19 20:32:24,890 epoch 1 - iter 1424/1786 - loss 1.40266291 - time (sec): 24.54 - samples/sec: 8076.12 - lr: 0.000024 - momentum: 0.000000
2023-10-19 20:32:27,811 epoch 1 - iter 1602/1786 - loss 1.30836370 - time (sec): 27.46 - samples/sec: 8142.50 - lr: 0.000027 - momentum: 0.000000
2023-10-19 20:32:31,068 epoch 1 - iter 1780/1786 - loss 1.23181977 - time (sec): 30.72 - samples/sec: 8084.96 - lr: 0.000030 - momentum: 0.000000
2023-10-19 20:32:31,159 ----------------------------------------------------------------------------------------------------
2023-10-19 20:32:31,160 EPOCH 1 done: loss 1.2311 - lr: 0.000030
2023-10-19 20:32:32,634 DEV : loss 0.332442045211792 - f1-score (micro avg) 0.086
2023-10-19 20:32:32,648 saving best model
2023-10-19 20:32:32,680 ----------------------------------------------------------------------------------------------------
2023-10-19 20:32:35,792 epoch 2 - iter 178/1786 - loss 0.48971429 - time (sec): 3.11 - samples/sec: 8407.28 - lr: 0.000030 - momentum: 0.000000
2023-10-19 20:32:38,828 epoch 2 - iter 356/1786 - loss 0.48727058 - time (sec): 6.15 - samples/sec: 8212.21 - lr: 0.000029 - momentum: 0.000000
2023-10-19 20:32:41,833 epoch 2 - iter 534/1786 - loss 0.47308375 - time (sec): 9.15 - samples/sec: 8096.70 - lr: 0.000029 - momentum: 0.000000
2023-10-19 20:32:44,934 epoch 2 - iter 712/1786 - loss 0.47093786 - time (sec): 12.25 - samples/sec: 8191.56 - lr: 0.000029 - momentum: 0.000000
2023-10-19 20:32:47,998 epoch 2 - iter 890/1786 - loss 0.45727677 - time (sec): 15.32 - samples/sec: 8181.28 - lr: 0.000028 - momentum: 0.000000
2023-10-19 20:32:51,091 epoch 2 - iter 1068/1786 - loss 0.45758949 - time (sec): 18.41 - samples/sec: 8168.29 - lr: 0.000028 - momentum: 0.000000
2023-10-19 20:32:54,094 epoch 2 - iter 1246/1786 - loss 0.45211801 - time (sec): 21.41 - samples/sec: 8132.84 - lr: 0.000028 - momentum: 0.000000
2023-10-19 20:32:57,133 epoch 2 - iter 1424/1786 - loss 0.44511293 - time (sec): 24.45 - samples/sec: 8142.91 - lr: 0.000027 - momentum: 0.000000
2023-10-19 20:33:00,236 epoch 2 - iter 1602/1786 - loss 0.44288817 - time (sec): 27.56 - samples/sec: 8134.49 - lr: 0.000027 - momentum: 0.000000
2023-10-19 20:33:03,253 epoch 2 - iter 1780/1786 - loss 0.43672204 - time (sec): 30.57 - samples/sec: 8119.04 - lr: 0.000027 - momentum: 0.000000
2023-10-19 20:33:03,343 ----------------------------------------------------------------------------------------------------
2023-10-19 20:33:03,344 EPOCH 2 done: loss 0.4369 - lr: 0.000027
2023-10-19 20:33:05,683 DEV : loss 0.2546565532684326 - f1-score (micro avg) 0.366
2023-10-19 20:33:05,696 saving best model
2023-10-19 20:33:05,730 ----------------------------------------------------------------------------------------------------
2023-10-19 20:33:08,715 epoch 3 - iter 178/1786 - loss 0.37082634 - time (sec): 2.99 - samples/sec: 7746.66 - lr: 0.000026 - momentum: 0.000000
2023-10-19 20:33:11,668 epoch 3 - iter 356/1786 - loss 0.36635335 - time (sec): 5.94 - samples/sec: 8104.85 - lr: 0.000026 - momentum: 0.000000
2023-10-19 20:33:14,833 epoch 3 - iter 534/1786 - loss 0.36916731 - time (sec): 9.10 - samples/sec: 8042.33 - lr: 0.000026 - momentum: 0.000000
2023-10-19 20:33:17,972 epoch 3 - iter 712/1786 - loss 0.37973970 - time (sec): 12.24 - samples/sec: 8096.35 - lr: 0.000025 - momentum: 0.000000
2023-10-19 20:33:20,911 epoch 3 - iter 890/1786 - loss 0.38024056 - time (sec): 15.18 - samples/sec: 8137.05 - lr: 0.000025 - momentum: 0.000000
2023-10-19 20:33:23,735 epoch 3 - iter 1068/1786 - loss 0.37160042 - time (sec): 18.00 - samples/sec: 8260.60 - lr: 0.000025 - momentum: 0.000000
2023-10-19 20:33:26,857 epoch 3 - iter 1246/1786 - loss 0.36830405 - time (sec): 21.13 - samples/sec: 8222.40 - lr: 0.000024 - momentum: 0.000000
2023-10-19 20:33:30,028 epoch 3 - iter 1424/1786 - loss 0.36360124 - time (sec): 24.30 - samples/sec: 8191.11 - lr: 0.000024 - momentum: 0.000000
2023-10-19 20:33:33,157 epoch 3 - iter 1602/1786 - loss 0.35846373 - time (sec): 27.43 - samples/sec: 8164.32 - lr: 0.000024 - momentum: 0.000000
2023-10-19 20:33:36,217 epoch 3 - iter 1780/1786 - loss 0.35471508 - time (sec): 30.49 - samples/sec: 8118.57 - lr: 0.000023 - momentum: 0.000000
2023-10-19 20:33:36,332 ----------------------------------------------------------------------------------------------------
2023-10-19 20:33:36,332 EPOCH 3 done: loss 0.3546 - lr: 0.000023
2023-10-19 20:33:39,156 DEV : loss 0.22689871490001678 - f1-score (micro avg) 0.4293
2023-10-19 20:33:39,170 saving best model
2023-10-19 20:33:39,203 ----------------------------------------------------------------------------------------------------
2023-10-19 20:33:42,292 epoch 4 - iter 178/1786 - loss 0.33183201 - time (sec): 3.09 - samples/sec: 7976.40 - lr: 0.000023 - momentum: 0.000000
2023-10-19 20:33:45,317 epoch 4 - iter 356/1786 - loss 0.33466415 - time (sec): 6.11 - samples/sec: 8090.12 - lr: 0.000023 - momentum: 0.000000
2023-10-19 20:33:48,413 epoch 4 - iter 534/1786 - loss 0.31911501 - time (sec): 9.21 - samples/sec: 8052.80 - lr: 0.000022 - momentum: 0.000000
2023-10-19 20:33:51,452 epoch 4 - iter 712/1786 - loss 0.31645588 - time (sec): 12.25 - samples/sec: 8078.11 - lr: 0.000022 - momentum: 0.000000
2023-10-19 20:33:54,555 epoch 4 - iter 890/1786 - loss 0.31727146 - time (sec): 15.35 - samples/sec: 8064.96 - lr: 0.000022 - momentum: 0.000000
2023-10-19 20:33:57,632 epoch 4 - iter 1068/1786 - loss 0.31778035 - time (sec): 18.43 - samples/sec: 8114.20 - lr: 0.000021 - momentum: 0.000000
2023-10-19 20:34:00,664 epoch 4 - iter 1246/1786 - loss 0.32088801 - time (sec): 21.46 - samples/sec: 8061.89 - lr: 0.000021 - momentum: 0.000000
2023-10-19 20:34:03,815 epoch 4 - iter 1424/1786 - loss 0.32135691 - time (sec): 24.61 - samples/sec: 8065.68 - lr: 0.000021 - momentum: 0.000000
2023-10-19 20:34:06,801 epoch 4 - iter 1602/1786 - loss 0.32028615 - time (sec): 27.60 - samples/sec: 8032.40 - lr: 0.000020 - momentum: 0.000000
2023-10-19 20:34:09,981 epoch 4 - iter 1780/1786 - loss 0.31755337 - time (sec): 30.78 - samples/sec: 8054.80 - lr: 0.000020 - momentum: 0.000000
2023-10-19 20:34:10,091 ----------------------------------------------------------------------------------------------------
2023-10-19 20:34:10,091 EPOCH 4 done: loss 0.3173 - lr: 0.000020
2023-10-19 20:34:12,477 DEV : loss 0.21633675694465637 - f1-score (micro avg) 0.4634
2023-10-19 20:34:12,492 saving best model
2023-10-19 20:34:12,528 ----------------------------------------------------------------------------------------------------
2023-10-19 20:34:15,215 epoch 5 - iter 178/1786 - loss 0.27680115 - time (sec): 2.69 - samples/sec: 9072.91 - lr: 0.000020 - momentum: 0.000000
2023-10-19 20:34:17,836 epoch 5 - iter 356/1786 - loss 0.28824880 - time (sec): 5.31 - samples/sec: 9164.73 - lr: 0.000019 - momentum: 0.000000
2023-10-19 20:34:20,835 epoch 5 - iter 534/1786 - loss 0.29282530 - time (sec): 8.31 - samples/sec: 8785.11 - lr: 0.000019 - momentum: 0.000000
2023-10-19 20:34:23,952 epoch 5 - iter 712/1786 - loss 0.29251154 - time (sec): 11.42 - samples/sec: 8608.71 - lr: 0.000019 - momentum: 0.000000
2023-10-19 20:34:27,467 epoch 5 - iter 890/1786 - loss 0.29527347 - time (sec): 14.94 - samples/sec: 8294.34 - lr: 0.000018 - momentum: 0.000000
2023-10-19 20:34:30,608 epoch 5 - iter 1068/1786 - loss 0.29254338 - time (sec): 18.08 - samples/sec: 8231.58 - lr: 0.000018 - momentum: 0.000000
2023-10-19 20:34:33,649 epoch 5 - iter 1246/1786 - loss 0.29410203 - time (sec): 21.12 - samples/sec: 8210.30 - lr: 0.000018 - momentum: 0.000000
2023-10-19 20:34:36,935 epoch 5 - iter 1424/1786 - loss 0.29602451 - time (sec): 24.41 - samples/sec: 8141.60 - lr: 0.000017 - momentum: 0.000000
2023-10-19 20:34:40,074 epoch 5 - iter 1602/1786 - loss 0.29176349 - time (sec): 27.55 - samples/sec: 8101.73 - lr: 0.000017 - momentum: 0.000000
2023-10-19 20:34:43,157 epoch 5 - iter 1780/1786 - loss 0.29064522 - time (sec): 30.63 - samples/sec: 8094.19 - lr: 0.000017 - momentum: 0.000000
2023-10-19 20:34:43,261 ----------------------------------------------------------------------------------------------------
2023-10-19 20:34:43,261 EPOCH 5 done: loss 0.2905 - lr: 0.000017
2023-10-19 20:34:46,153 DEV : loss 0.20622387528419495 - f1-score (micro avg) 0.4786
2023-10-19 20:34:46,168 saving best model
2023-10-19 20:34:46,201 ----------------------------------------------------------------------------------------------------
2023-10-19 20:34:49,342 epoch 6 - iter 178/1786 - loss 0.27032943 - time (sec): 3.14 - samples/sec: 7933.80 - lr: 0.000016 - momentum: 0.000000
2023-10-19 20:34:52,435 epoch 6 - iter 356/1786 - loss 0.27170061 - time (sec): 6.23 - samples/sec: 7725.16 - lr: 0.000016 - momentum: 0.000000
2023-10-19 20:34:55,670 epoch 6 - iter 534/1786 - loss 0.26713977 - time (sec): 9.47 - samples/sec: 7598.11 - lr: 0.000016 - momentum: 0.000000
2023-10-19 20:34:58,761 epoch 6 - iter 712/1786 - loss 0.26494070 - time (sec): 12.56 - samples/sec: 7680.41 - lr: 0.000015 - momentum: 0.000000
2023-10-19 20:35:01,862 epoch 6 - iter 890/1786 - loss 0.26762243 - time (sec): 15.66 - samples/sec: 7704.47 - lr: 0.000015 - momentum: 0.000000
2023-10-19 20:35:04,978 epoch 6 - iter 1068/1786 - loss 0.27032426 - time (sec): 18.78 - samples/sec: 7748.42 - lr: 0.000015 - momentum: 0.000000
2023-10-19 20:35:08,054 epoch 6 - iter 1246/1786 - loss 0.27154388 - time (sec): 21.85 - samples/sec: 7823.54 - lr: 0.000014 - momentum: 0.000000
2023-10-19 20:35:11,117 epoch 6 - iter 1424/1786 - loss 0.27377412 - time (sec): 24.91 - samples/sec: 7910.87 - lr: 0.000014 - momentum: 0.000000
2023-10-19 20:35:14,161 epoch 6 - iter 1602/1786 - loss 0.27292448 - time (sec): 27.96 - samples/sec: 7962.08 - lr: 0.000014 - momentum: 0.000000
2023-10-19 20:35:17,363 epoch 6 - iter 1780/1786 - loss 0.27251347 - time (sec): 31.16 - samples/sec: 7950.61 - lr: 0.000013 - momentum: 0.000000
2023-10-19 20:35:17,476 ----------------------------------------------------------------------------------------------------
2023-10-19 20:35:17,477 EPOCH 6 done: loss 0.2720 - lr: 0.000013
2023-10-19 20:35:19,851 DEV : loss 0.1968582421541214 - f1-score (micro avg) 0.4949
2023-10-19 20:35:19,865 saving best model
2023-10-19 20:35:19,900 ----------------------------------------------------------------------------------------------------
2023-10-19 20:35:22,936 epoch 7 - iter 178/1786 - loss 0.24006764 - time (sec): 3.03 - samples/sec: 7523.24 - lr: 0.000013 - momentum: 0.000000
2023-10-19 20:35:25,964 epoch 7 - iter 356/1786 - loss 0.25690864 - time (sec): 6.06 - samples/sec: 7769.94 - lr: 0.000013 - momentum: 0.000000
2023-10-19 20:35:29,059 epoch 7 - iter 534/1786 - loss 0.25439491 - time (sec): 9.16 - samples/sec: 7909.75 - lr: 0.000012 - momentum: 0.000000
2023-10-19 20:35:32,058 epoch 7 - iter 712/1786 - loss 0.25285107 - time (sec): 12.16 - samples/sec: 7921.22 - lr: 0.000012 - momentum: 0.000000
2023-10-19 20:35:35,190 epoch 7 - iter 890/1786 - loss 0.25158788 - time (sec): 15.29 - samples/sec: 7977.69 - lr: 0.000012 - momentum: 0.000000
2023-10-19 20:35:38,159 epoch 7 - iter 1068/1786 - loss 0.25381241 - time (sec): 18.26 - samples/sec: 7966.12 - lr: 0.000011 - momentum: 0.000000
2023-10-19 20:35:41,156 epoch 7 - iter 1246/1786 - loss 0.25712643 - time (sec): 21.25 - samples/sec: 7918.27 - lr: 0.000011 - momentum: 0.000000
2023-10-19 20:35:44,414 epoch 7 - iter 1424/1786 - loss 0.25746357 - time (sec): 24.51 - samples/sec: 8037.79 - lr: 0.000011 - momentum: 0.000000
2023-10-19 20:35:47,477 epoch 7 - iter 1602/1786 - loss 0.25757421 - time (sec): 27.58 - samples/sec: 8119.90 - lr: 0.000010 - momentum: 0.000000
2023-10-19 20:35:50,558 epoch 7 - iter 1780/1786 - loss 0.25799854 - time (sec): 30.66 - samples/sec: 8090.47 - lr: 0.000010 - momentum: 0.000000
2023-10-19 20:35:50,663 ----------------------------------------------------------------------------------------------------
2023-10-19 20:35:50,664 EPOCH 7 done: loss 0.2576 - lr: 0.000010
2023-10-19 20:35:53,513 DEV : loss 0.19658498466014862 - f1-score (micro avg) 0.4945
2023-10-19 20:35:53,528 ----------------------------------------------------------------------------------------------------
2023-10-19 20:35:56,505 epoch 8 - iter 178/1786 - loss 0.22844374 - time (sec): 2.98 - samples/sec: 7926.19 - lr: 0.000010 - momentum: 0.000000
2023-10-19 20:35:59,573 epoch 8 - iter 356/1786 - loss 0.23968996 - time (sec): 6.04 - samples/sec: 8226.99 - lr: 0.000009 - momentum: 0.000000
2023-10-19 20:36:02,632 epoch 8 - iter 534/1786 - loss 0.24477437 - time (sec): 9.10 - samples/sec: 8151.27 - lr: 0.000009 - momentum: 0.000000
2023-10-19 20:36:05,766 epoch 8 - iter 712/1786 - loss 0.24078464 - time (sec): 12.24 - samples/sec: 8098.25 - lr: 0.000009 - momentum: 0.000000
2023-10-19 20:36:08,877 epoch 8 - iter 890/1786 - loss 0.25006937 - time (sec): 15.35 - samples/sec: 7993.62 - lr: 0.000008 - momentum: 0.000000
2023-10-19 20:36:11,957 epoch 8 - iter 1068/1786 - loss 0.25146410 - time (sec): 18.43 - samples/sec: 8001.77 - lr: 0.000008 - momentum: 0.000000
2023-10-19 20:36:15,041 epoch 8 - iter 1246/1786 - loss 0.24964134 - time (sec): 21.51 - samples/sec: 7964.04 - lr: 0.000008 - momentum: 0.000000
2023-10-19 20:36:18,053 epoch 8 - iter 1424/1786 - loss 0.24702717 - time (sec): 24.53 - samples/sec: 7974.14 - lr: 0.000007 - momentum: 0.000000
2023-10-19 20:36:21,146 epoch 8 - iter 1602/1786 - loss 0.24740242 - time (sec): 27.62 - samples/sec: 8066.94 - lr: 0.000007 - momentum: 0.000000
2023-10-19 20:36:24,282 epoch 8 - iter 1780/1786 - loss 0.24591802 - time (sec): 30.75 - samples/sec: 8065.89 - lr: 0.000007 - momentum: 0.000000
2023-10-19 20:36:24,388 ----------------------------------------------------------------------------------------------------
2023-10-19 20:36:24,388 EPOCH 8 done: loss 0.2465 - lr: 0.000007
2023-10-19 20:36:26,749 DEV : loss 0.19389420747756958 - f1-score (micro avg) 0.5054
2023-10-19 20:36:26,764 saving best model
2023-10-19 20:36:26,799 ----------------------------------------------------------------------------------------------------
2023-10-19 20:36:29,850 epoch 9 - iter 178/1786 - loss 0.25583969 - time (sec): 3.05 - samples/sec: 7983.20 - lr: 0.000006 - momentum: 0.000000
2023-10-19 20:36:32,832 epoch 9 - iter 356/1786 - loss 0.24145976 - time (sec): 6.03 - samples/sec: 8129.78 - lr: 0.000006 - momentum: 0.000000
2023-10-19 20:36:35,417 epoch 9 - iter 534/1786 - loss 0.24379068 - time (sec): 8.62 - samples/sec: 8495.79 - lr: 0.000006 - momentum: 0.000000
2023-10-19 20:36:38,167 epoch 9 - iter 712/1786 - loss 0.24757821 - time (sec): 11.37 - samples/sec: 8686.25 - lr: 0.000005 - momentum: 0.000000
2023-10-19 20:36:41,229 epoch 9 - iter 890/1786 - loss 0.24744320 - time (sec): 14.43 - samples/sec: 8711.33 - lr: 0.000005 - momentum: 0.000000
2023-10-19 20:36:44,224 epoch 9 - iter 1068/1786 - loss 0.24257814 - time (sec): 17.42 - samples/sec: 8575.61 - lr: 0.000005 - momentum: 0.000000
2023-10-19 20:36:47,249 epoch 9 - iter 1246/1786 - loss 0.24236958 - time (sec): 20.45 - samples/sec: 8537.05 - lr: 0.000004 - momentum: 0.000000
2023-10-19 20:36:50,305 epoch 9 - iter 1424/1786 - loss 0.24100770 - time (sec): 23.51 - samples/sec: 8446.55 - lr: 0.000004 - momentum: 0.000000
2023-10-19 20:36:53,554 epoch 9 - iter 1602/1786 - loss 0.23994016 - time (sec): 26.75 - samples/sec: 8359.62 - lr: 0.000004 - momentum: 0.000000
2023-10-19 20:36:56,756 epoch 9 - iter 1780/1786 - loss 0.23975442 - time (sec): 29.96 - samples/sec: 8274.90 - lr: 0.000003 - momentum: 0.000000
2023-10-19 20:36:56,862 ----------------------------------------------------------------------------------------------------
2023-10-19 20:36:56,862 EPOCH 9 done: loss 0.2396 - lr: 0.000003
2023-10-19 20:36:59,733 DEV : loss 0.19508205354213715 - f1-score (micro avg) 0.5112
2023-10-19 20:36:59,749 saving best model
2023-10-19 20:36:59,785 ----------------------------------------------------------------------------------------------------
2023-10-19 20:37:02,860 epoch 10 - iter 178/1786 - loss 0.22700481 - time (sec): 3.07 - samples/sec: 8525.85 - lr: 0.000003 - momentum: 0.000000
2023-10-19 20:37:05,989 epoch 10 - iter 356/1786 - loss 0.22976632 - time (sec): 6.20 - samples/sec: 8348.54 - lr: 0.000003 - momentum: 0.000000
2023-10-19 20:37:09,068 epoch 10 - iter 534/1786 - loss 0.23666307 - time (sec): 9.28 - samples/sec: 8401.78 - lr: 0.000002 - momentum: 0.000000
2023-10-19 20:37:12,053 epoch 10 - iter 712/1786 - loss 0.23862130 - time (sec): 12.27 - samples/sec: 8350.81 - lr: 0.000002 - momentum: 0.000000
2023-10-19 20:37:15,183 epoch 10 - iter 890/1786 - loss 0.23849067 - time (sec): 15.40 - samples/sec: 8264.52 - lr: 0.000002 - momentum: 0.000000
2023-10-19 20:37:18,236 epoch 10 - iter 1068/1786 - loss 0.23907350 - time (sec): 18.45 - samples/sec: 8214.37 - lr: 0.000001 - momentum: 0.000000
2023-10-19 20:37:21,420 epoch 10 - iter 1246/1786 - loss 0.23722320 - time (sec): 21.63 - samples/sec: 8133.94 - lr: 0.000001 - momentum: 0.000000
2023-10-19 20:37:24,454 epoch 10 - iter 1424/1786 - loss 0.23624508 - time (sec): 24.67 - samples/sec: 8100.71 - lr: 0.000001 - momentum: 0.000000
2023-10-19 20:37:27,377 epoch 10 - iter 1602/1786 - loss 0.23688162 - time (sec): 27.59 - samples/sec: 8104.12 - lr: 0.000000 - momentum: 0.000000
2023-10-19 20:37:30,390 epoch 10 - iter 1780/1786 - loss 0.23692269 - time (sec): 30.60 - samples/sec: 8091.67 - lr: 0.000000 - momentum: 0.000000
2023-10-19 20:37:30,493 ----------------------------------------------------------------------------------------------------
2023-10-19 20:37:30,494 EPOCH 10 done: loss 0.2370 - lr: 0.000000
2023-10-19 20:37:32,851 DEV : loss 0.1948169320821762 - f1-score (micro avg) 0.5124
2023-10-19 20:37:32,865 saving best model
2023-10-19 20:37:32,928 ----------------------------------------------------------------------------------------------------
2023-10-19 20:37:32,928 Loading model from best epoch ...
2023-10-19 20:37:33,001 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-19 20:37:37,533
Results:
- F-score (micro) 0.4123
- F-score (macro) 0.241
- Accuracy 0.2682
By class:
precision recall f1-score support
LOC 0.4067 0.5233 0.4577 1095
PER 0.4154 0.5020 0.4546 1012
ORG 0.0765 0.0392 0.0519 357
HumanProd 0.0000 0.0000 0.0000 33
micro avg 0.3890 0.4385 0.4123 2497
macro avg 0.2246 0.2661 0.2410 2497
weighted avg 0.3576 0.4385 0.3924 2497
2023-10-19 20:37:37,533 ----------------------------------------------------------------------------------------------------