stefan-it's picture
Upload ./training.log with huggingface_hub
6b6ee77
2023-10-25 16:03:45,800 ----------------------------------------------------------------------------------------------------
2023-10-25 16:03:45,801 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 16:03:45,801 ----------------------------------------------------------------------------------------------------
2023-10-25 16:03:45,802 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-25 16:03:45,802 ----------------------------------------------------------------------------------------------------
2023-10-25 16:03:45,802 Train: 7142 sentences
2023-10-25 16:03:45,802 (train_with_dev=False, train_with_test=False)
2023-10-25 16:03:45,802 ----------------------------------------------------------------------------------------------------
2023-10-25 16:03:45,802 Training Params:
2023-10-25 16:03:45,802 - learning_rate: "3e-05"
2023-10-25 16:03:45,802 - mini_batch_size: "8"
2023-10-25 16:03:45,802 - max_epochs: "10"
2023-10-25 16:03:45,802 - shuffle: "True"
2023-10-25 16:03:45,802 ----------------------------------------------------------------------------------------------------
2023-10-25 16:03:45,802 Plugins:
2023-10-25 16:03:45,802 - TensorboardLogger
2023-10-25 16:03:45,802 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 16:03:45,802 ----------------------------------------------------------------------------------------------------
2023-10-25 16:03:45,802 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 16:03:45,802 - metric: "('micro avg', 'f1-score')"
2023-10-25 16:03:45,802 ----------------------------------------------------------------------------------------------------
2023-10-25 16:03:45,802 Computation:
2023-10-25 16:03:45,802 - compute on device: cuda:0
2023-10-25 16:03:45,802 - embedding storage: none
2023-10-25 16:03:45,802 ----------------------------------------------------------------------------------------------------
2023-10-25 16:03:45,802 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-25 16:03:45,802 ----------------------------------------------------------------------------------------------------
2023-10-25 16:03:45,802 ----------------------------------------------------------------------------------------------------
2023-10-25 16:03:45,802 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 16:03:51,443 epoch 1 - iter 89/893 - loss 2.11453385 - time (sec): 5.64 - samples/sec: 4239.80 - lr: 0.000003 - momentum: 0.000000
2023-10-25 16:03:57,103 epoch 1 - iter 178/893 - loss 1.35559978 - time (sec): 11.30 - samples/sec: 4304.94 - lr: 0.000006 - momentum: 0.000000
2023-10-25 16:04:02,845 epoch 1 - iter 267/893 - loss 1.03942567 - time (sec): 17.04 - samples/sec: 4288.79 - lr: 0.000009 - momentum: 0.000000
2023-10-25 16:04:08,818 epoch 1 - iter 356/893 - loss 0.83395820 - time (sec): 23.01 - samples/sec: 4326.26 - lr: 0.000012 - momentum: 0.000000
2023-10-25 16:04:14,737 epoch 1 - iter 445/893 - loss 0.71286418 - time (sec): 28.93 - samples/sec: 4262.92 - lr: 0.000015 - momentum: 0.000000
2023-10-25 16:04:20,666 epoch 1 - iter 534/893 - loss 0.62634480 - time (sec): 34.86 - samples/sec: 4252.19 - lr: 0.000018 - momentum: 0.000000
2023-10-25 16:04:26,458 epoch 1 - iter 623/893 - loss 0.55642282 - time (sec): 40.65 - samples/sec: 4291.46 - lr: 0.000021 - momentum: 0.000000
2023-10-25 16:04:32,257 epoch 1 - iter 712/893 - loss 0.50694716 - time (sec): 46.45 - samples/sec: 4285.38 - lr: 0.000024 - momentum: 0.000000
2023-10-25 16:04:38,477 epoch 1 - iter 801/893 - loss 0.46774892 - time (sec): 52.67 - samples/sec: 4241.13 - lr: 0.000027 - momentum: 0.000000
2023-10-25 16:04:44,453 epoch 1 - iter 890/893 - loss 0.43611724 - time (sec): 58.65 - samples/sec: 4231.79 - lr: 0.000030 - momentum: 0.000000
2023-10-25 16:04:44,630 ----------------------------------------------------------------------------------------------------
2023-10-25 16:04:44,630 EPOCH 1 done: loss 0.4353 - lr: 0.000030
2023-10-25 16:04:48,501 DEV : loss 0.10428953170776367 - f1-score (micro avg) 0.7302
2023-10-25 16:04:48,524 saving best model
2023-10-25 16:04:48,998 ----------------------------------------------------------------------------------------------------
2023-10-25 16:04:55,053 epoch 2 - iter 89/893 - loss 0.09529641 - time (sec): 6.05 - samples/sec: 4160.31 - lr: 0.000030 - momentum: 0.000000
2023-10-25 16:05:00,721 epoch 2 - iter 178/893 - loss 0.10475713 - time (sec): 11.72 - samples/sec: 4023.51 - lr: 0.000029 - momentum: 0.000000
2023-10-25 16:05:06,626 epoch 2 - iter 267/893 - loss 0.10179451 - time (sec): 17.63 - samples/sec: 4106.60 - lr: 0.000029 - momentum: 0.000000
2023-10-25 16:05:12,511 epoch 2 - iter 356/893 - loss 0.10043505 - time (sec): 23.51 - samples/sec: 4255.39 - lr: 0.000029 - momentum: 0.000000
2023-10-25 16:05:18,187 epoch 2 - iter 445/893 - loss 0.10096008 - time (sec): 29.19 - samples/sec: 4269.11 - lr: 0.000028 - momentum: 0.000000
2023-10-25 16:05:23,889 epoch 2 - iter 534/893 - loss 0.10266929 - time (sec): 34.89 - samples/sec: 4262.37 - lr: 0.000028 - momentum: 0.000000
2023-10-25 16:05:29,682 epoch 2 - iter 623/893 - loss 0.10198042 - time (sec): 40.68 - samples/sec: 4270.98 - lr: 0.000028 - momentum: 0.000000
2023-10-25 16:05:35,406 epoch 2 - iter 712/893 - loss 0.10191959 - time (sec): 46.41 - samples/sec: 4260.38 - lr: 0.000027 - momentum: 0.000000
2023-10-25 16:05:41,234 epoch 2 - iter 801/893 - loss 0.10071170 - time (sec): 52.23 - samples/sec: 4274.86 - lr: 0.000027 - momentum: 0.000000
2023-10-25 16:05:47,006 epoch 2 - iter 890/893 - loss 0.10161902 - time (sec): 58.01 - samples/sec: 4275.23 - lr: 0.000027 - momentum: 0.000000
2023-10-25 16:05:47,186 ----------------------------------------------------------------------------------------------------
2023-10-25 16:05:47,186 EPOCH 2 done: loss 0.1016 - lr: 0.000027
2023-10-25 16:05:52,168 DEV : loss 0.10751491039991379 - f1-score (micro avg) 0.7383
2023-10-25 16:05:52,188 saving best model
2023-10-25 16:05:52,845 ----------------------------------------------------------------------------------------------------
2023-10-25 16:05:58,345 epoch 3 - iter 89/893 - loss 0.06374085 - time (sec): 5.50 - samples/sec: 4324.31 - lr: 0.000026 - momentum: 0.000000
2023-10-25 16:06:04,113 epoch 3 - iter 178/893 - loss 0.06156188 - time (sec): 11.27 - samples/sec: 4279.94 - lr: 0.000026 - momentum: 0.000000
2023-10-25 16:06:09,777 epoch 3 - iter 267/893 - loss 0.06173512 - time (sec): 16.93 - samples/sec: 4295.50 - lr: 0.000026 - momentum: 0.000000
2023-10-25 16:06:15,478 epoch 3 - iter 356/893 - loss 0.06171922 - time (sec): 22.63 - samples/sec: 4245.33 - lr: 0.000025 - momentum: 0.000000
2023-10-25 16:06:21,478 epoch 3 - iter 445/893 - loss 0.06261596 - time (sec): 28.63 - samples/sec: 4188.25 - lr: 0.000025 - momentum: 0.000000
2023-10-25 16:06:27,370 epoch 3 - iter 534/893 - loss 0.06408842 - time (sec): 34.52 - samples/sec: 4210.14 - lr: 0.000025 - momentum: 0.000000
2023-10-25 16:06:33,249 epoch 3 - iter 623/893 - loss 0.06241415 - time (sec): 40.40 - samples/sec: 4222.35 - lr: 0.000024 - momentum: 0.000000
2023-10-25 16:06:39,262 epoch 3 - iter 712/893 - loss 0.06180175 - time (sec): 46.42 - samples/sec: 4251.49 - lr: 0.000024 - momentum: 0.000000
2023-10-25 16:06:45,091 epoch 3 - iter 801/893 - loss 0.06228107 - time (sec): 52.24 - samples/sec: 4264.61 - lr: 0.000024 - momentum: 0.000000
2023-10-25 16:06:50,664 epoch 3 - iter 890/893 - loss 0.06105178 - time (sec): 57.82 - samples/sec: 4288.20 - lr: 0.000023 - momentum: 0.000000
2023-10-25 16:06:50,839 ----------------------------------------------------------------------------------------------------
2023-10-25 16:06:50,839 EPOCH 3 done: loss 0.0610 - lr: 0.000023
2023-10-25 16:06:54,937 DEV : loss 0.1158902570605278 - f1-score (micro avg) 0.7961
2023-10-25 16:06:54,961 saving best model
2023-10-25 16:06:55,624 ----------------------------------------------------------------------------------------------------
2023-10-25 16:07:01,352 epoch 4 - iter 89/893 - loss 0.03630781 - time (sec): 5.73 - samples/sec: 4149.21 - lr: 0.000023 - momentum: 0.000000
2023-10-25 16:07:07,076 epoch 4 - iter 178/893 - loss 0.03930768 - time (sec): 11.45 - samples/sec: 4208.72 - lr: 0.000023 - momentum: 0.000000
2023-10-25 16:07:12,786 epoch 4 - iter 267/893 - loss 0.04266963 - time (sec): 17.16 - samples/sec: 4161.09 - lr: 0.000022 - momentum: 0.000000
2023-10-25 16:07:18,726 epoch 4 - iter 356/893 - loss 0.04067651 - time (sec): 23.10 - samples/sec: 4223.20 - lr: 0.000022 - momentum: 0.000000
2023-10-25 16:07:24,414 epoch 4 - iter 445/893 - loss 0.03956743 - time (sec): 28.79 - samples/sec: 4226.27 - lr: 0.000022 - momentum: 0.000000
2023-10-25 16:07:30,471 epoch 4 - iter 534/893 - loss 0.03919699 - time (sec): 34.85 - samples/sec: 4221.37 - lr: 0.000021 - momentum: 0.000000
2023-10-25 16:07:36,402 epoch 4 - iter 623/893 - loss 0.03896199 - time (sec): 40.78 - samples/sec: 4226.47 - lr: 0.000021 - momentum: 0.000000
2023-10-25 16:07:42,413 epoch 4 - iter 712/893 - loss 0.04026796 - time (sec): 46.79 - samples/sec: 4243.35 - lr: 0.000021 - momentum: 0.000000
2023-10-25 16:07:48,236 epoch 4 - iter 801/893 - loss 0.04044564 - time (sec): 52.61 - samples/sec: 4230.96 - lr: 0.000020 - momentum: 0.000000
2023-10-25 16:07:54,111 epoch 4 - iter 890/893 - loss 0.04197328 - time (sec): 58.49 - samples/sec: 4243.52 - lr: 0.000020 - momentum: 0.000000
2023-10-25 16:07:54,289 ----------------------------------------------------------------------------------------------------
2023-10-25 16:07:54,290 EPOCH 4 done: loss 0.0420 - lr: 0.000020
2023-10-25 16:07:59,427 DEV : loss 0.1348925083875656 - f1-score (micro avg) 0.7792
2023-10-25 16:07:59,450 ----------------------------------------------------------------------------------------------------
2023-10-25 16:08:05,238 epoch 5 - iter 89/893 - loss 0.02848867 - time (sec): 5.79 - samples/sec: 4191.05 - lr: 0.000020 - momentum: 0.000000
2023-10-25 16:08:10,903 epoch 5 - iter 178/893 - loss 0.03195236 - time (sec): 11.45 - samples/sec: 4197.68 - lr: 0.000019 - momentum: 0.000000
2023-10-25 16:08:16,954 epoch 5 - iter 267/893 - loss 0.03181815 - time (sec): 17.50 - samples/sec: 4164.19 - lr: 0.000019 - momentum: 0.000000
2023-10-25 16:08:22,702 epoch 5 - iter 356/893 - loss 0.03233453 - time (sec): 23.25 - samples/sec: 4173.83 - lr: 0.000019 - momentum: 0.000000
2023-10-25 16:08:28,352 epoch 5 - iter 445/893 - loss 0.03353148 - time (sec): 28.90 - samples/sec: 4170.75 - lr: 0.000018 - momentum: 0.000000
2023-10-25 16:08:33,942 epoch 5 - iter 534/893 - loss 0.03379190 - time (sec): 34.49 - samples/sec: 4191.81 - lr: 0.000018 - momentum: 0.000000
2023-10-25 16:08:39,759 epoch 5 - iter 623/893 - loss 0.03405769 - time (sec): 40.31 - samples/sec: 4233.44 - lr: 0.000018 - momentum: 0.000000
2023-10-25 16:08:45,341 epoch 5 - iter 712/893 - loss 0.03407352 - time (sec): 45.89 - samples/sec: 4255.30 - lr: 0.000017 - momentum: 0.000000
2023-10-25 16:08:51,211 epoch 5 - iter 801/893 - loss 0.03330545 - time (sec): 51.76 - samples/sec: 4302.16 - lr: 0.000017 - momentum: 0.000000
2023-10-25 16:08:56,795 epoch 5 - iter 890/893 - loss 0.03306264 - time (sec): 57.34 - samples/sec: 4327.41 - lr: 0.000017 - momentum: 0.000000
2023-10-25 16:08:56,979 ----------------------------------------------------------------------------------------------------
2023-10-25 16:08:56,980 EPOCH 5 done: loss 0.0332 - lr: 0.000017
2023-10-25 16:09:01,006 DEV : loss 0.15769165754318237 - f1-score (micro avg) 0.8062
2023-10-25 16:09:01,026 saving best model
2023-10-25 16:09:02,411 ----------------------------------------------------------------------------------------------------
2023-10-25 16:09:08,234 epoch 6 - iter 89/893 - loss 0.01753563 - time (sec): 5.82 - samples/sec: 4087.62 - lr: 0.000016 - momentum: 0.000000
2023-10-25 16:09:13,933 epoch 6 - iter 178/893 - loss 0.02304834 - time (sec): 11.52 - samples/sec: 4237.86 - lr: 0.000016 - momentum: 0.000000
2023-10-25 16:09:19,772 epoch 6 - iter 267/893 - loss 0.02763774 - time (sec): 17.36 - samples/sec: 4234.69 - lr: 0.000016 - momentum: 0.000000
2023-10-25 16:09:25,569 epoch 6 - iter 356/893 - loss 0.02508677 - time (sec): 23.15 - samples/sec: 4244.50 - lr: 0.000015 - momentum: 0.000000
2023-10-25 16:09:31,199 epoch 6 - iter 445/893 - loss 0.02568297 - time (sec): 28.79 - samples/sec: 4292.22 - lr: 0.000015 - momentum: 0.000000
2023-10-25 16:09:36,617 epoch 6 - iter 534/893 - loss 0.02645846 - time (sec): 34.20 - samples/sec: 4321.48 - lr: 0.000015 - momentum: 0.000000
2023-10-25 16:09:42,157 epoch 6 - iter 623/893 - loss 0.02627985 - time (sec): 39.74 - samples/sec: 4347.19 - lr: 0.000014 - momentum: 0.000000
2023-10-25 16:09:48,255 epoch 6 - iter 712/893 - loss 0.02557212 - time (sec): 45.84 - samples/sec: 4336.26 - lr: 0.000014 - momentum: 0.000000
2023-10-25 16:09:54,094 epoch 6 - iter 801/893 - loss 0.02579576 - time (sec): 51.68 - samples/sec: 4314.21 - lr: 0.000014 - momentum: 0.000000
2023-10-25 16:10:00,081 epoch 6 - iter 890/893 - loss 0.02567987 - time (sec): 57.67 - samples/sec: 4301.20 - lr: 0.000013 - momentum: 0.000000
2023-10-25 16:10:00,267 ----------------------------------------------------------------------------------------------------
2023-10-25 16:10:00,268 EPOCH 6 done: loss 0.0258 - lr: 0.000013
2023-10-25 16:10:04,475 DEV : loss 0.17935216426849365 - f1-score (micro avg) 0.7969
2023-10-25 16:10:04,496 ----------------------------------------------------------------------------------------------------
2023-10-25 16:10:10,572 epoch 7 - iter 89/893 - loss 0.01624137 - time (sec): 6.07 - samples/sec: 4274.27 - lr: 0.000013 - momentum: 0.000000
2023-10-25 16:10:16,503 epoch 7 - iter 178/893 - loss 0.01696138 - time (sec): 12.01 - samples/sec: 4285.74 - lr: 0.000013 - momentum: 0.000000
2023-10-25 16:10:22,335 epoch 7 - iter 267/893 - loss 0.01803794 - time (sec): 17.84 - samples/sec: 4312.78 - lr: 0.000012 - momentum: 0.000000
2023-10-25 16:10:27,975 epoch 7 - iter 356/893 - loss 0.01768749 - time (sec): 23.48 - samples/sec: 4346.49 - lr: 0.000012 - momentum: 0.000000
2023-10-25 16:10:33,788 epoch 7 - iter 445/893 - loss 0.01843579 - time (sec): 29.29 - samples/sec: 4297.72 - lr: 0.000012 - momentum: 0.000000
2023-10-25 16:10:39,585 epoch 7 - iter 534/893 - loss 0.01959203 - time (sec): 35.09 - samples/sec: 4315.20 - lr: 0.000011 - momentum: 0.000000
2023-10-25 16:10:45,169 epoch 7 - iter 623/893 - loss 0.01997406 - time (sec): 40.67 - samples/sec: 4310.17 - lr: 0.000011 - momentum: 0.000000
2023-10-25 16:10:50,640 epoch 7 - iter 712/893 - loss 0.01981503 - time (sec): 46.14 - samples/sec: 4324.79 - lr: 0.000011 - momentum: 0.000000
2023-10-25 16:10:56,455 epoch 7 - iter 801/893 - loss 0.02006920 - time (sec): 51.96 - samples/sec: 4324.53 - lr: 0.000010 - momentum: 0.000000
2023-10-25 16:11:02,151 epoch 7 - iter 890/893 - loss 0.02019253 - time (sec): 57.65 - samples/sec: 4302.38 - lr: 0.000010 - momentum: 0.000000
2023-10-25 16:11:02,324 ----------------------------------------------------------------------------------------------------
2023-10-25 16:11:02,325 EPOCH 7 done: loss 0.0203 - lr: 0.000010
2023-10-25 16:11:07,186 DEV : loss 0.19400793313980103 - f1-score (micro avg) 0.7987
2023-10-25 16:11:07,207 ----------------------------------------------------------------------------------------------------
2023-10-25 16:11:12,822 epoch 8 - iter 89/893 - loss 0.01615825 - time (sec): 5.61 - samples/sec: 4433.19 - lr: 0.000010 - momentum: 0.000000
2023-10-25 16:11:18,357 epoch 8 - iter 178/893 - loss 0.01263981 - time (sec): 11.15 - samples/sec: 4305.66 - lr: 0.000009 - momentum: 0.000000
2023-10-25 16:11:24,197 epoch 8 - iter 267/893 - loss 0.01173664 - time (sec): 16.99 - samples/sec: 4396.90 - lr: 0.000009 - momentum: 0.000000
2023-10-25 16:11:29,935 epoch 8 - iter 356/893 - loss 0.01453444 - time (sec): 22.73 - samples/sec: 4356.89 - lr: 0.000009 - momentum: 0.000000
2023-10-25 16:11:35,822 epoch 8 - iter 445/893 - loss 0.01465221 - time (sec): 28.61 - samples/sec: 4331.12 - lr: 0.000008 - momentum: 0.000000
2023-10-25 16:11:41,600 epoch 8 - iter 534/893 - loss 0.01486509 - time (sec): 34.39 - samples/sec: 4364.99 - lr: 0.000008 - momentum: 0.000000
2023-10-25 16:11:47,224 epoch 8 - iter 623/893 - loss 0.01399005 - time (sec): 40.02 - samples/sec: 4365.09 - lr: 0.000008 - momentum: 0.000000
2023-10-25 16:11:52,868 epoch 8 - iter 712/893 - loss 0.01436240 - time (sec): 45.66 - samples/sec: 4373.14 - lr: 0.000007 - momentum: 0.000000
2023-10-25 16:11:58,440 epoch 8 - iter 801/893 - loss 0.01450736 - time (sec): 51.23 - samples/sec: 4355.28 - lr: 0.000007 - momentum: 0.000000
2023-10-25 16:12:04,233 epoch 8 - iter 890/893 - loss 0.01512636 - time (sec): 57.02 - samples/sec: 4349.06 - lr: 0.000007 - momentum: 0.000000
2023-10-25 16:12:04,409 ----------------------------------------------------------------------------------------------------
2023-10-25 16:12:04,409 EPOCH 8 done: loss 0.0151 - lr: 0.000007
2023-10-25 16:12:09,035 DEV : loss 0.20422013103961945 - f1-score (micro avg) 0.8059
2023-10-25 16:12:09,056 ----------------------------------------------------------------------------------------------------
2023-10-25 16:12:14,823 epoch 9 - iter 89/893 - loss 0.00799801 - time (sec): 5.76 - samples/sec: 4645.19 - lr: 0.000006 - momentum: 0.000000
2023-10-25 16:12:20,720 epoch 9 - iter 178/893 - loss 0.01018495 - time (sec): 11.66 - samples/sec: 4454.68 - lr: 0.000006 - momentum: 0.000000
2023-10-25 16:12:26,567 epoch 9 - iter 267/893 - loss 0.00963239 - time (sec): 17.51 - samples/sec: 4454.23 - lr: 0.000006 - momentum: 0.000000
2023-10-25 16:12:32,090 epoch 9 - iter 356/893 - loss 0.01103547 - time (sec): 23.03 - samples/sec: 4396.07 - lr: 0.000005 - momentum: 0.000000
2023-10-25 16:12:37,638 epoch 9 - iter 445/893 - loss 0.01201791 - time (sec): 28.58 - samples/sec: 4326.54 - lr: 0.000005 - momentum: 0.000000
2023-10-25 16:12:43,104 epoch 9 - iter 534/893 - loss 0.01156303 - time (sec): 34.05 - samples/sec: 4310.93 - lr: 0.000005 - momentum: 0.000000
2023-10-25 16:12:49,695 epoch 9 - iter 623/893 - loss 0.01084288 - time (sec): 40.64 - samples/sec: 4238.58 - lr: 0.000004 - momentum: 0.000000
2023-10-25 16:12:55,519 epoch 9 - iter 712/893 - loss 0.01103957 - time (sec): 46.46 - samples/sec: 4242.86 - lr: 0.000004 - momentum: 0.000000
2023-10-25 16:13:01,126 epoch 9 - iter 801/893 - loss 0.01106522 - time (sec): 52.07 - samples/sec: 4252.21 - lr: 0.000004 - momentum: 0.000000
2023-10-25 16:13:06,983 epoch 9 - iter 890/893 - loss 0.01082742 - time (sec): 57.92 - samples/sec: 4282.64 - lr: 0.000003 - momentum: 0.000000
2023-10-25 16:13:07,162 ----------------------------------------------------------------------------------------------------
2023-10-25 16:13:07,162 EPOCH 9 done: loss 0.0108 - lr: 0.000003
2023-10-25 16:13:11,368 DEV : loss 0.20651929080486298 - f1-score (micro avg) 0.8026
2023-10-25 16:13:11,385 ----------------------------------------------------------------------------------------------------
2023-10-25 16:13:17,093 epoch 10 - iter 89/893 - loss 0.00654699 - time (sec): 5.71 - samples/sec: 4580.14 - lr: 0.000003 - momentum: 0.000000
2023-10-25 16:13:22,645 epoch 10 - iter 178/893 - loss 0.00895836 - time (sec): 11.26 - samples/sec: 4462.50 - lr: 0.000003 - momentum: 0.000000
2023-10-25 16:13:28,297 epoch 10 - iter 267/893 - loss 0.00924465 - time (sec): 16.91 - samples/sec: 4473.34 - lr: 0.000002 - momentum: 0.000000
2023-10-25 16:13:33,931 epoch 10 - iter 356/893 - loss 0.00922855 - time (sec): 22.54 - samples/sec: 4409.65 - lr: 0.000002 - momentum: 0.000000
2023-10-25 16:13:39,442 epoch 10 - iter 445/893 - loss 0.00964830 - time (sec): 28.06 - samples/sec: 4405.86 - lr: 0.000002 - momentum: 0.000000
2023-10-25 16:13:45,127 epoch 10 - iter 534/893 - loss 0.00919840 - time (sec): 33.74 - samples/sec: 4429.88 - lr: 0.000001 - momentum: 0.000000
2023-10-25 16:13:50,731 epoch 10 - iter 623/893 - loss 0.00902350 - time (sec): 39.34 - samples/sec: 4452.91 - lr: 0.000001 - momentum: 0.000000
2023-10-25 16:13:56,364 epoch 10 - iter 712/893 - loss 0.00826685 - time (sec): 44.98 - samples/sec: 4448.66 - lr: 0.000001 - momentum: 0.000000
2023-10-25 16:14:01,868 epoch 10 - iter 801/893 - loss 0.00834390 - time (sec): 50.48 - samples/sec: 4431.37 - lr: 0.000000 - momentum: 0.000000
2023-10-25 16:14:07,462 epoch 10 - iter 890/893 - loss 0.00816210 - time (sec): 56.08 - samples/sec: 4426.97 - lr: 0.000000 - momentum: 0.000000
2023-10-25 16:14:07,632 ----------------------------------------------------------------------------------------------------
2023-10-25 16:14:07,633 EPOCH 10 done: loss 0.0082 - lr: 0.000000
2023-10-25 16:14:12,660 DEV : loss 0.21082164347171783 - f1-score (micro avg) 0.8053
2023-10-25 16:14:13,131 ----------------------------------------------------------------------------------------------------
2023-10-25 16:14:13,133 Loading model from best epoch ...
2023-10-25 16:14:14,951 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 16:14:28,429
Results:
- F-score (micro) 0.6946
- F-score (macro) 0.6319
- Accuracy 0.5453
By class:
precision recall f1-score support
LOC 0.7229 0.6338 0.6754 1095
PER 0.8044 0.7885 0.7964 1012
ORG 0.4569 0.5490 0.4987 357
HumanProd 0.4783 0.6667 0.5570 33
micro avg 0.7046 0.6848 0.6946 2497
macro avg 0.6156 0.6595 0.6319 2497
weighted avg 0.7147 0.6848 0.6976 2497
2023-10-25 16:14:28,430 ----------------------------------------------------------------------------------------------------