|
2023-10-25 16:03:45,800 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:03:45,801 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 16:03:45,801 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:03:45,802 MultiCorpus: 7142 train + 698 dev + 2570 test sentences |
|
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator |
|
2023-10-25 16:03:45,802 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:03:45,802 Train: 7142 sentences |
|
2023-10-25 16:03:45,802 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 16:03:45,802 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:03:45,802 Training Params: |
|
2023-10-25 16:03:45,802 - learning_rate: "3e-05" |
|
2023-10-25 16:03:45,802 - mini_batch_size: "8" |
|
2023-10-25 16:03:45,802 - max_epochs: "10" |
|
2023-10-25 16:03:45,802 - shuffle: "True" |
|
2023-10-25 16:03:45,802 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:03:45,802 Plugins: |
|
2023-10-25 16:03:45,802 - TensorboardLogger |
|
2023-10-25 16:03:45,802 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 16:03:45,802 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:03:45,802 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 16:03:45,802 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 16:03:45,802 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:03:45,802 Computation: |
|
2023-10-25 16:03:45,802 - compute on device: cuda:0 |
|
2023-10-25 16:03:45,802 - embedding storage: none |
|
2023-10-25 16:03:45,802 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:03:45,802 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-25 16:03:45,802 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:03:45,802 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:03:45,802 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 16:03:51,443 epoch 1 - iter 89/893 - loss 2.11453385 - time (sec): 5.64 - samples/sec: 4239.80 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 16:03:57,103 epoch 1 - iter 178/893 - loss 1.35559978 - time (sec): 11.30 - samples/sec: 4304.94 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 16:04:02,845 epoch 1 - iter 267/893 - loss 1.03942567 - time (sec): 17.04 - samples/sec: 4288.79 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 16:04:08,818 epoch 1 - iter 356/893 - loss 0.83395820 - time (sec): 23.01 - samples/sec: 4326.26 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 16:04:14,737 epoch 1 - iter 445/893 - loss 0.71286418 - time (sec): 28.93 - samples/sec: 4262.92 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 16:04:20,666 epoch 1 - iter 534/893 - loss 0.62634480 - time (sec): 34.86 - samples/sec: 4252.19 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 16:04:26,458 epoch 1 - iter 623/893 - loss 0.55642282 - time (sec): 40.65 - samples/sec: 4291.46 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 16:04:32,257 epoch 1 - iter 712/893 - loss 0.50694716 - time (sec): 46.45 - samples/sec: 4285.38 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 16:04:38,477 epoch 1 - iter 801/893 - loss 0.46774892 - time (sec): 52.67 - samples/sec: 4241.13 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 16:04:44,453 epoch 1 - iter 890/893 - loss 0.43611724 - time (sec): 58.65 - samples/sec: 4231.79 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 16:04:44,630 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:04:44,630 EPOCH 1 done: loss 0.4353 - lr: 0.000030 |
|
2023-10-25 16:04:48,501 DEV : loss 0.10428953170776367 - f1-score (micro avg) 0.7302 |
|
2023-10-25 16:04:48,524 saving best model |
|
2023-10-25 16:04:48,998 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:04:55,053 epoch 2 - iter 89/893 - loss 0.09529641 - time (sec): 6.05 - samples/sec: 4160.31 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 16:05:00,721 epoch 2 - iter 178/893 - loss 0.10475713 - time (sec): 11.72 - samples/sec: 4023.51 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 16:05:06,626 epoch 2 - iter 267/893 - loss 0.10179451 - time (sec): 17.63 - samples/sec: 4106.60 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 16:05:12,511 epoch 2 - iter 356/893 - loss 0.10043505 - time (sec): 23.51 - samples/sec: 4255.39 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 16:05:18,187 epoch 2 - iter 445/893 - loss 0.10096008 - time (sec): 29.19 - samples/sec: 4269.11 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 16:05:23,889 epoch 2 - iter 534/893 - loss 0.10266929 - time (sec): 34.89 - samples/sec: 4262.37 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 16:05:29,682 epoch 2 - iter 623/893 - loss 0.10198042 - time (sec): 40.68 - samples/sec: 4270.98 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 16:05:35,406 epoch 2 - iter 712/893 - loss 0.10191959 - time (sec): 46.41 - samples/sec: 4260.38 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 16:05:41,234 epoch 2 - iter 801/893 - loss 0.10071170 - time (sec): 52.23 - samples/sec: 4274.86 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 16:05:47,006 epoch 2 - iter 890/893 - loss 0.10161902 - time (sec): 58.01 - samples/sec: 4275.23 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 16:05:47,186 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:05:47,186 EPOCH 2 done: loss 0.1016 - lr: 0.000027 |
|
2023-10-25 16:05:52,168 DEV : loss 0.10751491039991379 - f1-score (micro avg) 0.7383 |
|
2023-10-25 16:05:52,188 saving best model |
|
2023-10-25 16:05:52,845 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:05:58,345 epoch 3 - iter 89/893 - loss 0.06374085 - time (sec): 5.50 - samples/sec: 4324.31 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 16:06:04,113 epoch 3 - iter 178/893 - loss 0.06156188 - time (sec): 11.27 - samples/sec: 4279.94 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 16:06:09,777 epoch 3 - iter 267/893 - loss 0.06173512 - time (sec): 16.93 - samples/sec: 4295.50 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 16:06:15,478 epoch 3 - iter 356/893 - loss 0.06171922 - time (sec): 22.63 - samples/sec: 4245.33 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 16:06:21,478 epoch 3 - iter 445/893 - loss 0.06261596 - time (sec): 28.63 - samples/sec: 4188.25 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 16:06:27,370 epoch 3 - iter 534/893 - loss 0.06408842 - time (sec): 34.52 - samples/sec: 4210.14 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 16:06:33,249 epoch 3 - iter 623/893 - loss 0.06241415 - time (sec): 40.40 - samples/sec: 4222.35 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 16:06:39,262 epoch 3 - iter 712/893 - loss 0.06180175 - time (sec): 46.42 - samples/sec: 4251.49 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 16:06:45,091 epoch 3 - iter 801/893 - loss 0.06228107 - time (sec): 52.24 - samples/sec: 4264.61 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 16:06:50,664 epoch 3 - iter 890/893 - loss 0.06105178 - time (sec): 57.82 - samples/sec: 4288.20 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 16:06:50,839 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:06:50,839 EPOCH 3 done: loss 0.0610 - lr: 0.000023 |
|
2023-10-25 16:06:54,937 DEV : loss 0.1158902570605278 - f1-score (micro avg) 0.7961 |
|
2023-10-25 16:06:54,961 saving best model |
|
2023-10-25 16:06:55,624 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:07:01,352 epoch 4 - iter 89/893 - loss 0.03630781 - time (sec): 5.73 - samples/sec: 4149.21 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 16:07:07,076 epoch 4 - iter 178/893 - loss 0.03930768 - time (sec): 11.45 - samples/sec: 4208.72 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 16:07:12,786 epoch 4 - iter 267/893 - loss 0.04266963 - time (sec): 17.16 - samples/sec: 4161.09 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 16:07:18,726 epoch 4 - iter 356/893 - loss 0.04067651 - time (sec): 23.10 - samples/sec: 4223.20 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 16:07:24,414 epoch 4 - iter 445/893 - loss 0.03956743 - time (sec): 28.79 - samples/sec: 4226.27 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 16:07:30,471 epoch 4 - iter 534/893 - loss 0.03919699 - time (sec): 34.85 - samples/sec: 4221.37 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 16:07:36,402 epoch 4 - iter 623/893 - loss 0.03896199 - time (sec): 40.78 - samples/sec: 4226.47 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 16:07:42,413 epoch 4 - iter 712/893 - loss 0.04026796 - time (sec): 46.79 - samples/sec: 4243.35 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 16:07:48,236 epoch 4 - iter 801/893 - loss 0.04044564 - time (sec): 52.61 - samples/sec: 4230.96 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 16:07:54,111 epoch 4 - iter 890/893 - loss 0.04197328 - time (sec): 58.49 - samples/sec: 4243.52 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 16:07:54,289 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:07:54,290 EPOCH 4 done: loss 0.0420 - lr: 0.000020 |
|
2023-10-25 16:07:59,427 DEV : loss 0.1348925083875656 - f1-score (micro avg) 0.7792 |
|
2023-10-25 16:07:59,450 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:08:05,238 epoch 5 - iter 89/893 - loss 0.02848867 - time (sec): 5.79 - samples/sec: 4191.05 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 16:08:10,903 epoch 5 - iter 178/893 - loss 0.03195236 - time (sec): 11.45 - samples/sec: 4197.68 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 16:08:16,954 epoch 5 - iter 267/893 - loss 0.03181815 - time (sec): 17.50 - samples/sec: 4164.19 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 16:08:22,702 epoch 5 - iter 356/893 - loss 0.03233453 - time (sec): 23.25 - samples/sec: 4173.83 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 16:08:28,352 epoch 5 - iter 445/893 - loss 0.03353148 - time (sec): 28.90 - samples/sec: 4170.75 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 16:08:33,942 epoch 5 - iter 534/893 - loss 0.03379190 - time (sec): 34.49 - samples/sec: 4191.81 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 16:08:39,759 epoch 5 - iter 623/893 - loss 0.03405769 - time (sec): 40.31 - samples/sec: 4233.44 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 16:08:45,341 epoch 5 - iter 712/893 - loss 0.03407352 - time (sec): 45.89 - samples/sec: 4255.30 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 16:08:51,211 epoch 5 - iter 801/893 - loss 0.03330545 - time (sec): 51.76 - samples/sec: 4302.16 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 16:08:56,795 epoch 5 - iter 890/893 - loss 0.03306264 - time (sec): 57.34 - samples/sec: 4327.41 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 16:08:56,979 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:08:56,980 EPOCH 5 done: loss 0.0332 - lr: 0.000017 |
|
2023-10-25 16:09:01,006 DEV : loss 0.15769165754318237 - f1-score (micro avg) 0.8062 |
|
2023-10-25 16:09:01,026 saving best model |
|
2023-10-25 16:09:02,411 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:09:08,234 epoch 6 - iter 89/893 - loss 0.01753563 - time (sec): 5.82 - samples/sec: 4087.62 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 16:09:13,933 epoch 6 - iter 178/893 - loss 0.02304834 - time (sec): 11.52 - samples/sec: 4237.86 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 16:09:19,772 epoch 6 - iter 267/893 - loss 0.02763774 - time (sec): 17.36 - samples/sec: 4234.69 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 16:09:25,569 epoch 6 - iter 356/893 - loss 0.02508677 - time (sec): 23.15 - samples/sec: 4244.50 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 16:09:31,199 epoch 6 - iter 445/893 - loss 0.02568297 - time (sec): 28.79 - samples/sec: 4292.22 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 16:09:36,617 epoch 6 - iter 534/893 - loss 0.02645846 - time (sec): 34.20 - samples/sec: 4321.48 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 16:09:42,157 epoch 6 - iter 623/893 - loss 0.02627985 - time (sec): 39.74 - samples/sec: 4347.19 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 16:09:48,255 epoch 6 - iter 712/893 - loss 0.02557212 - time (sec): 45.84 - samples/sec: 4336.26 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 16:09:54,094 epoch 6 - iter 801/893 - loss 0.02579576 - time (sec): 51.68 - samples/sec: 4314.21 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 16:10:00,081 epoch 6 - iter 890/893 - loss 0.02567987 - time (sec): 57.67 - samples/sec: 4301.20 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 16:10:00,267 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:10:00,268 EPOCH 6 done: loss 0.0258 - lr: 0.000013 |
|
2023-10-25 16:10:04,475 DEV : loss 0.17935216426849365 - f1-score (micro avg) 0.7969 |
|
2023-10-25 16:10:04,496 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:10:10,572 epoch 7 - iter 89/893 - loss 0.01624137 - time (sec): 6.07 - samples/sec: 4274.27 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 16:10:16,503 epoch 7 - iter 178/893 - loss 0.01696138 - time (sec): 12.01 - samples/sec: 4285.74 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 16:10:22,335 epoch 7 - iter 267/893 - loss 0.01803794 - time (sec): 17.84 - samples/sec: 4312.78 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 16:10:27,975 epoch 7 - iter 356/893 - loss 0.01768749 - time (sec): 23.48 - samples/sec: 4346.49 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 16:10:33,788 epoch 7 - iter 445/893 - loss 0.01843579 - time (sec): 29.29 - samples/sec: 4297.72 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 16:10:39,585 epoch 7 - iter 534/893 - loss 0.01959203 - time (sec): 35.09 - samples/sec: 4315.20 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 16:10:45,169 epoch 7 - iter 623/893 - loss 0.01997406 - time (sec): 40.67 - samples/sec: 4310.17 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 16:10:50,640 epoch 7 - iter 712/893 - loss 0.01981503 - time (sec): 46.14 - samples/sec: 4324.79 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 16:10:56,455 epoch 7 - iter 801/893 - loss 0.02006920 - time (sec): 51.96 - samples/sec: 4324.53 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 16:11:02,151 epoch 7 - iter 890/893 - loss 0.02019253 - time (sec): 57.65 - samples/sec: 4302.38 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 16:11:02,324 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:11:02,325 EPOCH 7 done: loss 0.0203 - lr: 0.000010 |
|
2023-10-25 16:11:07,186 DEV : loss 0.19400793313980103 - f1-score (micro avg) 0.7987 |
|
2023-10-25 16:11:07,207 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:11:12,822 epoch 8 - iter 89/893 - loss 0.01615825 - time (sec): 5.61 - samples/sec: 4433.19 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 16:11:18,357 epoch 8 - iter 178/893 - loss 0.01263981 - time (sec): 11.15 - samples/sec: 4305.66 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 16:11:24,197 epoch 8 - iter 267/893 - loss 0.01173664 - time (sec): 16.99 - samples/sec: 4396.90 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 16:11:29,935 epoch 8 - iter 356/893 - loss 0.01453444 - time (sec): 22.73 - samples/sec: 4356.89 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 16:11:35,822 epoch 8 - iter 445/893 - loss 0.01465221 - time (sec): 28.61 - samples/sec: 4331.12 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 16:11:41,600 epoch 8 - iter 534/893 - loss 0.01486509 - time (sec): 34.39 - samples/sec: 4364.99 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 16:11:47,224 epoch 8 - iter 623/893 - loss 0.01399005 - time (sec): 40.02 - samples/sec: 4365.09 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 16:11:52,868 epoch 8 - iter 712/893 - loss 0.01436240 - time (sec): 45.66 - samples/sec: 4373.14 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 16:11:58,440 epoch 8 - iter 801/893 - loss 0.01450736 - time (sec): 51.23 - samples/sec: 4355.28 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 16:12:04,233 epoch 8 - iter 890/893 - loss 0.01512636 - time (sec): 57.02 - samples/sec: 4349.06 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 16:12:04,409 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:12:04,409 EPOCH 8 done: loss 0.0151 - lr: 0.000007 |
|
2023-10-25 16:12:09,035 DEV : loss 0.20422013103961945 - f1-score (micro avg) 0.8059 |
|
2023-10-25 16:12:09,056 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:12:14,823 epoch 9 - iter 89/893 - loss 0.00799801 - time (sec): 5.76 - samples/sec: 4645.19 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 16:12:20,720 epoch 9 - iter 178/893 - loss 0.01018495 - time (sec): 11.66 - samples/sec: 4454.68 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 16:12:26,567 epoch 9 - iter 267/893 - loss 0.00963239 - time (sec): 17.51 - samples/sec: 4454.23 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 16:12:32,090 epoch 9 - iter 356/893 - loss 0.01103547 - time (sec): 23.03 - samples/sec: 4396.07 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 16:12:37,638 epoch 9 - iter 445/893 - loss 0.01201791 - time (sec): 28.58 - samples/sec: 4326.54 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 16:12:43,104 epoch 9 - iter 534/893 - loss 0.01156303 - time (sec): 34.05 - samples/sec: 4310.93 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 16:12:49,695 epoch 9 - iter 623/893 - loss 0.01084288 - time (sec): 40.64 - samples/sec: 4238.58 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 16:12:55,519 epoch 9 - iter 712/893 - loss 0.01103957 - time (sec): 46.46 - samples/sec: 4242.86 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 16:13:01,126 epoch 9 - iter 801/893 - loss 0.01106522 - time (sec): 52.07 - samples/sec: 4252.21 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 16:13:06,983 epoch 9 - iter 890/893 - loss 0.01082742 - time (sec): 57.92 - samples/sec: 4282.64 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 16:13:07,162 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:13:07,162 EPOCH 9 done: loss 0.0108 - lr: 0.000003 |
|
2023-10-25 16:13:11,368 DEV : loss 0.20651929080486298 - f1-score (micro avg) 0.8026 |
|
2023-10-25 16:13:11,385 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:13:17,093 epoch 10 - iter 89/893 - loss 0.00654699 - time (sec): 5.71 - samples/sec: 4580.14 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 16:13:22,645 epoch 10 - iter 178/893 - loss 0.00895836 - time (sec): 11.26 - samples/sec: 4462.50 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 16:13:28,297 epoch 10 - iter 267/893 - loss 0.00924465 - time (sec): 16.91 - samples/sec: 4473.34 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 16:13:33,931 epoch 10 - iter 356/893 - loss 0.00922855 - time (sec): 22.54 - samples/sec: 4409.65 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 16:13:39,442 epoch 10 - iter 445/893 - loss 0.00964830 - time (sec): 28.06 - samples/sec: 4405.86 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 16:13:45,127 epoch 10 - iter 534/893 - loss 0.00919840 - time (sec): 33.74 - samples/sec: 4429.88 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 16:13:50,731 epoch 10 - iter 623/893 - loss 0.00902350 - time (sec): 39.34 - samples/sec: 4452.91 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 16:13:56,364 epoch 10 - iter 712/893 - loss 0.00826685 - time (sec): 44.98 - samples/sec: 4448.66 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 16:14:01,868 epoch 10 - iter 801/893 - loss 0.00834390 - time (sec): 50.48 - samples/sec: 4431.37 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 16:14:07,462 epoch 10 - iter 890/893 - loss 0.00816210 - time (sec): 56.08 - samples/sec: 4426.97 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 16:14:07,632 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:14:07,633 EPOCH 10 done: loss 0.0082 - lr: 0.000000 |
|
2023-10-25 16:14:12,660 DEV : loss 0.21082164347171783 - f1-score (micro avg) 0.8053 |
|
2023-10-25 16:14:13,131 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:14:13,133 Loading model from best epoch ... |
|
2023-10-25 16:14:14,951 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-25 16:14:28,429 |
|
Results: |
|
- F-score (micro) 0.6946 |
|
- F-score (macro) 0.6319 |
|
- Accuracy 0.5453 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.7229 0.6338 0.6754 1095 |
|
PER 0.8044 0.7885 0.7964 1012 |
|
ORG 0.4569 0.5490 0.4987 357 |
|
HumanProd 0.4783 0.6667 0.5570 33 |
|
|
|
micro avg 0.7046 0.6848 0.6946 2497 |
|
macro avg 0.6156 0.6595 0.6319 2497 |
|
weighted avg 0.7147 0.6848 0.6976 2497 |
|
|
|
2023-10-25 16:14:28,430 ---------------------------------------------------------------------------------------------------- |
|
|