|
2023-10-25 15:18:28,415 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:18:28,416 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 15:18:28,416 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:18:28,416 MultiCorpus: 7142 train + 698 dev + 2570 test sentences |
|
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator |
|
2023-10-25 15:18:28,416 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:18:28,416 Train: 7142 sentences |
|
2023-10-25 15:18:28,417 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 15:18:28,417 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:18:28,417 Training Params: |
|
2023-10-25 15:18:28,417 - learning_rate: "5e-05" |
|
2023-10-25 15:18:28,417 - mini_batch_size: "8" |
|
2023-10-25 15:18:28,417 - max_epochs: "10" |
|
2023-10-25 15:18:28,417 - shuffle: "True" |
|
2023-10-25 15:18:28,417 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:18:28,417 Plugins: |
|
2023-10-25 15:18:28,417 - TensorboardLogger |
|
2023-10-25 15:18:28,417 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 15:18:28,417 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:18:28,417 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 15:18:28,417 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 15:18:28,417 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:18:28,417 Computation: |
|
2023-10-25 15:18:28,418 - compute on device: cuda:0 |
|
2023-10-25 15:18:28,418 - embedding storage: none |
|
2023-10-25 15:18:28,418 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:18:28,418 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-25 15:18:28,418 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:18:28,418 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:18:28,418 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 15:18:34,554 epoch 1 - iter 89/893 - loss 1.99536437 - time (sec): 6.13 - samples/sec: 4156.59 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 15:18:40,462 epoch 1 - iter 178/893 - loss 1.28347735 - time (sec): 12.04 - samples/sec: 4054.53 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 15:18:46,184 epoch 1 - iter 267/893 - loss 0.96621297 - time (sec): 17.77 - samples/sec: 4067.33 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 15:18:52,212 epoch 1 - iter 356/893 - loss 0.78153614 - time (sec): 23.79 - samples/sec: 4068.56 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 15:18:58,243 epoch 1 - iter 445/893 - loss 0.65825937 - time (sec): 29.82 - samples/sec: 4089.24 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 15:19:04,627 epoch 1 - iter 534/893 - loss 0.56876226 - time (sec): 36.21 - samples/sec: 4096.24 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 15:19:10,683 epoch 1 - iter 623/893 - loss 0.51115509 - time (sec): 42.26 - samples/sec: 4096.86 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 15:19:16,477 epoch 1 - iter 712/893 - loss 0.46644377 - time (sec): 48.06 - samples/sec: 4119.52 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 15:19:22,505 epoch 1 - iter 801/893 - loss 0.43085918 - time (sec): 54.09 - samples/sec: 4116.89 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 15:19:28,588 epoch 1 - iter 890/893 - loss 0.40247481 - time (sec): 60.17 - samples/sec: 4123.30 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-25 15:19:28,768 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:19:28,768 EPOCH 1 done: loss 0.4017 - lr: 0.000050 |
|
2023-10-25 15:19:32,698 DEV : loss 0.129494771361351 - f1-score (micro avg) 0.7147 |
|
2023-10-25 15:19:32,721 saving best model |
|
2023-10-25 15:19:33,191 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:19:39,171 epoch 2 - iter 89/893 - loss 0.11570827 - time (sec): 5.98 - samples/sec: 4292.85 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 15:19:44,703 epoch 2 - iter 178/893 - loss 0.11300717 - time (sec): 11.51 - samples/sec: 4096.21 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 15:19:50,654 epoch 2 - iter 267/893 - loss 0.10732609 - time (sec): 17.46 - samples/sec: 4226.81 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 15:19:56,334 epoch 2 - iter 356/893 - loss 0.10846905 - time (sec): 23.14 - samples/sec: 4255.34 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 15:20:02,251 epoch 2 - iter 445/893 - loss 0.10556608 - time (sec): 29.06 - samples/sec: 4268.35 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 15:20:07,968 epoch 2 - iter 534/893 - loss 0.10460658 - time (sec): 34.78 - samples/sec: 4277.84 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 15:20:13,816 epoch 2 - iter 623/893 - loss 0.10460754 - time (sec): 40.62 - samples/sec: 4304.95 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 15:20:19,498 epoch 2 - iter 712/893 - loss 0.10363498 - time (sec): 46.31 - samples/sec: 4259.99 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 15:20:25,661 epoch 2 - iter 801/893 - loss 0.10398883 - time (sec): 52.47 - samples/sec: 4243.14 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 15:20:31,642 epoch 2 - iter 890/893 - loss 0.10345717 - time (sec): 58.45 - samples/sec: 4238.82 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 15:20:31,837 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:20:31,837 EPOCH 2 done: loss 0.1032 - lr: 0.000044 |
|
2023-10-25 15:20:36,043 DEV : loss 0.1055043414235115 - f1-score (micro avg) 0.7684 |
|
2023-10-25 15:20:36,066 saving best model |
|
2023-10-25 15:20:36,737 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:20:43,647 epoch 3 - iter 89/893 - loss 0.06314996 - time (sec): 6.91 - samples/sec: 3683.92 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 15:20:49,448 epoch 3 - iter 178/893 - loss 0.05732830 - time (sec): 12.71 - samples/sec: 3851.88 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 15:20:55,350 epoch 3 - iter 267/893 - loss 0.06133970 - time (sec): 18.61 - samples/sec: 4029.84 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 15:21:01,032 epoch 3 - iter 356/893 - loss 0.06089097 - time (sec): 24.29 - samples/sec: 4104.42 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 15:21:06,667 epoch 3 - iter 445/893 - loss 0.06302484 - time (sec): 29.93 - samples/sec: 4110.82 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 15:21:12,590 epoch 3 - iter 534/893 - loss 0.06411251 - time (sec): 35.85 - samples/sec: 4120.08 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 15:21:18,784 epoch 3 - iter 623/893 - loss 0.06511810 - time (sec): 42.04 - samples/sec: 4134.48 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 15:21:24,821 epoch 3 - iter 712/893 - loss 0.06623456 - time (sec): 48.08 - samples/sec: 4153.85 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 15:21:30,738 epoch 3 - iter 801/893 - loss 0.06510690 - time (sec): 54.00 - samples/sec: 4169.43 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-25 15:21:36,527 epoch 3 - iter 890/893 - loss 0.06434040 - time (sec): 59.79 - samples/sec: 4151.20 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-25 15:21:36,707 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:21:36,707 EPOCH 3 done: loss 0.0643 - lr: 0.000039 |
|
2023-10-25 15:21:40,732 DEV : loss 0.10246531665325165 - f1-score (micro avg) 0.7631 |
|
2023-10-25 15:21:40,752 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:21:46,872 epoch 4 - iter 89/893 - loss 0.04679698 - time (sec): 6.12 - samples/sec: 4077.93 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 15:21:52,842 epoch 4 - iter 178/893 - loss 0.04843306 - time (sec): 12.09 - samples/sec: 4130.07 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 15:21:58,578 epoch 4 - iter 267/893 - loss 0.04773681 - time (sec): 17.82 - samples/sec: 4119.76 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 15:22:04,436 epoch 4 - iter 356/893 - loss 0.04706353 - time (sec): 23.68 - samples/sec: 4188.67 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 15:22:10,388 epoch 4 - iter 445/893 - loss 0.04712251 - time (sec): 29.63 - samples/sec: 4192.99 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 15:22:16,548 epoch 4 - iter 534/893 - loss 0.04626453 - time (sec): 35.79 - samples/sec: 4217.43 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 15:22:22,420 epoch 4 - iter 623/893 - loss 0.04772297 - time (sec): 41.67 - samples/sec: 4200.40 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 15:22:28,169 epoch 4 - iter 712/893 - loss 0.04802724 - time (sec): 47.42 - samples/sec: 4177.73 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 15:22:34,271 epoch 4 - iter 801/893 - loss 0.04644315 - time (sec): 53.52 - samples/sec: 4194.85 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 15:22:40,085 epoch 4 - iter 890/893 - loss 0.04640298 - time (sec): 59.33 - samples/sec: 4183.14 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 15:22:40,257 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:22:40,257 EPOCH 4 done: loss 0.0465 - lr: 0.000033 |
|
2023-10-25 15:22:45,236 DEV : loss 0.12659873068332672 - f1-score (micro avg) 0.795 |
|
2023-10-25 15:22:45,256 saving best model |
|
2023-10-25 15:22:45,952 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:22:51,820 epoch 5 - iter 89/893 - loss 0.03385452 - time (sec): 5.86 - samples/sec: 4049.42 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 15:22:57,539 epoch 5 - iter 178/893 - loss 0.03418212 - time (sec): 11.58 - samples/sec: 4271.19 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 15:23:03,297 epoch 5 - iter 267/893 - loss 0.03554924 - time (sec): 17.34 - samples/sec: 4336.00 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 15:23:08,625 epoch 5 - iter 356/893 - loss 0.03505123 - time (sec): 22.67 - samples/sec: 4387.92 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 15:23:14,192 epoch 5 - iter 445/893 - loss 0.03409722 - time (sec): 28.24 - samples/sec: 4378.75 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 15:23:20,011 epoch 5 - iter 534/893 - loss 0.03585306 - time (sec): 34.05 - samples/sec: 4402.79 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 15:23:25,675 epoch 5 - iter 623/893 - loss 0.03523579 - time (sec): 39.72 - samples/sec: 4383.02 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 15:23:31,217 epoch 5 - iter 712/893 - loss 0.03457740 - time (sec): 45.26 - samples/sec: 4379.15 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 15:23:36,764 epoch 5 - iter 801/893 - loss 0.03595039 - time (sec): 50.81 - samples/sec: 4386.59 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 15:23:42,344 epoch 5 - iter 890/893 - loss 0.03641465 - time (sec): 56.39 - samples/sec: 4398.95 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 15:23:42,509 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:23:42,510 EPOCH 5 done: loss 0.0363 - lr: 0.000028 |
|
2023-10-25 15:23:46,433 DEV : loss 0.15950827300548553 - f1-score (micro avg) 0.8068 |
|
2023-10-25 15:23:46,454 saving best model |
|
2023-10-25 15:23:47,110 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:23:53,621 epoch 6 - iter 89/893 - loss 0.01843165 - time (sec): 6.51 - samples/sec: 3866.75 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 15:23:59,070 epoch 6 - iter 178/893 - loss 0.02060189 - time (sec): 11.96 - samples/sec: 4081.75 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 15:24:04,597 epoch 6 - iter 267/893 - loss 0.02287458 - time (sec): 17.48 - samples/sec: 4238.78 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 15:24:10,175 epoch 6 - iter 356/893 - loss 0.02618184 - time (sec): 23.06 - samples/sec: 4299.53 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 15:24:15,632 epoch 6 - iter 445/893 - loss 0.02572160 - time (sec): 28.52 - samples/sec: 4391.01 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 15:24:20,990 epoch 6 - iter 534/893 - loss 0.02653895 - time (sec): 33.88 - samples/sec: 4380.81 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 15:24:26,774 epoch 6 - iter 623/893 - loss 0.02582608 - time (sec): 39.66 - samples/sec: 4384.91 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 15:24:32,333 epoch 6 - iter 712/893 - loss 0.02556832 - time (sec): 45.22 - samples/sec: 4406.91 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 15:24:37,881 epoch 6 - iter 801/893 - loss 0.02666351 - time (sec): 50.77 - samples/sec: 4411.11 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 15:24:43,421 epoch 6 - iter 890/893 - loss 0.02679083 - time (sec): 56.31 - samples/sec: 4408.04 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 15:24:43,612 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:24:43,612 EPOCH 6 done: loss 0.0269 - lr: 0.000022 |
|
2023-10-25 15:24:47,464 DEV : loss 0.16690844297409058 - f1-score (micro avg) 0.7946 |
|
2023-10-25 15:24:47,486 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:24:53,278 epoch 7 - iter 89/893 - loss 0.01604361 - time (sec): 5.79 - samples/sec: 4603.66 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 15:24:58,745 epoch 7 - iter 178/893 - loss 0.01903158 - time (sec): 11.26 - samples/sec: 4502.09 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 15:25:04,355 epoch 7 - iter 267/893 - loss 0.01762160 - time (sec): 16.87 - samples/sec: 4447.11 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 15:25:10,018 epoch 7 - iter 356/893 - loss 0.01799719 - time (sec): 22.53 - samples/sec: 4449.66 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 15:25:15,579 epoch 7 - iter 445/893 - loss 0.01816463 - time (sec): 28.09 - samples/sec: 4437.15 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 15:25:21,206 epoch 7 - iter 534/893 - loss 0.01762382 - time (sec): 33.72 - samples/sec: 4408.92 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 15:25:26,582 epoch 7 - iter 623/893 - loss 0.01818342 - time (sec): 39.09 - samples/sec: 4391.43 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 15:25:32,470 epoch 7 - iter 712/893 - loss 0.01864109 - time (sec): 44.98 - samples/sec: 4401.35 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 15:25:37,922 epoch 7 - iter 801/893 - loss 0.01859268 - time (sec): 50.43 - samples/sec: 4410.62 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 15:25:43,454 epoch 7 - iter 890/893 - loss 0.01876551 - time (sec): 55.97 - samples/sec: 4432.61 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 15:25:43,621 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:25:43,621 EPOCH 7 done: loss 0.0187 - lr: 0.000017 |
|
2023-10-25 15:25:48,547 DEV : loss 0.20097887516021729 - f1-score (micro avg) 0.797 |
|
2023-10-25 15:25:48,569 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:25:54,290 epoch 8 - iter 89/893 - loss 0.01460596 - time (sec): 5.72 - samples/sec: 4188.29 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 15:25:59,972 epoch 8 - iter 178/893 - loss 0.01450563 - time (sec): 11.40 - samples/sec: 4260.99 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 15:26:05,738 epoch 8 - iter 267/893 - loss 0.01437589 - time (sec): 17.17 - samples/sec: 4309.25 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 15:26:11,493 epoch 8 - iter 356/893 - loss 0.01437436 - time (sec): 22.92 - samples/sec: 4260.09 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 15:26:17,302 epoch 8 - iter 445/893 - loss 0.01387661 - time (sec): 28.73 - samples/sec: 4259.05 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 15:26:23,205 epoch 8 - iter 534/893 - loss 0.01340798 - time (sec): 34.63 - samples/sec: 4266.55 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 15:26:29,267 epoch 8 - iter 623/893 - loss 0.01346100 - time (sec): 40.70 - samples/sec: 4280.38 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 15:26:34,847 epoch 8 - iter 712/893 - loss 0.01427616 - time (sec): 46.28 - samples/sec: 4271.12 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 15:26:40,330 epoch 8 - iter 801/893 - loss 0.01452412 - time (sec): 51.76 - samples/sec: 4297.64 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 15:26:45,869 epoch 8 - iter 890/893 - loss 0.01415925 - time (sec): 57.30 - samples/sec: 4328.86 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 15:26:46,048 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:26:46,049 EPOCH 8 done: loss 0.0142 - lr: 0.000011 |
|
2023-10-25 15:26:50,095 DEV : loss 0.20207209885120392 - f1-score (micro avg) 0.8049 |
|
2023-10-25 15:26:50,116 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:26:56,812 epoch 9 - iter 89/893 - loss 0.00563361 - time (sec): 6.69 - samples/sec: 3687.01 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 15:27:02,376 epoch 9 - iter 178/893 - loss 0.00548688 - time (sec): 12.26 - samples/sec: 3995.04 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 15:27:07,978 epoch 9 - iter 267/893 - loss 0.00713590 - time (sec): 17.86 - samples/sec: 4125.43 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 15:27:13,974 epoch 9 - iter 356/893 - loss 0.00914907 - time (sec): 23.86 - samples/sec: 4219.03 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 15:27:19,731 epoch 9 - iter 445/893 - loss 0.01028740 - time (sec): 29.61 - samples/sec: 4249.69 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 15:27:25,466 epoch 9 - iter 534/893 - loss 0.01006046 - time (sec): 35.35 - samples/sec: 4282.64 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 15:27:31,112 epoch 9 - iter 623/893 - loss 0.01076164 - time (sec): 40.99 - samples/sec: 4271.27 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 15:27:36,878 epoch 9 - iter 712/893 - loss 0.01047859 - time (sec): 46.76 - samples/sec: 4289.54 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 15:27:42,412 epoch 9 - iter 801/893 - loss 0.01028654 - time (sec): 52.29 - samples/sec: 4294.61 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 15:27:48,005 epoch 9 - iter 890/893 - loss 0.01012660 - time (sec): 57.89 - samples/sec: 4285.49 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 15:27:48,168 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:27:48,169 EPOCH 9 done: loss 0.0101 - lr: 0.000006 |
|
2023-10-25 15:27:52,536 DEV : loss 0.21088457107543945 - f1-score (micro avg) 0.8116 |
|
2023-10-25 15:27:52,560 saving best model |
|
2023-10-25 15:27:53,280 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:27:58,925 epoch 10 - iter 89/893 - loss 0.00743254 - time (sec): 5.64 - samples/sec: 4235.48 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 15:28:04,835 epoch 10 - iter 178/893 - loss 0.00751409 - time (sec): 11.55 - samples/sec: 4297.28 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 15:28:10,905 epoch 10 - iter 267/893 - loss 0.00573510 - time (sec): 17.62 - samples/sec: 4241.46 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 15:28:16,625 epoch 10 - iter 356/893 - loss 0.00576235 - time (sec): 23.34 - samples/sec: 4280.64 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 15:28:22,156 epoch 10 - iter 445/893 - loss 0.00542024 - time (sec): 28.87 - samples/sec: 4228.88 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 15:28:27,932 epoch 10 - iter 534/893 - loss 0.00569832 - time (sec): 34.65 - samples/sec: 4272.05 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 15:28:33,724 epoch 10 - iter 623/893 - loss 0.00590285 - time (sec): 40.44 - samples/sec: 4273.13 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 15:28:39,373 epoch 10 - iter 712/893 - loss 0.00577265 - time (sec): 46.09 - samples/sec: 4289.95 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 15:28:45,228 epoch 10 - iter 801/893 - loss 0.00558124 - time (sec): 51.94 - samples/sec: 4264.95 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 15:28:51,161 epoch 10 - iter 890/893 - loss 0.00610789 - time (sec): 57.88 - samples/sec: 4286.41 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 15:28:51,332 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:28:51,332 EPOCH 10 done: loss 0.0061 - lr: 0.000000 |
|
2023-10-25 15:28:56,407 DEV : loss 0.21562287211418152 - f1-score (micro avg) 0.8067 |
|
2023-10-25 15:28:56,870 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:28:56,871 Loading model from best epoch ... |
|
2023-10-25 15:28:58,698 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-25 15:29:10,894 |
|
Results: |
|
- F-score (micro) 0.6867 |
|
- F-score (macro) 0.6046 |
|
- Accuracy 0.5377 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.6919 0.6667 0.6791 1095 |
|
PER 0.7771 0.7717 0.7744 1012 |
|
ORG 0.4685 0.5630 0.5115 357 |
|
HumanProd 0.3438 0.6667 0.4536 33 |
|
|
|
micro avg 0.6792 0.6944 0.6867 2497 |
|
macro avg 0.5703 0.6670 0.6046 2497 |
|
weighted avg 0.6899 0.6944 0.6908 2497 |
|
|
|
2023-10-25 15:29:10,895 ---------------------------------------------------------------------------------------------------- |
|
|