|
2023-10-25 21:02:46,321 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:02:46,322 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 21:02:46,322 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:02:46,322 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-25 21:02:46,322 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:02:46,322 Train: 1166 sentences |
|
2023-10-25 21:02:46,322 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 21:02:46,322 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:02:46,322 Training Params: |
|
2023-10-25 21:02:46,322 - learning_rate: "5e-05" |
|
2023-10-25 21:02:46,322 - mini_batch_size: "4" |
|
2023-10-25 21:02:46,322 - max_epochs: "10" |
|
2023-10-25 21:02:46,322 - shuffle: "True" |
|
2023-10-25 21:02:46,322 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:02:46,323 Plugins: |
|
2023-10-25 21:02:46,323 - TensorboardLogger |
|
2023-10-25 21:02:46,323 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 21:02:46,323 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:02:46,323 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 21:02:46,323 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 21:02:46,323 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:02:46,323 Computation: |
|
2023-10-25 21:02:46,323 - compute on device: cuda:0 |
|
2023-10-25 21:02:46,323 - embedding storage: none |
|
2023-10-25 21:02:46,323 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:02:46,323 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-25 21:02:46,323 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:02:46,323 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:02:46,323 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 21:02:47,568 epoch 1 - iter 29/292 - loss 2.70493946 - time (sec): 1.24 - samples/sec: 3001.90 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 21:02:48,960 epoch 1 - iter 58/292 - loss 1.64294689 - time (sec): 2.64 - samples/sec: 3433.28 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 21:02:50,359 epoch 1 - iter 87/292 - loss 1.24359019 - time (sec): 4.03 - samples/sec: 3589.43 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 21:02:51,628 epoch 1 - iter 116/292 - loss 1.05004462 - time (sec): 5.30 - samples/sec: 3552.13 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 21:02:52,874 epoch 1 - iter 145/292 - loss 0.91786982 - time (sec): 6.55 - samples/sec: 3520.91 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 21:02:54,229 epoch 1 - iter 174/292 - loss 0.80347598 - time (sec): 7.91 - samples/sec: 3546.58 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 21:02:55,489 epoch 1 - iter 203/292 - loss 0.74188701 - time (sec): 9.17 - samples/sec: 3431.97 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 21:02:56,762 epoch 1 - iter 232/292 - loss 0.68879598 - time (sec): 10.44 - samples/sec: 3393.36 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 21:02:57,981 epoch 1 - iter 261/292 - loss 0.65138888 - time (sec): 11.66 - samples/sec: 3344.55 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 21:02:59,311 epoch 1 - iter 290/292 - loss 0.60276932 - time (sec): 12.99 - samples/sec: 3401.64 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 21:02:59,388 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:02:59,388 EPOCH 1 done: loss 0.5995 - lr: 0.000049 |
|
2023-10-25 21:03:00,042 DEV : loss 0.15543945133686066 - f1-score (micro avg) 0.5759 |
|
2023-10-25 21:03:00,046 saving best model |
|
2023-10-25 21:03:00,575 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:03:01,799 epoch 2 - iter 29/292 - loss 0.16952144 - time (sec): 1.22 - samples/sec: 3371.46 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 21:03:03,040 epoch 2 - iter 58/292 - loss 0.18657242 - time (sec): 2.46 - samples/sec: 3294.57 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 21:03:04,352 epoch 2 - iter 87/292 - loss 0.16106239 - time (sec): 3.78 - samples/sec: 3300.13 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 21:03:05,608 epoch 2 - iter 116/292 - loss 0.15644988 - time (sec): 5.03 - samples/sec: 3352.82 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 21:03:06,836 epoch 2 - iter 145/292 - loss 0.15373069 - time (sec): 6.26 - samples/sec: 3411.63 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 21:03:08,014 epoch 2 - iter 174/292 - loss 0.16590765 - time (sec): 7.44 - samples/sec: 3373.45 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 21:03:09,249 epoch 2 - iter 203/292 - loss 0.17045348 - time (sec): 8.67 - samples/sec: 3377.62 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 21:03:10,551 epoch 2 - iter 232/292 - loss 0.17007656 - time (sec): 9.97 - samples/sec: 3343.68 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 21:03:11,963 epoch 2 - iter 261/292 - loss 0.16415110 - time (sec): 11.39 - samples/sec: 3441.55 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 21:03:13,297 epoch 2 - iter 290/292 - loss 0.15537644 - time (sec): 12.72 - samples/sec: 3482.88 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 21:03:13,376 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:03:13,376 EPOCH 2 done: loss 0.1553 - lr: 0.000045 |
|
2023-10-25 21:03:14,282 DEV : loss 0.14280402660369873 - f1-score (micro avg) 0.7113 |
|
2023-10-25 21:03:14,287 saving best model |
|
2023-10-25 21:03:14,953 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:03:16,338 epoch 3 - iter 29/292 - loss 0.12799538 - time (sec): 1.38 - samples/sec: 3483.45 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 21:03:17,670 epoch 3 - iter 58/292 - loss 0.09862348 - time (sec): 2.71 - samples/sec: 3517.39 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 21:03:18,972 epoch 3 - iter 87/292 - loss 0.08769634 - time (sec): 4.02 - samples/sec: 3425.70 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 21:03:20,258 epoch 3 - iter 116/292 - loss 0.08875247 - time (sec): 5.30 - samples/sec: 3388.13 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 21:03:21,512 epoch 3 - iter 145/292 - loss 0.08886591 - time (sec): 6.56 - samples/sec: 3325.24 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 21:03:22,754 epoch 3 - iter 174/292 - loss 0.09139678 - time (sec): 7.80 - samples/sec: 3283.17 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 21:03:24,065 epoch 3 - iter 203/292 - loss 0.08933904 - time (sec): 9.11 - samples/sec: 3332.68 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 21:03:25,360 epoch 3 - iter 232/292 - loss 0.08802096 - time (sec): 10.40 - samples/sec: 3352.74 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 21:03:26,654 epoch 3 - iter 261/292 - loss 0.09013801 - time (sec): 11.70 - samples/sec: 3350.70 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 21:03:28,018 epoch 3 - iter 290/292 - loss 0.08726593 - time (sec): 13.06 - samples/sec: 3372.98 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-25 21:03:28,107 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:03:28,108 EPOCH 3 done: loss 0.0879 - lr: 0.000039 |
|
2023-10-25 21:03:29,012 DEV : loss 0.16181543469429016 - f1-score (micro avg) 0.7412 |
|
2023-10-25 21:03:29,017 saving best model |
|
2023-10-25 21:03:29,682 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:03:31,011 epoch 4 - iter 29/292 - loss 0.08251774 - time (sec): 1.33 - samples/sec: 3144.69 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 21:03:32,296 epoch 4 - iter 58/292 - loss 0.06039563 - time (sec): 2.61 - samples/sec: 3480.66 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 21:03:33,577 epoch 4 - iter 87/292 - loss 0.05907690 - time (sec): 3.89 - samples/sec: 3336.97 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 21:03:34,854 epoch 4 - iter 116/292 - loss 0.05470702 - time (sec): 5.17 - samples/sec: 3343.45 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 21:03:36,253 epoch 4 - iter 145/292 - loss 0.05698655 - time (sec): 6.57 - samples/sec: 3300.17 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 21:03:37,613 epoch 4 - iter 174/292 - loss 0.05507272 - time (sec): 7.93 - samples/sec: 3315.00 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 21:03:38,875 epoch 4 - iter 203/292 - loss 0.06136269 - time (sec): 9.19 - samples/sec: 3312.60 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 21:03:40,241 epoch 4 - iter 232/292 - loss 0.06014642 - time (sec): 10.56 - samples/sec: 3353.87 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 21:03:41,520 epoch 4 - iter 261/292 - loss 0.05735605 - time (sec): 11.84 - samples/sec: 3372.49 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 21:03:42,826 epoch 4 - iter 290/292 - loss 0.05824407 - time (sec): 13.14 - samples/sec: 3365.23 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 21:03:42,909 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:03:42,909 EPOCH 4 done: loss 0.0580 - lr: 0.000033 |
|
2023-10-25 21:03:43,968 DEV : loss 0.1569732278585434 - f1-score (micro avg) 0.7606 |
|
2023-10-25 21:03:43,972 saving best model |
|
2023-10-25 21:03:44,658 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:03:45,907 epoch 5 - iter 29/292 - loss 0.02448988 - time (sec): 1.25 - samples/sec: 2965.12 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 21:03:47,242 epoch 5 - iter 58/292 - loss 0.03323303 - time (sec): 2.58 - samples/sec: 3230.68 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 21:03:48,588 epoch 5 - iter 87/292 - loss 0.03405391 - time (sec): 3.93 - samples/sec: 3407.07 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 21:03:49,819 epoch 5 - iter 116/292 - loss 0.03187030 - time (sec): 5.16 - samples/sec: 3408.51 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 21:03:51,096 epoch 5 - iter 145/292 - loss 0.03696710 - time (sec): 6.43 - samples/sec: 3345.01 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 21:03:52,458 epoch 5 - iter 174/292 - loss 0.03833785 - time (sec): 7.80 - samples/sec: 3310.13 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 21:03:53,820 epoch 5 - iter 203/292 - loss 0.04040647 - time (sec): 9.16 - samples/sec: 3370.90 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 21:03:55,148 epoch 5 - iter 232/292 - loss 0.03785862 - time (sec): 10.49 - samples/sec: 3380.50 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 21:03:56,505 epoch 5 - iter 261/292 - loss 0.03665954 - time (sec): 11.84 - samples/sec: 3356.28 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 21:03:57,830 epoch 5 - iter 290/292 - loss 0.03585809 - time (sec): 13.17 - samples/sec: 3363.05 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 21:03:57,905 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:03:57,905 EPOCH 5 done: loss 0.0357 - lr: 0.000028 |
|
2023-10-25 21:03:58,815 DEV : loss 0.14395155012607574 - f1-score (micro avg) 0.7265 |
|
2023-10-25 21:03:58,819 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:04:00,208 epoch 6 - iter 29/292 - loss 0.02165138 - time (sec): 1.39 - samples/sec: 3839.74 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 21:04:01,519 epoch 6 - iter 58/292 - loss 0.01941242 - time (sec): 2.70 - samples/sec: 3485.44 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 21:04:02,781 epoch 6 - iter 87/292 - loss 0.02557862 - time (sec): 3.96 - samples/sec: 3472.87 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 21:04:04,147 epoch 6 - iter 116/292 - loss 0.02482614 - time (sec): 5.33 - samples/sec: 3550.68 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 21:04:05,427 epoch 6 - iter 145/292 - loss 0.02631017 - time (sec): 6.61 - samples/sec: 3457.32 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 21:04:06,786 epoch 6 - iter 174/292 - loss 0.02314945 - time (sec): 7.97 - samples/sec: 3450.43 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 21:04:08,068 epoch 6 - iter 203/292 - loss 0.02313133 - time (sec): 9.25 - samples/sec: 3440.85 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 21:04:09,331 epoch 6 - iter 232/292 - loss 0.02352516 - time (sec): 10.51 - samples/sec: 3447.74 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 21:04:10,604 epoch 6 - iter 261/292 - loss 0.02453468 - time (sec): 11.78 - samples/sec: 3461.04 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 21:04:11,847 epoch 6 - iter 290/292 - loss 0.02499737 - time (sec): 13.03 - samples/sec: 3387.70 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 21:04:11,936 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:04:11,936 EPOCH 6 done: loss 0.0251 - lr: 0.000022 |
|
2023-10-25 21:04:12,842 DEV : loss 0.17060092091560364 - f1-score (micro avg) 0.7447 |
|
2023-10-25 21:04:12,846 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:04:14,184 epoch 7 - iter 29/292 - loss 0.03038679 - time (sec): 1.34 - samples/sec: 3200.51 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 21:04:15,504 epoch 7 - iter 58/292 - loss 0.02199783 - time (sec): 2.66 - samples/sec: 3269.59 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 21:04:16,943 epoch 7 - iter 87/292 - loss 0.02107799 - time (sec): 4.10 - samples/sec: 3377.74 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 21:04:18,274 epoch 7 - iter 116/292 - loss 0.01988430 - time (sec): 5.43 - samples/sec: 3358.94 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 21:04:19,515 epoch 7 - iter 145/292 - loss 0.02054682 - time (sec): 6.67 - samples/sec: 3387.20 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 21:04:20,827 epoch 7 - iter 174/292 - loss 0.01902887 - time (sec): 7.98 - samples/sec: 3402.18 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 21:04:22,089 epoch 7 - iter 203/292 - loss 0.01967890 - time (sec): 9.24 - samples/sec: 3381.86 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 21:04:23,370 epoch 7 - iter 232/292 - loss 0.02047310 - time (sec): 10.52 - samples/sec: 3353.31 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 21:04:24,632 epoch 7 - iter 261/292 - loss 0.02022304 - time (sec): 11.78 - samples/sec: 3367.80 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 21:04:25,851 epoch 7 - iter 290/292 - loss 0.01979598 - time (sec): 13.00 - samples/sec: 3390.67 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 21:04:25,946 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:04:25,946 EPOCH 7 done: loss 0.0196 - lr: 0.000017 |
|
2023-10-25 21:04:26,861 DEV : loss 0.1871466189622879 - f1-score (micro avg) 0.7439 |
|
2023-10-25 21:04:26,866 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:04:28,143 epoch 8 - iter 29/292 - loss 0.00718740 - time (sec): 1.28 - samples/sec: 3597.72 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 21:04:29,467 epoch 8 - iter 58/292 - loss 0.01503456 - time (sec): 2.60 - samples/sec: 3365.84 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 21:04:30,790 epoch 8 - iter 87/292 - loss 0.01557495 - time (sec): 3.92 - samples/sec: 3190.64 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 21:04:32,093 epoch 8 - iter 116/292 - loss 0.01500266 - time (sec): 5.23 - samples/sec: 3234.34 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 21:04:33,336 epoch 8 - iter 145/292 - loss 0.01724417 - time (sec): 6.47 - samples/sec: 3304.72 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 21:04:34,647 epoch 8 - iter 174/292 - loss 0.01760521 - time (sec): 7.78 - samples/sec: 3357.34 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 21:04:35,925 epoch 8 - iter 203/292 - loss 0.01743967 - time (sec): 9.06 - samples/sec: 3381.28 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 21:04:37,315 epoch 8 - iter 232/292 - loss 0.01549544 - time (sec): 10.45 - samples/sec: 3403.73 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 21:04:38,612 epoch 8 - iter 261/292 - loss 0.01484755 - time (sec): 11.74 - samples/sec: 3405.07 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 21:04:39,938 epoch 8 - iter 290/292 - loss 0.01565987 - time (sec): 13.07 - samples/sec: 3394.16 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 21:04:40,013 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:04:40,013 EPOCH 8 done: loss 0.0159 - lr: 0.000011 |
|
2023-10-25 21:04:41,089 DEV : loss 0.2016552984714508 - f1-score (micro avg) 0.7642 |
|
2023-10-25 21:04:41,094 saving best model |
|
2023-10-25 21:04:41,658 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:04:42,938 epoch 9 - iter 29/292 - loss 0.00883147 - time (sec): 1.28 - samples/sec: 3170.38 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 21:04:44,271 epoch 9 - iter 58/292 - loss 0.01566657 - time (sec): 2.61 - samples/sec: 3276.96 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 21:04:45,628 epoch 9 - iter 87/292 - loss 0.01114646 - time (sec): 3.97 - samples/sec: 3248.19 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 21:04:46,926 epoch 9 - iter 116/292 - loss 0.00886227 - time (sec): 5.27 - samples/sec: 3159.36 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 21:04:48,286 epoch 9 - iter 145/292 - loss 0.00720736 - time (sec): 6.63 - samples/sec: 3205.00 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 21:04:49,636 epoch 9 - iter 174/292 - loss 0.00633132 - time (sec): 7.98 - samples/sec: 3302.03 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 21:04:50,954 epoch 9 - iter 203/292 - loss 0.00655342 - time (sec): 9.29 - samples/sec: 3292.40 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 21:04:52,300 epoch 9 - iter 232/292 - loss 0.00853524 - time (sec): 10.64 - samples/sec: 3304.99 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 21:04:53,593 epoch 9 - iter 261/292 - loss 0.00803946 - time (sec): 11.93 - samples/sec: 3297.35 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 21:04:54,911 epoch 9 - iter 290/292 - loss 0.00829946 - time (sec): 13.25 - samples/sec: 3327.92 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 21:04:54,992 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:04:54,992 EPOCH 9 done: loss 0.0082 - lr: 0.000006 |
|
2023-10-25 21:04:55,903 DEV : loss 0.2061367630958557 - f1-score (micro avg) 0.7543 |
|
2023-10-25 21:04:55,908 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:04:57,187 epoch 10 - iter 29/292 - loss 0.00318513 - time (sec): 1.28 - samples/sec: 3035.16 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 21:04:58,468 epoch 10 - iter 58/292 - loss 0.00296072 - time (sec): 2.56 - samples/sec: 3051.77 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 21:04:59,815 epoch 10 - iter 87/292 - loss 0.00649624 - time (sec): 3.91 - samples/sec: 3244.54 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 21:05:01,138 epoch 10 - iter 116/292 - loss 0.00570956 - time (sec): 5.23 - samples/sec: 3328.03 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 21:05:02,456 epoch 10 - iter 145/292 - loss 0.00505520 - time (sec): 6.55 - samples/sec: 3425.62 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 21:05:03,752 epoch 10 - iter 174/292 - loss 0.00507932 - time (sec): 7.84 - samples/sec: 3488.66 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 21:05:05,062 epoch 10 - iter 203/292 - loss 0.00532013 - time (sec): 9.15 - samples/sec: 3509.91 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 21:05:06,295 epoch 10 - iter 232/292 - loss 0.00556513 - time (sec): 10.39 - samples/sec: 3461.67 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 21:05:07,561 epoch 10 - iter 261/292 - loss 0.00596959 - time (sec): 11.65 - samples/sec: 3425.39 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 21:05:08,803 epoch 10 - iter 290/292 - loss 0.00553911 - time (sec): 12.89 - samples/sec: 3424.82 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 21:05:08,887 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:05:08,887 EPOCH 10 done: loss 0.0058 - lr: 0.000000 |
|
2023-10-25 21:05:09,796 DEV : loss 0.20963872969150543 - f1-score (micro avg) 0.7366 |
|
2023-10-25 21:05:10,330 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:05:10,332 Loading model from best epoch ... |
|
2023-10-25 21:05:12,043 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-25 21:05:13,585 |
|
Results: |
|
- F-score (micro) 0.7502 |
|
- F-score (macro) 0.6592 |
|
- Accuracy 0.6229 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.7754 0.8333 0.8033 348 |
|
LOC 0.6889 0.8314 0.7535 261 |
|
ORG 0.3962 0.4038 0.4000 52 |
|
HumanProd 0.6071 0.7727 0.6800 22 |
|
|
|
micro avg 0.7078 0.7980 0.7502 683 |
|
macro avg 0.6169 0.7103 0.6592 683 |
|
weighted avg 0.7081 0.7980 0.7496 683 |
|
|
|
2023-10-25 21:05:13,585 ---------------------------------------------------------------------------------------------------- |
|
|