|
2023-10-25 12:17:41,121 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:17:41,122 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 12:17:41,122 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:17:41,122 MultiCorpus: 6183 train + 680 dev + 2113 test sentences |
|
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator |
|
2023-10-25 12:17:41,122 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:17:41,122 Train: 6183 sentences |
|
2023-10-25 12:17:41,122 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 12:17:41,122 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:17:41,122 Training Params: |
|
2023-10-25 12:17:41,122 - learning_rate: "5e-05" |
|
2023-10-25 12:17:41,122 - mini_batch_size: "8" |
|
2023-10-25 12:17:41,122 - max_epochs: "10" |
|
2023-10-25 12:17:41,122 - shuffle: "True" |
|
2023-10-25 12:17:41,122 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:17:41,122 Plugins: |
|
2023-10-25 12:17:41,122 - TensorboardLogger |
|
2023-10-25 12:17:41,122 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 12:17:41,122 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:17:41,122 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 12:17:41,122 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 12:17:41,123 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:17:41,123 Computation: |
|
2023-10-25 12:17:41,123 - compute on device: cuda:0 |
|
2023-10-25 12:17:41,123 - embedding storage: none |
|
2023-10-25 12:17:41,123 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:17:41,123 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-25 12:17:41,123 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:17:41,123 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:17:41,123 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 12:17:45,684 epoch 1 - iter 77/773 - loss 1.71289552 - time (sec): 4.56 - samples/sec: 2872.98 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 12:17:50,444 epoch 1 - iter 154/773 - loss 0.94501693 - time (sec): 9.32 - samples/sec: 2799.47 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 12:17:55,307 epoch 1 - iter 231/773 - loss 0.69098775 - time (sec): 14.18 - samples/sec: 2696.13 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 12:18:00,140 epoch 1 - iter 308/773 - loss 0.55669950 - time (sec): 19.02 - samples/sec: 2649.90 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 12:18:04,990 epoch 1 - iter 385/773 - loss 0.46930884 - time (sec): 23.87 - samples/sec: 2631.31 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 12:18:09,855 epoch 1 - iter 462/773 - loss 0.40669569 - time (sec): 28.73 - samples/sec: 2629.20 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 12:18:14,604 epoch 1 - iter 539/773 - loss 0.36224903 - time (sec): 33.48 - samples/sec: 2612.89 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 12:18:19,364 epoch 1 - iter 616/773 - loss 0.33137895 - time (sec): 38.24 - samples/sec: 2611.99 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 12:18:24,185 epoch 1 - iter 693/773 - loss 0.30555756 - time (sec): 43.06 - samples/sec: 2612.71 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 12:18:29,384 epoch 1 - iter 770/773 - loss 0.28622854 - time (sec): 48.26 - samples/sec: 2563.43 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-25 12:18:29,573 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:18:29,573 EPOCH 1 done: loss 0.2850 - lr: 0.000050 |
|
2023-10-25 12:18:32,030 DEV : loss 0.052840519696474075 - f1-score (micro avg) 0.7706 |
|
2023-10-25 12:18:32,047 saving best model |
|
2023-10-25 12:18:32,576 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:18:37,118 epoch 2 - iter 77/773 - loss 0.05777603 - time (sec): 4.54 - samples/sec: 2790.00 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 12:18:41,639 epoch 2 - iter 154/773 - loss 0.07598892 - time (sec): 9.06 - samples/sec: 2842.23 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 12:18:46,219 epoch 2 - iter 231/773 - loss 0.07264176 - time (sec): 13.64 - samples/sec: 2836.67 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 12:18:50,901 epoch 2 - iter 308/773 - loss 0.07623153 - time (sec): 18.32 - samples/sec: 2835.14 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 12:18:55,465 epoch 2 - iter 385/773 - loss 0.08302008 - time (sec): 22.89 - samples/sec: 2833.14 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 12:18:59,946 epoch 2 - iter 462/773 - loss 0.08126031 - time (sec): 27.37 - samples/sec: 2802.44 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 12:19:04,391 epoch 2 - iter 539/773 - loss 0.08188120 - time (sec): 31.81 - samples/sec: 2784.58 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 12:19:08,772 epoch 2 - iter 616/773 - loss 0.08100590 - time (sec): 36.19 - samples/sec: 2755.24 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 12:19:13,067 epoch 2 - iter 693/773 - loss 0.08064490 - time (sec): 40.49 - samples/sec: 2748.75 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 12:19:17,497 epoch 2 - iter 770/773 - loss 0.08177812 - time (sec): 44.92 - samples/sec: 2759.05 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 12:19:17,666 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:19:17,667 EPOCH 2 done: loss 0.0817 - lr: 0.000044 |
|
2023-10-25 12:19:20,724 DEV : loss 0.06832588464021683 - f1-score (micro avg) 0.6442 |
|
2023-10-25 12:19:20,741 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:19:25,307 epoch 3 - iter 77/773 - loss 0.04838851 - time (sec): 4.56 - samples/sec: 2605.03 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 12:19:29,785 epoch 3 - iter 154/773 - loss 0.05469717 - time (sec): 9.04 - samples/sec: 2680.01 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 12:19:34,143 epoch 3 - iter 231/773 - loss 0.05654808 - time (sec): 13.40 - samples/sec: 2717.16 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 12:19:38,588 epoch 3 - iter 308/773 - loss 0.05669582 - time (sec): 17.85 - samples/sec: 2728.53 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 12:19:42,926 epoch 3 - iter 385/773 - loss 0.05551828 - time (sec): 22.18 - samples/sec: 2732.02 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 12:19:47,727 epoch 3 - iter 462/773 - loss 0.05835246 - time (sec): 26.98 - samples/sec: 2696.94 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 12:19:52,542 epoch 3 - iter 539/773 - loss 0.06076653 - time (sec): 31.80 - samples/sec: 2700.73 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 12:19:57,349 epoch 3 - iter 616/773 - loss 0.05833860 - time (sec): 36.61 - samples/sec: 2701.25 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 12:20:02,064 epoch 3 - iter 693/773 - loss 0.05658941 - time (sec): 41.32 - samples/sec: 2695.26 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-25 12:20:06,656 epoch 3 - iter 770/773 - loss 0.05697211 - time (sec): 45.91 - samples/sec: 2699.68 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-25 12:20:06,827 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:20:06,828 EPOCH 3 done: loss 0.0570 - lr: 0.000039 |
|
2023-10-25 12:20:09,398 DEV : loss 0.06559617072343826 - f1-score (micro avg) 0.6748 |
|
2023-10-25 12:20:09,416 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:20:13,912 epoch 4 - iter 77/773 - loss 0.03785036 - time (sec): 4.50 - samples/sec: 2585.85 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 12:20:18,649 epoch 4 - iter 154/773 - loss 0.04259835 - time (sec): 9.23 - samples/sec: 2623.43 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 12:20:23,636 epoch 4 - iter 231/773 - loss 0.04827369 - time (sec): 14.22 - samples/sec: 2597.03 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 12:20:28,127 epoch 4 - iter 308/773 - loss 0.04600678 - time (sec): 18.71 - samples/sec: 2596.64 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 12:20:32,900 epoch 4 - iter 385/773 - loss 0.04496290 - time (sec): 23.48 - samples/sec: 2651.95 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 12:20:38,132 epoch 4 - iter 462/773 - loss 0.04314026 - time (sec): 28.72 - samples/sec: 2590.46 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 12:20:42,961 epoch 4 - iter 539/773 - loss 0.04146073 - time (sec): 33.54 - samples/sec: 2593.41 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 12:20:47,332 epoch 4 - iter 616/773 - loss 0.04115461 - time (sec): 37.91 - samples/sec: 2583.47 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 12:20:51,848 epoch 4 - iter 693/773 - loss 0.04126242 - time (sec): 42.43 - samples/sec: 2616.47 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 12:20:56,348 epoch 4 - iter 770/773 - loss 0.04056032 - time (sec): 46.93 - samples/sec: 2635.02 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 12:20:56,522 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:20:56,522 EPOCH 4 done: loss 0.0405 - lr: 0.000033 |
|
2023-10-25 12:20:59,335 DEV : loss 0.09151480346918106 - f1-score (micro avg) 0.7633 |
|
2023-10-25 12:20:59,352 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:21:04,044 epoch 5 - iter 77/773 - loss 0.02112046 - time (sec): 4.69 - samples/sec: 2721.05 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 12:21:08,686 epoch 5 - iter 154/773 - loss 0.01842961 - time (sec): 9.33 - samples/sec: 2662.66 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 12:21:12,928 epoch 5 - iter 231/773 - loss 0.01846319 - time (sec): 13.57 - samples/sec: 2723.90 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 12:21:17,112 epoch 5 - iter 308/773 - loss 0.02359632 - time (sec): 17.76 - samples/sec: 2764.13 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 12:21:21,343 epoch 5 - iter 385/773 - loss 0.02524343 - time (sec): 21.99 - samples/sec: 2805.99 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 12:21:25,495 epoch 5 - iter 462/773 - loss 0.02906964 - time (sec): 26.14 - samples/sec: 2830.80 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 12:21:29,675 epoch 5 - iter 539/773 - loss 0.02877031 - time (sec): 30.32 - samples/sec: 2839.55 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 12:21:34,125 epoch 5 - iter 616/773 - loss 0.02875024 - time (sec): 34.77 - samples/sec: 2860.94 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 12:21:38,506 epoch 5 - iter 693/773 - loss 0.02760480 - time (sec): 39.15 - samples/sec: 2847.13 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 12:21:42,830 epoch 5 - iter 770/773 - loss 0.02771260 - time (sec): 43.48 - samples/sec: 2847.59 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 12:21:43,003 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:21:43,003 EPOCH 5 done: loss 0.0279 - lr: 0.000028 |
|
2023-10-25 12:21:45,710 DEV : loss 0.1106417253613472 - f1-score (micro avg) 0.7591 |
|
2023-10-25 12:21:45,728 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:21:50,111 epoch 6 - iter 77/773 - loss 0.01580263 - time (sec): 4.38 - samples/sec: 2696.31 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 12:21:54,324 epoch 6 - iter 154/773 - loss 0.01650220 - time (sec): 8.59 - samples/sec: 2848.75 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 12:21:58,492 epoch 6 - iter 231/773 - loss 0.02049639 - time (sec): 12.76 - samples/sec: 2818.64 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 12:22:02,787 epoch 6 - iter 308/773 - loss 0.01879335 - time (sec): 17.06 - samples/sec: 2803.10 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 12:22:07,156 epoch 6 - iter 385/773 - loss 0.01859473 - time (sec): 21.43 - samples/sec: 2792.82 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 12:22:11,622 epoch 6 - iter 462/773 - loss 0.01757656 - time (sec): 25.89 - samples/sec: 2802.22 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 12:22:16,024 epoch 6 - iter 539/773 - loss 0.01795437 - time (sec): 30.29 - samples/sec: 2834.82 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 12:22:20,564 epoch 6 - iter 616/773 - loss 0.01760159 - time (sec): 34.83 - samples/sec: 2833.47 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 12:22:25,025 epoch 6 - iter 693/773 - loss 0.01752017 - time (sec): 39.30 - samples/sec: 2837.97 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 12:22:29,522 epoch 6 - iter 770/773 - loss 0.01757611 - time (sec): 43.79 - samples/sec: 2823.70 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 12:22:29,691 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:22:29,691 EPOCH 6 done: loss 0.0175 - lr: 0.000022 |
|
2023-10-25 12:22:32,413 DEV : loss 0.12988701462745667 - f1-score (micro avg) 0.7323 |
|
2023-10-25 12:22:32,432 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:22:37,573 epoch 7 - iter 77/773 - loss 0.01270465 - time (sec): 5.14 - samples/sec: 2461.00 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 12:22:42,090 epoch 7 - iter 154/773 - loss 0.01340211 - time (sec): 9.66 - samples/sec: 2681.39 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 12:22:46,564 epoch 7 - iter 231/773 - loss 0.01191530 - time (sec): 14.13 - samples/sec: 2706.10 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 12:22:50,845 epoch 7 - iter 308/773 - loss 0.01125805 - time (sec): 18.41 - samples/sec: 2716.57 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 12:22:55,183 epoch 7 - iter 385/773 - loss 0.01159184 - time (sec): 22.75 - samples/sec: 2737.83 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 12:22:59,448 epoch 7 - iter 462/773 - loss 0.01228398 - time (sec): 27.01 - samples/sec: 2738.62 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 12:23:04,007 epoch 7 - iter 539/773 - loss 0.01193303 - time (sec): 31.57 - samples/sec: 2714.44 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 12:23:08,480 epoch 7 - iter 616/773 - loss 0.01164081 - time (sec): 36.05 - samples/sec: 2735.41 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 12:23:12,994 epoch 7 - iter 693/773 - loss 0.01185873 - time (sec): 40.56 - samples/sec: 2754.05 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 12:23:17,234 epoch 7 - iter 770/773 - loss 0.01137596 - time (sec): 44.80 - samples/sec: 2765.93 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 12:23:17,394 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:23:17,394 EPOCH 7 done: loss 0.0114 - lr: 0.000017 |
|
2023-10-25 12:23:20,000 DEV : loss 0.13432981073856354 - f1-score (micro avg) 0.7571 |
|
2023-10-25 12:23:20,021 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:23:24,663 epoch 8 - iter 77/773 - loss 0.00837517 - time (sec): 4.64 - samples/sec: 2619.85 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 12:23:29,170 epoch 8 - iter 154/773 - loss 0.00984375 - time (sec): 9.15 - samples/sec: 2640.40 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 12:23:33,748 epoch 8 - iter 231/773 - loss 0.00851381 - time (sec): 13.73 - samples/sec: 2700.21 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 12:23:38,097 epoch 8 - iter 308/773 - loss 0.00773937 - time (sec): 18.07 - samples/sec: 2726.57 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 12:23:42,316 epoch 8 - iter 385/773 - loss 0.00789454 - time (sec): 22.29 - samples/sec: 2797.84 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 12:23:46,760 epoch 8 - iter 462/773 - loss 0.00721549 - time (sec): 26.74 - samples/sec: 2810.19 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 12:23:51,020 epoch 8 - iter 539/773 - loss 0.00741554 - time (sec): 31.00 - samples/sec: 2833.22 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 12:23:55,330 epoch 8 - iter 616/773 - loss 0.00763241 - time (sec): 35.31 - samples/sec: 2833.08 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 12:23:59,779 epoch 8 - iter 693/773 - loss 0.00757973 - time (sec): 39.76 - samples/sec: 2815.75 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 12:24:04,116 epoch 8 - iter 770/773 - loss 0.00740762 - time (sec): 44.09 - samples/sec: 2810.96 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 12:24:04,278 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:24:04,279 EPOCH 8 done: loss 0.0074 - lr: 0.000011 |
|
2023-10-25 12:24:06,829 DEV : loss 0.12856917083263397 - f1-score (micro avg) 0.7841 |
|
2023-10-25 12:24:06,851 saving best model |
|
2023-10-25 12:24:07,574 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:24:12,197 epoch 9 - iter 77/773 - loss 0.00218328 - time (sec): 4.62 - samples/sec: 2511.74 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 12:24:16,700 epoch 9 - iter 154/773 - loss 0.00263026 - time (sec): 9.12 - samples/sec: 2668.23 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 12:24:21,233 epoch 9 - iter 231/773 - loss 0.00409824 - time (sec): 13.66 - samples/sec: 2651.16 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 12:24:25,754 epoch 9 - iter 308/773 - loss 0.00407327 - time (sec): 18.18 - samples/sec: 2671.13 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 12:24:30,270 epoch 9 - iter 385/773 - loss 0.00406267 - time (sec): 22.69 - samples/sec: 2704.09 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 12:24:34,963 epoch 9 - iter 462/773 - loss 0.00439371 - time (sec): 27.39 - samples/sec: 2709.28 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 12:24:39,738 epoch 9 - iter 539/773 - loss 0.00465950 - time (sec): 32.16 - samples/sec: 2697.73 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 12:24:44,235 epoch 9 - iter 616/773 - loss 0.00455909 - time (sec): 36.66 - samples/sec: 2697.30 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 12:24:48,680 epoch 9 - iter 693/773 - loss 0.00453870 - time (sec): 41.10 - samples/sec: 2712.52 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 12:24:53,176 epoch 9 - iter 770/773 - loss 0.00471129 - time (sec): 45.60 - samples/sec: 2710.85 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 12:24:53,348 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:24:53,348 EPOCH 9 done: loss 0.0047 - lr: 0.000006 |
|
2023-10-25 12:24:56,315 DEV : loss 0.12616777420043945 - f1-score (micro avg) 0.778 |
|
2023-10-25 12:24:56,334 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:25:01,018 epoch 10 - iter 77/773 - loss 0.00130179 - time (sec): 4.68 - samples/sec: 2718.05 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 12:25:05,516 epoch 10 - iter 154/773 - loss 0.00302749 - time (sec): 9.18 - samples/sec: 2663.57 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 12:25:09,770 epoch 10 - iter 231/773 - loss 0.00291699 - time (sec): 13.43 - samples/sec: 2781.05 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 12:25:14,055 epoch 10 - iter 308/773 - loss 0.00301012 - time (sec): 17.72 - samples/sec: 2833.57 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 12:25:18,465 epoch 10 - iter 385/773 - loss 0.00292380 - time (sec): 22.13 - samples/sec: 2838.08 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 12:25:22,891 epoch 10 - iter 462/773 - loss 0.00308081 - time (sec): 26.55 - samples/sec: 2850.43 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 12:25:27,055 epoch 10 - iter 539/773 - loss 0.00316596 - time (sec): 30.72 - samples/sec: 2869.90 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 12:25:31,255 epoch 10 - iter 616/773 - loss 0.00349236 - time (sec): 34.92 - samples/sec: 2854.14 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 12:25:35,580 epoch 10 - iter 693/773 - loss 0.00373209 - time (sec): 39.24 - samples/sec: 2858.12 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 12:25:39,752 epoch 10 - iter 770/773 - loss 0.00373405 - time (sec): 43.42 - samples/sec: 2852.74 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 12:25:39,912 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:25:39,913 EPOCH 10 done: loss 0.0037 - lr: 0.000000 |
|
2023-10-25 12:25:42,889 DEV : loss 0.1320255696773529 - f1-score (micro avg) 0.7708 |
|
2023-10-25 12:25:43,369 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 12:25:43,370 Loading model from best epoch ... |
|
2023-10-25 12:25:45,103 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET |
|
2023-10-25 12:25:53,829 |
|
Results: |
|
- F-score (micro) 0.7923 |
|
- F-score (macro) 0.6885 |
|
- Accuracy 0.6776 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8402 0.8393 0.8398 946 |
|
BUILDING 0.6323 0.5297 0.5765 185 |
|
STREET 0.6379 0.6607 0.6491 56 |
|
|
|
micro avg 0.8022 0.7826 0.7923 1187 |
|
macro avg 0.7035 0.6766 0.6885 1187 |
|
weighted avg 0.7983 0.7826 0.7897 1187 |
|
|
|
2023-10-25 12:25:53,829 ---------------------------------------------------------------------------------------------------- |
|
|