stefan-it's picture
Upload ./training.log with huggingface_hub
09859ba
2023-10-25 18:53:21,074 ----------------------------------------------------------------------------------------------------
2023-10-25 18:53:21,075 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 18:53:21,075 ----------------------------------------------------------------------------------------------------
2023-10-25 18:53:21,075 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-25 18:53:21,075 ----------------------------------------------------------------------------------------------------
2023-10-25 18:53:21,075 Train: 20847 sentences
2023-10-25 18:53:21,075 (train_with_dev=False, train_with_test=False)
2023-10-25 18:53:21,075 ----------------------------------------------------------------------------------------------------
2023-10-25 18:53:21,075 Training Params:
2023-10-25 18:53:21,075 - learning_rate: "5e-05"
2023-10-25 18:53:21,075 - mini_batch_size: "8"
2023-10-25 18:53:21,075 - max_epochs: "10"
2023-10-25 18:53:21,075 - shuffle: "True"
2023-10-25 18:53:21,075 ----------------------------------------------------------------------------------------------------
2023-10-25 18:53:21,075 Plugins:
2023-10-25 18:53:21,075 - TensorboardLogger
2023-10-25 18:53:21,075 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 18:53:21,075 ----------------------------------------------------------------------------------------------------
2023-10-25 18:53:21,075 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 18:53:21,075 - metric: "('micro avg', 'f1-score')"
2023-10-25 18:53:21,075 ----------------------------------------------------------------------------------------------------
2023-10-25 18:53:21,075 Computation:
2023-10-25 18:53:21,075 - compute on device: cuda:0
2023-10-25 18:53:21,075 - embedding storage: none
2023-10-25 18:53:21,075 ----------------------------------------------------------------------------------------------------
2023-10-25 18:53:21,075 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-25 18:53:21,076 ----------------------------------------------------------------------------------------------------
2023-10-25 18:53:21,076 ----------------------------------------------------------------------------------------------------
2023-10-25 18:53:21,076 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 18:53:35,071 epoch 1 - iter 260/2606 - loss 1.22654989 - time (sec): 13.99 - samples/sec: 2582.53 - lr: 0.000005 - momentum: 0.000000
2023-10-25 18:53:48,825 epoch 1 - iter 520/2606 - loss 0.79397378 - time (sec): 27.75 - samples/sec: 2647.86 - lr: 0.000010 - momentum: 0.000000
2023-10-25 18:54:03,050 epoch 1 - iter 780/2606 - loss 0.60720555 - time (sec): 41.97 - samples/sec: 2660.39 - lr: 0.000015 - momentum: 0.000000
2023-10-25 18:54:17,076 epoch 1 - iter 1040/2606 - loss 0.51582665 - time (sec): 56.00 - samples/sec: 2652.91 - lr: 0.000020 - momentum: 0.000000
2023-10-25 18:54:30,805 epoch 1 - iter 1300/2606 - loss 0.46165890 - time (sec): 69.73 - samples/sec: 2629.90 - lr: 0.000025 - momentum: 0.000000
2023-10-25 18:54:44,643 epoch 1 - iter 1560/2606 - loss 0.42197187 - time (sec): 83.57 - samples/sec: 2612.42 - lr: 0.000030 - momentum: 0.000000
2023-10-25 18:54:59,068 epoch 1 - iter 1820/2606 - loss 0.38914452 - time (sec): 97.99 - samples/sec: 2631.50 - lr: 0.000035 - momentum: 0.000000
2023-10-25 18:55:14,246 epoch 1 - iter 2080/2606 - loss 0.36051912 - time (sec): 113.17 - samples/sec: 2613.37 - lr: 0.000040 - momentum: 0.000000
2023-10-25 18:55:28,081 epoch 1 - iter 2340/2606 - loss 0.34210880 - time (sec): 127.00 - samples/sec: 2610.36 - lr: 0.000045 - momentum: 0.000000
2023-10-25 18:55:41,859 epoch 1 - iter 2600/2606 - loss 0.32875774 - time (sec): 140.78 - samples/sec: 2601.68 - lr: 0.000050 - momentum: 0.000000
2023-10-25 18:55:42,255 ----------------------------------------------------------------------------------------------------
2023-10-25 18:55:42,256 EPOCH 1 done: loss 0.3282 - lr: 0.000050
2023-10-25 18:55:46,806 DEV : loss 0.10818666964769363 - f1-score (micro avg) 0.3028
2023-10-25 18:55:46,831 saving best model
2023-10-25 18:55:47,171 ----------------------------------------------------------------------------------------------------
2023-10-25 18:56:00,856 epoch 2 - iter 260/2606 - loss 0.16703388 - time (sec): 13.68 - samples/sec: 2673.86 - lr: 0.000049 - momentum: 0.000000
2023-10-25 18:56:14,745 epoch 2 - iter 520/2606 - loss 0.16020151 - time (sec): 27.57 - samples/sec: 2639.84 - lr: 0.000049 - momentum: 0.000000
2023-10-25 18:56:28,759 epoch 2 - iter 780/2606 - loss 0.16073044 - time (sec): 41.59 - samples/sec: 2650.19 - lr: 0.000048 - momentum: 0.000000
2023-10-25 18:56:42,925 epoch 2 - iter 1040/2606 - loss 0.17210066 - time (sec): 55.75 - samples/sec: 2643.62 - lr: 0.000048 - momentum: 0.000000
2023-10-25 18:56:56,952 epoch 2 - iter 1300/2606 - loss 0.18255173 - time (sec): 69.78 - samples/sec: 2621.28 - lr: 0.000047 - momentum: 0.000000
2023-10-25 18:57:11,062 epoch 2 - iter 1560/2606 - loss 0.18257560 - time (sec): 83.89 - samples/sec: 2640.91 - lr: 0.000047 - momentum: 0.000000
2023-10-25 18:57:24,788 epoch 2 - iter 1820/2606 - loss 0.18125754 - time (sec): 97.62 - samples/sec: 2639.70 - lr: 0.000046 - momentum: 0.000000
2023-10-25 18:57:38,312 epoch 2 - iter 2080/2606 - loss 0.18852407 - time (sec): 111.14 - samples/sec: 2624.26 - lr: 0.000046 - momentum: 0.000000
2023-10-25 18:57:52,289 epoch 2 - iter 2340/2606 - loss 0.20269247 - time (sec): 125.12 - samples/sec: 2626.55 - lr: 0.000045 - momentum: 0.000000
2023-10-25 18:58:05,728 epoch 2 - iter 2600/2606 - loss 0.21971447 - time (sec): 138.56 - samples/sec: 2644.17 - lr: 0.000044 - momentum: 0.000000
2023-10-25 18:58:06,094 ----------------------------------------------------------------------------------------------------
2023-10-25 18:58:06,094 EPOCH 2 done: loss 0.2193 - lr: 0.000044
2023-10-25 18:58:12,220 DEV : loss 0.15937478840351105 - f1-score (micro avg) 0.146
2023-10-25 18:58:12,245 ----------------------------------------------------------------------------------------------------
2023-10-25 18:58:26,118 epoch 3 - iter 260/2606 - loss 0.35785536 - time (sec): 13.87 - samples/sec: 2625.46 - lr: 0.000044 - momentum: 0.000000
2023-10-25 18:58:40,275 epoch 3 - iter 520/2606 - loss 0.33128432 - time (sec): 28.03 - samples/sec: 2711.15 - lr: 0.000043 - momentum: 0.000000
2023-10-25 18:58:54,085 epoch 3 - iter 780/2606 - loss 0.32057211 - time (sec): 41.84 - samples/sec: 2695.43 - lr: 0.000043 - momentum: 0.000000
2023-10-25 18:59:07,877 epoch 3 - iter 1040/2606 - loss 0.31732087 - time (sec): 55.63 - samples/sec: 2662.56 - lr: 0.000042 - momentum: 0.000000
2023-10-25 18:59:21,706 epoch 3 - iter 1300/2606 - loss 0.31159665 - time (sec): 69.46 - samples/sec: 2666.29 - lr: 0.000042 - momentum: 0.000000
2023-10-25 18:59:35,403 epoch 3 - iter 1560/2606 - loss 0.29696758 - time (sec): 83.16 - samples/sec: 2639.12 - lr: 0.000041 - momentum: 0.000000
2023-10-25 18:59:48,699 epoch 3 - iter 1820/2606 - loss 0.27941389 - time (sec): 96.45 - samples/sec: 2649.65 - lr: 0.000041 - momentum: 0.000000
2023-10-25 19:00:03,073 epoch 3 - iter 2080/2606 - loss 0.26628342 - time (sec): 110.83 - samples/sec: 2661.43 - lr: 0.000040 - momentum: 0.000000
2023-10-25 19:00:16,640 epoch 3 - iter 2340/2606 - loss 0.25955628 - time (sec): 124.39 - samples/sec: 2642.95 - lr: 0.000039 - momentum: 0.000000
2023-10-25 19:00:30,814 epoch 3 - iter 2600/2606 - loss 0.25232625 - time (sec): 138.57 - samples/sec: 2646.42 - lr: 0.000039 - momentum: 0.000000
2023-10-25 19:00:31,097 ----------------------------------------------------------------------------------------------------
2023-10-25 19:00:31,098 EPOCH 3 done: loss 0.2522 - lr: 0.000039
2023-10-25 19:00:37,278 DEV : loss 0.1657640039920807 - f1-score (micro avg) 0.1864
2023-10-25 19:00:37,304 ----------------------------------------------------------------------------------------------------
2023-10-25 19:00:51,401 epoch 4 - iter 260/2606 - loss 0.15124015 - time (sec): 14.10 - samples/sec: 2591.51 - lr: 0.000038 - momentum: 0.000000
2023-10-25 19:01:06,091 epoch 4 - iter 520/2606 - loss 0.16792346 - time (sec): 28.79 - samples/sec: 2664.43 - lr: 0.000038 - momentum: 0.000000
2023-10-25 19:01:19,837 epoch 4 - iter 780/2606 - loss 0.16285129 - time (sec): 42.53 - samples/sec: 2663.29 - lr: 0.000037 - momentum: 0.000000
2023-10-25 19:01:34,811 epoch 4 - iter 1040/2606 - loss 0.15299021 - time (sec): 57.51 - samples/sec: 2637.57 - lr: 0.000037 - momentum: 0.000000
2023-10-25 19:01:49,026 epoch 4 - iter 1300/2606 - loss 0.14524141 - time (sec): 71.72 - samples/sec: 2644.43 - lr: 0.000036 - momentum: 0.000000
2023-10-25 19:02:02,607 epoch 4 - iter 1560/2606 - loss 0.14577143 - time (sec): 85.30 - samples/sec: 2630.63 - lr: 0.000036 - momentum: 0.000000
2023-10-25 19:02:16,480 epoch 4 - iter 1820/2606 - loss 0.14482886 - time (sec): 99.18 - samples/sec: 2640.54 - lr: 0.000035 - momentum: 0.000000
2023-10-25 19:02:30,428 epoch 4 - iter 2080/2606 - loss 0.14186931 - time (sec): 113.12 - samples/sec: 2636.86 - lr: 0.000034 - momentum: 0.000000
2023-10-25 19:02:43,625 epoch 4 - iter 2340/2606 - loss 0.13854088 - time (sec): 126.32 - samples/sec: 2637.26 - lr: 0.000034 - momentum: 0.000000
2023-10-25 19:02:56,979 epoch 4 - iter 2600/2606 - loss 0.13696604 - time (sec): 139.67 - samples/sec: 2626.30 - lr: 0.000033 - momentum: 0.000000
2023-10-25 19:02:57,258 ----------------------------------------------------------------------------------------------------
2023-10-25 19:02:57,259 EPOCH 4 done: loss 0.1369 - lr: 0.000033
2023-10-25 19:03:03,450 DEV : loss 0.1679009050130844 - f1-score (micro avg) 0.2583
2023-10-25 19:03:03,476 ----------------------------------------------------------------------------------------------------
2023-10-25 19:03:17,388 epoch 5 - iter 260/2606 - loss 0.12128412 - time (sec): 13.91 - samples/sec: 2601.28 - lr: 0.000033 - momentum: 0.000000
2023-10-25 19:03:31,093 epoch 5 - iter 520/2606 - loss 0.10950431 - time (sec): 27.62 - samples/sec: 2579.15 - lr: 0.000032 - momentum: 0.000000
2023-10-25 19:03:45,240 epoch 5 - iter 780/2606 - loss 0.11025977 - time (sec): 41.76 - samples/sec: 2612.93 - lr: 0.000032 - momentum: 0.000000
2023-10-25 19:03:59,070 epoch 5 - iter 1040/2606 - loss 0.12022034 - time (sec): 55.59 - samples/sec: 2650.10 - lr: 0.000031 - momentum: 0.000000
2023-10-25 19:04:13,019 epoch 5 - iter 1300/2606 - loss 0.11986942 - time (sec): 69.54 - samples/sec: 2649.73 - lr: 0.000031 - momentum: 0.000000
2023-10-25 19:04:26,967 epoch 5 - iter 1560/2606 - loss 0.12190371 - time (sec): 83.49 - samples/sec: 2649.25 - lr: 0.000030 - momentum: 0.000000
2023-10-25 19:04:41,286 epoch 5 - iter 1820/2606 - loss 0.12535053 - time (sec): 97.81 - samples/sec: 2604.46 - lr: 0.000029 - momentum: 0.000000
2023-10-25 19:04:54,862 epoch 5 - iter 2080/2606 - loss 0.12512345 - time (sec): 111.38 - samples/sec: 2622.64 - lr: 0.000029 - momentum: 0.000000
2023-10-25 19:05:08,753 epoch 5 - iter 2340/2606 - loss 0.12311329 - time (sec): 125.28 - samples/sec: 2623.06 - lr: 0.000028 - momentum: 0.000000
2023-10-25 19:05:23,105 epoch 5 - iter 2600/2606 - loss 0.11877944 - time (sec): 139.63 - samples/sec: 2625.34 - lr: 0.000028 - momentum: 0.000000
2023-10-25 19:05:23,396 ----------------------------------------------------------------------------------------------------
2023-10-25 19:05:23,397 EPOCH 5 done: loss 0.1186 - lr: 0.000028
2023-10-25 19:05:29,639 DEV : loss 0.24302677810192108 - f1-score (micro avg) 0.3593
2023-10-25 19:05:29,664 saving best model
2023-10-25 19:05:30,133 ----------------------------------------------------------------------------------------------------
2023-10-25 19:05:44,343 epoch 6 - iter 260/2606 - loss 0.06961303 - time (sec): 14.20 - samples/sec: 2677.08 - lr: 0.000027 - momentum: 0.000000
2023-10-25 19:05:58,247 epoch 6 - iter 520/2606 - loss 0.06637601 - time (sec): 28.11 - samples/sec: 2600.18 - lr: 0.000027 - momentum: 0.000000
2023-10-25 19:06:11,653 epoch 6 - iter 780/2606 - loss 0.07414255 - time (sec): 41.51 - samples/sec: 2613.60 - lr: 0.000026 - momentum: 0.000000
2023-10-25 19:06:25,533 epoch 6 - iter 1040/2606 - loss 0.10031781 - time (sec): 55.39 - samples/sec: 2648.30 - lr: 0.000026 - momentum: 0.000000
2023-10-25 19:06:39,167 epoch 6 - iter 1300/2606 - loss 0.13040414 - time (sec): 69.03 - samples/sec: 2637.32 - lr: 0.000025 - momentum: 0.000000
2023-10-25 19:06:53,355 epoch 6 - iter 1560/2606 - loss 0.13377462 - time (sec): 83.22 - samples/sec: 2645.86 - lr: 0.000024 - momentum: 0.000000
2023-10-25 19:07:06,957 epoch 6 - iter 1820/2606 - loss 0.13543341 - time (sec): 96.82 - samples/sec: 2631.12 - lr: 0.000024 - momentum: 0.000000
2023-10-25 19:07:20,744 epoch 6 - iter 2080/2606 - loss 0.13407847 - time (sec): 110.61 - samples/sec: 2636.25 - lr: 0.000023 - momentum: 0.000000
2023-10-25 19:07:35,017 epoch 6 - iter 2340/2606 - loss 0.13093426 - time (sec): 124.88 - samples/sec: 2635.80 - lr: 0.000023 - momentum: 0.000000
2023-10-25 19:07:50,193 epoch 6 - iter 2600/2606 - loss 0.12726324 - time (sec): 140.05 - samples/sec: 2616.54 - lr: 0.000022 - momentum: 0.000000
2023-10-25 19:07:50,554 ----------------------------------------------------------------------------------------------------
2023-10-25 19:07:50,555 EPOCH 6 done: loss 0.1275 - lr: 0.000022
2023-10-25 19:07:56,751 DEV : loss 0.21893833577632904 - f1-score (micro avg) 0.2173
2023-10-25 19:07:56,777 ----------------------------------------------------------------------------------------------------
2023-10-25 19:08:10,969 epoch 7 - iter 260/2606 - loss 0.11553804 - time (sec): 14.19 - samples/sec: 2546.35 - lr: 0.000022 - momentum: 0.000000
2023-10-25 19:08:25,327 epoch 7 - iter 520/2606 - loss 0.11916207 - time (sec): 28.55 - samples/sec: 2588.66 - lr: 0.000021 - momentum: 0.000000
2023-10-25 19:08:39,233 epoch 7 - iter 780/2606 - loss 0.12128387 - time (sec): 42.46 - samples/sec: 2613.09 - lr: 0.000021 - momentum: 0.000000
2023-10-25 19:08:53,554 epoch 7 - iter 1040/2606 - loss 0.12599467 - time (sec): 56.78 - samples/sec: 2633.23 - lr: 0.000020 - momentum: 0.000000
2023-10-25 19:09:07,296 epoch 7 - iter 1300/2606 - loss 0.12603993 - time (sec): 70.52 - samples/sec: 2630.77 - lr: 0.000019 - momentum: 0.000000
2023-10-25 19:09:21,108 epoch 7 - iter 1560/2606 - loss 0.13415502 - time (sec): 84.33 - samples/sec: 2656.62 - lr: 0.000019 - momentum: 0.000000
2023-10-25 19:09:35,497 epoch 7 - iter 1820/2606 - loss 0.13605167 - time (sec): 98.72 - samples/sec: 2651.85 - lr: 0.000018 - momentum: 0.000000
2023-10-25 19:09:49,453 epoch 7 - iter 2080/2606 - loss 0.13494947 - time (sec): 112.68 - samples/sec: 2647.74 - lr: 0.000018 - momentum: 0.000000
2023-10-25 19:10:03,525 epoch 7 - iter 2340/2606 - loss 0.13572897 - time (sec): 126.75 - samples/sec: 2621.39 - lr: 0.000017 - momentum: 0.000000
2023-10-25 19:10:17,206 epoch 7 - iter 2600/2606 - loss 0.13869799 - time (sec): 140.43 - samples/sec: 2611.15 - lr: 0.000017 - momentum: 0.000000
2023-10-25 19:10:17,500 ----------------------------------------------------------------------------------------------------
2023-10-25 19:10:17,501 EPOCH 7 done: loss 0.1388 - lr: 0.000017
2023-10-25 19:10:24,378 DEV : loss 0.22277408838272095 - f1-score (micro avg) 0.1056
2023-10-25 19:10:24,404 ----------------------------------------------------------------------------------------------------
2023-10-25 19:10:38,339 epoch 8 - iter 260/2606 - loss 0.13444344 - time (sec): 13.93 - samples/sec: 2599.23 - lr: 0.000016 - momentum: 0.000000
2023-10-25 19:10:52,192 epoch 8 - iter 520/2606 - loss 0.12655038 - time (sec): 27.79 - samples/sec: 2594.77 - lr: 0.000016 - momentum: 0.000000
2023-10-25 19:11:06,051 epoch 8 - iter 780/2606 - loss 0.13953058 - time (sec): 41.65 - samples/sec: 2615.05 - lr: 0.000015 - momentum: 0.000000
2023-10-25 19:11:19,501 epoch 8 - iter 1040/2606 - loss 0.15672566 - time (sec): 55.10 - samples/sec: 2596.78 - lr: 0.000014 - momentum: 0.000000
2023-10-25 19:11:33,165 epoch 8 - iter 1300/2606 - loss 0.16238670 - time (sec): 68.76 - samples/sec: 2630.61 - lr: 0.000014 - momentum: 0.000000
2023-10-25 19:11:46,784 epoch 8 - iter 1560/2606 - loss 0.16000621 - time (sec): 82.38 - samples/sec: 2609.36 - lr: 0.000013 - momentum: 0.000000
2023-10-25 19:12:01,346 epoch 8 - iter 1820/2606 - loss 0.16256852 - time (sec): 96.94 - samples/sec: 2607.62 - lr: 0.000013 - momentum: 0.000000
2023-10-25 19:12:15,435 epoch 8 - iter 2080/2606 - loss 0.16386440 - time (sec): 111.03 - samples/sec: 2610.02 - lr: 0.000012 - momentum: 0.000000
2023-10-25 19:12:29,146 epoch 8 - iter 2340/2606 - loss 0.16468898 - time (sec): 124.74 - samples/sec: 2638.59 - lr: 0.000012 - momentum: 0.000000
2023-10-25 19:12:42,925 epoch 8 - iter 2600/2606 - loss 0.16416334 - time (sec): 138.52 - samples/sec: 2645.99 - lr: 0.000011 - momentum: 0.000000
2023-10-25 19:12:43,215 ----------------------------------------------------------------------------------------------------
2023-10-25 19:12:43,215 EPOCH 8 done: loss 0.1644 - lr: 0.000011
2023-10-25 19:12:50,042 DEV : loss 0.25141724944114685 - f1-score (micro avg) 0.0708
2023-10-25 19:12:50,069 ----------------------------------------------------------------------------------------------------
2023-10-25 19:13:04,144 epoch 9 - iter 260/2606 - loss 0.16456916 - time (sec): 14.07 - samples/sec: 2636.38 - lr: 0.000011 - momentum: 0.000000
2023-10-25 19:13:18,077 epoch 9 - iter 520/2606 - loss 0.17390022 - time (sec): 28.01 - samples/sec: 2556.96 - lr: 0.000010 - momentum: 0.000000
2023-10-25 19:13:32,123 epoch 9 - iter 780/2606 - loss 0.16945904 - time (sec): 42.05 - samples/sec: 2607.81 - lr: 0.000009 - momentum: 0.000000
2023-10-25 19:13:45,929 epoch 9 - iter 1040/2606 - loss 0.17187574 - time (sec): 55.86 - samples/sec: 2605.36 - lr: 0.000009 - momentum: 0.000000
2023-10-25 19:13:59,718 epoch 9 - iter 1300/2606 - loss 0.17064245 - time (sec): 69.65 - samples/sec: 2590.76 - lr: 0.000008 - momentum: 0.000000
2023-10-25 19:14:13,646 epoch 9 - iter 1560/2606 - loss 0.16565863 - time (sec): 83.58 - samples/sec: 2599.36 - lr: 0.000008 - momentum: 0.000000
2023-10-25 19:14:27,426 epoch 9 - iter 1820/2606 - loss 0.16328328 - time (sec): 97.36 - samples/sec: 2588.64 - lr: 0.000007 - momentum: 0.000000
2023-10-25 19:14:41,710 epoch 9 - iter 2080/2606 - loss 0.15985803 - time (sec): 111.64 - samples/sec: 2608.36 - lr: 0.000007 - momentum: 0.000000
2023-10-25 19:14:55,830 epoch 9 - iter 2340/2606 - loss 0.15644552 - time (sec): 125.76 - samples/sec: 2630.74 - lr: 0.000006 - momentum: 0.000000
2023-10-25 19:15:09,880 epoch 9 - iter 2600/2606 - loss 0.15293464 - time (sec): 139.81 - samples/sec: 2618.85 - lr: 0.000006 - momentum: 0.000000
2023-10-25 19:15:10,303 ----------------------------------------------------------------------------------------------------
2023-10-25 19:15:10,304 EPOCH 9 done: loss 0.1527 - lr: 0.000006
2023-10-25 19:15:17,210 DEV : loss 0.23577940464019775 - f1-score (micro avg) 0.1376
2023-10-25 19:15:17,236 ----------------------------------------------------------------------------------------------------
2023-10-25 19:15:31,281 epoch 10 - iter 260/2606 - loss 0.11824962 - time (sec): 14.04 - samples/sec: 2639.64 - lr: 0.000005 - momentum: 0.000000
2023-10-25 19:15:45,059 epoch 10 - iter 520/2606 - loss 0.13577297 - time (sec): 27.82 - samples/sec: 2632.39 - lr: 0.000004 - momentum: 0.000000
2023-10-25 19:15:58,722 epoch 10 - iter 780/2606 - loss 0.14718022 - time (sec): 41.48 - samples/sec: 2638.99 - lr: 0.000004 - momentum: 0.000000
2023-10-25 19:16:12,501 epoch 10 - iter 1040/2606 - loss 0.15180573 - time (sec): 55.26 - samples/sec: 2626.46 - lr: 0.000003 - momentum: 0.000000
2023-10-25 19:16:26,627 epoch 10 - iter 1300/2606 - loss 0.15379125 - time (sec): 69.39 - samples/sec: 2612.90 - lr: 0.000003 - momentum: 0.000000
2023-10-25 19:16:40,065 epoch 10 - iter 1560/2606 - loss 0.15876599 - time (sec): 82.83 - samples/sec: 2594.32 - lr: 0.000002 - momentum: 0.000000
2023-10-25 19:16:54,014 epoch 10 - iter 1820/2606 - loss 0.15804883 - time (sec): 96.78 - samples/sec: 2597.72 - lr: 0.000002 - momentum: 0.000000
2023-10-25 19:17:07,637 epoch 10 - iter 2080/2606 - loss 0.15931423 - time (sec): 110.40 - samples/sec: 2606.04 - lr: 0.000001 - momentum: 0.000000
2023-10-25 19:17:22,183 epoch 10 - iter 2340/2606 - loss 0.15752814 - time (sec): 124.95 - samples/sec: 2623.44 - lr: 0.000001 - momentum: 0.000000
2023-10-25 19:17:36,063 epoch 10 - iter 2600/2606 - loss 0.15722249 - time (sec): 138.83 - samples/sec: 2636.58 - lr: 0.000000 - momentum: 0.000000
2023-10-25 19:17:36,487 ----------------------------------------------------------------------------------------------------
2023-10-25 19:17:36,487 EPOCH 10 done: loss 0.1571 - lr: 0.000000
2023-10-25 19:17:43,301 DEV : loss 0.22768089175224304 - f1-score (micro avg) 0.0925
2023-10-25 19:17:43,799 ----------------------------------------------------------------------------------------------------
2023-10-25 19:17:43,800 Loading model from best epoch ...
2023-10-25 19:17:45,401 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 19:17:55,259
Results:
- F-score (micro) 0.4822
- F-score (macro) 0.3067
- Accuracy 0.3221
By class:
precision recall f1-score support
LOC 0.5077 0.6820 0.5821 1214
PER 0.4086 0.4703 0.4373 808
ORG 0.2050 0.2096 0.2073 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4380 0.5364 0.4822 2390
macro avg 0.2803 0.3405 0.3067 2390
weighted avg 0.4263 0.5364 0.4741 2390
2023-10-25 19:17:55,259 ----------------------------------------------------------------------------------------------------