stefan-it's picture
Upload ./training.log with huggingface_hub
3325523
2023-10-25 14:09:44,320 ----------------------------------------------------------------------------------------------------
2023-10-25 14:09:44,321 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 14:09:44,321 ----------------------------------------------------------------------------------------------------
2023-10-25 14:09:44,321 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-25 14:09:44,321 ----------------------------------------------------------------------------------------------------
2023-10-25 14:09:44,322 Train: 20847 sentences
2023-10-25 14:09:44,322 (train_with_dev=False, train_with_test=False)
2023-10-25 14:09:44,322 ----------------------------------------------------------------------------------------------------
2023-10-25 14:09:44,322 Training Params:
2023-10-25 14:09:44,322 - learning_rate: "3e-05"
2023-10-25 14:09:44,322 - mini_batch_size: "8"
2023-10-25 14:09:44,322 - max_epochs: "10"
2023-10-25 14:09:44,322 - shuffle: "True"
2023-10-25 14:09:44,322 ----------------------------------------------------------------------------------------------------
2023-10-25 14:09:44,322 Plugins:
2023-10-25 14:09:44,322 - TensorboardLogger
2023-10-25 14:09:44,322 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 14:09:44,322 ----------------------------------------------------------------------------------------------------
2023-10-25 14:09:44,322 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 14:09:44,322 - metric: "('micro avg', 'f1-score')"
2023-10-25 14:09:44,322 ----------------------------------------------------------------------------------------------------
2023-10-25 14:09:44,322 Computation:
2023-10-25 14:09:44,322 - compute on device: cuda:0
2023-10-25 14:09:44,322 - embedding storage: none
2023-10-25 14:09:44,322 ----------------------------------------------------------------------------------------------------
2023-10-25 14:09:44,322 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-25 14:09:44,322 ----------------------------------------------------------------------------------------------------
2023-10-25 14:09:44,322 ----------------------------------------------------------------------------------------------------
2023-10-25 14:09:44,322 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 14:09:59,239 epoch 1 - iter 260/2606 - loss 1.63450744 - time (sec): 14.92 - samples/sec: 2481.30 - lr: 0.000003 - momentum: 0.000000
2023-10-25 14:10:13,050 epoch 1 - iter 520/2606 - loss 1.02153022 - time (sec): 28.73 - samples/sec: 2544.03 - lr: 0.000006 - momentum: 0.000000
2023-10-25 14:10:27,475 epoch 1 - iter 780/2606 - loss 0.78405696 - time (sec): 43.15 - samples/sec: 2578.80 - lr: 0.000009 - momentum: 0.000000
2023-10-25 14:10:41,459 epoch 1 - iter 1040/2606 - loss 0.64562869 - time (sec): 57.14 - samples/sec: 2592.31 - lr: 0.000012 - momentum: 0.000000
2023-10-25 14:10:55,957 epoch 1 - iter 1300/2606 - loss 0.56108502 - time (sec): 71.63 - samples/sec: 2630.53 - lr: 0.000015 - momentum: 0.000000
2023-10-25 14:11:09,915 epoch 1 - iter 1560/2606 - loss 0.50028254 - time (sec): 85.59 - samples/sec: 2628.92 - lr: 0.000018 - momentum: 0.000000
2023-10-25 14:11:23,827 epoch 1 - iter 1820/2606 - loss 0.45800118 - time (sec): 99.50 - samples/sec: 2628.90 - lr: 0.000021 - momentum: 0.000000
2023-10-25 14:11:37,878 epoch 1 - iter 2080/2606 - loss 0.42269524 - time (sec): 113.55 - samples/sec: 2629.76 - lr: 0.000024 - momentum: 0.000000
2023-10-25 14:11:51,504 epoch 1 - iter 2340/2606 - loss 0.39871138 - time (sec): 127.18 - samples/sec: 2609.72 - lr: 0.000027 - momentum: 0.000000
2023-10-25 14:12:05,905 epoch 1 - iter 2600/2606 - loss 0.37747153 - time (sec): 141.58 - samples/sec: 2591.92 - lr: 0.000030 - momentum: 0.000000
2023-10-25 14:12:06,237 ----------------------------------------------------------------------------------------------------
2023-10-25 14:12:06,238 EPOCH 1 done: loss 0.3772 - lr: 0.000030
2023-10-25 14:12:09,991 DEV : loss 0.13742859661579132 - f1-score (micro avg) 0.3201
2023-10-25 14:12:10,014 saving best model
2023-10-25 14:12:10,611 ----------------------------------------------------------------------------------------------------
2023-10-25 14:12:25,774 epoch 2 - iter 260/2606 - loss 0.14519096 - time (sec): 15.16 - samples/sec: 2511.96 - lr: 0.000030 - momentum: 0.000000
2023-10-25 14:12:40,756 epoch 2 - iter 520/2606 - loss 0.14211845 - time (sec): 30.14 - samples/sec: 2536.32 - lr: 0.000029 - momentum: 0.000000
2023-10-25 14:12:55,478 epoch 2 - iter 780/2606 - loss 0.13915106 - time (sec): 44.87 - samples/sec: 2528.96 - lr: 0.000029 - momentum: 0.000000
2023-10-25 14:13:09,729 epoch 2 - iter 1040/2606 - loss 0.14043910 - time (sec): 59.12 - samples/sec: 2534.95 - lr: 0.000029 - momentum: 0.000000
2023-10-25 14:13:24,189 epoch 2 - iter 1300/2606 - loss 0.14258354 - time (sec): 73.58 - samples/sec: 2531.58 - lr: 0.000028 - momentum: 0.000000
2023-10-25 14:13:38,159 epoch 2 - iter 1560/2606 - loss 0.14286427 - time (sec): 87.55 - samples/sec: 2532.92 - lr: 0.000028 - momentum: 0.000000
2023-10-25 14:13:52,969 epoch 2 - iter 1820/2606 - loss 0.14570149 - time (sec): 102.36 - samples/sec: 2531.49 - lr: 0.000028 - momentum: 0.000000
2023-10-25 14:14:08,066 epoch 2 - iter 2080/2606 - loss 0.14415040 - time (sec): 117.45 - samples/sec: 2524.04 - lr: 0.000027 - momentum: 0.000000
2023-10-25 14:14:21,896 epoch 2 - iter 2340/2606 - loss 0.14393099 - time (sec): 131.28 - samples/sec: 2511.98 - lr: 0.000027 - momentum: 0.000000
2023-10-25 14:14:36,338 epoch 2 - iter 2600/2606 - loss 0.14285847 - time (sec): 145.73 - samples/sec: 2513.81 - lr: 0.000027 - momentum: 0.000000
2023-10-25 14:14:36,722 ----------------------------------------------------------------------------------------------------
2023-10-25 14:14:36,723 EPOCH 2 done: loss 0.1427 - lr: 0.000027
2023-10-25 14:14:43,804 DEV : loss 0.1843709945678711 - f1-score (micro avg) 0.3536
2023-10-25 14:14:43,827 saving best model
2023-10-25 14:14:44,490 ----------------------------------------------------------------------------------------------------
2023-10-25 14:14:59,471 epoch 3 - iter 260/2606 - loss 0.08794930 - time (sec): 14.98 - samples/sec: 2450.26 - lr: 0.000026 - momentum: 0.000000
2023-10-25 14:15:13,545 epoch 3 - iter 520/2606 - loss 0.09651930 - time (sec): 29.05 - samples/sec: 2464.24 - lr: 0.000026 - momentum: 0.000000
2023-10-25 14:15:28,051 epoch 3 - iter 780/2606 - loss 0.09574977 - time (sec): 43.56 - samples/sec: 2469.11 - lr: 0.000026 - momentum: 0.000000
2023-10-25 14:15:42,170 epoch 3 - iter 1040/2606 - loss 0.09264605 - time (sec): 57.68 - samples/sec: 2526.05 - lr: 0.000025 - momentum: 0.000000
2023-10-25 14:15:56,294 epoch 3 - iter 1300/2606 - loss 0.09201435 - time (sec): 71.80 - samples/sec: 2550.69 - lr: 0.000025 - momentum: 0.000000
2023-10-25 14:16:10,616 epoch 3 - iter 1560/2606 - loss 0.09390110 - time (sec): 86.12 - samples/sec: 2544.81 - lr: 0.000025 - momentum: 0.000000
2023-10-25 14:16:24,326 epoch 3 - iter 1820/2606 - loss 0.09474851 - time (sec): 99.83 - samples/sec: 2554.95 - lr: 0.000024 - momentum: 0.000000
2023-10-25 14:16:38,591 epoch 3 - iter 2080/2606 - loss 0.09517834 - time (sec): 114.10 - samples/sec: 2559.85 - lr: 0.000024 - momentum: 0.000000
2023-10-25 14:16:52,243 epoch 3 - iter 2340/2606 - loss 0.09607960 - time (sec): 127.75 - samples/sec: 2559.26 - lr: 0.000024 - momentum: 0.000000
2023-10-25 14:17:07,509 epoch 3 - iter 2600/2606 - loss 0.09548921 - time (sec): 143.02 - samples/sec: 2562.55 - lr: 0.000023 - momentum: 0.000000
2023-10-25 14:17:07,844 ----------------------------------------------------------------------------------------------------
2023-10-25 14:17:07,844 EPOCH 3 done: loss 0.0956 - lr: 0.000023
2023-10-25 14:17:14,739 DEV : loss 0.17078012228012085 - f1-score (micro avg) 0.4127
2023-10-25 14:17:14,763 saving best model
2023-10-25 14:17:15,286 ----------------------------------------------------------------------------------------------------
2023-10-25 14:17:30,053 epoch 4 - iter 260/2606 - loss 0.07013202 - time (sec): 14.77 - samples/sec: 2448.77 - lr: 0.000023 - momentum: 0.000000
2023-10-25 14:17:44,224 epoch 4 - iter 520/2606 - loss 0.06707470 - time (sec): 28.94 - samples/sec: 2510.68 - lr: 0.000023 - momentum: 0.000000
2023-10-25 14:17:58,435 epoch 4 - iter 780/2606 - loss 0.06721877 - time (sec): 43.15 - samples/sec: 2530.08 - lr: 0.000022 - momentum: 0.000000
2023-10-25 14:18:12,528 epoch 4 - iter 1040/2606 - loss 0.06904912 - time (sec): 57.24 - samples/sec: 2502.44 - lr: 0.000022 - momentum: 0.000000
2023-10-25 14:18:26,510 epoch 4 - iter 1300/2606 - loss 0.06928579 - time (sec): 71.22 - samples/sec: 2498.94 - lr: 0.000022 - momentum: 0.000000
2023-10-25 14:18:41,075 epoch 4 - iter 1560/2606 - loss 0.06747754 - time (sec): 85.79 - samples/sec: 2546.69 - lr: 0.000021 - momentum: 0.000000
2023-10-25 14:18:55,036 epoch 4 - iter 1820/2606 - loss 0.06629190 - time (sec): 99.75 - samples/sec: 2529.68 - lr: 0.000021 - momentum: 0.000000
2023-10-25 14:19:09,844 epoch 4 - iter 2080/2606 - loss 0.06673662 - time (sec): 114.56 - samples/sec: 2531.10 - lr: 0.000021 - momentum: 0.000000
2023-10-25 14:19:23,694 epoch 4 - iter 2340/2606 - loss 0.06702601 - time (sec): 128.41 - samples/sec: 2542.34 - lr: 0.000020 - momentum: 0.000000
2023-10-25 14:19:39,275 epoch 4 - iter 2600/2606 - loss 0.06621362 - time (sec): 143.99 - samples/sec: 2543.76 - lr: 0.000020 - momentum: 0.000000
2023-10-25 14:19:39,660 ----------------------------------------------------------------------------------------------------
2023-10-25 14:19:39,660 EPOCH 4 done: loss 0.0662 - lr: 0.000020
2023-10-25 14:19:46,463 DEV : loss 0.273721307516098 - f1-score (micro avg) 0.386
2023-10-25 14:19:46,487 ----------------------------------------------------------------------------------------------------
2023-10-25 14:20:00,639 epoch 5 - iter 260/2606 - loss 0.06691840 - time (sec): 14.15 - samples/sec: 2630.55 - lr: 0.000020 - momentum: 0.000000
2023-10-25 14:20:14,529 epoch 5 - iter 520/2606 - loss 0.05760987 - time (sec): 28.04 - samples/sec: 2591.09 - lr: 0.000019 - momentum: 0.000000
2023-10-25 14:20:29,637 epoch 5 - iter 780/2606 - loss 0.06058874 - time (sec): 43.15 - samples/sec: 2563.17 - lr: 0.000019 - momentum: 0.000000
2023-10-25 14:20:44,571 epoch 5 - iter 1040/2606 - loss 0.05700962 - time (sec): 58.08 - samples/sec: 2551.13 - lr: 0.000019 - momentum: 0.000000
2023-10-25 14:21:00,977 epoch 5 - iter 1300/2606 - loss 0.05444334 - time (sec): 74.49 - samples/sec: 2449.95 - lr: 0.000018 - momentum: 0.000000
2023-10-25 14:21:16,309 epoch 5 - iter 1560/2606 - loss 0.05260313 - time (sec): 89.82 - samples/sec: 2439.97 - lr: 0.000018 - momentum: 0.000000
2023-10-25 14:21:30,068 epoch 5 - iter 1820/2606 - loss 0.05245316 - time (sec): 103.58 - samples/sec: 2469.81 - lr: 0.000018 - momentum: 0.000000
2023-10-25 14:21:44,396 epoch 5 - iter 2080/2606 - loss 0.05144192 - time (sec): 117.91 - samples/sec: 2493.38 - lr: 0.000017 - momentum: 0.000000
2023-10-25 14:21:59,116 epoch 5 - iter 2340/2606 - loss 0.05047561 - time (sec): 132.63 - samples/sec: 2507.01 - lr: 0.000017 - momentum: 0.000000
2023-10-25 14:22:13,165 epoch 5 - iter 2600/2606 - loss 0.05025500 - time (sec): 146.68 - samples/sec: 2498.92 - lr: 0.000017 - momentum: 0.000000
2023-10-25 14:22:13,537 ----------------------------------------------------------------------------------------------------
2023-10-25 14:22:13,538 EPOCH 5 done: loss 0.0503 - lr: 0.000017
2023-10-25 14:22:20,360 DEV : loss 0.31281721591949463 - f1-score (micro avg) 0.3713
2023-10-25 14:22:20,384 ----------------------------------------------------------------------------------------------------
2023-10-25 14:22:34,887 epoch 6 - iter 260/2606 - loss 0.02750655 - time (sec): 14.50 - samples/sec: 2716.97 - lr: 0.000016 - momentum: 0.000000
2023-10-25 14:22:49,531 epoch 6 - iter 520/2606 - loss 0.03366210 - time (sec): 29.15 - samples/sec: 2697.50 - lr: 0.000016 - momentum: 0.000000
2023-10-25 14:23:03,291 epoch 6 - iter 780/2606 - loss 0.03476698 - time (sec): 42.91 - samples/sec: 2635.59 - lr: 0.000016 - momentum: 0.000000
2023-10-25 14:23:17,137 epoch 6 - iter 1040/2606 - loss 0.03562665 - time (sec): 56.75 - samples/sec: 2628.17 - lr: 0.000015 - momentum: 0.000000
2023-10-25 14:23:30,397 epoch 6 - iter 1300/2606 - loss 0.03616456 - time (sec): 70.01 - samples/sec: 2608.05 - lr: 0.000015 - momentum: 0.000000
2023-10-25 14:23:44,783 epoch 6 - iter 1560/2606 - loss 0.03621415 - time (sec): 84.40 - samples/sec: 2590.93 - lr: 0.000015 - momentum: 0.000000
2023-10-25 14:23:59,274 epoch 6 - iter 1820/2606 - loss 0.03535518 - time (sec): 98.89 - samples/sec: 2591.35 - lr: 0.000014 - momentum: 0.000000
2023-10-25 14:24:13,367 epoch 6 - iter 2080/2606 - loss 0.03506987 - time (sec): 112.98 - samples/sec: 2601.35 - lr: 0.000014 - momentum: 0.000000
2023-10-25 14:24:28,280 epoch 6 - iter 2340/2606 - loss 0.03467514 - time (sec): 127.90 - samples/sec: 2589.94 - lr: 0.000014 - momentum: 0.000000
2023-10-25 14:24:42,383 epoch 6 - iter 2600/2606 - loss 0.03512806 - time (sec): 142.00 - samples/sec: 2576.94 - lr: 0.000013 - momentum: 0.000000
2023-10-25 14:24:42,750 ----------------------------------------------------------------------------------------------------
2023-10-25 14:24:42,750 EPOCH 6 done: loss 0.0351 - lr: 0.000013
2023-10-25 14:24:49,602 DEV : loss 0.3527540862560272 - f1-score (micro avg) 0.3748
2023-10-25 14:24:49,626 ----------------------------------------------------------------------------------------------------
2023-10-25 14:25:03,759 epoch 7 - iter 260/2606 - loss 0.03135937 - time (sec): 14.13 - samples/sec: 2463.00 - lr: 0.000013 - momentum: 0.000000
2023-10-25 14:25:18,452 epoch 7 - iter 520/2606 - loss 0.02798732 - time (sec): 28.82 - samples/sec: 2492.97 - lr: 0.000013 - momentum: 0.000000
2023-10-25 14:25:31,831 epoch 7 - iter 780/2606 - loss 0.02966098 - time (sec): 42.20 - samples/sec: 2488.39 - lr: 0.000012 - momentum: 0.000000
2023-10-25 14:25:46,387 epoch 7 - iter 1040/2606 - loss 0.02906116 - time (sec): 56.76 - samples/sec: 2501.44 - lr: 0.000012 - momentum: 0.000000
2023-10-25 14:26:00,733 epoch 7 - iter 1300/2606 - loss 0.02817930 - time (sec): 71.11 - samples/sec: 2513.06 - lr: 0.000012 - momentum: 0.000000
2023-10-25 14:26:15,858 epoch 7 - iter 1560/2606 - loss 0.02768488 - time (sec): 86.23 - samples/sec: 2549.14 - lr: 0.000011 - momentum: 0.000000
2023-10-25 14:26:31,035 epoch 7 - iter 1820/2606 - loss 0.02723490 - time (sec): 101.41 - samples/sec: 2567.64 - lr: 0.000011 - momentum: 0.000000
2023-10-25 14:26:45,536 epoch 7 - iter 2080/2606 - loss 0.02700111 - time (sec): 115.91 - samples/sec: 2566.07 - lr: 0.000011 - momentum: 0.000000
2023-10-25 14:26:59,816 epoch 7 - iter 2340/2606 - loss 0.02648609 - time (sec): 130.19 - samples/sec: 2564.80 - lr: 0.000010 - momentum: 0.000000
2023-10-25 14:27:13,882 epoch 7 - iter 2600/2606 - loss 0.02661855 - time (sec): 144.26 - samples/sec: 2541.36 - lr: 0.000010 - momentum: 0.000000
2023-10-25 14:27:14,206 ----------------------------------------------------------------------------------------------------
2023-10-25 14:27:14,206 EPOCH 7 done: loss 0.0266 - lr: 0.000010
2023-10-25 14:27:21,216 DEV : loss 0.4295073449611664 - f1-score (micro avg) 0.3762
2023-10-25 14:27:21,253 ----------------------------------------------------------------------------------------------------
2023-10-25 14:27:36,663 epoch 8 - iter 260/2606 - loss 0.02181414 - time (sec): 15.41 - samples/sec: 2521.69 - lr: 0.000010 - momentum: 0.000000
2023-10-25 14:27:50,289 epoch 8 - iter 520/2606 - loss 0.01996708 - time (sec): 29.03 - samples/sec: 2522.06 - lr: 0.000009 - momentum: 0.000000
2023-10-25 14:28:04,303 epoch 8 - iter 780/2606 - loss 0.01971643 - time (sec): 43.05 - samples/sec: 2548.27 - lr: 0.000009 - momentum: 0.000000
2023-10-25 14:28:18,416 epoch 8 - iter 1040/2606 - loss 0.01910012 - time (sec): 57.16 - samples/sec: 2548.36 - lr: 0.000009 - momentum: 0.000000
2023-10-25 14:28:32,654 epoch 8 - iter 1300/2606 - loss 0.01932305 - time (sec): 71.40 - samples/sec: 2571.72 - lr: 0.000008 - momentum: 0.000000
2023-10-25 14:28:46,927 epoch 8 - iter 1560/2606 - loss 0.01911397 - time (sec): 85.67 - samples/sec: 2599.53 - lr: 0.000008 - momentum: 0.000000
2023-10-25 14:29:00,738 epoch 8 - iter 1820/2606 - loss 0.01903817 - time (sec): 99.48 - samples/sec: 2594.02 - lr: 0.000008 - momentum: 0.000000
2023-10-25 14:29:15,408 epoch 8 - iter 2080/2606 - loss 0.01902221 - time (sec): 114.15 - samples/sec: 2627.51 - lr: 0.000007 - momentum: 0.000000
2023-10-25 14:29:29,121 epoch 8 - iter 2340/2606 - loss 0.01834796 - time (sec): 127.87 - samples/sec: 2611.02 - lr: 0.000007 - momentum: 0.000000
2023-10-25 14:29:42,489 epoch 8 - iter 2600/2606 - loss 0.01832678 - time (sec): 141.23 - samples/sec: 2594.69 - lr: 0.000007 - momentum: 0.000000
2023-10-25 14:29:42,908 ----------------------------------------------------------------------------------------------------
2023-10-25 14:29:42,908 EPOCH 8 done: loss 0.0183 - lr: 0.000007
2023-10-25 14:29:49,377 DEV : loss 0.4789745509624481 - f1-score (micro avg) 0.383
2023-10-25 14:29:49,402 ----------------------------------------------------------------------------------------------------
2023-10-25 14:30:03,298 epoch 9 - iter 260/2606 - loss 0.01111594 - time (sec): 13.89 - samples/sec: 2464.98 - lr: 0.000006 - momentum: 0.000000
2023-10-25 14:30:17,830 epoch 9 - iter 520/2606 - loss 0.01321447 - time (sec): 28.43 - samples/sec: 2547.08 - lr: 0.000006 - momentum: 0.000000
2023-10-25 14:30:32,851 epoch 9 - iter 780/2606 - loss 0.01217180 - time (sec): 43.45 - samples/sec: 2505.70 - lr: 0.000006 - momentum: 0.000000
2023-10-25 14:30:46,570 epoch 9 - iter 1040/2606 - loss 0.01514475 - time (sec): 57.17 - samples/sec: 2524.17 - lr: 0.000005 - momentum: 0.000000
2023-10-25 14:31:00,565 epoch 9 - iter 1300/2606 - loss 0.01506380 - time (sec): 71.16 - samples/sec: 2521.69 - lr: 0.000005 - momentum: 0.000000
2023-10-25 14:31:14,890 epoch 9 - iter 1560/2606 - loss 0.01489752 - time (sec): 85.49 - samples/sec: 2535.73 - lr: 0.000005 - momentum: 0.000000
2023-10-25 14:31:29,184 epoch 9 - iter 1820/2606 - loss 0.01427346 - time (sec): 99.78 - samples/sec: 2541.19 - lr: 0.000004 - momentum: 0.000000
2023-10-25 14:31:43,563 epoch 9 - iter 2080/2606 - loss 0.01410493 - time (sec): 114.16 - samples/sec: 2565.31 - lr: 0.000004 - momentum: 0.000000
2023-10-25 14:31:57,830 epoch 9 - iter 2340/2606 - loss 0.01382165 - time (sec): 128.43 - samples/sec: 2577.17 - lr: 0.000004 - momentum: 0.000000
2023-10-25 14:32:11,563 epoch 9 - iter 2600/2606 - loss 0.01346922 - time (sec): 142.16 - samples/sec: 2580.05 - lr: 0.000003 - momentum: 0.000000
2023-10-25 14:32:11,840 ----------------------------------------------------------------------------------------------------
2023-10-25 14:32:11,840 EPOCH 9 done: loss 0.0134 - lr: 0.000003
2023-10-25 14:32:18,364 DEV : loss 0.46503743529319763 - f1-score (micro avg) 0.3877
2023-10-25 14:32:18,390 ----------------------------------------------------------------------------------------------------
2023-10-25 14:32:32,275 epoch 10 - iter 260/2606 - loss 0.00779959 - time (sec): 13.88 - samples/sec: 2537.23 - lr: 0.000003 - momentum: 0.000000
2023-10-25 14:32:47,937 epoch 10 - iter 520/2606 - loss 0.00959445 - time (sec): 29.55 - samples/sec: 2480.44 - lr: 0.000003 - momentum: 0.000000
2023-10-25 14:33:03,639 epoch 10 - iter 780/2606 - loss 0.00886606 - time (sec): 45.25 - samples/sec: 2363.03 - lr: 0.000002 - momentum: 0.000000
2023-10-25 14:33:17,842 epoch 10 - iter 1040/2606 - loss 0.00899256 - time (sec): 59.45 - samples/sec: 2439.45 - lr: 0.000002 - momentum: 0.000000
2023-10-25 14:33:32,181 epoch 10 - iter 1300/2606 - loss 0.00983203 - time (sec): 73.79 - samples/sec: 2485.30 - lr: 0.000002 - momentum: 0.000000
2023-10-25 14:33:47,503 epoch 10 - iter 1560/2606 - loss 0.00977552 - time (sec): 89.11 - samples/sec: 2483.36 - lr: 0.000001 - momentum: 0.000000
2023-10-25 14:34:01,483 epoch 10 - iter 1820/2606 - loss 0.01040378 - time (sec): 103.09 - samples/sec: 2500.37 - lr: 0.000001 - momentum: 0.000000
2023-10-25 14:34:15,530 epoch 10 - iter 2080/2606 - loss 0.00988332 - time (sec): 117.14 - samples/sec: 2505.05 - lr: 0.000001 - momentum: 0.000000
2023-10-25 14:34:30,609 epoch 10 - iter 2340/2606 - loss 0.00983035 - time (sec): 132.22 - samples/sec: 2485.61 - lr: 0.000000 - momentum: 0.000000
2023-10-25 14:34:44,672 epoch 10 - iter 2600/2606 - loss 0.00949045 - time (sec): 146.28 - samples/sec: 2505.61 - lr: 0.000000 - momentum: 0.000000
2023-10-25 14:34:44,984 ----------------------------------------------------------------------------------------------------
2023-10-25 14:34:44,984 EPOCH 10 done: loss 0.0095 - lr: 0.000000
2023-10-25 14:34:51,458 DEV : loss 0.46764612197875977 - f1-score (micro avg) 0.3931
2023-10-25 14:34:52,026 ----------------------------------------------------------------------------------------------------
2023-10-25 14:34:52,027 Loading model from best epoch ...
2023-10-25 14:34:53,898 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 14:35:03,672
Results:
- F-score (micro) 0.4654
- F-score (macro) 0.2881
- Accuracy 0.3061
By class:
precision recall f1-score support
LOC 0.5611 0.6203 0.5892 1214
PER 0.4804 0.2884 0.3604 808
ORG 0.2175 0.1898 0.2027 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4932 0.4406 0.4654 2390
macro avg 0.3148 0.2746 0.2881 2390
weighted avg 0.4796 0.4406 0.4511 2390
2023-10-25 14:35:03,672 ----------------------------------------------------------------------------------------------------