stefan-it's picture
Upload ./training.log with huggingface_hub
d67f357
2023-10-25 14:35:36,306 ----------------------------------------------------------------------------------------------------
2023-10-25 14:35:36,307 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 14:35:36,307 ----------------------------------------------------------------------------------------------------
2023-10-25 14:35:36,307 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-25 14:35:36,307 ----------------------------------------------------------------------------------------------------
2023-10-25 14:35:36,307 Train: 20847 sentences
2023-10-25 14:35:36,307 (train_with_dev=False, train_with_test=False)
2023-10-25 14:35:36,307 ----------------------------------------------------------------------------------------------------
2023-10-25 14:35:36,307 Training Params:
2023-10-25 14:35:36,307 - learning_rate: "5e-05"
2023-10-25 14:35:36,307 - mini_batch_size: "8"
2023-10-25 14:35:36,307 - max_epochs: "10"
2023-10-25 14:35:36,307 - shuffle: "True"
2023-10-25 14:35:36,307 ----------------------------------------------------------------------------------------------------
2023-10-25 14:35:36,307 Plugins:
2023-10-25 14:35:36,307 - TensorboardLogger
2023-10-25 14:35:36,307 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 14:35:36,307 ----------------------------------------------------------------------------------------------------
2023-10-25 14:35:36,307 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 14:35:36,307 - metric: "('micro avg', 'f1-score')"
2023-10-25 14:35:36,307 ----------------------------------------------------------------------------------------------------
2023-10-25 14:35:36,307 Computation:
2023-10-25 14:35:36,307 - compute on device: cuda:0
2023-10-25 14:35:36,307 - embedding storage: none
2023-10-25 14:35:36,308 ----------------------------------------------------------------------------------------------------
2023-10-25 14:35:36,308 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-25 14:35:36,308 ----------------------------------------------------------------------------------------------------
2023-10-25 14:35:36,308 ----------------------------------------------------------------------------------------------------
2023-10-25 14:35:36,308 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 14:35:50,873 epoch 1 - iter 260/2606 - loss 1.43007414 - time (sec): 14.56 - samples/sec: 2541.19 - lr: 0.000005 - momentum: 0.000000
2023-10-25 14:36:05,195 epoch 1 - iter 520/2606 - loss 0.88642286 - time (sec): 28.89 - samples/sec: 2530.05 - lr: 0.000010 - momentum: 0.000000
2023-10-25 14:36:20,044 epoch 1 - iter 780/2606 - loss 0.68416171 - time (sec): 43.74 - samples/sec: 2544.39 - lr: 0.000015 - momentum: 0.000000
2023-10-25 14:36:34,261 epoch 1 - iter 1040/2606 - loss 0.56767487 - time (sec): 57.95 - samples/sec: 2555.77 - lr: 0.000020 - momentum: 0.000000
2023-10-25 14:36:48,977 epoch 1 - iter 1300/2606 - loss 0.49684606 - time (sec): 72.67 - samples/sec: 2593.07 - lr: 0.000025 - momentum: 0.000000
2023-10-25 14:37:03,212 epoch 1 - iter 1560/2606 - loss 0.44567246 - time (sec): 86.90 - samples/sec: 2589.23 - lr: 0.000030 - momentum: 0.000000
2023-10-25 14:37:17,495 epoch 1 - iter 1820/2606 - loss 0.41053349 - time (sec): 101.19 - samples/sec: 2585.19 - lr: 0.000035 - momentum: 0.000000
2023-10-25 14:37:31,904 epoch 1 - iter 2080/2606 - loss 0.38157875 - time (sec): 115.60 - samples/sec: 2583.32 - lr: 0.000040 - momentum: 0.000000
2023-10-25 14:37:45,329 epoch 1 - iter 2340/2606 - loss 0.36197031 - time (sec): 129.02 - samples/sec: 2572.51 - lr: 0.000045 - momentum: 0.000000
2023-10-25 14:37:59,142 epoch 1 - iter 2600/2606 - loss 0.34509274 - time (sec): 142.83 - samples/sec: 2569.22 - lr: 0.000050 - momentum: 0.000000
2023-10-25 14:37:59,394 ----------------------------------------------------------------------------------------------------
2023-10-25 14:37:59,395 EPOCH 1 done: loss 0.3449 - lr: 0.000050
2023-10-25 14:38:03,168 DEV : loss 0.14861008524894714 - f1-score (micro avg) 0.3075
2023-10-25 14:38:03,193 saving best model
2023-10-25 14:38:03,720 ----------------------------------------------------------------------------------------------------
2023-10-25 14:38:17,970 epoch 2 - iter 260/2606 - loss 0.16010954 - time (sec): 14.25 - samples/sec: 2672.86 - lr: 0.000049 - momentum: 0.000000
2023-10-25 14:38:32,344 epoch 2 - iter 520/2606 - loss 0.16216140 - time (sec): 28.62 - samples/sec: 2671.12 - lr: 0.000049 - momentum: 0.000000
2023-10-25 14:38:47,501 epoch 2 - iter 780/2606 - loss 0.16202218 - time (sec): 43.78 - samples/sec: 2591.71 - lr: 0.000048 - momentum: 0.000000
2023-10-25 14:39:01,279 epoch 2 - iter 1040/2606 - loss 0.16269454 - time (sec): 57.56 - samples/sec: 2603.63 - lr: 0.000048 - momentum: 0.000000
2023-10-25 14:39:15,219 epoch 2 - iter 1300/2606 - loss 0.16161422 - time (sec): 71.50 - samples/sec: 2605.20 - lr: 0.000047 - momentum: 0.000000
2023-10-25 14:39:28,712 epoch 2 - iter 1560/2606 - loss 0.16050413 - time (sec): 84.99 - samples/sec: 2609.10 - lr: 0.000047 - momentum: 0.000000
2023-10-25 14:39:43,253 epoch 2 - iter 1820/2606 - loss 0.16221687 - time (sec): 99.53 - samples/sec: 2603.35 - lr: 0.000046 - momentum: 0.000000
2023-10-25 14:39:57,463 epoch 2 - iter 2080/2606 - loss 0.16080811 - time (sec): 113.74 - samples/sec: 2606.42 - lr: 0.000046 - momentum: 0.000000
2023-10-25 14:40:11,416 epoch 2 - iter 2340/2606 - loss 0.15978570 - time (sec): 127.69 - samples/sec: 2582.57 - lr: 0.000045 - momentum: 0.000000
2023-10-25 14:40:26,509 epoch 2 - iter 2600/2606 - loss 0.15840144 - time (sec): 142.79 - samples/sec: 2565.54 - lr: 0.000044 - momentum: 0.000000
2023-10-25 14:40:26,911 ----------------------------------------------------------------------------------------------------
2023-10-25 14:40:26,911 EPOCH 2 done: loss 0.1582 - lr: 0.000044
2023-10-25 14:40:34,413 DEV : loss 0.2031174749135971 - f1-score (micro avg) 0.3265
2023-10-25 14:40:34,438 saving best model
2023-10-25 14:40:35,160 ----------------------------------------------------------------------------------------------------
2023-10-25 14:40:49,669 epoch 3 - iter 260/2606 - loss 0.09809148 - time (sec): 14.51 - samples/sec: 2529.64 - lr: 0.000044 - momentum: 0.000000
2023-10-25 14:41:03,475 epoch 3 - iter 520/2606 - loss 0.11472092 - time (sec): 28.31 - samples/sec: 2528.48 - lr: 0.000043 - momentum: 0.000000
2023-10-25 14:41:17,287 epoch 3 - iter 780/2606 - loss 0.11436832 - time (sec): 42.13 - samples/sec: 2553.09 - lr: 0.000043 - momentum: 0.000000
2023-10-25 14:41:31,543 epoch 3 - iter 1040/2606 - loss 0.10975702 - time (sec): 56.38 - samples/sec: 2584.12 - lr: 0.000042 - momentum: 0.000000
2023-10-25 14:41:45,618 epoch 3 - iter 1300/2606 - loss 0.10721993 - time (sec): 70.46 - samples/sec: 2599.38 - lr: 0.000042 - momentum: 0.000000
2023-10-25 14:41:59,386 epoch 3 - iter 1560/2606 - loss 0.11156032 - time (sec): 84.22 - samples/sec: 2602.20 - lr: 0.000041 - momentum: 0.000000
2023-10-25 14:42:13,163 epoch 3 - iter 1820/2606 - loss 0.11371807 - time (sec): 98.00 - samples/sec: 2602.70 - lr: 0.000041 - momentum: 0.000000
2023-10-25 14:42:27,303 epoch 3 - iter 2080/2606 - loss 0.11301284 - time (sec): 112.14 - samples/sec: 2604.51 - lr: 0.000040 - momentum: 0.000000
2023-10-25 14:42:40,919 epoch 3 - iter 2340/2606 - loss 0.11223997 - time (sec): 125.76 - samples/sec: 2599.81 - lr: 0.000039 - momentum: 0.000000
2023-10-25 14:42:55,448 epoch 3 - iter 2600/2606 - loss 0.11074835 - time (sec): 140.29 - samples/sec: 2612.40 - lr: 0.000039 - momentum: 0.000000
2023-10-25 14:42:55,774 ----------------------------------------------------------------------------------------------------
2023-10-25 14:42:55,774 EPOCH 3 done: loss 0.1106 - lr: 0.000039
2023-10-25 14:43:02,639 DEV : loss 0.19291090965270996 - f1-score (micro avg) 0.3613
2023-10-25 14:43:02,664 saving best model
2023-10-25 14:43:03,328 ----------------------------------------------------------------------------------------------------
2023-10-25 14:43:17,067 epoch 4 - iter 260/2606 - loss 0.09500081 - time (sec): 13.74 - samples/sec: 2632.02 - lr: 0.000038 - momentum: 0.000000
2023-10-25 14:43:31,086 epoch 4 - iter 520/2606 - loss 0.09830307 - time (sec): 27.76 - samples/sec: 2617.40 - lr: 0.000038 - momentum: 0.000000
2023-10-25 14:43:46,455 epoch 4 - iter 780/2606 - loss 0.09455546 - time (sec): 43.13 - samples/sec: 2531.37 - lr: 0.000037 - momentum: 0.000000
2023-10-25 14:44:00,720 epoch 4 - iter 1040/2606 - loss 0.09159831 - time (sec): 57.39 - samples/sec: 2495.89 - lr: 0.000037 - momentum: 0.000000
2023-10-25 14:44:14,439 epoch 4 - iter 1300/2606 - loss 0.09101211 - time (sec): 71.11 - samples/sec: 2502.90 - lr: 0.000036 - momentum: 0.000000
2023-10-25 14:44:29,101 epoch 4 - iter 1560/2606 - loss 0.08649454 - time (sec): 85.77 - samples/sec: 2547.17 - lr: 0.000036 - momentum: 0.000000
2023-10-25 14:44:42,716 epoch 4 - iter 1820/2606 - loss 0.08545536 - time (sec): 99.39 - samples/sec: 2538.89 - lr: 0.000035 - momentum: 0.000000
2023-10-25 14:44:57,358 epoch 4 - iter 2080/2606 - loss 0.08615483 - time (sec): 114.03 - samples/sec: 2542.81 - lr: 0.000034 - momentum: 0.000000
2023-10-25 14:45:11,633 epoch 4 - iter 2340/2606 - loss 0.08630774 - time (sec): 128.30 - samples/sec: 2544.39 - lr: 0.000034 - momentum: 0.000000
2023-10-25 14:45:26,422 epoch 4 - iter 2600/2606 - loss 0.08626198 - time (sec): 143.09 - samples/sec: 2559.67 - lr: 0.000033 - momentum: 0.000000
2023-10-25 14:45:26,820 ----------------------------------------------------------------------------------------------------
2023-10-25 14:45:26,820 EPOCH 4 done: loss 0.0863 - lr: 0.000033
2023-10-25 14:45:33,781 DEV : loss 0.2698976397514343 - f1-score (micro avg) 0.3764
2023-10-25 14:45:33,806 saving best model
2023-10-25 14:45:34,465 ----------------------------------------------------------------------------------------------------
2023-10-25 14:45:48,639 epoch 5 - iter 260/2606 - loss 0.06464820 - time (sec): 14.17 - samples/sec: 2626.43 - lr: 0.000033 - momentum: 0.000000
2023-10-25 14:46:03,035 epoch 5 - iter 520/2606 - loss 0.06110421 - time (sec): 28.57 - samples/sec: 2543.20 - lr: 0.000032 - momentum: 0.000000
2023-10-25 14:46:17,861 epoch 5 - iter 780/2606 - loss 0.06016891 - time (sec): 43.39 - samples/sec: 2548.67 - lr: 0.000032 - momentum: 0.000000
2023-10-25 14:46:32,445 epoch 5 - iter 1040/2606 - loss 0.06173170 - time (sec): 57.98 - samples/sec: 2555.72 - lr: 0.000031 - momentum: 0.000000
2023-10-25 14:46:46,809 epoch 5 - iter 1300/2606 - loss 0.06133749 - time (sec): 72.34 - samples/sec: 2522.61 - lr: 0.000031 - momentum: 0.000000
2023-10-25 14:47:01,071 epoch 5 - iter 1560/2606 - loss 0.06007684 - time (sec): 86.60 - samples/sec: 2530.55 - lr: 0.000030 - momentum: 0.000000
2023-10-25 14:47:15,449 epoch 5 - iter 1820/2606 - loss 0.06078408 - time (sec): 100.98 - samples/sec: 2533.31 - lr: 0.000029 - momentum: 0.000000
2023-10-25 14:47:29,645 epoch 5 - iter 2080/2606 - loss 0.06059075 - time (sec): 115.18 - samples/sec: 2552.45 - lr: 0.000029 - momentum: 0.000000
2023-10-25 14:47:44,001 epoch 5 - iter 2340/2606 - loss 0.06046212 - time (sec): 129.53 - samples/sec: 2566.87 - lr: 0.000028 - momentum: 0.000000
2023-10-25 14:47:57,841 epoch 5 - iter 2600/2606 - loss 0.06127827 - time (sec): 143.37 - samples/sec: 2556.47 - lr: 0.000028 - momentum: 0.000000
2023-10-25 14:47:58,210 ----------------------------------------------------------------------------------------------------
2023-10-25 14:47:58,210 EPOCH 5 done: loss 0.0613 - lr: 0.000028
2023-10-25 14:48:05,406 DEV : loss 0.3795294165611267 - f1-score (micro avg) 0.3326
2023-10-25 14:48:05,434 ----------------------------------------------------------------------------------------------------
2023-10-25 14:48:20,290 epoch 6 - iter 260/2606 - loss 0.03491277 - time (sec): 14.86 - samples/sec: 2652.35 - lr: 0.000027 - momentum: 0.000000
2023-10-25 14:48:35,779 epoch 6 - iter 520/2606 - loss 0.04096153 - time (sec): 30.34 - samples/sec: 2591.06 - lr: 0.000027 - momentum: 0.000000
2023-10-25 14:48:49,582 epoch 6 - iter 780/2606 - loss 0.04201657 - time (sec): 44.15 - samples/sec: 2561.47 - lr: 0.000026 - momentum: 0.000000
2023-10-25 14:49:03,570 epoch 6 - iter 1040/2606 - loss 0.04304112 - time (sec): 58.13 - samples/sec: 2565.68 - lr: 0.000026 - momentum: 0.000000
2023-10-25 14:49:17,029 epoch 6 - iter 1300/2606 - loss 0.04336291 - time (sec): 71.59 - samples/sec: 2550.43 - lr: 0.000025 - momentum: 0.000000
2023-10-25 14:49:31,156 epoch 6 - iter 1560/2606 - loss 0.04379894 - time (sec): 85.72 - samples/sec: 2550.97 - lr: 0.000024 - momentum: 0.000000
2023-10-25 14:49:45,284 epoch 6 - iter 1820/2606 - loss 0.04398689 - time (sec): 99.85 - samples/sec: 2566.43 - lr: 0.000024 - momentum: 0.000000
2023-10-25 14:49:59,537 epoch 6 - iter 2080/2606 - loss 0.04462424 - time (sec): 114.10 - samples/sec: 2575.83 - lr: 0.000023 - momentum: 0.000000
2023-10-25 14:50:13,887 epoch 6 - iter 2340/2606 - loss 0.04389891 - time (sec): 128.45 - samples/sec: 2578.71 - lr: 0.000023 - momentum: 0.000000
2023-10-25 14:50:27,662 epoch 6 - iter 2600/2606 - loss 0.04366628 - time (sec): 142.23 - samples/sec: 2572.78 - lr: 0.000022 - momentum: 0.000000
2023-10-25 14:50:28,029 ----------------------------------------------------------------------------------------------------
2023-10-25 14:50:28,030 EPOCH 6 done: loss 0.0437 - lr: 0.000022
2023-10-25 14:50:34,281 DEV : loss 0.3496846556663513 - f1-score (micro avg) 0.3677
2023-10-25 14:50:34,307 ----------------------------------------------------------------------------------------------------
2023-10-25 14:50:49,498 epoch 7 - iter 260/2606 - loss 0.03844051 - time (sec): 15.19 - samples/sec: 2291.49 - lr: 0.000022 - momentum: 0.000000
2023-10-25 14:51:04,819 epoch 7 - iter 520/2606 - loss 0.03763631 - time (sec): 30.51 - samples/sec: 2355.19 - lr: 0.000021 - momentum: 0.000000
2023-10-25 14:51:19,867 epoch 7 - iter 780/2606 - loss 0.04085011 - time (sec): 45.56 - samples/sec: 2305.17 - lr: 0.000021 - momentum: 0.000000
2023-10-25 14:51:35,277 epoch 7 - iter 1040/2606 - loss 0.04072254 - time (sec): 60.97 - samples/sec: 2328.77 - lr: 0.000020 - momentum: 0.000000
2023-10-25 14:51:49,777 epoch 7 - iter 1300/2606 - loss 0.04275275 - time (sec): 75.47 - samples/sec: 2367.75 - lr: 0.000019 - momentum: 0.000000
2023-10-25 14:52:04,500 epoch 7 - iter 1560/2606 - loss 0.04412206 - time (sec): 90.19 - samples/sec: 2437.18 - lr: 0.000019 - momentum: 0.000000
2023-10-25 14:52:19,146 epoch 7 - iter 1820/2606 - loss 0.04560131 - time (sec): 104.84 - samples/sec: 2483.64 - lr: 0.000018 - momentum: 0.000000
2023-10-25 14:52:33,348 epoch 7 - iter 2080/2606 - loss 0.04383087 - time (sec): 119.04 - samples/sec: 2498.57 - lr: 0.000018 - momentum: 0.000000
2023-10-25 14:52:47,443 epoch 7 - iter 2340/2606 - loss 0.04287800 - time (sec): 133.13 - samples/sec: 2508.05 - lr: 0.000017 - momentum: 0.000000
2023-10-25 14:53:00,987 epoch 7 - iter 2600/2606 - loss 0.04293301 - time (sec): 146.68 - samples/sec: 2499.36 - lr: 0.000017 - momentum: 0.000000
2023-10-25 14:53:01,273 ----------------------------------------------------------------------------------------------------
2023-10-25 14:53:01,273 EPOCH 7 done: loss 0.0430 - lr: 0.000017
2023-10-25 14:53:07,539 DEV : loss 0.36888352036476135 - f1-score (micro avg) 0.3547
2023-10-25 14:53:07,566 ----------------------------------------------------------------------------------------------------
2023-10-25 14:53:22,206 epoch 8 - iter 260/2606 - loss 0.02816273 - time (sec): 14.64 - samples/sec: 2654.17 - lr: 0.000016 - momentum: 0.000000
2023-10-25 14:53:36,106 epoch 8 - iter 520/2606 - loss 0.02706506 - time (sec): 28.54 - samples/sec: 2565.72 - lr: 0.000016 - momentum: 0.000000
2023-10-25 14:53:50,421 epoch 8 - iter 780/2606 - loss 0.03077908 - time (sec): 42.85 - samples/sec: 2559.81 - lr: 0.000015 - momentum: 0.000000
2023-10-25 14:54:04,697 epoch 8 - iter 1040/2606 - loss 0.03177140 - time (sec): 57.13 - samples/sec: 2549.73 - lr: 0.000014 - momentum: 0.000000
2023-10-25 14:54:19,291 epoch 8 - iter 1300/2606 - loss 0.03241797 - time (sec): 71.72 - samples/sec: 2560.08 - lr: 0.000014 - momentum: 0.000000
2023-10-25 14:54:34,241 epoch 8 - iter 1560/2606 - loss 0.04472240 - time (sec): 86.67 - samples/sec: 2569.48 - lr: 0.000013 - momentum: 0.000000
2023-10-25 14:54:48,203 epoch 8 - iter 1820/2606 - loss 0.06082510 - time (sec): 100.64 - samples/sec: 2564.29 - lr: 0.000013 - momentum: 0.000000
2023-10-25 14:55:03,356 epoch 8 - iter 2080/2606 - loss 0.07237813 - time (sec): 115.79 - samples/sec: 2590.36 - lr: 0.000012 - momentum: 0.000000
2023-10-25 14:55:17,114 epoch 8 - iter 2340/2606 - loss 0.07735544 - time (sec): 129.55 - samples/sec: 2577.12 - lr: 0.000012 - momentum: 0.000000
2023-10-25 14:55:30,594 epoch 8 - iter 2600/2606 - loss 0.08074278 - time (sec): 143.03 - samples/sec: 2562.17 - lr: 0.000011 - momentum: 0.000000
2023-10-25 14:55:31,015 ----------------------------------------------------------------------------------------------------
2023-10-25 14:55:31,015 EPOCH 8 done: loss 0.0806 - lr: 0.000011
2023-10-25 14:55:37,310 DEV : loss 0.3248702585697174 - f1-score (micro avg) 0.2139
2023-10-25 14:55:37,335 ----------------------------------------------------------------------------------------------------
2023-10-25 14:55:51,369 epoch 9 - iter 260/2606 - loss 0.09042002 - time (sec): 14.03 - samples/sec: 2440.70 - lr: 0.000011 - momentum: 0.000000
2023-10-25 14:56:05,964 epoch 9 - iter 520/2606 - loss 0.10574243 - time (sec): 28.63 - samples/sec: 2529.19 - lr: 0.000010 - momentum: 0.000000
2023-10-25 14:56:20,271 epoch 9 - iter 780/2606 - loss 0.10510265 - time (sec): 42.93 - samples/sec: 2535.61 - lr: 0.000009 - momentum: 0.000000
2023-10-25 14:56:34,213 epoch 9 - iter 1040/2606 - loss 0.11581644 - time (sec): 56.88 - samples/sec: 2537.07 - lr: 0.000009 - momentum: 0.000000
2023-10-25 14:56:48,138 epoch 9 - iter 1300/2606 - loss 0.13853151 - time (sec): 70.80 - samples/sec: 2534.52 - lr: 0.000008 - momentum: 0.000000
2023-10-25 14:57:02,718 epoch 9 - iter 1560/2606 - loss 0.15621653 - time (sec): 85.38 - samples/sec: 2538.87 - lr: 0.000008 - momentum: 0.000000
2023-10-25 14:57:16,241 epoch 9 - iter 1820/2606 - loss 0.16924463 - time (sec): 98.90 - samples/sec: 2563.69 - lr: 0.000007 - momentum: 0.000000
2023-10-25 14:57:31,403 epoch 9 - iter 2080/2606 - loss 0.17046855 - time (sec): 114.07 - samples/sec: 2567.41 - lr: 0.000007 - momentum: 0.000000
2023-10-25 14:57:45,821 epoch 9 - iter 2340/2606 - loss 0.17400048 - time (sec): 128.48 - samples/sec: 2576.01 - lr: 0.000006 - momentum: 0.000000
2023-10-25 14:57:59,796 epoch 9 - iter 2600/2606 - loss 0.17816634 - time (sec): 142.46 - samples/sec: 2574.61 - lr: 0.000006 - momentum: 0.000000
2023-10-25 14:58:00,124 ----------------------------------------------------------------------------------------------------
2023-10-25 14:58:00,124 EPOCH 9 done: loss 0.1782 - lr: 0.000006
2023-10-25 14:58:06,453 DEV : loss 0.22723859548568726 - f1-score (micro avg) 0.0329
2023-10-25 14:58:06,479 ----------------------------------------------------------------------------------------------------
2023-10-25 14:58:20,284 epoch 10 - iter 260/2606 - loss 0.19008451 - time (sec): 13.80 - samples/sec: 2551.98 - lr: 0.000005 - momentum: 0.000000
2023-10-25 14:58:34,669 epoch 10 - iter 520/2606 - loss 0.19479366 - time (sec): 28.19 - samples/sec: 2599.77 - lr: 0.000004 - momentum: 0.000000
2023-10-25 14:58:48,736 epoch 10 - iter 780/2606 - loss 0.19834988 - time (sec): 42.26 - samples/sec: 2530.31 - lr: 0.000004 - momentum: 0.000000
2023-10-25 14:59:02,964 epoch 10 - iter 1040/2606 - loss 0.19699333 - time (sec): 56.48 - samples/sec: 2567.60 - lr: 0.000003 - momentum: 0.000000
2023-10-25 14:59:17,539 epoch 10 - iter 1300/2606 - loss 0.18959408 - time (sec): 71.06 - samples/sec: 2580.81 - lr: 0.000003 - momentum: 0.000000
2023-10-25 14:59:32,058 epoch 10 - iter 1560/2606 - loss 0.18754436 - time (sec): 85.58 - samples/sec: 2585.89 - lr: 0.000002 - momentum: 0.000000
2023-10-25 14:59:46,141 epoch 10 - iter 1820/2606 - loss 0.19219019 - time (sec): 99.66 - samples/sec: 2586.46 - lr: 0.000002 - momentum: 0.000000
2023-10-25 15:00:00,445 epoch 10 - iter 2080/2606 - loss 0.19419016 - time (sec): 113.96 - samples/sec: 2574.81 - lr: 0.000001 - momentum: 0.000000
2023-10-25 15:00:14,275 epoch 10 - iter 2340/2606 - loss 0.19370314 - time (sec): 127.79 - samples/sec: 2571.63 - lr: 0.000001 - momentum: 0.000000
2023-10-25 15:00:29,133 epoch 10 - iter 2600/2606 - loss 0.19280325 - time (sec): 142.65 - samples/sec: 2569.33 - lr: 0.000000 - momentum: 0.000000
2023-10-25 15:00:29,439 ----------------------------------------------------------------------------------------------------
2023-10-25 15:00:29,439 EPOCH 10 done: loss 0.1929 - lr: 0.000000
2023-10-25 15:00:36,359 DEV : loss 0.25543084740638733 - f1-score (micro avg) 0.0519
2023-10-25 15:00:37,019 ----------------------------------------------------------------------------------------------------
2023-10-25 15:00:37,020 Loading model from best epoch ...
2023-10-25 15:00:39,021 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 15:00:49,051
Results:
- F-score (micro) 0.4518
- F-score (macro) 0.3002
- Accuracy 0.2956
By class:
precision recall f1-score support
LOC 0.4964 0.5700 0.5307 1214
PER 0.3949 0.4394 0.4159 808
ORG 0.2628 0.2465 0.2544 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4312 0.4745 0.4518 2390
macro avg 0.2885 0.3140 0.3002 2390
weighted avg 0.4245 0.4745 0.4477 2390
2023-10-25 15:00:49,051 ----------------------------------------------------------------------------------------------------