stefan-it's picture
Upload folder using huggingface_hub
4a8f32e
2023-10-16 20:08:16,375 ----------------------------------------------------------------------------------------------------
2023-10-16 20:08:16,376 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-16 20:08:16,376 ----------------------------------------------------------------------------------------------------
2023-10-16 20:08:16,376 MultiCorpus: 1085 train + 148 dev + 364 test sentences
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator
2023-10-16 20:08:16,376 ----------------------------------------------------------------------------------------------------
2023-10-16 20:08:16,376 Train: 1085 sentences
2023-10-16 20:08:16,376 (train_with_dev=False, train_with_test=False)
2023-10-16 20:08:16,376 ----------------------------------------------------------------------------------------------------
2023-10-16 20:08:16,376 Training Params:
2023-10-16 20:08:16,376 - learning_rate: "3e-05"
2023-10-16 20:08:16,376 - mini_batch_size: "4"
2023-10-16 20:08:16,376 - max_epochs: "10"
2023-10-16 20:08:16,376 - shuffle: "True"
2023-10-16 20:08:16,376 ----------------------------------------------------------------------------------------------------
2023-10-16 20:08:16,376 Plugins:
2023-10-16 20:08:16,376 - LinearScheduler | warmup_fraction: '0.1'
2023-10-16 20:08:16,376 ----------------------------------------------------------------------------------------------------
2023-10-16 20:08:16,376 Final evaluation on model from best epoch (best-model.pt)
2023-10-16 20:08:16,376 - metric: "('micro avg', 'f1-score')"
2023-10-16 20:08:16,376 ----------------------------------------------------------------------------------------------------
2023-10-16 20:08:16,376 Computation:
2023-10-16 20:08:16,377 - compute on device: cuda:0
2023-10-16 20:08:16,377 - embedding storage: none
2023-10-16 20:08:16,377 ----------------------------------------------------------------------------------------------------
2023-10-16 20:08:16,377 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-16 20:08:16,377 ----------------------------------------------------------------------------------------------------
2023-10-16 20:08:16,377 ----------------------------------------------------------------------------------------------------
2023-10-16 20:08:17,873 epoch 1 - iter 27/272 - loss 3.03869039 - time (sec): 1.50 - samples/sec: 3386.21 - lr: 0.000003 - momentum: 0.000000
2023-10-16 20:08:19,637 epoch 1 - iter 54/272 - loss 2.61368780 - time (sec): 3.26 - samples/sec: 3353.24 - lr: 0.000006 - momentum: 0.000000
2023-10-16 20:08:21,288 epoch 1 - iter 81/272 - loss 1.99970409 - time (sec): 4.91 - samples/sec: 3317.85 - lr: 0.000009 - momentum: 0.000000
2023-10-16 20:08:22,892 epoch 1 - iter 108/272 - loss 1.60221038 - time (sec): 6.51 - samples/sec: 3384.49 - lr: 0.000012 - momentum: 0.000000
2023-10-16 20:08:24,494 epoch 1 - iter 135/272 - loss 1.36857841 - time (sec): 8.12 - samples/sec: 3404.33 - lr: 0.000015 - momentum: 0.000000
2023-10-16 20:08:25,924 epoch 1 - iter 162/272 - loss 1.24173113 - time (sec): 9.55 - samples/sec: 3370.84 - lr: 0.000018 - momentum: 0.000000
2023-10-16 20:08:27,471 epoch 1 - iter 189/272 - loss 1.11940563 - time (sec): 11.09 - samples/sec: 3341.30 - lr: 0.000021 - momentum: 0.000000
2023-10-16 20:08:28,953 epoch 1 - iter 216/272 - loss 1.02508320 - time (sec): 12.58 - samples/sec: 3329.53 - lr: 0.000024 - momentum: 0.000000
2023-10-16 20:08:30,571 epoch 1 - iter 243/272 - loss 0.93120235 - time (sec): 14.19 - samples/sec: 3329.49 - lr: 0.000027 - momentum: 0.000000
2023-10-16 20:08:32,030 epoch 1 - iter 270/272 - loss 0.87450918 - time (sec): 15.65 - samples/sec: 3306.80 - lr: 0.000030 - momentum: 0.000000
2023-10-16 20:08:32,123 ----------------------------------------------------------------------------------------------------
2023-10-16 20:08:32,123 EPOCH 1 done: loss 0.8714 - lr: 0.000030
2023-10-16 20:08:33,138 DEV : loss 0.16372087597846985 - f1-score (micro avg) 0.6596
2023-10-16 20:08:33,142 saving best model
2023-10-16 20:08:33,504 ----------------------------------------------------------------------------------------------------
2023-10-16 20:08:34,972 epoch 2 - iter 27/272 - loss 0.19904837 - time (sec): 1.47 - samples/sec: 3231.28 - lr: 0.000030 - momentum: 0.000000
2023-10-16 20:08:36,508 epoch 2 - iter 54/272 - loss 0.16653628 - time (sec): 3.00 - samples/sec: 3310.15 - lr: 0.000029 - momentum: 0.000000
2023-10-16 20:08:38,099 epoch 2 - iter 81/272 - loss 0.15689792 - time (sec): 4.59 - samples/sec: 3349.68 - lr: 0.000029 - momentum: 0.000000
2023-10-16 20:08:39,541 epoch 2 - iter 108/272 - loss 0.16347169 - time (sec): 6.04 - samples/sec: 3364.95 - lr: 0.000029 - momentum: 0.000000
2023-10-16 20:08:41,015 epoch 2 - iter 135/272 - loss 0.17363768 - time (sec): 7.51 - samples/sec: 3300.26 - lr: 0.000028 - momentum: 0.000000
2023-10-16 20:08:42,594 epoch 2 - iter 162/272 - loss 0.17440064 - time (sec): 9.09 - samples/sec: 3269.51 - lr: 0.000028 - momentum: 0.000000
2023-10-16 20:08:44,277 epoch 2 - iter 189/272 - loss 0.16931364 - time (sec): 10.77 - samples/sec: 3271.48 - lr: 0.000028 - momentum: 0.000000
2023-10-16 20:08:45,872 epoch 2 - iter 216/272 - loss 0.16682327 - time (sec): 12.37 - samples/sec: 3321.29 - lr: 0.000027 - momentum: 0.000000
2023-10-16 20:08:47,465 epoch 2 - iter 243/272 - loss 0.16163978 - time (sec): 13.96 - samples/sec: 3336.00 - lr: 0.000027 - momentum: 0.000000
2023-10-16 20:08:49,024 epoch 2 - iter 270/272 - loss 0.15802832 - time (sec): 15.52 - samples/sec: 3344.12 - lr: 0.000027 - momentum: 0.000000
2023-10-16 20:08:49,108 ----------------------------------------------------------------------------------------------------
2023-10-16 20:08:49,108 EPOCH 2 done: loss 0.1578 - lr: 0.000027
2023-10-16 20:08:50,522 DEV : loss 0.124007448554039 - f1-score (micro avg) 0.7647
2023-10-16 20:08:50,526 saving best model
2023-10-16 20:08:50,982 ----------------------------------------------------------------------------------------------------
2023-10-16 20:08:52,657 epoch 3 - iter 27/272 - loss 0.10639237 - time (sec): 1.67 - samples/sec: 3479.95 - lr: 0.000026 - momentum: 0.000000
2023-10-16 20:08:54,265 epoch 3 - iter 54/272 - loss 0.09668779 - time (sec): 3.28 - samples/sec: 3546.28 - lr: 0.000026 - momentum: 0.000000
2023-10-16 20:08:55,779 epoch 3 - iter 81/272 - loss 0.09388957 - time (sec): 4.80 - samples/sec: 3414.38 - lr: 0.000026 - momentum: 0.000000
2023-10-16 20:08:57,368 epoch 3 - iter 108/272 - loss 0.10422929 - time (sec): 6.38 - samples/sec: 3362.35 - lr: 0.000025 - momentum: 0.000000
2023-10-16 20:08:58,804 epoch 3 - iter 135/272 - loss 0.10468543 - time (sec): 7.82 - samples/sec: 3316.85 - lr: 0.000025 - momentum: 0.000000
2023-10-16 20:09:00,296 epoch 3 - iter 162/272 - loss 0.10192530 - time (sec): 9.31 - samples/sec: 3305.64 - lr: 0.000025 - momentum: 0.000000
2023-10-16 20:09:01,973 epoch 3 - iter 189/272 - loss 0.09803531 - time (sec): 10.99 - samples/sec: 3351.01 - lr: 0.000024 - momentum: 0.000000
2023-10-16 20:09:03,442 epoch 3 - iter 216/272 - loss 0.09598613 - time (sec): 12.46 - samples/sec: 3328.11 - lr: 0.000024 - momentum: 0.000000
2023-10-16 20:09:04,960 epoch 3 - iter 243/272 - loss 0.09428346 - time (sec): 13.98 - samples/sec: 3330.45 - lr: 0.000024 - momentum: 0.000000
2023-10-16 20:09:06,593 epoch 3 - iter 270/272 - loss 0.09356033 - time (sec): 15.61 - samples/sec: 3315.79 - lr: 0.000023 - momentum: 0.000000
2023-10-16 20:09:06,686 ----------------------------------------------------------------------------------------------------
2023-10-16 20:09:06,686 EPOCH 3 done: loss 0.0934 - lr: 0.000023
2023-10-16 20:09:08,109 DEV : loss 0.11912587285041809 - f1-score (micro avg) 0.7703
2023-10-16 20:09:08,113 saving best model
2023-10-16 20:09:08,574 ----------------------------------------------------------------------------------------------------
2023-10-16 20:09:10,270 epoch 4 - iter 27/272 - loss 0.06533739 - time (sec): 1.69 - samples/sec: 3074.95 - lr: 0.000023 - momentum: 0.000000
2023-10-16 20:09:11,827 epoch 4 - iter 54/272 - loss 0.05477897 - time (sec): 3.25 - samples/sec: 3259.29 - lr: 0.000023 - momentum: 0.000000
2023-10-16 20:09:13,351 epoch 4 - iter 81/272 - loss 0.04947201 - time (sec): 4.77 - samples/sec: 3350.20 - lr: 0.000022 - momentum: 0.000000
2023-10-16 20:09:14,941 epoch 4 - iter 108/272 - loss 0.05320227 - time (sec): 6.36 - samples/sec: 3384.90 - lr: 0.000022 - momentum: 0.000000
2023-10-16 20:09:16,458 epoch 4 - iter 135/272 - loss 0.05332532 - time (sec): 7.88 - samples/sec: 3301.85 - lr: 0.000022 - momentum: 0.000000
2023-10-16 20:09:18,246 epoch 4 - iter 162/272 - loss 0.05318251 - time (sec): 9.67 - samples/sec: 3310.54 - lr: 0.000021 - momentum: 0.000000
2023-10-16 20:09:19,815 epoch 4 - iter 189/272 - loss 0.05540684 - time (sec): 11.24 - samples/sec: 3289.42 - lr: 0.000021 - momentum: 0.000000
2023-10-16 20:09:21,646 epoch 4 - iter 216/272 - loss 0.05287585 - time (sec): 13.07 - samples/sec: 3221.21 - lr: 0.000021 - momentum: 0.000000
2023-10-16 20:09:23,205 epoch 4 - iter 243/272 - loss 0.05039066 - time (sec): 14.63 - samples/sec: 3246.02 - lr: 0.000020 - momentum: 0.000000
2023-10-16 20:09:24,612 epoch 4 - iter 270/272 - loss 0.05055159 - time (sec): 16.03 - samples/sec: 3234.24 - lr: 0.000020 - momentum: 0.000000
2023-10-16 20:09:24,707 ----------------------------------------------------------------------------------------------------
2023-10-16 20:09:24,708 EPOCH 4 done: loss 0.0504 - lr: 0.000020
2023-10-16 20:09:26,163 DEV : loss 0.13422761857509613 - f1-score (micro avg) 0.7808
2023-10-16 20:09:26,168 saving best model
2023-10-16 20:09:26,610 ----------------------------------------------------------------------------------------------------
2023-10-16 20:09:28,226 epoch 5 - iter 27/272 - loss 0.03954786 - time (sec): 1.61 - samples/sec: 3061.53 - lr: 0.000020 - momentum: 0.000000
2023-10-16 20:09:29,701 epoch 5 - iter 54/272 - loss 0.03936235 - time (sec): 3.09 - samples/sec: 3054.57 - lr: 0.000019 - momentum: 0.000000
2023-10-16 20:09:31,220 epoch 5 - iter 81/272 - loss 0.04126617 - time (sec): 4.61 - samples/sec: 3255.14 - lr: 0.000019 - momentum: 0.000000
2023-10-16 20:09:32,926 epoch 5 - iter 108/272 - loss 0.04314098 - time (sec): 6.32 - samples/sec: 3195.32 - lr: 0.000019 - momentum: 0.000000
2023-10-16 20:09:34,448 epoch 5 - iter 135/272 - loss 0.04076125 - time (sec): 7.84 - samples/sec: 3187.96 - lr: 0.000018 - momentum: 0.000000
2023-10-16 20:09:35,991 epoch 5 - iter 162/272 - loss 0.03995894 - time (sec): 9.38 - samples/sec: 3211.17 - lr: 0.000018 - momentum: 0.000000
2023-10-16 20:09:37,641 epoch 5 - iter 189/272 - loss 0.03840579 - time (sec): 11.03 - samples/sec: 3236.95 - lr: 0.000018 - momentum: 0.000000
2023-10-16 20:09:39,245 epoch 5 - iter 216/272 - loss 0.03843528 - time (sec): 12.63 - samples/sec: 3251.87 - lr: 0.000017 - momentum: 0.000000
2023-10-16 20:09:40,775 epoch 5 - iter 243/272 - loss 0.03804151 - time (sec): 14.16 - samples/sec: 3261.62 - lr: 0.000017 - momentum: 0.000000
2023-10-16 20:09:42,284 epoch 5 - iter 270/272 - loss 0.03802072 - time (sec): 15.67 - samples/sec: 3278.84 - lr: 0.000017 - momentum: 0.000000
2023-10-16 20:09:42,478 ----------------------------------------------------------------------------------------------------
2023-10-16 20:09:42,478 EPOCH 5 done: loss 0.0376 - lr: 0.000017
2023-10-16 20:09:43,958 DEV : loss 0.1476840376853943 - f1-score (micro avg) 0.8111
2023-10-16 20:09:43,963 saving best model
2023-10-16 20:09:44,440 ----------------------------------------------------------------------------------------------------
2023-10-16 20:09:45,995 epoch 6 - iter 27/272 - loss 0.02354845 - time (sec): 1.55 - samples/sec: 2910.31 - lr: 0.000016 - momentum: 0.000000
2023-10-16 20:09:47,584 epoch 6 - iter 54/272 - loss 0.02510456 - time (sec): 3.14 - samples/sec: 3012.29 - lr: 0.000016 - momentum: 0.000000
2023-10-16 20:09:49,317 epoch 6 - iter 81/272 - loss 0.02501468 - time (sec): 4.87 - samples/sec: 3132.48 - lr: 0.000016 - momentum: 0.000000
2023-10-16 20:09:51,157 epoch 6 - iter 108/272 - loss 0.02624276 - time (sec): 6.71 - samples/sec: 3211.82 - lr: 0.000015 - momentum: 0.000000
2023-10-16 20:09:52,701 epoch 6 - iter 135/272 - loss 0.02955035 - time (sec): 8.26 - samples/sec: 3249.35 - lr: 0.000015 - momentum: 0.000000
2023-10-16 20:09:54,174 epoch 6 - iter 162/272 - loss 0.03040959 - time (sec): 9.73 - samples/sec: 3190.36 - lr: 0.000015 - momentum: 0.000000
2023-10-16 20:09:55,753 epoch 6 - iter 189/272 - loss 0.03027120 - time (sec): 11.31 - samples/sec: 3171.70 - lr: 0.000014 - momentum: 0.000000
2023-10-16 20:09:57,356 epoch 6 - iter 216/272 - loss 0.02953089 - time (sec): 12.91 - samples/sec: 3212.26 - lr: 0.000014 - momentum: 0.000000
2023-10-16 20:09:58,957 epoch 6 - iter 243/272 - loss 0.02753869 - time (sec): 14.51 - samples/sec: 3255.77 - lr: 0.000014 - momentum: 0.000000
2023-10-16 20:10:00,376 epoch 6 - iter 270/272 - loss 0.02794685 - time (sec): 15.93 - samples/sec: 3252.15 - lr: 0.000013 - momentum: 0.000000
2023-10-16 20:10:00,470 ----------------------------------------------------------------------------------------------------
2023-10-16 20:10:00,470 EPOCH 6 done: loss 0.0283 - lr: 0.000013
2023-10-16 20:10:01,986 DEV : loss 0.14637169241905212 - f1-score (micro avg) 0.8147
2023-10-16 20:10:01,991 saving best model
2023-10-16 20:10:02,502 ----------------------------------------------------------------------------------------------------
2023-10-16 20:10:04,065 epoch 7 - iter 27/272 - loss 0.01433158 - time (sec): 1.56 - samples/sec: 3247.49 - lr: 0.000013 - momentum: 0.000000
2023-10-16 20:10:05,829 epoch 7 - iter 54/272 - loss 0.01183116 - time (sec): 3.33 - samples/sec: 3389.33 - lr: 0.000013 - momentum: 0.000000
2023-10-16 20:10:07,497 epoch 7 - iter 81/272 - loss 0.01733263 - time (sec): 4.99 - samples/sec: 3317.72 - lr: 0.000012 - momentum: 0.000000
2023-10-16 20:10:09,156 epoch 7 - iter 108/272 - loss 0.02093726 - time (sec): 6.65 - samples/sec: 3348.06 - lr: 0.000012 - momentum: 0.000000
2023-10-16 20:10:10,693 epoch 7 - iter 135/272 - loss 0.02097625 - time (sec): 8.19 - samples/sec: 3298.29 - lr: 0.000012 - momentum: 0.000000
2023-10-16 20:10:12,346 epoch 7 - iter 162/272 - loss 0.02032468 - time (sec): 9.84 - samples/sec: 3335.63 - lr: 0.000011 - momentum: 0.000000
2023-10-16 20:10:13,932 epoch 7 - iter 189/272 - loss 0.01915034 - time (sec): 11.43 - samples/sec: 3301.37 - lr: 0.000011 - momentum: 0.000000
2023-10-16 20:10:15,419 epoch 7 - iter 216/272 - loss 0.01901545 - time (sec): 12.92 - samples/sec: 3275.63 - lr: 0.000011 - momentum: 0.000000
2023-10-16 20:10:17,028 epoch 7 - iter 243/272 - loss 0.01969638 - time (sec): 14.52 - samples/sec: 3249.17 - lr: 0.000010 - momentum: 0.000000
2023-10-16 20:10:18,534 epoch 7 - iter 270/272 - loss 0.02060808 - time (sec): 16.03 - samples/sec: 3227.84 - lr: 0.000010 - momentum: 0.000000
2023-10-16 20:10:18,630 ----------------------------------------------------------------------------------------------------
2023-10-16 20:10:18,630 EPOCH 7 done: loss 0.0205 - lr: 0.000010
2023-10-16 20:10:20,635 DEV : loss 0.1661161333322525 - f1-score (micro avg) 0.781
2023-10-16 20:10:20,645 ----------------------------------------------------------------------------------------------------
2023-10-16 20:10:22,161 epoch 8 - iter 27/272 - loss 0.01704553 - time (sec): 1.51 - samples/sec: 2895.07 - lr: 0.000010 - momentum: 0.000000
2023-10-16 20:10:23,681 epoch 8 - iter 54/272 - loss 0.02020109 - time (sec): 3.03 - samples/sec: 3024.26 - lr: 0.000009 - momentum: 0.000000
2023-10-16 20:10:25,148 epoch 8 - iter 81/272 - loss 0.01720121 - time (sec): 4.50 - samples/sec: 3068.64 - lr: 0.000009 - momentum: 0.000000
2023-10-16 20:10:26,765 epoch 8 - iter 108/272 - loss 0.01554735 - time (sec): 6.12 - samples/sec: 3136.58 - lr: 0.000009 - momentum: 0.000000
2023-10-16 20:10:28,432 epoch 8 - iter 135/272 - loss 0.01454416 - time (sec): 7.79 - samples/sec: 3170.01 - lr: 0.000008 - momentum: 0.000000
2023-10-16 20:10:30,022 epoch 8 - iter 162/272 - loss 0.01429672 - time (sec): 9.38 - samples/sec: 3191.64 - lr: 0.000008 - momentum: 0.000000
2023-10-16 20:10:31,677 epoch 8 - iter 189/272 - loss 0.01340823 - time (sec): 11.03 - samples/sec: 3241.91 - lr: 0.000008 - momentum: 0.000000
2023-10-16 20:10:33,293 epoch 8 - iter 216/272 - loss 0.01381698 - time (sec): 12.65 - samples/sec: 3214.06 - lr: 0.000007 - momentum: 0.000000
2023-10-16 20:10:35,051 epoch 8 - iter 243/272 - loss 0.01311001 - time (sec): 14.40 - samples/sec: 3239.09 - lr: 0.000007 - momentum: 0.000000
2023-10-16 20:10:36,567 epoch 8 - iter 270/272 - loss 0.01358429 - time (sec): 15.92 - samples/sec: 3247.18 - lr: 0.000007 - momentum: 0.000000
2023-10-16 20:10:36,672 ----------------------------------------------------------------------------------------------------
2023-10-16 20:10:36,672 EPOCH 8 done: loss 0.0135 - lr: 0.000007
2023-10-16 20:10:38,151 DEV : loss 0.15581990778446198 - f1-score (micro avg) 0.8202
2023-10-16 20:10:38,155 saving best model
2023-10-16 20:10:38,576 ----------------------------------------------------------------------------------------------------
2023-10-16 20:10:40,287 epoch 9 - iter 27/272 - loss 0.00855607 - time (sec): 1.71 - samples/sec: 3430.97 - lr: 0.000006 - momentum: 0.000000
2023-10-16 20:10:41,832 epoch 9 - iter 54/272 - loss 0.00640941 - time (sec): 3.25 - samples/sec: 3389.45 - lr: 0.000006 - momentum: 0.000000
2023-10-16 20:10:43,315 epoch 9 - iter 81/272 - loss 0.00951773 - time (sec): 4.74 - samples/sec: 3242.50 - lr: 0.000006 - momentum: 0.000000
2023-10-16 20:10:44,925 epoch 9 - iter 108/272 - loss 0.00906601 - time (sec): 6.35 - samples/sec: 3276.97 - lr: 0.000005 - momentum: 0.000000
2023-10-16 20:10:46,512 epoch 9 - iter 135/272 - loss 0.00901014 - time (sec): 7.93 - samples/sec: 3307.04 - lr: 0.000005 - momentum: 0.000000
2023-10-16 20:10:48,086 epoch 9 - iter 162/272 - loss 0.00920996 - time (sec): 9.51 - samples/sec: 3316.02 - lr: 0.000005 - momentum: 0.000000
2023-10-16 20:10:49,743 epoch 9 - iter 189/272 - loss 0.00916628 - time (sec): 11.17 - samples/sec: 3275.70 - lr: 0.000004 - momentum: 0.000000
2023-10-16 20:10:51,431 epoch 9 - iter 216/272 - loss 0.00950579 - time (sec): 12.85 - samples/sec: 3302.60 - lr: 0.000004 - momentum: 0.000000
2023-10-16 20:10:53,012 epoch 9 - iter 243/272 - loss 0.00904451 - time (sec): 14.43 - samples/sec: 3292.51 - lr: 0.000004 - momentum: 0.000000
2023-10-16 20:10:54,523 epoch 9 - iter 270/272 - loss 0.01016922 - time (sec): 15.95 - samples/sec: 3254.14 - lr: 0.000003 - momentum: 0.000000
2023-10-16 20:10:54,612 ----------------------------------------------------------------------------------------------------
2023-10-16 20:10:54,612 EPOCH 9 done: loss 0.0103 - lr: 0.000003
2023-10-16 20:10:56,087 DEV : loss 0.16827048361301422 - f1-score (micro avg) 0.7978
2023-10-16 20:10:56,091 ----------------------------------------------------------------------------------------------------
2023-10-16 20:10:57,763 epoch 10 - iter 27/272 - loss 0.00219439 - time (sec): 1.67 - samples/sec: 3283.09 - lr: 0.000003 - momentum: 0.000000
2023-10-16 20:10:59,157 epoch 10 - iter 54/272 - loss 0.00215367 - time (sec): 3.06 - samples/sec: 3131.12 - lr: 0.000003 - momentum: 0.000000
2023-10-16 20:11:00,669 epoch 10 - iter 81/272 - loss 0.00391488 - time (sec): 4.58 - samples/sec: 3096.20 - lr: 0.000002 - momentum: 0.000000
2023-10-16 20:11:02,286 epoch 10 - iter 108/272 - loss 0.00571526 - time (sec): 6.19 - samples/sec: 3153.11 - lr: 0.000002 - momentum: 0.000000
2023-10-16 20:11:03,838 epoch 10 - iter 135/272 - loss 0.00553198 - time (sec): 7.75 - samples/sec: 3185.79 - lr: 0.000002 - momentum: 0.000000
2023-10-16 20:11:05,318 epoch 10 - iter 162/272 - loss 0.00540099 - time (sec): 9.22 - samples/sec: 3154.28 - lr: 0.000001 - momentum: 0.000000
2023-10-16 20:11:07,006 epoch 10 - iter 189/272 - loss 0.00681792 - time (sec): 10.91 - samples/sec: 3200.59 - lr: 0.000001 - momentum: 0.000000
2023-10-16 20:11:08,579 epoch 10 - iter 216/272 - loss 0.00808721 - time (sec): 12.49 - samples/sec: 3222.74 - lr: 0.000001 - momentum: 0.000000
2023-10-16 20:11:10,229 epoch 10 - iter 243/272 - loss 0.00790858 - time (sec): 14.14 - samples/sec: 3235.57 - lr: 0.000000 - momentum: 0.000000
2023-10-16 20:11:11,880 epoch 10 - iter 270/272 - loss 0.00880521 - time (sec): 15.79 - samples/sec: 3272.87 - lr: 0.000000 - momentum: 0.000000
2023-10-16 20:11:11,972 ----------------------------------------------------------------------------------------------------
2023-10-16 20:11:11,972 EPOCH 10 done: loss 0.0088 - lr: 0.000000
2023-10-16 20:11:13,486 DEV : loss 0.16773611307144165 - f1-score (micro avg) 0.8051
2023-10-16 20:11:13,859 ----------------------------------------------------------------------------------------------------
2023-10-16 20:11:13,860 Loading model from best epoch ...
2023-10-16 20:11:15,379 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-16 20:11:18,000
Results:
- F-score (micro) 0.7798
- F-score (macro) 0.724
- Accuracy 0.6534
By class:
precision recall f1-score support
LOC 0.8114 0.8686 0.8390 312
PER 0.6923 0.8654 0.7692 208
ORG 0.4375 0.3818 0.4078 55
HumanProd 0.7857 1.0000 0.8800 22
micro avg 0.7373 0.8275 0.7798 597
macro avg 0.6817 0.7789 0.7240 597
weighted avg 0.7345 0.8275 0.7765 597
2023-10-16 20:11:18,001 ----------------------------------------------------------------------------------------------------