|
2023-10-16 20:08:16,375 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:08:16,376 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 20:08:16,376 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:08:16,376 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-16 20:08:16,376 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:08:16,376 Train: 1085 sentences |
|
2023-10-16 20:08:16,376 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 20:08:16,376 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:08:16,376 Training Params: |
|
2023-10-16 20:08:16,376 - learning_rate: "3e-05" |
|
2023-10-16 20:08:16,376 - mini_batch_size: "4" |
|
2023-10-16 20:08:16,376 - max_epochs: "10" |
|
2023-10-16 20:08:16,376 - shuffle: "True" |
|
2023-10-16 20:08:16,376 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:08:16,376 Plugins: |
|
2023-10-16 20:08:16,376 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 20:08:16,376 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:08:16,376 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 20:08:16,376 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 20:08:16,376 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:08:16,376 Computation: |
|
2023-10-16 20:08:16,377 - compute on device: cuda:0 |
|
2023-10-16 20:08:16,377 - embedding storage: none |
|
2023-10-16 20:08:16,377 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:08:16,377 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-16 20:08:16,377 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:08:16,377 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:08:17,873 epoch 1 - iter 27/272 - loss 3.03869039 - time (sec): 1.50 - samples/sec: 3386.21 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 20:08:19,637 epoch 1 - iter 54/272 - loss 2.61368780 - time (sec): 3.26 - samples/sec: 3353.24 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 20:08:21,288 epoch 1 - iter 81/272 - loss 1.99970409 - time (sec): 4.91 - samples/sec: 3317.85 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 20:08:22,892 epoch 1 - iter 108/272 - loss 1.60221038 - time (sec): 6.51 - samples/sec: 3384.49 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 20:08:24,494 epoch 1 - iter 135/272 - loss 1.36857841 - time (sec): 8.12 - samples/sec: 3404.33 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 20:08:25,924 epoch 1 - iter 162/272 - loss 1.24173113 - time (sec): 9.55 - samples/sec: 3370.84 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 20:08:27,471 epoch 1 - iter 189/272 - loss 1.11940563 - time (sec): 11.09 - samples/sec: 3341.30 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 20:08:28,953 epoch 1 - iter 216/272 - loss 1.02508320 - time (sec): 12.58 - samples/sec: 3329.53 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 20:08:30,571 epoch 1 - iter 243/272 - loss 0.93120235 - time (sec): 14.19 - samples/sec: 3329.49 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 20:08:32,030 epoch 1 - iter 270/272 - loss 0.87450918 - time (sec): 15.65 - samples/sec: 3306.80 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 20:08:32,123 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:08:32,123 EPOCH 1 done: loss 0.8714 - lr: 0.000030 |
|
2023-10-16 20:08:33,138 DEV : loss 0.16372087597846985 - f1-score (micro avg) 0.6596 |
|
2023-10-16 20:08:33,142 saving best model |
|
2023-10-16 20:08:33,504 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:08:34,972 epoch 2 - iter 27/272 - loss 0.19904837 - time (sec): 1.47 - samples/sec: 3231.28 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 20:08:36,508 epoch 2 - iter 54/272 - loss 0.16653628 - time (sec): 3.00 - samples/sec: 3310.15 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 20:08:38,099 epoch 2 - iter 81/272 - loss 0.15689792 - time (sec): 4.59 - samples/sec: 3349.68 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 20:08:39,541 epoch 2 - iter 108/272 - loss 0.16347169 - time (sec): 6.04 - samples/sec: 3364.95 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 20:08:41,015 epoch 2 - iter 135/272 - loss 0.17363768 - time (sec): 7.51 - samples/sec: 3300.26 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 20:08:42,594 epoch 2 - iter 162/272 - loss 0.17440064 - time (sec): 9.09 - samples/sec: 3269.51 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 20:08:44,277 epoch 2 - iter 189/272 - loss 0.16931364 - time (sec): 10.77 - samples/sec: 3271.48 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 20:08:45,872 epoch 2 - iter 216/272 - loss 0.16682327 - time (sec): 12.37 - samples/sec: 3321.29 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 20:08:47,465 epoch 2 - iter 243/272 - loss 0.16163978 - time (sec): 13.96 - samples/sec: 3336.00 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 20:08:49,024 epoch 2 - iter 270/272 - loss 0.15802832 - time (sec): 15.52 - samples/sec: 3344.12 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 20:08:49,108 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:08:49,108 EPOCH 2 done: loss 0.1578 - lr: 0.000027 |
|
2023-10-16 20:08:50,522 DEV : loss 0.124007448554039 - f1-score (micro avg) 0.7647 |
|
2023-10-16 20:08:50,526 saving best model |
|
2023-10-16 20:08:50,982 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:08:52,657 epoch 3 - iter 27/272 - loss 0.10639237 - time (sec): 1.67 - samples/sec: 3479.95 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 20:08:54,265 epoch 3 - iter 54/272 - loss 0.09668779 - time (sec): 3.28 - samples/sec: 3546.28 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 20:08:55,779 epoch 3 - iter 81/272 - loss 0.09388957 - time (sec): 4.80 - samples/sec: 3414.38 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 20:08:57,368 epoch 3 - iter 108/272 - loss 0.10422929 - time (sec): 6.38 - samples/sec: 3362.35 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 20:08:58,804 epoch 3 - iter 135/272 - loss 0.10468543 - time (sec): 7.82 - samples/sec: 3316.85 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 20:09:00,296 epoch 3 - iter 162/272 - loss 0.10192530 - time (sec): 9.31 - samples/sec: 3305.64 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 20:09:01,973 epoch 3 - iter 189/272 - loss 0.09803531 - time (sec): 10.99 - samples/sec: 3351.01 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 20:09:03,442 epoch 3 - iter 216/272 - loss 0.09598613 - time (sec): 12.46 - samples/sec: 3328.11 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 20:09:04,960 epoch 3 - iter 243/272 - loss 0.09428346 - time (sec): 13.98 - samples/sec: 3330.45 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 20:09:06,593 epoch 3 - iter 270/272 - loss 0.09356033 - time (sec): 15.61 - samples/sec: 3315.79 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 20:09:06,686 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:09:06,686 EPOCH 3 done: loss 0.0934 - lr: 0.000023 |
|
2023-10-16 20:09:08,109 DEV : loss 0.11912587285041809 - f1-score (micro avg) 0.7703 |
|
2023-10-16 20:09:08,113 saving best model |
|
2023-10-16 20:09:08,574 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:09:10,270 epoch 4 - iter 27/272 - loss 0.06533739 - time (sec): 1.69 - samples/sec: 3074.95 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 20:09:11,827 epoch 4 - iter 54/272 - loss 0.05477897 - time (sec): 3.25 - samples/sec: 3259.29 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 20:09:13,351 epoch 4 - iter 81/272 - loss 0.04947201 - time (sec): 4.77 - samples/sec: 3350.20 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 20:09:14,941 epoch 4 - iter 108/272 - loss 0.05320227 - time (sec): 6.36 - samples/sec: 3384.90 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 20:09:16,458 epoch 4 - iter 135/272 - loss 0.05332532 - time (sec): 7.88 - samples/sec: 3301.85 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 20:09:18,246 epoch 4 - iter 162/272 - loss 0.05318251 - time (sec): 9.67 - samples/sec: 3310.54 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 20:09:19,815 epoch 4 - iter 189/272 - loss 0.05540684 - time (sec): 11.24 - samples/sec: 3289.42 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 20:09:21,646 epoch 4 - iter 216/272 - loss 0.05287585 - time (sec): 13.07 - samples/sec: 3221.21 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 20:09:23,205 epoch 4 - iter 243/272 - loss 0.05039066 - time (sec): 14.63 - samples/sec: 3246.02 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 20:09:24,612 epoch 4 - iter 270/272 - loss 0.05055159 - time (sec): 16.03 - samples/sec: 3234.24 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 20:09:24,707 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:09:24,708 EPOCH 4 done: loss 0.0504 - lr: 0.000020 |
|
2023-10-16 20:09:26,163 DEV : loss 0.13422761857509613 - f1-score (micro avg) 0.7808 |
|
2023-10-16 20:09:26,168 saving best model |
|
2023-10-16 20:09:26,610 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:09:28,226 epoch 5 - iter 27/272 - loss 0.03954786 - time (sec): 1.61 - samples/sec: 3061.53 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 20:09:29,701 epoch 5 - iter 54/272 - loss 0.03936235 - time (sec): 3.09 - samples/sec: 3054.57 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 20:09:31,220 epoch 5 - iter 81/272 - loss 0.04126617 - time (sec): 4.61 - samples/sec: 3255.14 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 20:09:32,926 epoch 5 - iter 108/272 - loss 0.04314098 - time (sec): 6.32 - samples/sec: 3195.32 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 20:09:34,448 epoch 5 - iter 135/272 - loss 0.04076125 - time (sec): 7.84 - samples/sec: 3187.96 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 20:09:35,991 epoch 5 - iter 162/272 - loss 0.03995894 - time (sec): 9.38 - samples/sec: 3211.17 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 20:09:37,641 epoch 5 - iter 189/272 - loss 0.03840579 - time (sec): 11.03 - samples/sec: 3236.95 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 20:09:39,245 epoch 5 - iter 216/272 - loss 0.03843528 - time (sec): 12.63 - samples/sec: 3251.87 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 20:09:40,775 epoch 5 - iter 243/272 - loss 0.03804151 - time (sec): 14.16 - samples/sec: 3261.62 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 20:09:42,284 epoch 5 - iter 270/272 - loss 0.03802072 - time (sec): 15.67 - samples/sec: 3278.84 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 20:09:42,478 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:09:42,478 EPOCH 5 done: loss 0.0376 - lr: 0.000017 |
|
2023-10-16 20:09:43,958 DEV : loss 0.1476840376853943 - f1-score (micro avg) 0.8111 |
|
2023-10-16 20:09:43,963 saving best model |
|
2023-10-16 20:09:44,440 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:09:45,995 epoch 6 - iter 27/272 - loss 0.02354845 - time (sec): 1.55 - samples/sec: 2910.31 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 20:09:47,584 epoch 6 - iter 54/272 - loss 0.02510456 - time (sec): 3.14 - samples/sec: 3012.29 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 20:09:49,317 epoch 6 - iter 81/272 - loss 0.02501468 - time (sec): 4.87 - samples/sec: 3132.48 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 20:09:51,157 epoch 6 - iter 108/272 - loss 0.02624276 - time (sec): 6.71 - samples/sec: 3211.82 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 20:09:52,701 epoch 6 - iter 135/272 - loss 0.02955035 - time (sec): 8.26 - samples/sec: 3249.35 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 20:09:54,174 epoch 6 - iter 162/272 - loss 0.03040959 - time (sec): 9.73 - samples/sec: 3190.36 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 20:09:55,753 epoch 6 - iter 189/272 - loss 0.03027120 - time (sec): 11.31 - samples/sec: 3171.70 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 20:09:57,356 epoch 6 - iter 216/272 - loss 0.02953089 - time (sec): 12.91 - samples/sec: 3212.26 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 20:09:58,957 epoch 6 - iter 243/272 - loss 0.02753869 - time (sec): 14.51 - samples/sec: 3255.77 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 20:10:00,376 epoch 6 - iter 270/272 - loss 0.02794685 - time (sec): 15.93 - samples/sec: 3252.15 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 20:10:00,470 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:10:00,470 EPOCH 6 done: loss 0.0283 - lr: 0.000013 |
|
2023-10-16 20:10:01,986 DEV : loss 0.14637169241905212 - f1-score (micro avg) 0.8147 |
|
2023-10-16 20:10:01,991 saving best model |
|
2023-10-16 20:10:02,502 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:10:04,065 epoch 7 - iter 27/272 - loss 0.01433158 - time (sec): 1.56 - samples/sec: 3247.49 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 20:10:05,829 epoch 7 - iter 54/272 - loss 0.01183116 - time (sec): 3.33 - samples/sec: 3389.33 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 20:10:07,497 epoch 7 - iter 81/272 - loss 0.01733263 - time (sec): 4.99 - samples/sec: 3317.72 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 20:10:09,156 epoch 7 - iter 108/272 - loss 0.02093726 - time (sec): 6.65 - samples/sec: 3348.06 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 20:10:10,693 epoch 7 - iter 135/272 - loss 0.02097625 - time (sec): 8.19 - samples/sec: 3298.29 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 20:10:12,346 epoch 7 - iter 162/272 - loss 0.02032468 - time (sec): 9.84 - samples/sec: 3335.63 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 20:10:13,932 epoch 7 - iter 189/272 - loss 0.01915034 - time (sec): 11.43 - samples/sec: 3301.37 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 20:10:15,419 epoch 7 - iter 216/272 - loss 0.01901545 - time (sec): 12.92 - samples/sec: 3275.63 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 20:10:17,028 epoch 7 - iter 243/272 - loss 0.01969638 - time (sec): 14.52 - samples/sec: 3249.17 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 20:10:18,534 epoch 7 - iter 270/272 - loss 0.02060808 - time (sec): 16.03 - samples/sec: 3227.84 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 20:10:18,630 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:10:18,630 EPOCH 7 done: loss 0.0205 - lr: 0.000010 |
|
2023-10-16 20:10:20,635 DEV : loss 0.1661161333322525 - f1-score (micro avg) 0.781 |
|
2023-10-16 20:10:20,645 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:10:22,161 epoch 8 - iter 27/272 - loss 0.01704553 - time (sec): 1.51 - samples/sec: 2895.07 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 20:10:23,681 epoch 8 - iter 54/272 - loss 0.02020109 - time (sec): 3.03 - samples/sec: 3024.26 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 20:10:25,148 epoch 8 - iter 81/272 - loss 0.01720121 - time (sec): 4.50 - samples/sec: 3068.64 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 20:10:26,765 epoch 8 - iter 108/272 - loss 0.01554735 - time (sec): 6.12 - samples/sec: 3136.58 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 20:10:28,432 epoch 8 - iter 135/272 - loss 0.01454416 - time (sec): 7.79 - samples/sec: 3170.01 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 20:10:30,022 epoch 8 - iter 162/272 - loss 0.01429672 - time (sec): 9.38 - samples/sec: 3191.64 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 20:10:31,677 epoch 8 - iter 189/272 - loss 0.01340823 - time (sec): 11.03 - samples/sec: 3241.91 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 20:10:33,293 epoch 8 - iter 216/272 - loss 0.01381698 - time (sec): 12.65 - samples/sec: 3214.06 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 20:10:35,051 epoch 8 - iter 243/272 - loss 0.01311001 - time (sec): 14.40 - samples/sec: 3239.09 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 20:10:36,567 epoch 8 - iter 270/272 - loss 0.01358429 - time (sec): 15.92 - samples/sec: 3247.18 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 20:10:36,672 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:10:36,672 EPOCH 8 done: loss 0.0135 - lr: 0.000007 |
|
2023-10-16 20:10:38,151 DEV : loss 0.15581990778446198 - f1-score (micro avg) 0.8202 |
|
2023-10-16 20:10:38,155 saving best model |
|
2023-10-16 20:10:38,576 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:10:40,287 epoch 9 - iter 27/272 - loss 0.00855607 - time (sec): 1.71 - samples/sec: 3430.97 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 20:10:41,832 epoch 9 - iter 54/272 - loss 0.00640941 - time (sec): 3.25 - samples/sec: 3389.45 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 20:10:43,315 epoch 9 - iter 81/272 - loss 0.00951773 - time (sec): 4.74 - samples/sec: 3242.50 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 20:10:44,925 epoch 9 - iter 108/272 - loss 0.00906601 - time (sec): 6.35 - samples/sec: 3276.97 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 20:10:46,512 epoch 9 - iter 135/272 - loss 0.00901014 - time (sec): 7.93 - samples/sec: 3307.04 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 20:10:48,086 epoch 9 - iter 162/272 - loss 0.00920996 - time (sec): 9.51 - samples/sec: 3316.02 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 20:10:49,743 epoch 9 - iter 189/272 - loss 0.00916628 - time (sec): 11.17 - samples/sec: 3275.70 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 20:10:51,431 epoch 9 - iter 216/272 - loss 0.00950579 - time (sec): 12.85 - samples/sec: 3302.60 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 20:10:53,012 epoch 9 - iter 243/272 - loss 0.00904451 - time (sec): 14.43 - samples/sec: 3292.51 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 20:10:54,523 epoch 9 - iter 270/272 - loss 0.01016922 - time (sec): 15.95 - samples/sec: 3254.14 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 20:10:54,612 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:10:54,612 EPOCH 9 done: loss 0.0103 - lr: 0.000003 |
|
2023-10-16 20:10:56,087 DEV : loss 0.16827048361301422 - f1-score (micro avg) 0.7978 |
|
2023-10-16 20:10:56,091 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:10:57,763 epoch 10 - iter 27/272 - loss 0.00219439 - time (sec): 1.67 - samples/sec: 3283.09 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 20:10:59,157 epoch 10 - iter 54/272 - loss 0.00215367 - time (sec): 3.06 - samples/sec: 3131.12 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 20:11:00,669 epoch 10 - iter 81/272 - loss 0.00391488 - time (sec): 4.58 - samples/sec: 3096.20 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 20:11:02,286 epoch 10 - iter 108/272 - loss 0.00571526 - time (sec): 6.19 - samples/sec: 3153.11 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 20:11:03,838 epoch 10 - iter 135/272 - loss 0.00553198 - time (sec): 7.75 - samples/sec: 3185.79 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 20:11:05,318 epoch 10 - iter 162/272 - loss 0.00540099 - time (sec): 9.22 - samples/sec: 3154.28 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 20:11:07,006 epoch 10 - iter 189/272 - loss 0.00681792 - time (sec): 10.91 - samples/sec: 3200.59 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 20:11:08,579 epoch 10 - iter 216/272 - loss 0.00808721 - time (sec): 12.49 - samples/sec: 3222.74 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 20:11:10,229 epoch 10 - iter 243/272 - loss 0.00790858 - time (sec): 14.14 - samples/sec: 3235.57 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 20:11:11,880 epoch 10 - iter 270/272 - loss 0.00880521 - time (sec): 15.79 - samples/sec: 3272.87 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 20:11:11,972 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:11:11,972 EPOCH 10 done: loss 0.0088 - lr: 0.000000 |
|
2023-10-16 20:11:13,486 DEV : loss 0.16773611307144165 - f1-score (micro avg) 0.8051 |
|
2023-10-16 20:11:13,859 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:11:13,860 Loading model from best epoch ... |
|
2023-10-16 20:11:15,379 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-16 20:11:18,000 |
|
Results: |
|
- F-score (micro) 0.7798 |
|
- F-score (macro) 0.724 |
|
- Accuracy 0.6534 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8114 0.8686 0.8390 312 |
|
PER 0.6923 0.8654 0.7692 208 |
|
ORG 0.4375 0.3818 0.4078 55 |
|
HumanProd 0.7857 1.0000 0.8800 22 |
|
|
|
micro avg 0.7373 0.8275 0.7798 597 |
|
macro avg 0.6817 0.7789 0.7240 597 |
|
weighted avg 0.7345 0.8275 0.7765 597 |
|
|
|
2023-10-16 20:11:18,001 ---------------------------------------------------------------------------------------------------- |
|
|