|
2023-10-25 20:52:43,281 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:52:43,282 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 20:52:43,282 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:52:43,282 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-25 20:52:43,282 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:52:43,282 Train: 1166 sentences |
|
2023-10-25 20:52:43,282 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 20:52:43,282 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:52:43,282 Training Params: |
|
2023-10-25 20:52:43,282 - learning_rate: "5e-05" |
|
2023-10-25 20:52:43,282 - mini_batch_size: "4" |
|
2023-10-25 20:52:43,282 - max_epochs: "10" |
|
2023-10-25 20:52:43,282 - shuffle: "True" |
|
2023-10-25 20:52:43,282 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:52:43,282 Plugins: |
|
2023-10-25 20:52:43,282 - TensorboardLogger |
|
2023-10-25 20:52:43,282 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 20:52:43,282 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:52:43,282 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 20:52:43,282 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 20:52:43,282 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:52:43,282 Computation: |
|
2023-10-25 20:52:43,282 - compute on device: cuda:0 |
|
2023-10-25 20:52:43,282 - embedding storage: none |
|
2023-10-25 20:52:43,282 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:52:43,282 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-25 20:52:43,282 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:52:43,282 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:52:43,283 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 20:52:44,591 epoch 1 - iter 29/292 - loss 3.24438831 - time (sec): 1.31 - samples/sec: 3643.15 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 20:52:45,844 epoch 1 - iter 58/292 - loss 2.35737757 - time (sec): 2.56 - samples/sec: 3396.11 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 20:52:47,145 epoch 1 - iter 87/292 - loss 1.70602648 - time (sec): 3.86 - samples/sec: 3418.58 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 20:52:48,484 epoch 1 - iter 116/292 - loss 1.39064461 - time (sec): 5.20 - samples/sec: 3415.94 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 20:52:49,917 epoch 1 - iter 145/292 - loss 1.17555138 - time (sec): 6.63 - samples/sec: 3402.10 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 20:52:51,237 epoch 1 - iter 174/292 - loss 1.01899326 - time (sec): 7.95 - samples/sec: 3443.73 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 20:52:52,538 epoch 1 - iter 203/292 - loss 0.91213345 - time (sec): 9.25 - samples/sec: 3426.03 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 20:52:53,826 epoch 1 - iter 232/292 - loss 0.83161711 - time (sec): 10.54 - samples/sec: 3403.05 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 20:52:55,130 epoch 1 - iter 261/292 - loss 0.76267135 - time (sec): 11.85 - samples/sec: 3400.51 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 20:52:56,405 epoch 1 - iter 290/292 - loss 0.71294027 - time (sec): 13.12 - samples/sec: 3371.82 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 20:52:56,483 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:52:56,483 EPOCH 1 done: loss 0.7122 - lr: 0.000049 |
|
2023-10-25 20:52:56,997 DEV : loss 0.14957387745380402 - f1-score (micro avg) 0.567 |
|
2023-10-25 20:52:57,002 saving best model |
|
2023-10-25 20:52:57,519 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:52:58,790 epoch 2 - iter 29/292 - loss 0.17704339 - time (sec): 1.27 - samples/sec: 3344.25 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 20:53:00,044 epoch 2 - iter 58/292 - loss 0.17561646 - time (sec): 2.52 - samples/sec: 3391.82 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 20:53:01,370 epoch 2 - iter 87/292 - loss 0.17159875 - time (sec): 3.85 - samples/sec: 3503.44 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 20:53:02,630 epoch 2 - iter 116/292 - loss 0.16582732 - time (sec): 5.11 - samples/sec: 3467.99 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 20:53:03,858 epoch 2 - iter 145/292 - loss 0.16514765 - time (sec): 6.34 - samples/sec: 3353.34 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 20:53:05,182 epoch 2 - iter 174/292 - loss 0.15569460 - time (sec): 7.66 - samples/sec: 3412.54 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 20:53:06,547 epoch 2 - iter 203/292 - loss 0.14640467 - time (sec): 9.03 - samples/sec: 3502.60 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 20:53:07,883 epoch 2 - iter 232/292 - loss 0.14581795 - time (sec): 10.36 - samples/sec: 3440.88 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 20:53:09,217 epoch 2 - iter 261/292 - loss 0.14877150 - time (sec): 11.70 - samples/sec: 3357.27 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 20:53:10,602 epoch 2 - iter 290/292 - loss 0.15023947 - time (sec): 13.08 - samples/sec: 3379.62 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 20:53:10,691 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:53:10,691 EPOCH 2 done: loss 0.1508 - lr: 0.000045 |
|
2023-10-25 20:53:11,772 DEV : loss 0.13169609010219574 - f1-score (micro avg) 0.6407 |
|
2023-10-25 20:53:11,777 saving best model |
|
2023-10-25 20:53:12,452 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:53:13,788 epoch 3 - iter 29/292 - loss 0.08817301 - time (sec): 1.33 - samples/sec: 3800.10 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 20:53:15,148 epoch 3 - iter 58/292 - loss 0.07997124 - time (sec): 2.69 - samples/sec: 3792.63 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 20:53:16,442 epoch 3 - iter 87/292 - loss 0.08332501 - time (sec): 3.99 - samples/sec: 3484.50 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 20:53:17,735 epoch 3 - iter 116/292 - loss 0.08493381 - time (sec): 5.28 - samples/sec: 3430.17 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 20:53:19,013 epoch 3 - iter 145/292 - loss 0.08790732 - time (sec): 6.56 - samples/sec: 3456.09 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 20:53:20,255 epoch 3 - iter 174/292 - loss 0.08164116 - time (sec): 7.80 - samples/sec: 3330.52 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 20:53:21,582 epoch 3 - iter 203/292 - loss 0.08657696 - time (sec): 9.13 - samples/sec: 3404.26 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 20:53:22,907 epoch 3 - iter 232/292 - loss 0.08717220 - time (sec): 10.45 - samples/sec: 3388.03 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 20:53:24,256 epoch 3 - iter 261/292 - loss 0.08829596 - time (sec): 11.80 - samples/sec: 3397.56 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 20:53:25,532 epoch 3 - iter 290/292 - loss 0.08553678 - time (sec): 13.08 - samples/sec: 3390.93 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-25 20:53:25,619 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:53:25,619 EPOCH 3 done: loss 0.0855 - lr: 0.000039 |
|
2023-10-25 20:53:26,532 DEV : loss 0.1209733635187149 - f1-score (micro avg) 0.7101 |
|
2023-10-25 20:53:26,536 saving best model |
|
2023-10-25 20:53:27,211 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:53:28,480 epoch 4 - iter 29/292 - loss 0.04193234 - time (sec): 1.27 - samples/sec: 2908.67 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 20:53:29,729 epoch 4 - iter 58/292 - loss 0.05216175 - time (sec): 2.52 - samples/sec: 3085.61 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 20:53:31,136 epoch 4 - iter 87/292 - loss 0.04828892 - time (sec): 3.92 - samples/sec: 3187.99 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 20:53:32,401 epoch 4 - iter 116/292 - loss 0.05429306 - time (sec): 5.19 - samples/sec: 3318.14 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 20:53:33,740 epoch 4 - iter 145/292 - loss 0.04835703 - time (sec): 6.53 - samples/sec: 3520.02 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 20:53:34,991 epoch 4 - iter 174/292 - loss 0.05061998 - time (sec): 7.78 - samples/sec: 3422.69 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 20:53:36,354 epoch 4 - iter 203/292 - loss 0.05117584 - time (sec): 9.14 - samples/sec: 3450.24 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 20:53:37,606 epoch 4 - iter 232/292 - loss 0.05313204 - time (sec): 10.39 - samples/sec: 3423.20 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 20:53:38,852 epoch 4 - iter 261/292 - loss 0.05248535 - time (sec): 11.64 - samples/sec: 3398.13 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 20:53:40,096 epoch 4 - iter 290/292 - loss 0.05345773 - time (sec): 12.88 - samples/sec: 3434.57 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 20:53:40,181 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:53:40,182 EPOCH 4 done: loss 0.0537 - lr: 0.000033 |
|
2023-10-25 20:53:41,097 DEV : loss 0.11280453205108643 - f1-score (micro avg) 0.7522 |
|
2023-10-25 20:53:41,102 saving best model |
|
2023-10-25 20:53:41,756 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:53:43,027 epoch 5 - iter 29/292 - loss 0.03067125 - time (sec): 1.27 - samples/sec: 3116.28 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 20:53:44,281 epoch 5 - iter 58/292 - loss 0.03432643 - time (sec): 2.52 - samples/sec: 3410.73 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 20:53:45,604 epoch 5 - iter 87/292 - loss 0.04094254 - time (sec): 3.85 - samples/sec: 3442.26 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 20:53:46,927 epoch 5 - iter 116/292 - loss 0.03856020 - time (sec): 5.17 - samples/sec: 3419.71 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 20:53:48,182 epoch 5 - iter 145/292 - loss 0.03449220 - time (sec): 6.42 - samples/sec: 3411.26 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 20:53:49,492 epoch 5 - iter 174/292 - loss 0.03424483 - time (sec): 7.73 - samples/sec: 3425.22 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 20:53:50,794 epoch 5 - iter 203/292 - loss 0.03475940 - time (sec): 9.04 - samples/sec: 3410.84 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 20:53:52,091 epoch 5 - iter 232/292 - loss 0.03428080 - time (sec): 10.33 - samples/sec: 3432.78 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 20:53:53,399 epoch 5 - iter 261/292 - loss 0.03246941 - time (sec): 11.64 - samples/sec: 3388.06 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 20:53:54,768 epoch 5 - iter 290/292 - loss 0.03384146 - time (sec): 13.01 - samples/sec: 3402.63 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 20:53:54,843 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:53:54,843 EPOCH 5 done: loss 0.0341 - lr: 0.000028 |
|
2023-10-25 20:53:55,776 DEV : loss 0.14307908713817596 - f1-score (micro avg) 0.7343 |
|
2023-10-25 20:53:55,781 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:53:57,014 epoch 6 - iter 29/292 - loss 0.02283523 - time (sec): 1.23 - samples/sec: 3403.33 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 20:53:58,274 epoch 6 - iter 58/292 - loss 0.02378104 - time (sec): 2.49 - samples/sec: 3439.00 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 20:53:59,705 epoch 6 - iter 87/292 - loss 0.02477152 - time (sec): 3.92 - samples/sec: 3555.45 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 20:54:01,001 epoch 6 - iter 116/292 - loss 0.02923244 - time (sec): 5.22 - samples/sec: 3558.00 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 20:54:02,338 epoch 6 - iter 145/292 - loss 0.02764328 - time (sec): 6.56 - samples/sec: 3408.14 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 20:54:03,788 epoch 6 - iter 174/292 - loss 0.02758029 - time (sec): 8.01 - samples/sec: 3357.42 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 20:54:05,097 epoch 6 - iter 203/292 - loss 0.02712844 - time (sec): 9.32 - samples/sec: 3364.89 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 20:54:06,398 epoch 6 - iter 232/292 - loss 0.02556567 - time (sec): 10.62 - samples/sec: 3381.89 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 20:54:07,644 epoch 6 - iter 261/292 - loss 0.02621766 - time (sec): 11.86 - samples/sec: 3356.01 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 20:54:08,936 epoch 6 - iter 290/292 - loss 0.02847300 - time (sec): 13.15 - samples/sec: 3366.52 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 20:54:09,016 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:54:09,016 EPOCH 6 done: loss 0.0284 - lr: 0.000022 |
|
2023-10-25 20:54:10,097 DEV : loss 0.16628390550613403 - f1-score (micro avg) 0.7522 |
|
2023-10-25 20:54:10,102 saving best model |
|
2023-10-25 20:54:10,660 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:54:11,922 epoch 7 - iter 29/292 - loss 0.01313140 - time (sec): 1.26 - samples/sec: 3320.18 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 20:54:13,181 epoch 7 - iter 58/292 - loss 0.01062088 - time (sec): 2.52 - samples/sec: 3319.42 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 20:54:14,445 epoch 7 - iter 87/292 - loss 0.01259981 - time (sec): 3.78 - samples/sec: 3345.63 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 20:54:15,657 epoch 7 - iter 116/292 - loss 0.01549817 - time (sec): 5.00 - samples/sec: 3351.89 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 20:54:17,051 epoch 7 - iter 145/292 - loss 0.01629611 - time (sec): 6.39 - samples/sec: 3443.66 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 20:54:18,411 epoch 7 - iter 174/292 - loss 0.01683554 - time (sec): 7.75 - samples/sec: 3456.79 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 20:54:19,716 epoch 7 - iter 203/292 - loss 0.02085836 - time (sec): 9.06 - samples/sec: 3467.56 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 20:54:20,977 epoch 7 - iter 232/292 - loss 0.02012985 - time (sec): 10.32 - samples/sec: 3450.94 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 20:54:22,272 epoch 7 - iter 261/292 - loss 0.01963116 - time (sec): 11.61 - samples/sec: 3441.95 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 20:54:23,519 epoch 7 - iter 290/292 - loss 0.01887461 - time (sec): 12.86 - samples/sec: 3434.82 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 20:54:23,603 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:54:23,603 EPOCH 7 done: loss 0.0188 - lr: 0.000017 |
|
2023-10-25 20:54:24,526 DEV : loss 0.18740706145763397 - f1-score (micro avg) 0.7209 |
|
2023-10-25 20:54:24,531 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:54:25,835 epoch 8 - iter 29/292 - loss 0.01005555 - time (sec): 1.30 - samples/sec: 3400.17 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 20:54:27,199 epoch 8 - iter 58/292 - loss 0.01177608 - time (sec): 2.67 - samples/sec: 3735.36 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 20:54:28,454 epoch 8 - iter 87/292 - loss 0.01023423 - time (sec): 3.92 - samples/sec: 3525.50 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 20:54:29,725 epoch 8 - iter 116/292 - loss 0.01018425 - time (sec): 5.19 - samples/sec: 3503.67 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 20:54:30,974 epoch 8 - iter 145/292 - loss 0.00987982 - time (sec): 6.44 - samples/sec: 3494.30 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 20:54:32,348 epoch 8 - iter 174/292 - loss 0.01053102 - time (sec): 7.82 - samples/sec: 3468.90 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 20:54:33,697 epoch 8 - iter 203/292 - loss 0.01060943 - time (sec): 9.17 - samples/sec: 3480.83 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 20:54:35,002 epoch 8 - iter 232/292 - loss 0.01150807 - time (sec): 10.47 - samples/sec: 3462.88 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 20:54:36,267 epoch 8 - iter 261/292 - loss 0.01076409 - time (sec): 11.74 - samples/sec: 3411.72 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 20:54:37,558 epoch 8 - iter 290/292 - loss 0.01018919 - time (sec): 13.03 - samples/sec: 3390.12 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 20:54:37,644 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:54:37,644 EPOCH 8 done: loss 0.0101 - lr: 0.000011 |
|
2023-10-25 20:54:38,559 DEV : loss 0.19930434226989746 - f1-score (micro avg) 0.7309 |
|
2023-10-25 20:54:38,563 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:54:39,907 epoch 9 - iter 29/292 - loss 0.01344550 - time (sec): 1.34 - samples/sec: 3432.35 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 20:54:41,169 epoch 9 - iter 58/292 - loss 0.01055378 - time (sec): 2.60 - samples/sec: 3307.88 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 20:54:42,418 epoch 9 - iter 87/292 - loss 0.00815295 - time (sec): 3.85 - samples/sec: 3152.34 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 20:54:43,710 epoch 9 - iter 116/292 - loss 0.00642132 - time (sec): 5.15 - samples/sec: 3294.32 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 20:54:44,977 epoch 9 - iter 145/292 - loss 0.00692122 - time (sec): 6.41 - samples/sec: 3257.05 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 20:54:46,255 epoch 9 - iter 174/292 - loss 0.00736428 - time (sec): 7.69 - samples/sec: 3255.49 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 20:54:47,546 epoch 9 - iter 203/292 - loss 0.00688575 - time (sec): 8.98 - samples/sec: 3341.57 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 20:54:48,746 epoch 9 - iter 232/292 - loss 0.00605517 - time (sec): 10.18 - samples/sec: 3389.43 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 20:54:50,084 epoch 9 - iter 261/292 - loss 0.00604344 - time (sec): 11.52 - samples/sec: 3467.54 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 20:54:51,342 epoch 9 - iter 290/292 - loss 0.00647095 - time (sec): 12.78 - samples/sec: 3460.46 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 20:54:51,421 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:54:51,421 EPOCH 9 done: loss 0.0064 - lr: 0.000006 |
|
2023-10-25 20:54:52,337 DEV : loss 0.20804047584533691 - f1-score (micro avg) 0.735 |
|
2023-10-25 20:54:52,342 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:54:53,531 epoch 10 - iter 29/292 - loss 0.00167655 - time (sec): 1.19 - samples/sec: 3673.47 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 20:54:54,800 epoch 10 - iter 58/292 - loss 0.00843115 - time (sec): 2.46 - samples/sec: 3456.32 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 20:54:56,074 epoch 10 - iter 87/292 - loss 0.00620319 - time (sec): 3.73 - samples/sec: 3501.50 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 20:54:57,422 epoch 10 - iter 116/292 - loss 0.00516689 - time (sec): 5.08 - samples/sec: 3410.62 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 20:54:58,751 epoch 10 - iter 145/292 - loss 0.00519706 - time (sec): 6.41 - samples/sec: 3393.23 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 20:55:00,068 epoch 10 - iter 174/292 - loss 0.00530414 - time (sec): 7.73 - samples/sec: 3331.36 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 20:55:01,375 epoch 10 - iter 203/292 - loss 0.00462386 - time (sec): 9.03 - samples/sec: 3418.00 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 20:55:02,678 epoch 10 - iter 232/292 - loss 0.00511570 - time (sec): 10.34 - samples/sec: 3447.71 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 20:55:03,957 epoch 10 - iter 261/292 - loss 0.00465885 - time (sec): 11.61 - samples/sec: 3411.83 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 20:55:05,243 epoch 10 - iter 290/292 - loss 0.00423569 - time (sec): 12.90 - samples/sec: 3419.83 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 20:55:05,325 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:55:05,325 EPOCH 10 done: loss 0.0042 - lr: 0.000000 |
|
2023-10-25 20:55:06,243 DEV : loss 0.21092882752418518 - f1-score (micro avg) 0.7292 |
|
2023-10-25 20:55:06,767 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 20:55:06,768 Loading model from best epoch ... |
|
2023-10-25 20:55:08,658 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-25 20:55:10,238 |
|
Results: |
|
- F-score (micro) 0.7654 |
|
- F-score (macro) 0.6924 |
|
- Accuracy 0.6478 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.7832 0.8305 0.8061 348 |
|
LOC 0.6959 0.8506 0.7655 261 |
|
ORG 0.5278 0.3654 0.4318 52 |
|
HumanProd 0.7200 0.8182 0.7660 22 |
|
|
|
micro avg 0.7316 0.8023 0.7654 683 |
|
macro avg 0.6817 0.7162 0.6924 683 |
|
weighted avg 0.7284 0.8023 0.7608 683 |
|
|
|
2023-10-25 20:55:10,238 ---------------------------------------------------------------------------------------------------- |
|
|