|
2023-10-13 13:20:08,359 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:20:08,360 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-13 13:20:08,360 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:20:08,360 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences |
|
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator |
|
2023-10-13 13:20:08,360 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:20:08,360 Train: 3575 sentences |
|
2023-10-13 13:20:08,360 (train_with_dev=False, train_with_test=False) |
|
2023-10-13 13:20:08,360 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:20:08,360 Training Params: |
|
2023-10-13 13:20:08,360 - learning_rate: "5e-05" |
|
2023-10-13 13:20:08,360 - mini_batch_size: "8" |
|
2023-10-13 13:20:08,360 - max_epochs: "10" |
|
2023-10-13 13:20:08,360 - shuffle: "True" |
|
2023-10-13 13:20:08,361 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:20:08,361 Plugins: |
|
2023-10-13 13:20:08,361 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-13 13:20:08,361 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:20:08,361 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-13 13:20:08,361 - metric: "('micro avg', 'f1-score')" |
|
2023-10-13 13:20:08,361 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:20:08,361 Computation: |
|
2023-10-13 13:20:08,361 - compute on device: cuda:0 |
|
2023-10-13 13:20:08,361 - embedding storage: none |
|
2023-10-13 13:20:08,361 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:20:08,361 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-13 13:20:08,361 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:20:08,361 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:20:11,185 epoch 1 - iter 44/447 - loss 2.98851820 - time (sec): 2.82 - samples/sec: 3102.17 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 13:20:14,004 epoch 1 - iter 88/447 - loss 1.95325123 - time (sec): 5.64 - samples/sec: 3148.92 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 13:20:16,621 epoch 1 - iter 132/447 - loss 1.49794150 - time (sec): 8.26 - samples/sec: 3114.70 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 13:20:19,630 epoch 1 - iter 176/447 - loss 1.20997650 - time (sec): 11.27 - samples/sec: 3062.86 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 13:20:22,366 epoch 1 - iter 220/447 - loss 1.03351243 - time (sec): 14.00 - samples/sec: 3050.14 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 13:20:25,109 epoch 1 - iter 264/447 - loss 0.91449018 - time (sec): 16.75 - samples/sec: 3041.16 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 13:20:27,854 epoch 1 - iter 308/447 - loss 0.82776650 - time (sec): 19.49 - samples/sec: 3038.53 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-13 13:20:30,594 epoch 1 - iter 352/447 - loss 0.75655567 - time (sec): 22.23 - samples/sec: 3042.01 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-13 13:20:33,213 epoch 1 - iter 396/447 - loss 0.69589232 - time (sec): 24.85 - samples/sec: 3043.96 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-13 13:20:36,387 epoch 1 - iter 440/447 - loss 0.64524193 - time (sec): 28.03 - samples/sec: 3041.44 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-13 13:20:36,800 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:20:36,800 EPOCH 1 done: loss 0.6389 - lr: 0.000049 |
|
2023-10-13 13:20:41,834 DEV : loss 0.17851579189300537 - f1-score (micro avg) 0.6619 |
|
2023-10-13 13:20:41,868 saving best model |
|
2023-10-13 13:20:42,189 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:20:45,136 epoch 2 - iter 44/447 - loss 0.18786776 - time (sec): 2.95 - samples/sec: 3040.94 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-13 13:20:48,309 epoch 2 - iter 88/447 - loss 0.18005101 - time (sec): 6.12 - samples/sec: 3024.51 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-13 13:20:50,957 epoch 2 - iter 132/447 - loss 0.17172587 - time (sec): 8.77 - samples/sec: 2987.88 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 13:20:53,671 epoch 2 - iter 176/447 - loss 0.17670397 - time (sec): 11.48 - samples/sec: 3002.83 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 13:20:56,617 epoch 2 - iter 220/447 - loss 0.17152176 - time (sec): 14.43 - samples/sec: 2986.38 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-13 13:20:59,327 epoch 2 - iter 264/447 - loss 0.16193941 - time (sec): 17.14 - samples/sec: 3019.15 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-13 13:21:01,946 epoch 2 - iter 308/447 - loss 0.15929042 - time (sec): 19.75 - samples/sec: 3022.81 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-13 13:21:04,541 epoch 2 - iter 352/447 - loss 0.15834174 - time (sec): 22.35 - samples/sec: 3031.62 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-13 13:21:07,674 epoch 2 - iter 396/447 - loss 0.15381226 - time (sec): 25.48 - samples/sec: 3014.93 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 13:21:10,380 epoch 2 - iter 440/447 - loss 0.15235393 - time (sec): 28.19 - samples/sec: 3024.09 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 13:21:10,888 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:21:10,889 EPOCH 2 done: loss 0.1514 - lr: 0.000045 |
|
2023-10-13 13:21:19,373 DEV : loss 0.12778525054454803 - f1-score (micro avg) 0.6984 |
|
2023-10-13 13:21:19,406 saving best model |
|
2023-10-13 13:21:19,820 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:21:22,656 epoch 3 - iter 44/447 - loss 0.10065484 - time (sec): 2.83 - samples/sec: 3018.94 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-13 13:21:25,786 epoch 3 - iter 88/447 - loss 0.08666742 - time (sec): 5.96 - samples/sec: 2992.75 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 13:21:28,598 epoch 3 - iter 132/447 - loss 0.08572550 - time (sec): 8.78 - samples/sec: 2986.73 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 13:21:31,298 epoch 3 - iter 176/447 - loss 0.08491542 - time (sec): 11.48 - samples/sec: 3003.02 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-13 13:21:33,912 epoch 3 - iter 220/447 - loss 0.08321697 - time (sec): 14.09 - samples/sec: 2985.23 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-13 13:21:36,687 epoch 3 - iter 264/447 - loss 0.08312643 - time (sec): 16.87 - samples/sec: 2997.64 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-13 13:21:39,389 epoch 3 - iter 308/447 - loss 0.08303934 - time (sec): 19.57 - samples/sec: 2999.67 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-13 13:21:42,237 epoch 3 - iter 352/447 - loss 0.07934551 - time (sec): 22.41 - samples/sec: 3012.75 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-13 13:21:44,842 epoch 3 - iter 396/447 - loss 0.08105453 - time (sec): 25.02 - samples/sec: 3030.91 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-13 13:21:47,987 epoch 3 - iter 440/447 - loss 0.08108549 - time (sec): 28.16 - samples/sec: 3028.88 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-13 13:21:48,390 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:21:48,390 EPOCH 3 done: loss 0.0808 - lr: 0.000039 |
|
2023-10-13 13:21:56,816 DEV : loss 0.13502156734466553 - f1-score (micro avg) 0.7328 |
|
2023-10-13 13:21:56,851 saving best model |
|
2023-10-13 13:21:57,269 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:21:59,999 epoch 4 - iter 44/447 - loss 0.04201711 - time (sec): 2.72 - samples/sec: 3098.25 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-13 13:22:02,677 epoch 4 - iter 88/447 - loss 0.05222326 - time (sec): 5.40 - samples/sec: 3037.66 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-13 13:22:05,495 epoch 4 - iter 132/447 - loss 0.04940786 - time (sec): 8.22 - samples/sec: 3037.73 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 13:22:08,212 epoch 4 - iter 176/447 - loss 0.04414691 - time (sec): 10.94 - samples/sec: 3046.88 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 13:22:11,658 epoch 4 - iter 220/447 - loss 0.04915488 - time (sec): 14.38 - samples/sec: 3015.37 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-13 13:22:14,498 epoch 4 - iter 264/447 - loss 0.05177181 - time (sec): 17.22 - samples/sec: 3015.62 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-13 13:22:17,164 epoch 4 - iter 308/447 - loss 0.04997384 - time (sec): 19.89 - samples/sec: 3006.17 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-13 13:22:19,895 epoch 4 - iter 352/447 - loss 0.05023042 - time (sec): 22.62 - samples/sec: 2998.23 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-13 13:22:23,206 epoch 4 - iter 396/447 - loss 0.05174515 - time (sec): 25.93 - samples/sec: 2981.06 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-13 13:22:25,978 epoch 4 - iter 440/447 - loss 0.05238596 - time (sec): 28.70 - samples/sec: 2969.31 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-13 13:22:26,430 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:22:26,431 EPOCH 4 done: loss 0.0529 - lr: 0.000033 |
|
2023-10-13 13:22:35,016 DEV : loss 0.15478116273880005 - f1-score (micro avg) 0.7741 |
|
2023-10-13 13:22:35,048 saving best model |
|
2023-10-13 13:22:35,500 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:22:38,504 epoch 5 - iter 44/447 - loss 0.03979897 - time (sec): 3.00 - samples/sec: 2988.15 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-13 13:22:41,411 epoch 5 - iter 88/447 - loss 0.03334224 - time (sec): 5.91 - samples/sec: 2922.69 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 13:22:44,290 epoch 5 - iter 132/447 - loss 0.03321982 - time (sec): 8.79 - samples/sec: 2971.55 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 13:22:47,083 epoch 5 - iter 176/447 - loss 0.03182161 - time (sec): 11.58 - samples/sec: 2994.52 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-13 13:22:49,725 epoch 5 - iter 220/447 - loss 0.03370102 - time (sec): 14.22 - samples/sec: 2998.29 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-13 13:22:52,582 epoch 5 - iter 264/447 - loss 0.03501394 - time (sec): 17.08 - samples/sec: 3001.26 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 13:22:55,826 epoch 5 - iter 308/447 - loss 0.03844377 - time (sec): 20.32 - samples/sec: 2980.60 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 13:22:58,482 epoch 5 - iter 352/447 - loss 0.03808194 - time (sec): 22.98 - samples/sec: 2984.83 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 13:23:01,347 epoch 5 - iter 396/447 - loss 0.03629037 - time (sec): 25.85 - samples/sec: 2970.33 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 13:23:04,266 epoch 5 - iter 440/447 - loss 0.03561767 - time (sec): 28.76 - samples/sec: 2968.14 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 13:23:04,682 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:23:04,682 EPOCH 5 done: loss 0.0353 - lr: 0.000028 |
|
2023-10-13 13:23:13,184 DEV : loss 0.17760951817035675 - f1-score (micro avg) 0.7682 |
|
2023-10-13 13:23:13,217 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:23:16,042 epoch 6 - iter 44/447 - loss 0.01877968 - time (sec): 2.82 - samples/sec: 3048.07 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 13:23:18,942 epoch 6 - iter 88/447 - loss 0.01932285 - time (sec): 5.72 - samples/sec: 3077.70 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 13:23:21,625 epoch 6 - iter 132/447 - loss 0.02015927 - time (sec): 8.41 - samples/sec: 3088.96 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 13:23:24,907 epoch 6 - iter 176/447 - loss 0.01821601 - time (sec): 11.69 - samples/sec: 3065.39 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 13:23:27,771 epoch 6 - iter 220/447 - loss 0.01881791 - time (sec): 14.55 - samples/sec: 2979.48 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 13:23:30,488 epoch 6 - iter 264/447 - loss 0.01860310 - time (sec): 17.27 - samples/sec: 2980.78 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 13:23:33,366 epoch 6 - iter 308/447 - loss 0.01815398 - time (sec): 20.15 - samples/sec: 2976.57 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 13:23:36,065 epoch 6 - iter 352/447 - loss 0.01915966 - time (sec): 22.85 - samples/sec: 2977.83 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 13:23:38,865 epoch 6 - iter 396/447 - loss 0.01991320 - time (sec): 25.65 - samples/sec: 2999.07 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 13:23:41,624 epoch 6 - iter 440/447 - loss 0.02043759 - time (sec): 28.41 - samples/sec: 3003.11 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 13:23:42,034 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:23:42,034 EPOCH 6 done: loss 0.0205 - lr: 0.000022 |
|
2023-10-13 13:23:50,494 DEV : loss 0.21282611787319183 - f1-score (micro avg) 0.7583 |
|
2023-10-13 13:23:50,525 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:23:53,838 epoch 7 - iter 44/447 - loss 0.02198471 - time (sec): 3.31 - samples/sec: 2998.52 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 13:23:56,555 epoch 7 - iter 88/447 - loss 0.01806647 - time (sec): 6.03 - samples/sec: 2959.07 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 13:23:59,442 epoch 7 - iter 132/447 - loss 0.01433410 - time (sec): 8.92 - samples/sec: 2982.30 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 13:24:02,316 epoch 7 - iter 176/447 - loss 0.01416791 - time (sec): 11.79 - samples/sec: 3005.39 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 13:24:05,100 epoch 7 - iter 220/447 - loss 0.01395297 - time (sec): 14.57 - samples/sec: 3015.13 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 13:24:07,721 epoch 7 - iter 264/447 - loss 0.01425224 - time (sec): 17.19 - samples/sec: 3003.40 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 13:24:10,471 epoch 7 - iter 308/447 - loss 0.01599349 - time (sec): 19.95 - samples/sec: 3015.57 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 13:24:13,263 epoch 7 - iter 352/447 - loss 0.01639863 - time (sec): 22.74 - samples/sec: 3007.11 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 13:24:15,883 epoch 7 - iter 396/447 - loss 0.01654187 - time (sec): 25.36 - samples/sec: 3012.06 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 13:24:18,717 epoch 7 - iter 440/447 - loss 0.01632206 - time (sec): 28.19 - samples/sec: 3031.51 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 13:24:19,099 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:24:19,099 EPOCH 7 done: loss 0.0163 - lr: 0.000017 |
|
2023-10-13 13:24:27,956 DEV : loss 0.21246877312660217 - f1-score (micro avg) 0.7732 |
|
2023-10-13 13:24:27,990 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:24:30,849 epoch 8 - iter 44/447 - loss 0.01236522 - time (sec): 2.86 - samples/sec: 3006.90 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 13:24:33,776 epoch 8 - iter 88/447 - loss 0.01153258 - time (sec): 5.78 - samples/sec: 2962.65 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 13:24:36,431 epoch 8 - iter 132/447 - loss 0.01262017 - time (sec): 8.44 - samples/sec: 3003.64 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 13:24:39,104 epoch 8 - iter 176/447 - loss 0.01112667 - time (sec): 11.11 - samples/sec: 3015.10 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 13:24:41,911 epoch 8 - iter 220/447 - loss 0.00946565 - time (sec): 13.92 - samples/sec: 3002.28 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 13:24:44,625 epoch 8 - iter 264/447 - loss 0.00948303 - time (sec): 16.63 - samples/sec: 3018.28 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 13:24:47,483 epoch 8 - iter 308/447 - loss 0.00851173 - time (sec): 19.49 - samples/sec: 2998.77 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 13:24:50,628 epoch 8 - iter 352/447 - loss 0.00968723 - time (sec): 22.64 - samples/sec: 2987.84 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 13:24:53,754 epoch 8 - iter 396/447 - loss 0.00996054 - time (sec): 25.76 - samples/sec: 2977.54 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 13:24:56,483 epoch 8 - iter 440/447 - loss 0.00980498 - time (sec): 28.49 - samples/sec: 2985.89 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 13:24:56,967 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:24:56,967 EPOCH 8 done: loss 0.0096 - lr: 0.000011 |
|
2023-10-13 13:25:05,316 DEV : loss 0.21407929062843323 - f1-score (micro avg) 0.7841 |
|
2023-10-13 13:25:05,349 saving best model |
|
2023-10-13 13:25:06,114 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:25:08,889 epoch 9 - iter 44/447 - loss 0.00542228 - time (sec): 2.77 - samples/sec: 2990.28 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 13:25:11,925 epoch 9 - iter 88/447 - loss 0.00463397 - time (sec): 5.81 - samples/sec: 2922.50 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 13:25:14,532 epoch 9 - iter 132/447 - loss 0.00602455 - time (sec): 8.42 - samples/sec: 2961.12 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 13:25:17,407 epoch 9 - iter 176/447 - loss 0.00746601 - time (sec): 11.29 - samples/sec: 2945.80 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 13:25:20,398 epoch 9 - iter 220/447 - loss 0.00670432 - time (sec): 14.28 - samples/sec: 2945.21 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 13:25:23,382 epoch 9 - iter 264/447 - loss 0.00608308 - time (sec): 17.27 - samples/sec: 2921.85 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 13:25:26,159 epoch 9 - iter 308/447 - loss 0.00653327 - time (sec): 20.04 - samples/sec: 2942.50 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 13:25:29,535 epoch 9 - iter 352/447 - loss 0.00606227 - time (sec): 23.42 - samples/sec: 2950.37 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 13:25:32,391 epoch 9 - iter 396/447 - loss 0.00639259 - time (sec): 26.28 - samples/sec: 2943.53 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 13:25:35,184 epoch 9 - iter 440/447 - loss 0.00614557 - time (sec): 29.07 - samples/sec: 2940.07 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 13:25:35,554 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:25:35,555 EPOCH 9 done: loss 0.0066 - lr: 0.000006 |
|
2023-10-13 13:25:43,820 DEV : loss 0.23063816130161285 - f1-score (micro avg) 0.781 |
|
2023-10-13 13:25:43,854 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:25:47,476 epoch 10 - iter 44/447 - loss 0.00098734 - time (sec): 3.62 - samples/sec: 2730.27 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 13:25:50,549 epoch 10 - iter 88/447 - loss 0.00181722 - time (sec): 6.69 - samples/sec: 2770.81 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 13:25:53,340 epoch 10 - iter 132/447 - loss 0.00316493 - time (sec): 9.48 - samples/sec: 2826.86 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 13:25:56,007 epoch 10 - iter 176/447 - loss 0.00376538 - time (sec): 12.15 - samples/sec: 2864.66 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 13:25:58,830 epoch 10 - iter 220/447 - loss 0.00318066 - time (sec): 14.97 - samples/sec: 2891.81 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 13:26:01,539 epoch 10 - iter 264/447 - loss 0.00282318 - time (sec): 17.68 - samples/sec: 2901.48 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 13:26:04,410 epoch 10 - iter 308/447 - loss 0.00291001 - time (sec): 20.55 - samples/sec: 2894.73 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 13:26:07,417 epoch 10 - iter 352/447 - loss 0.00276025 - time (sec): 23.56 - samples/sec: 2890.66 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 13:26:10,104 epoch 10 - iter 396/447 - loss 0.00297373 - time (sec): 26.25 - samples/sec: 2912.16 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 13:26:12,949 epoch 10 - iter 440/447 - loss 0.00306137 - time (sec): 29.09 - samples/sec: 2939.98 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 13:26:13,356 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:26:13,357 EPOCH 10 done: loss 0.0030 - lr: 0.000000 |
|
2023-10-13 13:26:21,613 DEV : loss 0.2350376695394516 - f1-score (micro avg) 0.7849 |
|
2023-10-13 13:26:21,645 saving best model |
|
2023-10-13 13:26:22,417 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 13:26:22,419 Loading model from best epoch ... |
|
2023-10-13 13:26:23,884 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time |
|
2023-10-13 13:26:28,948 |
|
Results: |
|
- F-score (micro) 0.7551 |
|
- F-score (macro) 0.6861 |
|
- Accuracy 0.6253 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.8361 0.8473 0.8417 596 |
|
pers 0.6658 0.7778 0.7175 333 |
|
org 0.5620 0.5152 0.5375 132 |
|
prod 0.6333 0.5758 0.6032 66 |
|
time 0.6909 0.7755 0.7308 49 |
|
|
|
micro avg 0.7388 0.7721 0.7551 1176 |
|
macro avg 0.6776 0.6983 0.6861 1176 |
|
weighted avg 0.7397 0.7721 0.7544 1176 |
|
|
|
2023-10-13 13:26:28,948 ---------------------------------------------------------------------------------------------------- |
|
|