|
2023-10-18 17:57:48,044 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:57:48,044 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 17:57:48,044 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:57:48,044 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences |
|
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator |
|
2023-10-18 17:57:48,044 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:57:48,044 Train: 3575 sentences |
|
2023-10-18 17:57:48,044 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 17:57:48,044 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:57:48,044 Training Params: |
|
2023-10-18 17:57:48,044 - learning_rate: "5e-05" |
|
2023-10-18 17:57:48,045 - mini_batch_size: "8" |
|
2023-10-18 17:57:48,045 - max_epochs: "10" |
|
2023-10-18 17:57:48,045 - shuffle: "True" |
|
2023-10-18 17:57:48,045 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:57:48,045 Plugins: |
|
2023-10-18 17:57:48,045 - TensorboardLogger |
|
2023-10-18 17:57:48,045 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 17:57:48,045 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:57:48,045 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 17:57:48,045 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 17:57:48,045 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:57:48,045 Computation: |
|
2023-10-18 17:57:48,045 - compute on device: cuda:0 |
|
2023-10-18 17:57:48,045 - embedding storage: none |
|
2023-10-18 17:57:48,045 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:57:48,045 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-18 17:57:48,045 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:57:48,045 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:57:48,045 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 17:57:49,179 epoch 1 - iter 44/447 - loss 3.19284290 - time (sec): 1.13 - samples/sec: 7978.22 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 17:57:50,225 epoch 1 - iter 88/447 - loss 3.01363940 - time (sec): 2.18 - samples/sec: 8580.02 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 17:57:51,189 epoch 1 - iter 132/447 - loss 2.78665781 - time (sec): 3.14 - samples/sec: 8321.31 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 17:57:52,175 epoch 1 - iter 176/447 - loss 2.47324587 - time (sec): 4.13 - samples/sec: 8193.55 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 17:57:53,220 epoch 1 - iter 220/447 - loss 2.16020751 - time (sec): 5.17 - samples/sec: 8079.06 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 17:57:54,218 epoch 1 - iter 264/447 - loss 1.90715039 - time (sec): 6.17 - samples/sec: 8154.90 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 17:57:55,266 epoch 1 - iter 308/447 - loss 1.71530413 - time (sec): 7.22 - samples/sec: 8207.96 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 17:57:56,340 epoch 1 - iter 352/447 - loss 1.55384500 - time (sec): 8.29 - samples/sec: 8290.74 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 17:57:57,336 epoch 1 - iter 396/447 - loss 1.44428539 - time (sec): 9.29 - samples/sec: 8304.00 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 17:57:58,334 epoch 1 - iter 440/447 - loss 1.36744870 - time (sec): 10.29 - samples/sec: 8290.89 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 17:57:58,494 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:57:58,494 EPOCH 1 done: loss 1.3584 - lr: 0.000049 |
|
2023-10-18 17:58:00,741 DEV : loss 0.43687644600868225 - f1-score (micro avg) 0.0 |
|
2023-10-18 17:58:00,766 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:58:01,735 epoch 2 - iter 44/447 - loss 0.48119690 - time (sec): 0.97 - samples/sec: 8619.57 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 17:58:02,731 epoch 2 - iter 88/447 - loss 0.50993594 - time (sec): 1.96 - samples/sec: 8729.02 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 17:58:03,741 epoch 2 - iter 132/447 - loss 0.50165300 - time (sec): 2.97 - samples/sec: 8368.61 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 17:58:04,741 epoch 2 - iter 176/447 - loss 0.49887066 - time (sec): 3.97 - samples/sec: 8304.89 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 17:58:05,794 epoch 2 - iter 220/447 - loss 0.50462302 - time (sec): 5.03 - samples/sec: 8465.43 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 17:58:06,779 epoch 2 - iter 264/447 - loss 0.49268319 - time (sec): 6.01 - samples/sec: 8482.14 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 17:58:07,825 epoch 2 - iter 308/447 - loss 0.48765160 - time (sec): 7.06 - samples/sec: 8639.41 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 17:58:08,801 epoch 2 - iter 352/447 - loss 0.48463496 - time (sec): 8.03 - samples/sec: 8531.89 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 17:58:09,788 epoch 2 - iter 396/447 - loss 0.48158940 - time (sec): 9.02 - samples/sec: 8544.51 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 17:58:10,794 epoch 2 - iter 440/447 - loss 0.47795623 - time (sec): 10.03 - samples/sec: 8519.62 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 17:58:10,947 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:58:10,947 EPOCH 2 done: loss 0.4791 - lr: 0.000045 |
|
2023-10-18 17:58:15,860 DEV : loss 0.33681756258010864 - f1-score (micro avg) 0.1217 |
|
2023-10-18 17:58:15,886 saving best model |
|
2023-10-18 17:58:15,920 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:58:16,685 epoch 3 - iter 44/447 - loss 0.42838384 - time (sec): 0.76 - samples/sec: 11780.38 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 17:58:17,389 epoch 3 - iter 88/447 - loss 0.42564911 - time (sec): 1.47 - samples/sec: 11937.67 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 17:58:18,327 epoch 3 - iter 132/447 - loss 0.42033282 - time (sec): 2.41 - samples/sec: 10893.04 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 17:58:19,293 epoch 3 - iter 176/447 - loss 0.43188443 - time (sec): 3.37 - samples/sec: 10082.16 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 17:58:20,281 epoch 3 - iter 220/447 - loss 0.42083961 - time (sec): 4.36 - samples/sec: 9613.01 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 17:58:21,325 epoch 3 - iter 264/447 - loss 0.42146771 - time (sec): 5.40 - samples/sec: 9324.08 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 17:58:22,389 epoch 3 - iter 308/447 - loss 0.41172801 - time (sec): 6.47 - samples/sec: 9113.86 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 17:58:23,791 epoch 3 - iter 352/447 - loss 0.41407969 - time (sec): 7.87 - samples/sec: 8626.39 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 17:58:24,857 epoch 3 - iter 396/447 - loss 0.41044893 - time (sec): 8.94 - samples/sec: 8581.94 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 17:58:25,880 epoch 3 - iter 440/447 - loss 0.40707204 - time (sec): 9.96 - samples/sec: 8581.10 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 17:58:26,031 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:58:26,031 EPOCH 3 done: loss 0.4069 - lr: 0.000039 |
|
2023-10-18 17:58:30,935 DEV : loss 0.3142178952693939 - f1-score (micro avg) 0.2985 |
|
2023-10-18 17:58:30,960 saving best model |
|
2023-10-18 17:58:30,992 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:58:31,990 epoch 4 - iter 44/447 - loss 0.39689983 - time (sec): 1.00 - samples/sec: 8133.45 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 17:58:33,066 epoch 4 - iter 88/447 - loss 0.35623413 - time (sec): 2.07 - samples/sec: 8753.23 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 17:58:34,068 epoch 4 - iter 132/447 - loss 0.35755301 - time (sec): 3.07 - samples/sec: 8651.03 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 17:58:35,038 epoch 4 - iter 176/447 - loss 0.36151376 - time (sec): 4.05 - samples/sec: 8718.94 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 17:58:36,046 epoch 4 - iter 220/447 - loss 0.35271799 - time (sec): 5.05 - samples/sec: 8718.35 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 17:58:37,004 epoch 4 - iter 264/447 - loss 0.35719487 - time (sec): 6.01 - samples/sec: 8702.25 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 17:58:37,979 epoch 4 - iter 308/447 - loss 0.35411738 - time (sec): 6.99 - samples/sec: 8668.62 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 17:58:38,962 epoch 4 - iter 352/447 - loss 0.35874790 - time (sec): 7.97 - samples/sec: 8662.23 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 17:58:39,950 epoch 4 - iter 396/447 - loss 0.35964077 - time (sec): 8.96 - samples/sec: 8604.79 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 17:58:40,968 epoch 4 - iter 440/447 - loss 0.35906851 - time (sec): 9.97 - samples/sec: 8547.40 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 17:58:41,133 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:58:41,134 EPOCH 4 done: loss 0.3586 - lr: 0.000033 |
|
2023-10-18 17:58:46,412 DEV : loss 0.3010146915912628 - f1-score (micro avg) 0.329 |
|
2023-10-18 17:58:46,437 saving best model |
|
2023-10-18 17:58:46,470 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:58:47,540 epoch 5 - iter 44/447 - loss 0.34061996 - time (sec): 1.07 - samples/sec: 8009.08 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 17:58:48,573 epoch 5 - iter 88/447 - loss 0.32307230 - time (sec): 2.10 - samples/sec: 8478.89 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 17:58:49,572 epoch 5 - iter 132/447 - loss 0.32451711 - time (sec): 3.10 - samples/sec: 8213.47 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 17:58:50,586 epoch 5 - iter 176/447 - loss 0.32488369 - time (sec): 4.12 - samples/sec: 8309.14 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 17:58:51,579 epoch 5 - iter 220/447 - loss 0.32269515 - time (sec): 5.11 - samples/sec: 8223.00 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 17:58:52,605 epoch 5 - iter 264/447 - loss 0.33241103 - time (sec): 6.13 - samples/sec: 8169.84 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 17:58:53,709 epoch 5 - iter 308/447 - loss 0.33192877 - time (sec): 7.24 - samples/sec: 8188.65 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 17:58:54,814 epoch 5 - iter 352/447 - loss 0.33433260 - time (sec): 8.34 - samples/sec: 8220.85 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 17:58:55,825 epoch 5 - iter 396/447 - loss 0.33154866 - time (sec): 9.35 - samples/sec: 8252.27 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 17:58:56,791 epoch 5 - iter 440/447 - loss 0.32622718 - time (sec): 10.32 - samples/sec: 8221.07 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 17:58:56,967 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:58:56,967 EPOCH 5 done: loss 0.3247 - lr: 0.000028 |
|
2023-10-18 17:59:02,199 DEV : loss 0.2914736866950989 - f1-score (micro avg) 0.3482 |
|
2023-10-18 17:59:02,224 saving best model |
|
2023-10-18 17:59:02,256 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:59:03,191 epoch 6 - iter 44/447 - loss 0.30115644 - time (sec): 0.93 - samples/sec: 8943.48 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 17:59:03,991 epoch 6 - iter 88/447 - loss 0.30685456 - time (sec): 1.73 - samples/sec: 9404.40 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 17:59:04,865 epoch 6 - iter 132/447 - loss 0.29321190 - time (sec): 2.61 - samples/sec: 9170.89 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 17:59:05,845 epoch 6 - iter 176/447 - loss 0.31362991 - time (sec): 3.59 - samples/sec: 8985.59 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 17:59:06,829 epoch 6 - iter 220/447 - loss 0.31663488 - time (sec): 4.57 - samples/sec: 8843.88 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 17:59:07,875 epoch 6 - iter 264/447 - loss 0.32593502 - time (sec): 5.62 - samples/sec: 8930.99 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 17:59:08,867 epoch 6 - iter 308/447 - loss 0.31944221 - time (sec): 6.61 - samples/sec: 8921.37 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 17:59:09,908 epoch 6 - iter 352/447 - loss 0.30844029 - time (sec): 7.65 - samples/sec: 8814.12 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 17:59:11,009 epoch 6 - iter 396/447 - loss 0.31191116 - time (sec): 8.75 - samples/sec: 8792.22 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 17:59:12,066 epoch 6 - iter 440/447 - loss 0.30921094 - time (sec): 9.81 - samples/sec: 8704.33 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 17:59:12,223 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:59:12,223 EPOCH 6 done: loss 0.3085 - lr: 0.000022 |
|
2023-10-18 17:59:17,516 DEV : loss 0.29000064730644226 - f1-score (micro avg) 0.3639 |
|
2023-10-18 17:59:17,540 saving best model |
|
2023-10-18 17:59:17,571 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:59:18,576 epoch 7 - iter 44/447 - loss 0.28426196 - time (sec): 1.00 - samples/sec: 8758.45 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 17:59:19,615 epoch 7 - iter 88/447 - loss 0.30391801 - time (sec): 2.04 - samples/sec: 8399.06 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 17:59:20,678 epoch 7 - iter 132/447 - loss 0.30469420 - time (sec): 3.11 - samples/sec: 8215.37 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 17:59:21,698 epoch 7 - iter 176/447 - loss 0.30084472 - time (sec): 4.13 - samples/sec: 8124.85 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 17:59:22,736 epoch 7 - iter 220/447 - loss 0.29345470 - time (sec): 5.16 - samples/sec: 8110.09 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 17:59:23,759 epoch 7 - iter 264/447 - loss 0.29138463 - time (sec): 6.19 - samples/sec: 8252.21 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 17:59:24,754 epoch 7 - iter 308/447 - loss 0.29351467 - time (sec): 7.18 - samples/sec: 8305.01 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 17:59:25,825 epoch 7 - iter 352/447 - loss 0.29638985 - time (sec): 8.25 - samples/sec: 8355.19 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 17:59:26,879 epoch 7 - iter 396/447 - loss 0.29660285 - time (sec): 9.31 - samples/sec: 8329.27 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 17:59:27,902 epoch 7 - iter 440/447 - loss 0.29359818 - time (sec): 10.33 - samples/sec: 8260.63 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 17:59:28,060 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:59:28,061 EPOCH 7 done: loss 0.2939 - lr: 0.000017 |
|
2023-10-18 17:59:33,329 DEV : loss 0.2924981713294983 - f1-score (micro avg) 0.3632 |
|
2023-10-18 17:59:33,353 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:59:34,451 epoch 8 - iter 44/447 - loss 0.30018572 - time (sec): 1.10 - samples/sec: 8010.32 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 17:59:35,486 epoch 8 - iter 88/447 - loss 0.29132373 - time (sec): 2.13 - samples/sec: 8515.38 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 17:59:36,483 epoch 8 - iter 132/447 - loss 0.29254593 - time (sec): 3.13 - samples/sec: 8346.49 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 17:59:37,528 epoch 8 - iter 176/447 - loss 0.28696166 - time (sec): 4.17 - samples/sec: 8362.77 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 17:59:38,565 epoch 8 - iter 220/447 - loss 0.28064859 - time (sec): 5.21 - samples/sec: 8490.03 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 17:59:39,596 epoch 8 - iter 264/447 - loss 0.28058588 - time (sec): 6.24 - samples/sec: 8423.76 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 17:59:40,604 epoch 8 - iter 308/447 - loss 0.28283717 - time (sec): 7.25 - samples/sec: 8324.22 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 17:59:41,671 epoch 8 - iter 352/447 - loss 0.27965960 - time (sec): 8.32 - samples/sec: 8320.97 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 17:59:42,712 epoch 8 - iter 396/447 - loss 0.28457473 - time (sec): 9.36 - samples/sec: 8303.89 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 17:59:43,663 epoch 8 - iter 440/447 - loss 0.28032758 - time (sec): 10.31 - samples/sec: 8254.52 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 17:59:43,820 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:59:43,820 EPOCH 8 done: loss 0.2786 - lr: 0.000011 |
|
2023-10-18 17:59:48,835 DEV : loss 0.2947298288345337 - f1-score (micro avg) 0.3628 |
|
2023-10-18 17:59:48,860 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:59:49,885 epoch 9 - iter 44/447 - loss 0.21804140 - time (sec): 1.02 - samples/sec: 8158.29 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 17:59:50,878 epoch 9 - iter 88/447 - loss 0.24378083 - time (sec): 2.02 - samples/sec: 8148.24 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 17:59:51,953 epoch 9 - iter 132/447 - loss 0.26094469 - time (sec): 3.09 - samples/sec: 8394.90 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 17:59:52,966 epoch 9 - iter 176/447 - loss 0.27186542 - time (sec): 4.11 - samples/sec: 8515.92 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 17:59:53,966 epoch 9 - iter 220/447 - loss 0.27565457 - time (sec): 5.11 - samples/sec: 8388.76 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 17:59:54,956 epoch 9 - iter 264/447 - loss 0.27726648 - time (sec): 6.10 - samples/sec: 8356.24 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 17:59:55,952 epoch 9 - iter 308/447 - loss 0.27713457 - time (sec): 7.09 - samples/sec: 8417.32 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 17:59:56,953 epoch 9 - iter 352/447 - loss 0.26979378 - time (sec): 8.09 - samples/sec: 8536.18 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 17:59:57,973 epoch 9 - iter 396/447 - loss 0.27600964 - time (sec): 9.11 - samples/sec: 8472.65 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 17:59:58,940 epoch 9 - iter 440/447 - loss 0.27663352 - time (sec): 10.08 - samples/sec: 8473.73 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 17:59:59,089 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:59:59,089 EPOCH 9 done: loss 0.2764 - lr: 0.000006 |
|
2023-10-18 18:00:04,399 DEV : loss 0.286447137594223 - f1-score (micro avg) 0.3807 |
|
2023-10-18 18:00:04,424 saving best model |
|
2023-10-18 18:00:04,454 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:00:05,491 epoch 10 - iter 44/447 - loss 0.26465749 - time (sec): 1.04 - samples/sec: 9417.64 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 18:00:06,502 epoch 10 - iter 88/447 - loss 0.27584699 - time (sec): 2.05 - samples/sec: 8735.04 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 18:00:07,506 epoch 10 - iter 132/447 - loss 0.26228332 - time (sec): 3.05 - samples/sec: 8604.53 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 18:00:08,501 epoch 10 - iter 176/447 - loss 0.26937474 - time (sec): 4.05 - samples/sec: 8701.97 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 18:00:09,466 epoch 10 - iter 220/447 - loss 0.26880653 - time (sec): 5.01 - samples/sec: 8505.78 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 18:00:10,492 epoch 10 - iter 264/447 - loss 0.26811965 - time (sec): 6.04 - samples/sec: 8551.71 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 18:00:11,548 epoch 10 - iter 308/447 - loss 0.27210434 - time (sec): 7.09 - samples/sec: 8570.30 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 18:00:12,593 epoch 10 - iter 352/447 - loss 0.27503554 - time (sec): 8.14 - samples/sec: 8491.65 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 18:00:13,615 epoch 10 - iter 396/447 - loss 0.27246757 - time (sec): 9.16 - samples/sec: 8414.29 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 18:00:14,613 epoch 10 - iter 440/447 - loss 0.26857705 - time (sec): 10.16 - samples/sec: 8371.37 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 18:00:14,781 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:00:14,781 EPOCH 10 done: loss 0.2677 - lr: 0.000000 |
|
2023-10-18 18:00:20,055 DEV : loss 0.2858006954193115 - f1-score (micro avg) 0.3819 |
|
2023-10-18 18:00:20,079 saving best model |
|
2023-10-18 18:00:20,140 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:00:20,140 Loading model from best epoch ... |
|
2023-10-18 18:00:20,216 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time |
|
2023-10-18 18:00:22,461 |
|
Results: |
|
- F-score (micro) 0.3942 |
|
- F-score (macro) 0.1902 |
|
- Accuracy 0.2574 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.5108 0.5940 0.5493 596 |
|
pers 0.2347 0.2763 0.2538 333 |
|
org 0.0000 0.0000 0.0000 132 |
|
time 0.1875 0.1224 0.1481 49 |
|
prod 0.0000 0.0000 0.0000 66 |
|
|
|
micro avg 0.4047 0.3844 0.3942 1176 |
|
macro avg 0.1866 0.1985 0.1902 1176 |
|
weighted avg 0.3332 0.3844 0.3564 1176 |
|
|
|
2023-10-18 18:00:22,461 ---------------------------------------------------------------------------------------------------- |
|
|