|
2023-10-18 18:15:44,757 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:15:44,758 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 18:15:44,758 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:15:44,758 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences |
|
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator |
|
2023-10-18 18:15:44,758 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:15:44,758 Train: 3575 sentences |
|
2023-10-18 18:15:44,758 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 18:15:44,758 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:15:44,758 Training Params: |
|
2023-10-18 18:15:44,758 - learning_rate: "5e-05" |
|
2023-10-18 18:15:44,758 - mini_batch_size: "4" |
|
2023-10-18 18:15:44,758 - max_epochs: "10" |
|
2023-10-18 18:15:44,758 - shuffle: "True" |
|
2023-10-18 18:15:44,758 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:15:44,758 Plugins: |
|
2023-10-18 18:15:44,758 - TensorboardLogger |
|
2023-10-18 18:15:44,758 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 18:15:44,758 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:15:44,758 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 18:15:44,758 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 18:15:44,758 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:15:44,758 Computation: |
|
2023-10-18 18:15:44,758 - compute on device: cuda:0 |
|
2023-10-18 18:15:44,758 - embedding storage: none |
|
2023-10-18 18:15:44,758 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:15:44,758 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-18 18:15:44,758 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:15:44,758 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:15:44,759 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 18:15:46,187 epoch 1 - iter 89/894 - loss 4.25883135 - time (sec): 1.43 - samples/sec: 5802.08 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 18:15:47,561 epoch 1 - iter 178/894 - loss 3.88846431 - time (sec): 2.80 - samples/sec: 6012.25 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 18:15:48,974 epoch 1 - iter 267/894 - loss 3.39207820 - time (sec): 4.21 - samples/sec: 6314.28 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 18:15:50,370 epoch 1 - iter 356/894 - loss 2.85985897 - time (sec): 5.61 - samples/sec: 6345.58 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 18:15:51,764 epoch 1 - iter 445/894 - loss 2.42305557 - time (sec): 7.01 - samples/sec: 6428.18 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 18:15:53,024 epoch 1 - iter 534/894 - loss 2.14394411 - time (sec): 8.27 - samples/sec: 6485.91 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 18:15:54,306 epoch 1 - iter 623/894 - loss 1.93877586 - time (sec): 9.55 - samples/sec: 6461.23 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 18:15:55,643 epoch 1 - iter 712/894 - loss 1.78000923 - time (sec): 10.88 - samples/sec: 6416.44 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 18:15:57,005 epoch 1 - iter 801/894 - loss 1.66000782 - time (sec): 12.25 - samples/sec: 6339.69 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 18:15:58,322 epoch 1 - iter 890/894 - loss 1.54698441 - time (sec): 13.56 - samples/sec: 6357.22 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-18 18:15:58,378 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:15:58,378 EPOCH 1 done: loss 1.5432 - lr: 0.000050 |
|
2023-10-18 18:16:00,654 DEV : loss 0.3984837830066681 - f1-score (micro avg) 0.0 |
|
2023-10-18 18:16:00,682 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:16:02,073 epoch 2 - iter 89/894 - loss 0.52594792 - time (sec): 1.39 - samples/sec: 6369.10 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 18:16:03,433 epoch 2 - iter 178/894 - loss 0.50071493 - time (sec): 2.75 - samples/sec: 6264.97 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 18:16:04,866 epoch 2 - iter 267/894 - loss 0.49628263 - time (sec): 4.18 - samples/sec: 6123.09 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 18:16:06,248 epoch 2 - iter 356/894 - loss 0.48832046 - time (sec): 5.57 - samples/sec: 6098.08 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 18:16:07,641 epoch 2 - iter 445/894 - loss 0.48124232 - time (sec): 6.96 - samples/sec: 6088.49 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 18:16:09,013 epoch 2 - iter 534/894 - loss 0.47516643 - time (sec): 8.33 - samples/sec: 6006.41 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 18:16:10,390 epoch 2 - iter 623/894 - loss 0.46557034 - time (sec): 9.71 - samples/sec: 6057.11 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 18:16:11,800 epoch 2 - iter 712/894 - loss 0.45087699 - time (sec): 11.12 - samples/sec: 6183.99 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 18:16:13,191 epoch 2 - iter 801/894 - loss 0.44947313 - time (sec): 12.51 - samples/sec: 6218.03 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 18:16:14,558 epoch 2 - iter 890/894 - loss 0.44333084 - time (sec): 13.88 - samples/sec: 6210.45 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 18:16:14,616 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:16:14,616 EPOCH 2 done: loss 0.4443 - lr: 0.000044 |
|
2023-10-18 18:16:19,949 DEV : loss 0.3260151147842407 - f1-score (micro avg) 0.2507 |
|
2023-10-18 18:16:19,976 saving best model |
|
2023-10-18 18:16:20,012 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:16:21,412 epoch 3 - iter 89/894 - loss 0.37344402 - time (sec): 1.40 - samples/sec: 6354.88 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 18:16:22,774 epoch 3 - iter 178/894 - loss 0.35634230 - time (sec): 2.76 - samples/sec: 6244.10 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 18:16:24,157 epoch 3 - iter 267/894 - loss 0.36909354 - time (sec): 4.14 - samples/sec: 6261.41 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 18:16:25,496 epoch 3 - iter 356/894 - loss 0.37521279 - time (sec): 5.48 - samples/sec: 6213.43 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 18:16:26,874 epoch 3 - iter 445/894 - loss 0.37028765 - time (sec): 6.86 - samples/sec: 6242.59 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 18:16:28,313 epoch 3 - iter 534/894 - loss 0.36555683 - time (sec): 8.30 - samples/sec: 6357.70 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 18:16:29,728 epoch 3 - iter 623/894 - loss 0.37443295 - time (sec): 9.72 - samples/sec: 6334.47 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 18:16:31,122 epoch 3 - iter 712/894 - loss 0.36744880 - time (sec): 11.11 - samples/sec: 6287.03 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 18:16:32,488 epoch 3 - iter 801/894 - loss 0.36894331 - time (sec): 12.48 - samples/sec: 6224.81 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 18:16:33,882 epoch 3 - iter 890/894 - loss 0.36997707 - time (sec): 13.87 - samples/sec: 6210.54 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 18:16:33,941 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:16:33,942 EPOCH 3 done: loss 0.3694 - lr: 0.000039 |
|
2023-10-18 18:16:39,229 DEV : loss 0.3036574423313141 - f1-score (micro avg) 0.3299 |
|
2023-10-18 18:16:39,257 saving best model |
|
2023-10-18 18:16:39,295 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:16:40,691 epoch 4 - iter 89/894 - loss 0.31548673 - time (sec): 1.39 - samples/sec: 5546.15 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 18:16:41,893 epoch 4 - iter 178/894 - loss 0.33092983 - time (sec): 2.60 - samples/sec: 6132.05 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 18:16:43,135 epoch 4 - iter 267/894 - loss 0.35748682 - time (sec): 3.84 - samples/sec: 6337.31 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 18:16:44,610 epoch 4 - iter 356/894 - loss 0.35730649 - time (sec): 5.31 - samples/sec: 6200.84 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 18:16:46,065 epoch 4 - iter 445/894 - loss 0.34162987 - time (sec): 6.77 - samples/sec: 6260.27 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 18:16:47,448 epoch 4 - iter 534/894 - loss 0.33088873 - time (sec): 8.15 - samples/sec: 6268.61 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 18:16:48,891 epoch 4 - iter 623/894 - loss 0.32593421 - time (sec): 9.59 - samples/sec: 6322.04 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 18:16:50,274 epoch 4 - iter 712/894 - loss 0.32575031 - time (sec): 10.98 - samples/sec: 6348.62 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 18:16:51,707 epoch 4 - iter 801/894 - loss 0.32238412 - time (sec): 12.41 - samples/sec: 6285.42 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 18:16:53,074 epoch 4 - iter 890/894 - loss 0.32559948 - time (sec): 13.78 - samples/sec: 6261.46 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 18:16:53,135 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:16:53,135 EPOCH 4 done: loss 0.3265 - lr: 0.000033 |
|
2023-10-18 18:16:58,438 DEV : loss 0.3067420423030853 - f1-score (micro avg) 0.3352 |
|
2023-10-18 18:16:58,466 saving best model |
|
2023-10-18 18:16:58,498 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:16:59,905 epoch 5 - iter 89/894 - loss 0.31476768 - time (sec): 1.41 - samples/sec: 6326.63 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 18:17:01,274 epoch 5 - iter 178/894 - loss 0.29060870 - time (sec): 2.78 - samples/sec: 6049.80 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 18:17:02,622 epoch 5 - iter 267/894 - loss 0.29162793 - time (sec): 4.12 - samples/sec: 6349.59 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 18:17:04,072 epoch 5 - iter 356/894 - loss 0.29757568 - time (sec): 5.57 - samples/sec: 6274.00 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 18:17:05,440 epoch 5 - iter 445/894 - loss 0.29353139 - time (sec): 6.94 - samples/sec: 6306.29 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 18:17:06,820 epoch 5 - iter 534/894 - loss 0.29628911 - time (sec): 8.32 - samples/sec: 6270.32 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 18:17:08,186 epoch 5 - iter 623/894 - loss 0.30268124 - time (sec): 9.69 - samples/sec: 6181.30 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 18:17:09,575 epoch 5 - iter 712/894 - loss 0.30243496 - time (sec): 11.08 - samples/sec: 6145.96 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 18:17:10,966 epoch 5 - iter 801/894 - loss 0.29872617 - time (sec): 12.47 - samples/sec: 6122.41 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 18:17:12,436 epoch 5 - iter 890/894 - loss 0.30206675 - time (sec): 13.94 - samples/sec: 6192.20 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 18:17:12,491 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:17:12,491 EPOCH 5 done: loss 0.3021 - lr: 0.000028 |
|
2023-10-18 18:17:17,514 DEV : loss 0.29209402203559875 - f1-score (micro avg) 0.3512 |
|
2023-10-18 18:17:17,541 saving best model |
|
2023-10-18 18:17:17,576 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:17:18,938 epoch 6 - iter 89/894 - loss 0.28333048 - time (sec): 1.36 - samples/sec: 5836.96 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 18:17:20,633 epoch 6 - iter 178/894 - loss 0.25675668 - time (sec): 3.06 - samples/sec: 5679.16 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 18:17:21,994 epoch 6 - iter 267/894 - loss 0.25036742 - time (sec): 4.42 - samples/sec: 5706.85 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 18:17:23,384 epoch 6 - iter 356/894 - loss 0.26571766 - time (sec): 5.81 - samples/sec: 5964.57 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 18:17:24,788 epoch 6 - iter 445/894 - loss 0.26798149 - time (sec): 7.21 - samples/sec: 6082.58 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 18:17:26,140 epoch 6 - iter 534/894 - loss 0.27319482 - time (sec): 8.56 - samples/sec: 6074.22 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 18:17:27,493 epoch 6 - iter 623/894 - loss 0.27136640 - time (sec): 9.92 - samples/sec: 6064.28 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 18:17:28,873 epoch 6 - iter 712/894 - loss 0.27289306 - time (sec): 11.30 - samples/sec: 6174.47 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 18:17:30,260 epoch 6 - iter 801/894 - loss 0.27008042 - time (sec): 12.68 - samples/sec: 6133.47 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 18:17:31,622 epoch 6 - iter 890/894 - loss 0.28116086 - time (sec): 14.04 - samples/sec: 6138.46 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 18:17:31,678 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:17:31,678 EPOCH 6 done: loss 0.2811 - lr: 0.000022 |
|
2023-10-18 18:17:36,730 DEV : loss 0.2932937443256378 - f1-score (micro avg) 0.3493 |
|
2023-10-18 18:17:36,758 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:17:38,169 epoch 7 - iter 89/894 - loss 0.22158974 - time (sec): 1.41 - samples/sec: 6593.41 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 18:17:39,537 epoch 7 - iter 178/894 - loss 0.25346068 - time (sec): 2.78 - samples/sec: 6309.08 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 18:17:41,035 epoch 7 - iter 267/894 - loss 0.27860276 - time (sec): 4.28 - samples/sec: 6398.57 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 18:17:42,457 epoch 7 - iter 356/894 - loss 0.27970492 - time (sec): 5.70 - samples/sec: 6335.40 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 18:17:43,845 epoch 7 - iter 445/894 - loss 0.27912838 - time (sec): 7.09 - samples/sec: 6296.72 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 18:17:45,223 epoch 7 - iter 534/894 - loss 0.27591335 - time (sec): 8.46 - samples/sec: 6216.40 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 18:17:46,594 epoch 7 - iter 623/894 - loss 0.26892576 - time (sec): 9.84 - samples/sec: 6180.69 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 18:17:47,986 epoch 7 - iter 712/894 - loss 0.26882585 - time (sec): 11.23 - samples/sec: 6174.19 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 18:17:49,380 epoch 7 - iter 801/894 - loss 0.26356169 - time (sec): 12.62 - samples/sec: 6177.37 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 18:17:50,736 epoch 7 - iter 890/894 - loss 0.26609943 - time (sec): 13.98 - samples/sec: 6167.03 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 18:17:50,794 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:17:50,794 EPOCH 7 done: loss 0.2667 - lr: 0.000017 |
|
2023-10-18 18:17:56,124 DEV : loss 0.3029758632183075 - f1-score (micro avg) 0.3684 |
|
2023-10-18 18:17:56,153 saving best model |
|
2023-10-18 18:17:56,190 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:17:57,395 epoch 8 - iter 89/894 - loss 0.25478181 - time (sec): 1.20 - samples/sec: 7916.80 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 18:17:58,758 epoch 8 - iter 178/894 - loss 0.24482678 - time (sec): 2.57 - samples/sec: 6943.02 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 18:18:00,134 epoch 8 - iter 267/894 - loss 0.25577237 - time (sec): 3.94 - samples/sec: 6715.16 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 18:18:01,491 epoch 8 - iter 356/894 - loss 0.26327040 - time (sec): 5.30 - samples/sec: 6517.93 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 18:18:02,866 epoch 8 - iter 445/894 - loss 0.26619424 - time (sec): 6.68 - samples/sec: 6430.84 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 18:18:04,249 epoch 8 - iter 534/894 - loss 0.26295140 - time (sec): 8.06 - samples/sec: 6363.38 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 18:18:05,610 epoch 8 - iter 623/894 - loss 0.25983363 - time (sec): 9.42 - samples/sec: 6292.77 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 18:18:07,002 epoch 8 - iter 712/894 - loss 0.25819637 - time (sec): 10.81 - samples/sec: 6295.11 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 18:18:08,360 epoch 8 - iter 801/894 - loss 0.25250941 - time (sec): 12.17 - samples/sec: 6291.86 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 18:18:09,708 epoch 8 - iter 890/894 - loss 0.25517432 - time (sec): 13.52 - samples/sec: 6368.90 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 18:18:09,767 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:18:09,768 EPOCH 8 done: loss 0.2542 - lr: 0.000011 |
|
2023-10-18 18:18:15,190 DEV : loss 0.30212923884391785 - f1-score (micro avg) 0.3703 |
|
2023-10-18 18:18:15,218 saving best model |
|
2023-10-18 18:18:15,254 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:18:16,676 epoch 9 - iter 89/894 - loss 0.22003771 - time (sec): 1.42 - samples/sec: 5794.02 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 18:18:18,054 epoch 9 - iter 178/894 - loss 0.23577347 - time (sec): 2.80 - samples/sec: 5592.59 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 18:18:19,450 epoch 9 - iter 267/894 - loss 0.23253033 - time (sec): 4.19 - samples/sec: 5831.48 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 18:18:20,847 epoch 9 - iter 356/894 - loss 0.24026224 - time (sec): 5.59 - samples/sec: 5862.72 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 18:18:22,278 epoch 9 - iter 445/894 - loss 0.24192239 - time (sec): 7.02 - samples/sec: 5991.87 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 18:18:23,699 epoch 9 - iter 534/894 - loss 0.24368213 - time (sec): 8.44 - samples/sec: 6069.84 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 18:18:25,087 epoch 9 - iter 623/894 - loss 0.23891496 - time (sec): 9.83 - samples/sec: 6082.71 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 18:18:26,427 epoch 9 - iter 712/894 - loss 0.23933687 - time (sec): 11.17 - samples/sec: 6076.22 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 18:18:27,723 epoch 9 - iter 801/894 - loss 0.24261585 - time (sec): 12.47 - samples/sec: 6219.89 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 18:18:29,148 epoch 9 - iter 890/894 - loss 0.24459348 - time (sec): 13.89 - samples/sec: 6213.34 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 18:18:29,209 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:18:29,209 EPOCH 9 done: loss 0.2449 - lr: 0.000006 |
|
2023-10-18 18:18:34,561 DEV : loss 0.3134533762931824 - f1-score (micro avg) 0.3652 |
|
2023-10-18 18:18:34,588 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:18:35,969 epoch 10 - iter 89/894 - loss 0.28204273 - time (sec): 1.38 - samples/sec: 6010.18 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 18:18:37,357 epoch 10 - iter 178/894 - loss 0.25940032 - time (sec): 2.77 - samples/sec: 5955.00 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 18:18:38,751 epoch 10 - iter 267/894 - loss 0.24066822 - time (sec): 4.16 - samples/sec: 6023.31 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 18:18:40,128 epoch 10 - iter 356/894 - loss 0.23618971 - time (sec): 5.54 - samples/sec: 5981.87 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 18:18:41,496 epoch 10 - iter 445/894 - loss 0.24428962 - time (sec): 6.91 - samples/sec: 5907.36 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 18:18:42,898 epoch 10 - iter 534/894 - loss 0.23708175 - time (sec): 8.31 - samples/sec: 5973.97 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 18:18:44,191 epoch 10 - iter 623/894 - loss 0.23826333 - time (sec): 9.60 - samples/sec: 6039.64 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 18:18:45,476 epoch 10 - iter 712/894 - loss 0.24098653 - time (sec): 10.89 - samples/sec: 6299.71 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 18:18:46,713 epoch 10 - iter 801/894 - loss 0.24280119 - time (sec): 12.12 - samples/sec: 6351.83 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 18:18:47,960 epoch 10 - iter 890/894 - loss 0.24048726 - time (sec): 13.37 - samples/sec: 6433.47 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 18:18:48,018 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:18:48,018 EPOCH 10 done: loss 0.2407 - lr: 0.000000 |
|
2023-10-18 18:18:53,067 DEV : loss 0.3089354932308197 - f1-score (micro avg) 0.3602 |
|
2023-10-18 18:18:53,126 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:18:53,126 Loading model from best epoch ... |
|
2023-10-18 18:18:53,208 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time |
|
2023-10-18 18:18:55,501 |
|
Results: |
|
- F-score (micro) 0.3689 |
|
- F-score (macro) 0.2044 |
|
- Accuracy 0.2367 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.4930 0.5906 0.5374 596 |
|
pers 0.1657 0.2462 0.1981 333 |
|
org 1.0000 0.0076 0.0150 132 |
|
time 0.3438 0.2245 0.2716 49 |
|
prod 0.0000 0.0000 0.0000 66 |
|
|
|
micro avg 0.3591 0.3793 0.3689 1176 |
|
macro avg 0.4005 0.2138 0.2044 1176 |
|
weighted avg 0.4233 0.3793 0.3414 1176 |
|
|
|
2023-10-18 18:18:55,501 ---------------------------------------------------------------------------------------------------- |
|
|