|
2023-10-18 17:36:25,474 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:36:25,474 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 17:36:25,475 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:36:25,475 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences |
|
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator |
|
2023-10-18 17:36:25,475 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:36:25,475 Train: 3575 sentences |
|
2023-10-18 17:36:25,475 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 17:36:25,475 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:36:25,475 Training Params: |
|
2023-10-18 17:36:25,475 - learning_rate: "3e-05" |
|
2023-10-18 17:36:25,475 - mini_batch_size: "4" |
|
2023-10-18 17:36:25,475 - max_epochs: "10" |
|
2023-10-18 17:36:25,475 - shuffle: "True" |
|
2023-10-18 17:36:25,475 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:36:25,475 Plugins: |
|
2023-10-18 17:36:25,475 - TensorboardLogger |
|
2023-10-18 17:36:25,475 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 17:36:25,475 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:36:25,475 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 17:36:25,475 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 17:36:25,475 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:36:25,475 Computation: |
|
2023-10-18 17:36:25,475 - compute on device: cuda:0 |
|
2023-10-18 17:36:25,475 - embedding storage: none |
|
2023-10-18 17:36:25,475 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:36:25,475 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-18 17:36:25,475 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:36:25,475 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:36:25,476 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 17:36:27,488 epoch 1 - iter 89/894 - loss 3.60685157 - time (sec): 2.01 - samples/sec: 4119.17 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 17:36:28,899 epoch 1 - iter 178/894 - loss 3.43579194 - time (sec): 3.42 - samples/sec: 4885.46 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 17:36:30,247 epoch 1 - iter 267/894 - loss 3.14895937 - time (sec): 4.77 - samples/sec: 5303.08 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 17:36:31,459 epoch 1 - iter 356/894 - loss 2.77341929 - time (sec): 5.98 - samples/sec: 5655.16 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 17:36:32,840 epoch 1 - iter 445/894 - loss 2.39208728 - time (sec): 7.36 - samples/sec: 5861.60 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 17:36:34,162 epoch 1 - iter 534/894 - loss 2.10409346 - time (sec): 8.69 - samples/sec: 5970.64 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 17:36:35,578 epoch 1 - iter 623/894 - loss 1.90065689 - time (sec): 10.10 - samples/sec: 5975.34 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 17:36:36,982 epoch 1 - iter 712/894 - loss 1.72992901 - time (sec): 11.51 - samples/sec: 6027.23 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 17:36:38,453 epoch 1 - iter 801/894 - loss 1.60878755 - time (sec): 12.98 - samples/sec: 6010.66 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 17:36:39,851 epoch 1 - iter 890/894 - loss 1.50933559 - time (sec): 14.37 - samples/sec: 5985.93 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 17:36:39,914 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:36:39,915 EPOCH 1 done: loss 1.5042 - lr: 0.000030 |
|
2023-10-18 17:36:42,200 DEV : loss 0.4629151523113251 - f1-score (micro avg) 0.0 |
|
2023-10-18 17:36:42,223 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:36:43,588 epoch 2 - iter 89/894 - loss 0.58852432 - time (sec): 1.36 - samples/sec: 6949.95 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 17:36:45,030 epoch 2 - iter 178/894 - loss 0.54411825 - time (sec): 2.81 - samples/sec: 6592.57 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 17:36:46,418 epoch 2 - iter 267/894 - loss 0.54683486 - time (sec): 4.19 - samples/sec: 6584.48 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 17:36:47,767 epoch 2 - iter 356/894 - loss 0.54433138 - time (sec): 5.54 - samples/sec: 6322.95 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 17:36:49,142 epoch 2 - iter 445/894 - loss 0.53650942 - time (sec): 6.92 - samples/sec: 6371.03 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 17:36:50,518 epoch 2 - iter 534/894 - loss 0.52210596 - time (sec): 8.29 - samples/sec: 6303.55 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 17:36:51,893 epoch 2 - iter 623/894 - loss 0.52262297 - time (sec): 9.67 - samples/sec: 6270.43 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 17:36:53,298 epoch 2 - iter 712/894 - loss 0.51891633 - time (sec): 11.07 - samples/sec: 6255.01 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 17:36:54,668 epoch 2 - iter 801/894 - loss 0.51519139 - time (sec): 12.44 - samples/sec: 6258.79 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 17:36:56,134 epoch 2 - iter 890/894 - loss 0.50867215 - time (sec): 13.91 - samples/sec: 6200.35 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 17:36:56,191 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:36:56,191 EPOCH 2 done: loss 0.5093 - lr: 0.000027 |
|
2023-10-18 17:37:01,301 DEV : loss 0.3628327250480652 - f1-score (micro avg) 0.128 |
|
2023-10-18 17:37:01,323 saving best model |
|
2023-10-18 17:37:01,358 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:37:02,719 epoch 3 - iter 89/894 - loss 0.43219902 - time (sec): 1.36 - samples/sec: 5777.27 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 17:37:04,080 epoch 3 - iter 178/894 - loss 0.44103438 - time (sec): 2.72 - samples/sec: 6127.02 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 17:37:05,441 epoch 3 - iter 267/894 - loss 0.44023008 - time (sec): 4.08 - samples/sec: 6031.94 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 17:37:06,846 epoch 3 - iter 356/894 - loss 0.42765276 - time (sec): 5.49 - samples/sec: 6102.39 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 17:37:08,214 epoch 3 - iter 445/894 - loss 0.42248114 - time (sec): 6.86 - samples/sec: 6168.53 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 17:37:09,589 epoch 3 - iter 534/894 - loss 0.41947771 - time (sec): 8.23 - samples/sec: 6237.17 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 17:37:11,046 epoch 3 - iter 623/894 - loss 0.41258339 - time (sec): 9.69 - samples/sec: 6138.52 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 17:37:12,450 epoch 3 - iter 712/894 - loss 0.41900656 - time (sec): 11.09 - samples/sec: 6123.10 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 17:37:13,856 epoch 3 - iter 801/894 - loss 0.41744775 - time (sec): 12.50 - samples/sec: 6214.48 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 17:37:15,244 epoch 3 - iter 890/894 - loss 0.41728557 - time (sec): 13.89 - samples/sec: 6206.23 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 17:37:15,303 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:37:15,303 EPOCH 3 done: loss 0.4170 - lr: 0.000023 |
|
2023-10-18 17:37:20,445 DEV : loss 0.3314690589904785 - f1-score (micro avg) 0.2698 |
|
2023-10-18 17:37:20,468 saving best model |
|
2023-10-18 17:37:20,503 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:37:21,875 epoch 4 - iter 89/894 - loss 0.38232423 - time (sec): 1.37 - samples/sec: 6676.33 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 17:37:23,287 epoch 4 - iter 178/894 - loss 0.39602551 - time (sec): 2.78 - samples/sec: 6380.14 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 17:37:24,673 epoch 4 - iter 267/894 - loss 0.39988427 - time (sec): 4.17 - samples/sec: 6420.02 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 17:37:26,052 epoch 4 - iter 356/894 - loss 0.39339554 - time (sec): 5.55 - samples/sec: 6443.98 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 17:37:27,417 epoch 4 - iter 445/894 - loss 0.39230378 - time (sec): 6.91 - samples/sec: 6371.85 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 17:37:28,798 epoch 4 - iter 534/894 - loss 0.38516435 - time (sec): 8.29 - samples/sec: 6330.75 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 17:37:30,201 epoch 4 - iter 623/894 - loss 0.38031629 - time (sec): 9.70 - samples/sec: 6311.69 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 17:37:31,595 epoch 4 - iter 712/894 - loss 0.37447322 - time (sec): 11.09 - samples/sec: 6273.58 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 17:37:32,954 epoch 4 - iter 801/894 - loss 0.38018360 - time (sec): 12.45 - samples/sec: 6255.81 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 17:37:34,316 epoch 4 - iter 890/894 - loss 0.37844184 - time (sec): 13.81 - samples/sec: 6243.02 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 17:37:34,373 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:37:34,373 EPOCH 4 done: loss 0.3783 - lr: 0.000020 |
|
2023-10-18 17:37:39,281 DEV : loss 0.32610201835632324 - f1-score (micro avg) 0.2978 |
|
2023-10-18 17:37:39,304 saving best model |
|
2023-10-18 17:37:39,339 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:37:40,724 epoch 5 - iter 89/894 - loss 0.34035485 - time (sec): 1.38 - samples/sec: 5898.59 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 17:37:42,375 epoch 5 - iter 178/894 - loss 0.35772845 - time (sec): 3.04 - samples/sec: 5284.30 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 17:37:43,750 epoch 5 - iter 267/894 - loss 0.33815125 - time (sec): 4.41 - samples/sec: 5452.24 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 17:37:45,156 epoch 5 - iter 356/894 - loss 0.34272345 - time (sec): 5.82 - samples/sec: 5785.33 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 17:37:46,572 epoch 5 - iter 445/894 - loss 0.33976236 - time (sec): 7.23 - samples/sec: 5891.13 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 17:37:47,964 epoch 5 - iter 534/894 - loss 0.33880704 - time (sec): 8.62 - samples/sec: 6013.37 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 17:37:49,368 epoch 5 - iter 623/894 - loss 0.34084873 - time (sec): 10.03 - samples/sec: 6040.46 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 17:37:50,759 epoch 5 - iter 712/894 - loss 0.35022281 - time (sec): 11.42 - samples/sec: 6055.53 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 17:37:52,154 epoch 5 - iter 801/894 - loss 0.34654579 - time (sec): 12.81 - samples/sec: 6032.95 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 17:37:53,575 epoch 5 - iter 890/894 - loss 0.34927526 - time (sec): 14.23 - samples/sec: 6056.28 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 17:37:53,637 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:37:53,637 EPOCH 5 done: loss 0.3517 - lr: 0.000017 |
|
2023-10-18 17:37:58,511 DEV : loss 0.32136932015419006 - f1-score (micro avg) 0.3208 |
|
2023-10-18 17:37:58,534 saving best model |
|
2023-10-18 17:37:58,569 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:37:59,945 epoch 6 - iter 89/894 - loss 0.36756032 - time (sec): 1.38 - samples/sec: 6097.41 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 17:38:01,382 epoch 6 - iter 178/894 - loss 0.32745591 - time (sec): 2.81 - samples/sec: 6605.56 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 17:38:02,775 epoch 6 - iter 267/894 - loss 0.30942266 - time (sec): 4.21 - samples/sec: 6400.64 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 17:38:04,183 epoch 6 - iter 356/894 - loss 0.32095647 - time (sec): 5.61 - samples/sec: 6229.94 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 17:38:05,584 epoch 6 - iter 445/894 - loss 0.33483226 - time (sec): 7.01 - samples/sec: 6199.86 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 17:38:06,942 epoch 6 - iter 534/894 - loss 0.33463000 - time (sec): 8.37 - samples/sec: 6185.54 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 17:38:08,349 epoch 6 - iter 623/894 - loss 0.33461203 - time (sec): 9.78 - samples/sec: 6155.34 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 17:38:09,745 epoch 6 - iter 712/894 - loss 0.33184603 - time (sec): 11.18 - samples/sec: 6172.66 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 17:38:11,146 epoch 6 - iter 801/894 - loss 0.33294506 - time (sec): 12.58 - samples/sec: 6183.44 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 17:38:12,527 epoch 6 - iter 890/894 - loss 0.33341179 - time (sec): 13.96 - samples/sec: 6176.34 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 17:38:12,590 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:38:12,591 EPOCH 6 done: loss 0.3332 - lr: 0.000013 |
|
2023-10-18 17:38:17,742 DEV : loss 0.3205994963645935 - f1-score (micro avg) 0.3318 |
|
2023-10-18 17:38:17,764 saving best model |
|
2023-10-18 17:38:17,798 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:38:19,163 epoch 7 - iter 89/894 - loss 0.29987830 - time (sec): 1.36 - samples/sec: 6253.50 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 17:38:20,522 epoch 7 - iter 178/894 - loss 0.30154270 - time (sec): 2.72 - samples/sec: 6215.09 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 17:38:21,880 epoch 7 - iter 267/894 - loss 0.30844541 - time (sec): 4.08 - samples/sec: 6057.61 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 17:38:23,289 epoch 7 - iter 356/894 - loss 0.32341333 - time (sec): 5.49 - samples/sec: 6107.38 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 17:38:24,696 epoch 7 - iter 445/894 - loss 0.31465865 - time (sec): 6.90 - samples/sec: 6131.33 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 17:38:26,051 epoch 7 - iter 534/894 - loss 0.31193555 - time (sec): 8.25 - samples/sec: 6114.94 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 17:38:27,409 epoch 7 - iter 623/894 - loss 0.31791789 - time (sec): 9.61 - samples/sec: 6138.35 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 17:38:28,858 epoch 7 - iter 712/894 - loss 0.31733624 - time (sec): 11.06 - samples/sec: 6214.57 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 17:38:30,222 epoch 7 - iter 801/894 - loss 0.31813531 - time (sec): 12.42 - samples/sec: 6192.16 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 17:38:31,623 epoch 7 - iter 890/894 - loss 0.32055924 - time (sec): 13.82 - samples/sec: 6234.66 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 17:38:31,685 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:38:31,685 EPOCH 7 done: loss 0.3200 - lr: 0.000010 |
|
2023-10-18 17:38:36,835 DEV : loss 0.3110288083553314 - f1-score (micro avg) 0.3412 |
|
2023-10-18 17:38:36,860 saving best model |
|
2023-10-18 17:38:36,895 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:38:38,329 epoch 8 - iter 89/894 - loss 0.31231747 - time (sec): 1.43 - samples/sec: 5728.08 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 17:38:39,843 epoch 8 - iter 178/894 - loss 0.30057737 - time (sec): 2.95 - samples/sec: 5735.50 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 17:38:41,272 epoch 8 - iter 267/894 - loss 0.30029382 - time (sec): 4.38 - samples/sec: 5676.62 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 17:38:42,712 epoch 8 - iter 356/894 - loss 0.30980555 - time (sec): 5.82 - samples/sec: 5706.75 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 17:38:44,107 epoch 8 - iter 445/894 - loss 0.30969262 - time (sec): 7.21 - samples/sec: 5800.47 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 17:38:45,505 epoch 8 - iter 534/894 - loss 0.31959907 - time (sec): 8.61 - samples/sec: 5878.88 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 17:38:46,943 epoch 8 - iter 623/894 - loss 0.30926784 - time (sec): 10.05 - samples/sec: 6036.97 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 17:38:48,357 epoch 8 - iter 712/894 - loss 0.31223576 - time (sec): 11.46 - samples/sec: 6062.16 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 17:38:49,749 epoch 8 - iter 801/894 - loss 0.31292463 - time (sec): 12.85 - samples/sec: 6068.04 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 17:38:51,135 epoch 8 - iter 890/894 - loss 0.31032021 - time (sec): 14.24 - samples/sec: 6057.33 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 17:38:51,196 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:38:51,196 EPOCH 8 done: loss 0.3102 - lr: 0.000007 |
|
2023-10-18 17:38:56,436 DEV : loss 0.30947345495224 - f1-score (micro avg) 0.339 |
|
2023-10-18 17:38:56,458 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:38:57,843 epoch 9 - iter 89/894 - loss 0.29949212 - time (sec): 1.38 - samples/sec: 5989.48 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 17:38:59,288 epoch 9 - iter 178/894 - loss 0.29343631 - time (sec): 2.83 - samples/sec: 6029.11 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 17:39:00,671 epoch 9 - iter 267/894 - loss 0.30787930 - time (sec): 4.21 - samples/sec: 6030.45 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 17:39:02,046 epoch 9 - iter 356/894 - loss 0.30027561 - time (sec): 5.59 - samples/sec: 6029.94 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 17:39:03,440 epoch 9 - iter 445/894 - loss 0.29995985 - time (sec): 6.98 - samples/sec: 6025.96 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 17:39:04,791 epoch 9 - iter 534/894 - loss 0.29442790 - time (sec): 8.33 - samples/sec: 6144.19 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 17:39:06,097 epoch 9 - iter 623/894 - loss 0.30032066 - time (sec): 9.64 - samples/sec: 6266.70 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 17:39:07,422 epoch 9 - iter 712/894 - loss 0.30283528 - time (sec): 10.96 - samples/sec: 6231.47 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 17:39:08,823 epoch 9 - iter 801/894 - loss 0.30451824 - time (sec): 12.36 - samples/sec: 6219.24 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 17:39:10,226 epoch 9 - iter 890/894 - loss 0.30418948 - time (sec): 13.77 - samples/sec: 6262.83 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 17:39:10,283 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:39:10,284 EPOCH 9 done: loss 0.3038 - lr: 0.000003 |
|
2023-10-18 17:39:15,233 DEV : loss 0.3095887303352356 - f1-score (micro avg) 0.3394 |
|
2023-10-18 17:39:15,256 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:39:16,812 epoch 10 - iter 89/894 - loss 0.29817324 - time (sec): 1.56 - samples/sec: 5131.42 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 17:39:18,228 epoch 10 - iter 178/894 - loss 0.28193937 - time (sec): 2.97 - samples/sec: 5823.01 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 17:39:19,623 epoch 10 - iter 267/894 - loss 0.29211979 - time (sec): 4.37 - samples/sec: 5904.15 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 17:39:20,984 epoch 10 - iter 356/894 - loss 0.29996436 - time (sec): 5.73 - samples/sec: 5984.52 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 17:39:22,362 epoch 10 - iter 445/894 - loss 0.29997785 - time (sec): 7.11 - samples/sec: 6177.77 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 17:39:23,733 epoch 10 - iter 534/894 - loss 0.29823122 - time (sec): 8.48 - samples/sec: 6172.32 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 17:39:25,116 epoch 10 - iter 623/894 - loss 0.30457304 - time (sec): 9.86 - samples/sec: 6151.50 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 17:39:26,496 epoch 10 - iter 712/894 - loss 0.30446491 - time (sec): 11.24 - samples/sec: 6143.72 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 17:39:27,902 epoch 10 - iter 801/894 - loss 0.30439421 - time (sec): 12.65 - samples/sec: 6139.08 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 17:39:29,318 epoch 10 - iter 890/894 - loss 0.30167890 - time (sec): 14.06 - samples/sec: 6136.82 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 17:39:29,382 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:39:29,382 EPOCH 10 done: loss 0.3020 - lr: 0.000000 |
|
2023-10-18 17:39:34,569 DEV : loss 0.3073885440826416 - f1-score (micro avg) 0.3415 |
|
2023-10-18 17:39:34,595 saving best model |
|
2023-10-18 17:39:34,654 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 17:39:34,654 Loading model from best epoch ... |
|
2023-10-18 17:39:34,729 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time |
|
2023-10-18 17:39:36,931 |
|
Results: |
|
- F-score (micro) 0.3501 |
|
- F-score (macro) 0.1397 |
|
- Accuracy 0.2236 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.5160 0.5419 0.5286 596 |
|
pers 0.1604 0.1802 0.1697 333 |
|
org 0.0000 0.0000 0.0000 132 |
|
prod 0.0000 0.0000 0.0000 66 |
|
time 0.0000 0.0000 0.0000 49 |
|
|
|
micro avg 0.3785 0.3257 0.3501 1176 |
|
macro avg 0.1353 0.1444 0.1397 1176 |
|
weighted avg 0.3069 0.3257 0.3160 1176 |
|
|
|
2023-10-18 17:39:36,931 ---------------------------------------------------------------------------------------------------- |
|
|