|
2023-10-18 18:24:23,192 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:24:23,193 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 18:24:23,193 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:24:23,193 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences |
|
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator |
|
2023-10-18 18:24:23,193 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:24:23,193 Train: 3575 sentences |
|
2023-10-18 18:24:23,193 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 18:24:23,193 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:24:23,193 Training Params: |
|
2023-10-18 18:24:23,193 - learning_rate: "3e-05" |
|
2023-10-18 18:24:23,193 - mini_batch_size: "4" |
|
2023-10-18 18:24:23,193 - max_epochs: "10" |
|
2023-10-18 18:24:23,193 - shuffle: "True" |
|
2023-10-18 18:24:23,193 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:24:23,193 Plugins: |
|
2023-10-18 18:24:23,193 - TensorboardLogger |
|
2023-10-18 18:24:23,193 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 18:24:23,193 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:24:23,193 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 18:24:23,193 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 18:24:23,193 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:24:23,193 Computation: |
|
2023-10-18 18:24:23,193 - compute on device: cuda:0 |
|
2023-10-18 18:24:23,193 - embedding storage: none |
|
2023-10-18 18:24:23,193 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:24:23,193 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-18 18:24:23,193 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:24:23,193 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:24:23,194 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 18:24:24,611 epoch 1 - iter 89/894 - loss 3.44377310 - time (sec): 1.42 - samples/sec: 6732.15 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 18:24:26,013 epoch 1 - iter 178/894 - loss 3.32507717 - time (sec): 2.82 - samples/sec: 6559.53 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 18:24:27,420 epoch 1 - iter 267/894 - loss 3.03650720 - time (sec): 4.23 - samples/sec: 6473.93 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 18:24:28,797 epoch 1 - iter 356/894 - loss 2.71169589 - time (sec): 5.60 - samples/sec: 6340.06 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 18:24:30,202 epoch 1 - iter 445/894 - loss 2.36182152 - time (sec): 7.01 - samples/sec: 6333.75 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 18:24:31,595 epoch 1 - iter 534/894 - loss 2.09769477 - time (sec): 8.40 - samples/sec: 6275.33 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 18:24:32,979 epoch 1 - iter 623/894 - loss 1.90205848 - time (sec): 9.79 - samples/sec: 6265.56 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 18:24:34,360 epoch 1 - iter 712/894 - loss 1.74916876 - time (sec): 11.17 - samples/sec: 6233.53 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 18:24:35,679 epoch 1 - iter 801/894 - loss 1.63097149 - time (sec): 12.49 - samples/sec: 6242.13 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 18:24:36,995 epoch 1 - iter 890/894 - loss 1.53628366 - time (sec): 13.80 - samples/sec: 6237.65 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 18:24:37,062 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:24:37,062 EPOCH 1 done: loss 1.5305 - lr: 0.000030 |
|
2023-10-18 18:24:39,256 DEV : loss 0.47118452191352844 - f1-score (micro avg) 0.0 |
|
2023-10-18 18:24:39,282 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:24:40,513 epoch 2 - iter 89/894 - loss 0.58579553 - time (sec): 1.23 - samples/sec: 7877.62 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 18:24:41,931 epoch 2 - iter 178/894 - loss 0.57032839 - time (sec): 2.65 - samples/sec: 7060.10 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 18:24:43,315 epoch 2 - iter 267/894 - loss 0.54052276 - time (sec): 4.03 - samples/sec: 6714.69 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 18:24:44,702 epoch 2 - iter 356/894 - loss 0.51251248 - time (sec): 5.42 - samples/sec: 6568.34 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 18:24:46,091 epoch 2 - iter 445/894 - loss 0.50579319 - time (sec): 6.81 - samples/sec: 6390.32 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 18:24:47,483 epoch 2 - iter 534/894 - loss 0.49805171 - time (sec): 8.20 - samples/sec: 6308.43 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 18:24:48,888 epoch 2 - iter 623/894 - loss 0.49569052 - time (sec): 9.61 - samples/sec: 6336.78 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 18:24:50,296 epoch 2 - iter 712/894 - loss 0.49586035 - time (sec): 11.01 - samples/sec: 6311.45 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 18:24:51,661 epoch 2 - iter 801/894 - loss 0.49461561 - time (sec): 12.38 - samples/sec: 6282.93 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 18:24:53,027 epoch 2 - iter 890/894 - loss 0.49190223 - time (sec): 13.74 - samples/sec: 6277.14 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 18:24:53,086 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:24:53,086 EPOCH 2 done: loss 0.4922 - lr: 0.000027 |
|
2023-10-18 18:24:58,310 DEV : loss 0.35969656705856323 - f1-score (micro avg) 0.0576 |
|
2023-10-18 18:24:58,336 saving best model |
|
2023-10-18 18:24:58,373 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:24:59,774 epoch 3 - iter 89/894 - loss 0.45189352 - time (sec): 1.40 - samples/sec: 6287.24 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 18:25:01,047 epoch 3 - iter 178/894 - loss 0.45925371 - time (sec): 2.67 - samples/sec: 6402.24 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 18:25:02,416 epoch 3 - iter 267/894 - loss 0.44356716 - time (sec): 4.04 - samples/sec: 6398.43 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 18:25:03,800 epoch 3 - iter 356/894 - loss 0.42841623 - time (sec): 5.43 - samples/sec: 6271.91 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 18:25:05,217 epoch 3 - iter 445/894 - loss 0.43422146 - time (sec): 6.84 - samples/sec: 6278.06 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 18:25:06,578 epoch 3 - iter 534/894 - loss 0.42632128 - time (sec): 8.20 - samples/sec: 6239.49 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 18:25:07,998 epoch 3 - iter 623/894 - loss 0.42026616 - time (sec): 9.62 - samples/sec: 6325.99 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 18:25:09,371 epoch 3 - iter 712/894 - loss 0.41871103 - time (sec): 11.00 - samples/sec: 6274.20 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 18:25:10,805 epoch 3 - iter 801/894 - loss 0.41988152 - time (sec): 12.43 - samples/sec: 6266.34 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 18:25:12,173 epoch 3 - iter 890/894 - loss 0.41575753 - time (sec): 13.80 - samples/sec: 6252.09 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 18:25:12,233 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:25:12,233 EPOCH 3 done: loss 0.4158 - lr: 0.000023 |
|
2023-10-18 18:25:17,416 DEV : loss 0.328664094209671 - f1-score (micro avg) 0.2589 |
|
2023-10-18 18:25:17,442 saving best model |
|
2023-10-18 18:25:17,476 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:25:18,887 epoch 4 - iter 89/894 - loss 0.36856508 - time (sec): 1.41 - samples/sec: 6414.14 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 18:25:20,308 epoch 4 - iter 178/894 - loss 0.36110083 - time (sec): 2.83 - samples/sec: 6278.32 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 18:25:21,721 epoch 4 - iter 267/894 - loss 0.36322010 - time (sec): 4.24 - samples/sec: 6454.81 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 18:25:23,115 epoch 4 - iter 356/894 - loss 0.37579106 - time (sec): 5.64 - samples/sec: 6440.42 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 18:25:24,499 epoch 4 - iter 445/894 - loss 0.37427337 - time (sec): 7.02 - samples/sec: 6284.65 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 18:25:25,951 epoch 4 - iter 534/894 - loss 0.37959096 - time (sec): 8.47 - samples/sec: 6292.40 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 18:25:27,329 epoch 4 - iter 623/894 - loss 0.38174423 - time (sec): 9.85 - samples/sec: 6236.47 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 18:25:28,726 epoch 4 - iter 712/894 - loss 0.38408338 - time (sec): 11.25 - samples/sec: 6196.42 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 18:25:30,100 epoch 4 - iter 801/894 - loss 0.38386136 - time (sec): 12.62 - samples/sec: 6167.78 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 18:25:31,481 epoch 4 - iter 890/894 - loss 0.38339230 - time (sec): 14.00 - samples/sec: 6153.34 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 18:25:31,546 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:25:31,546 EPOCH 4 done: loss 0.3834 - lr: 0.000020 |
|
2023-10-18 18:25:36,818 DEV : loss 0.3191424310207367 - f1-score (micro avg) 0.2918 |
|
2023-10-18 18:25:36,845 saving best model |
|
2023-10-18 18:25:36,886 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:25:38,253 epoch 5 - iter 89/894 - loss 0.41005231 - time (sec): 1.37 - samples/sec: 5681.44 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 18:25:39,623 epoch 5 - iter 178/894 - loss 0.37709380 - time (sec): 2.74 - samples/sec: 5787.17 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 18:25:41,011 epoch 5 - iter 267/894 - loss 0.37250446 - time (sec): 4.12 - samples/sec: 5878.04 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 18:25:42,478 epoch 5 - iter 356/894 - loss 0.35129430 - time (sec): 5.59 - samples/sec: 6049.95 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 18:25:43,913 epoch 5 - iter 445/894 - loss 0.34544567 - time (sec): 7.03 - samples/sec: 6161.99 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 18:25:45,368 epoch 5 - iter 534/894 - loss 0.34722635 - time (sec): 8.48 - samples/sec: 6210.15 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 18:25:46,735 epoch 5 - iter 623/894 - loss 0.34754497 - time (sec): 9.85 - samples/sec: 6190.05 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 18:25:48,093 epoch 5 - iter 712/894 - loss 0.35160168 - time (sec): 11.21 - samples/sec: 6172.18 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 18:25:49,446 epoch 5 - iter 801/894 - loss 0.35284732 - time (sec): 12.56 - samples/sec: 6183.91 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 18:25:50,843 epoch 5 - iter 890/894 - loss 0.35750849 - time (sec): 13.96 - samples/sec: 6181.58 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 18:25:50,902 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:25:50,902 EPOCH 5 done: loss 0.3570 - lr: 0.000017 |
|
2023-10-18 18:25:55,876 DEV : loss 0.31827524304389954 - f1-score (micro avg) 0.3341 |
|
2023-10-18 18:25:55,904 saving best model |
|
2023-10-18 18:25:55,939 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:25:57,667 epoch 6 - iter 89/894 - loss 0.32626758 - time (sec): 1.73 - samples/sec: 4966.40 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 18:25:59,055 epoch 6 - iter 178/894 - loss 0.34440906 - time (sec): 3.12 - samples/sec: 5573.96 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 18:26:00,471 epoch 6 - iter 267/894 - loss 0.34248903 - time (sec): 4.53 - samples/sec: 6008.91 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 18:26:01,873 epoch 6 - iter 356/894 - loss 0.34427275 - time (sec): 5.93 - samples/sec: 5992.98 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 18:26:03,277 epoch 6 - iter 445/894 - loss 0.34981341 - time (sec): 7.34 - samples/sec: 5988.96 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 18:26:04,635 epoch 6 - iter 534/894 - loss 0.34619955 - time (sec): 8.70 - samples/sec: 5929.90 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 18:26:06,023 epoch 6 - iter 623/894 - loss 0.34619718 - time (sec): 10.08 - samples/sec: 5980.05 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 18:26:07,446 epoch 6 - iter 712/894 - loss 0.34673980 - time (sec): 11.51 - samples/sec: 5974.78 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 18:26:08,831 epoch 6 - iter 801/894 - loss 0.34705892 - time (sec): 12.89 - samples/sec: 6008.23 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 18:26:10,230 epoch 6 - iter 890/894 - loss 0.34088003 - time (sec): 14.29 - samples/sec: 6032.22 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 18:26:10,287 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:26:10,287 EPOCH 6 done: loss 0.3400 - lr: 0.000013 |
|
2023-10-18 18:26:15,258 DEV : loss 0.3164794147014618 - f1-score (micro avg) 0.3395 |
|
2023-10-18 18:26:15,285 saving best model |
|
2023-10-18 18:26:15,325 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:26:16,771 epoch 7 - iter 89/894 - loss 0.34476077 - time (sec): 1.45 - samples/sec: 5879.83 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 18:26:18,127 epoch 7 - iter 178/894 - loss 0.32276911 - time (sec): 2.80 - samples/sec: 5986.68 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 18:26:19,496 epoch 7 - iter 267/894 - loss 0.32152674 - time (sec): 4.17 - samples/sec: 5955.22 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 18:26:20,911 epoch 7 - iter 356/894 - loss 0.32068103 - time (sec): 5.59 - samples/sec: 6115.23 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 18:26:22,310 epoch 7 - iter 445/894 - loss 0.32162727 - time (sec): 6.98 - samples/sec: 6156.14 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 18:26:23,729 epoch 7 - iter 534/894 - loss 0.31705506 - time (sec): 8.40 - samples/sec: 6170.93 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 18:26:25,113 epoch 7 - iter 623/894 - loss 0.32675305 - time (sec): 9.79 - samples/sec: 6179.87 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 18:26:26,522 epoch 7 - iter 712/894 - loss 0.32578820 - time (sec): 11.20 - samples/sec: 6226.74 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 18:26:27,895 epoch 7 - iter 801/894 - loss 0.32694328 - time (sec): 12.57 - samples/sec: 6206.43 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 18:26:29,268 epoch 7 - iter 890/894 - loss 0.32690021 - time (sec): 13.94 - samples/sec: 6177.73 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 18:26:29,329 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:26:29,329 EPOCH 7 done: loss 0.3260 - lr: 0.000010 |
|
2023-10-18 18:26:34,643 DEV : loss 0.3100321888923645 - f1-score (micro avg) 0.3492 |
|
2023-10-18 18:26:34,671 saving best model |
|
2023-10-18 18:26:34,709 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:26:36,180 epoch 8 - iter 89/894 - loss 0.33805047 - time (sec): 1.47 - samples/sec: 5813.17 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 18:26:37,559 epoch 8 - iter 178/894 - loss 0.32541421 - time (sec): 2.85 - samples/sec: 5960.72 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 18:26:38,944 epoch 8 - iter 267/894 - loss 0.33510004 - time (sec): 4.23 - samples/sec: 6116.97 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 18:26:40,347 epoch 8 - iter 356/894 - loss 0.33625387 - time (sec): 5.64 - samples/sec: 6157.91 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 18:26:41,784 epoch 8 - iter 445/894 - loss 0.33118526 - time (sec): 7.07 - samples/sec: 6120.14 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 18:26:43,292 epoch 8 - iter 534/894 - loss 0.32648379 - time (sec): 8.58 - samples/sec: 6186.97 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 18:26:44,700 epoch 8 - iter 623/894 - loss 0.32349710 - time (sec): 9.99 - samples/sec: 6109.44 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 18:26:46,131 epoch 8 - iter 712/894 - loss 0.32373014 - time (sec): 11.42 - samples/sec: 6103.37 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 18:26:47,564 epoch 8 - iter 801/894 - loss 0.31732168 - time (sec): 12.85 - samples/sec: 6139.46 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 18:26:48,931 epoch 8 - iter 890/894 - loss 0.32090107 - time (sec): 14.22 - samples/sec: 6063.28 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 18:26:48,988 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:26:48,988 EPOCH 8 done: loss 0.3205 - lr: 0.000007 |
|
2023-10-18 18:26:54,256 DEV : loss 0.30898723006248474 - f1-score (micro avg) 0.3486 |
|
2023-10-18 18:26:54,283 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:26:55,650 epoch 9 - iter 89/894 - loss 0.28474499 - time (sec): 1.37 - samples/sec: 5868.01 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 18:26:57,145 epoch 9 - iter 178/894 - loss 0.31255629 - time (sec): 2.86 - samples/sec: 6241.57 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 18:26:58,541 epoch 9 - iter 267/894 - loss 0.33568219 - time (sec): 4.26 - samples/sec: 6184.45 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 18:26:59,975 epoch 9 - iter 356/894 - loss 0.32610043 - time (sec): 5.69 - samples/sec: 6138.41 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 18:27:01,404 epoch 9 - iter 445/894 - loss 0.32342253 - time (sec): 7.12 - samples/sec: 6047.66 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 18:27:02,888 epoch 9 - iter 534/894 - loss 0.31927800 - time (sec): 8.60 - samples/sec: 6123.51 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 18:27:04,266 epoch 9 - iter 623/894 - loss 0.31440826 - time (sec): 9.98 - samples/sec: 6173.64 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 18:27:05,656 epoch 9 - iter 712/894 - loss 0.31534195 - time (sec): 11.37 - samples/sec: 6135.91 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 18:27:07,025 epoch 9 - iter 801/894 - loss 0.31358640 - time (sec): 12.74 - samples/sec: 6146.45 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 18:27:08,474 epoch 9 - iter 890/894 - loss 0.31067410 - time (sec): 14.19 - samples/sec: 6084.82 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 18:27:08,534 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:27:08,534 EPOCH 9 done: loss 0.3110 - lr: 0.000003 |
|
2023-10-18 18:27:13,828 DEV : loss 0.3121003806591034 - f1-score (micro avg) 0.3518 |
|
2023-10-18 18:27:13,855 saving best model |
|
2023-10-18 18:27:13,896 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:27:15,302 epoch 10 - iter 89/894 - loss 0.26168045 - time (sec): 1.41 - samples/sec: 6367.82 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 18:27:16,694 epoch 10 - iter 178/894 - loss 0.27101137 - time (sec): 2.80 - samples/sec: 6264.19 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 18:27:18,047 epoch 10 - iter 267/894 - loss 0.26561479 - time (sec): 4.15 - samples/sec: 6068.32 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 18:27:19,410 epoch 10 - iter 356/894 - loss 0.27933512 - time (sec): 5.51 - samples/sec: 6042.11 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 18:27:20,758 epoch 10 - iter 445/894 - loss 0.28638971 - time (sec): 6.86 - samples/sec: 6055.18 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 18:27:22,121 epoch 10 - iter 534/894 - loss 0.29135151 - time (sec): 8.22 - samples/sec: 6039.98 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 18:27:23,547 epoch 10 - iter 623/894 - loss 0.29451250 - time (sec): 9.65 - samples/sec: 6018.63 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 18:27:24,955 epoch 10 - iter 712/894 - loss 0.29782593 - time (sec): 11.06 - samples/sec: 6105.22 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 18:27:26,233 epoch 10 - iter 801/894 - loss 0.29562823 - time (sec): 12.34 - samples/sec: 6294.96 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 18:27:27,476 epoch 10 - iter 890/894 - loss 0.30303112 - time (sec): 13.58 - samples/sec: 6351.18 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 18:27:27,528 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:27:27,528 EPOCH 10 done: loss 0.3035 - lr: 0.000000 |
|
2023-10-18 18:27:32,523 DEV : loss 0.3079068958759308 - f1-score (micro avg) 0.3549 |
|
2023-10-18 18:27:32,550 saving best model |
|
2023-10-18 18:27:32,611 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:27:32,612 Loading model from best epoch ... |
|
2023-10-18 18:27:32,693 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time |
|
2023-10-18 18:27:35,046 |
|
Results: |
|
- F-score (micro) 0.3371 |
|
- F-score (macro) 0.1334 |
|
- Accuracy 0.2144 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.4716 0.5436 0.5051 596 |
|
pers 0.1452 0.1832 0.1620 333 |
|
org 0.0000 0.0000 0.0000 132 |
|
prod 0.0000 0.0000 0.0000 66 |
|
time 0.0000 0.0000 0.0000 49 |
|
|
|
micro avg 0.3475 0.3274 0.3371 1176 |
|
macro avg 0.1234 0.1454 0.1334 1176 |
|
weighted avg 0.2801 0.3274 0.3018 1176 |
|
|
|
2023-10-18 18:27:35,047 ---------------------------------------------------------------------------------------------------- |
|
|