|
2023-10-18 18:07:05,092 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:07:05,092 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 18:07:05,092 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:07:05,092 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences |
|
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator |
|
2023-10-18 18:07:05,092 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:07:05,092 Train: 3575 sentences |
|
2023-10-18 18:07:05,092 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 18:07:05,092 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:07:05,092 Training Params: |
|
2023-10-18 18:07:05,093 - learning_rate: "3e-05" |
|
2023-10-18 18:07:05,093 - mini_batch_size: "8" |
|
2023-10-18 18:07:05,093 - max_epochs: "10" |
|
2023-10-18 18:07:05,093 - shuffle: "True" |
|
2023-10-18 18:07:05,093 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:07:05,093 Plugins: |
|
2023-10-18 18:07:05,093 - TensorboardLogger |
|
2023-10-18 18:07:05,093 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 18:07:05,093 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:07:05,093 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 18:07:05,093 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 18:07:05,093 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:07:05,093 Computation: |
|
2023-10-18 18:07:05,093 - compute on device: cuda:0 |
|
2023-10-18 18:07:05,093 - embedding storage: none |
|
2023-10-18 18:07:05,093 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:07:05,093 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-18 18:07:05,093 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:07:05,093 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:07:05,093 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 18:07:05,891 epoch 1 - iter 44/447 - loss 3.50461292 - time (sec): 0.80 - samples/sec: 10088.17 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 18:07:06,695 epoch 1 - iter 88/447 - loss 3.36992254 - time (sec): 1.60 - samples/sec: 10075.15 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 18:07:07,556 epoch 1 - iter 132/447 - loss 3.17925117 - time (sec): 2.46 - samples/sec: 10196.84 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 18:07:08,433 epoch 1 - iter 176/447 - loss 2.91552427 - time (sec): 3.34 - samples/sec: 10158.79 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 18:07:09,438 epoch 1 - iter 220/447 - loss 2.64264091 - time (sec): 4.34 - samples/sec: 9779.21 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 18:07:10,433 epoch 1 - iter 264/447 - loss 2.37544260 - time (sec): 5.34 - samples/sec: 9466.40 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 18:07:11,401 epoch 1 - iter 308/447 - loss 2.14809863 - time (sec): 6.31 - samples/sec: 9273.60 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 18:07:12,414 epoch 1 - iter 352/447 - loss 1.96232764 - time (sec): 7.32 - samples/sec: 9127.34 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 18:07:13,473 epoch 1 - iter 396/447 - loss 1.78387650 - time (sec): 8.38 - samples/sec: 9196.65 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 18:07:14,484 epoch 1 - iter 440/447 - loss 1.66986552 - time (sec): 9.39 - samples/sec: 9077.85 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 18:07:14,643 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:07:14,643 EPOCH 1 done: loss 1.6561 - lr: 0.000029 |
|
2023-10-18 18:07:16,854 DEV : loss 0.47917860746383667 - f1-score (micro avg) 0.0 |
|
2023-10-18 18:07:16,878 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:07:17,885 epoch 2 - iter 44/447 - loss 0.59675206 - time (sec): 1.01 - samples/sec: 8748.78 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 18:07:18,906 epoch 2 - iter 88/447 - loss 0.57436524 - time (sec): 2.03 - samples/sec: 8496.81 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 18:07:19,912 epoch 2 - iter 132/447 - loss 0.57218780 - time (sec): 3.03 - samples/sec: 8338.73 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 18:07:20,901 epoch 2 - iter 176/447 - loss 0.57549297 - time (sec): 4.02 - samples/sec: 8262.18 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 18:07:21,897 epoch 2 - iter 220/447 - loss 0.57292048 - time (sec): 5.02 - samples/sec: 8299.12 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 18:07:22,896 epoch 2 - iter 264/447 - loss 0.56116937 - time (sec): 6.02 - samples/sec: 8342.46 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 18:07:23,874 epoch 2 - iter 308/447 - loss 0.56094107 - time (sec): 7.00 - samples/sec: 8327.98 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 18:07:24,902 epoch 2 - iter 352/447 - loss 0.55109095 - time (sec): 8.02 - samples/sec: 8331.66 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 18:07:25,979 epoch 2 - iter 396/447 - loss 0.54595643 - time (sec): 9.10 - samples/sec: 8433.04 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 18:07:26,986 epoch 2 - iter 440/447 - loss 0.54125168 - time (sec): 10.11 - samples/sec: 8421.53 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 18:07:27,142 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:07:27,142 EPOCH 2 done: loss 0.5395 - lr: 0.000027 |
|
2023-10-18 18:07:31,982 DEV : loss 0.3830968141555786 - f1-score (micro avg) 0.0046 |
|
2023-10-18 18:07:32,007 saving best model |
|
2023-10-18 18:07:32,043 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:07:33,021 epoch 3 - iter 44/447 - loss 0.49082765 - time (sec): 0.98 - samples/sec: 8339.91 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 18:07:34,035 epoch 3 - iter 88/447 - loss 0.51096977 - time (sec): 1.99 - samples/sec: 8496.04 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 18:07:35,097 epoch 3 - iter 132/447 - loss 0.49344824 - time (sec): 3.05 - samples/sec: 8568.51 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 18:07:36,123 epoch 3 - iter 176/447 - loss 0.49144506 - time (sec): 4.08 - samples/sec: 8540.93 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 18:07:37,170 epoch 3 - iter 220/447 - loss 0.47976846 - time (sec): 5.13 - samples/sec: 8459.17 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 18:07:38,209 epoch 3 - iter 264/447 - loss 0.46171274 - time (sec): 6.17 - samples/sec: 8428.96 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 18:07:39,280 epoch 3 - iter 308/447 - loss 0.45866907 - time (sec): 7.24 - samples/sec: 8351.11 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 18:07:40,298 epoch 3 - iter 352/447 - loss 0.45353460 - time (sec): 8.25 - samples/sec: 8282.42 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 18:07:41,314 epoch 3 - iter 396/447 - loss 0.45002822 - time (sec): 9.27 - samples/sec: 8318.39 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 18:07:42,298 epoch 3 - iter 440/447 - loss 0.45039203 - time (sec): 10.26 - samples/sec: 8305.69 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 18:07:42,459 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:07:42,460 EPOCH 3 done: loss 0.4492 - lr: 0.000023 |
|
2023-10-18 18:07:47,591 DEV : loss 0.35753190517425537 - f1-score (micro avg) 0.0929 |
|
2023-10-18 18:07:47,616 saving best model |
|
2023-10-18 18:07:47,647 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:07:48,651 epoch 4 - iter 44/447 - loss 0.42225951 - time (sec): 1.00 - samples/sec: 8473.16 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 18:07:49,655 epoch 4 - iter 88/447 - loss 0.44058185 - time (sec): 2.01 - samples/sec: 8560.44 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 18:07:50,648 epoch 4 - iter 132/447 - loss 0.43632193 - time (sec): 3.00 - samples/sec: 8600.18 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 18:07:51,725 epoch 4 - iter 176/447 - loss 0.42490939 - time (sec): 4.08 - samples/sec: 8563.05 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 18:07:52,835 epoch 4 - iter 220/447 - loss 0.42362563 - time (sec): 5.19 - samples/sec: 8595.40 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 18:07:53,870 epoch 4 - iter 264/447 - loss 0.42543379 - time (sec): 6.22 - samples/sec: 8444.87 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 18:07:54,934 epoch 4 - iter 308/447 - loss 0.41859725 - time (sec): 7.29 - samples/sec: 8347.96 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 18:07:55,958 epoch 4 - iter 352/447 - loss 0.42123722 - time (sec): 8.31 - samples/sec: 8286.21 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 18:07:57,002 epoch 4 - iter 396/447 - loss 0.42071393 - time (sec): 9.35 - samples/sec: 8201.04 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 18:07:58,069 epoch 4 - iter 440/447 - loss 0.41441893 - time (sec): 10.42 - samples/sec: 8196.17 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 18:07:58,228 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:07:58,229 EPOCH 4 done: loss 0.4146 - lr: 0.000020 |
|
2023-10-18 18:08:03,384 DEV : loss 0.3443821668624878 - f1-score (micro avg) 0.2113 |
|
2023-10-18 18:08:03,411 saving best model |
|
2023-10-18 18:08:03,442 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:08:04,547 epoch 5 - iter 44/447 - loss 0.38157833 - time (sec): 1.10 - samples/sec: 7597.16 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 18:08:05,639 epoch 5 - iter 88/447 - loss 0.38080177 - time (sec): 2.20 - samples/sec: 8217.75 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 18:08:06,677 epoch 5 - iter 132/447 - loss 0.38920520 - time (sec): 3.23 - samples/sec: 8330.35 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 18:08:07,682 epoch 5 - iter 176/447 - loss 0.38415499 - time (sec): 4.24 - samples/sec: 8330.93 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 18:08:08,668 epoch 5 - iter 220/447 - loss 0.39229553 - time (sec): 5.23 - samples/sec: 8266.94 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 18:08:09,712 epoch 5 - iter 264/447 - loss 0.38988800 - time (sec): 6.27 - samples/sec: 8187.04 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 18:08:10,751 epoch 5 - iter 308/447 - loss 0.39386531 - time (sec): 7.31 - samples/sec: 8197.79 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 18:08:11,774 epoch 5 - iter 352/447 - loss 0.39212543 - time (sec): 8.33 - samples/sec: 8281.85 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 18:08:12,788 epoch 5 - iter 396/447 - loss 0.39073713 - time (sec): 9.35 - samples/sec: 8287.81 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 18:08:13,784 epoch 5 - iter 440/447 - loss 0.39025883 - time (sec): 10.34 - samples/sec: 8254.89 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 18:08:13,937 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:08:13,938 EPOCH 5 done: loss 0.3880 - lr: 0.000017 |
|
2023-10-18 18:08:19,214 DEV : loss 0.32710015773773193 - f1-score (micro avg) 0.2681 |
|
2023-10-18 18:08:19,239 saving best model |
|
2023-10-18 18:08:19,269 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:08:20,299 epoch 6 - iter 44/447 - loss 0.32138402 - time (sec): 1.03 - samples/sec: 8948.29 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 18:08:21,338 epoch 6 - iter 88/447 - loss 0.35939093 - time (sec): 2.07 - samples/sec: 8475.70 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 18:08:22,396 epoch 6 - iter 132/447 - loss 0.38072210 - time (sec): 3.13 - samples/sec: 8262.49 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 18:08:23,491 epoch 6 - iter 176/447 - loss 0.38219611 - time (sec): 4.22 - samples/sec: 8121.81 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 18:08:24,532 epoch 6 - iter 220/447 - loss 0.38590948 - time (sec): 5.26 - samples/sec: 8195.23 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 18:08:25,609 epoch 6 - iter 264/447 - loss 0.38458419 - time (sec): 6.34 - samples/sec: 8136.28 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 18:08:26,669 epoch 6 - iter 308/447 - loss 0.37526689 - time (sec): 7.40 - samples/sec: 8061.49 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 18:08:27,702 epoch 6 - iter 352/447 - loss 0.37111345 - time (sec): 8.43 - samples/sec: 8062.46 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 18:08:28,769 epoch 6 - iter 396/447 - loss 0.36328385 - time (sec): 9.50 - samples/sec: 8096.07 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 18:08:29,786 epoch 6 - iter 440/447 - loss 0.36654173 - time (sec): 10.52 - samples/sec: 8084.61 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 18:08:29,946 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:08:29,946 EPOCH 6 done: loss 0.3674 - lr: 0.000013 |
|
2023-10-18 18:08:35,149 DEV : loss 0.3198098838329315 - f1-score (micro avg) 0.2982 |
|
2023-10-18 18:08:35,174 saving best model |
|
2023-10-18 18:08:35,213 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:08:36,213 epoch 7 - iter 44/447 - loss 0.30880564 - time (sec): 1.00 - samples/sec: 8101.35 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 18:08:37,227 epoch 7 - iter 88/447 - loss 0.34698433 - time (sec): 2.01 - samples/sec: 8636.29 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 18:08:38,218 epoch 7 - iter 132/447 - loss 0.34845425 - time (sec): 3.00 - samples/sec: 8564.49 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 18:08:39,278 epoch 7 - iter 176/447 - loss 0.34775342 - time (sec): 4.06 - samples/sec: 8745.68 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 18:08:40,278 epoch 7 - iter 220/447 - loss 0.35181495 - time (sec): 5.06 - samples/sec: 8700.53 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 18:08:41,293 epoch 7 - iter 264/447 - loss 0.35415511 - time (sec): 6.08 - samples/sec: 8740.50 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 18:08:42,269 epoch 7 - iter 308/447 - loss 0.35519252 - time (sec): 7.06 - samples/sec: 8636.30 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 18:08:43,309 epoch 7 - iter 352/447 - loss 0.36019247 - time (sec): 8.09 - samples/sec: 8566.63 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 18:08:44,299 epoch 7 - iter 396/447 - loss 0.36381442 - time (sec): 9.09 - samples/sec: 8521.82 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 18:08:45,298 epoch 7 - iter 440/447 - loss 0.35927689 - time (sec): 10.08 - samples/sec: 8457.32 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 18:08:45,469 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:08:45,469 EPOCH 7 done: loss 0.3616 - lr: 0.000010 |
|
2023-10-18 18:08:50,395 DEV : loss 0.319092720746994 - f1-score (micro avg) 0.3018 |
|
2023-10-18 18:08:50,420 saving best model |
|
2023-10-18 18:08:50,457 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:08:51,438 epoch 8 - iter 44/447 - loss 0.33586850 - time (sec): 0.98 - samples/sec: 8029.13 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 18:08:52,394 epoch 8 - iter 88/447 - loss 0.34939112 - time (sec): 1.94 - samples/sec: 7813.59 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 18:08:53,388 epoch 8 - iter 132/447 - loss 0.34860416 - time (sec): 2.93 - samples/sec: 8130.83 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 18:08:54,411 epoch 8 - iter 176/447 - loss 0.36296272 - time (sec): 3.95 - samples/sec: 8116.89 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 18:08:55,442 epoch 8 - iter 220/447 - loss 0.35372586 - time (sec): 4.98 - samples/sec: 8030.27 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 18:08:56,441 epoch 8 - iter 264/447 - loss 0.35124092 - time (sec): 5.98 - samples/sec: 8024.65 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 18:08:57,459 epoch 8 - iter 308/447 - loss 0.34747995 - time (sec): 7.00 - samples/sec: 8214.22 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 18:08:58,478 epoch 8 - iter 352/447 - loss 0.35373290 - time (sec): 8.02 - samples/sec: 8332.01 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 18:08:59,461 epoch 8 - iter 396/447 - loss 0.35277945 - time (sec): 9.00 - samples/sec: 8297.12 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 18:09:00,541 epoch 8 - iter 440/447 - loss 0.35015304 - time (sec): 10.08 - samples/sec: 8324.91 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 18:09:01,065 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:09:01,065 EPOCH 8 done: loss 0.3492 - lr: 0.000007 |
|
2023-10-18 18:09:06,001 DEV : loss 0.31162384152412415 - f1-score (micro avg) 0.3008 |
|
2023-10-18 18:09:06,028 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:09:07,212 epoch 9 - iter 44/447 - loss 0.36624375 - time (sec): 1.18 - samples/sec: 8244.97 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 18:09:08,241 epoch 9 - iter 88/447 - loss 0.37242831 - time (sec): 2.21 - samples/sec: 8235.91 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 18:09:09,304 epoch 9 - iter 132/447 - loss 0.35798892 - time (sec): 3.28 - samples/sec: 8329.30 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 18:09:10,265 epoch 9 - iter 176/447 - loss 0.36040509 - time (sec): 4.24 - samples/sec: 8162.04 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 18:09:11,281 epoch 9 - iter 220/447 - loss 0.34292561 - time (sec): 5.25 - samples/sec: 8191.00 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 18:09:12,283 epoch 9 - iter 264/447 - loss 0.35220271 - time (sec): 6.25 - samples/sec: 8200.26 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 18:09:13,337 epoch 9 - iter 308/447 - loss 0.34817812 - time (sec): 7.31 - samples/sec: 8175.54 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 18:09:14,342 epoch 9 - iter 352/447 - loss 0.34389750 - time (sec): 8.31 - samples/sec: 8219.01 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 18:09:15,337 epoch 9 - iter 396/447 - loss 0.34225937 - time (sec): 9.31 - samples/sec: 8251.47 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 18:09:16,385 epoch 9 - iter 440/447 - loss 0.34245648 - time (sec): 10.36 - samples/sec: 8172.11 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 18:09:16,574 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:09:16,574 EPOCH 9 done: loss 0.3424 - lr: 0.000003 |
|
2023-10-18 18:09:21,816 DEV : loss 0.3109774589538574 - f1-score (micro avg) 0.3081 |
|
2023-10-18 18:09:21,841 saving best model |
|
2023-10-18 18:09:21,871 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:09:22,862 epoch 10 - iter 44/447 - loss 0.36620473 - time (sec): 0.99 - samples/sec: 8918.75 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 18:09:23,843 epoch 10 - iter 88/447 - loss 0.36388255 - time (sec): 1.97 - samples/sec: 8882.55 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 18:09:24,809 epoch 10 - iter 132/447 - loss 0.35657952 - time (sec): 2.94 - samples/sec: 8690.34 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 18:09:25,762 epoch 10 - iter 176/447 - loss 0.35887161 - time (sec): 3.89 - samples/sec: 8547.45 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 18:09:26,768 epoch 10 - iter 220/447 - loss 0.35777360 - time (sec): 4.90 - samples/sec: 8436.02 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 18:09:27,835 epoch 10 - iter 264/447 - loss 0.35322405 - time (sec): 5.96 - samples/sec: 8422.42 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 18:09:28,938 epoch 10 - iter 308/447 - loss 0.34508246 - time (sec): 7.07 - samples/sec: 8386.42 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 18:09:29,971 epoch 10 - iter 352/447 - loss 0.34032089 - time (sec): 8.10 - samples/sec: 8370.11 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 18:09:30,970 epoch 10 - iter 396/447 - loss 0.33827017 - time (sec): 9.10 - samples/sec: 8395.25 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 18:09:31,965 epoch 10 - iter 440/447 - loss 0.33662610 - time (sec): 10.09 - samples/sec: 8439.74 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 18:09:32,118 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:09:32,119 EPOCH 10 done: loss 0.3368 - lr: 0.000000 |
|
2023-10-18 18:09:37,350 DEV : loss 0.3095775544643402 - f1-score (micro avg) 0.3138 |
|
2023-10-18 18:09:37,375 saving best model |
|
2023-10-18 18:09:37,436 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 18:09:37,436 Loading model from best epoch ... |
|
2023-10-18 18:09:37,514 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time |
|
2023-10-18 18:09:39,395 |
|
Results: |
|
- F-score (micro) 0.3101 |
|
- F-score (macro) 0.1149 |
|
- Accuracy 0.1915 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.4433 0.4983 0.4692 596 |
|
pers 0.1168 0.0961 0.1054 333 |
|
org 0.0000 0.0000 0.0000 132 |
|
prod 0.0000 0.0000 0.0000 66 |
|
time 0.0000 0.0000 0.0000 49 |
|
|
|
micro avg 0.3478 0.2798 0.3101 1176 |
|
macro avg 0.1120 0.1189 0.1149 1176 |
|
weighted avg 0.2577 0.2798 0.2676 1176 |
|
|
|
2023-10-18 18:09:39,395 ---------------------------------------------------------------------------------------------------- |
|
|