stefan-it's picture
Upload folder using huggingface_hub
041857e
2023-10-18 18:00:28,510 ----------------------------------------------------------------------------------------------------
2023-10-18 18:00:28,511 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 18:00:28,511 ----------------------------------------------------------------------------------------------------
2023-10-18 18:00:28,511 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-18 18:00:28,511 ----------------------------------------------------------------------------------------------------
2023-10-18 18:00:28,511 Train: 3575 sentences
2023-10-18 18:00:28,511 (train_with_dev=False, train_with_test=False)
2023-10-18 18:00:28,511 ----------------------------------------------------------------------------------------------------
2023-10-18 18:00:28,511 Training Params:
2023-10-18 18:00:28,511 - learning_rate: "3e-05"
2023-10-18 18:00:28,511 - mini_batch_size: "4"
2023-10-18 18:00:28,511 - max_epochs: "10"
2023-10-18 18:00:28,511 - shuffle: "True"
2023-10-18 18:00:28,511 ----------------------------------------------------------------------------------------------------
2023-10-18 18:00:28,511 Plugins:
2023-10-18 18:00:28,511 - TensorboardLogger
2023-10-18 18:00:28,511 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 18:00:28,511 ----------------------------------------------------------------------------------------------------
2023-10-18 18:00:28,511 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 18:00:28,511 - metric: "('micro avg', 'f1-score')"
2023-10-18 18:00:28,511 ----------------------------------------------------------------------------------------------------
2023-10-18 18:00:28,511 Computation:
2023-10-18 18:00:28,512 - compute on device: cuda:0
2023-10-18 18:00:28,512 - embedding storage: none
2023-10-18 18:00:28,512 ----------------------------------------------------------------------------------------------------
2023-10-18 18:00:28,512 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-18 18:00:28,512 ----------------------------------------------------------------------------------------------------
2023-10-18 18:00:28,512 ----------------------------------------------------------------------------------------------------
2023-10-18 18:00:28,512 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 18:00:29,842 epoch 1 - iter 89/894 - loss 3.47720738 - time (sec): 1.33 - samples/sec: 6076.79 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:00:31,074 epoch 1 - iter 178/894 - loss 3.26012265 - time (sec): 2.56 - samples/sec: 6404.36 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:00:32,466 epoch 1 - iter 267/894 - loss 2.94964775 - time (sec): 3.95 - samples/sec: 6442.77 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:00:33,867 epoch 1 - iter 356/894 - loss 2.55034837 - time (sec): 5.36 - samples/sec: 6441.75 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:00:35,262 epoch 1 - iter 445/894 - loss 2.22532146 - time (sec): 6.75 - samples/sec: 6366.77 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:00:36,638 epoch 1 - iter 534/894 - loss 1.98044920 - time (sec): 8.13 - samples/sec: 6274.87 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:00:38,008 epoch 1 - iter 623/894 - loss 1.78776894 - time (sec): 9.50 - samples/sec: 6241.02 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:00:39,396 epoch 1 - iter 712/894 - loss 1.63136830 - time (sec): 10.88 - samples/sec: 6265.61 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:00:40,838 epoch 1 - iter 801/894 - loss 1.50892338 - time (sec): 12.33 - samples/sec: 6302.71 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:00:42,211 epoch 1 - iter 890/894 - loss 1.41796507 - time (sec): 13.70 - samples/sec: 6288.61 - lr: 0.000030 - momentum: 0.000000
2023-10-18 18:00:42,272 ----------------------------------------------------------------------------------------------------
2023-10-18 18:00:42,272 EPOCH 1 done: loss 1.4167 - lr: 0.000030
2023-10-18 18:00:44,511 DEV : loss 0.46221593022346497 - f1-score (micro avg) 0.0
2023-10-18 18:00:44,536 ----------------------------------------------------------------------------------------------------
2023-10-18 18:00:45,908 epoch 2 - iter 89/894 - loss 0.53965102 - time (sec): 1.37 - samples/sec: 6473.76 - lr: 0.000030 - momentum: 0.000000
2023-10-18 18:00:47,295 epoch 2 - iter 178/894 - loss 0.53998811 - time (sec): 2.76 - samples/sec: 6299.09 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:00:48,666 epoch 2 - iter 267/894 - loss 0.53762330 - time (sec): 4.13 - samples/sec: 6199.42 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:00:50,051 epoch 2 - iter 356/894 - loss 0.53608027 - time (sec): 5.52 - samples/sec: 6099.51 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:00:51,436 epoch 2 - iter 445/894 - loss 0.53380079 - time (sec): 6.90 - samples/sec: 6092.46 - lr: 0.000028 - momentum: 0.000000
2023-10-18 18:00:52,845 epoch 2 - iter 534/894 - loss 0.51793174 - time (sec): 8.31 - samples/sec: 6117.45 - lr: 0.000028 - momentum: 0.000000
2023-10-18 18:00:54,238 epoch 2 - iter 623/894 - loss 0.51806145 - time (sec): 9.70 - samples/sec: 6078.34 - lr: 0.000028 - momentum: 0.000000
2023-10-18 18:00:55,547 epoch 2 - iter 712/894 - loss 0.50629182 - time (sec): 11.01 - samples/sec: 6243.33 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:00:56,887 epoch 2 - iter 801/894 - loss 0.50500686 - time (sec): 12.35 - samples/sec: 6297.13 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:00:58,286 epoch 2 - iter 890/894 - loss 0.49779972 - time (sec): 13.75 - samples/sec: 6275.11 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:00:58,341 ----------------------------------------------------------------------------------------------------
2023-10-18 18:00:58,341 EPOCH 2 done: loss 0.4984 - lr: 0.000027
2023-10-18 18:01:03,629 DEV : loss 0.36310797929763794 - f1-score (micro avg) 0.0659
2023-10-18 18:01:03,656 saving best model
2023-10-18 18:01:03,691 ----------------------------------------------------------------------------------------------------
2023-10-18 18:01:05,140 epoch 3 - iter 89/894 - loss 0.46261754 - time (sec): 1.45 - samples/sec: 5713.97 - lr: 0.000026 - momentum: 0.000000
2023-10-18 18:01:06,556 epoch 3 - iter 178/894 - loss 0.48194337 - time (sec): 2.86 - samples/sec: 5973.79 - lr: 0.000026 - momentum: 0.000000
2023-10-18 18:01:07,954 epoch 3 - iter 267/894 - loss 0.46562612 - time (sec): 4.26 - samples/sec: 6198.53 - lr: 0.000026 - momentum: 0.000000
2023-10-18 18:01:09,365 epoch 3 - iter 356/894 - loss 0.46102644 - time (sec): 5.67 - samples/sec: 6220.14 - lr: 0.000025 - momentum: 0.000000
2023-10-18 18:01:10,727 epoch 3 - iter 445/894 - loss 0.45309118 - time (sec): 7.04 - samples/sec: 6213.32 - lr: 0.000025 - momentum: 0.000000
2023-10-18 18:01:12,120 epoch 3 - iter 534/894 - loss 0.43617221 - time (sec): 8.43 - samples/sec: 6233.77 - lr: 0.000025 - momentum: 0.000000
2023-10-18 18:01:13,497 epoch 3 - iter 623/894 - loss 0.42980845 - time (sec): 9.81 - samples/sec: 6237.84 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:01:14,859 epoch 3 - iter 712/894 - loss 0.42508549 - time (sec): 11.17 - samples/sec: 6196.81 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:01:16,272 epoch 3 - iter 801/894 - loss 0.42413621 - time (sec): 12.58 - samples/sec: 6195.05 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:01:17,657 epoch 3 - iter 890/894 - loss 0.42143294 - time (sec): 13.97 - samples/sec: 6165.33 - lr: 0.000023 - momentum: 0.000000
2023-10-18 18:01:17,715 ----------------------------------------------------------------------------------------------------
2023-10-18 18:01:17,716 EPOCH 3 done: loss 0.4214 - lr: 0.000023
2023-10-18 18:01:23,002 DEV : loss 0.34216511249542236 - f1-score (micro avg) 0.2547
2023-10-18 18:01:23,028 saving best model
2023-10-18 18:01:23,062 ----------------------------------------------------------------------------------------------------
2023-10-18 18:01:24,460 epoch 4 - iter 89/894 - loss 0.38257387 - time (sec): 1.40 - samples/sec: 6145.77 - lr: 0.000023 - momentum: 0.000000
2023-10-18 18:01:25,846 epoch 4 - iter 178/894 - loss 0.41047799 - time (sec): 2.78 - samples/sec: 6227.56 - lr: 0.000023 - momentum: 0.000000
2023-10-18 18:01:27,219 epoch 4 - iter 267/894 - loss 0.40657509 - time (sec): 4.16 - samples/sec: 6293.75 - lr: 0.000022 - momentum: 0.000000
2023-10-18 18:01:28,618 epoch 4 - iter 356/894 - loss 0.38753137 - time (sec): 5.56 - samples/sec: 6387.97 - lr: 0.000022 - momentum: 0.000000
2023-10-18 18:01:30,013 epoch 4 - iter 445/894 - loss 0.38928001 - time (sec): 6.95 - samples/sec: 6490.07 - lr: 0.000022 - momentum: 0.000000
2023-10-18 18:01:31,419 epoch 4 - iter 534/894 - loss 0.39281104 - time (sec): 8.36 - samples/sec: 6339.34 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:01:32,780 epoch 4 - iter 623/894 - loss 0.38261370 - time (sec): 9.72 - samples/sec: 6331.61 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:01:34,156 epoch 4 - iter 712/894 - loss 0.38588466 - time (sec): 11.09 - samples/sec: 6292.75 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:01:35,451 epoch 4 - iter 801/894 - loss 0.38455298 - time (sec): 12.39 - samples/sec: 6253.02 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:01:36,706 epoch 4 - iter 890/894 - loss 0.38059774 - time (sec): 13.64 - samples/sec: 6323.32 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:01:36,760 ----------------------------------------------------------------------------------------------------
2023-10-18 18:01:36,760 EPOCH 4 done: loss 0.3807 - lr: 0.000020
2023-10-18 18:01:41,760 DEV : loss 0.33357590436935425 - f1-score (micro avg) 0.2831
2023-10-18 18:01:41,786 saving best model
2023-10-18 18:01:41,824 ----------------------------------------------------------------------------------------------------
2023-10-18 18:01:43,066 epoch 5 - iter 89/894 - loss 0.36835419 - time (sec): 1.24 - samples/sec: 6812.86 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:01:44,652 epoch 5 - iter 178/894 - loss 0.36526170 - time (sec): 2.83 - samples/sec: 6471.45 - lr: 0.000019 - momentum: 0.000000
2023-10-18 18:01:46,113 epoch 5 - iter 267/894 - loss 0.36494379 - time (sec): 4.29 - samples/sec: 6370.54 - lr: 0.000019 - momentum: 0.000000
2023-10-18 18:01:47,503 epoch 5 - iter 356/894 - loss 0.36282400 - time (sec): 5.68 - samples/sec: 6281.29 - lr: 0.000019 - momentum: 0.000000
2023-10-18 18:01:48,808 epoch 5 - iter 445/894 - loss 0.36785318 - time (sec): 6.98 - samples/sec: 6253.29 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:01:50,333 epoch 5 - iter 534/894 - loss 0.36650001 - time (sec): 8.51 - samples/sec: 6099.75 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:01:51,703 epoch 5 - iter 623/894 - loss 0.36490601 - time (sec): 9.88 - samples/sec: 6137.93 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:01:53,130 epoch 5 - iter 712/894 - loss 0.36466791 - time (sec): 11.31 - samples/sec: 6171.94 - lr: 0.000017 - momentum: 0.000000
2023-10-18 18:01:54,542 epoch 5 - iter 801/894 - loss 0.36141892 - time (sec): 12.72 - samples/sec: 6147.71 - lr: 0.000017 - momentum: 0.000000
2023-10-18 18:01:55,971 epoch 5 - iter 890/894 - loss 0.35848906 - time (sec): 14.15 - samples/sec: 6095.21 - lr: 0.000017 - momentum: 0.000000
2023-10-18 18:01:56,030 ----------------------------------------------------------------------------------------------------
2023-10-18 18:01:56,030 EPOCH 5 done: loss 0.3583 - lr: 0.000017
2023-10-18 18:02:01,024 DEV : loss 0.3251766562461853 - f1-score (micro avg) 0.3029
2023-10-18 18:02:01,050 saving best model
2023-10-18 18:02:01,083 ----------------------------------------------------------------------------------------------------
2023-10-18 18:02:02,492 epoch 6 - iter 89/894 - loss 0.29911401 - time (sec): 1.41 - samples/sec: 6589.86 - lr: 0.000016 - momentum: 0.000000
2023-10-18 18:02:03,865 epoch 6 - iter 178/894 - loss 0.33016547 - time (sec): 2.78 - samples/sec: 6372.59 - lr: 0.000016 - momentum: 0.000000
2023-10-18 18:02:05,251 epoch 6 - iter 267/894 - loss 0.35328330 - time (sec): 4.17 - samples/sec: 6238.38 - lr: 0.000016 - momentum: 0.000000
2023-10-18 18:02:06,633 epoch 6 - iter 356/894 - loss 0.35220783 - time (sec): 5.55 - samples/sec: 6233.65 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:02:08,026 epoch 6 - iter 445/894 - loss 0.35655484 - time (sec): 6.94 - samples/sec: 6290.15 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:02:09,414 epoch 6 - iter 534/894 - loss 0.35239010 - time (sec): 8.33 - samples/sec: 6249.98 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:02:10,804 epoch 6 - iter 623/894 - loss 0.34528875 - time (sec): 9.72 - samples/sec: 6207.28 - lr: 0.000014 - momentum: 0.000000
2023-10-18 18:02:12,183 epoch 6 - iter 712/894 - loss 0.34025091 - time (sec): 11.10 - samples/sec: 6210.97 - lr: 0.000014 - momentum: 0.000000
2023-10-18 18:02:13,577 epoch 6 - iter 801/894 - loss 0.33584252 - time (sec): 12.49 - samples/sec: 6224.43 - lr: 0.000014 - momentum: 0.000000
2023-10-18 18:02:14,955 epoch 6 - iter 890/894 - loss 0.34022539 - time (sec): 13.87 - samples/sec: 6210.36 - lr: 0.000013 - momentum: 0.000000
2023-10-18 18:02:15,018 ----------------------------------------------------------------------------------------------------
2023-10-18 18:02:15,018 EPOCH 6 done: loss 0.3399 - lr: 0.000013
2023-10-18 18:02:20,353 DEV : loss 0.32095086574554443 - f1-score (micro avg) 0.3121
2023-10-18 18:02:20,379 saving best model
2023-10-18 18:02:20,414 ----------------------------------------------------------------------------------------------------
2023-10-18 18:02:21,792 epoch 7 - iter 89/894 - loss 0.29036338 - time (sec): 1.38 - samples/sec: 5916.59 - lr: 0.000013 - momentum: 0.000000
2023-10-18 18:02:23,232 epoch 7 - iter 178/894 - loss 0.32315512 - time (sec): 2.82 - samples/sec: 6279.32 - lr: 0.000013 - momentum: 0.000000
2023-10-18 18:02:24,652 epoch 7 - iter 267/894 - loss 0.32199177 - time (sec): 4.24 - samples/sec: 6133.97 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:02:26,089 epoch 7 - iter 356/894 - loss 0.31963448 - time (sec): 5.67 - samples/sec: 6344.48 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:02:27,455 epoch 7 - iter 445/894 - loss 0.32435811 - time (sec): 7.04 - samples/sec: 6307.25 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:02:28,909 epoch 7 - iter 534/894 - loss 0.32202630 - time (sec): 8.49 - samples/sec: 6337.39 - lr: 0.000011 - momentum: 0.000000
2023-10-18 18:02:30,284 epoch 7 - iter 623/894 - loss 0.31985686 - time (sec): 9.87 - samples/sec: 6234.95 - lr: 0.000011 - momentum: 0.000000
2023-10-18 18:02:31,679 epoch 7 - iter 712/894 - loss 0.32618461 - time (sec): 11.26 - samples/sec: 6240.32 - lr: 0.000011 - momentum: 0.000000
2023-10-18 18:02:33,026 epoch 7 - iter 801/894 - loss 0.32697320 - time (sec): 12.61 - samples/sec: 6189.31 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:02:34,430 epoch 7 - iter 890/894 - loss 0.32816588 - time (sec): 14.02 - samples/sec: 6146.80 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:02:34,490 ----------------------------------------------------------------------------------------------------
2023-10-18 18:02:34,490 EPOCH 7 done: loss 0.3275 - lr: 0.000010
2023-10-18 18:02:39,856 DEV : loss 0.3155861496925354 - f1-score (micro avg) 0.3204
2023-10-18 18:02:39,883 saving best model
2023-10-18 18:02:39,923 ----------------------------------------------------------------------------------------------------
2023-10-18 18:02:41,289 epoch 8 - iter 89/894 - loss 0.29543599 - time (sec): 1.37 - samples/sec: 5810.43 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:02:42,643 epoch 8 - iter 178/894 - loss 0.30056323 - time (sec): 2.72 - samples/sec: 5629.53 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:02:44,023 epoch 8 - iter 267/894 - loss 0.30791330 - time (sec): 4.10 - samples/sec: 5858.01 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:02:45,424 epoch 8 - iter 356/894 - loss 0.32106558 - time (sec): 5.50 - samples/sec: 5892.48 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:02:46,790 epoch 8 - iter 445/894 - loss 0.31239277 - time (sec): 6.87 - samples/sec: 5886.26 - lr: 0.000008 - momentum: 0.000000
2023-10-18 18:02:48,201 epoch 8 - iter 534/894 - loss 0.31223381 - time (sec): 8.28 - samples/sec: 5855.44 - lr: 0.000008 - momentum: 0.000000
2023-10-18 18:02:49,694 epoch 8 - iter 623/894 - loss 0.30739347 - time (sec): 9.77 - samples/sec: 5953.62 - lr: 0.000008 - momentum: 0.000000
2023-10-18 18:02:51,271 epoch 8 - iter 712/894 - loss 0.31585638 - time (sec): 11.35 - samples/sec: 5965.33 - lr: 0.000007 - momentum: 0.000000
2023-10-18 18:02:52,694 epoch 8 - iter 801/894 - loss 0.31590622 - time (sec): 12.77 - samples/sec: 5942.88 - lr: 0.000007 - momentum: 0.000000
2023-10-18 18:02:54,107 epoch 8 - iter 890/894 - loss 0.31458767 - time (sec): 14.18 - samples/sec: 6001.79 - lr: 0.000007 - momentum: 0.000000
2023-10-18 18:02:54,198 ----------------------------------------------------------------------------------------------------
2023-10-18 18:02:54,198 EPOCH 8 done: loss 0.3137 - lr: 0.000007
2023-10-18 18:02:59,584 DEV : loss 0.3094746768474579 - f1-score (micro avg) 0.3213
2023-10-18 18:02:59,611 saving best model
2023-10-18 18:02:59,650 ----------------------------------------------------------------------------------------------------
2023-10-18 18:03:01,115 epoch 9 - iter 89/894 - loss 0.33597402 - time (sec): 1.47 - samples/sec: 6725.14 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:03:02,484 epoch 9 - iter 178/894 - loss 0.34653954 - time (sec): 2.83 - samples/sec: 6500.85 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:03:03,748 epoch 9 - iter 267/894 - loss 0.33458364 - time (sec): 4.10 - samples/sec: 6705.02 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:03:04,977 epoch 9 - iter 356/894 - loss 0.33027321 - time (sec): 5.33 - samples/sec: 6569.42 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:03:06,220 epoch 9 - iter 445/894 - loss 0.31343385 - time (sec): 6.57 - samples/sec: 6604.81 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:03:07,466 epoch 9 - iter 534/894 - loss 0.32342682 - time (sec): 7.82 - samples/sec: 6658.65 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:03:08,806 epoch 9 - iter 623/894 - loss 0.31647783 - time (sec): 9.16 - samples/sec: 6601.37 - lr: 0.000004 - momentum: 0.000000
2023-10-18 18:03:10,198 epoch 9 - iter 712/894 - loss 0.31201076 - time (sec): 10.55 - samples/sec: 6552.60 - lr: 0.000004 - momentum: 0.000000
2023-10-18 18:03:11,583 epoch 9 - iter 801/894 - loss 0.30997413 - time (sec): 11.93 - samples/sec: 6503.82 - lr: 0.000004 - momentum: 0.000000
2023-10-18 18:03:12,987 epoch 9 - iter 890/894 - loss 0.31007472 - time (sec): 13.34 - samples/sec: 6456.74 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:03:13,055 ----------------------------------------------------------------------------------------------------
2023-10-18 18:03:13,055 EPOCH 9 done: loss 0.3113 - lr: 0.000003
2023-10-18 18:03:18,075 DEV : loss 0.313385546207428 - f1-score (micro avg) 0.3291
2023-10-18 18:03:18,103 saving best model
2023-10-18 18:03:18,134 ----------------------------------------------------------------------------------------------------
2023-10-18 18:03:19,534 epoch 10 - iter 89/894 - loss 0.33123978 - time (sec): 1.40 - samples/sec: 6376.22 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:03:20,922 epoch 10 - iter 178/894 - loss 0.33004373 - time (sec): 2.79 - samples/sec: 6381.39 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:03:22,281 epoch 10 - iter 267/894 - loss 0.31823562 - time (sec): 4.15 - samples/sec: 6234.71 - lr: 0.000002 - momentum: 0.000000
2023-10-18 18:03:23,648 epoch 10 - iter 356/894 - loss 0.32311896 - time (sec): 5.51 - samples/sec: 6115.74 - lr: 0.000002 - momentum: 0.000000
2023-10-18 18:03:25,371 epoch 10 - iter 445/894 - loss 0.32122107 - time (sec): 7.24 - samples/sec: 5912.86 - lr: 0.000002 - momentum: 0.000000
2023-10-18 18:03:26,701 epoch 10 - iter 534/894 - loss 0.31637427 - time (sec): 8.57 - samples/sec: 5920.11 - lr: 0.000001 - momentum: 0.000000
2023-10-18 18:03:28,124 epoch 10 - iter 623/894 - loss 0.30980662 - time (sec): 9.99 - samples/sec: 5999.17 - lr: 0.000001 - momentum: 0.000000
2023-10-18 18:03:29,517 epoch 10 - iter 712/894 - loss 0.30583933 - time (sec): 11.38 - samples/sec: 6044.37 - lr: 0.000001 - momentum: 0.000000
2023-10-18 18:03:30,833 epoch 10 - iter 801/894 - loss 0.30618552 - time (sec): 12.70 - samples/sec: 6098.80 - lr: 0.000000 - momentum: 0.000000
2023-10-18 18:03:32,243 epoch 10 - iter 890/894 - loss 0.30716566 - time (sec): 14.11 - samples/sec: 6110.70 - lr: 0.000000 - momentum: 0.000000
2023-10-18 18:03:32,301 ----------------------------------------------------------------------------------------------------
2023-10-18 18:03:32,301 EPOCH 10 done: loss 0.3067 - lr: 0.000000
2023-10-18 18:03:37,341 DEV : loss 0.3104316294193268 - f1-score (micro avg) 0.3269
2023-10-18 18:03:37,398 ----------------------------------------------------------------------------------------------------
2023-10-18 18:03:37,399 Loading model from best epoch ...
2023-10-18 18:03:37,476 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-18 18:03:39,816
Results:
- F-score (micro) 0.3144
- F-score (macro) 0.1225
- Accuracy 0.1974
By class:
precision recall f1-score support
loc 0.4540 0.5050 0.4782 596
pers 0.1281 0.1411 0.1343 333
org 0.0000 0.0000 0.0000 132
prod 0.0000 0.0000 0.0000 66
time 0.0000 0.0000 0.0000 49
micro avg 0.3353 0.2959 0.3144 1176
macro avg 0.1164 0.1292 0.1225 1176
weighted avg 0.2664 0.2959 0.2804 1176
2023-10-18 18:03:39,816 ----------------------------------------------------------------------------------------------------