stefan-it's picture
Upload folder using huggingface_hub
5759270
2023-10-18 14:36:33,366 ----------------------------------------------------------------------------------------------------
2023-10-18 14:36:33,366 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 14:36:33,366 ----------------------------------------------------------------------------------------------------
2023-10-18 14:36:33,366 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-18 14:36:33,366 ----------------------------------------------------------------------------------------------------
2023-10-18 14:36:33,366 Train: 1100 sentences
2023-10-18 14:36:33,366 (train_with_dev=False, train_with_test=False)
2023-10-18 14:36:33,366 ----------------------------------------------------------------------------------------------------
2023-10-18 14:36:33,367 Training Params:
2023-10-18 14:36:33,367 - learning_rate: "3e-05"
2023-10-18 14:36:33,367 - mini_batch_size: "4"
2023-10-18 14:36:33,367 - max_epochs: "10"
2023-10-18 14:36:33,367 - shuffle: "True"
2023-10-18 14:36:33,367 ----------------------------------------------------------------------------------------------------
2023-10-18 14:36:33,367 Plugins:
2023-10-18 14:36:33,367 - TensorboardLogger
2023-10-18 14:36:33,367 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 14:36:33,367 ----------------------------------------------------------------------------------------------------
2023-10-18 14:36:33,367 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 14:36:33,367 - metric: "('micro avg', 'f1-score')"
2023-10-18 14:36:33,367 ----------------------------------------------------------------------------------------------------
2023-10-18 14:36:33,367 Computation:
2023-10-18 14:36:33,367 - compute on device: cuda:0
2023-10-18 14:36:33,367 - embedding storage: none
2023-10-18 14:36:33,367 ----------------------------------------------------------------------------------------------------
2023-10-18 14:36:33,367 Model training base path: "hmbench-ajmc/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-18 14:36:33,367 ----------------------------------------------------------------------------------------------------
2023-10-18 14:36:33,367 ----------------------------------------------------------------------------------------------------
2023-10-18 14:36:33,367 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 14:36:33,791 epoch 1 - iter 27/275 - loss 3.45619634 - time (sec): 0.42 - samples/sec: 5230.73 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:36:34,208 epoch 1 - iter 54/275 - loss 3.53073752 - time (sec): 0.84 - samples/sec: 5327.26 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:36:34,619 epoch 1 - iter 81/275 - loss 3.46212523 - time (sec): 1.25 - samples/sec: 5520.99 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:36:35,014 epoch 1 - iter 108/275 - loss 3.32851857 - time (sec): 1.65 - samples/sec: 5543.00 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:36:35,417 epoch 1 - iter 135/275 - loss 3.18983791 - time (sec): 2.05 - samples/sec: 5531.34 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:36:35,831 epoch 1 - iter 162/275 - loss 3.01174189 - time (sec): 2.46 - samples/sec: 5533.36 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:36:36,255 epoch 1 - iter 189/275 - loss 2.82548982 - time (sec): 2.89 - samples/sec: 5540.79 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:36:36,649 epoch 1 - iter 216/275 - loss 2.65145767 - time (sec): 3.28 - samples/sec: 5519.36 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:36:37,050 epoch 1 - iter 243/275 - loss 2.48063195 - time (sec): 3.68 - samples/sec: 5502.57 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:36:37,459 epoch 1 - iter 270/275 - loss 2.35694701 - time (sec): 4.09 - samples/sec: 5449.68 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:36:37,545 ----------------------------------------------------------------------------------------------------
2023-10-18 14:36:37,545 EPOCH 1 done: loss 2.3269 - lr: 0.000029
2023-10-18 14:36:37,797 DEV : loss 0.9024934768676758 - f1-score (micro avg) 0.0
2023-10-18 14:36:37,803 ----------------------------------------------------------------------------------------------------
2023-10-18 14:36:38,205 epoch 2 - iter 27/275 - loss 1.04597814 - time (sec): 0.40 - samples/sec: 6019.56 - lr: 0.000030 - momentum: 0.000000
2023-10-18 14:36:38,612 epoch 2 - iter 54/275 - loss 1.06133788 - time (sec): 0.81 - samples/sec: 5877.07 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:36:39,007 epoch 2 - iter 81/275 - loss 1.01487211 - time (sec): 1.20 - samples/sec: 5783.90 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:36:39,412 epoch 2 - iter 108/275 - loss 1.03819427 - time (sec): 1.61 - samples/sec: 5804.67 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:36:39,812 epoch 2 - iter 135/275 - loss 1.00072248 - time (sec): 2.01 - samples/sec: 5787.40 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:36:40,215 epoch 2 - iter 162/275 - loss 0.95904475 - time (sec): 2.41 - samples/sec: 5665.82 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:36:40,614 epoch 2 - iter 189/275 - loss 0.94371477 - time (sec): 2.81 - samples/sec: 5714.57 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:36:41,023 epoch 2 - iter 216/275 - loss 0.92129802 - time (sec): 3.22 - samples/sec: 5693.25 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:36:41,403 epoch 2 - iter 243/275 - loss 0.89080202 - time (sec): 3.60 - samples/sec: 5634.09 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:36:41,773 epoch 2 - iter 270/275 - loss 0.88200667 - time (sec): 3.97 - samples/sec: 5652.29 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:36:41,843 ----------------------------------------------------------------------------------------------------
2023-10-18 14:36:41,843 EPOCH 2 done: loss 0.8774 - lr: 0.000027
2023-10-18 14:36:42,206 DEV : loss 0.6456888914108276 - f1-score (micro avg) 0.1618
2023-10-18 14:36:42,211 saving best model
2023-10-18 14:36:42,243 ----------------------------------------------------------------------------------------------------
2023-10-18 14:36:42,644 epoch 3 - iter 27/275 - loss 0.62278804 - time (sec): 0.40 - samples/sec: 5258.73 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:36:43,056 epoch 3 - iter 54/275 - loss 0.69758740 - time (sec): 0.81 - samples/sec: 5519.26 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:36:43,460 epoch 3 - iter 81/275 - loss 0.69895332 - time (sec): 1.22 - samples/sec: 5549.30 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:36:43,861 epoch 3 - iter 108/275 - loss 0.67870849 - time (sec): 1.62 - samples/sec: 5590.13 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:36:44,260 epoch 3 - iter 135/275 - loss 0.69228442 - time (sec): 2.02 - samples/sec: 5551.39 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:36:44,669 epoch 3 - iter 162/275 - loss 0.67676551 - time (sec): 2.43 - samples/sec: 5558.72 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:36:45,094 epoch 3 - iter 189/275 - loss 0.67144051 - time (sec): 2.85 - samples/sec: 5565.94 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:36:45,528 epoch 3 - iter 216/275 - loss 0.67066038 - time (sec): 3.28 - samples/sec: 5455.08 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:36:45,935 epoch 3 - iter 243/275 - loss 0.67616565 - time (sec): 3.69 - samples/sec: 5476.23 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:36:46,331 epoch 3 - iter 270/275 - loss 0.67052385 - time (sec): 4.09 - samples/sec: 5473.35 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:36:46,407 ----------------------------------------------------------------------------------------------------
2023-10-18 14:36:46,407 EPOCH 3 done: loss 0.6781 - lr: 0.000023
2023-10-18 14:36:46,899 DEV : loss 0.5147013068199158 - f1-score (micro avg) 0.2226
2023-10-18 14:36:46,904 saving best model
2023-10-18 14:36:46,938 ----------------------------------------------------------------------------------------------------
2023-10-18 14:36:47,338 epoch 4 - iter 27/275 - loss 0.56192949 - time (sec): 0.40 - samples/sec: 5280.27 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:36:47,744 epoch 4 - iter 54/275 - loss 0.58035071 - time (sec): 0.81 - samples/sec: 5369.24 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:36:48,158 epoch 4 - iter 81/275 - loss 0.61288291 - time (sec): 1.22 - samples/sec: 5403.89 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:36:48,592 epoch 4 - iter 108/275 - loss 0.63425335 - time (sec): 1.65 - samples/sec: 5445.33 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:36:48,987 epoch 4 - iter 135/275 - loss 0.63477596 - time (sec): 2.05 - samples/sec: 5334.87 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:36:49,396 epoch 4 - iter 162/275 - loss 0.61887265 - time (sec): 2.46 - samples/sec: 5385.44 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:36:49,798 epoch 4 - iter 189/275 - loss 0.59727387 - time (sec): 2.86 - samples/sec: 5454.39 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:36:50,227 epoch 4 - iter 216/275 - loss 0.59528584 - time (sec): 3.29 - samples/sec: 5494.51 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:36:50,643 epoch 4 - iter 243/275 - loss 0.57702778 - time (sec): 3.70 - samples/sec: 5411.25 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:36:51,058 epoch 4 - iter 270/275 - loss 0.57960283 - time (sec): 4.12 - samples/sec: 5388.00 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:36:51,141 ----------------------------------------------------------------------------------------------------
2023-10-18 14:36:51,141 EPOCH 4 done: loss 0.5796 - lr: 0.000020
2023-10-18 14:36:51,502 DEV : loss 0.4463545083999634 - f1-score (micro avg) 0.2611
2023-10-18 14:36:51,506 saving best model
2023-10-18 14:36:51,541 ----------------------------------------------------------------------------------------------------
2023-10-18 14:36:51,973 epoch 5 - iter 27/275 - loss 0.54498405 - time (sec): 0.43 - samples/sec: 4931.32 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:36:52,387 epoch 5 - iter 54/275 - loss 0.53632267 - time (sec): 0.84 - samples/sec: 5186.15 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:36:52,793 epoch 5 - iter 81/275 - loss 0.56778411 - time (sec): 1.25 - samples/sec: 5422.63 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:36:53,202 epoch 5 - iter 108/275 - loss 0.55103820 - time (sec): 1.66 - samples/sec: 5448.36 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:36:53,604 epoch 5 - iter 135/275 - loss 0.52634884 - time (sec): 2.06 - samples/sec: 5348.31 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:36:54,010 epoch 5 - iter 162/275 - loss 0.54865693 - time (sec): 2.47 - samples/sec: 5415.74 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:36:54,414 epoch 5 - iter 189/275 - loss 0.53785239 - time (sec): 2.87 - samples/sec: 5409.32 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:36:54,816 epoch 5 - iter 216/275 - loss 0.53605532 - time (sec): 3.27 - samples/sec: 5470.15 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:36:55,221 epoch 5 - iter 243/275 - loss 0.54605645 - time (sec): 3.68 - samples/sec: 5539.12 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:36:55,623 epoch 5 - iter 270/275 - loss 0.53545867 - time (sec): 4.08 - samples/sec: 5468.56 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:36:55,696 ----------------------------------------------------------------------------------------------------
2023-10-18 14:36:55,696 EPOCH 5 done: loss 0.5360 - lr: 0.000017
2023-10-18 14:36:56,063 DEV : loss 0.3909608721733093 - f1-score (micro avg) 0.4147
2023-10-18 14:36:56,067 saving best model
2023-10-18 14:36:56,101 ----------------------------------------------------------------------------------------------------
2023-10-18 14:36:56,502 epoch 6 - iter 27/275 - loss 0.38667056 - time (sec): 0.40 - samples/sec: 5678.79 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:36:56,895 epoch 6 - iter 54/275 - loss 0.44539454 - time (sec): 0.79 - samples/sec: 5510.39 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:36:57,287 epoch 6 - iter 81/275 - loss 0.43904788 - time (sec): 1.19 - samples/sec: 5579.63 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:36:57,676 epoch 6 - iter 108/275 - loss 0.47185525 - time (sec): 1.57 - samples/sec: 5601.41 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:36:58,087 epoch 6 - iter 135/275 - loss 0.47304307 - time (sec): 1.99 - samples/sec: 5713.36 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:36:58,484 epoch 6 - iter 162/275 - loss 0.45971614 - time (sec): 2.38 - samples/sec: 5715.27 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:36:58,880 epoch 6 - iter 189/275 - loss 0.46054158 - time (sec): 2.78 - samples/sec: 5711.46 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:36:59,272 epoch 6 - iter 216/275 - loss 0.46634615 - time (sec): 3.17 - samples/sec: 5684.25 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:36:59,669 epoch 6 - iter 243/275 - loss 0.47350805 - time (sec): 3.57 - samples/sec: 5690.93 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:37:00,074 epoch 6 - iter 270/275 - loss 0.47499535 - time (sec): 3.97 - samples/sec: 5659.19 - lr: 0.000013 - momentum: 0.000000
2023-10-18 14:37:00,144 ----------------------------------------------------------------------------------------------------
2023-10-18 14:37:00,144 EPOCH 6 done: loss 0.4754 - lr: 0.000013
2023-10-18 14:37:00,508 DEV : loss 0.3617814779281616 - f1-score (micro avg) 0.4787
2023-10-18 14:37:00,512 saving best model
2023-10-18 14:37:00,545 ----------------------------------------------------------------------------------------------------
2023-10-18 14:37:00,954 epoch 7 - iter 27/275 - loss 0.56443656 - time (sec): 0.41 - samples/sec: 6005.68 - lr: 0.000013 - momentum: 0.000000
2023-10-18 14:37:01,358 epoch 7 - iter 54/275 - loss 0.50534845 - time (sec): 0.81 - samples/sec: 5551.73 - lr: 0.000013 - momentum: 0.000000
2023-10-18 14:37:01,756 epoch 7 - iter 81/275 - loss 0.49268904 - time (sec): 1.21 - samples/sec: 5587.27 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:37:02,159 epoch 7 - iter 108/275 - loss 0.48705259 - time (sec): 1.61 - samples/sec: 5544.37 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:37:02,579 epoch 7 - iter 135/275 - loss 0.48987296 - time (sec): 2.03 - samples/sec: 5512.33 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:37:02,983 epoch 7 - iter 162/275 - loss 0.48551192 - time (sec): 2.44 - samples/sec: 5437.37 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:37:03,403 epoch 7 - iter 189/275 - loss 0.48652435 - time (sec): 2.86 - samples/sec: 5466.15 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:37:03,804 epoch 7 - iter 216/275 - loss 0.47536884 - time (sec): 3.26 - samples/sec: 5426.92 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:37:04,217 epoch 7 - iter 243/275 - loss 0.46704692 - time (sec): 3.67 - samples/sec: 5430.03 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:37:04,613 epoch 7 - iter 270/275 - loss 0.46114947 - time (sec): 4.07 - samples/sec: 5473.46 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:37:04,690 ----------------------------------------------------------------------------------------------------
2023-10-18 14:37:04,690 EPOCH 7 done: loss 0.4590 - lr: 0.000010
2023-10-18 14:37:05,059 DEV : loss 0.3537524342536926 - f1-score (micro avg) 0.5072
2023-10-18 14:37:05,062 saving best model
2023-10-18 14:37:05,097 ----------------------------------------------------------------------------------------------------
2023-10-18 14:37:05,485 epoch 8 - iter 27/275 - loss 0.47960427 - time (sec): 0.39 - samples/sec: 5980.89 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:37:05,886 epoch 8 - iter 54/275 - loss 0.45993071 - time (sec): 0.79 - samples/sec: 5951.66 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:37:06,300 epoch 8 - iter 81/275 - loss 0.45830077 - time (sec): 1.20 - samples/sec: 5648.05 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:37:06,726 epoch 8 - iter 108/275 - loss 0.45962537 - time (sec): 1.63 - samples/sec: 5711.83 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:37:07,139 epoch 8 - iter 135/275 - loss 0.44293332 - time (sec): 2.04 - samples/sec: 5700.88 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:37:07,562 epoch 8 - iter 162/275 - loss 0.43660952 - time (sec): 2.46 - samples/sec: 5628.88 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:37:07,965 epoch 8 - iter 189/275 - loss 0.43568566 - time (sec): 2.87 - samples/sec: 5503.48 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:37:08,364 epoch 8 - iter 216/275 - loss 0.44748474 - time (sec): 3.27 - samples/sec: 5488.29 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:37:08,776 epoch 8 - iter 243/275 - loss 0.44554978 - time (sec): 3.68 - samples/sec: 5494.63 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:37:09,187 epoch 8 - iter 270/275 - loss 0.43700141 - time (sec): 4.09 - samples/sec: 5456.24 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:37:09,263 ----------------------------------------------------------------------------------------------------
2023-10-18 14:37:09,263 EPOCH 8 done: loss 0.4374 - lr: 0.000007
2023-10-18 14:37:09,636 DEV : loss 0.3434961140155792 - f1-score (micro avg) 0.509
2023-10-18 14:37:09,641 saving best model
2023-10-18 14:37:09,676 ----------------------------------------------------------------------------------------------------
2023-10-18 14:37:10,075 epoch 9 - iter 27/275 - loss 0.44231506 - time (sec): 0.40 - samples/sec: 5288.08 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:37:10,471 epoch 9 - iter 54/275 - loss 0.44934559 - time (sec): 0.79 - samples/sec: 5443.80 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:37:10,873 epoch 9 - iter 81/275 - loss 0.44218712 - time (sec): 1.20 - samples/sec: 5546.31 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:37:11,280 epoch 9 - iter 108/275 - loss 0.44180089 - time (sec): 1.60 - samples/sec: 5545.45 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:37:11,678 epoch 9 - iter 135/275 - loss 0.44618910 - time (sec): 2.00 - samples/sec: 5520.79 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:37:12,075 epoch 9 - iter 162/275 - loss 0.45630016 - time (sec): 2.40 - samples/sec: 5532.09 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:37:12,486 epoch 9 - iter 189/275 - loss 0.43987687 - time (sec): 2.81 - samples/sec: 5624.41 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:37:12,904 epoch 9 - iter 216/275 - loss 0.43092048 - time (sec): 3.23 - samples/sec: 5570.46 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:37:13,322 epoch 9 - iter 243/275 - loss 0.42220015 - time (sec): 3.64 - samples/sec: 5560.04 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:37:13,723 epoch 9 - iter 270/275 - loss 0.41670759 - time (sec): 4.05 - samples/sec: 5548.61 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:37:13,798 ----------------------------------------------------------------------------------------------------
2023-10-18 14:37:13,798 EPOCH 9 done: loss 0.4161 - lr: 0.000003
2023-10-18 14:37:14,165 DEV : loss 0.33459872007369995 - f1-score (micro avg) 0.5156
2023-10-18 14:37:14,169 saving best model
2023-10-18 14:37:14,204 ----------------------------------------------------------------------------------------------------
2023-10-18 14:37:14,625 epoch 10 - iter 27/275 - loss 0.36380550 - time (sec): 0.42 - samples/sec: 5485.60 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:37:15,011 epoch 10 - iter 54/275 - loss 0.38570137 - time (sec): 0.81 - samples/sec: 5855.00 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:37:15,384 epoch 10 - iter 81/275 - loss 0.40437623 - time (sec): 1.18 - samples/sec: 5736.95 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:37:15,747 epoch 10 - iter 108/275 - loss 0.40790946 - time (sec): 1.54 - samples/sec: 5759.11 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:37:16,120 epoch 10 - iter 135/275 - loss 0.40109853 - time (sec): 1.91 - samples/sec: 5778.30 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:37:16,491 epoch 10 - iter 162/275 - loss 0.40055477 - time (sec): 2.29 - samples/sec: 5834.25 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:37:16,872 epoch 10 - iter 189/275 - loss 0.40524405 - time (sec): 2.67 - samples/sec: 5886.69 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:37:17,248 epoch 10 - iter 216/275 - loss 0.41846748 - time (sec): 3.04 - samples/sec: 5932.54 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:37:17,614 epoch 10 - iter 243/275 - loss 0.41928472 - time (sec): 3.41 - samples/sec: 5890.41 - lr: 0.000000 - momentum: 0.000000
2023-10-18 14:37:17,991 epoch 10 - iter 270/275 - loss 0.41660013 - time (sec): 3.79 - samples/sec: 5898.11 - lr: 0.000000 - momentum: 0.000000
2023-10-18 14:37:18,059 ----------------------------------------------------------------------------------------------------
2023-10-18 14:37:18,059 EPOCH 10 done: loss 0.4176 - lr: 0.000000
2023-10-18 14:37:18,427 DEV : loss 0.334285706281662 - f1-score (micro avg) 0.52
2023-10-18 14:37:18,431 saving best model
2023-10-18 14:37:18,498 ----------------------------------------------------------------------------------------------------
2023-10-18 14:37:18,498 Loading model from best epoch ...
2023-10-18 14:37:18,569 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-18 14:37:18,855
Results:
- F-score (micro) 0.5605
- F-score (macro) 0.3278
- Accuracy 0.3959
By class:
precision recall f1-score support
scope 0.5745 0.6136 0.5934 176
pers 0.8795 0.5703 0.6919 128
work 0.2991 0.4324 0.3536 74
object 0.0000 0.0000 0.0000 2
loc 0.0000 0.0000 0.0000 2
micro avg 0.5635 0.5576 0.5605 382
macro avg 0.3506 0.3233 0.3278 382
weighted avg 0.6173 0.5576 0.5738 382
2023-10-18 14:37:18,856 ----------------------------------------------------------------------------------------------------