|
2023-10-16 20:25:02,771 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:25:02,772 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 20:25:02,772 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:25:02,772 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-16 20:25:02,772 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:25:02,772 Train: 1085 sentences |
|
2023-10-16 20:25:02,772 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 20:25:02,772 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:25:02,772 Training Params: |
|
2023-10-16 20:25:02,772 - learning_rate: "5e-05" |
|
2023-10-16 20:25:02,772 - mini_batch_size: "4" |
|
2023-10-16 20:25:02,772 - max_epochs: "10" |
|
2023-10-16 20:25:02,772 - shuffle: "True" |
|
2023-10-16 20:25:02,772 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:25:02,772 Plugins: |
|
2023-10-16 20:25:02,772 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 20:25:02,772 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:25:02,772 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 20:25:02,773 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 20:25:02,773 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:25:02,773 Computation: |
|
2023-10-16 20:25:02,773 - compute on device: cuda:0 |
|
2023-10-16 20:25:02,773 - embedding storage: none |
|
2023-10-16 20:25:02,773 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:25:02,773 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-16 20:25:02,773 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:25:02,773 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:25:04,358 epoch 1 - iter 27/272 - loss 2.71590439 - time (sec): 1.58 - samples/sec: 3201.32 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 20:25:06,100 epoch 1 - iter 54/272 - loss 2.00496171 - time (sec): 3.33 - samples/sec: 3247.62 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 20:25:07,724 epoch 1 - iter 81/272 - loss 1.51156654 - time (sec): 4.95 - samples/sec: 3354.08 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 20:25:09,312 epoch 1 - iter 108/272 - loss 1.27030756 - time (sec): 6.54 - samples/sec: 3287.03 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 20:25:10,850 epoch 1 - iter 135/272 - loss 1.08958487 - time (sec): 8.08 - samples/sec: 3298.26 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 20:25:12,433 epoch 1 - iter 162/272 - loss 0.97036492 - time (sec): 9.66 - samples/sec: 3297.79 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 20:25:13,892 epoch 1 - iter 189/272 - loss 0.86351205 - time (sec): 11.12 - samples/sec: 3313.32 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 20:25:15,471 epoch 1 - iter 216/272 - loss 0.78804070 - time (sec): 12.70 - samples/sec: 3304.35 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 20:25:16,978 epoch 1 - iter 243/272 - loss 0.73891617 - time (sec): 14.20 - samples/sec: 3280.93 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-16 20:25:18,553 epoch 1 - iter 270/272 - loss 0.68670621 - time (sec): 15.78 - samples/sec: 3286.32 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-16 20:25:18,634 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:25:18,634 EPOCH 1 done: loss 0.6852 - lr: 0.000049 |
|
2023-10-16 20:25:19,719 DEV : loss 0.1688479781150818 - f1-score (micro avg) 0.6594 |
|
2023-10-16 20:25:19,723 saving best model |
|
2023-10-16 20:25:20,078 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:25:21,573 epoch 2 - iter 27/272 - loss 0.21134804 - time (sec): 1.49 - samples/sec: 3023.64 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-16 20:25:23,209 epoch 2 - iter 54/272 - loss 0.19104150 - time (sec): 3.13 - samples/sec: 2994.50 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-16 20:25:24,904 epoch 2 - iter 81/272 - loss 0.16881257 - time (sec): 4.82 - samples/sec: 3147.51 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 20:25:26,492 epoch 2 - iter 108/272 - loss 0.15477895 - time (sec): 6.41 - samples/sec: 3259.89 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 20:25:27,898 epoch 2 - iter 135/272 - loss 0.16197566 - time (sec): 7.82 - samples/sec: 3262.77 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 20:25:29,445 epoch 2 - iter 162/272 - loss 0.15573021 - time (sec): 9.37 - samples/sec: 3278.79 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 20:25:31,005 epoch 2 - iter 189/272 - loss 0.15034055 - time (sec): 10.93 - samples/sec: 3257.50 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-16 20:25:32,546 epoch 2 - iter 216/272 - loss 0.14836079 - time (sec): 12.47 - samples/sec: 3300.85 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-16 20:25:34,201 epoch 2 - iter 243/272 - loss 0.15183843 - time (sec): 14.12 - samples/sec: 3275.94 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 20:25:35,804 epoch 2 - iter 270/272 - loss 0.14814663 - time (sec): 15.72 - samples/sec: 3293.44 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 20:25:35,909 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:25:35,909 EPOCH 2 done: loss 0.1476 - lr: 0.000045 |
|
2023-10-16 20:25:37,371 DEV : loss 0.12901346385478973 - f1-score (micro avg) 0.7405 |
|
2023-10-16 20:25:37,375 saving best model |
|
2023-10-16 20:25:37,816 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:25:39,354 epoch 3 - iter 27/272 - loss 0.11333207 - time (sec): 1.53 - samples/sec: 2827.72 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-16 20:25:40,966 epoch 3 - iter 54/272 - loss 0.11744646 - time (sec): 3.15 - samples/sec: 3020.31 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 20:25:42,433 epoch 3 - iter 81/272 - loss 0.10995233 - time (sec): 4.61 - samples/sec: 3070.65 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 20:25:44,259 epoch 3 - iter 108/272 - loss 0.09882256 - time (sec): 6.44 - samples/sec: 3078.13 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-16 20:25:45,788 epoch 3 - iter 135/272 - loss 0.09334488 - time (sec): 7.97 - samples/sec: 3158.88 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-16 20:25:47,355 epoch 3 - iter 162/272 - loss 0.09177092 - time (sec): 9.53 - samples/sec: 3254.09 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-16 20:25:48,983 epoch 3 - iter 189/272 - loss 0.08768206 - time (sec): 11.16 - samples/sec: 3222.15 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-16 20:25:50,395 epoch 3 - iter 216/272 - loss 0.08548366 - time (sec): 12.57 - samples/sec: 3199.11 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 20:25:52,138 epoch 3 - iter 243/272 - loss 0.08434458 - time (sec): 14.32 - samples/sec: 3214.91 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 20:25:53,734 epoch 3 - iter 270/272 - loss 0.08288908 - time (sec): 15.91 - samples/sec: 3250.41 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-16 20:25:53,836 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:25:53,837 EPOCH 3 done: loss 0.0827 - lr: 0.000039 |
|
2023-10-16 20:25:55,308 DEV : loss 0.11182602494955063 - f1-score (micro avg) 0.7607 |
|
2023-10-16 20:25:55,313 saving best model |
|
2023-10-16 20:25:55,742 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:25:57,369 epoch 4 - iter 27/272 - loss 0.09273322 - time (sec): 1.62 - samples/sec: 3257.48 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 20:25:59,074 epoch 4 - iter 54/272 - loss 0.06332921 - time (sec): 3.33 - samples/sec: 3278.76 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 20:26:00,863 epoch 4 - iter 81/272 - loss 0.05000570 - time (sec): 5.12 - samples/sec: 3284.74 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 20:26:02,391 epoch 4 - iter 108/272 - loss 0.05317997 - time (sec): 6.65 - samples/sec: 3233.92 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 20:26:03,796 epoch 4 - iter 135/272 - loss 0.05115972 - time (sec): 8.05 - samples/sec: 3192.39 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 20:26:05,313 epoch 4 - iter 162/272 - loss 0.05239520 - time (sec): 9.57 - samples/sec: 3230.61 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 20:26:06,833 epoch 4 - iter 189/272 - loss 0.05245734 - time (sec): 11.09 - samples/sec: 3219.61 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 20:26:08,387 epoch 4 - iter 216/272 - loss 0.05505467 - time (sec): 12.64 - samples/sec: 3212.56 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-16 20:26:09,884 epoch 4 - iter 243/272 - loss 0.05486060 - time (sec): 14.14 - samples/sec: 3236.75 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-16 20:26:11,527 epoch 4 - iter 270/272 - loss 0.05274898 - time (sec): 15.78 - samples/sec: 3287.76 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 20:26:11,611 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:26:11,611 EPOCH 4 done: loss 0.0527 - lr: 0.000033 |
|
2023-10-16 20:26:13,072 DEV : loss 0.11778035759925842 - f1-score (micro avg) 0.779 |
|
2023-10-16 20:26:13,077 saving best model |
|
2023-10-16 20:26:13,534 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:26:15,154 epoch 5 - iter 27/272 - loss 0.03281956 - time (sec): 1.62 - samples/sec: 3190.04 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 20:26:16,537 epoch 5 - iter 54/272 - loss 0.02540293 - time (sec): 3.00 - samples/sec: 3167.17 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 20:26:18,094 epoch 5 - iter 81/272 - loss 0.02966326 - time (sec): 4.56 - samples/sec: 3185.84 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 20:26:19,639 epoch 5 - iter 108/272 - loss 0.02853719 - time (sec): 6.10 - samples/sec: 3221.43 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 20:26:21,104 epoch 5 - iter 135/272 - loss 0.02748740 - time (sec): 7.57 - samples/sec: 3306.16 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 20:26:22,625 epoch 5 - iter 162/272 - loss 0.03060723 - time (sec): 9.09 - samples/sec: 3299.94 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 20:26:24,438 epoch 5 - iter 189/272 - loss 0.03248342 - time (sec): 10.90 - samples/sec: 3350.10 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 20:26:26,120 epoch 5 - iter 216/272 - loss 0.03492039 - time (sec): 12.58 - samples/sec: 3362.14 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 20:26:27,581 epoch 5 - iter 243/272 - loss 0.03439472 - time (sec): 14.05 - samples/sec: 3330.71 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 20:26:29,103 epoch 5 - iter 270/272 - loss 0.03494240 - time (sec): 15.57 - samples/sec: 3322.15 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 20:26:29,187 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:26:29,187 EPOCH 5 done: loss 0.0348 - lr: 0.000028 |
|
2023-10-16 20:26:30,654 DEV : loss 0.15889212489128113 - f1-score (micro avg) 0.7651 |
|
2023-10-16 20:26:30,659 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:26:32,312 epoch 6 - iter 27/272 - loss 0.01204604 - time (sec): 1.65 - samples/sec: 3473.64 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 20:26:33,804 epoch 6 - iter 54/272 - loss 0.01561420 - time (sec): 3.14 - samples/sec: 3301.32 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 20:26:35,254 epoch 6 - iter 81/272 - loss 0.02034542 - time (sec): 4.59 - samples/sec: 3292.94 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 20:26:36,871 epoch 6 - iter 108/272 - loss 0.02945264 - time (sec): 6.21 - samples/sec: 3296.60 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 20:26:38,555 epoch 6 - iter 135/272 - loss 0.02776080 - time (sec): 7.90 - samples/sec: 3275.42 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 20:26:40,018 epoch 6 - iter 162/272 - loss 0.02920672 - time (sec): 9.36 - samples/sec: 3235.23 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 20:26:41,454 epoch 6 - iter 189/272 - loss 0.02714912 - time (sec): 10.79 - samples/sec: 3206.56 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 20:26:43,065 epoch 6 - iter 216/272 - loss 0.02642711 - time (sec): 12.40 - samples/sec: 3229.05 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 20:26:44,842 epoch 6 - iter 243/272 - loss 0.02513871 - time (sec): 14.18 - samples/sec: 3213.64 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 20:26:46,559 epoch 6 - iter 270/272 - loss 0.02635651 - time (sec): 15.90 - samples/sec: 3246.49 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 20:26:46,662 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:26:46,663 EPOCH 6 done: loss 0.0263 - lr: 0.000022 |
|
2023-10-16 20:26:48,128 DEV : loss 0.1309671252965927 - f1-score (micro avg) 0.814 |
|
2023-10-16 20:26:48,132 saving best model |
|
2023-10-16 20:26:48,579 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:26:50,302 epoch 7 - iter 27/272 - loss 0.02793060 - time (sec): 1.72 - samples/sec: 3573.44 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 20:26:51,864 epoch 7 - iter 54/272 - loss 0.01866505 - time (sec): 3.28 - samples/sec: 3334.01 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 20:26:53,247 epoch 7 - iter 81/272 - loss 0.02043068 - time (sec): 4.66 - samples/sec: 3190.07 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 20:26:54,776 epoch 7 - iter 108/272 - loss 0.02008078 - time (sec): 6.19 - samples/sec: 3238.12 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 20:26:56,265 epoch 7 - iter 135/272 - loss 0.01850352 - time (sec): 7.68 - samples/sec: 3244.72 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 20:26:57,828 epoch 7 - iter 162/272 - loss 0.01706133 - time (sec): 9.24 - samples/sec: 3222.33 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 20:26:59,415 epoch 7 - iter 189/272 - loss 0.01752863 - time (sec): 10.83 - samples/sec: 3267.49 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 20:27:01,170 epoch 7 - iter 216/272 - loss 0.01847442 - time (sec): 12.58 - samples/sec: 3299.75 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 20:27:02,721 epoch 7 - iter 243/272 - loss 0.01770913 - time (sec): 14.13 - samples/sec: 3301.87 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 20:27:04,282 epoch 7 - iter 270/272 - loss 0.01870407 - time (sec): 15.70 - samples/sec: 3303.30 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 20:27:04,362 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:27:04,363 EPOCH 7 done: loss 0.0187 - lr: 0.000017 |
|
2023-10-16 20:27:05,851 DEV : loss 0.1352917104959488 - f1-score (micro avg) 0.8318 |
|
2023-10-16 20:27:05,856 saving best model |
|
2023-10-16 20:27:06,310 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:27:07,883 epoch 8 - iter 27/272 - loss 0.02844365 - time (sec): 1.57 - samples/sec: 3365.77 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 20:27:09,547 epoch 8 - iter 54/272 - loss 0.02318735 - time (sec): 3.23 - samples/sec: 3266.73 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 20:27:10,937 epoch 8 - iter 81/272 - loss 0.02212318 - time (sec): 4.62 - samples/sec: 3251.07 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 20:27:12,585 epoch 8 - iter 108/272 - loss 0.02009919 - time (sec): 6.27 - samples/sec: 3291.31 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 20:27:13,993 epoch 8 - iter 135/272 - loss 0.01813660 - time (sec): 7.68 - samples/sec: 3332.79 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 20:27:15,606 epoch 8 - iter 162/272 - loss 0.01792248 - time (sec): 9.29 - samples/sec: 3313.51 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 20:27:17,195 epoch 8 - iter 189/272 - loss 0.01588766 - time (sec): 10.88 - samples/sec: 3259.91 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 20:27:18,843 epoch 8 - iter 216/272 - loss 0.01585827 - time (sec): 12.53 - samples/sec: 3317.30 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 20:27:20,358 epoch 8 - iter 243/272 - loss 0.01556947 - time (sec): 14.04 - samples/sec: 3321.62 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 20:27:22,042 epoch 8 - iter 270/272 - loss 0.01456604 - time (sec): 15.73 - samples/sec: 3299.76 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 20:27:22,123 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:27:22,124 EPOCH 8 done: loss 0.0147 - lr: 0.000011 |
|
2023-10-16 20:27:23,577 DEV : loss 0.14705629646778107 - f1-score (micro avg) 0.8125 |
|
2023-10-16 20:27:23,582 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:27:25,205 epoch 9 - iter 27/272 - loss 0.00269306 - time (sec): 1.62 - samples/sec: 3374.15 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 20:27:26,868 epoch 9 - iter 54/272 - loss 0.00618586 - time (sec): 3.28 - samples/sec: 3431.07 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 20:27:28,481 epoch 9 - iter 81/272 - loss 0.00636148 - time (sec): 4.90 - samples/sec: 3446.13 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 20:27:29,994 epoch 9 - iter 108/272 - loss 0.00665333 - time (sec): 6.41 - samples/sec: 3330.00 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 20:27:31,591 epoch 9 - iter 135/272 - loss 0.00608565 - time (sec): 8.01 - samples/sec: 3301.21 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 20:27:33,233 epoch 9 - iter 162/272 - loss 0.00623476 - time (sec): 9.65 - samples/sec: 3270.96 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 20:27:34,836 epoch 9 - iter 189/272 - loss 0.00642243 - time (sec): 11.25 - samples/sec: 3261.76 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 20:27:36,355 epoch 9 - iter 216/272 - loss 0.00750627 - time (sec): 12.77 - samples/sec: 3275.92 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 20:27:37,932 epoch 9 - iter 243/272 - loss 0.00778906 - time (sec): 14.35 - samples/sec: 3256.12 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 20:27:39,408 epoch 9 - iter 270/272 - loss 0.00748381 - time (sec): 15.83 - samples/sec: 3260.78 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 20:27:39,523 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:27:39,524 EPOCH 9 done: loss 0.0074 - lr: 0.000006 |
|
2023-10-16 20:27:41,555 DEV : loss 0.1639147847890854 - f1-score (micro avg) 0.8059 |
|
2023-10-16 20:27:41,560 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:27:43,194 epoch 10 - iter 27/272 - loss 0.00747063 - time (sec): 1.63 - samples/sec: 2978.29 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 20:27:44,786 epoch 10 - iter 54/272 - loss 0.00601378 - time (sec): 3.22 - samples/sec: 3048.90 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 20:27:46,444 epoch 10 - iter 81/272 - loss 0.00620060 - time (sec): 4.88 - samples/sec: 3206.59 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 20:27:48,036 epoch 10 - iter 108/272 - loss 0.00729168 - time (sec): 6.47 - samples/sec: 3263.59 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 20:27:49,472 epoch 10 - iter 135/272 - loss 0.00799108 - time (sec): 7.91 - samples/sec: 3275.27 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 20:27:50,990 epoch 10 - iter 162/272 - loss 0.00758061 - time (sec): 9.43 - samples/sec: 3263.02 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 20:27:52,655 epoch 10 - iter 189/272 - loss 0.00796482 - time (sec): 11.09 - samples/sec: 3272.48 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 20:27:54,428 epoch 10 - iter 216/272 - loss 0.00741487 - time (sec): 12.87 - samples/sec: 3311.93 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 20:27:55,870 epoch 10 - iter 243/272 - loss 0.00745972 - time (sec): 14.31 - samples/sec: 3288.99 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 20:27:57,319 epoch 10 - iter 270/272 - loss 0.00713898 - time (sec): 15.76 - samples/sec: 3275.97 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 20:27:57,430 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:27:57,430 EPOCH 10 done: loss 0.0071 - lr: 0.000000 |
|
2023-10-16 20:27:58,968 DEV : loss 0.15600299835205078 - f1-score (micro avg) 0.8103 |
|
2023-10-16 20:27:59,352 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:27:59,353 Loading model from best epoch ... |
|
2023-10-16 20:28:00,915 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-16 20:28:03,131 |
|
Results: |
|
- F-score (micro) 0.7875 |
|
- F-score (macro) 0.7428 |
|
- Accuracy 0.6649 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8152 0.8910 0.8515 312 |
|
PER 0.6818 0.8654 0.7627 208 |
|
ORG 0.4737 0.4909 0.4821 55 |
|
HumanProd 0.8077 0.9545 0.8750 22 |
|
|
|
micro avg 0.7355 0.8476 0.7875 597 |
|
macro avg 0.6946 0.8005 0.7428 597 |
|
weighted avg 0.7370 0.8476 0.7874 597 |
|
|
|
2023-10-16 20:28:03,131 ---------------------------------------------------------------------------------------------------- |
|
|