|
2023-10-25 14:09:44,320 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:09:44,321 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 14:09:44,321 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:09:44,321 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences |
|
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator |
|
2023-10-25 14:09:44,321 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:09:44,322 Train: 20847 sentences |
|
2023-10-25 14:09:44,322 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 14:09:44,322 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:09:44,322 Training Params: |
|
2023-10-25 14:09:44,322 - learning_rate: "3e-05" |
|
2023-10-25 14:09:44,322 - mini_batch_size: "8" |
|
2023-10-25 14:09:44,322 - max_epochs: "10" |
|
2023-10-25 14:09:44,322 - shuffle: "True" |
|
2023-10-25 14:09:44,322 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:09:44,322 Plugins: |
|
2023-10-25 14:09:44,322 - TensorboardLogger |
|
2023-10-25 14:09:44,322 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 14:09:44,322 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:09:44,322 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 14:09:44,322 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 14:09:44,322 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:09:44,322 Computation: |
|
2023-10-25 14:09:44,322 - compute on device: cuda:0 |
|
2023-10-25 14:09:44,322 - embedding storage: none |
|
2023-10-25 14:09:44,322 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:09:44,322 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-25 14:09:44,322 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:09:44,322 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:09:44,322 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 14:09:59,239 epoch 1 - iter 260/2606 - loss 1.63450744 - time (sec): 14.92 - samples/sec: 2481.30 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 14:10:13,050 epoch 1 - iter 520/2606 - loss 1.02153022 - time (sec): 28.73 - samples/sec: 2544.03 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 14:10:27,475 epoch 1 - iter 780/2606 - loss 0.78405696 - time (sec): 43.15 - samples/sec: 2578.80 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 14:10:41,459 epoch 1 - iter 1040/2606 - loss 0.64562869 - time (sec): 57.14 - samples/sec: 2592.31 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 14:10:55,957 epoch 1 - iter 1300/2606 - loss 0.56108502 - time (sec): 71.63 - samples/sec: 2630.53 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 14:11:09,915 epoch 1 - iter 1560/2606 - loss 0.50028254 - time (sec): 85.59 - samples/sec: 2628.92 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 14:11:23,827 epoch 1 - iter 1820/2606 - loss 0.45800118 - time (sec): 99.50 - samples/sec: 2628.90 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 14:11:37,878 epoch 1 - iter 2080/2606 - loss 0.42269524 - time (sec): 113.55 - samples/sec: 2629.76 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 14:11:51,504 epoch 1 - iter 2340/2606 - loss 0.39871138 - time (sec): 127.18 - samples/sec: 2609.72 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 14:12:05,905 epoch 1 - iter 2600/2606 - loss 0.37747153 - time (sec): 141.58 - samples/sec: 2591.92 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 14:12:06,237 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:12:06,238 EPOCH 1 done: loss 0.3772 - lr: 0.000030 |
|
2023-10-25 14:12:09,991 DEV : loss 0.13742859661579132 - f1-score (micro avg) 0.3201 |
|
2023-10-25 14:12:10,014 saving best model |
|
2023-10-25 14:12:10,611 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:12:25,774 epoch 2 - iter 260/2606 - loss 0.14519096 - time (sec): 15.16 - samples/sec: 2511.96 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 14:12:40,756 epoch 2 - iter 520/2606 - loss 0.14211845 - time (sec): 30.14 - samples/sec: 2536.32 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 14:12:55,478 epoch 2 - iter 780/2606 - loss 0.13915106 - time (sec): 44.87 - samples/sec: 2528.96 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 14:13:09,729 epoch 2 - iter 1040/2606 - loss 0.14043910 - time (sec): 59.12 - samples/sec: 2534.95 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 14:13:24,189 epoch 2 - iter 1300/2606 - loss 0.14258354 - time (sec): 73.58 - samples/sec: 2531.58 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 14:13:38,159 epoch 2 - iter 1560/2606 - loss 0.14286427 - time (sec): 87.55 - samples/sec: 2532.92 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 14:13:52,969 epoch 2 - iter 1820/2606 - loss 0.14570149 - time (sec): 102.36 - samples/sec: 2531.49 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 14:14:08,066 epoch 2 - iter 2080/2606 - loss 0.14415040 - time (sec): 117.45 - samples/sec: 2524.04 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 14:14:21,896 epoch 2 - iter 2340/2606 - loss 0.14393099 - time (sec): 131.28 - samples/sec: 2511.98 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 14:14:36,338 epoch 2 - iter 2600/2606 - loss 0.14285847 - time (sec): 145.73 - samples/sec: 2513.81 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 14:14:36,722 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:14:36,723 EPOCH 2 done: loss 0.1427 - lr: 0.000027 |
|
2023-10-25 14:14:43,804 DEV : loss 0.1843709945678711 - f1-score (micro avg) 0.3536 |
|
2023-10-25 14:14:43,827 saving best model |
|
2023-10-25 14:14:44,490 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:14:59,471 epoch 3 - iter 260/2606 - loss 0.08794930 - time (sec): 14.98 - samples/sec: 2450.26 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 14:15:13,545 epoch 3 - iter 520/2606 - loss 0.09651930 - time (sec): 29.05 - samples/sec: 2464.24 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 14:15:28,051 epoch 3 - iter 780/2606 - loss 0.09574977 - time (sec): 43.56 - samples/sec: 2469.11 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 14:15:42,170 epoch 3 - iter 1040/2606 - loss 0.09264605 - time (sec): 57.68 - samples/sec: 2526.05 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 14:15:56,294 epoch 3 - iter 1300/2606 - loss 0.09201435 - time (sec): 71.80 - samples/sec: 2550.69 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 14:16:10,616 epoch 3 - iter 1560/2606 - loss 0.09390110 - time (sec): 86.12 - samples/sec: 2544.81 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 14:16:24,326 epoch 3 - iter 1820/2606 - loss 0.09474851 - time (sec): 99.83 - samples/sec: 2554.95 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 14:16:38,591 epoch 3 - iter 2080/2606 - loss 0.09517834 - time (sec): 114.10 - samples/sec: 2559.85 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 14:16:52,243 epoch 3 - iter 2340/2606 - loss 0.09607960 - time (sec): 127.75 - samples/sec: 2559.26 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 14:17:07,509 epoch 3 - iter 2600/2606 - loss 0.09548921 - time (sec): 143.02 - samples/sec: 2562.55 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 14:17:07,844 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:17:07,844 EPOCH 3 done: loss 0.0956 - lr: 0.000023 |
|
2023-10-25 14:17:14,739 DEV : loss 0.17078012228012085 - f1-score (micro avg) 0.4127 |
|
2023-10-25 14:17:14,763 saving best model |
|
2023-10-25 14:17:15,286 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:17:30,053 epoch 4 - iter 260/2606 - loss 0.07013202 - time (sec): 14.77 - samples/sec: 2448.77 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 14:17:44,224 epoch 4 - iter 520/2606 - loss 0.06707470 - time (sec): 28.94 - samples/sec: 2510.68 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 14:17:58,435 epoch 4 - iter 780/2606 - loss 0.06721877 - time (sec): 43.15 - samples/sec: 2530.08 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 14:18:12,528 epoch 4 - iter 1040/2606 - loss 0.06904912 - time (sec): 57.24 - samples/sec: 2502.44 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 14:18:26,510 epoch 4 - iter 1300/2606 - loss 0.06928579 - time (sec): 71.22 - samples/sec: 2498.94 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 14:18:41,075 epoch 4 - iter 1560/2606 - loss 0.06747754 - time (sec): 85.79 - samples/sec: 2546.69 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 14:18:55,036 epoch 4 - iter 1820/2606 - loss 0.06629190 - time (sec): 99.75 - samples/sec: 2529.68 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 14:19:09,844 epoch 4 - iter 2080/2606 - loss 0.06673662 - time (sec): 114.56 - samples/sec: 2531.10 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 14:19:23,694 epoch 4 - iter 2340/2606 - loss 0.06702601 - time (sec): 128.41 - samples/sec: 2542.34 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 14:19:39,275 epoch 4 - iter 2600/2606 - loss 0.06621362 - time (sec): 143.99 - samples/sec: 2543.76 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 14:19:39,660 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:19:39,660 EPOCH 4 done: loss 0.0662 - lr: 0.000020 |
|
2023-10-25 14:19:46,463 DEV : loss 0.273721307516098 - f1-score (micro avg) 0.386 |
|
2023-10-25 14:19:46,487 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:20:00,639 epoch 5 - iter 260/2606 - loss 0.06691840 - time (sec): 14.15 - samples/sec: 2630.55 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 14:20:14,529 epoch 5 - iter 520/2606 - loss 0.05760987 - time (sec): 28.04 - samples/sec: 2591.09 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 14:20:29,637 epoch 5 - iter 780/2606 - loss 0.06058874 - time (sec): 43.15 - samples/sec: 2563.17 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 14:20:44,571 epoch 5 - iter 1040/2606 - loss 0.05700962 - time (sec): 58.08 - samples/sec: 2551.13 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 14:21:00,977 epoch 5 - iter 1300/2606 - loss 0.05444334 - time (sec): 74.49 - samples/sec: 2449.95 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 14:21:16,309 epoch 5 - iter 1560/2606 - loss 0.05260313 - time (sec): 89.82 - samples/sec: 2439.97 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 14:21:30,068 epoch 5 - iter 1820/2606 - loss 0.05245316 - time (sec): 103.58 - samples/sec: 2469.81 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 14:21:44,396 epoch 5 - iter 2080/2606 - loss 0.05144192 - time (sec): 117.91 - samples/sec: 2493.38 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 14:21:59,116 epoch 5 - iter 2340/2606 - loss 0.05047561 - time (sec): 132.63 - samples/sec: 2507.01 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 14:22:13,165 epoch 5 - iter 2600/2606 - loss 0.05025500 - time (sec): 146.68 - samples/sec: 2498.92 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 14:22:13,537 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:22:13,538 EPOCH 5 done: loss 0.0503 - lr: 0.000017 |
|
2023-10-25 14:22:20,360 DEV : loss 0.31281721591949463 - f1-score (micro avg) 0.3713 |
|
2023-10-25 14:22:20,384 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:22:34,887 epoch 6 - iter 260/2606 - loss 0.02750655 - time (sec): 14.50 - samples/sec: 2716.97 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 14:22:49,531 epoch 6 - iter 520/2606 - loss 0.03366210 - time (sec): 29.15 - samples/sec: 2697.50 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 14:23:03,291 epoch 6 - iter 780/2606 - loss 0.03476698 - time (sec): 42.91 - samples/sec: 2635.59 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 14:23:17,137 epoch 6 - iter 1040/2606 - loss 0.03562665 - time (sec): 56.75 - samples/sec: 2628.17 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 14:23:30,397 epoch 6 - iter 1300/2606 - loss 0.03616456 - time (sec): 70.01 - samples/sec: 2608.05 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 14:23:44,783 epoch 6 - iter 1560/2606 - loss 0.03621415 - time (sec): 84.40 - samples/sec: 2590.93 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 14:23:59,274 epoch 6 - iter 1820/2606 - loss 0.03535518 - time (sec): 98.89 - samples/sec: 2591.35 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 14:24:13,367 epoch 6 - iter 2080/2606 - loss 0.03506987 - time (sec): 112.98 - samples/sec: 2601.35 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 14:24:28,280 epoch 6 - iter 2340/2606 - loss 0.03467514 - time (sec): 127.90 - samples/sec: 2589.94 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 14:24:42,383 epoch 6 - iter 2600/2606 - loss 0.03512806 - time (sec): 142.00 - samples/sec: 2576.94 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 14:24:42,750 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:24:42,750 EPOCH 6 done: loss 0.0351 - lr: 0.000013 |
|
2023-10-25 14:24:49,602 DEV : loss 0.3527540862560272 - f1-score (micro avg) 0.3748 |
|
2023-10-25 14:24:49,626 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:25:03,759 epoch 7 - iter 260/2606 - loss 0.03135937 - time (sec): 14.13 - samples/sec: 2463.00 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 14:25:18,452 epoch 7 - iter 520/2606 - loss 0.02798732 - time (sec): 28.82 - samples/sec: 2492.97 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 14:25:31,831 epoch 7 - iter 780/2606 - loss 0.02966098 - time (sec): 42.20 - samples/sec: 2488.39 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 14:25:46,387 epoch 7 - iter 1040/2606 - loss 0.02906116 - time (sec): 56.76 - samples/sec: 2501.44 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 14:26:00,733 epoch 7 - iter 1300/2606 - loss 0.02817930 - time (sec): 71.11 - samples/sec: 2513.06 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 14:26:15,858 epoch 7 - iter 1560/2606 - loss 0.02768488 - time (sec): 86.23 - samples/sec: 2549.14 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 14:26:31,035 epoch 7 - iter 1820/2606 - loss 0.02723490 - time (sec): 101.41 - samples/sec: 2567.64 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 14:26:45,536 epoch 7 - iter 2080/2606 - loss 0.02700111 - time (sec): 115.91 - samples/sec: 2566.07 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 14:26:59,816 epoch 7 - iter 2340/2606 - loss 0.02648609 - time (sec): 130.19 - samples/sec: 2564.80 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 14:27:13,882 epoch 7 - iter 2600/2606 - loss 0.02661855 - time (sec): 144.26 - samples/sec: 2541.36 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 14:27:14,206 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:27:14,206 EPOCH 7 done: loss 0.0266 - lr: 0.000010 |
|
2023-10-25 14:27:21,216 DEV : loss 0.4295073449611664 - f1-score (micro avg) 0.3762 |
|
2023-10-25 14:27:21,253 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:27:36,663 epoch 8 - iter 260/2606 - loss 0.02181414 - time (sec): 15.41 - samples/sec: 2521.69 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 14:27:50,289 epoch 8 - iter 520/2606 - loss 0.01996708 - time (sec): 29.03 - samples/sec: 2522.06 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 14:28:04,303 epoch 8 - iter 780/2606 - loss 0.01971643 - time (sec): 43.05 - samples/sec: 2548.27 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 14:28:18,416 epoch 8 - iter 1040/2606 - loss 0.01910012 - time (sec): 57.16 - samples/sec: 2548.36 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 14:28:32,654 epoch 8 - iter 1300/2606 - loss 0.01932305 - time (sec): 71.40 - samples/sec: 2571.72 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 14:28:46,927 epoch 8 - iter 1560/2606 - loss 0.01911397 - time (sec): 85.67 - samples/sec: 2599.53 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 14:29:00,738 epoch 8 - iter 1820/2606 - loss 0.01903817 - time (sec): 99.48 - samples/sec: 2594.02 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 14:29:15,408 epoch 8 - iter 2080/2606 - loss 0.01902221 - time (sec): 114.15 - samples/sec: 2627.51 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 14:29:29,121 epoch 8 - iter 2340/2606 - loss 0.01834796 - time (sec): 127.87 - samples/sec: 2611.02 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 14:29:42,489 epoch 8 - iter 2600/2606 - loss 0.01832678 - time (sec): 141.23 - samples/sec: 2594.69 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 14:29:42,908 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:29:42,908 EPOCH 8 done: loss 0.0183 - lr: 0.000007 |
|
2023-10-25 14:29:49,377 DEV : loss 0.4789745509624481 - f1-score (micro avg) 0.383 |
|
2023-10-25 14:29:49,402 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:30:03,298 epoch 9 - iter 260/2606 - loss 0.01111594 - time (sec): 13.89 - samples/sec: 2464.98 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 14:30:17,830 epoch 9 - iter 520/2606 - loss 0.01321447 - time (sec): 28.43 - samples/sec: 2547.08 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 14:30:32,851 epoch 9 - iter 780/2606 - loss 0.01217180 - time (sec): 43.45 - samples/sec: 2505.70 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 14:30:46,570 epoch 9 - iter 1040/2606 - loss 0.01514475 - time (sec): 57.17 - samples/sec: 2524.17 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 14:31:00,565 epoch 9 - iter 1300/2606 - loss 0.01506380 - time (sec): 71.16 - samples/sec: 2521.69 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 14:31:14,890 epoch 9 - iter 1560/2606 - loss 0.01489752 - time (sec): 85.49 - samples/sec: 2535.73 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 14:31:29,184 epoch 9 - iter 1820/2606 - loss 0.01427346 - time (sec): 99.78 - samples/sec: 2541.19 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 14:31:43,563 epoch 9 - iter 2080/2606 - loss 0.01410493 - time (sec): 114.16 - samples/sec: 2565.31 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 14:31:57,830 epoch 9 - iter 2340/2606 - loss 0.01382165 - time (sec): 128.43 - samples/sec: 2577.17 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 14:32:11,563 epoch 9 - iter 2600/2606 - loss 0.01346922 - time (sec): 142.16 - samples/sec: 2580.05 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 14:32:11,840 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:32:11,840 EPOCH 9 done: loss 0.0134 - lr: 0.000003 |
|
2023-10-25 14:32:18,364 DEV : loss 0.46503743529319763 - f1-score (micro avg) 0.3877 |
|
2023-10-25 14:32:18,390 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:32:32,275 epoch 10 - iter 260/2606 - loss 0.00779959 - time (sec): 13.88 - samples/sec: 2537.23 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 14:32:47,937 epoch 10 - iter 520/2606 - loss 0.00959445 - time (sec): 29.55 - samples/sec: 2480.44 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 14:33:03,639 epoch 10 - iter 780/2606 - loss 0.00886606 - time (sec): 45.25 - samples/sec: 2363.03 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 14:33:17,842 epoch 10 - iter 1040/2606 - loss 0.00899256 - time (sec): 59.45 - samples/sec: 2439.45 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 14:33:32,181 epoch 10 - iter 1300/2606 - loss 0.00983203 - time (sec): 73.79 - samples/sec: 2485.30 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 14:33:47,503 epoch 10 - iter 1560/2606 - loss 0.00977552 - time (sec): 89.11 - samples/sec: 2483.36 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 14:34:01,483 epoch 10 - iter 1820/2606 - loss 0.01040378 - time (sec): 103.09 - samples/sec: 2500.37 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 14:34:15,530 epoch 10 - iter 2080/2606 - loss 0.00988332 - time (sec): 117.14 - samples/sec: 2505.05 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 14:34:30,609 epoch 10 - iter 2340/2606 - loss 0.00983035 - time (sec): 132.22 - samples/sec: 2485.61 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 14:34:44,672 epoch 10 - iter 2600/2606 - loss 0.00949045 - time (sec): 146.28 - samples/sec: 2505.61 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 14:34:44,984 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:34:44,984 EPOCH 10 done: loss 0.0095 - lr: 0.000000 |
|
2023-10-25 14:34:51,458 DEV : loss 0.46764612197875977 - f1-score (micro avg) 0.3931 |
|
2023-10-25 14:34:52,026 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 14:34:52,027 Loading model from best epoch ... |
|
2023-10-25 14:34:53,898 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-25 14:35:03,672 |
|
Results: |
|
- F-score (micro) 0.4654 |
|
- F-score (macro) 0.2881 |
|
- Accuracy 0.3061 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.5611 0.6203 0.5892 1214 |
|
PER 0.4804 0.2884 0.3604 808 |
|
ORG 0.2175 0.1898 0.2027 353 |
|
HumanProd 0.0000 0.0000 0.0000 15 |
|
|
|
micro avg 0.4932 0.4406 0.4654 2390 |
|
macro avg 0.3148 0.2746 0.2881 2390 |
|
weighted avg 0.4796 0.4406 0.4511 2390 |
|
|
|
2023-10-25 14:35:03,672 ---------------------------------------------------------------------------------------------------- |
|
|