|
2023-10-25 09:57:58,522 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:57:58,523 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 09:57:58,523 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:57:58,524 MultiCorpus: 6183 train + 680 dev + 2113 test sentences |
|
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator |
|
2023-10-25 09:57:58,524 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:57:58,524 Train: 6183 sentences |
|
2023-10-25 09:57:58,524 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 09:57:58,524 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:57:58,524 Training Params: |
|
2023-10-25 09:57:58,524 - learning_rate: "5e-05" |
|
2023-10-25 09:57:58,524 - mini_batch_size: "8" |
|
2023-10-25 09:57:58,524 - max_epochs: "10" |
|
2023-10-25 09:57:58,524 - shuffle: "True" |
|
2023-10-25 09:57:58,524 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:57:58,524 Plugins: |
|
2023-10-25 09:57:58,524 - TensorboardLogger |
|
2023-10-25 09:57:58,524 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 09:57:58,524 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:57:58,524 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 09:57:58,524 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 09:57:58,524 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:57:58,524 Computation: |
|
2023-10-25 09:57:58,524 - compute on device: cuda:0 |
|
2023-10-25 09:57:58,524 - embedding storage: none |
|
2023-10-25 09:57:58,524 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:57:58,525 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-25 09:57:58,525 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:57:58,525 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:57:58,525 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 09:58:03,484 epoch 1 - iter 77/773 - loss 2.03904562 - time (sec): 4.96 - samples/sec: 2546.24 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 09:58:08,424 epoch 1 - iter 154/773 - loss 1.12019790 - time (sec): 9.90 - samples/sec: 2562.94 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 09:58:13,526 epoch 1 - iter 231/773 - loss 0.79717628 - time (sec): 15.00 - samples/sec: 2507.80 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 09:58:18,475 epoch 1 - iter 308/773 - loss 0.62691977 - time (sec): 19.95 - samples/sec: 2515.65 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 09:58:23,289 epoch 1 - iter 385/773 - loss 0.53255000 - time (sec): 24.76 - samples/sec: 2487.10 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 09:58:28,225 epoch 1 - iter 462/773 - loss 0.46194793 - time (sec): 29.70 - samples/sec: 2488.13 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 09:58:33,164 epoch 1 - iter 539/773 - loss 0.41113608 - time (sec): 34.64 - samples/sec: 2494.45 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 09:58:38,154 epoch 1 - iter 616/773 - loss 0.37213224 - time (sec): 39.63 - samples/sec: 2506.66 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 09:58:43,158 epoch 1 - iter 693/773 - loss 0.34227260 - time (sec): 44.63 - samples/sec: 2495.53 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 09:58:48,050 epoch 1 - iter 770/773 - loss 0.31635614 - time (sec): 49.52 - samples/sec: 2504.31 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-25 09:58:48,231 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:58:48,232 EPOCH 1 done: loss 0.3160 - lr: 0.000050 |
|
2023-10-25 09:58:50,793 DEV : loss 0.07084957510232925 - f1-score (micro avg) 0.6729 |
|
2023-10-25 09:58:50,820 saving best model |
|
2023-10-25 09:58:51,326 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:58:56,425 epoch 2 - iter 77/773 - loss 0.08383621 - time (sec): 5.10 - samples/sec: 2422.97 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 09:59:01,475 epoch 2 - iter 154/773 - loss 0.08060651 - time (sec): 10.15 - samples/sec: 2410.00 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 09:59:06,628 epoch 2 - iter 231/773 - loss 0.08038522 - time (sec): 15.30 - samples/sec: 2410.53 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 09:59:11,762 epoch 2 - iter 308/773 - loss 0.07891260 - time (sec): 20.43 - samples/sec: 2417.00 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 09:59:16,729 epoch 2 - iter 385/773 - loss 0.07927934 - time (sec): 25.40 - samples/sec: 2421.31 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 09:59:21,933 epoch 2 - iter 462/773 - loss 0.07826921 - time (sec): 30.60 - samples/sec: 2413.32 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 09:59:26,942 epoch 2 - iter 539/773 - loss 0.07567819 - time (sec): 35.61 - samples/sec: 2427.55 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 09:59:32,543 epoch 2 - iter 616/773 - loss 0.07482291 - time (sec): 41.21 - samples/sec: 2403.65 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 09:59:37,548 epoch 2 - iter 693/773 - loss 0.07633291 - time (sec): 46.22 - samples/sec: 2417.97 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 09:59:42,566 epoch 2 - iter 770/773 - loss 0.07631245 - time (sec): 51.24 - samples/sec: 2418.55 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 09:59:42,749 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:59:42,750 EPOCH 2 done: loss 0.0763 - lr: 0.000044 |
|
2023-10-25 09:59:45,492 DEV : loss 0.05928114056587219 - f1-score (micro avg) 0.7621 |
|
2023-10-25 09:59:45,523 saving best model |
|
2023-10-25 09:59:46,246 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:59:51,253 epoch 3 - iter 77/773 - loss 0.03484718 - time (sec): 5.00 - samples/sec: 2494.02 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 09:59:56,296 epoch 3 - iter 154/773 - loss 0.03415565 - time (sec): 10.05 - samples/sec: 2431.13 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 10:00:01,430 epoch 3 - iter 231/773 - loss 0.03721366 - time (sec): 15.18 - samples/sec: 2406.07 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 10:00:06,559 epoch 3 - iter 308/773 - loss 0.04116588 - time (sec): 20.31 - samples/sec: 2404.77 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 10:00:11,520 epoch 3 - iter 385/773 - loss 0.04243653 - time (sec): 25.27 - samples/sec: 2419.43 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 10:00:16,488 epoch 3 - iter 462/773 - loss 0.04382048 - time (sec): 30.24 - samples/sec: 2448.99 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 10:00:21,593 epoch 3 - iter 539/773 - loss 0.04459189 - time (sec): 35.34 - samples/sec: 2449.95 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 10:00:26,549 epoch 3 - iter 616/773 - loss 0.04536472 - time (sec): 40.30 - samples/sec: 2454.64 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 10:00:31,453 epoch 3 - iter 693/773 - loss 0.04577056 - time (sec): 45.20 - samples/sec: 2458.03 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-25 10:00:36,478 epoch 3 - iter 770/773 - loss 0.04613942 - time (sec): 50.23 - samples/sec: 2461.36 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-25 10:00:36,682 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:00:36,683 EPOCH 3 done: loss 0.0461 - lr: 0.000039 |
|
2023-10-25 10:00:39,113 DEV : loss 0.07082913815975189 - f1-score (micro avg) 0.7401 |
|
2023-10-25 10:00:39,131 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:00:44,166 epoch 4 - iter 77/773 - loss 0.02209400 - time (sec): 5.03 - samples/sec: 2555.77 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 10:00:49,060 epoch 4 - iter 154/773 - loss 0.02495574 - time (sec): 9.93 - samples/sec: 2463.86 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 10:00:54,178 epoch 4 - iter 231/773 - loss 0.02855728 - time (sec): 15.05 - samples/sec: 2494.99 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 10:00:59,390 epoch 4 - iter 308/773 - loss 0.03133128 - time (sec): 20.26 - samples/sec: 2491.55 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 10:01:04,471 epoch 4 - iter 385/773 - loss 0.03152342 - time (sec): 25.34 - samples/sec: 2456.61 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 10:01:09,557 epoch 4 - iter 462/773 - loss 0.03107184 - time (sec): 30.42 - samples/sec: 2460.15 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 10:01:14,638 epoch 4 - iter 539/773 - loss 0.03090760 - time (sec): 35.51 - samples/sec: 2475.10 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 10:01:19,778 epoch 4 - iter 616/773 - loss 0.03088891 - time (sec): 40.65 - samples/sec: 2462.11 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 10:01:24,787 epoch 4 - iter 693/773 - loss 0.03076526 - time (sec): 45.65 - samples/sec: 2454.39 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 10:01:29,768 epoch 4 - iter 770/773 - loss 0.03111074 - time (sec): 50.64 - samples/sec: 2444.60 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 10:01:29,960 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:01:29,960 EPOCH 4 done: loss 0.0312 - lr: 0.000033 |
|
2023-10-25 10:01:32,618 DEV : loss 0.08776696026325226 - f1-score (micro avg) 0.7409 |
|
2023-10-25 10:01:32,634 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:01:37,624 epoch 5 - iter 77/773 - loss 0.02788476 - time (sec): 4.99 - samples/sec: 2283.51 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 10:01:42,764 epoch 5 - iter 154/773 - loss 0.02471748 - time (sec): 10.13 - samples/sec: 2431.01 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 10:01:47,922 epoch 5 - iter 231/773 - loss 0.02124558 - time (sec): 15.29 - samples/sec: 2447.42 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 10:01:53,049 epoch 5 - iter 308/773 - loss 0.02130901 - time (sec): 20.41 - samples/sec: 2441.56 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 10:01:58,037 epoch 5 - iter 385/773 - loss 0.02252948 - time (sec): 25.40 - samples/sec: 2441.86 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 10:02:03,283 epoch 5 - iter 462/773 - loss 0.02151139 - time (sec): 30.65 - samples/sec: 2434.62 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 10:02:08,345 epoch 5 - iter 539/773 - loss 0.02272270 - time (sec): 35.71 - samples/sec: 2429.79 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 10:02:13,261 epoch 5 - iter 616/773 - loss 0.02261758 - time (sec): 40.63 - samples/sec: 2440.11 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 10:02:18,235 epoch 5 - iter 693/773 - loss 0.02263034 - time (sec): 45.60 - samples/sec: 2447.06 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 10:02:23,287 epoch 5 - iter 770/773 - loss 0.02286498 - time (sec): 50.65 - samples/sec: 2446.74 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 10:02:23,467 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:02:23,467 EPOCH 5 done: loss 0.0228 - lr: 0.000028 |
|
2023-10-25 10:02:26,021 DEV : loss 0.10517541319131851 - f1-score (micro avg) 0.7738 |
|
2023-10-25 10:02:26,040 saving best model |
|
2023-10-25 10:02:26,757 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:02:31,936 epoch 6 - iter 77/773 - loss 0.01060825 - time (sec): 5.18 - samples/sec: 2429.49 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 10:02:37,075 epoch 6 - iter 154/773 - loss 0.01094672 - time (sec): 10.32 - samples/sec: 2419.14 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 10:02:42,090 epoch 6 - iter 231/773 - loss 0.01297909 - time (sec): 15.33 - samples/sec: 2439.59 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 10:02:47,138 epoch 6 - iter 308/773 - loss 0.01451729 - time (sec): 20.38 - samples/sec: 2450.65 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 10:02:52,051 epoch 6 - iter 385/773 - loss 0.01449527 - time (sec): 25.29 - samples/sec: 2412.22 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 10:02:57,072 epoch 6 - iter 462/773 - loss 0.01536073 - time (sec): 30.31 - samples/sec: 2411.44 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 10:03:02,074 epoch 6 - iter 539/773 - loss 0.01493132 - time (sec): 35.32 - samples/sec: 2426.18 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 10:03:07,077 epoch 6 - iter 616/773 - loss 0.01532872 - time (sec): 40.32 - samples/sec: 2455.13 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 10:03:12,152 epoch 6 - iter 693/773 - loss 0.01546009 - time (sec): 45.39 - samples/sec: 2455.84 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 10:03:17,621 epoch 6 - iter 770/773 - loss 0.01587373 - time (sec): 50.86 - samples/sec: 2435.45 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 10:03:17,830 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:03:17,831 EPOCH 6 done: loss 0.0159 - lr: 0.000022 |
|
2023-10-25 10:03:20,818 DEV : loss 0.10992585122585297 - f1-score (micro avg) 0.7592 |
|
2023-10-25 10:03:20,835 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:03:25,522 epoch 7 - iter 77/773 - loss 0.00548947 - time (sec): 4.69 - samples/sec: 2601.12 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 10:03:30,408 epoch 7 - iter 154/773 - loss 0.01141571 - time (sec): 9.57 - samples/sec: 2606.13 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 10:03:35,507 epoch 7 - iter 231/773 - loss 0.01079487 - time (sec): 14.67 - samples/sec: 2600.40 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 10:03:40,827 epoch 7 - iter 308/773 - loss 0.01099606 - time (sec): 19.99 - samples/sec: 2516.09 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 10:03:46,346 epoch 7 - iter 385/773 - loss 0.01110398 - time (sec): 25.51 - samples/sec: 2448.01 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 10:03:51,677 epoch 7 - iter 462/773 - loss 0.01103215 - time (sec): 30.84 - samples/sec: 2399.15 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 10:03:57,134 epoch 7 - iter 539/773 - loss 0.01112854 - time (sec): 36.30 - samples/sec: 2394.48 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 10:04:02,346 epoch 7 - iter 616/773 - loss 0.01041358 - time (sec): 41.51 - samples/sec: 2395.92 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 10:04:07,437 epoch 7 - iter 693/773 - loss 0.01037195 - time (sec): 46.60 - samples/sec: 2392.68 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 10:04:12,559 epoch 7 - iter 770/773 - loss 0.01062967 - time (sec): 51.72 - samples/sec: 2390.07 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 10:04:12,770 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:04:12,771 EPOCH 7 done: loss 0.0106 - lr: 0.000017 |
|
2023-10-25 10:04:16,220 DEV : loss 0.12441226094961166 - f1-score (micro avg) 0.768 |
|
2023-10-25 10:04:16,242 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:04:21,552 epoch 8 - iter 77/773 - loss 0.01141740 - time (sec): 5.31 - samples/sec: 2362.59 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 10:04:26,934 epoch 8 - iter 154/773 - loss 0.00925878 - time (sec): 10.69 - samples/sec: 2411.77 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 10:04:32,233 epoch 8 - iter 231/773 - loss 0.00818945 - time (sec): 15.99 - samples/sec: 2361.92 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 10:04:37,479 epoch 8 - iter 308/773 - loss 0.00843259 - time (sec): 21.24 - samples/sec: 2332.50 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 10:04:42,853 epoch 8 - iter 385/773 - loss 0.00881106 - time (sec): 26.61 - samples/sec: 2302.79 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 10:04:48,080 epoch 8 - iter 462/773 - loss 0.00919388 - time (sec): 31.84 - samples/sec: 2297.67 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 10:04:53,452 epoch 8 - iter 539/773 - loss 0.00913278 - time (sec): 37.21 - samples/sec: 2282.50 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 10:04:58,881 epoch 8 - iter 616/773 - loss 0.00832774 - time (sec): 42.64 - samples/sec: 2310.18 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 10:05:04,376 epoch 8 - iter 693/773 - loss 0.00817666 - time (sec): 48.13 - samples/sec: 2314.71 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 10:05:09,725 epoch 8 - iter 770/773 - loss 0.00807240 - time (sec): 53.48 - samples/sec: 2314.18 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 10:05:09,948 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:05:09,948 EPOCH 8 done: loss 0.0081 - lr: 0.000011 |
|
2023-10-25 10:05:12,914 DEV : loss 0.11879534274339676 - f1-score (micro avg) 0.7653 |
|
2023-10-25 10:05:12,938 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:05:18,125 epoch 9 - iter 77/773 - loss 0.00379106 - time (sec): 5.19 - samples/sec: 2229.98 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 10:05:23,477 epoch 9 - iter 154/773 - loss 0.00536332 - time (sec): 10.54 - samples/sec: 2262.36 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 10:05:28,825 epoch 9 - iter 231/773 - loss 0.00487666 - time (sec): 15.89 - samples/sec: 2307.17 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 10:05:34,164 epoch 9 - iter 308/773 - loss 0.00444790 - time (sec): 21.22 - samples/sec: 2326.98 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 10:05:39,560 epoch 9 - iter 385/773 - loss 0.00409264 - time (sec): 26.62 - samples/sec: 2350.96 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 10:05:44,934 epoch 9 - iter 462/773 - loss 0.00392048 - time (sec): 31.99 - samples/sec: 2348.07 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 10:05:50,225 epoch 9 - iter 539/773 - loss 0.00393411 - time (sec): 37.29 - samples/sec: 2367.10 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 10:05:54,787 epoch 9 - iter 616/773 - loss 0.00373331 - time (sec): 41.85 - samples/sec: 2399.22 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 10:05:59,572 epoch 9 - iter 693/773 - loss 0.00434126 - time (sec): 46.63 - samples/sec: 2397.01 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 10:06:04,607 epoch 9 - iter 770/773 - loss 0.00435473 - time (sec): 51.67 - samples/sec: 2396.05 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 10:06:04,793 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:06:04,794 EPOCH 9 done: loss 0.0043 - lr: 0.000006 |
|
2023-10-25 10:06:07,519 DEV : loss 0.1255100518465042 - f1-score (micro avg) 0.79 |
|
2023-10-25 10:06:07,536 saving best model |
|
2023-10-25 10:06:08,244 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:06:13,192 epoch 10 - iter 77/773 - loss 0.00296306 - time (sec): 4.95 - samples/sec: 2493.27 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 10:06:17,962 epoch 10 - iter 154/773 - loss 0.00363997 - time (sec): 9.72 - samples/sec: 2415.75 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 10:06:22,715 epoch 10 - iter 231/773 - loss 0.00251245 - time (sec): 14.47 - samples/sec: 2472.97 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 10:06:27,822 epoch 10 - iter 308/773 - loss 0.00259746 - time (sec): 19.58 - samples/sec: 2456.70 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 10:06:33,684 epoch 10 - iter 385/773 - loss 0.00245530 - time (sec): 25.44 - samples/sec: 2398.88 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 10:06:38,992 epoch 10 - iter 462/773 - loss 0.00267112 - time (sec): 30.75 - samples/sec: 2385.70 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 10:06:44,386 epoch 10 - iter 539/773 - loss 0.00238484 - time (sec): 36.14 - samples/sec: 2366.86 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 10:06:49,620 epoch 10 - iter 616/773 - loss 0.00268458 - time (sec): 41.37 - samples/sec: 2374.04 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 10:06:54,913 epoch 10 - iter 693/773 - loss 0.00240025 - time (sec): 46.67 - samples/sec: 2373.18 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 10:07:00,374 epoch 10 - iter 770/773 - loss 0.00227698 - time (sec): 52.13 - samples/sec: 2374.69 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 10:07:00,578 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:07:00,578 EPOCH 10 done: loss 0.0023 - lr: 0.000000 |
|
2023-10-25 10:07:03,614 DEV : loss 0.12568846344947815 - f1-score (micro avg) 0.7918 |
|
2023-10-25 10:07:03,635 saving best model |
|
2023-10-25 10:07:04,884 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 10:07:04,886 Loading model from best epoch ... |
|
2023-10-25 10:07:07,226 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET |
|
2023-10-25 10:07:18,217 |
|
Results: |
|
- F-score (micro) 0.7958 |
|
- F-score (macro) 0.7034 |
|
- Accuracy 0.6827 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8311 0.8478 0.8394 946 |
|
BUILDING 0.6201 0.6000 0.6099 185 |
|
STREET 0.6441 0.6786 0.6609 56 |
|
|
|
micro avg 0.7905 0.8012 0.7958 1187 |
|
macro avg 0.6984 0.7088 0.7034 1187 |
|
weighted avg 0.7894 0.8012 0.7952 1187 |
|
|
|
2023-10-25 10:07:18,218 ---------------------------------------------------------------------------------------------------- |
|
|