|
2023-10-25 09:48:20,325 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:48:20,326 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 09:48:20,326 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:48:20,327 MultiCorpus: 6183 train + 680 dev + 2113 test sentences |
|
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator |
|
2023-10-25 09:48:20,327 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:48:20,327 Train: 6183 sentences |
|
2023-10-25 09:48:20,327 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 09:48:20,327 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:48:20,327 Training Params: |
|
2023-10-25 09:48:20,327 - learning_rate: "3e-05" |
|
2023-10-25 09:48:20,327 - mini_batch_size: "8" |
|
2023-10-25 09:48:20,327 - max_epochs: "10" |
|
2023-10-25 09:48:20,327 - shuffle: "True" |
|
2023-10-25 09:48:20,327 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:48:20,327 Plugins: |
|
2023-10-25 09:48:20,327 - TensorboardLogger |
|
2023-10-25 09:48:20,327 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 09:48:20,327 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:48:20,327 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 09:48:20,327 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 09:48:20,327 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:48:20,327 Computation: |
|
2023-10-25 09:48:20,327 - compute on device: cuda:0 |
|
2023-10-25 09:48:20,327 - embedding storage: none |
|
2023-10-25 09:48:20,327 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:48:20,327 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-25 09:48:20,327 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:48:20,328 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:48:20,328 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 09:48:25,998 epoch 1 - iter 77/773 - loss 2.44068512 - time (sec): 5.67 - samples/sec: 2227.15 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 09:48:30,949 epoch 1 - iter 154/773 - loss 1.34447366 - time (sec): 10.62 - samples/sec: 2388.61 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 09:48:35,983 epoch 1 - iter 231/773 - loss 0.95565315 - time (sec): 15.65 - samples/sec: 2403.03 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 09:48:40,857 epoch 1 - iter 308/773 - loss 0.74863025 - time (sec): 20.53 - samples/sec: 2444.69 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 09:48:45,768 epoch 1 - iter 385/773 - loss 0.63300379 - time (sec): 25.44 - samples/sec: 2421.02 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 09:48:50,689 epoch 1 - iter 462/773 - loss 0.54698749 - time (sec): 30.36 - samples/sec: 2433.96 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 09:48:55,538 epoch 1 - iter 539/773 - loss 0.48462883 - time (sec): 35.21 - samples/sec: 2453.96 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 09:49:00,693 epoch 1 - iter 616/773 - loss 0.43581574 - time (sec): 40.36 - samples/sec: 2460.96 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 09:49:05,753 epoch 1 - iter 693/773 - loss 0.39859972 - time (sec): 45.42 - samples/sec: 2452.02 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 09:49:10,816 epoch 1 - iter 770/773 - loss 0.36632636 - time (sec): 50.49 - samples/sec: 2456.53 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 09:49:11,013 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:49:11,014 EPOCH 1 done: loss 0.3659 - lr: 0.000030 |
|
2023-10-25 09:49:13,651 DEV : loss 0.06680409610271454 - f1-score (micro avg) 0.673 |
|
2023-10-25 09:49:13,670 saving best model |
|
2023-10-25 09:49:14,233 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:49:19,151 epoch 2 - iter 77/773 - loss 0.08307367 - time (sec): 4.92 - samples/sec: 2511.83 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 09:49:24,030 epoch 2 - iter 154/773 - loss 0.07562126 - time (sec): 9.79 - samples/sec: 2496.58 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 09:49:29,067 epoch 2 - iter 231/773 - loss 0.07838681 - time (sec): 14.83 - samples/sec: 2486.59 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 09:49:34,072 epoch 2 - iter 308/773 - loss 0.07706583 - time (sec): 19.84 - samples/sec: 2489.64 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 09:49:39,069 epoch 2 - iter 385/773 - loss 0.07642793 - time (sec): 24.83 - samples/sec: 2476.52 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 09:49:43,970 epoch 2 - iter 462/773 - loss 0.07419511 - time (sec): 29.73 - samples/sec: 2483.97 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 09:49:49,027 epoch 2 - iter 539/773 - loss 0.07195341 - time (sec): 34.79 - samples/sec: 2484.94 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 09:49:54,504 epoch 2 - iter 616/773 - loss 0.07050405 - time (sec): 40.27 - samples/sec: 2460.10 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 09:49:59,469 epoch 2 - iter 693/773 - loss 0.07203292 - time (sec): 45.23 - samples/sec: 2470.66 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 09:50:04,418 epoch 2 - iter 770/773 - loss 0.07229932 - time (sec): 50.18 - samples/sec: 2469.35 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 09:50:04,614 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:50:04,615 EPOCH 2 done: loss 0.0723 - lr: 0.000027 |
|
2023-10-25 09:50:07,405 DEV : loss 0.05944029986858368 - f1-score (micro avg) 0.7636 |
|
2023-10-25 09:50:07,428 saving best model |
|
2023-10-25 09:50:08,212 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:50:13,281 epoch 3 - iter 77/773 - loss 0.03624577 - time (sec): 5.07 - samples/sec: 2463.78 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 09:50:18,315 epoch 3 - iter 154/773 - loss 0.03645789 - time (sec): 10.10 - samples/sec: 2418.55 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 09:50:23,400 epoch 3 - iter 231/773 - loss 0.03657292 - time (sec): 15.19 - samples/sec: 2405.52 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 09:50:28,540 epoch 3 - iter 308/773 - loss 0.03934264 - time (sec): 20.32 - samples/sec: 2403.18 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 09:50:33,684 epoch 3 - iter 385/773 - loss 0.03930106 - time (sec): 25.47 - samples/sec: 2400.76 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 09:50:38,552 epoch 3 - iter 462/773 - loss 0.04052705 - time (sec): 30.34 - samples/sec: 2441.13 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 09:50:43,656 epoch 3 - iter 539/773 - loss 0.04200801 - time (sec): 35.44 - samples/sec: 2443.28 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 09:50:48,660 epoch 3 - iter 616/773 - loss 0.04251387 - time (sec): 40.45 - samples/sec: 2445.86 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 09:50:53,646 epoch 3 - iter 693/773 - loss 0.04265944 - time (sec): 45.43 - samples/sec: 2445.82 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 09:50:58,686 epoch 3 - iter 770/773 - loss 0.04239354 - time (sec): 50.47 - samples/sec: 2449.55 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 09:50:58,909 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:50:58,909 EPOCH 3 done: loss 0.0423 - lr: 0.000023 |
|
2023-10-25 09:51:01,639 DEV : loss 0.07079580426216125 - f1-score (micro avg) 0.7485 |
|
2023-10-25 09:51:01,657 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:51:06,731 epoch 4 - iter 77/773 - loss 0.02225122 - time (sec): 5.07 - samples/sec: 2536.78 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 09:51:11,642 epoch 4 - iter 154/773 - loss 0.02524008 - time (sec): 9.98 - samples/sec: 2450.31 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 09:51:16,877 epoch 4 - iter 231/773 - loss 0.02602377 - time (sec): 15.22 - samples/sec: 2466.82 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 09:51:21,955 epoch 4 - iter 308/773 - loss 0.02739077 - time (sec): 20.30 - samples/sec: 2486.84 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 09:51:26,855 epoch 4 - iter 385/773 - loss 0.02692404 - time (sec): 25.20 - samples/sec: 2470.56 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 09:51:31,800 epoch 4 - iter 462/773 - loss 0.02708752 - time (sec): 30.14 - samples/sec: 2483.30 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 09:51:36,791 epoch 4 - iter 539/773 - loss 0.02609069 - time (sec): 35.13 - samples/sec: 2501.52 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 09:51:41,808 epoch 4 - iter 616/773 - loss 0.02645417 - time (sec): 40.15 - samples/sec: 2492.57 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 09:51:46,771 epoch 4 - iter 693/773 - loss 0.02725811 - time (sec): 45.11 - samples/sec: 2483.94 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 09:51:51,638 epoch 4 - iter 770/773 - loss 0.02781422 - time (sec): 49.98 - samples/sec: 2476.71 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 09:51:51,928 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:51:51,929 EPOCH 4 done: loss 0.0278 - lr: 0.000020 |
|
2023-10-25 09:51:54,482 DEV : loss 0.09247004240751266 - f1-score (micro avg) 0.7868 |
|
2023-10-25 09:51:54,510 saving best model |
|
2023-10-25 09:51:55,226 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:52:00,327 epoch 5 - iter 77/773 - loss 0.02243221 - time (sec): 5.10 - samples/sec: 2233.83 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 09:52:05,427 epoch 5 - iter 154/773 - loss 0.02225000 - time (sec): 10.20 - samples/sec: 2414.07 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 09:52:10,533 epoch 5 - iter 231/773 - loss 0.01978642 - time (sec): 15.30 - samples/sec: 2444.43 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 09:52:15,595 epoch 5 - iter 308/773 - loss 0.02239947 - time (sec): 20.37 - samples/sec: 2447.16 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 09:52:20,586 epoch 5 - iter 385/773 - loss 0.02443414 - time (sec): 25.36 - samples/sec: 2446.04 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 09:52:25,754 epoch 5 - iter 462/773 - loss 0.02333707 - time (sec): 30.53 - samples/sec: 2444.34 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 09:52:30,804 epoch 5 - iter 539/773 - loss 0.02353120 - time (sec): 35.58 - samples/sec: 2438.86 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 09:52:35,821 epoch 5 - iter 616/773 - loss 0.02287409 - time (sec): 40.59 - samples/sec: 2442.07 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 09:52:40,961 epoch 5 - iter 693/773 - loss 0.02315495 - time (sec): 45.73 - samples/sec: 2439.91 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 09:52:46,106 epoch 5 - iter 770/773 - loss 0.02367241 - time (sec): 50.88 - samples/sec: 2435.85 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 09:52:46,279 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:52:46,279 EPOCH 5 done: loss 0.0236 - lr: 0.000017 |
|
2023-10-25 09:52:48,935 DEV : loss 0.08800413459539413 - f1-score (micro avg) 0.7835 |
|
2023-10-25 09:52:48,951 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:52:53,960 epoch 6 - iter 77/773 - loss 0.00982011 - time (sec): 5.01 - samples/sec: 2512.18 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 09:52:59,003 epoch 6 - iter 154/773 - loss 0.01078328 - time (sec): 10.05 - samples/sec: 2483.04 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 09:53:04,477 epoch 6 - iter 231/773 - loss 0.01419181 - time (sec): 15.52 - samples/sec: 2409.37 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 09:53:09,531 epoch 6 - iter 308/773 - loss 0.01599070 - time (sec): 20.58 - samples/sec: 2426.99 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 09:53:14,405 epoch 6 - iter 385/773 - loss 0.01515093 - time (sec): 25.45 - samples/sec: 2397.06 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 09:53:19,406 epoch 6 - iter 462/773 - loss 0.01623985 - time (sec): 30.45 - samples/sec: 2400.37 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 09:53:24,366 epoch 6 - iter 539/773 - loss 0.01569609 - time (sec): 35.41 - samples/sec: 2419.50 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 09:53:29,359 epoch 6 - iter 616/773 - loss 0.01651881 - time (sec): 40.41 - samples/sec: 2449.80 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 09:53:34,430 epoch 6 - iter 693/773 - loss 0.01616537 - time (sec): 45.48 - samples/sec: 2451.34 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 09:53:39,468 epoch 6 - iter 770/773 - loss 0.01584310 - time (sec): 50.51 - samples/sec: 2452.20 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 09:53:39,658 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:53:39,658 EPOCH 6 done: loss 0.0158 - lr: 0.000013 |
|
2023-10-25 09:53:42,928 DEV : loss 0.11100301146507263 - f1-score (micro avg) 0.7586 |
|
2023-10-25 09:53:42,946 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:53:48,033 epoch 7 - iter 77/773 - loss 0.01063580 - time (sec): 5.08 - samples/sec: 2397.23 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 09:53:53,188 epoch 7 - iter 154/773 - loss 0.01434576 - time (sec): 10.24 - samples/sec: 2436.04 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 09:53:58,266 epoch 7 - iter 231/773 - loss 0.01280713 - time (sec): 15.32 - samples/sec: 2490.55 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 09:54:03,465 epoch 7 - iter 308/773 - loss 0.01179169 - time (sec): 20.52 - samples/sec: 2451.55 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 09:54:08,512 epoch 7 - iter 385/773 - loss 0.01188043 - time (sec): 25.56 - samples/sec: 2442.80 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 09:54:13,621 epoch 7 - iter 462/773 - loss 0.01181791 - time (sec): 30.67 - samples/sec: 2412.34 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 09:54:18,982 epoch 7 - iter 539/773 - loss 0.01160376 - time (sec): 36.03 - samples/sec: 2412.04 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 09:54:24,039 epoch 7 - iter 616/773 - loss 0.01099026 - time (sec): 41.09 - samples/sec: 2420.38 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 09:54:29,120 epoch 7 - iter 693/773 - loss 0.01056242 - time (sec): 46.17 - samples/sec: 2414.93 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 09:54:34,123 epoch 7 - iter 770/773 - loss 0.01026174 - time (sec): 51.17 - samples/sec: 2415.69 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 09:54:34,322 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:54:34,322 EPOCH 7 done: loss 0.0102 - lr: 0.000010 |
|
2023-10-25 09:54:36,914 DEV : loss 0.1104028970003128 - f1-score (micro avg) 0.7847 |
|
2023-10-25 09:54:36,933 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:54:42,059 epoch 8 - iter 77/773 - loss 0.00902845 - time (sec): 5.12 - samples/sec: 2447.37 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 09:54:47,039 epoch 8 - iter 154/773 - loss 0.00733489 - time (sec): 10.10 - samples/sec: 2551.79 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 09:54:52,056 epoch 8 - iter 231/773 - loss 0.00762665 - time (sec): 15.12 - samples/sec: 2497.49 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 09:54:57,067 epoch 8 - iter 308/773 - loss 0.00816844 - time (sec): 20.13 - samples/sec: 2460.35 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 09:55:02,176 epoch 8 - iter 385/773 - loss 0.00826861 - time (sec): 25.24 - samples/sec: 2427.67 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 09:55:07,189 epoch 8 - iter 462/773 - loss 0.00783738 - time (sec): 30.25 - samples/sec: 2417.84 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 09:55:12,188 epoch 8 - iter 539/773 - loss 0.00740576 - time (sec): 35.25 - samples/sec: 2409.14 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 09:55:17,334 epoch 8 - iter 616/773 - loss 0.00670498 - time (sec): 40.40 - samples/sec: 2438.22 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 09:55:22,409 epoch 8 - iter 693/773 - loss 0.00647742 - time (sec): 45.47 - samples/sec: 2449.99 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 09:55:27,686 epoch 8 - iter 770/773 - loss 0.00657971 - time (sec): 50.75 - samples/sec: 2438.64 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 09:55:27,910 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:55:27,910 EPOCH 8 done: loss 0.0066 - lr: 0.000007 |
|
2023-10-25 09:55:30,932 DEV : loss 0.126515731215477 - f1-score (micro avg) 0.7705 |
|
2023-10-25 09:55:30,952 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:55:35,948 epoch 9 - iter 77/773 - loss 0.00352912 - time (sec): 4.99 - samples/sec: 2314.99 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 09:55:41,037 epoch 9 - iter 154/773 - loss 0.00411738 - time (sec): 10.08 - samples/sec: 2364.06 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 09:55:45,993 epoch 9 - iter 231/773 - loss 0.00342230 - time (sec): 15.04 - samples/sec: 2436.84 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 09:55:51,280 epoch 9 - iter 308/773 - loss 0.00417078 - time (sec): 20.33 - samples/sec: 2429.69 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 09:55:56,485 epoch 9 - iter 385/773 - loss 0.00410253 - time (sec): 25.53 - samples/sec: 2451.17 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 09:56:01,460 epoch 9 - iter 462/773 - loss 0.00411207 - time (sec): 30.51 - samples/sec: 2462.56 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 09:56:06,653 epoch 9 - iter 539/773 - loss 0.00389372 - time (sec): 35.70 - samples/sec: 2472.27 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 09:56:11,766 epoch 9 - iter 616/773 - loss 0.00353167 - time (sec): 40.81 - samples/sec: 2460.01 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 09:56:16,752 epoch 9 - iter 693/773 - loss 0.00362333 - time (sec): 45.80 - samples/sec: 2440.61 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 09:56:21,727 epoch 9 - iter 770/773 - loss 0.00379019 - time (sec): 50.77 - samples/sec: 2438.22 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 09:56:21,933 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:56:21,934 EPOCH 9 done: loss 0.0038 - lr: 0.000003 |
|
2023-10-25 09:56:25,021 DEV : loss 0.12340500205755234 - f1-score (micro avg) 0.7823 |
|
2023-10-25 09:56:25,042 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:56:30,157 epoch 10 - iter 77/773 - loss 0.00453140 - time (sec): 5.11 - samples/sec: 2412.07 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 09:56:35,026 epoch 10 - iter 154/773 - loss 0.00572525 - time (sec): 9.98 - samples/sec: 2351.54 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 09:56:39,971 epoch 10 - iter 231/773 - loss 0.00379453 - time (sec): 14.93 - samples/sec: 2397.12 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 09:56:45,045 epoch 10 - iter 308/773 - loss 0.00410640 - time (sec): 20.00 - samples/sec: 2404.59 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 09:56:50,059 epoch 10 - iter 385/773 - loss 0.00356080 - time (sec): 25.02 - samples/sec: 2439.49 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 09:56:55,022 epoch 10 - iter 462/773 - loss 0.00339071 - time (sec): 29.98 - samples/sec: 2446.81 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 09:57:00,096 epoch 10 - iter 539/773 - loss 0.00337113 - time (sec): 35.05 - samples/sec: 2440.30 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 09:57:05,154 epoch 10 - iter 616/773 - loss 0.00330523 - time (sec): 40.11 - samples/sec: 2448.87 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 09:57:10,122 epoch 10 - iter 693/773 - loss 0.00301120 - time (sec): 45.08 - samples/sec: 2456.87 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 09:57:15,200 epoch 10 - iter 770/773 - loss 0.00310954 - time (sec): 50.16 - samples/sec: 2468.08 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 09:57:15,384 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:57:15,384 EPOCH 10 done: loss 0.0031 - lr: 0.000000 |
|
2023-10-25 09:57:17,906 DEV : loss 0.12285467237234116 - f1-score (micro avg) 0.7886 |
|
2023-10-25 09:57:17,925 saving best model |
|
2023-10-25 09:57:19,238 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 09:57:19,240 Loading model from best epoch ... |
|
2023-10-25 09:57:21,486 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET |
|
2023-10-25 09:57:30,130 |
|
Results: |
|
- F-score (micro) 0.8131 |
|
- F-score (macro) 0.7236 |
|
- Accuracy 0.7047 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8469 0.8710 0.8588 946 |
|
BUILDING 0.6073 0.6270 0.6170 185 |
|
STREET 0.6613 0.7321 0.6949 56 |
|
|
|
micro avg 0.8002 0.8265 0.8131 1187 |
|
macro avg 0.7052 0.7434 0.7236 1187 |
|
weighted avg 0.8008 0.8265 0.8134 1187 |
|
|
|
2023-10-25 09:57:30,130 ---------------------------------------------------------------------------------------------------- |
|
|