stefan-it's picture
Upload folder using huggingface_hub
4e1f750
2023-10-17 14:55:09,495 ----------------------------------------------------------------------------------------------------
2023-10-17 14:55:09,496 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 14:55:09,496 ----------------------------------------------------------------------------------------------------
2023-10-17 14:55:09,496 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-17 14:55:09,496 ----------------------------------------------------------------------------------------------------
2023-10-17 14:55:09,497 Train: 7936 sentences
2023-10-17 14:55:09,497 (train_with_dev=False, train_with_test=False)
2023-10-17 14:55:09,497 ----------------------------------------------------------------------------------------------------
2023-10-17 14:55:09,497 Training Params:
2023-10-17 14:55:09,497 - learning_rate: "3e-05"
2023-10-17 14:55:09,497 - mini_batch_size: "8"
2023-10-17 14:55:09,497 - max_epochs: "10"
2023-10-17 14:55:09,497 - shuffle: "True"
2023-10-17 14:55:09,497 ----------------------------------------------------------------------------------------------------
2023-10-17 14:55:09,497 Plugins:
2023-10-17 14:55:09,497 - TensorboardLogger
2023-10-17 14:55:09,497 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 14:55:09,497 ----------------------------------------------------------------------------------------------------
2023-10-17 14:55:09,497 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 14:55:09,497 - metric: "('micro avg', 'f1-score')"
2023-10-17 14:55:09,497 ----------------------------------------------------------------------------------------------------
2023-10-17 14:55:09,497 Computation:
2023-10-17 14:55:09,497 - compute on device: cuda:0
2023-10-17 14:55:09,497 - embedding storage: none
2023-10-17 14:55:09,497 ----------------------------------------------------------------------------------------------------
2023-10-17 14:55:09,497 Model training base path: "hmbench-icdar/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-17 14:55:09,497 ----------------------------------------------------------------------------------------------------
2023-10-17 14:55:09,497 ----------------------------------------------------------------------------------------------------
2023-10-17 14:55:09,497 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 14:55:15,336 epoch 1 - iter 99/992 - loss 2.77233788 - time (sec): 5.84 - samples/sec: 2762.70 - lr: 0.000003 - momentum: 0.000000
2023-10-17 14:55:21,438 epoch 1 - iter 198/992 - loss 1.67037158 - time (sec): 11.94 - samples/sec: 2810.79 - lr: 0.000006 - momentum: 0.000000
2023-10-17 14:55:27,193 epoch 1 - iter 297/992 - loss 1.22191274 - time (sec): 17.69 - samples/sec: 2784.25 - lr: 0.000009 - momentum: 0.000000
2023-10-17 14:55:32,952 epoch 1 - iter 396/992 - loss 0.97575122 - time (sec): 23.45 - samples/sec: 2781.30 - lr: 0.000012 - momentum: 0.000000
2023-10-17 14:55:38,693 epoch 1 - iter 495/992 - loss 0.83006039 - time (sec): 29.19 - samples/sec: 2760.78 - lr: 0.000015 - momentum: 0.000000
2023-10-17 14:55:44,467 epoch 1 - iter 594/992 - loss 0.71867904 - time (sec): 34.97 - samples/sec: 2772.54 - lr: 0.000018 - momentum: 0.000000
2023-10-17 14:55:50,622 epoch 1 - iter 693/992 - loss 0.63373292 - time (sec): 41.12 - samples/sec: 2769.32 - lr: 0.000021 - momentum: 0.000000
2023-10-17 14:55:56,539 epoch 1 - iter 792/992 - loss 0.57062966 - time (sec): 47.04 - samples/sec: 2774.49 - lr: 0.000024 - momentum: 0.000000
2023-10-17 14:56:02,349 epoch 1 - iter 891/992 - loss 0.52216394 - time (sec): 52.85 - samples/sec: 2781.94 - lr: 0.000027 - momentum: 0.000000
2023-10-17 14:56:08,420 epoch 1 - iter 990/992 - loss 0.48106559 - time (sec): 58.92 - samples/sec: 2777.41 - lr: 0.000030 - momentum: 0.000000
2023-10-17 14:56:08,530 ----------------------------------------------------------------------------------------------------
2023-10-17 14:56:08,530 EPOCH 1 done: loss 0.4802 - lr: 0.000030
2023-10-17 14:56:11,906 DEV : loss 0.09239894896745682 - f1-score (micro avg) 0.69
2023-10-17 14:56:11,928 saving best model
2023-10-17 14:56:12,272 ----------------------------------------------------------------------------------------------------
2023-10-17 14:56:18,363 epoch 2 - iter 99/992 - loss 0.11493867 - time (sec): 6.09 - samples/sec: 2798.64 - lr: 0.000030 - momentum: 0.000000
2023-10-17 14:56:24,274 epoch 2 - iter 198/992 - loss 0.10800470 - time (sec): 12.00 - samples/sec: 2785.06 - lr: 0.000029 - momentum: 0.000000
2023-10-17 14:56:29,911 epoch 2 - iter 297/992 - loss 0.10734803 - time (sec): 17.64 - samples/sec: 2820.64 - lr: 0.000029 - momentum: 0.000000
2023-10-17 14:56:35,592 epoch 2 - iter 396/992 - loss 0.10779127 - time (sec): 23.32 - samples/sec: 2799.07 - lr: 0.000029 - momentum: 0.000000
2023-10-17 14:56:41,915 epoch 2 - iter 495/992 - loss 0.10673539 - time (sec): 29.64 - samples/sec: 2769.34 - lr: 0.000028 - momentum: 0.000000
2023-10-17 14:56:47,905 epoch 2 - iter 594/992 - loss 0.10552213 - time (sec): 35.63 - samples/sec: 2749.29 - lr: 0.000028 - momentum: 0.000000
2023-10-17 14:56:53,856 epoch 2 - iter 693/992 - loss 0.10707402 - time (sec): 41.58 - samples/sec: 2740.93 - lr: 0.000028 - momentum: 0.000000
2023-10-17 14:56:59,581 epoch 2 - iter 792/992 - loss 0.10586383 - time (sec): 47.31 - samples/sec: 2751.71 - lr: 0.000027 - momentum: 0.000000
2023-10-17 14:57:05,744 epoch 2 - iter 891/992 - loss 0.10483571 - time (sec): 53.47 - samples/sec: 2753.62 - lr: 0.000027 - momentum: 0.000000
2023-10-17 14:57:11,629 epoch 2 - iter 990/992 - loss 0.10522458 - time (sec): 59.36 - samples/sec: 2757.89 - lr: 0.000027 - momentum: 0.000000
2023-10-17 14:57:11,744 ----------------------------------------------------------------------------------------------------
2023-10-17 14:57:11,744 EPOCH 2 done: loss 0.1052 - lr: 0.000027
2023-10-17 14:57:15,729 DEV : loss 0.0876772329211235 - f1-score (micro avg) 0.7535
2023-10-17 14:57:15,756 saving best model
2023-10-17 14:57:16,213 ----------------------------------------------------------------------------------------------------
2023-10-17 14:57:21,845 epoch 3 - iter 99/992 - loss 0.08022628 - time (sec): 5.63 - samples/sec: 2701.59 - lr: 0.000026 - momentum: 0.000000
2023-10-17 14:57:27,757 epoch 3 - iter 198/992 - loss 0.08031838 - time (sec): 11.54 - samples/sec: 2729.77 - lr: 0.000026 - momentum: 0.000000
2023-10-17 14:57:33,831 epoch 3 - iter 297/992 - loss 0.07836688 - time (sec): 17.61 - samples/sec: 2746.30 - lr: 0.000026 - momentum: 0.000000
2023-10-17 14:57:39,836 epoch 3 - iter 396/992 - loss 0.07442244 - time (sec): 23.62 - samples/sec: 2759.52 - lr: 0.000025 - momentum: 0.000000
2023-10-17 14:57:45,675 epoch 3 - iter 495/992 - loss 0.07411089 - time (sec): 29.46 - samples/sec: 2768.82 - lr: 0.000025 - momentum: 0.000000
2023-10-17 14:57:51,690 epoch 3 - iter 594/992 - loss 0.07456331 - time (sec): 35.47 - samples/sec: 2780.49 - lr: 0.000025 - momentum: 0.000000
2023-10-17 14:57:57,446 epoch 3 - iter 693/992 - loss 0.07434798 - time (sec): 41.23 - samples/sec: 2774.61 - lr: 0.000024 - momentum: 0.000000
2023-10-17 14:58:03,168 epoch 3 - iter 792/992 - loss 0.07472551 - time (sec): 46.95 - samples/sec: 2779.78 - lr: 0.000024 - momentum: 0.000000
2023-10-17 14:58:09,356 epoch 3 - iter 891/992 - loss 0.07318574 - time (sec): 53.14 - samples/sec: 2776.76 - lr: 0.000024 - momentum: 0.000000
2023-10-17 14:58:15,110 epoch 3 - iter 990/992 - loss 0.07399399 - time (sec): 58.89 - samples/sec: 2778.86 - lr: 0.000023 - momentum: 0.000000
2023-10-17 14:58:15,219 ----------------------------------------------------------------------------------------------------
2023-10-17 14:58:15,219 EPOCH 3 done: loss 0.0739 - lr: 0.000023
2023-10-17 14:58:18,781 DEV : loss 0.08807346224784851 - f1-score (micro avg) 0.755
2023-10-17 14:58:18,804 saving best model
2023-10-17 14:58:19,266 ----------------------------------------------------------------------------------------------------
2023-10-17 14:58:25,033 epoch 4 - iter 99/992 - loss 0.04956166 - time (sec): 5.76 - samples/sec: 2737.94 - lr: 0.000023 - momentum: 0.000000
2023-10-17 14:58:31,264 epoch 4 - iter 198/992 - loss 0.05434284 - time (sec): 11.99 - samples/sec: 2716.72 - lr: 0.000023 - momentum: 0.000000
2023-10-17 14:58:36,836 epoch 4 - iter 297/992 - loss 0.05733946 - time (sec): 17.56 - samples/sec: 2758.90 - lr: 0.000022 - momentum: 0.000000
2023-10-17 14:58:42,898 epoch 4 - iter 396/992 - loss 0.05426292 - time (sec): 23.63 - samples/sec: 2756.56 - lr: 0.000022 - momentum: 0.000000
2023-10-17 14:58:48,800 epoch 4 - iter 495/992 - loss 0.05485342 - time (sec): 29.53 - samples/sec: 2765.68 - lr: 0.000022 - momentum: 0.000000
2023-10-17 14:58:54,693 epoch 4 - iter 594/992 - loss 0.05628478 - time (sec): 35.42 - samples/sec: 2760.51 - lr: 0.000021 - momentum: 0.000000
2023-10-17 14:59:00,816 epoch 4 - iter 693/992 - loss 0.05626172 - time (sec): 41.54 - samples/sec: 2750.81 - lr: 0.000021 - momentum: 0.000000
2023-10-17 14:59:06,690 epoch 4 - iter 792/992 - loss 0.05709857 - time (sec): 47.42 - samples/sec: 2751.91 - lr: 0.000021 - momentum: 0.000000
2023-10-17 14:59:12,827 epoch 4 - iter 891/992 - loss 0.05685953 - time (sec): 53.56 - samples/sec: 2750.63 - lr: 0.000020 - momentum: 0.000000
2023-10-17 14:59:18,691 epoch 4 - iter 990/992 - loss 0.05615262 - time (sec): 59.42 - samples/sec: 2754.59 - lr: 0.000020 - momentum: 0.000000
2023-10-17 14:59:18,816 ----------------------------------------------------------------------------------------------------
2023-10-17 14:59:18,816 EPOCH 4 done: loss 0.0563 - lr: 0.000020
2023-10-17 14:59:22,248 DEV : loss 0.12442335486412048 - f1-score (micro avg) 0.754
2023-10-17 14:59:22,269 ----------------------------------------------------------------------------------------------------
2023-10-17 14:59:27,842 epoch 5 - iter 99/992 - loss 0.04149832 - time (sec): 5.57 - samples/sec: 2872.15 - lr: 0.000020 - momentum: 0.000000
2023-10-17 14:59:34,016 epoch 5 - iter 198/992 - loss 0.04152572 - time (sec): 11.75 - samples/sec: 2793.52 - lr: 0.000019 - momentum: 0.000000
2023-10-17 14:59:40,734 epoch 5 - iter 297/992 - loss 0.04103463 - time (sec): 18.46 - samples/sec: 2714.78 - lr: 0.000019 - momentum: 0.000000
2023-10-17 14:59:46,580 epoch 5 - iter 396/992 - loss 0.04204274 - time (sec): 24.31 - samples/sec: 2727.02 - lr: 0.000019 - momentum: 0.000000
2023-10-17 14:59:52,832 epoch 5 - iter 495/992 - loss 0.04310676 - time (sec): 30.56 - samples/sec: 2737.91 - lr: 0.000018 - momentum: 0.000000
2023-10-17 14:59:58,758 epoch 5 - iter 594/992 - loss 0.04258103 - time (sec): 36.49 - samples/sec: 2726.69 - lr: 0.000018 - momentum: 0.000000
2023-10-17 15:00:05,057 epoch 5 - iter 693/992 - loss 0.04329139 - time (sec): 42.79 - samples/sec: 2719.01 - lr: 0.000018 - momentum: 0.000000
2023-10-17 15:00:11,054 epoch 5 - iter 792/992 - loss 0.04291426 - time (sec): 48.78 - samples/sec: 2704.32 - lr: 0.000017 - momentum: 0.000000
2023-10-17 15:00:16,943 epoch 5 - iter 891/992 - loss 0.04170637 - time (sec): 54.67 - samples/sec: 2700.67 - lr: 0.000017 - momentum: 0.000000
2023-10-17 15:00:23,012 epoch 5 - iter 990/992 - loss 0.04090239 - time (sec): 60.74 - samples/sec: 2694.20 - lr: 0.000017 - momentum: 0.000000
2023-10-17 15:00:23,123 ----------------------------------------------------------------------------------------------------
2023-10-17 15:00:23,123 EPOCH 5 done: loss 0.0408 - lr: 0.000017
2023-10-17 15:00:26,694 DEV : loss 0.1553419977426529 - f1-score (micro avg) 0.757
2023-10-17 15:00:26,717 saving best model
2023-10-17 15:00:27,166 ----------------------------------------------------------------------------------------------------
2023-10-17 15:00:33,331 epoch 6 - iter 99/992 - loss 0.02513130 - time (sec): 6.16 - samples/sec: 2689.96 - lr: 0.000016 - momentum: 0.000000
2023-10-17 15:00:39,188 epoch 6 - iter 198/992 - loss 0.02780468 - time (sec): 12.01 - samples/sec: 2785.50 - lr: 0.000016 - momentum: 0.000000
2023-10-17 15:00:44,948 epoch 6 - iter 297/992 - loss 0.02839834 - time (sec): 17.77 - samples/sec: 2822.68 - lr: 0.000016 - momentum: 0.000000
2023-10-17 15:00:50,855 epoch 6 - iter 396/992 - loss 0.02785029 - time (sec): 23.68 - samples/sec: 2822.21 - lr: 0.000015 - momentum: 0.000000
2023-10-17 15:00:56,852 epoch 6 - iter 495/992 - loss 0.03132603 - time (sec): 29.68 - samples/sec: 2802.51 - lr: 0.000015 - momentum: 0.000000
2023-10-17 15:01:03,041 epoch 6 - iter 594/992 - loss 0.03290732 - time (sec): 35.87 - samples/sec: 2800.06 - lr: 0.000015 - momentum: 0.000000
2023-10-17 15:01:08,750 epoch 6 - iter 693/992 - loss 0.03275845 - time (sec): 41.58 - samples/sec: 2788.17 - lr: 0.000014 - momentum: 0.000000
2023-10-17 15:01:14,597 epoch 6 - iter 792/992 - loss 0.03349634 - time (sec): 47.42 - samples/sec: 2773.28 - lr: 0.000014 - momentum: 0.000000
2023-10-17 15:01:20,458 epoch 6 - iter 891/992 - loss 0.03345707 - time (sec): 53.28 - samples/sec: 2773.30 - lr: 0.000014 - momentum: 0.000000
2023-10-17 15:01:26,307 epoch 6 - iter 990/992 - loss 0.03373942 - time (sec): 59.13 - samples/sec: 2767.63 - lr: 0.000013 - momentum: 0.000000
2023-10-17 15:01:26,419 ----------------------------------------------------------------------------------------------------
2023-10-17 15:01:26,419 EPOCH 6 done: loss 0.0338 - lr: 0.000013
2023-10-17 15:01:30,490 DEV : loss 0.16105015575885773 - f1-score (micro avg) 0.7726
2023-10-17 15:01:30,511 saving best model
2023-10-17 15:01:30,955 ----------------------------------------------------------------------------------------------------
2023-10-17 15:01:36,828 epoch 7 - iter 99/992 - loss 0.02381798 - time (sec): 5.87 - samples/sec: 2705.66 - lr: 0.000013 - momentum: 0.000000
2023-10-17 15:01:42,830 epoch 7 - iter 198/992 - loss 0.02353458 - time (sec): 11.87 - samples/sec: 2738.73 - lr: 0.000013 - momentum: 0.000000
2023-10-17 15:01:48,749 epoch 7 - iter 297/992 - loss 0.02504263 - time (sec): 17.79 - samples/sec: 2747.14 - lr: 0.000012 - momentum: 0.000000
2023-10-17 15:01:55,101 epoch 7 - iter 396/992 - loss 0.02614405 - time (sec): 24.14 - samples/sec: 2749.97 - lr: 0.000012 - momentum: 0.000000
2023-10-17 15:02:01,604 epoch 7 - iter 495/992 - loss 0.02434582 - time (sec): 30.65 - samples/sec: 2751.51 - lr: 0.000012 - momentum: 0.000000
2023-10-17 15:02:07,430 epoch 7 - iter 594/992 - loss 0.02437965 - time (sec): 36.47 - samples/sec: 2754.37 - lr: 0.000011 - momentum: 0.000000
2023-10-17 15:02:13,086 epoch 7 - iter 693/992 - loss 0.02378273 - time (sec): 42.13 - samples/sec: 2757.97 - lr: 0.000011 - momentum: 0.000000
2023-10-17 15:02:19,137 epoch 7 - iter 792/992 - loss 0.02441024 - time (sec): 48.18 - samples/sec: 2743.72 - lr: 0.000011 - momentum: 0.000000
2023-10-17 15:02:25,086 epoch 7 - iter 891/992 - loss 0.02525223 - time (sec): 54.13 - samples/sec: 2737.86 - lr: 0.000010 - momentum: 0.000000
2023-10-17 15:02:30,788 epoch 7 - iter 990/992 - loss 0.02504337 - time (sec): 59.83 - samples/sec: 2734.40 - lr: 0.000010 - momentum: 0.000000
2023-10-17 15:02:30,903 ----------------------------------------------------------------------------------------------------
2023-10-17 15:02:30,903 EPOCH 7 done: loss 0.0252 - lr: 0.000010
2023-10-17 15:02:34,466 DEV : loss 0.19216737151145935 - f1-score (micro avg) 0.7635
2023-10-17 15:02:34,487 ----------------------------------------------------------------------------------------------------
2023-10-17 15:02:40,105 epoch 8 - iter 99/992 - loss 0.01270573 - time (sec): 5.62 - samples/sec: 2896.52 - lr: 0.000010 - momentum: 0.000000
2023-10-17 15:02:45,885 epoch 8 - iter 198/992 - loss 0.01300986 - time (sec): 11.40 - samples/sec: 2834.30 - lr: 0.000009 - momentum: 0.000000
2023-10-17 15:02:51,778 epoch 8 - iter 297/992 - loss 0.01459488 - time (sec): 17.29 - samples/sec: 2821.29 - lr: 0.000009 - momentum: 0.000000
2023-10-17 15:02:57,542 epoch 8 - iter 396/992 - loss 0.01561924 - time (sec): 23.05 - samples/sec: 2832.29 - lr: 0.000009 - momentum: 0.000000
2023-10-17 15:03:03,553 epoch 8 - iter 495/992 - loss 0.01691767 - time (sec): 29.06 - samples/sec: 2801.37 - lr: 0.000008 - momentum: 0.000000
2023-10-17 15:03:09,405 epoch 8 - iter 594/992 - loss 0.01739702 - time (sec): 34.92 - samples/sec: 2782.27 - lr: 0.000008 - momentum: 0.000000
2023-10-17 15:03:15,390 epoch 8 - iter 693/992 - loss 0.01752797 - time (sec): 40.90 - samples/sec: 2778.72 - lr: 0.000008 - momentum: 0.000000
2023-10-17 15:03:21,256 epoch 8 - iter 792/992 - loss 0.01832980 - time (sec): 46.77 - samples/sec: 2788.63 - lr: 0.000007 - momentum: 0.000000
2023-10-17 15:03:26,957 epoch 8 - iter 891/992 - loss 0.01848759 - time (sec): 52.47 - samples/sec: 2787.76 - lr: 0.000007 - momentum: 0.000000
2023-10-17 15:03:33,284 epoch 8 - iter 990/992 - loss 0.01807558 - time (sec): 58.80 - samples/sec: 2782.92 - lr: 0.000007 - momentum: 0.000000
2023-10-17 15:03:33,407 ----------------------------------------------------------------------------------------------------
2023-10-17 15:03:33,407 EPOCH 8 done: loss 0.0181 - lr: 0.000007
2023-10-17 15:03:36,953 DEV : loss 0.20749090611934662 - f1-score (micro avg) 0.7613
2023-10-17 15:03:36,977 ----------------------------------------------------------------------------------------------------
2023-10-17 15:03:42,840 epoch 9 - iter 99/992 - loss 0.01140057 - time (sec): 5.86 - samples/sec: 2871.72 - lr: 0.000006 - momentum: 0.000000
2023-10-17 15:03:48,743 epoch 9 - iter 198/992 - loss 0.01229614 - time (sec): 11.76 - samples/sec: 2870.30 - lr: 0.000006 - momentum: 0.000000
2023-10-17 15:03:54,675 epoch 9 - iter 297/992 - loss 0.01290086 - time (sec): 17.70 - samples/sec: 2835.09 - lr: 0.000006 - momentum: 0.000000
2023-10-17 15:04:00,390 epoch 9 - iter 396/992 - loss 0.01384047 - time (sec): 23.41 - samples/sec: 2833.12 - lr: 0.000005 - momentum: 0.000000
2023-10-17 15:04:06,550 epoch 9 - iter 495/992 - loss 0.01440933 - time (sec): 29.57 - samples/sec: 2816.04 - lr: 0.000005 - momentum: 0.000000
2023-10-17 15:04:12,304 epoch 9 - iter 594/992 - loss 0.01487478 - time (sec): 35.32 - samples/sec: 2805.99 - lr: 0.000005 - momentum: 0.000000
2023-10-17 15:04:18,764 epoch 9 - iter 693/992 - loss 0.01494095 - time (sec): 41.79 - samples/sec: 2766.97 - lr: 0.000004 - momentum: 0.000000
2023-10-17 15:04:25,004 epoch 9 - iter 792/992 - loss 0.01466637 - time (sec): 48.02 - samples/sec: 2744.42 - lr: 0.000004 - momentum: 0.000000
2023-10-17 15:04:31,488 epoch 9 - iter 891/992 - loss 0.01464725 - time (sec): 54.51 - samples/sec: 2711.33 - lr: 0.000004 - momentum: 0.000000
2023-10-17 15:04:37,644 epoch 9 - iter 990/992 - loss 0.01489035 - time (sec): 60.67 - samples/sec: 2698.37 - lr: 0.000003 - momentum: 0.000000
2023-10-17 15:04:37,757 ----------------------------------------------------------------------------------------------------
2023-10-17 15:04:37,757 EPOCH 9 done: loss 0.0149 - lr: 0.000003
2023-10-17 15:04:41,294 DEV : loss 0.21198396384716034 - f1-score (micro avg) 0.7669
2023-10-17 15:04:41,316 ----------------------------------------------------------------------------------------------------
2023-10-17 15:04:47,397 epoch 10 - iter 99/992 - loss 0.01491946 - time (sec): 6.08 - samples/sec: 2782.08 - lr: 0.000003 - momentum: 0.000000
2023-10-17 15:04:53,310 epoch 10 - iter 198/992 - loss 0.01291025 - time (sec): 11.99 - samples/sec: 2782.64 - lr: 0.000003 - momentum: 0.000000
2023-10-17 15:04:59,419 epoch 10 - iter 297/992 - loss 0.01185798 - time (sec): 18.10 - samples/sec: 2771.44 - lr: 0.000002 - momentum: 0.000000
2023-10-17 15:05:05,218 epoch 10 - iter 396/992 - loss 0.01205737 - time (sec): 23.90 - samples/sec: 2742.16 - lr: 0.000002 - momentum: 0.000000
2023-10-17 15:05:11,758 epoch 10 - iter 495/992 - loss 0.01209685 - time (sec): 30.44 - samples/sec: 2711.64 - lr: 0.000002 - momentum: 0.000000
2023-10-17 15:05:17,984 epoch 10 - iter 594/992 - loss 0.01139644 - time (sec): 36.67 - samples/sec: 2680.71 - lr: 0.000001 - momentum: 0.000000
2023-10-17 15:05:23,872 epoch 10 - iter 693/992 - loss 0.01165819 - time (sec): 42.55 - samples/sec: 2692.82 - lr: 0.000001 - momentum: 0.000000
2023-10-17 15:05:30,014 epoch 10 - iter 792/992 - loss 0.01139237 - time (sec): 48.70 - samples/sec: 2679.16 - lr: 0.000001 - momentum: 0.000000
2023-10-17 15:05:35,725 epoch 10 - iter 891/992 - loss 0.01100199 - time (sec): 54.41 - samples/sec: 2683.81 - lr: 0.000000 - momentum: 0.000000
2023-10-17 15:05:41,981 epoch 10 - iter 990/992 - loss 0.01054514 - time (sec): 60.66 - samples/sec: 2699.31 - lr: 0.000000 - momentum: 0.000000
2023-10-17 15:05:42,082 ----------------------------------------------------------------------------------------------------
2023-10-17 15:05:42,083 EPOCH 10 done: loss 0.0105 - lr: 0.000000
2023-10-17 15:05:46,454 DEV : loss 0.21605056524276733 - f1-score (micro avg) 0.7578
2023-10-17 15:05:46,824 ----------------------------------------------------------------------------------------------------
2023-10-17 15:05:46,825 Loading model from best epoch ...
2023-10-17 15:05:48,292 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 15:05:51,908
Results:
- F-score (micro) 0.7745
- F-score (macro) 0.6864
- Accuracy 0.6529
By class:
precision recall f1-score support
LOC 0.8356 0.8534 0.8444 655
PER 0.7328 0.7623 0.7473 223
ORG 0.4552 0.4803 0.4674 127
micro avg 0.7633 0.7861 0.7745 1005
macro avg 0.6745 0.6987 0.6864 1005
weighted avg 0.7647 0.7861 0.7752 1005
2023-10-17 15:05:51,908 ----------------------------------------------------------------------------------------------------