stefan-it's picture
Upload folder using huggingface_hub
6d7b7ff
2023-10-15 21:53:18,052 ----------------------------------------------------------------------------------------------------
2023-10-15 21:53:18,053 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-15 21:53:18,053 ----------------------------------------------------------------------------------------------------
2023-10-15 21:53:18,053 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-15 21:53:18,053 ----------------------------------------------------------------------------------------------------
2023-10-15 21:53:18,053 Train: 20847 sentences
2023-10-15 21:53:18,053 (train_with_dev=False, train_with_test=False)
2023-10-15 21:53:18,053 ----------------------------------------------------------------------------------------------------
2023-10-15 21:53:18,054 Training Params:
2023-10-15 21:53:18,054 - learning_rate: "3e-05"
2023-10-15 21:53:18,054 - mini_batch_size: "8"
2023-10-15 21:53:18,054 - max_epochs: "10"
2023-10-15 21:53:18,054 - shuffle: "True"
2023-10-15 21:53:18,054 ----------------------------------------------------------------------------------------------------
2023-10-15 21:53:18,054 Plugins:
2023-10-15 21:53:18,054 - LinearScheduler | warmup_fraction: '0.1'
2023-10-15 21:53:18,054 ----------------------------------------------------------------------------------------------------
2023-10-15 21:53:18,054 Final evaluation on model from best epoch (best-model.pt)
2023-10-15 21:53:18,054 - metric: "('micro avg', 'f1-score')"
2023-10-15 21:53:18,054 ----------------------------------------------------------------------------------------------------
2023-10-15 21:53:18,054 Computation:
2023-10-15 21:53:18,054 - compute on device: cuda:0
2023-10-15 21:53:18,054 - embedding storage: none
2023-10-15 21:53:18,054 ----------------------------------------------------------------------------------------------------
2023-10-15 21:53:18,054 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-15 21:53:18,054 ----------------------------------------------------------------------------------------------------
2023-10-15 21:53:18,054 ----------------------------------------------------------------------------------------------------
2023-10-15 21:53:36,396 epoch 1 - iter 260/2606 - loss 1.86299955 - time (sec): 18.34 - samples/sec: 1940.73 - lr: 0.000003 - momentum: 0.000000
2023-10-15 21:53:56,212 epoch 1 - iter 520/2606 - loss 1.13662102 - time (sec): 38.16 - samples/sec: 1910.14 - lr: 0.000006 - momentum: 0.000000
2023-10-15 21:54:15,437 epoch 1 - iter 780/2606 - loss 0.86190244 - time (sec): 57.38 - samples/sec: 1889.39 - lr: 0.000009 - momentum: 0.000000
2023-10-15 21:54:34,769 epoch 1 - iter 1040/2606 - loss 0.71609132 - time (sec): 76.71 - samples/sec: 1876.81 - lr: 0.000012 - momentum: 0.000000
2023-10-15 21:54:55,490 epoch 1 - iter 1300/2606 - loss 0.61712413 - time (sec): 97.43 - samples/sec: 1853.07 - lr: 0.000015 - momentum: 0.000000
2023-10-15 21:55:14,612 epoch 1 - iter 1560/2606 - loss 0.54551314 - time (sec): 116.56 - samples/sec: 1870.50 - lr: 0.000018 - momentum: 0.000000
2023-10-15 21:55:32,894 epoch 1 - iter 1820/2606 - loss 0.49840634 - time (sec): 134.84 - samples/sec: 1891.60 - lr: 0.000021 - momentum: 0.000000
2023-10-15 21:55:51,872 epoch 1 - iter 2080/2606 - loss 0.46070053 - time (sec): 153.82 - samples/sec: 1891.67 - lr: 0.000024 - momentum: 0.000000
2023-10-15 21:56:10,667 epoch 1 - iter 2340/2606 - loss 0.43323816 - time (sec): 172.61 - samples/sec: 1891.10 - lr: 0.000027 - momentum: 0.000000
2023-10-15 21:56:30,708 epoch 1 - iter 2600/2606 - loss 0.40652695 - time (sec): 192.65 - samples/sec: 1901.93 - lr: 0.000030 - momentum: 0.000000
2023-10-15 21:56:31,204 ----------------------------------------------------------------------------------------------------
2023-10-15 21:56:31,205 EPOCH 1 done: loss 0.4058 - lr: 0.000030
2023-10-15 21:56:36,969 DEV : loss 0.1317945271730423 - f1-score (micro avg) 0.3034
2023-10-15 21:56:36,997 saving best model
2023-10-15 21:56:37,371 ----------------------------------------------------------------------------------------------------
2023-10-15 21:56:56,944 epoch 2 - iter 260/2606 - loss 0.15883373 - time (sec): 19.57 - samples/sec: 1981.52 - lr: 0.000030 - momentum: 0.000000
2023-10-15 21:57:15,533 epoch 2 - iter 520/2606 - loss 0.15021341 - time (sec): 38.16 - samples/sec: 1965.18 - lr: 0.000029 - momentum: 0.000000
2023-10-15 21:57:34,242 epoch 2 - iter 780/2606 - loss 0.14428957 - time (sec): 56.87 - samples/sec: 1959.75 - lr: 0.000029 - momentum: 0.000000
2023-10-15 21:57:53,321 epoch 2 - iter 1040/2606 - loss 0.14632439 - time (sec): 75.95 - samples/sec: 1959.02 - lr: 0.000029 - momentum: 0.000000
2023-10-15 21:58:12,435 epoch 2 - iter 1300/2606 - loss 0.15002425 - time (sec): 95.06 - samples/sec: 1944.89 - lr: 0.000028 - momentum: 0.000000
2023-10-15 21:58:31,688 epoch 2 - iter 1560/2606 - loss 0.14638804 - time (sec): 114.32 - samples/sec: 1945.13 - lr: 0.000028 - momentum: 0.000000
2023-10-15 21:58:50,111 epoch 2 - iter 1820/2606 - loss 0.14688279 - time (sec): 132.74 - samples/sec: 1943.92 - lr: 0.000028 - momentum: 0.000000
2023-10-15 21:59:10,029 epoch 2 - iter 2080/2606 - loss 0.14589200 - time (sec): 152.66 - samples/sec: 1945.70 - lr: 0.000027 - momentum: 0.000000
2023-10-15 21:59:27,961 epoch 2 - iter 2340/2606 - loss 0.14640886 - time (sec): 170.59 - samples/sec: 1936.19 - lr: 0.000027 - momentum: 0.000000
2023-10-15 21:59:46,509 epoch 2 - iter 2600/2606 - loss 0.14645028 - time (sec): 189.14 - samples/sec: 1939.04 - lr: 0.000027 - momentum: 0.000000
2023-10-15 21:59:46,864 ----------------------------------------------------------------------------------------------------
2023-10-15 21:59:46,864 EPOCH 2 done: loss 0.1465 - lr: 0.000027
2023-10-15 21:59:55,993 DEV : loss 0.14893855154514313 - f1-score (micro avg) 0.3593
2023-10-15 21:59:56,021 saving best model
2023-10-15 21:59:56,501 ----------------------------------------------------------------------------------------------------
2023-10-15 22:00:15,212 epoch 3 - iter 260/2606 - loss 0.12119952 - time (sec): 18.71 - samples/sec: 1931.29 - lr: 0.000026 - momentum: 0.000000
2023-10-15 22:00:32,904 epoch 3 - iter 520/2606 - loss 0.10577303 - time (sec): 36.40 - samples/sec: 1889.81 - lr: 0.000026 - momentum: 0.000000
2023-10-15 22:00:51,018 epoch 3 - iter 780/2606 - loss 0.10660695 - time (sec): 54.51 - samples/sec: 1888.83 - lr: 0.000026 - momentum: 0.000000
2023-10-15 22:01:09,553 epoch 3 - iter 1040/2606 - loss 0.10403521 - time (sec): 73.05 - samples/sec: 1900.94 - lr: 0.000025 - momentum: 0.000000
2023-10-15 22:01:28,741 epoch 3 - iter 1300/2606 - loss 0.09848180 - time (sec): 92.24 - samples/sec: 1908.68 - lr: 0.000025 - momentum: 0.000000
2023-10-15 22:01:48,191 epoch 3 - iter 1560/2606 - loss 0.09910196 - time (sec): 111.69 - samples/sec: 1908.52 - lr: 0.000025 - momentum: 0.000000
2023-10-15 22:02:07,616 epoch 3 - iter 1820/2606 - loss 0.09891702 - time (sec): 131.11 - samples/sec: 1915.71 - lr: 0.000024 - momentum: 0.000000
2023-10-15 22:02:26,705 epoch 3 - iter 2080/2606 - loss 0.09897481 - time (sec): 150.20 - samples/sec: 1913.68 - lr: 0.000024 - momentum: 0.000000
2023-10-15 22:02:46,332 epoch 3 - iter 2340/2606 - loss 0.09916865 - time (sec): 169.83 - samples/sec: 1921.55 - lr: 0.000024 - momentum: 0.000000
2023-10-15 22:03:06,710 epoch 3 - iter 2600/2606 - loss 0.09857598 - time (sec): 190.20 - samples/sec: 1928.95 - lr: 0.000023 - momentum: 0.000000
2023-10-15 22:03:07,074 ----------------------------------------------------------------------------------------------------
2023-10-15 22:03:07,075 EPOCH 3 done: loss 0.0986 - lr: 0.000023
2023-10-15 22:03:16,081 DEV : loss 0.25157859921455383 - f1-score (micro avg) 0.3061
2023-10-15 22:03:16,107 ----------------------------------------------------------------------------------------------------
2023-10-15 22:03:34,214 epoch 4 - iter 260/2606 - loss 0.05397925 - time (sec): 18.11 - samples/sec: 1934.95 - lr: 0.000023 - momentum: 0.000000
2023-10-15 22:03:52,530 epoch 4 - iter 520/2606 - loss 0.06221217 - time (sec): 36.42 - samples/sec: 1969.41 - lr: 0.000023 - momentum: 0.000000
2023-10-15 22:04:11,985 epoch 4 - iter 780/2606 - loss 0.06302086 - time (sec): 55.88 - samples/sec: 1973.10 - lr: 0.000022 - momentum: 0.000000
2023-10-15 22:04:32,212 epoch 4 - iter 1040/2606 - loss 0.06274338 - time (sec): 76.10 - samples/sec: 1965.75 - lr: 0.000022 - momentum: 0.000000
2023-10-15 22:04:51,603 epoch 4 - iter 1300/2606 - loss 0.06736005 - time (sec): 95.49 - samples/sec: 1958.77 - lr: 0.000022 - momentum: 0.000000
2023-10-15 22:05:10,408 epoch 4 - iter 1560/2606 - loss 0.06710813 - time (sec): 114.30 - samples/sec: 1948.45 - lr: 0.000021 - momentum: 0.000000
2023-10-15 22:05:29,180 epoch 4 - iter 1820/2606 - loss 0.06660339 - time (sec): 133.07 - samples/sec: 1945.03 - lr: 0.000021 - momentum: 0.000000
2023-10-15 22:05:48,515 epoch 4 - iter 2080/2606 - loss 0.06584737 - time (sec): 152.41 - samples/sec: 1945.57 - lr: 0.000021 - momentum: 0.000000
2023-10-15 22:06:06,641 epoch 4 - iter 2340/2606 - loss 0.06648156 - time (sec): 170.53 - samples/sec: 1941.17 - lr: 0.000020 - momentum: 0.000000
2023-10-15 22:06:25,362 epoch 4 - iter 2600/2606 - loss 0.06655453 - time (sec): 189.25 - samples/sec: 1937.73 - lr: 0.000020 - momentum: 0.000000
2023-10-15 22:06:25,729 ----------------------------------------------------------------------------------------------------
2023-10-15 22:06:25,730 EPOCH 4 done: loss 0.0665 - lr: 0.000020
2023-10-15 22:06:34,799 DEV : loss 0.24660253524780273 - f1-score (micro avg) 0.362
2023-10-15 22:06:34,826 saving best model
2023-10-15 22:06:35,303 ----------------------------------------------------------------------------------------------------
2023-10-15 22:06:55,662 epoch 5 - iter 260/2606 - loss 0.05181151 - time (sec): 20.35 - samples/sec: 1892.16 - lr: 0.000020 - momentum: 0.000000
2023-10-15 22:07:13,967 epoch 5 - iter 520/2606 - loss 0.05220789 - time (sec): 38.66 - samples/sec: 1891.65 - lr: 0.000019 - momentum: 0.000000
2023-10-15 22:07:32,278 epoch 5 - iter 780/2606 - loss 0.05016679 - time (sec): 56.97 - samples/sec: 1906.72 - lr: 0.000019 - momentum: 0.000000
2023-10-15 22:07:51,830 epoch 5 - iter 1040/2606 - loss 0.04829786 - time (sec): 76.52 - samples/sec: 1925.97 - lr: 0.000019 - momentum: 0.000000
2023-10-15 22:08:11,380 epoch 5 - iter 1300/2606 - loss 0.04933213 - time (sec): 96.07 - samples/sec: 1932.05 - lr: 0.000018 - momentum: 0.000000
2023-10-15 22:08:29,300 epoch 5 - iter 1560/2606 - loss 0.05025896 - time (sec): 113.99 - samples/sec: 1940.82 - lr: 0.000018 - momentum: 0.000000
2023-10-15 22:08:48,060 epoch 5 - iter 1820/2606 - loss 0.05051188 - time (sec): 132.75 - samples/sec: 1930.96 - lr: 0.000018 - momentum: 0.000000
2023-10-15 22:09:07,640 epoch 5 - iter 2080/2606 - loss 0.05049316 - time (sec): 152.33 - samples/sec: 1933.14 - lr: 0.000017 - momentum: 0.000000
2023-10-15 22:09:25,995 epoch 5 - iter 2340/2606 - loss 0.05090602 - time (sec): 170.69 - samples/sec: 1936.70 - lr: 0.000017 - momentum: 0.000000
2023-10-15 22:09:44,781 epoch 5 - iter 2600/2606 - loss 0.05050105 - time (sec): 189.47 - samples/sec: 1935.41 - lr: 0.000017 - momentum: 0.000000
2023-10-15 22:09:45,200 ----------------------------------------------------------------------------------------------------
2023-10-15 22:09:45,200 EPOCH 5 done: loss 0.0505 - lr: 0.000017
2023-10-15 22:09:53,589 DEV : loss 0.39237016439437866 - f1-score (micro avg) 0.344
2023-10-15 22:09:53,617 ----------------------------------------------------------------------------------------------------
2023-10-15 22:10:12,902 epoch 6 - iter 260/2606 - loss 0.03079347 - time (sec): 19.28 - samples/sec: 1889.86 - lr: 0.000016 - momentum: 0.000000
2023-10-15 22:10:31,638 epoch 6 - iter 520/2606 - loss 0.03422273 - time (sec): 38.02 - samples/sec: 1933.13 - lr: 0.000016 - momentum: 0.000000
2023-10-15 22:10:50,704 epoch 6 - iter 780/2606 - loss 0.03367764 - time (sec): 57.09 - samples/sec: 1931.45 - lr: 0.000016 - momentum: 0.000000
2023-10-15 22:11:09,610 epoch 6 - iter 1040/2606 - loss 0.03423013 - time (sec): 75.99 - samples/sec: 1942.77 - lr: 0.000015 - momentum: 0.000000
2023-10-15 22:11:27,791 epoch 6 - iter 1300/2606 - loss 0.03565700 - time (sec): 94.17 - samples/sec: 1944.82 - lr: 0.000015 - momentum: 0.000000
2023-10-15 22:11:47,040 epoch 6 - iter 1560/2606 - loss 0.03707415 - time (sec): 113.42 - samples/sec: 1943.90 - lr: 0.000015 - momentum: 0.000000
2023-10-15 22:12:06,589 epoch 6 - iter 1820/2606 - loss 0.03663239 - time (sec): 132.97 - samples/sec: 1946.09 - lr: 0.000014 - momentum: 0.000000
2023-10-15 22:12:25,841 epoch 6 - iter 2080/2606 - loss 0.03621919 - time (sec): 152.22 - samples/sec: 1948.42 - lr: 0.000014 - momentum: 0.000000
2023-10-15 22:12:45,101 epoch 6 - iter 2340/2606 - loss 0.03653715 - time (sec): 171.48 - samples/sec: 1940.81 - lr: 0.000014 - momentum: 0.000000
2023-10-15 22:13:03,123 epoch 6 - iter 2600/2606 - loss 0.03719827 - time (sec): 189.50 - samples/sec: 1932.05 - lr: 0.000013 - momentum: 0.000000
2023-10-15 22:13:03,666 ----------------------------------------------------------------------------------------------------
2023-10-15 22:13:03,666 EPOCH 6 done: loss 0.0371 - lr: 0.000013
2023-10-15 22:13:12,012 DEV : loss 0.3723573386669159 - f1-score (micro avg) 0.3884
2023-10-15 22:13:12,040 saving best model
2023-10-15 22:13:12,529 ----------------------------------------------------------------------------------------------------
2023-10-15 22:13:30,591 epoch 7 - iter 260/2606 - loss 0.03143927 - time (sec): 18.05 - samples/sec: 1930.80 - lr: 0.000013 - momentum: 0.000000
2023-10-15 22:13:49,009 epoch 7 - iter 520/2606 - loss 0.03028943 - time (sec): 36.47 - samples/sec: 1959.23 - lr: 0.000013 - momentum: 0.000000
2023-10-15 22:14:08,538 epoch 7 - iter 780/2606 - loss 0.02982462 - time (sec): 56.00 - samples/sec: 1932.31 - lr: 0.000012 - momentum: 0.000000
2023-10-15 22:14:29,308 epoch 7 - iter 1040/2606 - loss 0.02849778 - time (sec): 76.77 - samples/sec: 1920.09 - lr: 0.000012 - momentum: 0.000000
2023-10-15 22:14:48,189 epoch 7 - iter 1300/2606 - loss 0.02740309 - time (sec): 95.65 - samples/sec: 1916.83 - lr: 0.000012 - momentum: 0.000000
2023-10-15 22:15:07,172 epoch 7 - iter 1560/2606 - loss 0.02795793 - time (sec): 114.64 - samples/sec: 1922.54 - lr: 0.000011 - momentum: 0.000000
2023-10-15 22:15:26,120 epoch 7 - iter 1820/2606 - loss 0.02696128 - time (sec): 133.58 - samples/sec: 1933.43 - lr: 0.000011 - momentum: 0.000000
2023-10-15 22:15:45,154 epoch 7 - iter 2080/2606 - loss 0.02587610 - time (sec): 152.62 - samples/sec: 1937.39 - lr: 0.000011 - momentum: 0.000000
2023-10-15 22:16:03,410 epoch 7 - iter 2340/2606 - loss 0.02517437 - time (sec): 170.87 - samples/sec: 1927.26 - lr: 0.000010 - momentum: 0.000000
2023-10-15 22:16:22,695 epoch 7 - iter 2600/2606 - loss 0.02561039 - time (sec): 190.16 - samples/sec: 1926.39 - lr: 0.000010 - momentum: 0.000000
2023-10-15 22:16:23,259 ----------------------------------------------------------------------------------------------------
2023-10-15 22:16:23,259 EPOCH 7 done: loss 0.0256 - lr: 0.000010
2023-10-15 22:16:31,497 DEV : loss 0.37943604588508606 - f1-score (micro avg) 0.3863
2023-10-15 22:16:31,525 ----------------------------------------------------------------------------------------------------
2023-10-15 22:16:51,372 epoch 8 - iter 260/2606 - loss 0.01682548 - time (sec): 19.85 - samples/sec: 1966.40 - lr: 0.000010 - momentum: 0.000000
2023-10-15 22:17:10,615 epoch 8 - iter 520/2606 - loss 0.01934422 - time (sec): 39.09 - samples/sec: 1968.64 - lr: 0.000009 - momentum: 0.000000
2023-10-15 22:17:29,460 epoch 8 - iter 780/2606 - loss 0.01904607 - time (sec): 57.93 - samples/sec: 1949.77 - lr: 0.000009 - momentum: 0.000000
2023-10-15 22:17:48,691 epoch 8 - iter 1040/2606 - loss 0.01943065 - time (sec): 77.16 - samples/sec: 1938.17 - lr: 0.000009 - momentum: 0.000000
2023-10-15 22:18:07,840 epoch 8 - iter 1300/2606 - loss 0.01802102 - time (sec): 96.31 - samples/sec: 1933.12 - lr: 0.000008 - momentum: 0.000000
2023-10-15 22:18:27,932 epoch 8 - iter 1560/2606 - loss 0.01930885 - time (sec): 116.41 - samples/sec: 1921.71 - lr: 0.000008 - momentum: 0.000000
2023-10-15 22:18:45,937 epoch 8 - iter 1820/2606 - loss 0.01997459 - time (sec): 134.41 - samples/sec: 1929.98 - lr: 0.000008 - momentum: 0.000000
2023-10-15 22:19:04,279 epoch 8 - iter 2080/2606 - loss 0.01985683 - time (sec): 152.75 - samples/sec: 1933.12 - lr: 0.000007 - momentum: 0.000000
2023-10-15 22:19:22,910 epoch 8 - iter 2340/2606 - loss 0.02005650 - time (sec): 171.38 - samples/sec: 1924.45 - lr: 0.000007 - momentum: 0.000000
2023-10-15 22:19:41,991 epoch 8 - iter 2600/2606 - loss 0.01999583 - time (sec): 190.46 - samples/sec: 1926.05 - lr: 0.000007 - momentum: 0.000000
2023-10-15 22:19:42,365 ----------------------------------------------------------------------------------------------------
2023-10-15 22:19:42,365 EPOCH 8 done: loss 0.0200 - lr: 0.000007
2023-10-15 22:19:50,593 DEV : loss 0.5024023652076721 - f1-score (micro avg) 0.3622
2023-10-15 22:19:50,620 ----------------------------------------------------------------------------------------------------
2023-10-15 22:20:09,631 epoch 9 - iter 260/2606 - loss 0.01423242 - time (sec): 19.01 - samples/sec: 1979.41 - lr: 0.000006 - momentum: 0.000000
2023-10-15 22:20:27,640 epoch 9 - iter 520/2606 - loss 0.01556764 - time (sec): 37.02 - samples/sec: 1958.87 - lr: 0.000006 - momentum: 0.000000
2023-10-15 22:20:46,047 epoch 9 - iter 780/2606 - loss 0.01574355 - time (sec): 55.43 - samples/sec: 1936.01 - lr: 0.000006 - momentum: 0.000000
2023-10-15 22:21:04,365 epoch 9 - iter 1040/2606 - loss 0.01640986 - time (sec): 73.74 - samples/sec: 1936.03 - lr: 0.000005 - momentum: 0.000000
2023-10-15 22:21:24,067 epoch 9 - iter 1300/2606 - loss 0.01548556 - time (sec): 93.45 - samples/sec: 1940.35 - lr: 0.000005 - momentum: 0.000000
2023-10-15 22:21:42,933 epoch 9 - iter 1560/2606 - loss 0.01493432 - time (sec): 112.31 - samples/sec: 1945.73 - lr: 0.000005 - momentum: 0.000000
2023-10-15 22:22:01,991 epoch 9 - iter 1820/2606 - loss 0.01494702 - time (sec): 131.37 - samples/sec: 1943.02 - lr: 0.000004 - momentum: 0.000000
2023-10-15 22:22:21,758 epoch 9 - iter 2080/2606 - loss 0.01477139 - time (sec): 151.14 - samples/sec: 1936.57 - lr: 0.000004 - momentum: 0.000000
2023-10-15 22:22:41,198 epoch 9 - iter 2340/2606 - loss 0.01481877 - time (sec): 170.58 - samples/sec: 1938.07 - lr: 0.000004 - momentum: 0.000000
2023-10-15 22:22:59,969 epoch 9 - iter 2600/2606 - loss 0.01461832 - time (sec): 189.35 - samples/sec: 1935.85 - lr: 0.000003 - momentum: 0.000000
2023-10-15 22:23:00,420 ----------------------------------------------------------------------------------------------------
2023-10-15 22:23:00,420 EPOCH 9 done: loss 0.0146 - lr: 0.000003
2023-10-15 22:23:08,661 DEV : loss 0.4772779941558838 - f1-score (micro avg) 0.3756
2023-10-15 22:23:08,688 ----------------------------------------------------------------------------------------------------
2023-10-15 22:23:27,098 epoch 10 - iter 260/2606 - loss 0.00972956 - time (sec): 18.41 - samples/sec: 1934.67 - lr: 0.000003 - momentum: 0.000000
2023-10-15 22:23:45,965 epoch 10 - iter 520/2606 - loss 0.01172751 - time (sec): 37.28 - samples/sec: 1916.83 - lr: 0.000003 - momentum: 0.000000
2023-10-15 22:24:04,310 epoch 10 - iter 780/2606 - loss 0.01065313 - time (sec): 55.62 - samples/sec: 1918.39 - lr: 0.000002 - momentum: 0.000000
2023-10-15 22:24:22,807 epoch 10 - iter 1040/2606 - loss 0.01016933 - time (sec): 74.12 - samples/sec: 1924.29 - lr: 0.000002 - momentum: 0.000000
2023-10-15 22:24:41,423 epoch 10 - iter 1300/2606 - loss 0.00957097 - time (sec): 92.73 - samples/sec: 1920.20 - lr: 0.000002 - momentum: 0.000000
2023-10-15 22:25:01,089 epoch 10 - iter 1560/2606 - loss 0.00974723 - time (sec): 112.40 - samples/sec: 1927.24 - lr: 0.000001 - momentum: 0.000000
2023-10-15 22:25:20,400 epoch 10 - iter 1820/2606 - loss 0.01003146 - time (sec): 131.71 - samples/sec: 1932.90 - lr: 0.000001 - momentum: 0.000000
2023-10-15 22:25:40,816 epoch 10 - iter 2080/2606 - loss 0.01015912 - time (sec): 152.13 - samples/sec: 1929.34 - lr: 0.000001 - momentum: 0.000000
2023-10-15 22:26:00,417 epoch 10 - iter 2340/2606 - loss 0.00999977 - time (sec): 171.73 - samples/sec: 1928.34 - lr: 0.000000 - momentum: 0.000000
2023-10-15 22:26:18,749 epoch 10 - iter 2600/2606 - loss 0.01000144 - time (sec): 190.06 - samples/sec: 1929.69 - lr: 0.000000 - momentum: 0.000000
2023-10-15 22:26:19,135 ----------------------------------------------------------------------------------------------------
2023-10-15 22:26:19,136 EPOCH 10 done: loss 0.0100 - lr: 0.000000
2023-10-15 22:26:28,202 DEV : loss 0.4924582839012146 - f1-score (micro avg) 0.3716
2023-10-15 22:26:28,679 ----------------------------------------------------------------------------------------------------
2023-10-15 22:26:28,681 Loading model from best epoch ...
2023-10-15 22:26:30,137 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-15 22:26:45,458
Results:
- F-score (micro) 0.4795
- F-score (macro) 0.3252
- Accuracy 0.3189
By class:
precision recall f1-score support
LOC 0.5055 0.6483 0.5680 1214
PER 0.4040 0.4270 0.4152 808
ORG 0.2940 0.3456 0.3177 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4415 0.5247 0.4795 2390
macro avg 0.3009 0.3552 0.3252 2390
weighted avg 0.4367 0.5247 0.4758 2390
2023-10-15 22:26:45,458 ----------------------------------------------------------------------------------------------------