|
2023-10-15 21:53:18,052 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 21:53:18,053 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-15 21:53:18,053 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 21:53:18,053 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences |
|
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator |
|
2023-10-15 21:53:18,053 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 21:53:18,053 Train: 20847 sentences |
|
2023-10-15 21:53:18,053 (train_with_dev=False, train_with_test=False) |
|
2023-10-15 21:53:18,053 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 21:53:18,054 Training Params: |
|
2023-10-15 21:53:18,054 - learning_rate: "3e-05" |
|
2023-10-15 21:53:18,054 - mini_batch_size: "8" |
|
2023-10-15 21:53:18,054 - max_epochs: "10" |
|
2023-10-15 21:53:18,054 - shuffle: "True" |
|
2023-10-15 21:53:18,054 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 21:53:18,054 Plugins: |
|
2023-10-15 21:53:18,054 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-15 21:53:18,054 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 21:53:18,054 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-15 21:53:18,054 - metric: "('micro avg', 'f1-score')" |
|
2023-10-15 21:53:18,054 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 21:53:18,054 Computation: |
|
2023-10-15 21:53:18,054 - compute on device: cuda:0 |
|
2023-10-15 21:53:18,054 - embedding storage: none |
|
2023-10-15 21:53:18,054 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 21:53:18,054 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-15 21:53:18,054 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 21:53:18,054 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 21:53:36,396 epoch 1 - iter 260/2606 - loss 1.86299955 - time (sec): 18.34 - samples/sec: 1940.73 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-15 21:53:56,212 epoch 1 - iter 520/2606 - loss 1.13662102 - time (sec): 38.16 - samples/sec: 1910.14 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-15 21:54:15,437 epoch 1 - iter 780/2606 - loss 0.86190244 - time (sec): 57.38 - samples/sec: 1889.39 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-15 21:54:34,769 epoch 1 - iter 1040/2606 - loss 0.71609132 - time (sec): 76.71 - samples/sec: 1876.81 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-15 21:54:55,490 epoch 1 - iter 1300/2606 - loss 0.61712413 - time (sec): 97.43 - samples/sec: 1853.07 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-15 21:55:14,612 epoch 1 - iter 1560/2606 - loss 0.54551314 - time (sec): 116.56 - samples/sec: 1870.50 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-15 21:55:32,894 epoch 1 - iter 1820/2606 - loss 0.49840634 - time (sec): 134.84 - samples/sec: 1891.60 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-15 21:55:51,872 epoch 1 - iter 2080/2606 - loss 0.46070053 - time (sec): 153.82 - samples/sec: 1891.67 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-15 21:56:10,667 epoch 1 - iter 2340/2606 - loss 0.43323816 - time (sec): 172.61 - samples/sec: 1891.10 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-15 21:56:30,708 epoch 1 - iter 2600/2606 - loss 0.40652695 - time (sec): 192.65 - samples/sec: 1901.93 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-15 21:56:31,204 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 21:56:31,205 EPOCH 1 done: loss 0.4058 - lr: 0.000030 |
|
2023-10-15 21:56:36,969 DEV : loss 0.1317945271730423 - f1-score (micro avg) 0.3034 |
|
2023-10-15 21:56:36,997 saving best model |
|
2023-10-15 21:56:37,371 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 21:56:56,944 epoch 2 - iter 260/2606 - loss 0.15883373 - time (sec): 19.57 - samples/sec: 1981.52 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-15 21:57:15,533 epoch 2 - iter 520/2606 - loss 0.15021341 - time (sec): 38.16 - samples/sec: 1965.18 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-15 21:57:34,242 epoch 2 - iter 780/2606 - loss 0.14428957 - time (sec): 56.87 - samples/sec: 1959.75 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-15 21:57:53,321 epoch 2 - iter 1040/2606 - loss 0.14632439 - time (sec): 75.95 - samples/sec: 1959.02 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-15 21:58:12,435 epoch 2 - iter 1300/2606 - loss 0.15002425 - time (sec): 95.06 - samples/sec: 1944.89 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-15 21:58:31,688 epoch 2 - iter 1560/2606 - loss 0.14638804 - time (sec): 114.32 - samples/sec: 1945.13 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-15 21:58:50,111 epoch 2 - iter 1820/2606 - loss 0.14688279 - time (sec): 132.74 - samples/sec: 1943.92 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-15 21:59:10,029 epoch 2 - iter 2080/2606 - loss 0.14589200 - time (sec): 152.66 - samples/sec: 1945.70 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-15 21:59:27,961 epoch 2 - iter 2340/2606 - loss 0.14640886 - time (sec): 170.59 - samples/sec: 1936.19 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-15 21:59:46,509 epoch 2 - iter 2600/2606 - loss 0.14645028 - time (sec): 189.14 - samples/sec: 1939.04 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-15 21:59:46,864 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 21:59:46,864 EPOCH 2 done: loss 0.1465 - lr: 0.000027 |
|
2023-10-15 21:59:55,993 DEV : loss 0.14893855154514313 - f1-score (micro avg) 0.3593 |
|
2023-10-15 21:59:56,021 saving best model |
|
2023-10-15 21:59:56,501 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 22:00:15,212 epoch 3 - iter 260/2606 - loss 0.12119952 - time (sec): 18.71 - samples/sec: 1931.29 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-15 22:00:32,904 epoch 3 - iter 520/2606 - loss 0.10577303 - time (sec): 36.40 - samples/sec: 1889.81 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-15 22:00:51,018 epoch 3 - iter 780/2606 - loss 0.10660695 - time (sec): 54.51 - samples/sec: 1888.83 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-15 22:01:09,553 epoch 3 - iter 1040/2606 - loss 0.10403521 - time (sec): 73.05 - samples/sec: 1900.94 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-15 22:01:28,741 epoch 3 - iter 1300/2606 - loss 0.09848180 - time (sec): 92.24 - samples/sec: 1908.68 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-15 22:01:48,191 epoch 3 - iter 1560/2606 - loss 0.09910196 - time (sec): 111.69 - samples/sec: 1908.52 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-15 22:02:07,616 epoch 3 - iter 1820/2606 - loss 0.09891702 - time (sec): 131.11 - samples/sec: 1915.71 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-15 22:02:26,705 epoch 3 - iter 2080/2606 - loss 0.09897481 - time (sec): 150.20 - samples/sec: 1913.68 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-15 22:02:46,332 epoch 3 - iter 2340/2606 - loss 0.09916865 - time (sec): 169.83 - samples/sec: 1921.55 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-15 22:03:06,710 epoch 3 - iter 2600/2606 - loss 0.09857598 - time (sec): 190.20 - samples/sec: 1928.95 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-15 22:03:07,074 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 22:03:07,075 EPOCH 3 done: loss 0.0986 - lr: 0.000023 |
|
2023-10-15 22:03:16,081 DEV : loss 0.25157859921455383 - f1-score (micro avg) 0.3061 |
|
2023-10-15 22:03:16,107 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 22:03:34,214 epoch 4 - iter 260/2606 - loss 0.05397925 - time (sec): 18.11 - samples/sec: 1934.95 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-15 22:03:52,530 epoch 4 - iter 520/2606 - loss 0.06221217 - time (sec): 36.42 - samples/sec: 1969.41 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-15 22:04:11,985 epoch 4 - iter 780/2606 - loss 0.06302086 - time (sec): 55.88 - samples/sec: 1973.10 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-15 22:04:32,212 epoch 4 - iter 1040/2606 - loss 0.06274338 - time (sec): 76.10 - samples/sec: 1965.75 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-15 22:04:51,603 epoch 4 - iter 1300/2606 - loss 0.06736005 - time (sec): 95.49 - samples/sec: 1958.77 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-15 22:05:10,408 epoch 4 - iter 1560/2606 - loss 0.06710813 - time (sec): 114.30 - samples/sec: 1948.45 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-15 22:05:29,180 epoch 4 - iter 1820/2606 - loss 0.06660339 - time (sec): 133.07 - samples/sec: 1945.03 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-15 22:05:48,515 epoch 4 - iter 2080/2606 - loss 0.06584737 - time (sec): 152.41 - samples/sec: 1945.57 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-15 22:06:06,641 epoch 4 - iter 2340/2606 - loss 0.06648156 - time (sec): 170.53 - samples/sec: 1941.17 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-15 22:06:25,362 epoch 4 - iter 2600/2606 - loss 0.06655453 - time (sec): 189.25 - samples/sec: 1937.73 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-15 22:06:25,729 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 22:06:25,730 EPOCH 4 done: loss 0.0665 - lr: 0.000020 |
|
2023-10-15 22:06:34,799 DEV : loss 0.24660253524780273 - f1-score (micro avg) 0.362 |
|
2023-10-15 22:06:34,826 saving best model |
|
2023-10-15 22:06:35,303 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 22:06:55,662 epoch 5 - iter 260/2606 - loss 0.05181151 - time (sec): 20.35 - samples/sec: 1892.16 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-15 22:07:13,967 epoch 5 - iter 520/2606 - loss 0.05220789 - time (sec): 38.66 - samples/sec: 1891.65 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-15 22:07:32,278 epoch 5 - iter 780/2606 - loss 0.05016679 - time (sec): 56.97 - samples/sec: 1906.72 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-15 22:07:51,830 epoch 5 - iter 1040/2606 - loss 0.04829786 - time (sec): 76.52 - samples/sec: 1925.97 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-15 22:08:11,380 epoch 5 - iter 1300/2606 - loss 0.04933213 - time (sec): 96.07 - samples/sec: 1932.05 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-15 22:08:29,300 epoch 5 - iter 1560/2606 - loss 0.05025896 - time (sec): 113.99 - samples/sec: 1940.82 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-15 22:08:48,060 epoch 5 - iter 1820/2606 - loss 0.05051188 - time (sec): 132.75 - samples/sec: 1930.96 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-15 22:09:07,640 epoch 5 - iter 2080/2606 - loss 0.05049316 - time (sec): 152.33 - samples/sec: 1933.14 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-15 22:09:25,995 epoch 5 - iter 2340/2606 - loss 0.05090602 - time (sec): 170.69 - samples/sec: 1936.70 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-15 22:09:44,781 epoch 5 - iter 2600/2606 - loss 0.05050105 - time (sec): 189.47 - samples/sec: 1935.41 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-15 22:09:45,200 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 22:09:45,200 EPOCH 5 done: loss 0.0505 - lr: 0.000017 |
|
2023-10-15 22:09:53,589 DEV : loss 0.39237016439437866 - f1-score (micro avg) 0.344 |
|
2023-10-15 22:09:53,617 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 22:10:12,902 epoch 6 - iter 260/2606 - loss 0.03079347 - time (sec): 19.28 - samples/sec: 1889.86 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-15 22:10:31,638 epoch 6 - iter 520/2606 - loss 0.03422273 - time (sec): 38.02 - samples/sec: 1933.13 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-15 22:10:50,704 epoch 6 - iter 780/2606 - loss 0.03367764 - time (sec): 57.09 - samples/sec: 1931.45 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-15 22:11:09,610 epoch 6 - iter 1040/2606 - loss 0.03423013 - time (sec): 75.99 - samples/sec: 1942.77 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-15 22:11:27,791 epoch 6 - iter 1300/2606 - loss 0.03565700 - time (sec): 94.17 - samples/sec: 1944.82 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-15 22:11:47,040 epoch 6 - iter 1560/2606 - loss 0.03707415 - time (sec): 113.42 - samples/sec: 1943.90 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-15 22:12:06,589 epoch 6 - iter 1820/2606 - loss 0.03663239 - time (sec): 132.97 - samples/sec: 1946.09 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-15 22:12:25,841 epoch 6 - iter 2080/2606 - loss 0.03621919 - time (sec): 152.22 - samples/sec: 1948.42 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-15 22:12:45,101 epoch 6 - iter 2340/2606 - loss 0.03653715 - time (sec): 171.48 - samples/sec: 1940.81 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-15 22:13:03,123 epoch 6 - iter 2600/2606 - loss 0.03719827 - time (sec): 189.50 - samples/sec: 1932.05 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-15 22:13:03,666 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 22:13:03,666 EPOCH 6 done: loss 0.0371 - lr: 0.000013 |
|
2023-10-15 22:13:12,012 DEV : loss 0.3723573386669159 - f1-score (micro avg) 0.3884 |
|
2023-10-15 22:13:12,040 saving best model |
|
2023-10-15 22:13:12,529 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 22:13:30,591 epoch 7 - iter 260/2606 - loss 0.03143927 - time (sec): 18.05 - samples/sec: 1930.80 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-15 22:13:49,009 epoch 7 - iter 520/2606 - loss 0.03028943 - time (sec): 36.47 - samples/sec: 1959.23 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-15 22:14:08,538 epoch 7 - iter 780/2606 - loss 0.02982462 - time (sec): 56.00 - samples/sec: 1932.31 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-15 22:14:29,308 epoch 7 - iter 1040/2606 - loss 0.02849778 - time (sec): 76.77 - samples/sec: 1920.09 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-15 22:14:48,189 epoch 7 - iter 1300/2606 - loss 0.02740309 - time (sec): 95.65 - samples/sec: 1916.83 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-15 22:15:07,172 epoch 7 - iter 1560/2606 - loss 0.02795793 - time (sec): 114.64 - samples/sec: 1922.54 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-15 22:15:26,120 epoch 7 - iter 1820/2606 - loss 0.02696128 - time (sec): 133.58 - samples/sec: 1933.43 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-15 22:15:45,154 epoch 7 - iter 2080/2606 - loss 0.02587610 - time (sec): 152.62 - samples/sec: 1937.39 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-15 22:16:03,410 epoch 7 - iter 2340/2606 - loss 0.02517437 - time (sec): 170.87 - samples/sec: 1927.26 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-15 22:16:22,695 epoch 7 - iter 2600/2606 - loss 0.02561039 - time (sec): 190.16 - samples/sec: 1926.39 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-15 22:16:23,259 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 22:16:23,259 EPOCH 7 done: loss 0.0256 - lr: 0.000010 |
|
2023-10-15 22:16:31,497 DEV : loss 0.37943604588508606 - f1-score (micro avg) 0.3863 |
|
2023-10-15 22:16:31,525 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 22:16:51,372 epoch 8 - iter 260/2606 - loss 0.01682548 - time (sec): 19.85 - samples/sec: 1966.40 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-15 22:17:10,615 epoch 8 - iter 520/2606 - loss 0.01934422 - time (sec): 39.09 - samples/sec: 1968.64 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-15 22:17:29,460 epoch 8 - iter 780/2606 - loss 0.01904607 - time (sec): 57.93 - samples/sec: 1949.77 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-15 22:17:48,691 epoch 8 - iter 1040/2606 - loss 0.01943065 - time (sec): 77.16 - samples/sec: 1938.17 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-15 22:18:07,840 epoch 8 - iter 1300/2606 - loss 0.01802102 - time (sec): 96.31 - samples/sec: 1933.12 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-15 22:18:27,932 epoch 8 - iter 1560/2606 - loss 0.01930885 - time (sec): 116.41 - samples/sec: 1921.71 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-15 22:18:45,937 epoch 8 - iter 1820/2606 - loss 0.01997459 - time (sec): 134.41 - samples/sec: 1929.98 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-15 22:19:04,279 epoch 8 - iter 2080/2606 - loss 0.01985683 - time (sec): 152.75 - samples/sec: 1933.12 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-15 22:19:22,910 epoch 8 - iter 2340/2606 - loss 0.02005650 - time (sec): 171.38 - samples/sec: 1924.45 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-15 22:19:41,991 epoch 8 - iter 2600/2606 - loss 0.01999583 - time (sec): 190.46 - samples/sec: 1926.05 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-15 22:19:42,365 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 22:19:42,365 EPOCH 8 done: loss 0.0200 - lr: 0.000007 |
|
2023-10-15 22:19:50,593 DEV : loss 0.5024023652076721 - f1-score (micro avg) 0.3622 |
|
2023-10-15 22:19:50,620 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 22:20:09,631 epoch 9 - iter 260/2606 - loss 0.01423242 - time (sec): 19.01 - samples/sec: 1979.41 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-15 22:20:27,640 epoch 9 - iter 520/2606 - loss 0.01556764 - time (sec): 37.02 - samples/sec: 1958.87 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-15 22:20:46,047 epoch 9 - iter 780/2606 - loss 0.01574355 - time (sec): 55.43 - samples/sec: 1936.01 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-15 22:21:04,365 epoch 9 - iter 1040/2606 - loss 0.01640986 - time (sec): 73.74 - samples/sec: 1936.03 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-15 22:21:24,067 epoch 9 - iter 1300/2606 - loss 0.01548556 - time (sec): 93.45 - samples/sec: 1940.35 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-15 22:21:42,933 epoch 9 - iter 1560/2606 - loss 0.01493432 - time (sec): 112.31 - samples/sec: 1945.73 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-15 22:22:01,991 epoch 9 - iter 1820/2606 - loss 0.01494702 - time (sec): 131.37 - samples/sec: 1943.02 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-15 22:22:21,758 epoch 9 - iter 2080/2606 - loss 0.01477139 - time (sec): 151.14 - samples/sec: 1936.57 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-15 22:22:41,198 epoch 9 - iter 2340/2606 - loss 0.01481877 - time (sec): 170.58 - samples/sec: 1938.07 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-15 22:22:59,969 epoch 9 - iter 2600/2606 - loss 0.01461832 - time (sec): 189.35 - samples/sec: 1935.85 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-15 22:23:00,420 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 22:23:00,420 EPOCH 9 done: loss 0.0146 - lr: 0.000003 |
|
2023-10-15 22:23:08,661 DEV : loss 0.4772779941558838 - f1-score (micro avg) 0.3756 |
|
2023-10-15 22:23:08,688 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 22:23:27,098 epoch 10 - iter 260/2606 - loss 0.00972956 - time (sec): 18.41 - samples/sec: 1934.67 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-15 22:23:45,965 epoch 10 - iter 520/2606 - loss 0.01172751 - time (sec): 37.28 - samples/sec: 1916.83 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-15 22:24:04,310 epoch 10 - iter 780/2606 - loss 0.01065313 - time (sec): 55.62 - samples/sec: 1918.39 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-15 22:24:22,807 epoch 10 - iter 1040/2606 - loss 0.01016933 - time (sec): 74.12 - samples/sec: 1924.29 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-15 22:24:41,423 epoch 10 - iter 1300/2606 - loss 0.00957097 - time (sec): 92.73 - samples/sec: 1920.20 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-15 22:25:01,089 epoch 10 - iter 1560/2606 - loss 0.00974723 - time (sec): 112.40 - samples/sec: 1927.24 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-15 22:25:20,400 epoch 10 - iter 1820/2606 - loss 0.01003146 - time (sec): 131.71 - samples/sec: 1932.90 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-15 22:25:40,816 epoch 10 - iter 2080/2606 - loss 0.01015912 - time (sec): 152.13 - samples/sec: 1929.34 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-15 22:26:00,417 epoch 10 - iter 2340/2606 - loss 0.00999977 - time (sec): 171.73 - samples/sec: 1928.34 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-15 22:26:18,749 epoch 10 - iter 2600/2606 - loss 0.01000144 - time (sec): 190.06 - samples/sec: 1929.69 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-15 22:26:19,135 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 22:26:19,136 EPOCH 10 done: loss 0.0100 - lr: 0.000000 |
|
2023-10-15 22:26:28,202 DEV : loss 0.4924582839012146 - f1-score (micro avg) 0.3716 |
|
2023-10-15 22:26:28,679 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 22:26:28,681 Loading model from best epoch ... |
|
2023-10-15 22:26:30,137 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-15 22:26:45,458 |
|
Results: |
|
- F-score (micro) 0.4795 |
|
- F-score (macro) 0.3252 |
|
- Accuracy 0.3189 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.5055 0.6483 0.5680 1214 |
|
PER 0.4040 0.4270 0.4152 808 |
|
ORG 0.2940 0.3456 0.3177 353 |
|
HumanProd 0.0000 0.0000 0.0000 15 |
|
|
|
micro avg 0.4415 0.5247 0.4795 2390 |
|
macro avg 0.3009 0.3552 0.3252 2390 |
|
weighted avg 0.4367 0.5247 0.4758 2390 |
|
|
|
2023-10-15 22:26:45,458 ---------------------------------------------------------------------------------------------------- |
|
|