stefan-it's picture
Upload folder using huggingface_hub
e2029cb
2023-10-15 17:12:09,248 ----------------------------------------------------------------------------------------------------
2023-10-15 17:12:09,249 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-15 17:12:09,249 ----------------------------------------------------------------------------------------------------
2023-10-15 17:12:09,249 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-15 17:12:09,249 ----------------------------------------------------------------------------------------------------
2023-10-15 17:12:09,249 Train: 20847 sentences
2023-10-15 17:12:09,249 (train_with_dev=False, train_with_test=False)
2023-10-15 17:12:09,249 ----------------------------------------------------------------------------------------------------
2023-10-15 17:12:09,249 Training Params:
2023-10-15 17:12:09,249 - learning_rate: "5e-05"
2023-10-15 17:12:09,249 - mini_batch_size: "8"
2023-10-15 17:12:09,249 - max_epochs: "10"
2023-10-15 17:12:09,249 - shuffle: "True"
2023-10-15 17:12:09,249 ----------------------------------------------------------------------------------------------------
2023-10-15 17:12:09,249 Plugins:
2023-10-15 17:12:09,249 - LinearScheduler | warmup_fraction: '0.1'
2023-10-15 17:12:09,249 ----------------------------------------------------------------------------------------------------
2023-10-15 17:12:09,249 Final evaluation on model from best epoch (best-model.pt)
2023-10-15 17:12:09,249 - metric: "('micro avg', 'f1-score')"
2023-10-15 17:12:09,249 ----------------------------------------------------------------------------------------------------
2023-10-15 17:12:09,249 Computation:
2023-10-15 17:12:09,249 - compute on device: cuda:0
2023-10-15 17:12:09,249 - embedding storage: none
2023-10-15 17:12:09,250 ----------------------------------------------------------------------------------------------------
2023-10-15 17:12:09,250 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-15 17:12:09,250 ----------------------------------------------------------------------------------------------------
2023-10-15 17:12:09,250 ----------------------------------------------------------------------------------------------------
2023-10-15 17:12:27,851 epoch 1 - iter 260/2606 - loss 1.50266157 - time (sec): 18.60 - samples/sec: 1874.61 - lr: 0.000005 - momentum: 0.000000
2023-10-15 17:12:46,100 epoch 1 - iter 520/2606 - loss 0.98485226 - time (sec): 36.85 - samples/sec: 1923.74 - lr: 0.000010 - momentum: 0.000000
2023-10-15 17:13:05,208 epoch 1 - iter 780/2606 - loss 0.74065196 - time (sec): 55.96 - samples/sec: 1931.59 - lr: 0.000015 - momentum: 0.000000
2023-10-15 17:13:24,603 epoch 1 - iter 1040/2606 - loss 0.61022759 - time (sec): 75.35 - samples/sec: 1930.01 - lr: 0.000020 - momentum: 0.000000
2023-10-15 17:13:43,557 epoch 1 - iter 1300/2606 - loss 0.53605747 - time (sec): 94.31 - samples/sec: 1935.21 - lr: 0.000025 - momentum: 0.000000
2023-10-15 17:14:03,940 epoch 1 - iter 1560/2606 - loss 0.47601742 - time (sec): 114.69 - samples/sec: 1936.25 - lr: 0.000030 - momentum: 0.000000
2023-10-15 17:14:22,308 epoch 1 - iter 1820/2606 - loss 0.43680600 - time (sec): 133.06 - samples/sec: 1943.35 - lr: 0.000035 - momentum: 0.000000
2023-10-15 17:14:40,531 epoch 1 - iter 2080/2606 - loss 0.40770913 - time (sec): 151.28 - samples/sec: 1949.84 - lr: 0.000040 - momentum: 0.000000
2023-10-15 17:14:58,591 epoch 1 - iter 2340/2606 - loss 0.38693475 - time (sec): 169.34 - samples/sec: 1941.41 - lr: 0.000045 - momentum: 0.000000
2023-10-15 17:15:18,047 epoch 1 - iter 2600/2606 - loss 0.36558100 - time (sec): 188.80 - samples/sec: 1942.74 - lr: 0.000050 - momentum: 0.000000
2023-10-15 17:15:18,426 ----------------------------------------------------------------------------------------------------
2023-10-15 17:15:18,426 EPOCH 1 done: loss 0.3652 - lr: 0.000050
2023-10-15 17:15:24,131 DEV : loss 0.15214306116104126 - f1-score (micro avg) 0.2904
2023-10-15 17:15:24,157 saving best model
2023-10-15 17:15:24,554 ----------------------------------------------------------------------------------------------------
2023-10-15 17:15:42,495 epoch 2 - iter 260/2606 - loss 0.21425752 - time (sec): 17.94 - samples/sec: 1856.99 - lr: 0.000049 - momentum: 0.000000
2023-10-15 17:16:01,095 epoch 2 - iter 520/2606 - loss 0.18763868 - time (sec): 36.54 - samples/sec: 1924.91 - lr: 0.000049 - momentum: 0.000000
2023-10-15 17:16:19,351 epoch 2 - iter 780/2606 - loss 0.18203253 - time (sec): 54.80 - samples/sec: 1930.66 - lr: 0.000048 - momentum: 0.000000
2023-10-15 17:16:38,367 epoch 2 - iter 1040/2606 - loss 0.17096932 - time (sec): 73.81 - samples/sec: 1936.80 - lr: 0.000048 - momentum: 0.000000
2023-10-15 17:16:56,653 epoch 2 - iter 1300/2606 - loss 0.16919457 - time (sec): 92.10 - samples/sec: 1939.99 - lr: 0.000047 - momentum: 0.000000
2023-10-15 17:17:16,519 epoch 2 - iter 1560/2606 - loss 0.16758215 - time (sec): 111.96 - samples/sec: 1947.98 - lr: 0.000047 - momentum: 0.000000
2023-10-15 17:17:35,476 epoch 2 - iter 1820/2606 - loss 0.16614397 - time (sec): 130.92 - samples/sec: 1941.10 - lr: 0.000046 - momentum: 0.000000
2023-10-15 17:17:54,052 epoch 2 - iter 2080/2606 - loss 0.16420431 - time (sec): 149.50 - samples/sec: 1942.57 - lr: 0.000046 - momentum: 0.000000
2023-10-15 17:18:13,424 epoch 2 - iter 2340/2606 - loss 0.16007055 - time (sec): 168.87 - samples/sec: 1941.56 - lr: 0.000045 - momentum: 0.000000
2023-10-15 17:18:33,096 epoch 2 - iter 2600/2606 - loss 0.15826879 - time (sec): 188.54 - samples/sec: 1942.06 - lr: 0.000044 - momentum: 0.000000
2023-10-15 17:18:33,596 ----------------------------------------------------------------------------------------------------
2023-10-15 17:18:33,596 EPOCH 2 done: loss 0.1580 - lr: 0.000044
2023-10-15 17:18:42,526 DEV : loss 0.1702265739440918 - f1-score (micro avg) 0.3155
2023-10-15 17:18:42,552 saving best model
2023-10-15 17:18:43,006 ----------------------------------------------------------------------------------------------------
2023-10-15 17:19:00,588 epoch 3 - iter 260/2606 - loss 0.11956974 - time (sec): 17.58 - samples/sec: 1906.64 - lr: 0.000044 - momentum: 0.000000
2023-10-15 17:19:18,723 epoch 3 - iter 520/2606 - loss 0.11388643 - time (sec): 35.72 - samples/sec: 1921.25 - lr: 0.000043 - momentum: 0.000000
2023-10-15 17:19:39,576 epoch 3 - iter 780/2606 - loss 0.10643473 - time (sec): 56.57 - samples/sec: 1944.70 - lr: 0.000043 - momentum: 0.000000
2023-10-15 17:19:59,157 epoch 3 - iter 1040/2606 - loss 0.10454978 - time (sec): 76.15 - samples/sec: 1936.60 - lr: 0.000042 - momentum: 0.000000
2023-10-15 17:20:18,167 epoch 3 - iter 1300/2606 - loss 0.10544035 - time (sec): 95.16 - samples/sec: 1939.11 - lr: 0.000042 - momentum: 0.000000
2023-10-15 17:20:36,446 epoch 3 - iter 1560/2606 - loss 0.10643284 - time (sec): 113.44 - samples/sec: 1946.85 - lr: 0.000041 - momentum: 0.000000
2023-10-15 17:20:55,437 epoch 3 - iter 1820/2606 - loss 0.10632511 - time (sec): 132.43 - samples/sec: 1944.58 - lr: 0.000041 - momentum: 0.000000
2023-10-15 17:21:14,086 epoch 3 - iter 2080/2606 - loss 0.10739551 - time (sec): 151.08 - samples/sec: 1936.58 - lr: 0.000040 - momentum: 0.000000
2023-10-15 17:21:32,654 epoch 3 - iter 2340/2606 - loss 0.10835971 - time (sec): 169.65 - samples/sec: 1935.80 - lr: 0.000039 - momentum: 0.000000
2023-10-15 17:21:52,442 epoch 3 - iter 2600/2606 - loss 0.10794681 - time (sec): 189.43 - samples/sec: 1935.52 - lr: 0.000039 - momentum: 0.000000
2023-10-15 17:21:52,809 ----------------------------------------------------------------------------------------------------
2023-10-15 17:21:52,809 EPOCH 3 done: loss 0.1081 - lr: 0.000039
2023-10-15 17:22:01,270 DEV : loss 0.18563616275787354 - f1-score (micro avg) 0.3485
2023-10-15 17:22:01,316 saving best model
2023-10-15 17:22:03,535 ----------------------------------------------------------------------------------------------------
2023-10-15 17:22:22,669 epoch 4 - iter 260/2606 - loss 0.07658858 - time (sec): 19.13 - samples/sec: 1936.28 - lr: 0.000038 - momentum: 0.000000
2023-10-15 17:22:42,190 epoch 4 - iter 520/2606 - loss 0.08210827 - time (sec): 38.65 - samples/sec: 1943.33 - lr: 0.000038 - momentum: 0.000000
2023-10-15 17:23:01,175 epoch 4 - iter 780/2606 - loss 0.08206491 - time (sec): 57.64 - samples/sec: 1925.85 - lr: 0.000037 - momentum: 0.000000
2023-10-15 17:23:20,345 epoch 4 - iter 1040/2606 - loss 0.08053312 - time (sec): 76.81 - samples/sec: 1916.75 - lr: 0.000037 - momentum: 0.000000
2023-10-15 17:23:39,237 epoch 4 - iter 1300/2606 - loss 0.08024248 - time (sec): 95.70 - samples/sec: 1916.94 - lr: 0.000036 - momentum: 0.000000
2023-10-15 17:23:58,673 epoch 4 - iter 1560/2606 - loss 0.08189957 - time (sec): 115.13 - samples/sec: 1923.89 - lr: 0.000036 - momentum: 0.000000
2023-10-15 17:24:18,116 epoch 4 - iter 1820/2606 - loss 0.08169910 - time (sec): 134.58 - samples/sec: 1922.71 - lr: 0.000035 - momentum: 0.000000
2023-10-15 17:24:37,989 epoch 4 - iter 2080/2606 - loss 0.08278118 - time (sec): 154.45 - samples/sec: 1904.84 - lr: 0.000034 - momentum: 0.000000
2023-10-15 17:24:56,294 epoch 4 - iter 2340/2606 - loss 0.08182834 - time (sec): 172.76 - samples/sec: 1905.38 - lr: 0.000034 - momentum: 0.000000
2023-10-15 17:25:15,512 epoch 4 - iter 2600/2606 - loss 0.08143557 - time (sec): 191.97 - samples/sec: 1910.29 - lr: 0.000033 - momentum: 0.000000
2023-10-15 17:25:15,865 ----------------------------------------------------------------------------------------------------
2023-10-15 17:25:15,865 EPOCH 4 done: loss 0.0813 - lr: 0.000033
2023-10-15 17:25:24,202 DEV : loss 0.2632576823234558 - f1-score (micro avg) 0.3843
2023-10-15 17:25:24,237 saving best model
2023-10-15 17:25:24,727 ----------------------------------------------------------------------------------------------------
2023-10-15 17:25:45,899 epoch 5 - iter 260/2606 - loss 0.05364757 - time (sec): 21.17 - samples/sec: 1882.52 - lr: 0.000033 - momentum: 0.000000
2023-10-15 17:26:04,238 epoch 5 - iter 520/2606 - loss 0.05447531 - time (sec): 39.51 - samples/sec: 1922.77 - lr: 0.000032 - momentum: 0.000000
2023-10-15 17:26:23,473 epoch 5 - iter 780/2606 - loss 0.05594421 - time (sec): 58.74 - samples/sec: 1892.98 - lr: 0.000032 - momentum: 0.000000
2023-10-15 17:26:43,477 epoch 5 - iter 1040/2606 - loss 0.05342593 - time (sec): 78.75 - samples/sec: 1903.69 - lr: 0.000031 - momentum: 0.000000
2023-10-15 17:27:02,033 epoch 5 - iter 1300/2606 - loss 0.05823903 - time (sec): 97.30 - samples/sec: 1920.69 - lr: 0.000031 - momentum: 0.000000
2023-10-15 17:27:20,738 epoch 5 - iter 1560/2606 - loss 0.05721454 - time (sec): 116.01 - samples/sec: 1913.24 - lr: 0.000030 - momentum: 0.000000
2023-10-15 17:27:39,173 epoch 5 - iter 1820/2606 - loss 0.05790703 - time (sec): 134.44 - samples/sec: 1906.47 - lr: 0.000029 - momentum: 0.000000
2023-10-15 17:27:58,531 epoch 5 - iter 2080/2606 - loss 0.05849359 - time (sec): 153.80 - samples/sec: 1908.55 - lr: 0.000029 - momentum: 0.000000
2023-10-15 17:28:17,095 epoch 5 - iter 2340/2606 - loss 0.05875878 - time (sec): 172.37 - samples/sec: 1914.33 - lr: 0.000028 - momentum: 0.000000
2023-10-15 17:28:36,319 epoch 5 - iter 2600/2606 - loss 0.05959147 - time (sec): 191.59 - samples/sec: 1913.86 - lr: 0.000028 - momentum: 0.000000
2023-10-15 17:28:36,730 ----------------------------------------------------------------------------------------------------
2023-10-15 17:28:36,730 EPOCH 5 done: loss 0.0596 - lr: 0.000028
2023-10-15 17:28:44,981 DEV : loss 0.2743144631385803 - f1-score (micro avg) 0.3518
2023-10-15 17:28:45,008 ----------------------------------------------------------------------------------------------------
2023-10-15 17:29:03,660 epoch 6 - iter 260/2606 - loss 0.03453990 - time (sec): 18.65 - samples/sec: 1920.29 - lr: 0.000027 - momentum: 0.000000
2023-10-15 17:29:22,853 epoch 6 - iter 520/2606 - loss 0.04058561 - time (sec): 37.84 - samples/sec: 1962.50 - lr: 0.000027 - momentum: 0.000000
2023-10-15 17:29:42,468 epoch 6 - iter 780/2606 - loss 0.04199312 - time (sec): 57.46 - samples/sec: 1951.09 - lr: 0.000026 - momentum: 0.000000
2023-10-15 17:30:01,376 epoch 6 - iter 1040/2606 - loss 0.04376216 - time (sec): 76.37 - samples/sec: 1942.05 - lr: 0.000026 - momentum: 0.000000
2023-10-15 17:30:19,228 epoch 6 - iter 1300/2606 - loss 0.04610481 - time (sec): 94.22 - samples/sec: 1931.70 - lr: 0.000025 - momentum: 0.000000
2023-10-15 17:30:40,254 epoch 6 - iter 1560/2606 - loss 0.04504652 - time (sec): 115.24 - samples/sec: 1915.81 - lr: 0.000024 - momentum: 0.000000
2023-10-15 17:30:58,796 epoch 6 - iter 1820/2606 - loss 0.04499603 - time (sec): 133.79 - samples/sec: 1915.50 - lr: 0.000024 - momentum: 0.000000
2023-10-15 17:31:18,595 epoch 6 - iter 2080/2606 - loss 0.04487564 - time (sec): 153.59 - samples/sec: 1912.99 - lr: 0.000023 - momentum: 0.000000
2023-10-15 17:31:37,219 epoch 6 - iter 2340/2606 - loss 0.04366420 - time (sec): 172.21 - samples/sec: 1917.65 - lr: 0.000023 - momentum: 0.000000
2023-10-15 17:31:55,917 epoch 6 - iter 2600/2606 - loss 0.04322832 - time (sec): 190.91 - samples/sec: 1918.66 - lr: 0.000022 - momentum: 0.000000
2023-10-15 17:31:56,494 ----------------------------------------------------------------------------------------------------
2023-10-15 17:31:56,494 EPOCH 6 done: loss 0.0432 - lr: 0.000022
2023-10-15 17:32:04,745 DEV : loss 0.3391795754432678 - f1-score (micro avg) 0.3928
2023-10-15 17:32:04,773 saving best model
2023-10-15 17:32:05,262 ----------------------------------------------------------------------------------------------------
2023-10-15 17:32:23,978 epoch 7 - iter 260/2606 - loss 0.03643105 - time (sec): 18.71 - samples/sec: 1804.68 - lr: 0.000022 - momentum: 0.000000
2023-10-15 17:32:43,242 epoch 7 - iter 520/2606 - loss 0.03438457 - time (sec): 37.98 - samples/sec: 1866.92 - lr: 0.000021 - momentum: 0.000000
2023-10-15 17:33:01,573 epoch 7 - iter 780/2606 - loss 0.03467784 - time (sec): 56.31 - samples/sec: 1857.31 - lr: 0.000021 - momentum: 0.000000
2023-10-15 17:33:20,953 epoch 7 - iter 1040/2606 - loss 0.03271919 - time (sec): 75.69 - samples/sec: 1881.72 - lr: 0.000020 - momentum: 0.000000
2023-10-15 17:33:40,723 epoch 7 - iter 1300/2606 - loss 0.03405536 - time (sec): 95.46 - samples/sec: 1891.86 - lr: 0.000019 - momentum: 0.000000
2023-10-15 17:33:59,097 epoch 7 - iter 1560/2606 - loss 0.03411291 - time (sec): 113.83 - samples/sec: 1903.15 - lr: 0.000019 - momentum: 0.000000
2023-10-15 17:34:18,022 epoch 7 - iter 1820/2606 - loss 0.03486039 - time (sec): 132.76 - samples/sec: 1920.98 - lr: 0.000018 - momentum: 0.000000
2023-10-15 17:34:36,899 epoch 7 - iter 2080/2606 - loss 0.03442730 - time (sec): 151.64 - samples/sec: 1927.00 - lr: 0.000018 - momentum: 0.000000
2023-10-15 17:34:57,116 epoch 7 - iter 2340/2606 - loss 0.03334037 - time (sec): 171.85 - samples/sec: 1912.03 - lr: 0.000017 - momentum: 0.000000
2023-10-15 17:35:16,378 epoch 7 - iter 2600/2606 - loss 0.03422553 - time (sec): 191.11 - samples/sec: 1918.88 - lr: 0.000017 - momentum: 0.000000
2023-10-15 17:35:16,884 ----------------------------------------------------------------------------------------------------
2023-10-15 17:35:16,884 EPOCH 7 done: loss 0.0343 - lr: 0.000017
2023-10-15 17:35:25,272 DEV : loss 0.3687325119972229 - f1-score (micro avg) 0.3729
2023-10-15 17:35:25,317 ----------------------------------------------------------------------------------------------------
2023-10-15 17:35:45,736 epoch 8 - iter 260/2606 - loss 0.02369143 - time (sec): 20.42 - samples/sec: 1926.18 - lr: 0.000016 - momentum: 0.000000
2023-10-15 17:36:05,692 epoch 8 - iter 520/2606 - loss 0.02091381 - time (sec): 40.37 - samples/sec: 1915.24 - lr: 0.000016 - momentum: 0.000000
2023-10-15 17:36:25,235 epoch 8 - iter 780/2606 - loss 0.02276536 - time (sec): 59.92 - samples/sec: 1915.08 - lr: 0.000015 - momentum: 0.000000
2023-10-15 17:36:43,567 epoch 8 - iter 1040/2606 - loss 0.02383309 - time (sec): 78.25 - samples/sec: 1913.80 - lr: 0.000014 - momentum: 0.000000
2023-10-15 17:37:02,312 epoch 8 - iter 1300/2606 - loss 0.02346249 - time (sec): 96.99 - samples/sec: 1907.47 - lr: 0.000014 - momentum: 0.000000
2023-10-15 17:37:21,147 epoch 8 - iter 1560/2606 - loss 0.02319510 - time (sec): 115.83 - samples/sec: 1908.88 - lr: 0.000013 - momentum: 0.000000
2023-10-15 17:37:40,866 epoch 8 - iter 1820/2606 - loss 0.02286996 - time (sec): 135.55 - samples/sec: 1909.35 - lr: 0.000013 - momentum: 0.000000
2023-10-15 17:37:59,177 epoch 8 - iter 2080/2606 - loss 0.02346536 - time (sec): 153.86 - samples/sec: 1906.92 - lr: 0.000012 - momentum: 0.000000
2023-10-15 17:38:18,146 epoch 8 - iter 2340/2606 - loss 0.02312509 - time (sec): 172.83 - samples/sec: 1909.04 - lr: 0.000012 - momentum: 0.000000
2023-10-15 17:38:36,335 epoch 8 - iter 2600/2606 - loss 0.02326902 - time (sec): 191.02 - samples/sec: 1916.84 - lr: 0.000011 - momentum: 0.000000
2023-10-15 17:38:36,880 ----------------------------------------------------------------------------------------------------
2023-10-15 17:38:36,880 EPOCH 8 done: loss 0.0232 - lr: 0.000011
2023-10-15 17:38:45,922 DEV : loss 0.4457707703113556 - f1-score (micro avg) 0.3707
2023-10-15 17:38:45,950 ----------------------------------------------------------------------------------------------------
2023-10-15 17:39:04,477 epoch 9 - iter 260/2606 - loss 0.01712180 - time (sec): 18.53 - samples/sec: 1918.33 - lr: 0.000011 - momentum: 0.000000
2023-10-15 17:39:23,052 epoch 9 - iter 520/2606 - loss 0.01713874 - time (sec): 37.10 - samples/sec: 1927.72 - lr: 0.000010 - momentum: 0.000000
2023-10-15 17:39:41,699 epoch 9 - iter 780/2606 - loss 0.01649312 - time (sec): 55.75 - samples/sec: 1939.85 - lr: 0.000009 - momentum: 0.000000
2023-10-15 17:40:00,707 epoch 9 - iter 1040/2606 - loss 0.01656224 - time (sec): 74.76 - samples/sec: 1931.44 - lr: 0.000009 - momentum: 0.000000
2023-10-15 17:40:19,694 epoch 9 - iter 1300/2606 - loss 0.01693325 - time (sec): 93.74 - samples/sec: 1918.88 - lr: 0.000008 - momentum: 0.000000
2023-10-15 17:40:39,415 epoch 9 - iter 1560/2606 - loss 0.01641210 - time (sec): 113.46 - samples/sec: 1928.31 - lr: 0.000008 - momentum: 0.000000
2023-10-15 17:40:57,945 epoch 9 - iter 1820/2606 - loss 0.01612644 - time (sec): 131.99 - samples/sec: 1933.37 - lr: 0.000007 - momentum: 0.000000
2023-10-15 17:41:16,629 epoch 9 - iter 2080/2606 - loss 0.01614170 - time (sec): 150.68 - samples/sec: 1929.74 - lr: 0.000007 - momentum: 0.000000
2023-10-15 17:41:36,069 epoch 9 - iter 2340/2606 - loss 0.01580469 - time (sec): 170.12 - samples/sec: 1933.42 - lr: 0.000006 - momentum: 0.000000
2023-10-15 17:41:55,657 epoch 9 - iter 2600/2606 - loss 0.01557134 - time (sec): 189.71 - samples/sec: 1933.00 - lr: 0.000006 - momentum: 0.000000
2023-10-15 17:41:56,016 ----------------------------------------------------------------------------------------------------
2023-10-15 17:41:56,016 EPOCH 9 done: loss 0.0156 - lr: 0.000006
2023-10-15 17:42:05,154 DEV : loss 0.46648621559143066 - f1-score (micro avg) 0.3789
2023-10-15 17:42:05,182 ----------------------------------------------------------------------------------------------------
2023-10-15 17:42:23,987 epoch 10 - iter 260/2606 - loss 0.01270944 - time (sec): 18.80 - samples/sec: 1908.91 - lr: 0.000005 - momentum: 0.000000
2023-10-15 17:42:43,038 epoch 10 - iter 520/2606 - loss 0.01205548 - time (sec): 37.85 - samples/sec: 1934.15 - lr: 0.000004 - momentum: 0.000000
2023-10-15 17:43:01,852 epoch 10 - iter 780/2606 - loss 0.01025038 - time (sec): 56.67 - samples/sec: 1948.42 - lr: 0.000004 - momentum: 0.000000
2023-10-15 17:43:21,081 epoch 10 - iter 1040/2606 - loss 0.00978047 - time (sec): 75.90 - samples/sec: 1931.66 - lr: 0.000003 - momentum: 0.000000
2023-10-15 17:43:41,232 epoch 10 - iter 1300/2606 - loss 0.00973465 - time (sec): 96.05 - samples/sec: 1930.85 - lr: 0.000003 - momentum: 0.000000
2023-10-15 17:44:00,420 epoch 10 - iter 1560/2606 - loss 0.00965036 - time (sec): 115.24 - samples/sec: 1923.94 - lr: 0.000002 - momentum: 0.000000
2023-10-15 17:44:19,182 epoch 10 - iter 1820/2606 - loss 0.00972466 - time (sec): 134.00 - samples/sec: 1927.01 - lr: 0.000002 - momentum: 0.000000
2023-10-15 17:44:37,598 epoch 10 - iter 2080/2606 - loss 0.01031159 - time (sec): 152.41 - samples/sec: 1925.30 - lr: 0.000001 - momentum: 0.000000
2023-10-15 17:44:56,824 epoch 10 - iter 2340/2606 - loss 0.01023345 - time (sec): 171.64 - samples/sec: 1917.64 - lr: 0.000001 - momentum: 0.000000
2023-10-15 17:45:16,466 epoch 10 - iter 2600/2606 - loss 0.01014657 - time (sec): 191.28 - samples/sec: 1917.45 - lr: 0.000000 - momentum: 0.000000
2023-10-15 17:45:16,848 ----------------------------------------------------------------------------------------------------
2023-10-15 17:45:16,848 EPOCH 10 done: loss 0.0102 - lr: 0.000000
2023-10-15 17:45:26,009 DEV : loss 0.44499194622039795 - f1-score (micro avg) 0.3941
2023-10-15 17:45:26,057 saving best model
2023-10-15 17:45:27,099 ----------------------------------------------------------------------------------------------------
2023-10-15 17:45:27,100 Loading model from best epoch ...
2023-10-15 17:45:28,580 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-15 17:45:44,396
Results:
- F-score (micro) 0.466
- F-score (macro) 0.3223
- Accuracy 0.3071
By class:
precision recall f1-score support
LOC 0.4869 0.5519 0.5174 1214
PER 0.4213 0.5037 0.4589 808
ORG 0.3062 0.3201 0.3130 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4380 0.4979 0.4660 2390
macro avg 0.3036 0.3439 0.3223 2390
weighted avg 0.4350 0.4979 0.4642 2390
2023-10-15 17:45:44,396 ----------------------------------------------------------------------------------------------------