2023-10-15 17:12:09,248 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:12:09,249 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-15 17:12:09,249 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:12:09,249 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences - NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator 2023-10-15 17:12:09,249 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:12:09,249 Train: 20847 sentences 2023-10-15 17:12:09,249 (train_with_dev=False, train_with_test=False) 2023-10-15 17:12:09,249 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:12:09,249 Training Params: 2023-10-15 17:12:09,249 - learning_rate: "5e-05" 2023-10-15 17:12:09,249 - mini_batch_size: "8" 2023-10-15 17:12:09,249 - max_epochs: "10" 2023-10-15 17:12:09,249 - shuffle: "True" 2023-10-15 17:12:09,249 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:12:09,249 Plugins: 2023-10-15 17:12:09,249 - LinearScheduler | warmup_fraction: '0.1' 2023-10-15 17:12:09,249 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:12:09,249 Final evaluation on model from best epoch (best-model.pt) 2023-10-15 17:12:09,249 - metric: "('micro avg', 'f1-score')" 2023-10-15 17:12:09,249 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:12:09,249 Computation: 2023-10-15 17:12:09,249 - compute on device: cuda:0 2023-10-15 17:12:09,249 - embedding storage: none 2023-10-15 17:12:09,250 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:12:09,250 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-15 17:12:09,250 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:12:09,250 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:12:27,851 epoch 1 - iter 260/2606 - loss 1.50266157 - time (sec): 18.60 - samples/sec: 1874.61 - lr: 0.000005 - momentum: 0.000000 2023-10-15 17:12:46,100 epoch 1 - iter 520/2606 - loss 0.98485226 - time (sec): 36.85 - samples/sec: 1923.74 - lr: 0.000010 - momentum: 0.000000 2023-10-15 17:13:05,208 epoch 1 - iter 780/2606 - loss 0.74065196 - time (sec): 55.96 - samples/sec: 1931.59 - lr: 0.000015 - momentum: 0.000000 2023-10-15 17:13:24,603 epoch 1 - iter 1040/2606 - loss 0.61022759 - time (sec): 75.35 - samples/sec: 1930.01 - lr: 0.000020 - momentum: 0.000000 2023-10-15 17:13:43,557 epoch 1 - iter 1300/2606 - loss 0.53605747 - time (sec): 94.31 - samples/sec: 1935.21 - lr: 0.000025 - momentum: 0.000000 2023-10-15 17:14:03,940 epoch 1 - iter 1560/2606 - loss 0.47601742 - time (sec): 114.69 - samples/sec: 1936.25 - lr: 0.000030 - momentum: 0.000000 2023-10-15 17:14:22,308 epoch 1 - iter 1820/2606 - loss 0.43680600 - time (sec): 133.06 - samples/sec: 1943.35 - lr: 0.000035 - momentum: 0.000000 2023-10-15 17:14:40,531 epoch 1 - iter 2080/2606 - loss 0.40770913 - time (sec): 151.28 - samples/sec: 1949.84 - lr: 0.000040 - momentum: 0.000000 2023-10-15 17:14:58,591 epoch 1 - iter 2340/2606 - loss 0.38693475 - time (sec): 169.34 - samples/sec: 1941.41 - lr: 0.000045 - momentum: 0.000000 2023-10-15 17:15:18,047 epoch 1 - iter 2600/2606 - loss 0.36558100 - time (sec): 188.80 - samples/sec: 1942.74 - lr: 0.000050 - momentum: 0.000000 2023-10-15 17:15:18,426 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:15:18,426 EPOCH 1 done: loss 0.3652 - lr: 0.000050 2023-10-15 17:15:24,131 DEV : loss 0.15214306116104126 - f1-score (micro avg) 0.2904 2023-10-15 17:15:24,157 saving best model 2023-10-15 17:15:24,554 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:15:42,495 epoch 2 - iter 260/2606 - loss 0.21425752 - time (sec): 17.94 - samples/sec: 1856.99 - lr: 0.000049 - momentum: 0.000000 2023-10-15 17:16:01,095 epoch 2 - iter 520/2606 - loss 0.18763868 - time (sec): 36.54 - samples/sec: 1924.91 - lr: 0.000049 - momentum: 0.000000 2023-10-15 17:16:19,351 epoch 2 - iter 780/2606 - loss 0.18203253 - time (sec): 54.80 - samples/sec: 1930.66 - lr: 0.000048 - momentum: 0.000000 2023-10-15 17:16:38,367 epoch 2 - iter 1040/2606 - loss 0.17096932 - time (sec): 73.81 - samples/sec: 1936.80 - lr: 0.000048 - momentum: 0.000000 2023-10-15 17:16:56,653 epoch 2 - iter 1300/2606 - loss 0.16919457 - time (sec): 92.10 - samples/sec: 1939.99 - lr: 0.000047 - momentum: 0.000000 2023-10-15 17:17:16,519 epoch 2 - iter 1560/2606 - loss 0.16758215 - time (sec): 111.96 - samples/sec: 1947.98 - lr: 0.000047 - momentum: 0.000000 2023-10-15 17:17:35,476 epoch 2 - iter 1820/2606 - loss 0.16614397 - time (sec): 130.92 - samples/sec: 1941.10 - lr: 0.000046 - momentum: 0.000000 2023-10-15 17:17:54,052 epoch 2 - iter 2080/2606 - loss 0.16420431 - time (sec): 149.50 - samples/sec: 1942.57 - lr: 0.000046 - momentum: 0.000000 2023-10-15 17:18:13,424 epoch 2 - iter 2340/2606 - loss 0.16007055 - time (sec): 168.87 - samples/sec: 1941.56 - lr: 0.000045 - momentum: 0.000000 2023-10-15 17:18:33,096 epoch 2 - iter 2600/2606 - loss 0.15826879 - time (sec): 188.54 - samples/sec: 1942.06 - lr: 0.000044 - momentum: 0.000000 2023-10-15 17:18:33,596 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:18:33,596 EPOCH 2 done: loss 0.1580 - lr: 0.000044 2023-10-15 17:18:42,526 DEV : loss 0.1702265739440918 - f1-score (micro avg) 0.3155 2023-10-15 17:18:42,552 saving best model 2023-10-15 17:18:43,006 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:19:00,588 epoch 3 - iter 260/2606 - loss 0.11956974 - time (sec): 17.58 - samples/sec: 1906.64 - lr: 0.000044 - momentum: 0.000000 2023-10-15 17:19:18,723 epoch 3 - iter 520/2606 - loss 0.11388643 - time (sec): 35.72 - samples/sec: 1921.25 - lr: 0.000043 - momentum: 0.000000 2023-10-15 17:19:39,576 epoch 3 - iter 780/2606 - loss 0.10643473 - time (sec): 56.57 - samples/sec: 1944.70 - lr: 0.000043 - momentum: 0.000000 2023-10-15 17:19:59,157 epoch 3 - iter 1040/2606 - loss 0.10454978 - time (sec): 76.15 - samples/sec: 1936.60 - lr: 0.000042 - momentum: 0.000000 2023-10-15 17:20:18,167 epoch 3 - iter 1300/2606 - loss 0.10544035 - time (sec): 95.16 - samples/sec: 1939.11 - lr: 0.000042 - momentum: 0.000000 2023-10-15 17:20:36,446 epoch 3 - iter 1560/2606 - loss 0.10643284 - time (sec): 113.44 - samples/sec: 1946.85 - lr: 0.000041 - momentum: 0.000000 2023-10-15 17:20:55,437 epoch 3 - iter 1820/2606 - loss 0.10632511 - time (sec): 132.43 - samples/sec: 1944.58 - lr: 0.000041 - momentum: 0.000000 2023-10-15 17:21:14,086 epoch 3 - iter 2080/2606 - loss 0.10739551 - time (sec): 151.08 - samples/sec: 1936.58 - lr: 0.000040 - momentum: 0.000000 2023-10-15 17:21:32,654 epoch 3 - iter 2340/2606 - loss 0.10835971 - time (sec): 169.65 - samples/sec: 1935.80 - lr: 0.000039 - momentum: 0.000000 2023-10-15 17:21:52,442 epoch 3 - iter 2600/2606 - loss 0.10794681 - time (sec): 189.43 - samples/sec: 1935.52 - lr: 0.000039 - momentum: 0.000000 2023-10-15 17:21:52,809 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:21:52,809 EPOCH 3 done: loss 0.1081 - lr: 0.000039 2023-10-15 17:22:01,270 DEV : loss 0.18563616275787354 - f1-score (micro avg) 0.3485 2023-10-15 17:22:01,316 saving best model 2023-10-15 17:22:03,535 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:22:22,669 epoch 4 - iter 260/2606 - loss 0.07658858 - time (sec): 19.13 - samples/sec: 1936.28 - lr: 0.000038 - momentum: 0.000000 2023-10-15 17:22:42,190 epoch 4 - iter 520/2606 - loss 0.08210827 - time (sec): 38.65 - samples/sec: 1943.33 - lr: 0.000038 - momentum: 0.000000 2023-10-15 17:23:01,175 epoch 4 - iter 780/2606 - loss 0.08206491 - time (sec): 57.64 - samples/sec: 1925.85 - lr: 0.000037 - momentum: 0.000000 2023-10-15 17:23:20,345 epoch 4 - iter 1040/2606 - loss 0.08053312 - time (sec): 76.81 - samples/sec: 1916.75 - lr: 0.000037 - momentum: 0.000000 2023-10-15 17:23:39,237 epoch 4 - iter 1300/2606 - loss 0.08024248 - time (sec): 95.70 - samples/sec: 1916.94 - lr: 0.000036 - momentum: 0.000000 2023-10-15 17:23:58,673 epoch 4 - iter 1560/2606 - loss 0.08189957 - time (sec): 115.13 - samples/sec: 1923.89 - lr: 0.000036 - momentum: 0.000000 2023-10-15 17:24:18,116 epoch 4 - iter 1820/2606 - loss 0.08169910 - time (sec): 134.58 - samples/sec: 1922.71 - lr: 0.000035 - momentum: 0.000000 2023-10-15 17:24:37,989 epoch 4 - iter 2080/2606 - loss 0.08278118 - time (sec): 154.45 - samples/sec: 1904.84 - lr: 0.000034 - momentum: 0.000000 2023-10-15 17:24:56,294 epoch 4 - iter 2340/2606 - loss 0.08182834 - time (sec): 172.76 - samples/sec: 1905.38 - lr: 0.000034 - momentum: 0.000000 2023-10-15 17:25:15,512 epoch 4 - iter 2600/2606 - loss 0.08143557 - time (sec): 191.97 - samples/sec: 1910.29 - lr: 0.000033 - momentum: 0.000000 2023-10-15 17:25:15,865 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:25:15,865 EPOCH 4 done: loss 0.0813 - lr: 0.000033 2023-10-15 17:25:24,202 DEV : loss 0.2632576823234558 - f1-score (micro avg) 0.3843 2023-10-15 17:25:24,237 saving best model 2023-10-15 17:25:24,727 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:25:45,899 epoch 5 - iter 260/2606 - loss 0.05364757 - time (sec): 21.17 - samples/sec: 1882.52 - lr: 0.000033 - momentum: 0.000000 2023-10-15 17:26:04,238 epoch 5 - iter 520/2606 - loss 0.05447531 - time (sec): 39.51 - samples/sec: 1922.77 - lr: 0.000032 - momentum: 0.000000 2023-10-15 17:26:23,473 epoch 5 - iter 780/2606 - loss 0.05594421 - time (sec): 58.74 - samples/sec: 1892.98 - lr: 0.000032 - momentum: 0.000000 2023-10-15 17:26:43,477 epoch 5 - iter 1040/2606 - loss 0.05342593 - time (sec): 78.75 - samples/sec: 1903.69 - lr: 0.000031 - momentum: 0.000000 2023-10-15 17:27:02,033 epoch 5 - iter 1300/2606 - loss 0.05823903 - time (sec): 97.30 - samples/sec: 1920.69 - lr: 0.000031 - momentum: 0.000000 2023-10-15 17:27:20,738 epoch 5 - iter 1560/2606 - loss 0.05721454 - time (sec): 116.01 - samples/sec: 1913.24 - lr: 0.000030 - momentum: 0.000000 2023-10-15 17:27:39,173 epoch 5 - iter 1820/2606 - loss 0.05790703 - time (sec): 134.44 - samples/sec: 1906.47 - lr: 0.000029 - momentum: 0.000000 2023-10-15 17:27:58,531 epoch 5 - iter 2080/2606 - loss 0.05849359 - time (sec): 153.80 - samples/sec: 1908.55 - lr: 0.000029 - momentum: 0.000000 2023-10-15 17:28:17,095 epoch 5 - iter 2340/2606 - loss 0.05875878 - time (sec): 172.37 - samples/sec: 1914.33 - lr: 0.000028 - momentum: 0.000000 2023-10-15 17:28:36,319 epoch 5 - iter 2600/2606 - loss 0.05959147 - time (sec): 191.59 - samples/sec: 1913.86 - lr: 0.000028 - momentum: 0.000000 2023-10-15 17:28:36,730 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:28:36,730 EPOCH 5 done: loss 0.0596 - lr: 0.000028 2023-10-15 17:28:44,981 DEV : loss 0.2743144631385803 - f1-score (micro avg) 0.3518 2023-10-15 17:28:45,008 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:29:03,660 epoch 6 - iter 260/2606 - loss 0.03453990 - time (sec): 18.65 - samples/sec: 1920.29 - lr: 0.000027 - momentum: 0.000000 2023-10-15 17:29:22,853 epoch 6 - iter 520/2606 - loss 0.04058561 - time (sec): 37.84 - samples/sec: 1962.50 - lr: 0.000027 - momentum: 0.000000 2023-10-15 17:29:42,468 epoch 6 - iter 780/2606 - loss 0.04199312 - time (sec): 57.46 - samples/sec: 1951.09 - lr: 0.000026 - momentum: 0.000000 2023-10-15 17:30:01,376 epoch 6 - iter 1040/2606 - loss 0.04376216 - time (sec): 76.37 - samples/sec: 1942.05 - lr: 0.000026 - momentum: 0.000000 2023-10-15 17:30:19,228 epoch 6 - iter 1300/2606 - loss 0.04610481 - time (sec): 94.22 - samples/sec: 1931.70 - lr: 0.000025 - momentum: 0.000000 2023-10-15 17:30:40,254 epoch 6 - iter 1560/2606 - loss 0.04504652 - time (sec): 115.24 - samples/sec: 1915.81 - lr: 0.000024 - momentum: 0.000000 2023-10-15 17:30:58,796 epoch 6 - iter 1820/2606 - loss 0.04499603 - time (sec): 133.79 - samples/sec: 1915.50 - lr: 0.000024 - momentum: 0.000000 2023-10-15 17:31:18,595 epoch 6 - iter 2080/2606 - loss 0.04487564 - time (sec): 153.59 - samples/sec: 1912.99 - lr: 0.000023 - momentum: 0.000000 2023-10-15 17:31:37,219 epoch 6 - iter 2340/2606 - loss 0.04366420 - time (sec): 172.21 - samples/sec: 1917.65 - lr: 0.000023 - momentum: 0.000000 2023-10-15 17:31:55,917 epoch 6 - iter 2600/2606 - loss 0.04322832 - time (sec): 190.91 - samples/sec: 1918.66 - lr: 0.000022 - momentum: 0.000000 2023-10-15 17:31:56,494 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:31:56,494 EPOCH 6 done: loss 0.0432 - lr: 0.000022 2023-10-15 17:32:04,745 DEV : loss 0.3391795754432678 - f1-score (micro avg) 0.3928 2023-10-15 17:32:04,773 saving best model 2023-10-15 17:32:05,262 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:32:23,978 epoch 7 - iter 260/2606 - loss 0.03643105 - time (sec): 18.71 - samples/sec: 1804.68 - lr: 0.000022 - momentum: 0.000000 2023-10-15 17:32:43,242 epoch 7 - iter 520/2606 - loss 0.03438457 - time (sec): 37.98 - samples/sec: 1866.92 - lr: 0.000021 - momentum: 0.000000 2023-10-15 17:33:01,573 epoch 7 - iter 780/2606 - loss 0.03467784 - time (sec): 56.31 - samples/sec: 1857.31 - lr: 0.000021 - momentum: 0.000000 2023-10-15 17:33:20,953 epoch 7 - iter 1040/2606 - loss 0.03271919 - time (sec): 75.69 - samples/sec: 1881.72 - lr: 0.000020 - momentum: 0.000000 2023-10-15 17:33:40,723 epoch 7 - iter 1300/2606 - loss 0.03405536 - time (sec): 95.46 - samples/sec: 1891.86 - lr: 0.000019 - momentum: 0.000000 2023-10-15 17:33:59,097 epoch 7 - iter 1560/2606 - loss 0.03411291 - time (sec): 113.83 - samples/sec: 1903.15 - lr: 0.000019 - momentum: 0.000000 2023-10-15 17:34:18,022 epoch 7 - iter 1820/2606 - loss 0.03486039 - time (sec): 132.76 - samples/sec: 1920.98 - lr: 0.000018 - momentum: 0.000000 2023-10-15 17:34:36,899 epoch 7 - iter 2080/2606 - loss 0.03442730 - time (sec): 151.64 - samples/sec: 1927.00 - lr: 0.000018 - momentum: 0.000000 2023-10-15 17:34:57,116 epoch 7 - iter 2340/2606 - loss 0.03334037 - time (sec): 171.85 - samples/sec: 1912.03 - lr: 0.000017 - momentum: 0.000000 2023-10-15 17:35:16,378 epoch 7 - iter 2600/2606 - loss 0.03422553 - time (sec): 191.11 - samples/sec: 1918.88 - lr: 0.000017 - momentum: 0.000000 2023-10-15 17:35:16,884 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:35:16,884 EPOCH 7 done: loss 0.0343 - lr: 0.000017 2023-10-15 17:35:25,272 DEV : loss 0.3687325119972229 - f1-score (micro avg) 0.3729 2023-10-15 17:35:25,317 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:35:45,736 epoch 8 - iter 260/2606 - loss 0.02369143 - time (sec): 20.42 - samples/sec: 1926.18 - lr: 0.000016 - momentum: 0.000000 2023-10-15 17:36:05,692 epoch 8 - iter 520/2606 - loss 0.02091381 - time (sec): 40.37 - samples/sec: 1915.24 - lr: 0.000016 - momentum: 0.000000 2023-10-15 17:36:25,235 epoch 8 - iter 780/2606 - loss 0.02276536 - time (sec): 59.92 - samples/sec: 1915.08 - lr: 0.000015 - momentum: 0.000000 2023-10-15 17:36:43,567 epoch 8 - iter 1040/2606 - loss 0.02383309 - time (sec): 78.25 - samples/sec: 1913.80 - lr: 0.000014 - momentum: 0.000000 2023-10-15 17:37:02,312 epoch 8 - iter 1300/2606 - loss 0.02346249 - time (sec): 96.99 - samples/sec: 1907.47 - lr: 0.000014 - momentum: 0.000000 2023-10-15 17:37:21,147 epoch 8 - iter 1560/2606 - loss 0.02319510 - time (sec): 115.83 - samples/sec: 1908.88 - lr: 0.000013 - momentum: 0.000000 2023-10-15 17:37:40,866 epoch 8 - iter 1820/2606 - loss 0.02286996 - time (sec): 135.55 - samples/sec: 1909.35 - lr: 0.000013 - momentum: 0.000000 2023-10-15 17:37:59,177 epoch 8 - iter 2080/2606 - loss 0.02346536 - time (sec): 153.86 - samples/sec: 1906.92 - lr: 0.000012 - momentum: 0.000000 2023-10-15 17:38:18,146 epoch 8 - iter 2340/2606 - loss 0.02312509 - time (sec): 172.83 - samples/sec: 1909.04 - lr: 0.000012 - momentum: 0.000000 2023-10-15 17:38:36,335 epoch 8 - iter 2600/2606 - loss 0.02326902 - time (sec): 191.02 - samples/sec: 1916.84 - lr: 0.000011 - momentum: 0.000000 2023-10-15 17:38:36,880 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:38:36,880 EPOCH 8 done: loss 0.0232 - lr: 0.000011 2023-10-15 17:38:45,922 DEV : loss 0.4457707703113556 - f1-score (micro avg) 0.3707 2023-10-15 17:38:45,950 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:39:04,477 epoch 9 - iter 260/2606 - loss 0.01712180 - time (sec): 18.53 - samples/sec: 1918.33 - lr: 0.000011 - momentum: 0.000000 2023-10-15 17:39:23,052 epoch 9 - iter 520/2606 - loss 0.01713874 - time (sec): 37.10 - samples/sec: 1927.72 - lr: 0.000010 - momentum: 0.000000 2023-10-15 17:39:41,699 epoch 9 - iter 780/2606 - loss 0.01649312 - time (sec): 55.75 - samples/sec: 1939.85 - lr: 0.000009 - momentum: 0.000000 2023-10-15 17:40:00,707 epoch 9 - iter 1040/2606 - loss 0.01656224 - time (sec): 74.76 - samples/sec: 1931.44 - lr: 0.000009 - momentum: 0.000000 2023-10-15 17:40:19,694 epoch 9 - iter 1300/2606 - loss 0.01693325 - time (sec): 93.74 - samples/sec: 1918.88 - lr: 0.000008 - momentum: 0.000000 2023-10-15 17:40:39,415 epoch 9 - iter 1560/2606 - loss 0.01641210 - time (sec): 113.46 - samples/sec: 1928.31 - lr: 0.000008 - momentum: 0.000000 2023-10-15 17:40:57,945 epoch 9 - iter 1820/2606 - loss 0.01612644 - time (sec): 131.99 - samples/sec: 1933.37 - lr: 0.000007 - momentum: 0.000000 2023-10-15 17:41:16,629 epoch 9 - iter 2080/2606 - loss 0.01614170 - time (sec): 150.68 - samples/sec: 1929.74 - lr: 0.000007 - momentum: 0.000000 2023-10-15 17:41:36,069 epoch 9 - iter 2340/2606 - loss 0.01580469 - time (sec): 170.12 - samples/sec: 1933.42 - lr: 0.000006 - momentum: 0.000000 2023-10-15 17:41:55,657 epoch 9 - iter 2600/2606 - loss 0.01557134 - time (sec): 189.71 - samples/sec: 1933.00 - lr: 0.000006 - momentum: 0.000000 2023-10-15 17:41:56,016 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:41:56,016 EPOCH 9 done: loss 0.0156 - lr: 0.000006 2023-10-15 17:42:05,154 DEV : loss 0.46648621559143066 - f1-score (micro avg) 0.3789 2023-10-15 17:42:05,182 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:42:23,987 epoch 10 - iter 260/2606 - loss 0.01270944 - time (sec): 18.80 - samples/sec: 1908.91 - lr: 0.000005 - momentum: 0.000000 2023-10-15 17:42:43,038 epoch 10 - iter 520/2606 - loss 0.01205548 - time (sec): 37.85 - samples/sec: 1934.15 - lr: 0.000004 - momentum: 0.000000 2023-10-15 17:43:01,852 epoch 10 - iter 780/2606 - loss 0.01025038 - time (sec): 56.67 - samples/sec: 1948.42 - lr: 0.000004 - momentum: 0.000000 2023-10-15 17:43:21,081 epoch 10 - iter 1040/2606 - loss 0.00978047 - time (sec): 75.90 - samples/sec: 1931.66 - lr: 0.000003 - momentum: 0.000000 2023-10-15 17:43:41,232 epoch 10 - iter 1300/2606 - loss 0.00973465 - time (sec): 96.05 - samples/sec: 1930.85 - lr: 0.000003 - momentum: 0.000000 2023-10-15 17:44:00,420 epoch 10 - iter 1560/2606 - loss 0.00965036 - time (sec): 115.24 - samples/sec: 1923.94 - lr: 0.000002 - momentum: 0.000000 2023-10-15 17:44:19,182 epoch 10 - iter 1820/2606 - loss 0.00972466 - time (sec): 134.00 - samples/sec: 1927.01 - lr: 0.000002 - momentum: 0.000000 2023-10-15 17:44:37,598 epoch 10 - iter 2080/2606 - loss 0.01031159 - time (sec): 152.41 - samples/sec: 1925.30 - lr: 0.000001 - momentum: 0.000000 2023-10-15 17:44:56,824 epoch 10 - iter 2340/2606 - loss 0.01023345 - time (sec): 171.64 - samples/sec: 1917.64 - lr: 0.000001 - momentum: 0.000000 2023-10-15 17:45:16,466 epoch 10 - iter 2600/2606 - loss 0.01014657 - time (sec): 191.28 - samples/sec: 1917.45 - lr: 0.000000 - momentum: 0.000000 2023-10-15 17:45:16,848 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:45:16,848 EPOCH 10 done: loss 0.0102 - lr: 0.000000 2023-10-15 17:45:26,009 DEV : loss 0.44499194622039795 - f1-score (micro avg) 0.3941 2023-10-15 17:45:26,057 saving best model 2023-10-15 17:45:27,099 ---------------------------------------------------------------------------------------------------- 2023-10-15 17:45:27,100 Loading model from best epoch ... 2023-10-15 17:45:28,580 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-15 17:45:44,396 Results: - F-score (micro) 0.466 - F-score (macro) 0.3223 - Accuracy 0.3071 By class: precision recall f1-score support LOC 0.4869 0.5519 0.5174 1214 PER 0.4213 0.5037 0.4589 808 ORG 0.3062 0.3201 0.3130 353 HumanProd 0.0000 0.0000 0.0000 15 micro avg 0.4380 0.4979 0.4660 2390 macro avg 0.3036 0.3439 0.3223 2390 weighted avg 0.4350 0.4979 0.4642 2390 2023-10-15 17:45:44,396 ----------------------------------------------------------------------------------------------------