stefan-it's picture
Upload folder using huggingface_hub
d7fb108
2023-10-15 16:38:09,257 ----------------------------------------------------------------------------------------------------
2023-10-15 16:38:09,258 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-15 16:38:09,258 ----------------------------------------------------------------------------------------------------
2023-10-15 16:38:09,258 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-15 16:38:09,258 ----------------------------------------------------------------------------------------------------
2023-10-15 16:38:09,258 Train: 20847 sentences
2023-10-15 16:38:09,258 (train_with_dev=False, train_with_test=False)
2023-10-15 16:38:09,258 ----------------------------------------------------------------------------------------------------
2023-10-15 16:38:09,258 Training Params:
2023-10-15 16:38:09,258 - learning_rate: "3e-05"
2023-10-15 16:38:09,258 - mini_batch_size: "8"
2023-10-15 16:38:09,258 - max_epochs: "10"
2023-10-15 16:38:09,258 - shuffle: "True"
2023-10-15 16:38:09,258 ----------------------------------------------------------------------------------------------------
2023-10-15 16:38:09,258 Plugins:
2023-10-15 16:38:09,258 - LinearScheduler | warmup_fraction: '0.1'
2023-10-15 16:38:09,258 ----------------------------------------------------------------------------------------------------
2023-10-15 16:38:09,258 Final evaluation on model from best epoch (best-model.pt)
2023-10-15 16:38:09,258 - metric: "('micro avg', 'f1-score')"
2023-10-15 16:38:09,258 ----------------------------------------------------------------------------------------------------
2023-10-15 16:38:09,258 Computation:
2023-10-15 16:38:09,258 - compute on device: cuda:0
2023-10-15 16:38:09,258 - embedding storage: none
2023-10-15 16:38:09,258 ----------------------------------------------------------------------------------------------------
2023-10-15 16:38:09,258 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-15 16:38:09,259 ----------------------------------------------------------------------------------------------------
2023-10-15 16:38:09,259 ----------------------------------------------------------------------------------------------------
2023-10-15 16:38:28,045 epoch 1 - iter 260/2606 - loss 1.78304728 - time (sec): 18.79 - samples/sec: 1856.12 - lr: 0.000003 - momentum: 0.000000
2023-10-15 16:38:46,594 epoch 1 - iter 520/2606 - loss 1.15216820 - time (sec): 37.33 - samples/sec: 1898.75 - lr: 0.000006 - momentum: 0.000000
2023-10-15 16:39:05,821 epoch 1 - iter 780/2606 - loss 0.86568725 - time (sec): 56.56 - samples/sec: 1910.94 - lr: 0.000009 - momentum: 0.000000
2023-10-15 16:39:25,235 epoch 1 - iter 1040/2606 - loss 0.70886674 - time (sec): 75.98 - samples/sec: 1914.21 - lr: 0.000012 - momentum: 0.000000
2023-10-15 16:39:44,071 epoch 1 - iter 1300/2606 - loss 0.61350901 - time (sec): 94.81 - samples/sec: 1924.89 - lr: 0.000015 - momentum: 0.000000
2023-10-15 16:40:03,717 epoch 1 - iter 1560/2606 - loss 0.54047028 - time (sec): 114.46 - samples/sec: 1940.17 - lr: 0.000018 - momentum: 0.000000
2023-10-15 16:40:22,096 epoch 1 - iter 1820/2606 - loss 0.49291843 - time (sec): 132.84 - samples/sec: 1946.58 - lr: 0.000021 - momentum: 0.000000
2023-10-15 16:40:40,387 epoch 1 - iter 2080/2606 - loss 0.45739180 - time (sec): 151.13 - samples/sec: 1951.82 - lr: 0.000024 - momentum: 0.000000
2023-10-15 16:40:59,388 epoch 1 - iter 2340/2606 - loss 0.43100594 - time (sec): 170.13 - samples/sec: 1932.42 - lr: 0.000027 - momentum: 0.000000
2023-10-15 16:41:18,986 epoch 1 - iter 2600/2606 - loss 0.40515264 - time (sec): 189.73 - samples/sec: 1933.21 - lr: 0.000030 - momentum: 0.000000
2023-10-15 16:41:19,367 ----------------------------------------------------------------------------------------------------
2023-10-15 16:41:19,367 EPOCH 1 done: loss 0.4047 - lr: 0.000030
2023-10-15 16:41:25,180 DEV : loss 0.1570487767457962 - f1-score (micro avg) 0.2895
2023-10-15 16:41:25,209 saving best model
2023-10-15 16:41:25,591 ----------------------------------------------------------------------------------------------------
2023-10-15 16:41:43,895 epoch 2 - iter 260/2606 - loss 0.20098129 - time (sec): 18.30 - samples/sec: 1820.25 - lr: 0.000030 - momentum: 0.000000
2023-10-15 16:42:02,865 epoch 2 - iter 520/2606 - loss 0.18249185 - time (sec): 37.27 - samples/sec: 1887.05 - lr: 0.000029 - momentum: 0.000000
2023-10-15 16:42:21,212 epoch 2 - iter 780/2606 - loss 0.17192597 - time (sec): 55.62 - samples/sec: 1902.07 - lr: 0.000029 - momentum: 0.000000
2023-10-15 16:42:40,345 epoch 2 - iter 1040/2606 - loss 0.16132827 - time (sec): 74.75 - samples/sec: 1912.44 - lr: 0.000029 - momentum: 0.000000
2023-10-15 16:42:58,644 epoch 2 - iter 1300/2606 - loss 0.16050052 - time (sec): 93.05 - samples/sec: 1920.11 - lr: 0.000028 - momentum: 0.000000
2023-10-15 16:43:18,524 epoch 2 - iter 1560/2606 - loss 0.15792990 - time (sec): 112.93 - samples/sec: 1931.30 - lr: 0.000028 - momentum: 0.000000
2023-10-15 16:43:37,445 epoch 2 - iter 1820/2606 - loss 0.15662757 - time (sec): 131.85 - samples/sec: 1927.40 - lr: 0.000028 - momentum: 0.000000
2023-10-15 16:43:56,056 epoch 2 - iter 2080/2606 - loss 0.15414219 - time (sec): 150.46 - samples/sec: 1930.09 - lr: 0.000027 - momentum: 0.000000
2023-10-15 16:44:15,590 epoch 2 - iter 2340/2606 - loss 0.15078285 - time (sec): 170.00 - samples/sec: 1928.67 - lr: 0.000027 - momentum: 0.000000
2023-10-15 16:44:35,237 epoch 2 - iter 2600/2606 - loss 0.14899030 - time (sec): 189.64 - samples/sec: 1930.76 - lr: 0.000027 - momentum: 0.000000
2023-10-15 16:44:35,739 ----------------------------------------------------------------------------------------------------
2023-10-15 16:44:35,739 EPOCH 2 done: loss 0.1487 - lr: 0.000027
2023-10-15 16:44:44,918 DEV : loss 0.14093472063541412 - f1-score (micro avg) 0.3357
2023-10-15 16:44:44,945 saving best model
2023-10-15 16:44:45,501 ----------------------------------------------------------------------------------------------------
2023-10-15 16:45:03,303 epoch 3 - iter 260/2606 - loss 0.10491536 - time (sec): 17.80 - samples/sec: 1883.64 - lr: 0.000026 - momentum: 0.000000
2023-10-15 16:45:21,796 epoch 3 - iter 520/2606 - loss 0.10122591 - time (sec): 36.29 - samples/sec: 1890.96 - lr: 0.000026 - momentum: 0.000000
2023-10-15 16:45:42,689 epoch 3 - iter 780/2606 - loss 0.09961218 - time (sec): 57.18 - samples/sec: 1923.87 - lr: 0.000026 - momentum: 0.000000
2023-10-15 16:46:02,554 epoch 3 - iter 1040/2606 - loss 0.09635866 - time (sec): 77.05 - samples/sec: 1914.08 - lr: 0.000025 - momentum: 0.000000
2023-10-15 16:46:21,515 epoch 3 - iter 1300/2606 - loss 0.09838515 - time (sec): 96.01 - samples/sec: 1921.98 - lr: 0.000025 - momentum: 0.000000
2023-10-15 16:46:39,827 epoch 3 - iter 1560/2606 - loss 0.09935068 - time (sec): 114.32 - samples/sec: 1931.86 - lr: 0.000025 - momentum: 0.000000
2023-10-15 16:46:58,868 epoch 3 - iter 1820/2606 - loss 0.09839679 - time (sec): 133.36 - samples/sec: 1931.00 - lr: 0.000024 - momentum: 0.000000
2023-10-15 16:47:17,465 epoch 3 - iter 2080/2606 - loss 0.09880177 - time (sec): 151.96 - samples/sec: 1925.39 - lr: 0.000024 - momentum: 0.000000
2023-10-15 16:47:35,970 epoch 3 - iter 2340/2606 - loss 0.09778731 - time (sec): 170.46 - samples/sec: 1926.54 - lr: 0.000024 - momentum: 0.000000
2023-10-15 16:47:55,086 epoch 3 - iter 2600/2606 - loss 0.09692475 - time (sec): 189.58 - samples/sec: 1934.04 - lr: 0.000023 - momentum: 0.000000
2023-10-15 16:47:55,463 ----------------------------------------------------------------------------------------------------
2023-10-15 16:47:55,463 EPOCH 3 done: loss 0.0970 - lr: 0.000023
2023-10-15 16:48:03,673 DEV : loss 0.20114129781723022 - f1-score (micro avg) 0.3372
2023-10-15 16:48:03,699 saving best model
2023-10-15 16:48:04,299 ----------------------------------------------------------------------------------------------------
2023-10-15 16:48:22,997 epoch 4 - iter 260/2606 - loss 0.05906672 - time (sec): 18.70 - samples/sec: 1981.27 - lr: 0.000023 - momentum: 0.000000
2023-10-15 16:48:42,832 epoch 4 - iter 520/2606 - loss 0.06444519 - time (sec): 38.53 - samples/sec: 1949.40 - lr: 0.000023 - momentum: 0.000000
2023-10-15 16:49:01,447 epoch 4 - iter 780/2606 - loss 0.06892697 - time (sec): 57.15 - samples/sec: 1942.37 - lr: 0.000022 - momentum: 0.000000
2023-10-15 16:49:20,367 epoch 4 - iter 1040/2606 - loss 0.06689883 - time (sec): 76.07 - samples/sec: 1935.40 - lr: 0.000022 - momentum: 0.000000
2023-10-15 16:49:39,006 epoch 4 - iter 1300/2606 - loss 0.06644882 - time (sec): 94.71 - samples/sec: 1937.06 - lr: 0.000022 - momentum: 0.000000
2023-10-15 16:49:58,334 epoch 4 - iter 1560/2606 - loss 0.06701638 - time (sec): 114.03 - samples/sec: 1942.46 - lr: 0.000021 - momentum: 0.000000
2023-10-15 16:50:17,146 epoch 4 - iter 1820/2606 - loss 0.06820439 - time (sec): 132.85 - samples/sec: 1947.78 - lr: 0.000021 - momentum: 0.000000
2023-10-15 16:50:36,366 epoch 4 - iter 2080/2606 - loss 0.06885462 - time (sec): 152.07 - samples/sec: 1934.72 - lr: 0.000021 - momentum: 0.000000
2023-10-15 16:50:54,739 epoch 4 - iter 2340/2606 - loss 0.06876316 - time (sec): 170.44 - samples/sec: 1931.29 - lr: 0.000020 - momentum: 0.000000
2023-10-15 16:51:13,995 epoch 4 - iter 2600/2606 - loss 0.06837534 - time (sec): 189.70 - samples/sec: 1933.23 - lr: 0.000020 - momentum: 0.000000
2023-10-15 16:51:14,350 ----------------------------------------------------------------------------------------------------
2023-10-15 16:51:14,350 EPOCH 4 done: loss 0.0687 - lr: 0.000020
2023-10-15 16:51:22,591 DEV : loss 0.30411383509635925 - f1-score (micro avg) 0.3834
2023-10-15 16:51:22,619 saving best model
2023-10-15 16:51:23,272 ----------------------------------------------------------------------------------------------------
2023-10-15 16:51:44,016 epoch 5 - iter 260/2606 - loss 0.05158986 - time (sec): 20.74 - samples/sec: 1921.40 - lr: 0.000020 - momentum: 0.000000
2023-10-15 16:52:02,369 epoch 5 - iter 520/2606 - loss 0.05047142 - time (sec): 39.10 - samples/sec: 1943.13 - lr: 0.000019 - momentum: 0.000000
2023-10-15 16:52:20,621 epoch 5 - iter 780/2606 - loss 0.05164843 - time (sec): 57.35 - samples/sec: 1939.10 - lr: 0.000019 - momentum: 0.000000
2023-10-15 16:52:40,125 epoch 5 - iter 1040/2606 - loss 0.04863786 - time (sec): 76.85 - samples/sec: 1950.69 - lr: 0.000019 - momentum: 0.000000
2023-10-15 16:52:59,496 epoch 5 - iter 1300/2606 - loss 0.05114388 - time (sec): 96.22 - samples/sec: 1942.27 - lr: 0.000018 - momentum: 0.000000
2023-10-15 16:53:18,027 epoch 5 - iter 1560/2606 - loss 0.05030163 - time (sec): 114.75 - samples/sec: 1934.20 - lr: 0.000018 - momentum: 0.000000
2023-10-15 16:53:36,276 epoch 5 - iter 1820/2606 - loss 0.04918969 - time (sec): 133.00 - samples/sec: 1927.14 - lr: 0.000018 - momentum: 0.000000
2023-10-15 16:53:55,482 epoch 5 - iter 2080/2606 - loss 0.04969722 - time (sec): 152.21 - samples/sec: 1928.53 - lr: 0.000017 - momentum: 0.000000
2023-10-15 16:54:13,930 epoch 5 - iter 2340/2606 - loss 0.04985016 - time (sec): 170.66 - samples/sec: 1933.53 - lr: 0.000017 - momentum: 0.000000
2023-10-15 16:54:32,872 epoch 5 - iter 2600/2606 - loss 0.05013012 - time (sec): 189.60 - samples/sec: 1933.97 - lr: 0.000017 - momentum: 0.000000
2023-10-15 16:54:33,288 ----------------------------------------------------------------------------------------------------
2023-10-15 16:54:33,288 EPOCH 5 done: loss 0.0501 - lr: 0.000017
2023-10-15 16:54:41,515 DEV : loss 0.3421619236469269 - f1-score (micro avg) 0.3831
2023-10-15 16:54:41,542 ----------------------------------------------------------------------------------------------------
2023-10-15 16:55:00,271 epoch 6 - iter 260/2606 - loss 0.02836933 - time (sec): 18.73 - samples/sec: 1912.47 - lr: 0.000016 - momentum: 0.000000
2023-10-15 16:55:19,461 epoch 6 - iter 520/2606 - loss 0.03322309 - time (sec): 37.92 - samples/sec: 1958.69 - lr: 0.000016 - momentum: 0.000000
2023-10-15 16:55:39,110 epoch 6 - iter 780/2606 - loss 0.03417720 - time (sec): 57.57 - samples/sec: 1947.43 - lr: 0.000016 - momentum: 0.000000
2023-10-15 16:55:58,123 epoch 6 - iter 1040/2606 - loss 0.03561005 - time (sec): 76.58 - samples/sec: 1936.64 - lr: 0.000015 - momentum: 0.000000
2023-10-15 16:56:15,638 epoch 6 - iter 1300/2606 - loss 0.03797069 - time (sec): 94.09 - samples/sec: 1934.24 - lr: 0.000015 - momentum: 0.000000
2023-10-15 16:56:35,345 epoch 6 - iter 1560/2606 - loss 0.03933476 - time (sec): 113.80 - samples/sec: 1940.11 - lr: 0.000015 - momentum: 0.000000
2023-10-15 16:56:53,573 epoch 6 - iter 1820/2606 - loss 0.03849459 - time (sec): 132.03 - samples/sec: 1940.99 - lr: 0.000014 - momentum: 0.000000
2023-10-15 16:57:14,002 epoch 6 - iter 2080/2606 - loss 0.03816553 - time (sec): 152.46 - samples/sec: 1927.13 - lr: 0.000014 - momentum: 0.000000
2023-10-15 16:57:32,779 epoch 6 - iter 2340/2606 - loss 0.03680409 - time (sec): 171.24 - samples/sec: 1928.56 - lr: 0.000014 - momentum: 0.000000
2023-10-15 16:57:51,496 epoch 6 - iter 2600/2606 - loss 0.03661753 - time (sec): 189.95 - samples/sec: 1928.30 - lr: 0.000013 - momentum: 0.000000
2023-10-15 16:57:52,077 ----------------------------------------------------------------------------------------------------
2023-10-15 16:57:52,077 EPOCH 6 done: loss 0.0367 - lr: 0.000013
2023-10-15 16:58:00,492 DEV : loss 0.40107467770576477 - f1-score (micro avg) 0.3727
2023-10-15 16:58:00,522 ----------------------------------------------------------------------------------------------------
2023-10-15 16:58:18,805 epoch 7 - iter 260/2606 - loss 0.02807878 - time (sec): 18.28 - samples/sec: 1847.43 - lr: 0.000013 - momentum: 0.000000
2023-10-15 16:58:37,987 epoch 7 - iter 520/2606 - loss 0.02940039 - time (sec): 37.46 - samples/sec: 1892.59 - lr: 0.000013 - momentum: 0.000000
2023-10-15 16:58:56,264 epoch 7 - iter 780/2606 - loss 0.02822614 - time (sec): 55.74 - samples/sec: 1876.29 - lr: 0.000012 - momentum: 0.000000
2023-10-15 16:59:15,716 epoch 7 - iter 1040/2606 - loss 0.02631228 - time (sec): 75.19 - samples/sec: 1894.14 - lr: 0.000012 - momentum: 0.000000
2023-10-15 16:59:35,276 epoch 7 - iter 1300/2606 - loss 0.02795012 - time (sec): 94.75 - samples/sec: 1905.97 - lr: 0.000012 - momentum: 0.000000
2023-10-15 16:59:53,587 epoch 7 - iter 1560/2606 - loss 0.02688673 - time (sec): 113.06 - samples/sec: 1916.11 - lr: 0.000011 - momentum: 0.000000
2023-10-15 17:00:12,419 epoch 7 - iter 1820/2606 - loss 0.02730602 - time (sec): 131.90 - samples/sec: 1933.54 - lr: 0.000011 - momentum: 0.000000
2023-10-15 17:00:31,363 epoch 7 - iter 2080/2606 - loss 0.02716500 - time (sec): 150.84 - samples/sec: 1937.16 - lr: 0.000011 - momentum: 0.000000
2023-10-15 17:00:50,648 epoch 7 - iter 2340/2606 - loss 0.02641485 - time (sec): 170.12 - samples/sec: 1931.45 - lr: 0.000010 - momentum: 0.000000
2023-10-15 17:01:10,567 epoch 7 - iter 2600/2606 - loss 0.02685152 - time (sec): 190.04 - samples/sec: 1929.68 - lr: 0.000010 - momentum: 0.000000
2023-10-15 17:01:11,036 ----------------------------------------------------------------------------------------------------
2023-10-15 17:01:11,036 EPOCH 7 done: loss 0.0268 - lr: 0.000010
2023-10-15 17:01:19,242 DEV : loss 0.36185500025749207 - f1-score (micro avg) 0.4043
2023-10-15 17:01:19,273 saving best model
2023-10-15 17:01:19,919 ----------------------------------------------------------------------------------------------------
2023-10-15 17:01:39,331 epoch 8 - iter 260/2606 - loss 0.01643423 - time (sec): 19.41 - samples/sec: 2026.45 - lr: 0.000010 - momentum: 0.000000
2023-10-15 17:01:59,051 epoch 8 - iter 520/2606 - loss 0.01851950 - time (sec): 39.13 - samples/sec: 1976.19 - lr: 0.000009 - momentum: 0.000000
2023-10-15 17:02:18,480 epoch 8 - iter 780/2606 - loss 0.01850720 - time (sec): 58.56 - samples/sec: 1959.55 - lr: 0.000009 - momentum: 0.000000
2023-10-15 17:02:36,845 epoch 8 - iter 1040/2606 - loss 0.02011290 - time (sec): 76.92 - samples/sec: 1946.80 - lr: 0.000009 - momentum: 0.000000
2023-10-15 17:02:55,283 epoch 8 - iter 1300/2606 - loss 0.01977515 - time (sec): 95.36 - samples/sec: 1940.14 - lr: 0.000008 - momentum: 0.000000
2023-10-15 17:03:14,146 epoch 8 - iter 1560/2606 - loss 0.01909661 - time (sec): 114.22 - samples/sec: 1935.70 - lr: 0.000008 - momentum: 0.000000
2023-10-15 17:03:33,519 epoch 8 - iter 1820/2606 - loss 0.01939229 - time (sec): 133.60 - samples/sec: 1937.25 - lr: 0.000008 - momentum: 0.000000
2023-10-15 17:03:51,713 epoch 8 - iter 2080/2606 - loss 0.02001143 - time (sec): 151.79 - samples/sec: 1932.90 - lr: 0.000007 - momentum: 0.000000
2023-10-15 17:04:10,894 epoch 8 - iter 2340/2606 - loss 0.01961925 - time (sec): 170.97 - samples/sec: 1929.77 - lr: 0.000007 - momentum: 0.000000
2023-10-15 17:04:29,158 epoch 8 - iter 2600/2606 - loss 0.02003412 - time (sec): 189.23 - samples/sec: 1934.90 - lr: 0.000007 - momentum: 0.000000
2023-10-15 17:04:29,709 ----------------------------------------------------------------------------------------------------
2023-10-15 17:04:29,709 EPOCH 8 done: loss 0.0200 - lr: 0.000007
2023-10-15 17:04:38,859 DEV : loss 0.4060736298561096 - f1-score (micro avg) 0.4087
2023-10-15 17:04:38,892 saving best model
2023-10-15 17:04:39,411 ----------------------------------------------------------------------------------------------------
2023-10-15 17:04:58,303 epoch 9 - iter 260/2606 - loss 0.01725916 - time (sec): 18.89 - samples/sec: 1881.58 - lr: 0.000006 - momentum: 0.000000
2023-10-15 17:05:17,100 epoch 9 - iter 520/2606 - loss 0.01816425 - time (sec): 37.68 - samples/sec: 1897.84 - lr: 0.000006 - momentum: 0.000000
2023-10-15 17:05:35,858 epoch 9 - iter 780/2606 - loss 0.01585953 - time (sec): 56.44 - samples/sec: 1915.98 - lr: 0.000006 - momentum: 0.000000
2023-10-15 17:05:54,651 epoch 9 - iter 1040/2606 - loss 0.01513412 - time (sec): 75.24 - samples/sec: 1919.13 - lr: 0.000005 - momentum: 0.000000
2023-10-15 17:06:13,353 epoch 9 - iter 1300/2606 - loss 0.01568114 - time (sec): 93.94 - samples/sec: 1914.89 - lr: 0.000005 - momentum: 0.000000
2023-10-15 17:06:33,351 epoch 9 - iter 1560/2606 - loss 0.01491052 - time (sec): 113.94 - samples/sec: 1920.33 - lr: 0.000005 - momentum: 0.000000
2023-10-15 17:06:52,085 epoch 9 - iter 1820/2606 - loss 0.01453602 - time (sec): 132.67 - samples/sec: 1923.52 - lr: 0.000004 - momentum: 0.000000
2023-10-15 17:07:10,935 epoch 9 - iter 2080/2606 - loss 0.01454802 - time (sec): 151.52 - samples/sec: 1919.01 - lr: 0.000004 - momentum: 0.000000
2023-10-15 17:07:30,661 epoch 9 - iter 2340/2606 - loss 0.01402508 - time (sec): 171.25 - samples/sec: 1920.68 - lr: 0.000004 - momentum: 0.000000
2023-10-15 17:07:50,213 epoch 9 - iter 2600/2606 - loss 0.01358373 - time (sec): 190.80 - samples/sec: 1921.95 - lr: 0.000003 - momentum: 0.000000
2023-10-15 17:07:50,566 ----------------------------------------------------------------------------------------------------
2023-10-15 17:07:50,567 EPOCH 9 done: loss 0.0136 - lr: 0.000003
2023-10-15 17:07:59,619 DEV : loss 0.5201111435890198 - f1-score (micro avg) 0.3708
2023-10-15 17:07:59,648 ----------------------------------------------------------------------------------------------------
2023-10-15 17:08:18,216 epoch 10 - iter 260/2606 - loss 0.00949472 - time (sec): 18.57 - samples/sec: 1933.16 - lr: 0.000003 - momentum: 0.000000
2023-10-15 17:08:37,149 epoch 10 - iter 520/2606 - loss 0.00956979 - time (sec): 37.50 - samples/sec: 1952.46 - lr: 0.000003 - momentum: 0.000000
2023-10-15 17:08:55,678 epoch 10 - iter 780/2606 - loss 0.00953648 - time (sec): 56.03 - samples/sec: 1970.66 - lr: 0.000002 - momentum: 0.000000
2023-10-15 17:09:14,776 epoch 10 - iter 1040/2606 - loss 0.00918127 - time (sec): 75.13 - samples/sec: 1951.47 - lr: 0.000002 - momentum: 0.000000
2023-10-15 17:09:34,593 epoch 10 - iter 1300/2606 - loss 0.00921523 - time (sec): 94.94 - samples/sec: 1953.32 - lr: 0.000002 - momentum: 0.000000
2023-10-15 17:09:53,789 epoch 10 - iter 1560/2606 - loss 0.00881008 - time (sec): 114.14 - samples/sec: 1942.44 - lr: 0.000001 - momentum: 0.000000
2023-10-15 17:10:12,660 epoch 10 - iter 1820/2606 - loss 0.00913303 - time (sec): 133.01 - samples/sec: 1941.32 - lr: 0.000001 - momentum: 0.000000
2023-10-15 17:10:31,001 epoch 10 - iter 2080/2606 - loss 0.00971053 - time (sec): 151.35 - samples/sec: 1938.83 - lr: 0.000001 - momentum: 0.000000
2023-10-15 17:10:49,857 epoch 10 - iter 2340/2606 - loss 0.00969515 - time (sec): 170.21 - samples/sec: 1933.79 - lr: 0.000000 - momentum: 0.000000
2023-10-15 17:11:09,481 epoch 10 - iter 2600/2606 - loss 0.00962657 - time (sec): 189.83 - samples/sec: 1932.12 - lr: 0.000000 - momentum: 0.000000
2023-10-15 17:11:09,863 ----------------------------------------------------------------------------------------------------
2023-10-15 17:11:09,863 EPOCH 10 done: loss 0.0096 - lr: 0.000000
2023-10-15 17:11:18,876 DEV : loss 0.4862280786037445 - f1-score (micro avg) 0.3916
2023-10-15 17:11:19,325 ----------------------------------------------------------------------------------------------------
2023-10-15 17:11:19,326 Loading model from best epoch ...
2023-10-15 17:11:20,868 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-15 17:11:37,464
Results:
- F-score (micro) 0.464
- F-score (macro) 0.3025
- Accuracy 0.3068
By class:
precision recall f1-score support
LOC 0.4868 0.5783 0.5286 1214
PER 0.4134 0.5050 0.4546 808
ORG 0.2706 0.1955 0.2270 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4380 0.4933 0.4640 2390
macro avg 0.2927 0.3197 0.3025 2390
weighted avg 0.4270 0.4933 0.4557 2390
2023-10-15 17:11:37,464 ----------------------------------------------------------------------------------------------------