stefan-it's picture
Upload folder using huggingface_hub
1eb2f4b
2023-10-18 18:24:23,192 ----------------------------------------------------------------------------------------------------
2023-10-18 18:24:23,193 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 18:24:23,193 ----------------------------------------------------------------------------------------------------
2023-10-18 18:24:23,193 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-18 18:24:23,193 ----------------------------------------------------------------------------------------------------
2023-10-18 18:24:23,193 Train: 3575 sentences
2023-10-18 18:24:23,193 (train_with_dev=False, train_with_test=False)
2023-10-18 18:24:23,193 ----------------------------------------------------------------------------------------------------
2023-10-18 18:24:23,193 Training Params:
2023-10-18 18:24:23,193 - learning_rate: "3e-05"
2023-10-18 18:24:23,193 - mini_batch_size: "4"
2023-10-18 18:24:23,193 - max_epochs: "10"
2023-10-18 18:24:23,193 - shuffle: "True"
2023-10-18 18:24:23,193 ----------------------------------------------------------------------------------------------------
2023-10-18 18:24:23,193 Plugins:
2023-10-18 18:24:23,193 - TensorboardLogger
2023-10-18 18:24:23,193 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 18:24:23,193 ----------------------------------------------------------------------------------------------------
2023-10-18 18:24:23,193 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 18:24:23,193 - metric: "('micro avg', 'f1-score')"
2023-10-18 18:24:23,193 ----------------------------------------------------------------------------------------------------
2023-10-18 18:24:23,193 Computation:
2023-10-18 18:24:23,193 - compute on device: cuda:0
2023-10-18 18:24:23,193 - embedding storage: none
2023-10-18 18:24:23,193 ----------------------------------------------------------------------------------------------------
2023-10-18 18:24:23,193 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-18 18:24:23,193 ----------------------------------------------------------------------------------------------------
2023-10-18 18:24:23,193 ----------------------------------------------------------------------------------------------------
2023-10-18 18:24:23,194 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 18:24:24,611 epoch 1 - iter 89/894 - loss 3.44377310 - time (sec): 1.42 - samples/sec: 6732.15 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:24:26,013 epoch 1 - iter 178/894 - loss 3.32507717 - time (sec): 2.82 - samples/sec: 6559.53 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:24:27,420 epoch 1 - iter 267/894 - loss 3.03650720 - time (sec): 4.23 - samples/sec: 6473.93 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:24:28,797 epoch 1 - iter 356/894 - loss 2.71169589 - time (sec): 5.60 - samples/sec: 6340.06 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:24:30,202 epoch 1 - iter 445/894 - loss 2.36182152 - time (sec): 7.01 - samples/sec: 6333.75 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:24:31,595 epoch 1 - iter 534/894 - loss 2.09769477 - time (sec): 8.40 - samples/sec: 6275.33 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:24:32,979 epoch 1 - iter 623/894 - loss 1.90205848 - time (sec): 9.79 - samples/sec: 6265.56 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:24:34,360 epoch 1 - iter 712/894 - loss 1.74916876 - time (sec): 11.17 - samples/sec: 6233.53 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:24:35,679 epoch 1 - iter 801/894 - loss 1.63097149 - time (sec): 12.49 - samples/sec: 6242.13 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:24:36,995 epoch 1 - iter 890/894 - loss 1.53628366 - time (sec): 13.80 - samples/sec: 6237.65 - lr: 0.000030 - momentum: 0.000000
2023-10-18 18:24:37,062 ----------------------------------------------------------------------------------------------------
2023-10-18 18:24:37,062 EPOCH 1 done: loss 1.5305 - lr: 0.000030
2023-10-18 18:24:39,256 DEV : loss 0.47118452191352844 - f1-score (micro avg) 0.0
2023-10-18 18:24:39,282 ----------------------------------------------------------------------------------------------------
2023-10-18 18:24:40,513 epoch 2 - iter 89/894 - loss 0.58579553 - time (sec): 1.23 - samples/sec: 7877.62 - lr: 0.000030 - momentum: 0.000000
2023-10-18 18:24:41,931 epoch 2 - iter 178/894 - loss 0.57032839 - time (sec): 2.65 - samples/sec: 7060.10 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:24:43,315 epoch 2 - iter 267/894 - loss 0.54052276 - time (sec): 4.03 - samples/sec: 6714.69 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:24:44,702 epoch 2 - iter 356/894 - loss 0.51251248 - time (sec): 5.42 - samples/sec: 6568.34 - lr: 0.000029 - momentum: 0.000000
2023-10-18 18:24:46,091 epoch 2 - iter 445/894 - loss 0.50579319 - time (sec): 6.81 - samples/sec: 6390.32 - lr: 0.000028 - momentum: 0.000000
2023-10-18 18:24:47,483 epoch 2 - iter 534/894 - loss 0.49805171 - time (sec): 8.20 - samples/sec: 6308.43 - lr: 0.000028 - momentum: 0.000000
2023-10-18 18:24:48,888 epoch 2 - iter 623/894 - loss 0.49569052 - time (sec): 9.61 - samples/sec: 6336.78 - lr: 0.000028 - momentum: 0.000000
2023-10-18 18:24:50,296 epoch 2 - iter 712/894 - loss 0.49586035 - time (sec): 11.01 - samples/sec: 6311.45 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:24:51,661 epoch 2 - iter 801/894 - loss 0.49461561 - time (sec): 12.38 - samples/sec: 6282.93 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:24:53,027 epoch 2 - iter 890/894 - loss 0.49190223 - time (sec): 13.74 - samples/sec: 6277.14 - lr: 0.000027 - momentum: 0.000000
2023-10-18 18:24:53,086 ----------------------------------------------------------------------------------------------------
2023-10-18 18:24:53,086 EPOCH 2 done: loss 0.4922 - lr: 0.000027
2023-10-18 18:24:58,310 DEV : loss 0.35969656705856323 - f1-score (micro avg) 0.0576
2023-10-18 18:24:58,336 saving best model
2023-10-18 18:24:58,373 ----------------------------------------------------------------------------------------------------
2023-10-18 18:24:59,774 epoch 3 - iter 89/894 - loss 0.45189352 - time (sec): 1.40 - samples/sec: 6287.24 - lr: 0.000026 - momentum: 0.000000
2023-10-18 18:25:01,047 epoch 3 - iter 178/894 - loss 0.45925371 - time (sec): 2.67 - samples/sec: 6402.24 - lr: 0.000026 - momentum: 0.000000
2023-10-18 18:25:02,416 epoch 3 - iter 267/894 - loss 0.44356716 - time (sec): 4.04 - samples/sec: 6398.43 - lr: 0.000026 - momentum: 0.000000
2023-10-18 18:25:03,800 epoch 3 - iter 356/894 - loss 0.42841623 - time (sec): 5.43 - samples/sec: 6271.91 - lr: 0.000025 - momentum: 0.000000
2023-10-18 18:25:05,217 epoch 3 - iter 445/894 - loss 0.43422146 - time (sec): 6.84 - samples/sec: 6278.06 - lr: 0.000025 - momentum: 0.000000
2023-10-18 18:25:06,578 epoch 3 - iter 534/894 - loss 0.42632128 - time (sec): 8.20 - samples/sec: 6239.49 - lr: 0.000025 - momentum: 0.000000
2023-10-18 18:25:07,998 epoch 3 - iter 623/894 - loss 0.42026616 - time (sec): 9.62 - samples/sec: 6325.99 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:25:09,371 epoch 3 - iter 712/894 - loss 0.41871103 - time (sec): 11.00 - samples/sec: 6274.20 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:25:10,805 epoch 3 - iter 801/894 - loss 0.41988152 - time (sec): 12.43 - samples/sec: 6266.34 - lr: 0.000024 - momentum: 0.000000
2023-10-18 18:25:12,173 epoch 3 - iter 890/894 - loss 0.41575753 - time (sec): 13.80 - samples/sec: 6252.09 - lr: 0.000023 - momentum: 0.000000
2023-10-18 18:25:12,233 ----------------------------------------------------------------------------------------------------
2023-10-18 18:25:12,233 EPOCH 3 done: loss 0.4158 - lr: 0.000023
2023-10-18 18:25:17,416 DEV : loss 0.328664094209671 - f1-score (micro avg) 0.2589
2023-10-18 18:25:17,442 saving best model
2023-10-18 18:25:17,476 ----------------------------------------------------------------------------------------------------
2023-10-18 18:25:18,887 epoch 4 - iter 89/894 - loss 0.36856508 - time (sec): 1.41 - samples/sec: 6414.14 - lr: 0.000023 - momentum: 0.000000
2023-10-18 18:25:20,308 epoch 4 - iter 178/894 - loss 0.36110083 - time (sec): 2.83 - samples/sec: 6278.32 - lr: 0.000023 - momentum: 0.000000
2023-10-18 18:25:21,721 epoch 4 - iter 267/894 - loss 0.36322010 - time (sec): 4.24 - samples/sec: 6454.81 - lr: 0.000022 - momentum: 0.000000
2023-10-18 18:25:23,115 epoch 4 - iter 356/894 - loss 0.37579106 - time (sec): 5.64 - samples/sec: 6440.42 - lr: 0.000022 - momentum: 0.000000
2023-10-18 18:25:24,499 epoch 4 - iter 445/894 - loss 0.37427337 - time (sec): 7.02 - samples/sec: 6284.65 - lr: 0.000022 - momentum: 0.000000
2023-10-18 18:25:25,951 epoch 4 - iter 534/894 - loss 0.37959096 - time (sec): 8.47 - samples/sec: 6292.40 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:25:27,329 epoch 4 - iter 623/894 - loss 0.38174423 - time (sec): 9.85 - samples/sec: 6236.47 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:25:28,726 epoch 4 - iter 712/894 - loss 0.38408338 - time (sec): 11.25 - samples/sec: 6196.42 - lr: 0.000021 - momentum: 0.000000
2023-10-18 18:25:30,100 epoch 4 - iter 801/894 - loss 0.38386136 - time (sec): 12.62 - samples/sec: 6167.78 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:25:31,481 epoch 4 - iter 890/894 - loss 0.38339230 - time (sec): 14.00 - samples/sec: 6153.34 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:25:31,546 ----------------------------------------------------------------------------------------------------
2023-10-18 18:25:31,546 EPOCH 4 done: loss 0.3834 - lr: 0.000020
2023-10-18 18:25:36,818 DEV : loss 0.3191424310207367 - f1-score (micro avg) 0.2918
2023-10-18 18:25:36,845 saving best model
2023-10-18 18:25:36,886 ----------------------------------------------------------------------------------------------------
2023-10-18 18:25:38,253 epoch 5 - iter 89/894 - loss 0.41005231 - time (sec): 1.37 - samples/sec: 5681.44 - lr: 0.000020 - momentum: 0.000000
2023-10-18 18:25:39,623 epoch 5 - iter 178/894 - loss 0.37709380 - time (sec): 2.74 - samples/sec: 5787.17 - lr: 0.000019 - momentum: 0.000000
2023-10-18 18:25:41,011 epoch 5 - iter 267/894 - loss 0.37250446 - time (sec): 4.12 - samples/sec: 5878.04 - lr: 0.000019 - momentum: 0.000000
2023-10-18 18:25:42,478 epoch 5 - iter 356/894 - loss 0.35129430 - time (sec): 5.59 - samples/sec: 6049.95 - lr: 0.000019 - momentum: 0.000000
2023-10-18 18:25:43,913 epoch 5 - iter 445/894 - loss 0.34544567 - time (sec): 7.03 - samples/sec: 6161.99 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:25:45,368 epoch 5 - iter 534/894 - loss 0.34722635 - time (sec): 8.48 - samples/sec: 6210.15 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:25:46,735 epoch 5 - iter 623/894 - loss 0.34754497 - time (sec): 9.85 - samples/sec: 6190.05 - lr: 0.000018 - momentum: 0.000000
2023-10-18 18:25:48,093 epoch 5 - iter 712/894 - loss 0.35160168 - time (sec): 11.21 - samples/sec: 6172.18 - lr: 0.000017 - momentum: 0.000000
2023-10-18 18:25:49,446 epoch 5 - iter 801/894 - loss 0.35284732 - time (sec): 12.56 - samples/sec: 6183.91 - lr: 0.000017 - momentum: 0.000000
2023-10-18 18:25:50,843 epoch 5 - iter 890/894 - loss 0.35750849 - time (sec): 13.96 - samples/sec: 6181.58 - lr: 0.000017 - momentum: 0.000000
2023-10-18 18:25:50,902 ----------------------------------------------------------------------------------------------------
2023-10-18 18:25:50,902 EPOCH 5 done: loss 0.3570 - lr: 0.000017
2023-10-18 18:25:55,876 DEV : loss 0.31827524304389954 - f1-score (micro avg) 0.3341
2023-10-18 18:25:55,904 saving best model
2023-10-18 18:25:55,939 ----------------------------------------------------------------------------------------------------
2023-10-18 18:25:57,667 epoch 6 - iter 89/894 - loss 0.32626758 - time (sec): 1.73 - samples/sec: 4966.40 - lr: 0.000016 - momentum: 0.000000
2023-10-18 18:25:59,055 epoch 6 - iter 178/894 - loss 0.34440906 - time (sec): 3.12 - samples/sec: 5573.96 - lr: 0.000016 - momentum: 0.000000
2023-10-18 18:26:00,471 epoch 6 - iter 267/894 - loss 0.34248903 - time (sec): 4.53 - samples/sec: 6008.91 - lr: 0.000016 - momentum: 0.000000
2023-10-18 18:26:01,873 epoch 6 - iter 356/894 - loss 0.34427275 - time (sec): 5.93 - samples/sec: 5992.98 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:26:03,277 epoch 6 - iter 445/894 - loss 0.34981341 - time (sec): 7.34 - samples/sec: 5988.96 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:26:04,635 epoch 6 - iter 534/894 - loss 0.34619955 - time (sec): 8.70 - samples/sec: 5929.90 - lr: 0.000015 - momentum: 0.000000
2023-10-18 18:26:06,023 epoch 6 - iter 623/894 - loss 0.34619718 - time (sec): 10.08 - samples/sec: 5980.05 - lr: 0.000014 - momentum: 0.000000
2023-10-18 18:26:07,446 epoch 6 - iter 712/894 - loss 0.34673980 - time (sec): 11.51 - samples/sec: 5974.78 - lr: 0.000014 - momentum: 0.000000
2023-10-18 18:26:08,831 epoch 6 - iter 801/894 - loss 0.34705892 - time (sec): 12.89 - samples/sec: 6008.23 - lr: 0.000014 - momentum: 0.000000
2023-10-18 18:26:10,230 epoch 6 - iter 890/894 - loss 0.34088003 - time (sec): 14.29 - samples/sec: 6032.22 - lr: 0.000013 - momentum: 0.000000
2023-10-18 18:26:10,287 ----------------------------------------------------------------------------------------------------
2023-10-18 18:26:10,287 EPOCH 6 done: loss 0.3400 - lr: 0.000013
2023-10-18 18:26:15,258 DEV : loss 0.3164794147014618 - f1-score (micro avg) 0.3395
2023-10-18 18:26:15,285 saving best model
2023-10-18 18:26:15,325 ----------------------------------------------------------------------------------------------------
2023-10-18 18:26:16,771 epoch 7 - iter 89/894 - loss 0.34476077 - time (sec): 1.45 - samples/sec: 5879.83 - lr: 0.000013 - momentum: 0.000000
2023-10-18 18:26:18,127 epoch 7 - iter 178/894 - loss 0.32276911 - time (sec): 2.80 - samples/sec: 5986.68 - lr: 0.000013 - momentum: 0.000000
2023-10-18 18:26:19,496 epoch 7 - iter 267/894 - loss 0.32152674 - time (sec): 4.17 - samples/sec: 5955.22 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:26:20,911 epoch 7 - iter 356/894 - loss 0.32068103 - time (sec): 5.59 - samples/sec: 6115.23 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:26:22,310 epoch 7 - iter 445/894 - loss 0.32162727 - time (sec): 6.98 - samples/sec: 6156.14 - lr: 0.000012 - momentum: 0.000000
2023-10-18 18:26:23,729 epoch 7 - iter 534/894 - loss 0.31705506 - time (sec): 8.40 - samples/sec: 6170.93 - lr: 0.000011 - momentum: 0.000000
2023-10-18 18:26:25,113 epoch 7 - iter 623/894 - loss 0.32675305 - time (sec): 9.79 - samples/sec: 6179.87 - lr: 0.000011 - momentum: 0.000000
2023-10-18 18:26:26,522 epoch 7 - iter 712/894 - loss 0.32578820 - time (sec): 11.20 - samples/sec: 6226.74 - lr: 0.000011 - momentum: 0.000000
2023-10-18 18:26:27,895 epoch 7 - iter 801/894 - loss 0.32694328 - time (sec): 12.57 - samples/sec: 6206.43 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:26:29,268 epoch 7 - iter 890/894 - loss 0.32690021 - time (sec): 13.94 - samples/sec: 6177.73 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:26:29,329 ----------------------------------------------------------------------------------------------------
2023-10-18 18:26:29,329 EPOCH 7 done: loss 0.3260 - lr: 0.000010
2023-10-18 18:26:34,643 DEV : loss 0.3100321888923645 - f1-score (micro avg) 0.3492
2023-10-18 18:26:34,671 saving best model
2023-10-18 18:26:34,709 ----------------------------------------------------------------------------------------------------
2023-10-18 18:26:36,180 epoch 8 - iter 89/894 - loss 0.33805047 - time (sec): 1.47 - samples/sec: 5813.17 - lr: 0.000010 - momentum: 0.000000
2023-10-18 18:26:37,559 epoch 8 - iter 178/894 - loss 0.32541421 - time (sec): 2.85 - samples/sec: 5960.72 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:26:38,944 epoch 8 - iter 267/894 - loss 0.33510004 - time (sec): 4.23 - samples/sec: 6116.97 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:26:40,347 epoch 8 - iter 356/894 - loss 0.33625387 - time (sec): 5.64 - samples/sec: 6157.91 - lr: 0.000009 - momentum: 0.000000
2023-10-18 18:26:41,784 epoch 8 - iter 445/894 - loss 0.33118526 - time (sec): 7.07 - samples/sec: 6120.14 - lr: 0.000008 - momentum: 0.000000
2023-10-18 18:26:43,292 epoch 8 - iter 534/894 - loss 0.32648379 - time (sec): 8.58 - samples/sec: 6186.97 - lr: 0.000008 - momentum: 0.000000
2023-10-18 18:26:44,700 epoch 8 - iter 623/894 - loss 0.32349710 - time (sec): 9.99 - samples/sec: 6109.44 - lr: 0.000008 - momentum: 0.000000
2023-10-18 18:26:46,131 epoch 8 - iter 712/894 - loss 0.32373014 - time (sec): 11.42 - samples/sec: 6103.37 - lr: 0.000007 - momentum: 0.000000
2023-10-18 18:26:47,564 epoch 8 - iter 801/894 - loss 0.31732168 - time (sec): 12.85 - samples/sec: 6139.46 - lr: 0.000007 - momentum: 0.000000
2023-10-18 18:26:48,931 epoch 8 - iter 890/894 - loss 0.32090107 - time (sec): 14.22 - samples/sec: 6063.28 - lr: 0.000007 - momentum: 0.000000
2023-10-18 18:26:48,988 ----------------------------------------------------------------------------------------------------
2023-10-18 18:26:48,988 EPOCH 8 done: loss 0.3205 - lr: 0.000007
2023-10-18 18:26:54,256 DEV : loss 0.30898723006248474 - f1-score (micro avg) 0.3486
2023-10-18 18:26:54,283 ----------------------------------------------------------------------------------------------------
2023-10-18 18:26:55,650 epoch 9 - iter 89/894 - loss 0.28474499 - time (sec): 1.37 - samples/sec: 5868.01 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:26:57,145 epoch 9 - iter 178/894 - loss 0.31255629 - time (sec): 2.86 - samples/sec: 6241.57 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:26:58,541 epoch 9 - iter 267/894 - loss 0.33568219 - time (sec): 4.26 - samples/sec: 6184.45 - lr: 0.000006 - momentum: 0.000000
2023-10-18 18:26:59,975 epoch 9 - iter 356/894 - loss 0.32610043 - time (sec): 5.69 - samples/sec: 6138.41 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:27:01,404 epoch 9 - iter 445/894 - loss 0.32342253 - time (sec): 7.12 - samples/sec: 6047.66 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:27:02,888 epoch 9 - iter 534/894 - loss 0.31927800 - time (sec): 8.60 - samples/sec: 6123.51 - lr: 0.000005 - momentum: 0.000000
2023-10-18 18:27:04,266 epoch 9 - iter 623/894 - loss 0.31440826 - time (sec): 9.98 - samples/sec: 6173.64 - lr: 0.000004 - momentum: 0.000000
2023-10-18 18:27:05,656 epoch 9 - iter 712/894 - loss 0.31534195 - time (sec): 11.37 - samples/sec: 6135.91 - lr: 0.000004 - momentum: 0.000000
2023-10-18 18:27:07,025 epoch 9 - iter 801/894 - loss 0.31358640 - time (sec): 12.74 - samples/sec: 6146.45 - lr: 0.000004 - momentum: 0.000000
2023-10-18 18:27:08,474 epoch 9 - iter 890/894 - loss 0.31067410 - time (sec): 14.19 - samples/sec: 6084.82 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:27:08,534 ----------------------------------------------------------------------------------------------------
2023-10-18 18:27:08,534 EPOCH 9 done: loss 0.3110 - lr: 0.000003
2023-10-18 18:27:13,828 DEV : loss 0.3121003806591034 - f1-score (micro avg) 0.3518
2023-10-18 18:27:13,855 saving best model
2023-10-18 18:27:13,896 ----------------------------------------------------------------------------------------------------
2023-10-18 18:27:15,302 epoch 10 - iter 89/894 - loss 0.26168045 - time (sec): 1.41 - samples/sec: 6367.82 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:27:16,694 epoch 10 - iter 178/894 - loss 0.27101137 - time (sec): 2.80 - samples/sec: 6264.19 - lr: 0.000003 - momentum: 0.000000
2023-10-18 18:27:18,047 epoch 10 - iter 267/894 - loss 0.26561479 - time (sec): 4.15 - samples/sec: 6068.32 - lr: 0.000002 - momentum: 0.000000
2023-10-18 18:27:19,410 epoch 10 - iter 356/894 - loss 0.27933512 - time (sec): 5.51 - samples/sec: 6042.11 - lr: 0.000002 - momentum: 0.000000
2023-10-18 18:27:20,758 epoch 10 - iter 445/894 - loss 0.28638971 - time (sec): 6.86 - samples/sec: 6055.18 - lr: 0.000002 - momentum: 0.000000
2023-10-18 18:27:22,121 epoch 10 - iter 534/894 - loss 0.29135151 - time (sec): 8.22 - samples/sec: 6039.98 - lr: 0.000001 - momentum: 0.000000
2023-10-18 18:27:23,547 epoch 10 - iter 623/894 - loss 0.29451250 - time (sec): 9.65 - samples/sec: 6018.63 - lr: 0.000001 - momentum: 0.000000
2023-10-18 18:27:24,955 epoch 10 - iter 712/894 - loss 0.29782593 - time (sec): 11.06 - samples/sec: 6105.22 - lr: 0.000001 - momentum: 0.000000
2023-10-18 18:27:26,233 epoch 10 - iter 801/894 - loss 0.29562823 - time (sec): 12.34 - samples/sec: 6294.96 - lr: 0.000000 - momentum: 0.000000
2023-10-18 18:27:27,476 epoch 10 - iter 890/894 - loss 0.30303112 - time (sec): 13.58 - samples/sec: 6351.18 - lr: 0.000000 - momentum: 0.000000
2023-10-18 18:27:27,528 ----------------------------------------------------------------------------------------------------
2023-10-18 18:27:27,528 EPOCH 10 done: loss 0.3035 - lr: 0.000000
2023-10-18 18:27:32,523 DEV : loss 0.3079068958759308 - f1-score (micro avg) 0.3549
2023-10-18 18:27:32,550 saving best model
2023-10-18 18:27:32,611 ----------------------------------------------------------------------------------------------------
2023-10-18 18:27:32,612 Loading model from best epoch ...
2023-10-18 18:27:32,693 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-18 18:27:35,046
Results:
- F-score (micro) 0.3371
- F-score (macro) 0.1334
- Accuracy 0.2144
By class:
precision recall f1-score support
loc 0.4716 0.5436 0.5051 596
pers 0.1452 0.1832 0.1620 333
org 0.0000 0.0000 0.0000 132
prod 0.0000 0.0000 0.0000 66
time 0.0000 0.0000 0.0000 49
micro avg 0.3475 0.3274 0.3371 1176
macro avg 0.1234 0.1454 0.1334 1176
weighted avg 0.2801 0.3274 0.3018 1176
2023-10-18 18:27:35,047 ----------------------------------------------------------------------------------------------------