stefan-it's picture
Upload folder using huggingface_hub
cfdc1c5
2023-10-15 23:01:18,561 ----------------------------------------------------------------------------------------------------
2023-10-15 23:01:18,562 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-15 23:01:18,563 ----------------------------------------------------------------------------------------------------
2023-10-15 23:01:18,563 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-15 23:01:18,563 ----------------------------------------------------------------------------------------------------
2023-10-15 23:01:18,563 Train: 20847 sentences
2023-10-15 23:01:18,563 (train_with_dev=False, train_with_test=False)
2023-10-15 23:01:18,563 ----------------------------------------------------------------------------------------------------
2023-10-15 23:01:18,563 Training Params:
2023-10-15 23:01:18,563 - learning_rate: "3e-05"
2023-10-15 23:01:18,563 - mini_batch_size: "4"
2023-10-15 23:01:18,563 - max_epochs: "10"
2023-10-15 23:01:18,563 - shuffle: "True"
2023-10-15 23:01:18,563 ----------------------------------------------------------------------------------------------------
2023-10-15 23:01:18,563 Plugins:
2023-10-15 23:01:18,563 - LinearScheduler | warmup_fraction: '0.1'
2023-10-15 23:01:18,563 ----------------------------------------------------------------------------------------------------
2023-10-15 23:01:18,563 Final evaluation on model from best epoch (best-model.pt)
2023-10-15 23:01:18,563 - metric: "('micro avg', 'f1-score')"
2023-10-15 23:01:18,563 ----------------------------------------------------------------------------------------------------
2023-10-15 23:01:18,563 Computation:
2023-10-15 23:01:18,563 - compute on device: cuda:0
2023-10-15 23:01:18,563 - embedding storage: none
2023-10-15 23:01:18,563 ----------------------------------------------------------------------------------------------------
2023-10-15 23:01:18,563 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-15 23:01:18,563 ----------------------------------------------------------------------------------------------------
2023-10-15 23:01:18,563 ----------------------------------------------------------------------------------------------------
2023-10-15 23:01:43,495 epoch 1 - iter 521/5212 - loss 1.57816846 - time (sec): 24.93 - samples/sec: 1428.73 - lr: 0.000003 - momentum: 0.000000
2023-10-15 23:02:08,663 epoch 1 - iter 1042/5212 - loss 1.00464763 - time (sec): 50.10 - samples/sec: 1457.84 - lr: 0.000006 - momentum: 0.000000
2023-10-15 23:02:33,770 epoch 1 - iter 1563/5212 - loss 0.77914777 - time (sec): 75.21 - samples/sec: 1443.74 - lr: 0.000009 - momentum: 0.000000
2023-10-15 23:02:59,065 epoch 1 - iter 2084/5212 - loss 0.65585694 - time (sec): 100.50 - samples/sec: 1436.30 - lr: 0.000012 - momentum: 0.000000
2023-10-15 23:03:24,499 epoch 1 - iter 2605/5212 - loss 0.57208897 - time (sec): 125.93 - samples/sec: 1434.86 - lr: 0.000015 - momentum: 0.000000
2023-10-15 23:03:49,897 epoch 1 - iter 3126/5212 - loss 0.50902242 - time (sec): 151.33 - samples/sec: 1443.16 - lr: 0.000018 - momentum: 0.000000
2023-10-15 23:04:14,803 epoch 1 - iter 3647/5212 - loss 0.46908985 - time (sec): 176.24 - samples/sec: 1449.68 - lr: 0.000021 - momentum: 0.000000
2023-10-15 23:04:40,120 epoch 1 - iter 4168/5212 - loss 0.43684486 - time (sec): 201.56 - samples/sec: 1447.01 - lr: 0.000024 - momentum: 0.000000
2023-10-15 23:05:05,064 epoch 1 - iter 4689/5212 - loss 0.41352998 - time (sec): 226.50 - samples/sec: 1443.15 - lr: 0.000027 - momentum: 0.000000
2023-10-15 23:05:30,792 epoch 1 - iter 5210/5212 - loss 0.39086124 - time (sec): 252.23 - samples/sec: 1456.59 - lr: 0.000030 - momentum: 0.000000
2023-10-15 23:05:30,882 ----------------------------------------------------------------------------------------------------
2023-10-15 23:05:30,882 EPOCH 1 done: loss 0.3908 - lr: 0.000030
2023-10-15 23:05:36,703 DEV : loss 0.1551702618598938 - f1-score (micro avg) 0.3358
2023-10-15 23:05:36,733 saving best model
2023-10-15 23:05:37,187 ----------------------------------------------------------------------------------------------------
2023-10-15 23:06:01,685 epoch 2 - iter 521/5212 - loss 0.18051053 - time (sec): 24.50 - samples/sec: 1584.27 - lr: 0.000030 - momentum: 0.000000
2023-10-15 23:06:25,892 epoch 2 - iter 1042/5212 - loss 0.17423932 - time (sec): 48.70 - samples/sec: 1540.79 - lr: 0.000029 - momentum: 0.000000
2023-10-15 23:06:52,002 epoch 2 - iter 1563/5212 - loss 0.16721027 - time (sec): 74.81 - samples/sec: 1491.10 - lr: 0.000029 - momentum: 0.000000
2023-10-15 23:07:17,432 epoch 2 - iter 2084/5212 - loss 0.16343997 - time (sec): 100.24 - samples/sec: 1485.17 - lr: 0.000029 - momentum: 0.000000
2023-10-15 23:07:42,744 epoch 2 - iter 2605/5212 - loss 0.16700608 - time (sec): 125.56 - samples/sec: 1475.93 - lr: 0.000028 - momentum: 0.000000
2023-10-15 23:08:08,383 epoch 2 - iter 3126/5212 - loss 0.16521606 - time (sec): 151.19 - samples/sec: 1474.40 - lr: 0.000028 - momentum: 0.000000
2023-10-15 23:08:33,266 epoch 2 - iter 3647/5212 - loss 0.16752019 - time (sec): 176.08 - samples/sec: 1469.68 - lr: 0.000028 - momentum: 0.000000
2023-10-15 23:08:58,928 epoch 2 - iter 4168/5212 - loss 0.16512155 - time (sec): 201.74 - samples/sec: 1475.67 - lr: 0.000027 - momentum: 0.000000
2023-10-15 23:09:23,467 epoch 2 - iter 4689/5212 - loss 0.16625606 - time (sec): 226.28 - samples/sec: 1461.72 - lr: 0.000027 - momentum: 0.000000
2023-10-15 23:09:48,392 epoch 2 - iter 5210/5212 - loss 0.16642187 - time (sec): 251.20 - samples/sec: 1461.65 - lr: 0.000027 - momentum: 0.000000
2023-10-15 23:09:48,506 ----------------------------------------------------------------------------------------------------
2023-10-15 23:09:48,506 EPOCH 2 done: loss 0.1664 - lr: 0.000027
2023-10-15 23:09:57,611 DEV : loss 0.1759524643421173 - f1-score (micro avg) 0.345
2023-10-15 23:09:57,639 saving best model
2023-10-15 23:09:58,174 ----------------------------------------------------------------------------------------------------
2023-10-15 23:10:23,351 epoch 3 - iter 521/5212 - loss 0.13813373 - time (sec): 25.17 - samples/sec: 1436.32 - lr: 0.000026 - momentum: 0.000000
2023-10-15 23:10:47,860 epoch 3 - iter 1042/5212 - loss 0.12806272 - time (sec): 49.68 - samples/sec: 1386.53 - lr: 0.000026 - momentum: 0.000000
2023-10-15 23:11:12,719 epoch 3 - iter 1563/5212 - loss 0.12631653 - time (sec): 74.54 - samples/sec: 1385.07 - lr: 0.000026 - momentum: 0.000000
2023-10-15 23:11:37,395 epoch 3 - iter 2084/5212 - loss 0.12662569 - time (sec): 99.22 - samples/sec: 1401.10 - lr: 0.000025 - momentum: 0.000000
2023-10-15 23:12:02,305 epoch 3 - iter 2605/5212 - loss 0.12161677 - time (sec): 124.13 - samples/sec: 1420.15 - lr: 0.000025 - momentum: 0.000000
2023-10-15 23:12:27,636 epoch 3 - iter 3126/5212 - loss 0.12396518 - time (sec): 149.46 - samples/sec: 1431.89 - lr: 0.000025 - momentum: 0.000000
2023-10-15 23:12:52,997 epoch 3 - iter 3647/5212 - loss 0.12304621 - time (sec): 174.82 - samples/sec: 1440.00 - lr: 0.000024 - momentum: 0.000000
2023-10-15 23:13:18,146 epoch 3 - iter 4168/5212 - loss 0.12220986 - time (sec): 199.97 - samples/sec: 1440.78 - lr: 0.000024 - momentum: 0.000000
2023-10-15 23:13:43,519 epoch 3 - iter 4689/5212 - loss 0.12332670 - time (sec): 225.34 - samples/sec: 1451.48 - lr: 0.000024 - momentum: 0.000000
2023-10-15 23:14:09,472 epoch 3 - iter 5210/5212 - loss 0.12074165 - time (sec): 251.30 - samples/sec: 1462.04 - lr: 0.000023 - momentum: 0.000000
2023-10-15 23:14:09,556 ----------------------------------------------------------------------------------------------------
2023-10-15 23:14:09,556 EPOCH 3 done: loss 0.1207 - lr: 0.000023
2023-10-15 23:14:18,570 DEV : loss 0.2367718517780304 - f1-score (micro avg) 0.3203
2023-10-15 23:14:18,597 ----------------------------------------------------------------------------------------------------
2023-10-15 23:14:43,445 epoch 4 - iter 521/5212 - loss 0.06947319 - time (sec): 24.85 - samples/sec: 1424.08 - lr: 0.000023 - momentum: 0.000000
2023-10-15 23:15:08,358 epoch 4 - iter 1042/5212 - loss 0.07944653 - time (sec): 49.76 - samples/sec: 1444.97 - lr: 0.000023 - momentum: 0.000000
2023-10-15 23:15:33,949 epoch 4 - iter 1563/5212 - loss 0.07739611 - time (sec): 75.35 - samples/sec: 1475.25 - lr: 0.000022 - momentum: 0.000000
2023-10-15 23:15:59,604 epoch 4 - iter 2084/5212 - loss 0.07559714 - time (sec): 101.01 - samples/sec: 1484.11 - lr: 0.000022 - momentum: 0.000000
2023-10-15 23:16:24,880 epoch 4 - iter 2605/5212 - loss 0.07966053 - time (sec): 126.28 - samples/sec: 1482.15 - lr: 0.000022 - momentum: 0.000000
2023-10-15 23:16:49,889 epoch 4 - iter 3126/5212 - loss 0.08000675 - time (sec): 151.29 - samples/sec: 1474.73 - lr: 0.000021 - momentum: 0.000000
2023-10-15 23:17:14,928 epoch 4 - iter 3647/5212 - loss 0.07959465 - time (sec): 176.33 - samples/sec: 1469.41 - lr: 0.000021 - momentum: 0.000000
2023-10-15 23:17:40,303 epoch 4 - iter 4168/5212 - loss 0.07917318 - time (sec): 201.70 - samples/sec: 1472.61 - lr: 0.000021 - momentum: 0.000000
2023-10-15 23:18:04,896 epoch 4 - iter 4689/5212 - loss 0.08036966 - time (sec): 226.30 - samples/sec: 1465.66 - lr: 0.000020 - momentum: 0.000000
2023-10-15 23:18:29,915 epoch 4 - iter 5210/5212 - loss 0.07972575 - time (sec): 251.32 - samples/sec: 1461.10 - lr: 0.000020 - momentum: 0.000000
2023-10-15 23:18:30,013 ----------------------------------------------------------------------------------------------------
2023-10-15 23:18:30,013 EPOCH 4 done: loss 0.0797 - lr: 0.000020
2023-10-15 23:18:39,084 DEV : loss 0.31254345178604126 - f1-score (micro avg) 0.3583
2023-10-15 23:18:39,115 saving best model
2023-10-15 23:18:39,534 ----------------------------------------------------------------------------------------------------
2023-10-15 23:19:05,445 epoch 5 - iter 521/5212 - loss 0.05681791 - time (sec): 25.91 - samples/sec: 1487.06 - lr: 0.000020 - momentum: 0.000000
2023-10-15 23:19:30,420 epoch 5 - iter 1042/5212 - loss 0.05805110 - time (sec): 50.88 - samples/sec: 1439.01 - lr: 0.000019 - momentum: 0.000000
2023-10-15 23:19:55,322 epoch 5 - iter 1563/5212 - loss 0.05743481 - time (sec): 75.79 - samples/sec: 1436.75 - lr: 0.000019 - momentum: 0.000000
2023-10-15 23:20:21,081 epoch 5 - iter 2084/5212 - loss 0.05774534 - time (sec): 101.54 - samples/sec: 1454.04 - lr: 0.000019 - momentum: 0.000000
2023-10-15 23:20:46,653 epoch 5 - iter 2605/5212 - loss 0.05950250 - time (sec): 127.12 - samples/sec: 1461.36 - lr: 0.000018 - momentum: 0.000000
2023-10-15 23:21:10,671 epoch 5 - iter 3126/5212 - loss 0.06076355 - time (sec): 151.14 - samples/sec: 1465.65 - lr: 0.000018 - momentum: 0.000000
2023-10-15 23:21:35,899 epoch 5 - iter 3647/5212 - loss 0.06139140 - time (sec): 176.36 - samples/sec: 1456.99 - lr: 0.000018 - momentum: 0.000000
2023-10-15 23:22:01,629 epoch 5 - iter 4168/5212 - loss 0.06089801 - time (sec): 202.09 - samples/sec: 1460.17 - lr: 0.000017 - momentum: 0.000000
2023-10-15 23:22:26,887 epoch 5 - iter 4689/5212 - loss 0.05994322 - time (sec): 227.35 - samples/sec: 1457.48 - lr: 0.000017 - momentum: 0.000000
2023-10-15 23:22:52,006 epoch 5 - iter 5210/5212 - loss 0.05926432 - time (sec): 252.47 - samples/sec: 1455.20 - lr: 0.000017 - momentum: 0.000000
2023-10-15 23:22:52,093 ----------------------------------------------------------------------------------------------------
2023-10-15 23:22:52,093 EPOCH 5 done: loss 0.0593 - lr: 0.000017
2023-10-15 23:23:01,238 DEV : loss 0.3170505166053772 - f1-score (micro avg) 0.3805
2023-10-15 23:23:01,265 saving best model
2023-10-15 23:23:01,736 ----------------------------------------------------------------------------------------------------
2023-10-15 23:23:26,895 epoch 6 - iter 521/5212 - loss 0.03813445 - time (sec): 25.15 - samples/sec: 1449.45 - lr: 0.000016 - momentum: 0.000000
2023-10-15 23:23:51,946 epoch 6 - iter 1042/5212 - loss 0.04569272 - time (sec): 50.20 - samples/sec: 1468.13 - lr: 0.000016 - momentum: 0.000000
2023-10-15 23:24:16,335 epoch 6 - iter 1563/5212 - loss 0.04146612 - time (sec): 74.59 - samples/sec: 1480.52 - lr: 0.000016 - momentum: 0.000000
2023-10-15 23:24:40,611 epoch 6 - iter 2084/5212 - loss 0.04255530 - time (sec): 98.87 - samples/sec: 1494.73 - lr: 0.000015 - momentum: 0.000000
2023-10-15 23:25:04,819 epoch 6 - iter 2605/5212 - loss 0.04259498 - time (sec): 123.08 - samples/sec: 1489.41 - lr: 0.000015 - momentum: 0.000000
2023-10-15 23:25:30,224 epoch 6 - iter 3126/5212 - loss 0.04613026 - time (sec): 148.48 - samples/sec: 1486.72 - lr: 0.000015 - momentum: 0.000000
2023-10-15 23:25:55,884 epoch 6 - iter 3647/5212 - loss 0.04513844 - time (sec): 174.14 - samples/sec: 1489.69 - lr: 0.000014 - momentum: 0.000000
2023-10-15 23:26:21,370 epoch 6 - iter 4168/5212 - loss 0.04411342 - time (sec): 199.63 - samples/sec: 1488.13 - lr: 0.000014 - momentum: 0.000000
2023-10-15 23:26:46,746 epoch 6 - iter 4689/5212 - loss 0.04421772 - time (sec): 225.00 - samples/sec: 1481.69 - lr: 0.000014 - momentum: 0.000000
2023-10-15 23:27:11,521 epoch 6 - iter 5210/5212 - loss 0.04352330 - time (sec): 249.78 - samples/sec: 1469.27 - lr: 0.000013 - momentum: 0.000000
2023-10-15 23:27:11,673 ----------------------------------------------------------------------------------------------------
2023-10-15 23:27:11,673 EPOCH 6 done: loss 0.0435 - lr: 0.000013
2023-10-15 23:27:20,816 DEV : loss 0.43000391125679016 - f1-score (micro avg) 0.3767
2023-10-15 23:27:20,843 ----------------------------------------------------------------------------------------------------
2023-10-15 23:27:46,148 epoch 7 - iter 521/5212 - loss 0.04318070 - time (sec): 25.30 - samples/sec: 1379.43 - lr: 0.000013 - momentum: 0.000000
2023-10-15 23:28:10,477 epoch 7 - iter 1042/5212 - loss 0.03545449 - time (sec): 49.63 - samples/sec: 1443.93 - lr: 0.000013 - momentum: 0.000000
2023-10-15 23:28:35,968 epoch 7 - iter 1563/5212 - loss 0.03539388 - time (sec): 75.12 - samples/sec: 1442.06 - lr: 0.000012 - momentum: 0.000000
2023-10-15 23:29:01,785 epoch 7 - iter 2084/5212 - loss 0.03433173 - time (sec): 100.94 - samples/sec: 1463.11 - lr: 0.000012 - momentum: 0.000000
2023-10-15 23:29:27,015 epoch 7 - iter 2605/5212 - loss 0.03377936 - time (sec): 126.17 - samples/sec: 1454.97 - lr: 0.000012 - momentum: 0.000000
2023-10-15 23:29:52,232 epoch 7 - iter 3126/5212 - loss 0.03443693 - time (sec): 151.39 - samples/sec: 1457.71 - lr: 0.000011 - momentum: 0.000000
2023-10-15 23:30:17,653 epoch 7 - iter 3647/5212 - loss 0.03352446 - time (sec): 176.81 - samples/sec: 1463.31 - lr: 0.000011 - momentum: 0.000000
2023-10-15 23:30:42,898 epoch 7 - iter 4168/5212 - loss 0.03294138 - time (sec): 202.05 - samples/sec: 1465.08 - lr: 0.000011 - momentum: 0.000000
2023-10-15 23:31:07,935 epoch 7 - iter 4689/5212 - loss 0.03247734 - time (sec): 227.09 - samples/sec: 1452.48 - lr: 0.000010 - momentum: 0.000000
2023-10-15 23:31:33,498 epoch 7 - iter 5210/5212 - loss 0.03215531 - time (sec): 252.65 - samples/sec: 1453.51 - lr: 0.000010 - momentum: 0.000000
2023-10-15 23:31:33,617 ----------------------------------------------------------------------------------------------------
2023-10-15 23:31:33,617 EPOCH 7 done: loss 0.0321 - lr: 0.000010
2023-10-15 23:31:41,895 DEV : loss 0.43989118933677673 - f1-score (micro avg) 0.3651
2023-10-15 23:31:41,925 ----------------------------------------------------------------------------------------------------
2023-10-15 23:32:07,556 epoch 8 - iter 521/5212 - loss 0.02030482 - time (sec): 25.63 - samples/sec: 1523.60 - lr: 0.000010 - momentum: 0.000000
2023-10-15 23:32:32,868 epoch 8 - iter 1042/5212 - loss 0.02029145 - time (sec): 50.94 - samples/sec: 1512.19 - lr: 0.000009 - momentum: 0.000000
2023-10-15 23:32:58,850 epoch 8 - iter 1563/5212 - loss 0.02030246 - time (sec): 76.92 - samples/sec: 1477.19 - lr: 0.000009 - momentum: 0.000000
2023-10-15 23:33:24,355 epoch 8 - iter 2084/5212 - loss 0.02102241 - time (sec): 102.43 - samples/sec: 1467.10 - lr: 0.000009 - momentum: 0.000000
2023-10-15 23:33:49,462 epoch 8 - iter 2605/5212 - loss 0.02014836 - time (sec): 127.54 - samples/sec: 1461.76 - lr: 0.000008 - momentum: 0.000000
2023-10-15 23:34:14,915 epoch 8 - iter 3126/5212 - loss 0.02105174 - time (sec): 152.99 - samples/sec: 1464.66 - lr: 0.000008 - momentum: 0.000000
2023-10-15 23:34:39,835 epoch 8 - iter 3647/5212 - loss 0.02225207 - time (sec): 177.91 - samples/sec: 1459.81 - lr: 0.000008 - momentum: 0.000000
2023-10-15 23:35:04,921 epoch 8 - iter 4168/5212 - loss 0.02254482 - time (sec): 202.99 - samples/sec: 1458.26 - lr: 0.000007 - momentum: 0.000000
2023-10-15 23:35:29,983 epoch 8 - iter 4689/5212 - loss 0.02221020 - time (sec): 228.06 - samples/sec: 1448.89 - lr: 0.000007 - momentum: 0.000000
2023-10-15 23:35:55,253 epoch 8 - iter 5210/5212 - loss 0.02190849 - time (sec): 253.33 - samples/sec: 1450.23 - lr: 0.000007 - momentum: 0.000000
2023-10-15 23:35:55,335 ----------------------------------------------------------------------------------------------------
2023-10-15 23:35:55,335 EPOCH 8 done: loss 0.0219 - lr: 0.000007
2023-10-15 23:36:03,613 DEV : loss 0.4804233908653259 - f1-score (micro avg) 0.3607
2023-10-15 23:36:03,641 ----------------------------------------------------------------------------------------------------
2023-10-15 23:36:28,832 epoch 9 - iter 521/5212 - loss 0.01165488 - time (sec): 25.19 - samples/sec: 1499.73 - lr: 0.000006 - momentum: 0.000000
2023-10-15 23:36:53,661 epoch 9 - iter 1042/5212 - loss 0.01448827 - time (sec): 50.02 - samples/sec: 1454.98 - lr: 0.000006 - momentum: 0.000000
2023-10-15 23:37:18,548 epoch 9 - iter 1563/5212 - loss 0.01652036 - time (sec): 74.91 - samples/sec: 1434.87 - lr: 0.000006 - momentum: 0.000000
2023-10-15 23:37:43,820 epoch 9 - iter 2084/5212 - loss 0.01722537 - time (sec): 100.18 - samples/sec: 1429.82 - lr: 0.000005 - momentum: 0.000000
2023-10-15 23:38:09,602 epoch 9 - iter 2605/5212 - loss 0.01646965 - time (sec): 125.96 - samples/sec: 1441.59 - lr: 0.000005 - momentum: 0.000000
2023-10-15 23:38:35,803 epoch 9 - iter 3126/5212 - loss 0.01574425 - time (sec): 152.16 - samples/sec: 1440.33 - lr: 0.000005 - momentum: 0.000000
2023-10-15 23:39:01,121 epoch 9 - iter 3647/5212 - loss 0.01557122 - time (sec): 177.48 - samples/sec: 1439.93 - lr: 0.000004 - momentum: 0.000000
2023-10-15 23:39:26,348 epoch 9 - iter 4168/5212 - loss 0.01516313 - time (sec): 202.71 - samples/sec: 1446.55 - lr: 0.000004 - momentum: 0.000000
2023-10-15 23:39:51,863 epoch 9 - iter 4689/5212 - loss 0.01529793 - time (sec): 228.22 - samples/sec: 1450.74 - lr: 0.000004 - momentum: 0.000000
2023-10-15 23:40:17,189 epoch 9 - iter 5210/5212 - loss 0.01498075 - time (sec): 253.55 - samples/sec: 1448.86 - lr: 0.000003 - momentum: 0.000000
2023-10-15 23:40:17,276 ----------------------------------------------------------------------------------------------------
2023-10-15 23:40:17,277 EPOCH 9 done: loss 0.0150 - lr: 0.000003
2023-10-15 23:40:25,600 DEV : loss 0.4658600091934204 - f1-score (micro avg) 0.3626
2023-10-15 23:40:25,629 ----------------------------------------------------------------------------------------------------
2023-10-15 23:40:50,716 epoch 10 - iter 521/5212 - loss 0.01097631 - time (sec): 25.09 - samples/sec: 1420.87 - lr: 0.000003 - momentum: 0.000000
2023-10-15 23:41:16,378 epoch 10 - iter 1042/5212 - loss 0.01039988 - time (sec): 50.75 - samples/sec: 1410.82 - lr: 0.000003 - momentum: 0.000000
2023-10-15 23:41:41,327 epoch 10 - iter 1563/5212 - loss 0.01008363 - time (sec): 75.70 - samples/sec: 1411.51 - lr: 0.000002 - momentum: 0.000000
2023-10-15 23:42:06,378 epoch 10 - iter 2084/5212 - loss 0.01093550 - time (sec): 100.75 - samples/sec: 1418.35 - lr: 0.000002 - momentum: 0.000000
2023-10-15 23:42:30,674 epoch 10 - iter 2605/5212 - loss 0.01038744 - time (sec): 125.04 - samples/sec: 1425.51 - lr: 0.000002 - momentum: 0.000000
2023-10-15 23:42:56,355 epoch 10 - iter 3126/5212 - loss 0.01029665 - time (sec): 150.73 - samples/sec: 1439.37 - lr: 0.000001 - momentum: 0.000000
2023-10-15 23:43:21,935 epoch 10 - iter 3647/5212 - loss 0.01072722 - time (sec): 176.30 - samples/sec: 1448.23 - lr: 0.000001 - momentum: 0.000000
2023-10-15 23:43:47,826 epoch 10 - iter 4168/5212 - loss 0.01038615 - time (sec): 202.20 - samples/sec: 1455.01 - lr: 0.000001 - momentum: 0.000000
2023-10-15 23:44:14,058 epoch 10 - iter 4689/5212 - loss 0.01029547 - time (sec): 228.43 - samples/sec: 1451.80 - lr: 0.000000 - momentum: 0.000000
2023-10-15 23:44:38,931 epoch 10 - iter 5210/5212 - loss 0.01030819 - time (sec): 253.30 - samples/sec: 1450.34 - lr: 0.000000 - momentum: 0.000000
2023-10-15 23:44:39,022 ----------------------------------------------------------------------------------------------------
2023-10-15 23:44:39,022 EPOCH 10 done: loss 0.0103 - lr: 0.000000
2023-10-15 23:44:47,300 DEV : loss 0.4685536324977875 - f1-score (micro avg) 0.3793
2023-10-15 23:44:47,791 ----------------------------------------------------------------------------------------------------
2023-10-15 23:44:47,793 Loading model from best epoch ...
2023-10-15 23:44:49,249 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-15 23:45:04,628
Results:
- F-score (micro) 0.4903
- F-score (macro) 0.334
- Accuracy 0.3303
By class:
precision recall f1-score support
LOC 0.5200 0.6425 0.5748 1214
PER 0.3872 0.5223 0.4447 808
ORG 0.2901 0.3484 0.3166 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4395 0.5544 0.4903 2390
macro avg 0.2993 0.3783 0.3340 2390
weighted avg 0.4379 0.5544 0.4891 2390
2023-10-15 23:45:04,629 ----------------------------------------------------------------------------------------------------