|
2023-10-15 23:01:18,561 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:01:18,562 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-15 23:01:18,563 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:01:18,563 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences |
|
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator |
|
2023-10-15 23:01:18,563 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:01:18,563 Train: 20847 sentences |
|
2023-10-15 23:01:18,563 (train_with_dev=False, train_with_test=False) |
|
2023-10-15 23:01:18,563 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:01:18,563 Training Params: |
|
2023-10-15 23:01:18,563 - learning_rate: "3e-05" |
|
2023-10-15 23:01:18,563 - mini_batch_size: "4" |
|
2023-10-15 23:01:18,563 - max_epochs: "10" |
|
2023-10-15 23:01:18,563 - shuffle: "True" |
|
2023-10-15 23:01:18,563 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:01:18,563 Plugins: |
|
2023-10-15 23:01:18,563 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-15 23:01:18,563 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:01:18,563 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-15 23:01:18,563 - metric: "('micro avg', 'f1-score')" |
|
2023-10-15 23:01:18,563 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:01:18,563 Computation: |
|
2023-10-15 23:01:18,563 - compute on device: cuda:0 |
|
2023-10-15 23:01:18,563 - embedding storage: none |
|
2023-10-15 23:01:18,563 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:01:18,563 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-15 23:01:18,563 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:01:18,563 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:01:43,495 epoch 1 - iter 521/5212 - loss 1.57816846 - time (sec): 24.93 - samples/sec: 1428.73 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-15 23:02:08,663 epoch 1 - iter 1042/5212 - loss 1.00464763 - time (sec): 50.10 - samples/sec: 1457.84 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-15 23:02:33,770 epoch 1 - iter 1563/5212 - loss 0.77914777 - time (sec): 75.21 - samples/sec: 1443.74 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-15 23:02:59,065 epoch 1 - iter 2084/5212 - loss 0.65585694 - time (sec): 100.50 - samples/sec: 1436.30 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-15 23:03:24,499 epoch 1 - iter 2605/5212 - loss 0.57208897 - time (sec): 125.93 - samples/sec: 1434.86 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-15 23:03:49,897 epoch 1 - iter 3126/5212 - loss 0.50902242 - time (sec): 151.33 - samples/sec: 1443.16 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-15 23:04:14,803 epoch 1 - iter 3647/5212 - loss 0.46908985 - time (sec): 176.24 - samples/sec: 1449.68 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-15 23:04:40,120 epoch 1 - iter 4168/5212 - loss 0.43684486 - time (sec): 201.56 - samples/sec: 1447.01 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-15 23:05:05,064 epoch 1 - iter 4689/5212 - loss 0.41352998 - time (sec): 226.50 - samples/sec: 1443.15 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-15 23:05:30,792 epoch 1 - iter 5210/5212 - loss 0.39086124 - time (sec): 252.23 - samples/sec: 1456.59 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-15 23:05:30,882 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:05:30,882 EPOCH 1 done: loss 0.3908 - lr: 0.000030 |
|
2023-10-15 23:05:36,703 DEV : loss 0.1551702618598938 - f1-score (micro avg) 0.3358 |
|
2023-10-15 23:05:36,733 saving best model |
|
2023-10-15 23:05:37,187 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:06:01,685 epoch 2 - iter 521/5212 - loss 0.18051053 - time (sec): 24.50 - samples/sec: 1584.27 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-15 23:06:25,892 epoch 2 - iter 1042/5212 - loss 0.17423932 - time (sec): 48.70 - samples/sec: 1540.79 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-15 23:06:52,002 epoch 2 - iter 1563/5212 - loss 0.16721027 - time (sec): 74.81 - samples/sec: 1491.10 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-15 23:07:17,432 epoch 2 - iter 2084/5212 - loss 0.16343997 - time (sec): 100.24 - samples/sec: 1485.17 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-15 23:07:42,744 epoch 2 - iter 2605/5212 - loss 0.16700608 - time (sec): 125.56 - samples/sec: 1475.93 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-15 23:08:08,383 epoch 2 - iter 3126/5212 - loss 0.16521606 - time (sec): 151.19 - samples/sec: 1474.40 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-15 23:08:33,266 epoch 2 - iter 3647/5212 - loss 0.16752019 - time (sec): 176.08 - samples/sec: 1469.68 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-15 23:08:58,928 epoch 2 - iter 4168/5212 - loss 0.16512155 - time (sec): 201.74 - samples/sec: 1475.67 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-15 23:09:23,467 epoch 2 - iter 4689/5212 - loss 0.16625606 - time (sec): 226.28 - samples/sec: 1461.72 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-15 23:09:48,392 epoch 2 - iter 5210/5212 - loss 0.16642187 - time (sec): 251.20 - samples/sec: 1461.65 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-15 23:09:48,506 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:09:48,506 EPOCH 2 done: loss 0.1664 - lr: 0.000027 |
|
2023-10-15 23:09:57,611 DEV : loss 0.1759524643421173 - f1-score (micro avg) 0.345 |
|
2023-10-15 23:09:57,639 saving best model |
|
2023-10-15 23:09:58,174 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:10:23,351 epoch 3 - iter 521/5212 - loss 0.13813373 - time (sec): 25.17 - samples/sec: 1436.32 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-15 23:10:47,860 epoch 3 - iter 1042/5212 - loss 0.12806272 - time (sec): 49.68 - samples/sec: 1386.53 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-15 23:11:12,719 epoch 3 - iter 1563/5212 - loss 0.12631653 - time (sec): 74.54 - samples/sec: 1385.07 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-15 23:11:37,395 epoch 3 - iter 2084/5212 - loss 0.12662569 - time (sec): 99.22 - samples/sec: 1401.10 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-15 23:12:02,305 epoch 3 - iter 2605/5212 - loss 0.12161677 - time (sec): 124.13 - samples/sec: 1420.15 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-15 23:12:27,636 epoch 3 - iter 3126/5212 - loss 0.12396518 - time (sec): 149.46 - samples/sec: 1431.89 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-15 23:12:52,997 epoch 3 - iter 3647/5212 - loss 0.12304621 - time (sec): 174.82 - samples/sec: 1440.00 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-15 23:13:18,146 epoch 3 - iter 4168/5212 - loss 0.12220986 - time (sec): 199.97 - samples/sec: 1440.78 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-15 23:13:43,519 epoch 3 - iter 4689/5212 - loss 0.12332670 - time (sec): 225.34 - samples/sec: 1451.48 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-15 23:14:09,472 epoch 3 - iter 5210/5212 - loss 0.12074165 - time (sec): 251.30 - samples/sec: 1462.04 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-15 23:14:09,556 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:14:09,556 EPOCH 3 done: loss 0.1207 - lr: 0.000023 |
|
2023-10-15 23:14:18,570 DEV : loss 0.2367718517780304 - f1-score (micro avg) 0.3203 |
|
2023-10-15 23:14:18,597 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:14:43,445 epoch 4 - iter 521/5212 - loss 0.06947319 - time (sec): 24.85 - samples/sec: 1424.08 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-15 23:15:08,358 epoch 4 - iter 1042/5212 - loss 0.07944653 - time (sec): 49.76 - samples/sec: 1444.97 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-15 23:15:33,949 epoch 4 - iter 1563/5212 - loss 0.07739611 - time (sec): 75.35 - samples/sec: 1475.25 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-15 23:15:59,604 epoch 4 - iter 2084/5212 - loss 0.07559714 - time (sec): 101.01 - samples/sec: 1484.11 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-15 23:16:24,880 epoch 4 - iter 2605/5212 - loss 0.07966053 - time (sec): 126.28 - samples/sec: 1482.15 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-15 23:16:49,889 epoch 4 - iter 3126/5212 - loss 0.08000675 - time (sec): 151.29 - samples/sec: 1474.73 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-15 23:17:14,928 epoch 4 - iter 3647/5212 - loss 0.07959465 - time (sec): 176.33 - samples/sec: 1469.41 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-15 23:17:40,303 epoch 4 - iter 4168/5212 - loss 0.07917318 - time (sec): 201.70 - samples/sec: 1472.61 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-15 23:18:04,896 epoch 4 - iter 4689/5212 - loss 0.08036966 - time (sec): 226.30 - samples/sec: 1465.66 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-15 23:18:29,915 epoch 4 - iter 5210/5212 - loss 0.07972575 - time (sec): 251.32 - samples/sec: 1461.10 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-15 23:18:30,013 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:18:30,013 EPOCH 4 done: loss 0.0797 - lr: 0.000020 |
|
2023-10-15 23:18:39,084 DEV : loss 0.31254345178604126 - f1-score (micro avg) 0.3583 |
|
2023-10-15 23:18:39,115 saving best model |
|
2023-10-15 23:18:39,534 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:19:05,445 epoch 5 - iter 521/5212 - loss 0.05681791 - time (sec): 25.91 - samples/sec: 1487.06 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-15 23:19:30,420 epoch 5 - iter 1042/5212 - loss 0.05805110 - time (sec): 50.88 - samples/sec: 1439.01 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-15 23:19:55,322 epoch 5 - iter 1563/5212 - loss 0.05743481 - time (sec): 75.79 - samples/sec: 1436.75 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-15 23:20:21,081 epoch 5 - iter 2084/5212 - loss 0.05774534 - time (sec): 101.54 - samples/sec: 1454.04 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-15 23:20:46,653 epoch 5 - iter 2605/5212 - loss 0.05950250 - time (sec): 127.12 - samples/sec: 1461.36 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-15 23:21:10,671 epoch 5 - iter 3126/5212 - loss 0.06076355 - time (sec): 151.14 - samples/sec: 1465.65 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-15 23:21:35,899 epoch 5 - iter 3647/5212 - loss 0.06139140 - time (sec): 176.36 - samples/sec: 1456.99 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-15 23:22:01,629 epoch 5 - iter 4168/5212 - loss 0.06089801 - time (sec): 202.09 - samples/sec: 1460.17 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-15 23:22:26,887 epoch 5 - iter 4689/5212 - loss 0.05994322 - time (sec): 227.35 - samples/sec: 1457.48 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-15 23:22:52,006 epoch 5 - iter 5210/5212 - loss 0.05926432 - time (sec): 252.47 - samples/sec: 1455.20 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-15 23:22:52,093 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:22:52,093 EPOCH 5 done: loss 0.0593 - lr: 0.000017 |
|
2023-10-15 23:23:01,238 DEV : loss 0.3170505166053772 - f1-score (micro avg) 0.3805 |
|
2023-10-15 23:23:01,265 saving best model |
|
2023-10-15 23:23:01,736 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:23:26,895 epoch 6 - iter 521/5212 - loss 0.03813445 - time (sec): 25.15 - samples/sec: 1449.45 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-15 23:23:51,946 epoch 6 - iter 1042/5212 - loss 0.04569272 - time (sec): 50.20 - samples/sec: 1468.13 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-15 23:24:16,335 epoch 6 - iter 1563/5212 - loss 0.04146612 - time (sec): 74.59 - samples/sec: 1480.52 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-15 23:24:40,611 epoch 6 - iter 2084/5212 - loss 0.04255530 - time (sec): 98.87 - samples/sec: 1494.73 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-15 23:25:04,819 epoch 6 - iter 2605/5212 - loss 0.04259498 - time (sec): 123.08 - samples/sec: 1489.41 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-15 23:25:30,224 epoch 6 - iter 3126/5212 - loss 0.04613026 - time (sec): 148.48 - samples/sec: 1486.72 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-15 23:25:55,884 epoch 6 - iter 3647/5212 - loss 0.04513844 - time (sec): 174.14 - samples/sec: 1489.69 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-15 23:26:21,370 epoch 6 - iter 4168/5212 - loss 0.04411342 - time (sec): 199.63 - samples/sec: 1488.13 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-15 23:26:46,746 epoch 6 - iter 4689/5212 - loss 0.04421772 - time (sec): 225.00 - samples/sec: 1481.69 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-15 23:27:11,521 epoch 6 - iter 5210/5212 - loss 0.04352330 - time (sec): 249.78 - samples/sec: 1469.27 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-15 23:27:11,673 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:27:11,673 EPOCH 6 done: loss 0.0435 - lr: 0.000013 |
|
2023-10-15 23:27:20,816 DEV : loss 0.43000391125679016 - f1-score (micro avg) 0.3767 |
|
2023-10-15 23:27:20,843 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:27:46,148 epoch 7 - iter 521/5212 - loss 0.04318070 - time (sec): 25.30 - samples/sec: 1379.43 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-15 23:28:10,477 epoch 7 - iter 1042/5212 - loss 0.03545449 - time (sec): 49.63 - samples/sec: 1443.93 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-15 23:28:35,968 epoch 7 - iter 1563/5212 - loss 0.03539388 - time (sec): 75.12 - samples/sec: 1442.06 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-15 23:29:01,785 epoch 7 - iter 2084/5212 - loss 0.03433173 - time (sec): 100.94 - samples/sec: 1463.11 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-15 23:29:27,015 epoch 7 - iter 2605/5212 - loss 0.03377936 - time (sec): 126.17 - samples/sec: 1454.97 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-15 23:29:52,232 epoch 7 - iter 3126/5212 - loss 0.03443693 - time (sec): 151.39 - samples/sec: 1457.71 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-15 23:30:17,653 epoch 7 - iter 3647/5212 - loss 0.03352446 - time (sec): 176.81 - samples/sec: 1463.31 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-15 23:30:42,898 epoch 7 - iter 4168/5212 - loss 0.03294138 - time (sec): 202.05 - samples/sec: 1465.08 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-15 23:31:07,935 epoch 7 - iter 4689/5212 - loss 0.03247734 - time (sec): 227.09 - samples/sec: 1452.48 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-15 23:31:33,498 epoch 7 - iter 5210/5212 - loss 0.03215531 - time (sec): 252.65 - samples/sec: 1453.51 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-15 23:31:33,617 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:31:33,617 EPOCH 7 done: loss 0.0321 - lr: 0.000010 |
|
2023-10-15 23:31:41,895 DEV : loss 0.43989118933677673 - f1-score (micro avg) 0.3651 |
|
2023-10-15 23:31:41,925 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:32:07,556 epoch 8 - iter 521/5212 - loss 0.02030482 - time (sec): 25.63 - samples/sec: 1523.60 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-15 23:32:32,868 epoch 8 - iter 1042/5212 - loss 0.02029145 - time (sec): 50.94 - samples/sec: 1512.19 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-15 23:32:58,850 epoch 8 - iter 1563/5212 - loss 0.02030246 - time (sec): 76.92 - samples/sec: 1477.19 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-15 23:33:24,355 epoch 8 - iter 2084/5212 - loss 0.02102241 - time (sec): 102.43 - samples/sec: 1467.10 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-15 23:33:49,462 epoch 8 - iter 2605/5212 - loss 0.02014836 - time (sec): 127.54 - samples/sec: 1461.76 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-15 23:34:14,915 epoch 8 - iter 3126/5212 - loss 0.02105174 - time (sec): 152.99 - samples/sec: 1464.66 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-15 23:34:39,835 epoch 8 - iter 3647/5212 - loss 0.02225207 - time (sec): 177.91 - samples/sec: 1459.81 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-15 23:35:04,921 epoch 8 - iter 4168/5212 - loss 0.02254482 - time (sec): 202.99 - samples/sec: 1458.26 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-15 23:35:29,983 epoch 8 - iter 4689/5212 - loss 0.02221020 - time (sec): 228.06 - samples/sec: 1448.89 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-15 23:35:55,253 epoch 8 - iter 5210/5212 - loss 0.02190849 - time (sec): 253.33 - samples/sec: 1450.23 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-15 23:35:55,335 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:35:55,335 EPOCH 8 done: loss 0.0219 - lr: 0.000007 |
|
2023-10-15 23:36:03,613 DEV : loss 0.4804233908653259 - f1-score (micro avg) 0.3607 |
|
2023-10-15 23:36:03,641 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:36:28,832 epoch 9 - iter 521/5212 - loss 0.01165488 - time (sec): 25.19 - samples/sec: 1499.73 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-15 23:36:53,661 epoch 9 - iter 1042/5212 - loss 0.01448827 - time (sec): 50.02 - samples/sec: 1454.98 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-15 23:37:18,548 epoch 9 - iter 1563/5212 - loss 0.01652036 - time (sec): 74.91 - samples/sec: 1434.87 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-15 23:37:43,820 epoch 9 - iter 2084/5212 - loss 0.01722537 - time (sec): 100.18 - samples/sec: 1429.82 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-15 23:38:09,602 epoch 9 - iter 2605/5212 - loss 0.01646965 - time (sec): 125.96 - samples/sec: 1441.59 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-15 23:38:35,803 epoch 9 - iter 3126/5212 - loss 0.01574425 - time (sec): 152.16 - samples/sec: 1440.33 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-15 23:39:01,121 epoch 9 - iter 3647/5212 - loss 0.01557122 - time (sec): 177.48 - samples/sec: 1439.93 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-15 23:39:26,348 epoch 9 - iter 4168/5212 - loss 0.01516313 - time (sec): 202.71 - samples/sec: 1446.55 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-15 23:39:51,863 epoch 9 - iter 4689/5212 - loss 0.01529793 - time (sec): 228.22 - samples/sec: 1450.74 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-15 23:40:17,189 epoch 9 - iter 5210/5212 - loss 0.01498075 - time (sec): 253.55 - samples/sec: 1448.86 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-15 23:40:17,276 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:40:17,277 EPOCH 9 done: loss 0.0150 - lr: 0.000003 |
|
2023-10-15 23:40:25,600 DEV : loss 0.4658600091934204 - f1-score (micro avg) 0.3626 |
|
2023-10-15 23:40:25,629 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:40:50,716 epoch 10 - iter 521/5212 - loss 0.01097631 - time (sec): 25.09 - samples/sec: 1420.87 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-15 23:41:16,378 epoch 10 - iter 1042/5212 - loss 0.01039988 - time (sec): 50.75 - samples/sec: 1410.82 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-15 23:41:41,327 epoch 10 - iter 1563/5212 - loss 0.01008363 - time (sec): 75.70 - samples/sec: 1411.51 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-15 23:42:06,378 epoch 10 - iter 2084/5212 - loss 0.01093550 - time (sec): 100.75 - samples/sec: 1418.35 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-15 23:42:30,674 epoch 10 - iter 2605/5212 - loss 0.01038744 - time (sec): 125.04 - samples/sec: 1425.51 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-15 23:42:56,355 epoch 10 - iter 3126/5212 - loss 0.01029665 - time (sec): 150.73 - samples/sec: 1439.37 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-15 23:43:21,935 epoch 10 - iter 3647/5212 - loss 0.01072722 - time (sec): 176.30 - samples/sec: 1448.23 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-15 23:43:47,826 epoch 10 - iter 4168/5212 - loss 0.01038615 - time (sec): 202.20 - samples/sec: 1455.01 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-15 23:44:14,058 epoch 10 - iter 4689/5212 - loss 0.01029547 - time (sec): 228.43 - samples/sec: 1451.80 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-15 23:44:38,931 epoch 10 - iter 5210/5212 - loss 0.01030819 - time (sec): 253.30 - samples/sec: 1450.34 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-15 23:44:39,022 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:44:39,022 EPOCH 10 done: loss 0.0103 - lr: 0.000000 |
|
2023-10-15 23:44:47,300 DEV : loss 0.4685536324977875 - f1-score (micro avg) 0.3793 |
|
2023-10-15 23:44:47,791 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 23:44:47,793 Loading model from best epoch ... |
|
2023-10-15 23:44:49,249 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-15 23:45:04,628 |
|
Results: |
|
- F-score (micro) 0.4903 |
|
- F-score (macro) 0.334 |
|
- Accuracy 0.3303 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.5200 0.6425 0.5748 1214 |
|
PER 0.3872 0.5223 0.4447 808 |
|
ORG 0.2901 0.3484 0.3166 353 |
|
HumanProd 0.0000 0.0000 0.0000 15 |
|
|
|
micro avg 0.4395 0.5544 0.4903 2390 |
|
macro avg 0.2993 0.3783 0.3340 2390 |
|
weighted avg 0.4379 0.5544 0.4891 2390 |
|
|
|
2023-10-15 23:45:04,629 ---------------------------------------------------------------------------------------------------- |
|
|