2023-10-15 23:01:18,561 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:01:18,562 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-15 23:01:18,563 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:01:18,563 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences - NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator 2023-10-15 23:01:18,563 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:01:18,563 Train: 20847 sentences 2023-10-15 23:01:18,563 (train_with_dev=False, train_with_test=False) 2023-10-15 23:01:18,563 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:01:18,563 Training Params: 2023-10-15 23:01:18,563 - learning_rate: "3e-05" 2023-10-15 23:01:18,563 - mini_batch_size: "4" 2023-10-15 23:01:18,563 - max_epochs: "10" 2023-10-15 23:01:18,563 - shuffle: "True" 2023-10-15 23:01:18,563 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:01:18,563 Plugins: 2023-10-15 23:01:18,563 - LinearScheduler | warmup_fraction: '0.1' 2023-10-15 23:01:18,563 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:01:18,563 Final evaluation on model from best epoch (best-model.pt) 2023-10-15 23:01:18,563 - metric: "('micro avg', 'f1-score')" 2023-10-15 23:01:18,563 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:01:18,563 Computation: 2023-10-15 23:01:18,563 - compute on device: cuda:0 2023-10-15 23:01:18,563 - embedding storage: none 2023-10-15 23:01:18,563 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:01:18,563 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-15 23:01:18,563 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:01:18,563 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:01:43,495 epoch 1 - iter 521/5212 - loss 1.57816846 - time (sec): 24.93 - samples/sec: 1428.73 - lr: 0.000003 - momentum: 0.000000 2023-10-15 23:02:08,663 epoch 1 - iter 1042/5212 - loss 1.00464763 - time (sec): 50.10 - samples/sec: 1457.84 - lr: 0.000006 - momentum: 0.000000 2023-10-15 23:02:33,770 epoch 1 - iter 1563/5212 - loss 0.77914777 - time (sec): 75.21 - samples/sec: 1443.74 - lr: 0.000009 - momentum: 0.000000 2023-10-15 23:02:59,065 epoch 1 - iter 2084/5212 - loss 0.65585694 - time (sec): 100.50 - samples/sec: 1436.30 - lr: 0.000012 - momentum: 0.000000 2023-10-15 23:03:24,499 epoch 1 - iter 2605/5212 - loss 0.57208897 - time (sec): 125.93 - samples/sec: 1434.86 - lr: 0.000015 - momentum: 0.000000 2023-10-15 23:03:49,897 epoch 1 - iter 3126/5212 - loss 0.50902242 - time (sec): 151.33 - samples/sec: 1443.16 - lr: 0.000018 - momentum: 0.000000 2023-10-15 23:04:14,803 epoch 1 - iter 3647/5212 - loss 0.46908985 - time (sec): 176.24 - samples/sec: 1449.68 - lr: 0.000021 - momentum: 0.000000 2023-10-15 23:04:40,120 epoch 1 - iter 4168/5212 - loss 0.43684486 - time (sec): 201.56 - samples/sec: 1447.01 - lr: 0.000024 - momentum: 0.000000 2023-10-15 23:05:05,064 epoch 1 - iter 4689/5212 - loss 0.41352998 - time (sec): 226.50 - samples/sec: 1443.15 - lr: 0.000027 - momentum: 0.000000 2023-10-15 23:05:30,792 epoch 1 - iter 5210/5212 - loss 0.39086124 - time (sec): 252.23 - samples/sec: 1456.59 - lr: 0.000030 - momentum: 0.000000 2023-10-15 23:05:30,882 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:05:30,882 EPOCH 1 done: loss 0.3908 - lr: 0.000030 2023-10-15 23:05:36,703 DEV : loss 0.1551702618598938 - f1-score (micro avg) 0.3358 2023-10-15 23:05:36,733 saving best model 2023-10-15 23:05:37,187 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:06:01,685 epoch 2 - iter 521/5212 - loss 0.18051053 - time (sec): 24.50 - samples/sec: 1584.27 - lr: 0.000030 - momentum: 0.000000 2023-10-15 23:06:25,892 epoch 2 - iter 1042/5212 - loss 0.17423932 - time (sec): 48.70 - samples/sec: 1540.79 - lr: 0.000029 - momentum: 0.000000 2023-10-15 23:06:52,002 epoch 2 - iter 1563/5212 - loss 0.16721027 - time (sec): 74.81 - samples/sec: 1491.10 - lr: 0.000029 - momentum: 0.000000 2023-10-15 23:07:17,432 epoch 2 - iter 2084/5212 - loss 0.16343997 - time (sec): 100.24 - samples/sec: 1485.17 - lr: 0.000029 - momentum: 0.000000 2023-10-15 23:07:42,744 epoch 2 - iter 2605/5212 - loss 0.16700608 - time (sec): 125.56 - samples/sec: 1475.93 - lr: 0.000028 - momentum: 0.000000 2023-10-15 23:08:08,383 epoch 2 - iter 3126/5212 - loss 0.16521606 - time (sec): 151.19 - samples/sec: 1474.40 - lr: 0.000028 - momentum: 0.000000 2023-10-15 23:08:33,266 epoch 2 - iter 3647/5212 - loss 0.16752019 - time (sec): 176.08 - samples/sec: 1469.68 - lr: 0.000028 - momentum: 0.000000 2023-10-15 23:08:58,928 epoch 2 - iter 4168/5212 - loss 0.16512155 - time (sec): 201.74 - samples/sec: 1475.67 - lr: 0.000027 - momentum: 0.000000 2023-10-15 23:09:23,467 epoch 2 - iter 4689/5212 - loss 0.16625606 - time (sec): 226.28 - samples/sec: 1461.72 - lr: 0.000027 - momentum: 0.000000 2023-10-15 23:09:48,392 epoch 2 - iter 5210/5212 - loss 0.16642187 - time (sec): 251.20 - samples/sec: 1461.65 - lr: 0.000027 - momentum: 0.000000 2023-10-15 23:09:48,506 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:09:48,506 EPOCH 2 done: loss 0.1664 - lr: 0.000027 2023-10-15 23:09:57,611 DEV : loss 0.1759524643421173 - f1-score (micro avg) 0.345 2023-10-15 23:09:57,639 saving best model 2023-10-15 23:09:58,174 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:10:23,351 epoch 3 - iter 521/5212 - loss 0.13813373 - time (sec): 25.17 - samples/sec: 1436.32 - lr: 0.000026 - momentum: 0.000000 2023-10-15 23:10:47,860 epoch 3 - iter 1042/5212 - loss 0.12806272 - time (sec): 49.68 - samples/sec: 1386.53 - lr: 0.000026 - momentum: 0.000000 2023-10-15 23:11:12,719 epoch 3 - iter 1563/5212 - loss 0.12631653 - time (sec): 74.54 - samples/sec: 1385.07 - lr: 0.000026 - momentum: 0.000000 2023-10-15 23:11:37,395 epoch 3 - iter 2084/5212 - loss 0.12662569 - time (sec): 99.22 - samples/sec: 1401.10 - lr: 0.000025 - momentum: 0.000000 2023-10-15 23:12:02,305 epoch 3 - iter 2605/5212 - loss 0.12161677 - time (sec): 124.13 - samples/sec: 1420.15 - lr: 0.000025 - momentum: 0.000000 2023-10-15 23:12:27,636 epoch 3 - iter 3126/5212 - loss 0.12396518 - time (sec): 149.46 - samples/sec: 1431.89 - lr: 0.000025 - momentum: 0.000000 2023-10-15 23:12:52,997 epoch 3 - iter 3647/5212 - loss 0.12304621 - time (sec): 174.82 - samples/sec: 1440.00 - lr: 0.000024 - momentum: 0.000000 2023-10-15 23:13:18,146 epoch 3 - iter 4168/5212 - loss 0.12220986 - time (sec): 199.97 - samples/sec: 1440.78 - lr: 0.000024 - momentum: 0.000000 2023-10-15 23:13:43,519 epoch 3 - iter 4689/5212 - loss 0.12332670 - time (sec): 225.34 - samples/sec: 1451.48 - lr: 0.000024 - momentum: 0.000000 2023-10-15 23:14:09,472 epoch 3 - iter 5210/5212 - loss 0.12074165 - time (sec): 251.30 - samples/sec: 1462.04 - lr: 0.000023 - momentum: 0.000000 2023-10-15 23:14:09,556 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:14:09,556 EPOCH 3 done: loss 0.1207 - lr: 0.000023 2023-10-15 23:14:18,570 DEV : loss 0.2367718517780304 - f1-score (micro avg) 0.3203 2023-10-15 23:14:18,597 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:14:43,445 epoch 4 - iter 521/5212 - loss 0.06947319 - time (sec): 24.85 - samples/sec: 1424.08 - lr: 0.000023 - momentum: 0.000000 2023-10-15 23:15:08,358 epoch 4 - iter 1042/5212 - loss 0.07944653 - time (sec): 49.76 - samples/sec: 1444.97 - lr: 0.000023 - momentum: 0.000000 2023-10-15 23:15:33,949 epoch 4 - iter 1563/5212 - loss 0.07739611 - time (sec): 75.35 - samples/sec: 1475.25 - lr: 0.000022 - momentum: 0.000000 2023-10-15 23:15:59,604 epoch 4 - iter 2084/5212 - loss 0.07559714 - time (sec): 101.01 - samples/sec: 1484.11 - lr: 0.000022 - momentum: 0.000000 2023-10-15 23:16:24,880 epoch 4 - iter 2605/5212 - loss 0.07966053 - time (sec): 126.28 - samples/sec: 1482.15 - lr: 0.000022 - momentum: 0.000000 2023-10-15 23:16:49,889 epoch 4 - iter 3126/5212 - loss 0.08000675 - time (sec): 151.29 - samples/sec: 1474.73 - lr: 0.000021 - momentum: 0.000000 2023-10-15 23:17:14,928 epoch 4 - iter 3647/5212 - loss 0.07959465 - time (sec): 176.33 - samples/sec: 1469.41 - lr: 0.000021 - momentum: 0.000000 2023-10-15 23:17:40,303 epoch 4 - iter 4168/5212 - loss 0.07917318 - time (sec): 201.70 - samples/sec: 1472.61 - lr: 0.000021 - momentum: 0.000000 2023-10-15 23:18:04,896 epoch 4 - iter 4689/5212 - loss 0.08036966 - time (sec): 226.30 - samples/sec: 1465.66 - lr: 0.000020 - momentum: 0.000000 2023-10-15 23:18:29,915 epoch 4 - iter 5210/5212 - loss 0.07972575 - time (sec): 251.32 - samples/sec: 1461.10 - lr: 0.000020 - momentum: 0.000000 2023-10-15 23:18:30,013 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:18:30,013 EPOCH 4 done: loss 0.0797 - lr: 0.000020 2023-10-15 23:18:39,084 DEV : loss 0.31254345178604126 - f1-score (micro avg) 0.3583 2023-10-15 23:18:39,115 saving best model 2023-10-15 23:18:39,534 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:19:05,445 epoch 5 - iter 521/5212 - loss 0.05681791 - time (sec): 25.91 - samples/sec: 1487.06 - lr: 0.000020 - momentum: 0.000000 2023-10-15 23:19:30,420 epoch 5 - iter 1042/5212 - loss 0.05805110 - time (sec): 50.88 - samples/sec: 1439.01 - lr: 0.000019 - momentum: 0.000000 2023-10-15 23:19:55,322 epoch 5 - iter 1563/5212 - loss 0.05743481 - time (sec): 75.79 - samples/sec: 1436.75 - lr: 0.000019 - momentum: 0.000000 2023-10-15 23:20:21,081 epoch 5 - iter 2084/5212 - loss 0.05774534 - time (sec): 101.54 - samples/sec: 1454.04 - lr: 0.000019 - momentum: 0.000000 2023-10-15 23:20:46,653 epoch 5 - iter 2605/5212 - loss 0.05950250 - time (sec): 127.12 - samples/sec: 1461.36 - lr: 0.000018 - momentum: 0.000000 2023-10-15 23:21:10,671 epoch 5 - iter 3126/5212 - loss 0.06076355 - time (sec): 151.14 - samples/sec: 1465.65 - lr: 0.000018 - momentum: 0.000000 2023-10-15 23:21:35,899 epoch 5 - iter 3647/5212 - loss 0.06139140 - time (sec): 176.36 - samples/sec: 1456.99 - lr: 0.000018 - momentum: 0.000000 2023-10-15 23:22:01,629 epoch 5 - iter 4168/5212 - loss 0.06089801 - time (sec): 202.09 - samples/sec: 1460.17 - lr: 0.000017 - momentum: 0.000000 2023-10-15 23:22:26,887 epoch 5 - iter 4689/5212 - loss 0.05994322 - time (sec): 227.35 - samples/sec: 1457.48 - lr: 0.000017 - momentum: 0.000000 2023-10-15 23:22:52,006 epoch 5 - iter 5210/5212 - loss 0.05926432 - time (sec): 252.47 - samples/sec: 1455.20 - lr: 0.000017 - momentum: 0.000000 2023-10-15 23:22:52,093 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:22:52,093 EPOCH 5 done: loss 0.0593 - lr: 0.000017 2023-10-15 23:23:01,238 DEV : loss 0.3170505166053772 - f1-score (micro avg) 0.3805 2023-10-15 23:23:01,265 saving best model 2023-10-15 23:23:01,736 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:23:26,895 epoch 6 - iter 521/5212 - loss 0.03813445 - time (sec): 25.15 - samples/sec: 1449.45 - lr: 0.000016 - momentum: 0.000000 2023-10-15 23:23:51,946 epoch 6 - iter 1042/5212 - loss 0.04569272 - time (sec): 50.20 - samples/sec: 1468.13 - lr: 0.000016 - momentum: 0.000000 2023-10-15 23:24:16,335 epoch 6 - iter 1563/5212 - loss 0.04146612 - time (sec): 74.59 - samples/sec: 1480.52 - lr: 0.000016 - momentum: 0.000000 2023-10-15 23:24:40,611 epoch 6 - iter 2084/5212 - loss 0.04255530 - time (sec): 98.87 - samples/sec: 1494.73 - lr: 0.000015 - momentum: 0.000000 2023-10-15 23:25:04,819 epoch 6 - iter 2605/5212 - loss 0.04259498 - time (sec): 123.08 - samples/sec: 1489.41 - lr: 0.000015 - momentum: 0.000000 2023-10-15 23:25:30,224 epoch 6 - iter 3126/5212 - loss 0.04613026 - time (sec): 148.48 - samples/sec: 1486.72 - lr: 0.000015 - momentum: 0.000000 2023-10-15 23:25:55,884 epoch 6 - iter 3647/5212 - loss 0.04513844 - time (sec): 174.14 - samples/sec: 1489.69 - lr: 0.000014 - momentum: 0.000000 2023-10-15 23:26:21,370 epoch 6 - iter 4168/5212 - loss 0.04411342 - time (sec): 199.63 - samples/sec: 1488.13 - lr: 0.000014 - momentum: 0.000000 2023-10-15 23:26:46,746 epoch 6 - iter 4689/5212 - loss 0.04421772 - time (sec): 225.00 - samples/sec: 1481.69 - lr: 0.000014 - momentum: 0.000000 2023-10-15 23:27:11,521 epoch 6 - iter 5210/5212 - loss 0.04352330 - time (sec): 249.78 - samples/sec: 1469.27 - lr: 0.000013 - momentum: 0.000000 2023-10-15 23:27:11,673 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:27:11,673 EPOCH 6 done: loss 0.0435 - lr: 0.000013 2023-10-15 23:27:20,816 DEV : loss 0.43000391125679016 - f1-score (micro avg) 0.3767 2023-10-15 23:27:20,843 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:27:46,148 epoch 7 - iter 521/5212 - loss 0.04318070 - time (sec): 25.30 - samples/sec: 1379.43 - lr: 0.000013 - momentum: 0.000000 2023-10-15 23:28:10,477 epoch 7 - iter 1042/5212 - loss 0.03545449 - time (sec): 49.63 - samples/sec: 1443.93 - lr: 0.000013 - momentum: 0.000000 2023-10-15 23:28:35,968 epoch 7 - iter 1563/5212 - loss 0.03539388 - time (sec): 75.12 - samples/sec: 1442.06 - lr: 0.000012 - momentum: 0.000000 2023-10-15 23:29:01,785 epoch 7 - iter 2084/5212 - loss 0.03433173 - time (sec): 100.94 - samples/sec: 1463.11 - lr: 0.000012 - momentum: 0.000000 2023-10-15 23:29:27,015 epoch 7 - iter 2605/5212 - loss 0.03377936 - time (sec): 126.17 - samples/sec: 1454.97 - lr: 0.000012 - momentum: 0.000000 2023-10-15 23:29:52,232 epoch 7 - iter 3126/5212 - loss 0.03443693 - time (sec): 151.39 - samples/sec: 1457.71 - lr: 0.000011 - momentum: 0.000000 2023-10-15 23:30:17,653 epoch 7 - iter 3647/5212 - loss 0.03352446 - time (sec): 176.81 - samples/sec: 1463.31 - lr: 0.000011 - momentum: 0.000000 2023-10-15 23:30:42,898 epoch 7 - iter 4168/5212 - loss 0.03294138 - time (sec): 202.05 - samples/sec: 1465.08 - lr: 0.000011 - momentum: 0.000000 2023-10-15 23:31:07,935 epoch 7 - iter 4689/5212 - loss 0.03247734 - time (sec): 227.09 - samples/sec: 1452.48 - lr: 0.000010 - momentum: 0.000000 2023-10-15 23:31:33,498 epoch 7 - iter 5210/5212 - loss 0.03215531 - time (sec): 252.65 - samples/sec: 1453.51 - lr: 0.000010 - momentum: 0.000000 2023-10-15 23:31:33,617 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:31:33,617 EPOCH 7 done: loss 0.0321 - lr: 0.000010 2023-10-15 23:31:41,895 DEV : loss 0.43989118933677673 - f1-score (micro avg) 0.3651 2023-10-15 23:31:41,925 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:32:07,556 epoch 8 - iter 521/5212 - loss 0.02030482 - time (sec): 25.63 - samples/sec: 1523.60 - lr: 0.000010 - momentum: 0.000000 2023-10-15 23:32:32,868 epoch 8 - iter 1042/5212 - loss 0.02029145 - time (sec): 50.94 - samples/sec: 1512.19 - lr: 0.000009 - momentum: 0.000000 2023-10-15 23:32:58,850 epoch 8 - iter 1563/5212 - loss 0.02030246 - time (sec): 76.92 - samples/sec: 1477.19 - lr: 0.000009 - momentum: 0.000000 2023-10-15 23:33:24,355 epoch 8 - iter 2084/5212 - loss 0.02102241 - time (sec): 102.43 - samples/sec: 1467.10 - lr: 0.000009 - momentum: 0.000000 2023-10-15 23:33:49,462 epoch 8 - iter 2605/5212 - loss 0.02014836 - time (sec): 127.54 - samples/sec: 1461.76 - lr: 0.000008 - momentum: 0.000000 2023-10-15 23:34:14,915 epoch 8 - iter 3126/5212 - loss 0.02105174 - time (sec): 152.99 - samples/sec: 1464.66 - lr: 0.000008 - momentum: 0.000000 2023-10-15 23:34:39,835 epoch 8 - iter 3647/5212 - loss 0.02225207 - time (sec): 177.91 - samples/sec: 1459.81 - lr: 0.000008 - momentum: 0.000000 2023-10-15 23:35:04,921 epoch 8 - iter 4168/5212 - loss 0.02254482 - time (sec): 202.99 - samples/sec: 1458.26 - lr: 0.000007 - momentum: 0.000000 2023-10-15 23:35:29,983 epoch 8 - iter 4689/5212 - loss 0.02221020 - time (sec): 228.06 - samples/sec: 1448.89 - lr: 0.000007 - momentum: 0.000000 2023-10-15 23:35:55,253 epoch 8 - iter 5210/5212 - loss 0.02190849 - time (sec): 253.33 - samples/sec: 1450.23 - lr: 0.000007 - momentum: 0.000000 2023-10-15 23:35:55,335 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:35:55,335 EPOCH 8 done: loss 0.0219 - lr: 0.000007 2023-10-15 23:36:03,613 DEV : loss 0.4804233908653259 - f1-score (micro avg) 0.3607 2023-10-15 23:36:03,641 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:36:28,832 epoch 9 - iter 521/5212 - loss 0.01165488 - time (sec): 25.19 - samples/sec: 1499.73 - lr: 0.000006 - momentum: 0.000000 2023-10-15 23:36:53,661 epoch 9 - iter 1042/5212 - loss 0.01448827 - time (sec): 50.02 - samples/sec: 1454.98 - lr: 0.000006 - momentum: 0.000000 2023-10-15 23:37:18,548 epoch 9 - iter 1563/5212 - loss 0.01652036 - time (sec): 74.91 - samples/sec: 1434.87 - lr: 0.000006 - momentum: 0.000000 2023-10-15 23:37:43,820 epoch 9 - iter 2084/5212 - loss 0.01722537 - time (sec): 100.18 - samples/sec: 1429.82 - lr: 0.000005 - momentum: 0.000000 2023-10-15 23:38:09,602 epoch 9 - iter 2605/5212 - loss 0.01646965 - time (sec): 125.96 - samples/sec: 1441.59 - lr: 0.000005 - momentum: 0.000000 2023-10-15 23:38:35,803 epoch 9 - iter 3126/5212 - loss 0.01574425 - time (sec): 152.16 - samples/sec: 1440.33 - lr: 0.000005 - momentum: 0.000000 2023-10-15 23:39:01,121 epoch 9 - iter 3647/5212 - loss 0.01557122 - time (sec): 177.48 - samples/sec: 1439.93 - lr: 0.000004 - momentum: 0.000000 2023-10-15 23:39:26,348 epoch 9 - iter 4168/5212 - loss 0.01516313 - time (sec): 202.71 - samples/sec: 1446.55 - lr: 0.000004 - momentum: 0.000000 2023-10-15 23:39:51,863 epoch 9 - iter 4689/5212 - loss 0.01529793 - time (sec): 228.22 - samples/sec: 1450.74 - lr: 0.000004 - momentum: 0.000000 2023-10-15 23:40:17,189 epoch 9 - iter 5210/5212 - loss 0.01498075 - time (sec): 253.55 - samples/sec: 1448.86 - lr: 0.000003 - momentum: 0.000000 2023-10-15 23:40:17,276 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:40:17,277 EPOCH 9 done: loss 0.0150 - lr: 0.000003 2023-10-15 23:40:25,600 DEV : loss 0.4658600091934204 - f1-score (micro avg) 0.3626 2023-10-15 23:40:25,629 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:40:50,716 epoch 10 - iter 521/5212 - loss 0.01097631 - time (sec): 25.09 - samples/sec: 1420.87 - lr: 0.000003 - momentum: 0.000000 2023-10-15 23:41:16,378 epoch 10 - iter 1042/5212 - loss 0.01039988 - time (sec): 50.75 - samples/sec: 1410.82 - lr: 0.000003 - momentum: 0.000000 2023-10-15 23:41:41,327 epoch 10 - iter 1563/5212 - loss 0.01008363 - time (sec): 75.70 - samples/sec: 1411.51 - lr: 0.000002 - momentum: 0.000000 2023-10-15 23:42:06,378 epoch 10 - iter 2084/5212 - loss 0.01093550 - time (sec): 100.75 - samples/sec: 1418.35 - lr: 0.000002 - momentum: 0.000000 2023-10-15 23:42:30,674 epoch 10 - iter 2605/5212 - loss 0.01038744 - time (sec): 125.04 - samples/sec: 1425.51 - lr: 0.000002 - momentum: 0.000000 2023-10-15 23:42:56,355 epoch 10 - iter 3126/5212 - loss 0.01029665 - time (sec): 150.73 - samples/sec: 1439.37 - lr: 0.000001 - momentum: 0.000000 2023-10-15 23:43:21,935 epoch 10 - iter 3647/5212 - loss 0.01072722 - time (sec): 176.30 - samples/sec: 1448.23 - lr: 0.000001 - momentum: 0.000000 2023-10-15 23:43:47,826 epoch 10 - iter 4168/5212 - loss 0.01038615 - time (sec): 202.20 - samples/sec: 1455.01 - lr: 0.000001 - momentum: 0.000000 2023-10-15 23:44:14,058 epoch 10 - iter 4689/5212 - loss 0.01029547 - time (sec): 228.43 - samples/sec: 1451.80 - lr: 0.000000 - momentum: 0.000000 2023-10-15 23:44:38,931 epoch 10 - iter 5210/5212 - loss 0.01030819 - time (sec): 253.30 - samples/sec: 1450.34 - lr: 0.000000 - momentum: 0.000000 2023-10-15 23:44:39,022 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:44:39,022 EPOCH 10 done: loss 0.0103 - lr: 0.000000 2023-10-15 23:44:47,300 DEV : loss 0.4685536324977875 - f1-score (micro avg) 0.3793 2023-10-15 23:44:47,791 ---------------------------------------------------------------------------------------------------- 2023-10-15 23:44:47,793 Loading model from best epoch ... 2023-10-15 23:44:49,249 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-15 23:45:04,628 Results: - F-score (micro) 0.4903 - F-score (macro) 0.334 - Accuracy 0.3303 By class: precision recall f1-score support LOC 0.5200 0.6425 0.5748 1214 PER 0.3872 0.5223 0.4447 808 ORG 0.2901 0.3484 0.3166 353 HumanProd 0.0000 0.0000 0.0000 15 micro avg 0.4395 0.5544 0.4903 2390 macro avg 0.2993 0.3783 0.3340 2390 weighted avg 0.4379 0.5544 0.4891 2390 2023-10-15 23:45:04,629 ----------------------------------------------------------------------------------------------------