|
2023-10-17 17:06:21,757 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:06:21,759 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 17:06:21,759 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:06:21,760 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences |
|
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator |
|
2023-10-17 17:06:21,760 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:06:21,760 Train: 3575 sentences |
|
2023-10-17 17:06:21,760 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 17:06:21,760 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:06:21,760 Training Params: |
|
2023-10-17 17:06:21,760 - learning_rate: "3e-05" |
|
2023-10-17 17:06:21,760 - mini_batch_size: "4" |
|
2023-10-17 17:06:21,760 - max_epochs: "10" |
|
2023-10-17 17:06:21,760 - shuffle: "True" |
|
2023-10-17 17:06:21,760 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:06:21,760 Plugins: |
|
2023-10-17 17:06:21,761 - TensorboardLogger |
|
2023-10-17 17:06:21,761 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 17:06:21,761 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:06:21,761 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 17:06:21,761 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 17:06:21,761 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:06:21,761 Computation: |
|
2023-10-17 17:06:21,761 - compute on device: cuda:0 |
|
2023-10-17 17:06:21,761 - embedding storage: none |
|
2023-10-17 17:06:21,761 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:06:21,761 Model training base path: "hmbench-hipe2020/de-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-17 17:06:21,761 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:06:21,761 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:06:21,762 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 17:06:28,965 epoch 1 - iter 89/894 - loss 3.35553705 - time (sec): 7.20 - samples/sec: 1085.50 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:06:36,102 epoch 1 - iter 178/894 - loss 2.15226997 - time (sec): 14.34 - samples/sec: 1188.33 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:06:43,076 epoch 1 - iter 267/894 - loss 1.59270429 - time (sec): 21.31 - samples/sec: 1211.02 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 17:06:50,128 epoch 1 - iter 356/894 - loss 1.29963321 - time (sec): 28.36 - samples/sec: 1197.16 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:06:57,328 epoch 1 - iter 445/894 - loss 1.10845361 - time (sec): 35.56 - samples/sec: 1196.45 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:07:04,764 epoch 1 - iter 534/894 - loss 0.94884297 - time (sec): 43.00 - samples/sec: 1220.21 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:07:11,870 epoch 1 - iter 623/894 - loss 0.85688276 - time (sec): 50.11 - samples/sec: 1217.69 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:07:18,923 epoch 1 - iter 712/894 - loss 0.78786576 - time (sec): 57.16 - samples/sec: 1209.25 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:07:25,981 epoch 1 - iter 801/894 - loss 0.72081724 - time (sec): 64.22 - samples/sec: 1218.84 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:07:32,853 epoch 1 - iter 890/894 - loss 0.67816083 - time (sec): 71.09 - samples/sec: 1210.62 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 17:07:33,168 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:07:33,169 EPOCH 1 done: loss 0.6751 - lr: 0.000030 |
|
2023-10-17 17:07:40,078 DEV : loss 0.16959424316883087 - f1-score (micro avg) 0.6302 |
|
2023-10-17 17:07:40,139 saving best model |
|
2023-10-17 17:07:40,689 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:07:47,691 epoch 2 - iter 89/894 - loss 0.16170417 - time (sec): 7.00 - samples/sec: 1224.08 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 17:07:54,736 epoch 2 - iter 178/894 - loss 0.16216802 - time (sec): 14.04 - samples/sec: 1216.26 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 17:08:01,696 epoch 2 - iter 267/894 - loss 0.16214263 - time (sec): 21.00 - samples/sec: 1188.59 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 17:08:08,639 epoch 2 - iter 356/894 - loss 0.16774037 - time (sec): 27.95 - samples/sec: 1156.25 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 17:08:15,757 epoch 2 - iter 445/894 - loss 0.16271357 - time (sec): 35.07 - samples/sec: 1192.89 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 17:08:22,973 epoch 2 - iter 534/894 - loss 0.16015759 - time (sec): 42.28 - samples/sec: 1213.83 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 17:08:29,908 epoch 2 - iter 623/894 - loss 0.16321569 - time (sec): 49.22 - samples/sec: 1213.44 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 17:08:36,967 epoch 2 - iter 712/894 - loss 0.15788501 - time (sec): 56.28 - samples/sec: 1221.15 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:08:44,149 epoch 2 - iter 801/894 - loss 0.15390744 - time (sec): 63.46 - samples/sec: 1231.76 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:08:51,058 epoch 2 - iter 890/894 - loss 0.15220708 - time (sec): 70.37 - samples/sec: 1224.43 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:08:51,360 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:08:51,360 EPOCH 2 done: loss 0.1524 - lr: 0.000027 |
|
2023-10-17 17:09:02,557 DEV : loss 0.12176849693059921 - f1-score (micro avg) 0.7254 |
|
2023-10-17 17:09:02,612 saving best model |
|
2023-10-17 17:09:04,170 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:09:10,984 epoch 3 - iter 89/894 - loss 0.09276748 - time (sec): 6.81 - samples/sec: 1267.72 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:09:17,975 epoch 3 - iter 178/894 - loss 0.08115097 - time (sec): 13.80 - samples/sec: 1282.23 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:09:24,918 epoch 3 - iter 267/894 - loss 0.07820107 - time (sec): 20.74 - samples/sec: 1274.60 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:09:31,744 epoch 3 - iter 356/894 - loss 0.08294175 - time (sec): 27.57 - samples/sec: 1243.39 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:09:38,744 epoch 3 - iter 445/894 - loss 0.09010169 - time (sec): 34.57 - samples/sec: 1247.70 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:09:45,969 epoch 3 - iter 534/894 - loss 0.09080906 - time (sec): 41.80 - samples/sec: 1250.36 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:09:52,993 epoch 3 - iter 623/894 - loss 0.08871421 - time (sec): 48.82 - samples/sec: 1243.02 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:09:59,840 epoch 3 - iter 712/894 - loss 0.09038709 - time (sec): 55.67 - samples/sec: 1245.20 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:10:07,030 epoch 3 - iter 801/894 - loss 0.08766641 - time (sec): 62.86 - samples/sec: 1245.16 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:10:13,791 epoch 3 - iter 890/894 - loss 0.08969348 - time (sec): 69.62 - samples/sec: 1237.85 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 17:10:14,092 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:10:14,093 EPOCH 3 done: loss 0.0896 - lr: 0.000023 |
|
2023-10-17 17:10:25,352 DEV : loss 0.1799084097146988 - f1-score (micro avg) 0.7667 |
|
2023-10-17 17:10:25,417 saving best model |
|
2023-10-17 17:10:26,913 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:10:34,130 epoch 4 - iter 89/894 - loss 0.04863283 - time (sec): 7.21 - samples/sec: 1263.06 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 17:10:41,093 epoch 4 - iter 178/894 - loss 0.04099925 - time (sec): 14.18 - samples/sec: 1233.06 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 17:10:48,147 epoch 4 - iter 267/894 - loss 0.04397038 - time (sec): 21.23 - samples/sec: 1231.83 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 17:10:55,108 epoch 4 - iter 356/894 - loss 0.04541991 - time (sec): 28.19 - samples/sec: 1222.30 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 17:11:02,139 epoch 4 - iter 445/894 - loss 0.04950959 - time (sec): 35.22 - samples/sec: 1223.17 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 17:11:09,087 epoch 4 - iter 534/894 - loss 0.05147770 - time (sec): 42.17 - samples/sec: 1227.55 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:11:15,904 epoch 4 - iter 623/894 - loss 0.05294425 - time (sec): 48.99 - samples/sec: 1225.03 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:11:22,717 epoch 4 - iter 712/894 - loss 0.05437475 - time (sec): 55.80 - samples/sec: 1222.21 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:11:29,800 epoch 4 - iter 801/894 - loss 0.05429004 - time (sec): 62.88 - samples/sec: 1239.61 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:11:36,591 epoch 4 - iter 890/894 - loss 0.05636947 - time (sec): 69.67 - samples/sec: 1237.10 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:11:36,892 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:11:36,892 EPOCH 4 done: loss 0.0561 - lr: 0.000020 |
|
2023-10-17 17:11:48,364 DEV : loss 0.1766158789396286 - f1-score (micro avg) 0.7846 |
|
2023-10-17 17:11:48,429 saving best model |
|
2023-10-17 17:11:49,013 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:11:55,899 epoch 5 - iter 89/894 - loss 0.02346057 - time (sec): 6.88 - samples/sec: 1208.74 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:12:02,805 epoch 5 - iter 178/894 - loss 0.02856937 - time (sec): 13.79 - samples/sec: 1255.07 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 17:12:09,972 epoch 5 - iter 267/894 - loss 0.03132166 - time (sec): 20.96 - samples/sec: 1286.18 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 17:12:16,856 epoch 5 - iter 356/894 - loss 0.02984524 - time (sec): 27.84 - samples/sec: 1248.91 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 17:12:23,989 epoch 5 - iter 445/894 - loss 0.02930967 - time (sec): 34.97 - samples/sec: 1258.23 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:12:31,236 epoch 5 - iter 534/894 - loss 0.03050672 - time (sec): 42.22 - samples/sec: 1247.10 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:12:38,277 epoch 5 - iter 623/894 - loss 0.03127495 - time (sec): 49.26 - samples/sec: 1236.68 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:12:45,381 epoch 5 - iter 712/894 - loss 0.03193147 - time (sec): 56.37 - samples/sec: 1223.56 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 17:12:52,394 epoch 5 - iter 801/894 - loss 0.03422109 - time (sec): 63.38 - samples/sec: 1229.63 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 17:12:59,246 epoch 5 - iter 890/894 - loss 0.03503814 - time (sec): 70.23 - samples/sec: 1227.14 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 17:12:59,545 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:12:59,545 EPOCH 5 done: loss 0.0349 - lr: 0.000017 |
|
2023-10-17 17:13:10,849 DEV : loss 0.21198122203350067 - f1-score (micro avg) 0.7885 |
|
2023-10-17 17:13:10,921 saving best model |
|
2023-10-17 17:13:11,561 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:13:18,742 epoch 6 - iter 89/894 - loss 0.02637658 - time (sec): 7.18 - samples/sec: 1234.51 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 17:13:25,820 epoch 6 - iter 178/894 - loss 0.02577714 - time (sec): 14.26 - samples/sec: 1296.69 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 17:13:32,810 epoch 6 - iter 267/894 - loss 0.02710819 - time (sec): 21.25 - samples/sec: 1270.16 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 17:13:39,791 epoch 6 - iter 356/894 - loss 0.02666332 - time (sec): 28.23 - samples/sec: 1227.51 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:13:46,741 epoch 6 - iter 445/894 - loss 0.02637469 - time (sec): 35.18 - samples/sec: 1187.72 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:13:53,721 epoch 6 - iter 534/894 - loss 0.02508981 - time (sec): 42.16 - samples/sec: 1193.26 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:14:00,773 epoch 6 - iter 623/894 - loss 0.02558045 - time (sec): 49.21 - samples/sec: 1214.27 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 17:14:07,755 epoch 6 - iter 712/894 - loss 0.02487283 - time (sec): 56.19 - samples/sec: 1217.35 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 17:14:14,925 epoch 6 - iter 801/894 - loss 0.02546208 - time (sec): 63.36 - samples/sec: 1233.36 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 17:14:21,885 epoch 6 - iter 890/894 - loss 0.02514569 - time (sec): 70.32 - samples/sec: 1227.81 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 17:14:22,187 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:14:22,188 EPOCH 6 done: loss 0.0251 - lr: 0.000013 |
|
2023-10-17 17:14:33,664 DEV : loss 0.2273338884115219 - f1-score (micro avg) 0.7968 |
|
2023-10-17 17:14:33,725 saving best model |
|
2023-10-17 17:14:34,286 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:14:41,116 epoch 7 - iter 89/894 - loss 0.01113447 - time (sec): 6.83 - samples/sec: 1357.75 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 17:14:47,866 epoch 7 - iter 178/894 - loss 0.01482375 - time (sec): 13.58 - samples/sec: 1282.12 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 17:14:55,103 epoch 7 - iter 267/894 - loss 0.01066913 - time (sec): 20.81 - samples/sec: 1259.51 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:15:01,996 epoch 7 - iter 356/894 - loss 0.01261009 - time (sec): 27.71 - samples/sec: 1231.95 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:15:09,192 epoch 7 - iter 445/894 - loss 0.01383171 - time (sec): 34.90 - samples/sec: 1210.20 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:15:16,625 epoch 7 - iter 534/894 - loss 0.01506180 - time (sec): 42.34 - samples/sec: 1205.84 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 17:15:24,059 epoch 7 - iter 623/894 - loss 0.01506110 - time (sec): 49.77 - samples/sec: 1192.69 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 17:15:31,155 epoch 7 - iter 712/894 - loss 0.01402772 - time (sec): 56.87 - samples/sec: 1196.58 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 17:15:38,312 epoch 7 - iter 801/894 - loss 0.01427605 - time (sec): 64.02 - samples/sec: 1196.24 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 17:15:45,497 epoch 7 - iter 890/894 - loss 0.01356603 - time (sec): 71.21 - samples/sec: 1210.17 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 17:15:45,817 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:15:45,817 EPOCH 7 done: loss 0.0136 - lr: 0.000010 |
|
2023-10-17 17:15:57,508 DEV : loss 0.22650040686130524 - f1-score (micro avg) 0.7936 |
|
2023-10-17 17:15:57,572 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:16:04,599 epoch 8 - iter 89/894 - loss 0.00378100 - time (sec): 7.02 - samples/sec: 1240.96 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 17:16:11,774 epoch 8 - iter 178/894 - loss 0.00485349 - time (sec): 14.20 - samples/sec: 1202.92 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 17:16:18,720 epoch 8 - iter 267/894 - loss 0.00570011 - time (sec): 21.15 - samples/sec: 1226.08 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 17:16:25,777 epoch 8 - iter 356/894 - loss 0.00742284 - time (sec): 28.20 - samples/sec: 1208.89 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 17:16:32,610 epoch 8 - iter 445/894 - loss 0.00867863 - time (sec): 35.04 - samples/sec: 1208.75 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 17:16:39,472 epoch 8 - iter 534/894 - loss 0.00889162 - time (sec): 41.90 - samples/sec: 1229.43 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 17:16:46,397 epoch 8 - iter 623/894 - loss 0.00804801 - time (sec): 48.82 - samples/sec: 1229.27 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 17:16:53,369 epoch 8 - iter 712/894 - loss 0.00890930 - time (sec): 55.79 - samples/sec: 1227.77 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 17:17:00,521 epoch 8 - iter 801/894 - loss 0.00880723 - time (sec): 62.95 - samples/sec: 1242.78 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 17:17:07,383 epoch 8 - iter 890/894 - loss 0.00956987 - time (sec): 69.81 - samples/sec: 1235.33 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 17:17:07,703 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:17:07,703 EPOCH 8 done: loss 0.0095 - lr: 0.000007 |
|
2023-10-17 17:17:18,616 DEV : loss 0.23670461773872375 - f1-score (micro avg) 0.7992 |
|
2023-10-17 17:17:18,672 saving best model |
|
2023-10-17 17:17:19,238 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:17:26,401 epoch 9 - iter 89/894 - loss 0.00869033 - time (sec): 7.16 - samples/sec: 1254.57 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:17:33,491 epoch 9 - iter 178/894 - loss 0.00462704 - time (sec): 14.25 - samples/sec: 1345.73 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:17:40,684 epoch 9 - iter 267/894 - loss 0.00598833 - time (sec): 21.44 - samples/sec: 1281.67 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:17:48,065 epoch 9 - iter 356/894 - loss 0.00662606 - time (sec): 28.82 - samples/sec: 1236.67 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:17:55,938 epoch 9 - iter 445/894 - loss 0.00663771 - time (sec): 36.70 - samples/sec: 1219.57 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:18:03,426 epoch 9 - iter 534/894 - loss 0.00594140 - time (sec): 44.19 - samples/sec: 1213.99 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:18:10,766 epoch 9 - iter 623/894 - loss 0.00591749 - time (sec): 51.53 - samples/sec: 1203.03 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 17:18:17,816 epoch 9 - iter 712/894 - loss 0.00614631 - time (sec): 58.58 - samples/sec: 1195.29 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 17:18:24,990 epoch 9 - iter 801/894 - loss 0.00628642 - time (sec): 65.75 - samples/sec: 1189.56 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 17:18:32,106 epoch 9 - iter 890/894 - loss 0.00634680 - time (sec): 72.87 - samples/sec: 1182.30 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:18:32,440 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:18:32,441 EPOCH 9 done: loss 0.0063 - lr: 0.000003 |
|
2023-10-17 17:18:44,236 DEV : loss 0.2532997727394104 - f1-score (micro avg) 0.7974 |
|
2023-10-17 17:18:44,299 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:18:51,392 epoch 10 - iter 89/894 - loss 0.00200554 - time (sec): 7.09 - samples/sec: 1215.85 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:18:58,618 epoch 10 - iter 178/894 - loss 0.00218017 - time (sec): 14.32 - samples/sec: 1183.44 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:19:05,806 epoch 10 - iter 267/894 - loss 0.00214375 - time (sec): 21.50 - samples/sec: 1162.88 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 17:19:13,058 epoch 10 - iter 356/894 - loss 0.00271732 - time (sec): 28.76 - samples/sec: 1172.68 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 17:19:20,143 epoch 10 - iter 445/894 - loss 0.00304840 - time (sec): 35.84 - samples/sec: 1189.24 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 17:19:27,455 epoch 10 - iter 534/894 - loss 0.00430996 - time (sec): 43.15 - samples/sec: 1206.22 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 17:19:34,413 epoch 10 - iter 623/894 - loss 0.00440082 - time (sec): 50.11 - samples/sec: 1202.00 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 17:19:41,398 epoch 10 - iter 712/894 - loss 0.00444493 - time (sec): 57.10 - samples/sec: 1211.19 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 17:19:48,402 epoch 10 - iter 801/894 - loss 0.00433071 - time (sec): 64.10 - samples/sec: 1207.94 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 17:19:55,518 epoch 10 - iter 890/894 - loss 0.00411415 - time (sec): 71.22 - samples/sec: 1209.86 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 17:19:55,836 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:19:55,837 EPOCH 10 done: loss 0.0041 - lr: 0.000000 |
|
2023-10-17 17:20:07,378 DEV : loss 0.2552371919155121 - f1-score (micro avg) 0.802 |
|
2023-10-17 17:20:07,436 saving best model |
|
2023-10-17 17:20:09,515 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:20:09,517 Loading model from best epoch ... |
|
2023-10-17 17:20:11,846 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time |
|
2023-10-17 17:20:18,216 |
|
Results: |
|
- F-score (micro) 0.7678 |
|
- F-score (macro) 0.6812 |
|
- Accuracy 0.6375 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.8517 0.8674 0.8595 596 |
|
pers 0.7018 0.7988 0.7472 333 |
|
org 0.5273 0.4394 0.4793 132 |
|
prod 0.6182 0.5152 0.5620 66 |
|
time 0.7826 0.7347 0.7579 49 |
|
|
|
micro avg 0.7611 0.7747 0.7678 1176 |
|
macro avg 0.6963 0.6711 0.6812 1176 |
|
weighted avg 0.7569 0.7747 0.7641 1176 |
|
|
|
2023-10-17 17:20:18,216 ---------------------------------------------------------------------------------------------------- |
|
|