stefan-it's picture
Upload folder using huggingface_hub
786426e
2023-10-17 17:06:21,757 ----------------------------------------------------------------------------------------------------
2023-10-17 17:06:21,759 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 17:06:21,759 ----------------------------------------------------------------------------------------------------
2023-10-17 17:06:21,760 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-17 17:06:21,760 ----------------------------------------------------------------------------------------------------
2023-10-17 17:06:21,760 Train: 3575 sentences
2023-10-17 17:06:21,760 (train_with_dev=False, train_with_test=False)
2023-10-17 17:06:21,760 ----------------------------------------------------------------------------------------------------
2023-10-17 17:06:21,760 Training Params:
2023-10-17 17:06:21,760 - learning_rate: "3e-05"
2023-10-17 17:06:21,760 - mini_batch_size: "4"
2023-10-17 17:06:21,760 - max_epochs: "10"
2023-10-17 17:06:21,760 - shuffle: "True"
2023-10-17 17:06:21,760 ----------------------------------------------------------------------------------------------------
2023-10-17 17:06:21,760 Plugins:
2023-10-17 17:06:21,761 - TensorboardLogger
2023-10-17 17:06:21,761 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 17:06:21,761 ----------------------------------------------------------------------------------------------------
2023-10-17 17:06:21,761 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 17:06:21,761 - metric: "('micro avg', 'f1-score')"
2023-10-17 17:06:21,761 ----------------------------------------------------------------------------------------------------
2023-10-17 17:06:21,761 Computation:
2023-10-17 17:06:21,761 - compute on device: cuda:0
2023-10-17 17:06:21,761 - embedding storage: none
2023-10-17 17:06:21,761 ----------------------------------------------------------------------------------------------------
2023-10-17 17:06:21,761 Model training base path: "hmbench-hipe2020/de-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-17 17:06:21,761 ----------------------------------------------------------------------------------------------------
2023-10-17 17:06:21,761 ----------------------------------------------------------------------------------------------------
2023-10-17 17:06:21,762 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 17:06:28,965 epoch 1 - iter 89/894 - loss 3.35553705 - time (sec): 7.20 - samples/sec: 1085.50 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:06:36,102 epoch 1 - iter 178/894 - loss 2.15226997 - time (sec): 14.34 - samples/sec: 1188.33 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:06:43,076 epoch 1 - iter 267/894 - loss 1.59270429 - time (sec): 21.31 - samples/sec: 1211.02 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:06:50,128 epoch 1 - iter 356/894 - loss 1.29963321 - time (sec): 28.36 - samples/sec: 1197.16 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:06:57,328 epoch 1 - iter 445/894 - loss 1.10845361 - time (sec): 35.56 - samples/sec: 1196.45 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:07:04,764 epoch 1 - iter 534/894 - loss 0.94884297 - time (sec): 43.00 - samples/sec: 1220.21 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:07:11,870 epoch 1 - iter 623/894 - loss 0.85688276 - time (sec): 50.11 - samples/sec: 1217.69 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:07:18,923 epoch 1 - iter 712/894 - loss 0.78786576 - time (sec): 57.16 - samples/sec: 1209.25 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:07:25,981 epoch 1 - iter 801/894 - loss 0.72081724 - time (sec): 64.22 - samples/sec: 1218.84 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:07:32,853 epoch 1 - iter 890/894 - loss 0.67816083 - time (sec): 71.09 - samples/sec: 1210.62 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:07:33,168 ----------------------------------------------------------------------------------------------------
2023-10-17 17:07:33,169 EPOCH 1 done: loss 0.6751 - lr: 0.000030
2023-10-17 17:07:40,078 DEV : loss 0.16959424316883087 - f1-score (micro avg) 0.6302
2023-10-17 17:07:40,139 saving best model
2023-10-17 17:07:40,689 ----------------------------------------------------------------------------------------------------
2023-10-17 17:07:47,691 epoch 2 - iter 89/894 - loss 0.16170417 - time (sec): 7.00 - samples/sec: 1224.08 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:07:54,736 epoch 2 - iter 178/894 - loss 0.16216802 - time (sec): 14.04 - samples/sec: 1216.26 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:08:01,696 epoch 2 - iter 267/894 - loss 0.16214263 - time (sec): 21.00 - samples/sec: 1188.59 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:08:08,639 epoch 2 - iter 356/894 - loss 0.16774037 - time (sec): 27.95 - samples/sec: 1156.25 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:08:15,757 epoch 2 - iter 445/894 - loss 0.16271357 - time (sec): 35.07 - samples/sec: 1192.89 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:08:22,973 epoch 2 - iter 534/894 - loss 0.16015759 - time (sec): 42.28 - samples/sec: 1213.83 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:08:29,908 epoch 2 - iter 623/894 - loss 0.16321569 - time (sec): 49.22 - samples/sec: 1213.44 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:08:36,967 epoch 2 - iter 712/894 - loss 0.15788501 - time (sec): 56.28 - samples/sec: 1221.15 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:08:44,149 epoch 2 - iter 801/894 - loss 0.15390744 - time (sec): 63.46 - samples/sec: 1231.76 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:08:51,058 epoch 2 - iter 890/894 - loss 0.15220708 - time (sec): 70.37 - samples/sec: 1224.43 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:08:51,360 ----------------------------------------------------------------------------------------------------
2023-10-17 17:08:51,360 EPOCH 2 done: loss 0.1524 - lr: 0.000027
2023-10-17 17:09:02,557 DEV : loss 0.12176849693059921 - f1-score (micro avg) 0.7254
2023-10-17 17:09:02,612 saving best model
2023-10-17 17:09:04,170 ----------------------------------------------------------------------------------------------------
2023-10-17 17:09:10,984 epoch 3 - iter 89/894 - loss 0.09276748 - time (sec): 6.81 - samples/sec: 1267.72 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:09:17,975 epoch 3 - iter 178/894 - loss 0.08115097 - time (sec): 13.80 - samples/sec: 1282.23 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:09:24,918 epoch 3 - iter 267/894 - loss 0.07820107 - time (sec): 20.74 - samples/sec: 1274.60 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:09:31,744 epoch 3 - iter 356/894 - loss 0.08294175 - time (sec): 27.57 - samples/sec: 1243.39 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:09:38,744 epoch 3 - iter 445/894 - loss 0.09010169 - time (sec): 34.57 - samples/sec: 1247.70 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:09:45,969 epoch 3 - iter 534/894 - loss 0.09080906 - time (sec): 41.80 - samples/sec: 1250.36 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:09:52,993 epoch 3 - iter 623/894 - loss 0.08871421 - time (sec): 48.82 - samples/sec: 1243.02 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:09:59,840 epoch 3 - iter 712/894 - loss 0.09038709 - time (sec): 55.67 - samples/sec: 1245.20 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:10:07,030 epoch 3 - iter 801/894 - loss 0.08766641 - time (sec): 62.86 - samples/sec: 1245.16 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:10:13,791 epoch 3 - iter 890/894 - loss 0.08969348 - time (sec): 69.62 - samples/sec: 1237.85 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:10:14,092 ----------------------------------------------------------------------------------------------------
2023-10-17 17:10:14,093 EPOCH 3 done: loss 0.0896 - lr: 0.000023
2023-10-17 17:10:25,352 DEV : loss 0.1799084097146988 - f1-score (micro avg) 0.7667
2023-10-17 17:10:25,417 saving best model
2023-10-17 17:10:26,913 ----------------------------------------------------------------------------------------------------
2023-10-17 17:10:34,130 epoch 4 - iter 89/894 - loss 0.04863283 - time (sec): 7.21 - samples/sec: 1263.06 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:10:41,093 epoch 4 - iter 178/894 - loss 0.04099925 - time (sec): 14.18 - samples/sec: 1233.06 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:10:48,147 epoch 4 - iter 267/894 - loss 0.04397038 - time (sec): 21.23 - samples/sec: 1231.83 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:10:55,108 epoch 4 - iter 356/894 - loss 0.04541991 - time (sec): 28.19 - samples/sec: 1222.30 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:11:02,139 epoch 4 - iter 445/894 - loss 0.04950959 - time (sec): 35.22 - samples/sec: 1223.17 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:11:09,087 epoch 4 - iter 534/894 - loss 0.05147770 - time (sec): 42.17 - samples/sec: 1227.55 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:11:15,904 epoch 4 - iter 623/894 - loss 0.05294425 - time (sec): 48.99 - samples/sec: 1225.03 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:11:22,717 epoch 4 - iter 712/894 - loss 0.05437475 - time (sec): 55.80 - samples/sec: 1222.21 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:11:29,800 epoch 4 - iter 801/894 - loss 0.05429004 - time (sec): 62.88 - samples/sec: 1239.61 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:11:36,591 epoch 4 - iter 890/894 - loss 0.05636947 - time (sec): 69.67 - samples/sec: 1237.10 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:11:36,892 ----------------------------------------------------------------------------------------------------
2023-10-17 17:11:36,892 EPOCH 4 done: loss 0.0561 - lr: 0.000020
2023-10-17 17:11:48,364 DEV : loss 0.1766158789396286 - f1-score (micro avg) 0.7846
2023-10-17 17:11:48,429 saving best model
2023-10-17 17:11:49,013 ----------------------------------------------------------------------------------------------------
2023-10-17 17:11:55,899 epoch 5 - iter 89/894 - loss 0.02346057 - time (sec): 6.88 - samples/sec: 1208.74 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:12:02,805 epoch 5 - iter 178/894 - loss 0.02856937 - time (sec): 13.79 - samples/sec: 1255.07 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:12:09,972 epoch 5 - iter 267/894 - loss 0.03132166 - time (sec): 20.96 - samples/sec: 1286.18 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:12:16,856 epoch 5 - iter 356/894 - loss 0.02984524 - time (sec): 27.84 - samples/sec: 1248.91 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:12:23,989 epoch 5 - iter 445/894 - loss 0.02930967 - time (sec): 34.97 - samples/sec: 1258.23 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:12:31,236 epoch 5 - iter 534/894 - loss 0.03050672 - time (sec): 42.22 - samples/sec: 1247.10 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:12:38,277 epoch 5 - iter 623/894 - loss 0.03127495 - time (sec): 49.26 - samples/sec: 1236.68 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:12:45,381 epoch 5 - iter 712/894 - loss 0.03193147 - time (sec): 56.37 - samples/sec: 1223.56 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:12:52,394 epoch 5 - iter 801/894 - loss 0.03422109 - time (sec): 63.38 - samples/sec: 1229.63 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:12:59,246 epoch 5 - iter 890/894 - loss 0.03503814 - time (sec): 70.23 - samples/sec: 1227.14 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:12:59,545 ----------------------------------------------------------------------------------------------------
2023-10-17 17:12:59,545 EPOCH 5 done: loss 0.0349 - lr: 0.000017
2023-10-17 17:13:10,849 DEV : loss 0.21198122203350067 - f1-score (micro avg) 0.7885
2023-10-17 17:13:10,921 saving best model
2023-10-17 17:13:11,561 ----------------------------------------------------------------------------------------------------
2023-10-17 17:13:18,742 epoch 6 - iter 89/894 - loss 0.02637658 - time (sec): 7.18 - samples/sec: 1234.51 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:13:25,820 epoch 6 - iter 178/894 - loss 0.02577714 - time (sec): 14.26 - samples/sec: 1296.69 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:13:32,810 epoch 6 - iter 267/894 - loss 0.02710819 - time (sec): 21.25 - samples/sec: 1270.16 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:13:39,791 epoch 6 - iter 356/894 - loss 0.02666332 - time (sec): 28.23 - samples/sec: 1227.51 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:13:46,741 epoch 6 - iter 445/894 - loss 0.02637469 - time (sec): 35.18 - samples/sec: 1187.72 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:13:53,721 epoch 6 - iter 534/894 - loss 0.02508981 - time (sec): 42.16 - samples/sec: 1193.26 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:14:00,773 epoch 6 - iter 623/894 - loss 0.02558045 - time (sec): 49.21 - samples/sec: 1214.27 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:14:07,755 epoch 6 - iter 712/894 - loss 0.02487283 - time (sec): 56.19 - samples/sec: 1217.35 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:14:14,925 epoch 6 - iter 801/894 - loss 0.02546208 - time (sec): 63.36 - samples/sec: 1233.36 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:14:21,885 epoch 6 - iter 890/894 - loss 0.02514569 - time (sec): 70.32 - samples/sec: 1227.81 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:14:22,187 ----------------------------------------------------------------------------------------------------
2023-10-17 17:14:22,188 EPOCH 6 done: loss 0.0251 - lr: 0.000013
2023-10-17 17:14:33,664 DEV : loss 0.2273338884115219 - f1-score (micro avg) 0.7968
2023-10-17 17:14:33,725 saving best model
2023-10-17 17:14:34,286 ----------------------------------------------------------------------------------------------------
2023-10-17 17:14:41,116 epoch 7 - iter 89/894 - loss 0.01113447 - time (sec): 6.83 - samples/sec: 1357.75 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:14:47,866 epoch 7 - iter 178/894 - loss 0.01482375 - time (sec): 13.58 - samples/sec: 1282.12 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:14:55,103 epoch 7 - iter 267/894 - loss 0.01066913 - time (sec): 20.81 - samples/sec: 1259.51 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:15:01,996 epoch 7 - iter 356/894 - loss 0.01261009 - time (sec): 27.71 - samples/sec: 1231.95 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:15:09,192 epoch 7 - iter 445/894 - loss 0.01383171 - time (sec): 34.90 - samples/sec: 1210.20 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:15:16,625 epoch 7 - iter 534/894 - loss 0.01506180 - time (sec): 42.34 - samples/sec: 1205.84 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:15:24,059 epoch 7 - iter 623/894 - loss 0.01506110 - time (sec): 49.77 - samples/sec: 1192.69 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:15:31,155 epoch 7 - iter 712/894 - loss 0.01402772 - time (sec): 56.87 - samples/sec: 1196.58 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:15:38,312 epoch 7 - iter 801/894 - loss 0.01427605 - time (sec): 64.02 - samples/sec: 1196.24 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:15:45,497 epoch 7 - iter 890/894 - loss 0.01356603 - time (sec): 71.21 - samples/sec: 1210.17 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:15:45,817 ----------------------------------------------------------------------------------------------------
2023-10-17 17:15:45,817 EPOCH 7 done: loss 0.0136 - lr: 0.000010
2023-10-17 17:15:57,508 DEV : loss 0.22650040686130524 - f1-score (micro avg) 0.7936
2023-10-17 17:15:57,572 ----------------------------------------------------------------------------------------------------
2023-10-17 17:16:04,599 epoch 8 - iter 89/894 - loss 0.00378100 - time (sec): 7.02 - samples/sec: 1240.96 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:16:11,774 epoch 8 - iter 178/894 - loss 0.00485349 - time (sec): 14.20 - samples/sec: 1202.92 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:16:18,720 epoch 8 - iter 267/894 - loss 0.00570011 - time (sec): 21.15 - samples/sec: 1226.08 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:16:25,777 epoch 8 - iter 356/894 - loss 0.00742284 - time (sec): 28.20 - samples/sec: 1208.89 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:16:32,610 epoch 8 - iter 445/894 - loss 0.00867863 - time (sec): 35.04 - samples/sec: 1208.75 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:16:39,472 epoch 8 - iter 534/894 - loss 0.00889162 - time (sec): 41.90 - samples/sec: 1229.43 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:16:46,397 epoch 8 - iter 623/894 - loss 0.00804801 - time (sec): 48.82 - samples/sec: 1229.27 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:16:53,369 epoch 8 - iter 712/894 - loss 0.00890930 - time (sec): 55.79 - samples/sec: 1227.77 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:17:00,521 epoch 8 - iter 801/894 - loss 0.00880723 - time (sec): 62.95 - samples/sec: 1242.78 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:17:07,383 epoch 8 - iter 890/894 - loss 0.00956987 - time (sec): 69.81 - samples/sec: 1235.33 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:17:07,703 ----------------------------------------------------------------------------------------------------
2023-10-17 17:17:07,703 EPOCH 8 done: loss 0.0095 - lr: 0.000007
2023-10-17 17:17:18,616 DEV : loss 0.23670461773872375 - f1-score (micro avg) 0.7992
2023-10-17 17:17:18,672 saving best model
2023-10-17 17:17:19,238 ----------------------------------------------------------------------------------------------------
2023-10-17 17:17:26,401 epoch 9 - iter 89/894 - loss 0.00869033 - time (sec): 7.16 - samples/sec: 1254.57 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:17:33,491 epoch 9 - iter 178/894 - loss 0.00462704 - time (sec): 14.25 - samples/sec: 1345.73 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:17:40,684 epoch 9 - iter 267/894 - loss 0.00598833 - time (sec): 21.44 - samples/sec: 1281.67 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:17:48,065 epoch 9 - iter 356/894 - loss 0.00662606 - time (sec): 28.82 - samples/sec: 1236.67 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:17:55,938 epoch 9 - iter 445/894 - loss 0.00663771 - time (sec): 36.70 - samples/sec: 1219.57 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:18:03,426 epoch 9 - iter 534/894 - loss 0.00594140 - time (sec): 44.19 - samples/sec: 1213.99 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:18:10,766 epoch 9 - iter 623/894 - loss 0.00591749 - time (sec): 51.53 - samples/sec: 1203.03 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:18:17,816 epoch 9 - iter 712/894 - loss 0.00614631 - time (sec): 58.58 - samples/sec: 1195.29 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:18:24,990 epoch 9 - iter 801/894 - loss 0.00628642 - time (sec): 65.75 - samples/sec: 1189.56 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:18:32,106 epoch 9 - iter 890/894 - loss 0.00634680 - time (sec): 72.87 - samples/sec: 1182.30 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:18:32,440 ----------------------------------------------------------------------------------------------------
2023-10-17 17:18:32,441 EPOCH 9 done: loss 0.0063 - lr: 0.000003
2023-10-17 17:18:44,236 DEV : loss 0.2532997727394104 - f1-score (micro avg) 0.7974
2023-10-17 17:18:44,299 ----------------------------------------------------------------------------------------------------
2023-10-17 17:18:51,392 epoch 10 - iter 89/894 - loss 0.00200554 - time (sec): 7.09 - samples/sec: 1215.85 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:18:58,618 epoch 10 - iter 178/894 - loss 0.00218017 - time (sec): 14.32 - samples/sec: 1183.44 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:19:05,806 epoch 10 - iter 267/894 - loss 0.00214375 - time (sec): 21.50 - samples/sec: 1162.88 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:19:13,058 epoch 10 - iter 356/894 - loss 0.00271732 - time (sec): 28.76 - samples/sec: 1172.68 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:19:20,143 epoch 10 - iter 445/894 - loss 0.00304840 - time (sec): 35.84 - samples/sec: 1189.24 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:19:27,455 epoch 10 - iter 534/894 - loss 0.00430996 - time (sec): 43.15 - samples/sec: 1206.22 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:19:34,413 epoch 10 - iter 623/894 - loss 0.00440082 - time (sec): 50.11 - samples/sec: 1202.00 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:19:41,398 epoch 10 - iter 712/894 - loss 0.00444493 - time (sec): 57.10 - samples/sec: 1211.19 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:19:48,402 epoch 10 - iter 801/894 - loss 0.00433071 - time (sec): 64.10 - samples/sec: 1207.94 - lr: 0.000000 - momentum: 0.000000
2023-10-17 17:19:55,518 epoch 10 - iter 890/894 - loss 0.00411415 - time (sec): 71.22 - samples/sec: 1209.86 - lr: 0.000000 - momentum: 0.000000
2023-10-17 17:19:55,836 ----------------------------------------------------------------------------------------------------
2023-10-17 17:19:55,837 EPOCH 10 done: loss 0.0041 - lr: 0.000000
2023-10-17 17:20:07,378 DEV : loss 0.2552371919155121 - f1-score (micro avg) 0.802
2023-10-17 17:20:07,436 saving best model
2023-10-17 17:20:09,515 ----------------------------------------------------------------------------------------------------
2023-10-17 17:20:09,517 Loading model from best epoch ...
2023-10-17 17:20:11,846 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-17 17:20:18,216
Results:
- F-score (micro) 0.7678
- F-score (macro) 0.6812
- Accuracy 0.6375
By class:
precision recall f1-score support
loc 0.8517 0.8674 0.8595 596
pers 0.7018 0.7988 0.7472 333
org 0.5273 0.4394 0.4793 132
prod 0.6182 0.5152 0.5620 66
time 0.7826 0.7347 0.7579 49
micro avg 0.7611 0.7747 0.7678 1176
macro avg 0.6963 0.6711 0.6812 1176
weighted avg 0.7569 0.7747 0.7641 1176
2023-10-17 17:20:18,216 ----------------------------------------------------------------------------------------------------