stefan-it's picture
Upload folder using huggingface_hub
41ee989
2023-10-17 22:54:53,029 ----------------------------------------------------------------------------------------------------
2023-10-17 22:54:53,030 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 22:54:53,030 ----------------------------------------------------------------------------------------------------
2023-10-17 22:54:53,031 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
2023-10-17 22:54:53,031 ----------------------------------------------------------------------------------------------------
2023-10-17 22:54:53,031 Train: 5901 sentences
2023-10-17 22:54:53,031 (train_with_dev=False, train_with_test=False)
2023-10-17 22:54:53,031 ----------------------------------------------------------------------------------------------------
2023-10-17 22:54:53,031 Training Params:
2023-10-17 22:54:53,031 - learning_rate: "5e-05"
2023-10-17 22:54:53,031 - mini_batch_size: "4"
2023-10-17 22:54:53,031 - max_epochs: "10"
2023-10-17 22:54:53,031 - shuffle: "True"
2023-10-17 22:54:53,031 ----------------------------------------------------------------------------------------------------
2023-10-17 22:54:53,031 Plugins:
2023-10-17 22:54:53,031 - TensorboardLogger
2023-10-17 22:54:53,031 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 22:54:53,031 ----------------------------------------------------------------------------------------------------
2023-10-17 22:54:53,031 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 22:54:53,031 - metric: "('micro avg', 'f1-score')"
2023-10-17 22:54:53,031 ----------------------------------------------------------------------------------------------------
2023-10-17 22:54:53,031 Computation:
2023-10-17 22:54:53,031 - compute on device: cuda:0
2023-10-17 22:54:53,031 - embedding storage: none
2023-10-17 22:54:53,031 ----------------------------------------------------------------------------------------------------
2023-10-17 22:54:53,031 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-17 22:54:53,031 ----------------------------------------------------------------------------------------------------
2023-10-17 22:54:53,031 ----------------------------------------------------------------------------------------------------
2023-10-17 22:54:53,032 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 22:55:00,436 epoch 1 - iter 147/1476 - loss 2.61410065 - time (sec): 7.40 - samples/sec: 2290.24 - lr: 0.000005 - momentum: 0.000000
2023-10-17 22:55:07,619 epoch 1 - iter 294/1476 - loss 1.63401124 - time (sec): 14.59 - samples/sec: 2183.35 - lr: 0.000010 - momentum: 0.000000
2023-10-17 22:55:14,513 epoch 1 - iter 441/1476 - loss 1.23823546 - time (sec): 21.48 - samples/sec: 2196.79 - lr: 0.000015 - momentum: 0.000000
2023-10-17 22:55:21,963 epoch 1 - iter 588/1476 - loss 0.99381365 - time (sec): 28.93 - samples/sec: 2221.21 - lr: 0.000020 - momentum: 0.000000
2023-10-17 22:55:29,464 epoch 1 - iter 735/1476 - loss 0.82861680 - time (sec): 36.43 - samples/sec: 2270.11 - lr: 0.000025 - momentum: 0.000000
2023-10-17 22:55:36,924 epoch 1 - iter 882/1476 - loss 0.72615788 - time (sec): 43.89 - samples/sec: 2273.52 - lr: 0.000030 - momentum: 0.000000
2023-10-17 22:55:44,082 epoch 1 - iter 1029/1476 - loss 0.65613249 - time (sec): 51.05 - samples/sec: 2257.24 - lr: 0.000035 - momentum: 0.000000
2023-10-17 22:55:51,290 epoch 1 - iter 1176/1476 - loss 0.59494235 - time (sec): 58.26 - samples/sec: 2262.84 - lr: 0.000040 - momentum: 0.000000
2023-10-17 22:55:58,694 epoch 1 - iter 1323/1476 - loss 0.54631725 - time (sec): 65.66 - samples/sec: 2261.98 - lr: 0.000045 - momentum: 0.000000
2023-10-17 22:56:06,128 epoch 1 - iter 1470/1476 - loss 0.50564963 - time (sec): 73.10 - samples/sec: 2271.56 - lr: 0.000050 - momentum: 0.000000
2023-10-17 22:56:06,396 ----------------------------------------------------------------------------------------------------
2023-10-17 22:56:06,396 EPOCH 1 done: loss 0.5053 - lr: 0.000050
2023-10-17 22:56:12,869 DEV : loss 0.16856464743614197 - f1-score (micro avg) 0.7249
2023-10-17 22:56:12,904 saving best model
2023-10-17 22:56:13,312 ----------------------------------------------------------------------------------------------------
2023-10-17 22:56:20,687 epoch 2 - iter 147/1476 - loss 0.14697420 - time (sec): 7.37 - samples/sec: 2507.53 - lr: 0.000049 - momentum: 0.000000
2023-10-17 22:56:27,733 epoch 2 - iter 294/1476 - loss 0.14960317 - time (sec): 14.42 - samples/sec: 2392.91 - lr: 0.000049 - momentum: 0.000000
2023-10-17 22:56:35,139 epoch 2 - iter 441/1476 - loss 0.14816186 - time (sec): 21.83 - samples/sec: 2317.35 - lr: 0.000048 - momentum: 0.000000
2023-10-17 22:56:42,420 epoch 2 - iter 588/1476 - loss 0.14307432 - time (sec): 29.11 - samples/sec: 2304.79 - lr: 0.000048 - momentum: 0.000000
2023-10-17 22:56:49,369 epoch 2 - iter 735/1476 - loss 0.14205158 - time (sec): 36.06 - samples/sec: 2263.30 - lr: 0.000047 - momentum: 0.000000
2023-10-17 22:56:56,737 epoch 2 - iter 882/1476 - loss 0.13867117 - time (sec): 43.42 - samples/sec: 2252.93 - lr: 0.000047 - momentum: 0.000000
2023-10-17 22:57:03,747 epoch 2 - iter 1029/1476 - loss 0.13860745 - time (sec): 50.43 - samples/sec: 2246.86 - lr: 0.000046 - momentum: 0.000000
2023-10-17 22:57:10,914 epoch 2 - iter 1176/1476 - loss 0.13663928 - time (sec): 57.60 - samples/sec: 2245.71 - lr: 0.000046 - momentum: 0.000000
2023-10-17 22:57:18,134 epoch 2 - iter 1323/1476 - loss 0.13771676 - time (sec): 64.82 - samples/sec: 2253.18 - lr: 0.000045 - momentum: 0.000000
2023-10-17 22:57:25,405 epoch 2 - iter 1470/1476 - loss 0.13622049 - time (sec): 72.09 - samples/sec: 2277.56 - lr: 0.000044 - momentum: 0.000000
2023-10-17 22:57:25,970 ----------------------------------------------------------------------------------------------------
2023-10-17 22:57:25,970 EPOCH 2 done: loss 0.1350 - lr: 0.000044
2023-10-17 22:57:37,562 DEV : loss 0.1357511579990387 - f1-score (micro avg) 0.806
2023-10-17 22:57:37,592 saving best model
2023-10-17 22:57:38,178 ----------------------------------------------------------------------------------------------------
2023-10-17 22:57:45,993 epoch 3 - iter 147/1476 - loss 0.07789933 - time (sec): 7.81 - samples/sec: 2380.81 - lr: 0.000044 - momentum: 0.000000
2023-10-17 22:57:53,060 epoch 3 - iter 294/1476 - loss 0.08181281 - time (sec): 14.88 - samples/sec: 2342.38 - lr: 0.000043 - momentum: 0.000000
2023-10-17 22:57:59,874 epoch 3 - iter 441/1476 - loss 0.08737757 - time (sec): 21.69 - samples/sec: 2288.54 - lr: 0.000043 - momentum: 0.000000
2023-10-17 22:58:06,546 epoch 3 - iter 588/1476 - loss 0.09070383 - time (sec): 28.37 - samples/sec: 2322.34 - lr: 0.000042 - momentum: 0.000000
2023-10-17 22:58:13,353 epoch 3 - iter 735/1476 - loss 0.08945432 - time (sec): 35.17 - samples/sec: 2311.59 - lr: 0.000042 - momentum: 0.000000
2023-10-17 22:58:20,726 epoch 3 - iter 882/1476 - loss 0.08694692 - time (sec): 42.55 - samples/sec: 2319.25 - lr: 0.000041 - momentum: 0.000000
2023-10-17 22:58:28,090 epoch 3 - iter 1029/1476 - loss 0.09008265 - time (sec): 49.91 - samples/sec: 2332.14 - lr: 0.000041 - momentum: 0.000000
2023-10-17 22:58:34,993 epoch 3 - iter 1176/1476 - loss 0.08900752 - time (sec): 56.81 - samples/sec: 2316.86 - lr: 0.000040 - momentum: 0.000000
2023-10-17 22:58:42,197 epoch 3 - iter 1323/1476 - loss 0.08727828 - time (sec): 64.02 - samples/sec: 2310.81 - lr: 0.000039 - momentum: 0.000000
2023-10-17 22:58:49,967 epoch 3 - iter 1470/1476 - loss 0.08534777 - time (sec): 71.79 - samples/sec: 2309.57 - lr: 0.000039 - momentum: 0.000000
2023-10-17 22:58:50,257 ----------------------------------------------------------------------------------------------------
2023-10-17 22:58:50,257 EPOCH 3 done: loss 0.0851 - lr: 0.000039
2023-10-17 22:59:01,890 DEV : loss 0.13451558351516724 - f1-score (micro avg) 0.8094
2023-10-17 22:59:01,925 saving best model
2023-10-17 22:59:02,473 ----------------------------------------------------------------------------------------------------
2023-10-17 22:59:09,707 epoch 4 - iter 147/1476 - loss 0.05105380 - time (sec): 7.22 - samples/sec: 2150.03 - lr: 0.000038 - momentum: 0.000000
2023-10-17 22:59:16,983 epoch 4 - iter 294/1476 - loss 0.04798395 - time (sec): 14.49 - samples/sec: 2164.31 - lr: 0.000038 - momentum: 0.000000
2023-10-17 22:59:24,121 epoch 4 - iter 441/1476 - loss 0.04694055 - time (sec): 21.63 - samples/sec: 2191.91 - lr: 0.000037 - momentum: 0.000000
2023-10-17 22:59:31,928 epoch 4 - iter 588/1476 - loss 0.05761020 - time (sec): 29.44 - samples/sec: 2271.58 - lr: 0.000037 - momentum: 0.000000
2023-10-17 22:59:39,147 epoch 4 - iter 735/1476 - loss 0.06008736 - time (sec): 36.66 - samples/sec: 2287.63 - lr: 0.000036 - momentum: 0.000000
2023-10-17 22:59:46,000 epoch 4 - iter 882/1476 - loss 0.06216032 - time (sec): 43.51 - samples/sec: 2274.28 - lr: 0.000036 - momentum: 0.000000
2023-10-17 22:59:53,352 epoch 4 - iter 1029/1476 - loss 0.06083285 - time (sec): 50.86 - samples/sec: 2265.37 - lr: 0.000035 - momentum: 0.000000
2023-10-17 23:00:00,510 epoch 4 - iter 1176/1476 - loss 0.05979852 - time (sec): 58.02 - samples/sec: 2260.82 - lr: 0.000034 - momentum: 0.000000
2023-10-17 23:00:07,666 epoch 4 - iter 1323/1476 - loss 0.06094034 - time (sec): 65.17 - samples/sec: 2258.56 - lr: 0.000034 - momentum: 0.000000
2023-10-17 23:00:15,525 epoch 4 - iter 1470/1476 - loss 0.06439233 - time (sec): 73.03 - samples/sec: 2271.09 - lr: 0.000033 - momentum: 0.000000
2023-10-17 23:00:15,786 ----------------------------------------------------------------------------------------------------
2023-10-17 23:00:15,787 EPOCH 4 done: loss 0.0647 - lr: 0.000033
2023-10-17 23:00:27,492 DEV : loss 0.17153489589691162 - f1-score (micro avg) 0.8151
2023-10-17 23:00:27,527 saving best model
2023-10-17 23:00:28,067 ----------------------------------------------------------------------------------------------------
2023-10-17 23:00:35,275 epoch 5 - iter 147/1476 - loss 0.02693424 - time (sec): 7.21 - samples/sec: 2208.32 - lr: 0.000033 - momentum: 0.000000
2023-10-17 23:00:43,160 epoch 5 - iter 294/1476 - loss 0.04437274 - time (sec): 15.09 - samples/sec: 2088.84 - lr: 0.000032 - momentum: 0.000000
2023-10-17 23:00:51,347 epoch 5 - iter 441/1476 - loss 0.04231060 - time (sec): 23.28 - samples/sec: 2074.10 - lr: 0.000032 - momentum: 0.000000
2023-10-17 23:00:58,899 epoch 5 - iter 588/1476 - loss 0.03964767 - time (sec): 30.83 - samples/sec: 2130.13 - lr: 0.000031 - momentum: 0.000000
2023-10-17 23:01:06,222 epoch 5 - iter 735/1476 - loss 0.04489672 - time (sec): 38.15 - samples/sec: 2171.31 - lr: 0.000031 - momentum: 0.000000
2023-10-17 23:01:13,502 epoch 5 - iter 882/1476 - loss 0.04379355 - time (sec): 45.43 - samples/sec: 2190.95 - lr: 0.000030 - momentum: 0.000000
2023-10-17 23:01:20,560 epoch 5 - iter 1029/1476 - loss 0.04424447 - time (sec): 52.49 - samples/sec: 2199.56 - lr: 0.000029 - momentum: 0.000000
2023-10-17 23:01:28,426 epoch 5 - iter 1176/1476 - loss 0.04709511 - time (sec): 60.36 - samples/sec: 2242.67 - lr: 0.000029 - momentum: 0.000000
2023-10-17 23:01:35,493 epoch 5 - iter 1323/1476 - loss 0.04587284 - time (sec): 67.42 - samples/sec: 2233.46 - lr: 0.000028 - momentum: 0.000000
2023-10-17 23:01:42,740 epoch 5 - iter 1470/1476 - loss 0.04539351 - time (sec): 74.67 - samples/sec: 2221.97 - lr: 0.000028 - momentum: 0.000000
2023-10-17 23:01:43,007 ----------------------------------------------------------------------------------------------------
2023-10-17 23:01:43,007 EPOCH 5 done: loss 0.0453 - lr: 0.000028
2023-10-17 23:01:55,382 DEV : loss 0.16670416295528412 - f1-score (micro avg) 0.8243
2023-10-17 23:01:55,429 saving best model
2023-10-17 23:01:55,891 ----------------------------------------------------------------------------------------------------
2023-10-17 23:02:03,242 epoch 6 - iter 147/1476 - loss 0.04188951 - time (sec): 7.35 - samples/sec: 2208.61 - lr: 0.000027 - momentum: 0.000000
2023-10-17 23:02:10,517 epoch 6 - iter 294/1476 - loss 0.03674983 - time (sec): 14.62 - samples/sec: 2221.86 - lr: 0.000027 - momentum: 0.000000
2023-10-17 23:02:18,067 epoch 6 - iter 441/1476 - loss 0.03424510 - time (sec): 22.17 - samples/sec: 2259.51 - lr: 0.000026 - momentum: 0.000000
2023-10-17 23:02:25,103 epoch 6 - iter 588/1476 - loss 0.03260930 - time (sec): 29.21 - samples/sec: 2247.83 - lr: 0.000026 - momentum: 0.000000
2023-10-17 23:02:32,401 epoch 6 - iter 735/1476 - loss 0.03228254 - time (sec): 36.51 - samples/sec: 2229.88 - lr: 0.000025 - momentum: 0.000000
2023-10-17 23:02:40,187 epoch 6 - iter 882/1476 - loss 0.03612267 - time (sec): 44.29 - samples/sec: 2240.29 - lr: 0.000024 - momentum: 0.000000
2023-10-17 23:02:47,290 epoch 6 - iter 1029/1476 - loss 0.03358404 - time (sec): 51.40 - samples/sec: 2230.21 - lr: 0.000024 - momentum: 0.000000
2023-10-17 23:02:54,497 epoch 6 - iter 1176/1476 - loss 0.03290098 - time (sec): 58.60 - samples/sec: 2243.05 - lr: 0.000023 - momentum: 0.000000
2023-10-17 23:03:02,407 epoch 6 - iter 1323/1476 - loss 0.03435333 - time (sec): 66.51 - samples/sec: 2251.49 - lr: 0.000023 - momentum: 0.000000
2023-10-17 23:03:09,382 epoch 6 - iter 1470/1476 - loss 0.03301237 - time (sec): 73.49 - samples/sec: 2256.53 - lr: 0.000022 - momentum: 0.000000
2023-10-17 23:03:09,649 ----------------------------------------------------------------------------------------------------
2023-10-17 23:03:09,649 EPOCH 6 done: loss 0.0329 - lr: 0.000022
2023-10-17 23:03:21,566 DEV : loss 0.19735385477542877 - f1-score (micro avg) 0.8315
2023-10-17 23:03:21,602 saving best model
2023-10-17 23:03:22,194 ----------------------------------------------------------------------------------------------------
2023-10-17 23:03:30,571 epoch 7 - iter 147/1476 - loss 0.02345715 - time (sec): 8.37 - samples/sec: 2270.00 - lr: 0.000022 - momentum: 0.000000
2023-10-17 23:03:38,053 epoch 7 - iter 294/1476 - loss 0.02262037 - time (sec): 15.86 - samples/sec: 2241.11 - lr: 0.000021 - momentum: 0.000000
2023-10-17 23:03:45,134 epoch 7 - iter 441/1476 - loss 0.02093600 - time (sec): 22.94 - samples/sec: 2236.74 - lr: 0.000021 - momentum: 0.000000
2023-10-17 23:03:53,381 epoch 7 - iter 588/1476 - loss 0.01790932 - time (sec): 31.18 - samples/sec: 2197.93 - lr: 0.000020 - momentum: 0.000000
2023-10-17 23:04:01,112 epoch 7 - iter 735/1476 - loss 0.01989970 - time (sec): 38.92 - samples/sec: 2148.51 - lr: 0.000019 - momentum: 0.000000
2023-10-17 23:04:08,498 epoch 7 - iter 882/1476 - loss 0.02070017 - time (sec): 46.30 - samples/sec: 2160.07 - lr: 0.000019 - momentum: 0.000000
2023-10-17 23:04:16,020 epoch 7 - iter 1029/1476 - loss 0.02122502 - time (sec): 53.82 - samples/sec: 2166.98 - lr: 0.000018 - momentum: 0.000000
2023-10-17 23:04:23,338 epoch 7 - iter 1176/1476 - loss 0.02137335 - time (sec): 61.14 - samples/sec: 2174.04 - lr: 0.000018 - momentum: 0.000000
2023-10-17 23:04:30,882 epoch 7 - iter 1323/1476 - loss 0.02158579 - time (sec): 68.69 - samples/sec: 2174.28 - lr: 0.000017 - momentum: 0.000000
2023-10-17 23:04:38,059 epoch 7 - iter 1470/1476 - loss 0.02065480 - time (sec): 75.86 - samples/sec: 2184.32 - lr: 0.000017 - momentum: 0.000000
2023-10-17 23:04:38,347 ----------------------------------------------------------------------------------------------------
2023-10-17 23:04:38,347 EPOCH 7 done: loss 0.0206 - lr: 0.000017
2023-10-17 23:04:50,314 DEV : loss 0.21340471506118774 - f1-score (micro avg) 0.839
2023-10-17 23:04:50,348 saving best model
2023-10-17 23:04:50,888 ----------------------------------------------------------------------------------------------------
2023-10-17 23:04:58,545 epoch 8 - iter 147/1476 - loss 0.01817788 - time (sec): 7.64 - samples/sec: 2509.33 - lr: 0.000016 - momentum: 0.000000
2023-10-17 23:05:06,162 epoch 8 - iter 294/1476 - loss 0.01429890 - time (sec): 15.26 - samples/sec: 2279.74 - lr: 0.000016 - momentum: 0.000000
2023-10-17 23:05:13,685 epoch 8 - iter 441/1476 - loss 0.01404232 - time (sec): 22.78 - samples/sec: 2318.88 - lr: 0.000015 - momentum: 0.000000
2023-10-17 23:05:21,215 epoch 8 - iter 588/1476 - loss 0.01341993 - time (sec): 30.31 - samples/sec: 2310.71 - lr: 0.000014 - momentum: 0.000000
2023-10-17 23:05:28,651 epoch 8 - iter 735/1476 - loss 0.01234219 - time (sec): 37.75 - samples/sec: 2281.94 - lr: 0.000014 - momentum: 0.000000
2023-10-17 23:05:36,056 epoch 8 - iter 882/1476 - loss 0.01255617 - time (sec): 45.15 - samples/sec: 2248.13 - lr: 0.000013 - momentum: 0.000000
2023-10-17 23:05:43,528 epoch 8 - iter 1029/1476 - loss 0.01279467 - time (sec): 52.63 - samples/sec: 2236.05 - lr: 0.000013 - momentum: 0.000000
2023-10-17 23:05:50,830 epoch 8 - iter 1176/1476 - loss 0.01363296 - time (sec): 59.93 - samples/sec: 2241.20 - lr: 0.000012 - momentum: 0.000000
2023-10-17 23:05:58,137 epoch 8 - iter 1323/1476 - loss 0.01384585 - time (sec): 67.24 - samples/sec: 2236.28 - lr: 0.000012 - momentum: 0.000000
2023-10-17 23:06:05,073 epoch 8 - iter 1470/1476 - loss 0.01355309 - time (sec): 74.17 - samples/sec: 2235.95 - lr: 0.000011 - momentum: 0.000000
2023-10-17 23:06:05,338 ----------------------------------------------------------------------------------------------------
2023-10-17 23:06:05,338 EPOCH 8 done: loss 0.0135 - lr: 0.000011
2023-10-17 23:06:16,942 DEV : loss 0.21790701150894165 - f1-score (micro avg) 0.8359
2023-10-17 23:06:16,973 ----------------------------------------------------------------------------------------------------
2023-10-17 23:06:23,866 epoch 9 - iter 147/1476 - loss 0.00401227 - time (sec): 6.89 - samples/sec: 2184.20 - lr: 0.000011 - momentum: 0.000000
2023-10-17 23:06:31,070 epoch 9 - iter 294/1476 - loss 0.00427172 - time (sec): 14.10 - samples/sec: 2209.47 - lr: 0.000010 - momentum: 0.000000
2023-10-17 23:06:38,966 epoch 9 - iter 441/1476 - loss 0.00795735 - time (sec): 21.99 - samples/sec: 2360.74 - lr: 0.000009 - momentum: 0.000000
2023-10-17 23:06:45,704 epoch 9 - iter 588/1476 - loss 0.00775844 - time (sec): 28.73 - samples/sec: 2314.97 - lr: 0.000009 - momentum: 0.000000
2023-10-17 23:06:52,721 epoch 9 - iter 735/1476 - loss 0.00709240 - time (sec): 35.75 - samples/sec: 2300.82 - lr: 0.000008 - momentum: 0.000000
2023-10-17 23:07:00,353 epoch 9 - iter 882/1476 - loss 0.00746520 - time (sec): 43.38 - samples/sec: 2317.72 - lr: 0.000008 - momentum: 0.000000
2023-10-17 23:07:07,391 epoch 9 - iter 1029/1476 - loss 0.00734797 - time (sec): 50.42 - samples/sec: 2292.46 - lr: 0.000007 - momentum: 0.000000
2023-10-17 23:07:14,735 epoch 9 - iter 1176/1476 - loss 0.00905542 - time (sec): 57.76 - samples/sec: 2298.94 - lr: 0.000007 - momentum: 0.000000
2023-10-17 23:07:21,931 epoch 9 - iter 1323/1476 - loss 0.00876966 - time (sec): 64.96 - samples/sec: 2303.55 - lr: 0.000006 - momentum: 0.000000
2023-10-17 23:07:28,917 epoch 9 - iter 1470/1476 - loss 0.00840203 - time (sec): 71.94 - samples/sec: 2301.70 - lr: 0.000006 - momentum: 0.000000
2023-10-17 23:07:29,220 ----------------------------------------------------------------------------------------------------
2023-10-17 23:07:29,220 EPOCH 9 done: loss 0.0084 - lr: 0.000006
2023-10-17 23:07:40,884 DEV : loss 0.22359006106853485 - f1-score (micro avg) 0.8372
2023-10-17 23:07:40,923 ----------------------------------------------------------------------------------------------------
2023-10-17 23:07:47,911 epoch 10 - iter 147/1476 - loss 0.00147687 - time (sec): 6.99 - samples/sec: 2119.63 - lr: 0.000005 - momentum: 0.000000
2023-10-17 23:07:55,405 epoch 10 - iter 294/1476 - loss 0.00457169 - time (sec): 14.48 - samples/sec: 2132.53 - lr: 0.000004 - momentum: 0.000000
2023-10-17 23:08:02,488 epoch 10 - iter 441/1476 - loss 0.00420220 - time (sec): 21.56 - samples/sec: 2167.19 - lr: 0.000004 - momentum: 0.000000
2023-10-17 23:08:10,041 epoch 10 - iter 588/1476 - loss 0.00602254 - time (sec): 29.12 - samples/sec: 2263.71 - lr: 0.000003 - momentum: 0.000000
2023-10-17 23:08:17,049 epoch 10 - iter 735/1476 - loss 0.00562953 - time (sec): 36.12 - samples/sec: 2252.28 - lr: 0.000003 - momentum: 0.000000
2023-10-17 23:08:24,354 epoch 10 - iter 882/1476 - loss 0.00721583 - time (sec): 43.43 - samples/sec: 2268.84 - lr: 0.000002 - momentum: 0.000000
2023-10-17 23:08:31,340 epoch 10 - iter 1029/1476 - loss 0.00711626 - time (sec): 50.41 - samples/sec: 2272.54 - lr: 0.000002 - momentum: 0.000000
2023-10-17 23:08:38,403 epoch 10 - iter 1176/1476 - loss 0.00666628 - time (sec): 57.48 - samples/sec: 2277.09 - lr: 0.000001 - momentum: 0.000000
2023-10-17 23:08:45,938 epoch 10 - iter 1323/1476 - loss 0.00644189 - time (sec): 65.01 - samples/sec: 2299.98 - lr: 0.000001 - momentum: 0.000000
2023-10-17 23:08:53,126 epoch 10 - iter 1470/1476 - loss 0.00604544 - time (sec): 72.20 - samples/sec: 2291.20 - lr: 0.000000 - momentum: 0.000000
2023-10-17 23:08:53,437 ----------------------------------------------------------------------------------------------------
2023-10-17 23:08:53,437 EPOCH 10 done: loss 0.0060 - lr: 0.000000
2023-10-17 23:09:05,404 DEV : loss 0.2264208048582077 - f1-score (micro avg) 0.838
2023-10-17 23:09:05,856 ----------------------------------------------------------------------------------------------------
2023-10-17 23:09:05,857 Loading model from best epoch ...
2023-10-17 23:09:07,337 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
2023-10-17 23:09:13,652
Results:
- F-score (micro) 0.8038
- F-score (macro) 0.702
- Accuracy 0.6918
By class:
precision recall f1-score support
loc 0.8422 0.8834 0.8623 858
pers 0.7978 0.8007 0.7993 537
org 0.6016 0.5833 0.5923 132
prod 0.6508 0.6721 0.6613 61
time 0.5373 0.6667 0.5950 54
micro avg 0.7908 0.8173 0.8038 1642
macro avg 0.6859 0.7213 0.7020 1642
weighted avg 0.7912 0.8173 0.8037 1642
2023-10-17 23:09:13,652 ----------------------------------------------------------------------------------------------------