Upload ./training.log with huggingface_hub
Browse files- training.log +242 -0
training.log
ADDED
@@ -0,0 +1,242 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-25 21:07:49,395 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-25 21:07:49,396 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(64001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=768, out_features=17, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-25 21:07:49,396 MultiCorpus: 1166 train + 165 dev + 415 test sentences
|
52 |
+
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
|
53 |
+
2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-25 21:07:49,396 Train: 1166 sentences
|
55 |
+
2023-10-25 21:07:49,396 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-25 21:07:49,396 Training Params:
|
58 |
+
2023-10-25 21:07:49,396 - learning_rate: "5e-05"
|
59 |
+
2023-10-25 21:07:49,396 - mini_batch_size: "8"
|
60 |
+
2023-10-25 21:07:49,396 - max_epochs: "10"
|
61 |
+
2023-10-25 21:07:49,396 - shuffle: "True"
|
62 |
+
2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-25 21:07:49,396 Plugins:
|
64 |
+
2023-10-25 21:07:49,396 - TensorboardLogger
|
65 |
+
2023-10-25 21:07:49,396 - LinearScheduler | warmup_fraction: '0.1'
|
66 |
+
2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
|
67 |
+
2023-10-25 21:07:49,396 Final evaluation on model from best epoch (best-model.pt)
|
68 |
+
2023-10-25 21:07:49,396 - metric: "('micro avg', 'f1-score')"
|
69 |
+
2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-25 21:07:49,396 Computation:
|
71 |
+
2023-10-25 21:07:49,396 - compute on device: cuda:0
|
72 |
+
2023-10-25 21:07:49,396 - embedding storage: none
|
73 |
+
2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
|
74 |
+
2023-10-25 21:07:49,396 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
|
75 |
+
2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
|
77 |
+
2023-10-25 21:07:49,397 Logging anything other than scalars to TensorBoard is currently not supported.
|
78 |
+
2023-10-25 21:07:50,189 epoch 1 - iter 14/146 - loss 2.76093476 - time (sec): 0.79 - samples/sec: 4306.04 - lr: 0.000004 - momentum: 0.000000
|
79 |
+
2023-10-25 21:07:50,999 epoch 1 - iter 28/146 - loss 2.16927590 - time (sec): 1.60 - samples/sec: 4446.03 - lr: 0.000009 - momentum: 0.000000
|
80 |
+
2023-10-25 21:07:51,943 epoch 1 - iter 42/146 - loss 1.56576289 - time (sec): 2.55 - samples/sec: 4650.49 - lr: 0.000014 - momentum: 0.000000
|
81 |
+
2023-10-25 21:07:52,744 epoch 1 - iter 56/146 - loss 1.31295193 - time (sec): 3.35 - samples/sec: 4680.42 - lr: 0.000019 - momentum: 0.000000
|
82 |
+
2023-10-25 21:07:53,535 epoch 1 - iter 70/146 - loss 1.14715761 - time (sec): 4.14 - samples/sec: 4718.31 - lr: 0.000024 - momentum: 0.000000
|
83 |
+
2023-10-25 21:07:54,508 epoch 1 - iter 84/146 - loss 1.02143305 - time (sec): 5.11 - samples/sec: 4686.68 - lr: 0.000028 - momentum: 0.000000
|
84 |
+
2023-10-25 21:07:55,436 epoch 1 - iter 98/146 - loss 0.90968592 - time (sec): 6.04 - samples/sec: 4766.54 - lr: 0.000033 - momentum: 0.000000
|
85 |
+
2023-10-25 21:07:56,476 epoch 1 - iter 112/146 - loss 0.82600257 - time (sec): 7.08 - samples/sec: 4773.82 - lr: 0.000038 - momentum: 0.000000
|
86 |
+
2023-10-25 21:07:57,299 epoch 1 - iter 126/146 - loss 0.76327971 - time (sec): 7.90 - samples/sec: 4816.97 - lr: 0.000043 - momentum: 0.000000
|
87 |
+
2023-10-25 21:07:58,227 epoch 1 - iter 140/146 - loss 0.70269213 - time (sec): 8.83 - samples/sec: 4846.77 - lr: 0.000048 - momentum: 0.000000
|
88 |
+
2023-10-25 21:07:58,636 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-25 21:07:58,637 EPOCH 1 done: loss 0.6875 - lr: 0.000048
|
90 |
+
2023-10-25 21:07:59,149 DEV : loss 0.1478959619998932 - f1-score (micro avg) 0.6147
|
91 |
+
2023-10-25 21:07:59,153 saving best model
|
92 |
+
2023-10-25 21:07:59,662 ----------------------------------------------------------------------------------------------------
|
93 |
+
2023-10-25 21:08:00,616 epoch 2 - iter 14/146 - loss 0.16108065 - time (sec): 0.95 - samples/sec: 4811.08 - lr: 0.000050 - momentum: 0.000000
|
94 |
+
2023-10-25 21:08:01,592 epoch 2 - iter 28/146 - loss 0.15067828 - time (sec): 1.93 - samples/sec: 4899.11 - lr: 0.000049 - momentum: 0.000000
|
95 |
+
2023-10-25 21:08:02,467 epoch 2 - iter 42/146 - loss 0.15492269 - time (sec): 2.80 - samples/sec: 4896.36 - lr: 0.000048 - momentum: 0.000000
|
96 |
+
2023-10-25 21:08:03,367 epoch 2 - iter 56/146 - loss 0.16102186 - time (sec): 3.70 - samples/sec: 4832.43 - lr: 0.000048 - momentum: 0.000000
|
97 |
+
2023-10-25 21:08:04,144 epoch 2 - iter 70/146 - loss 0.15945960 - time (sec): 4.48 - samples/sec: 4838.41 - lr: 0.000047 - momentum: 0.000000
|
98 |
+
2023-10-25 21:08:04,909 epoch 2 - iter 84/146 - loss 0.16413615 - time (sec): 5.25 - samples/sec: 4824.82 - lr: 0.000047 - momentum: 0.000000
|
99 |
+
2023-10-25 21:08:05,687 epoch 2 - iter 98/146 - loss 0.15984296 - time (sec): 6.02 - samples/sec: 4843.16 - lr: 0.000046 - momentum: 0.000000
|
100 |
+
2023-10-25 21:08:06,672 epoch 2 - iter 112/146 - loss 0.15388793 - time (sec): 7.01 - samples/sec: 4844.91 - lr: 0.000046 - momentum: 0.000000
|
101 |
+
2023-10-25 21:08:07,514 epoch 2 - iter 126/146 - loss 0.14973376 - time (sec): 7.85 - samples/sec: 4895.41 - lr: 0.000045 - momentum: 0.000000
|
102 |
+
2023-10-25 21:08:08,387 epoch 2 - iter 140/146 - loss 0.14951333 - time (sec): 8.72 - samples/sec: 4914.47 - lr: 0.000045 - momentum: 0.000000
|
103 |
+
2023-10-25 21:08:08,740 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-25 21:08:08,740 EPOCH 2 done: loss 0.1491 - lr: 0.000045
|
105 |
+
2023-10-25 21:08:09,802 DEV : loss 0.10244478285312653 - f1-score (micro avg) 0.722
|
106 |
+
2023-10-25 21:08:09,806 saving best model
|
107 |
+
2023-10-25 21:08:10,488 ----------------------------------------------------------------------------------------------------
|
108 |
+
2023-10-25 21:08:11,393 epoch 3 - iter 14/146 - loss 0.09051348 - time (sec): 0.90 - samples/sec: 4622.43 - lr: 0.000044 - momentum: 0.000000
|
109 |
+
2023-10-25 21:08:12,169 epoch 3 - iter 28/146 - loss 0.08682021 - time (sec): 1.68 - samples/sec: 4486.31 - lr: 0.000043 - momentum: 0.000000
|
110 |
+
2023-10-25 21:08:13,122 epoch 3 - iter 42/146 - loss 0.08134884 - time (sec): 2.63 - samples/sec: 4551.81 - lr: 0.000043 - momentum: 0.000000
|
111 |
+
2023-10-25 21:08:13,976 epoch 3 - iter 56/146 - loss 0.07868931 - time (sec): 3.49 - samples/sec: 4454.44 - lr: 0.000042 - momentum: 0.000000
|
112 |
+
2023-10-25 21:08:15,095 epoch 3 - iter 70/146 - loss 0.08088185 - time (sec): 4.60 - samples/sec: 4610.30 - lr: 0.000042 - momentum: 0.000000
|
113 |
+
2023-10-25 21:08:15,955 epoch 3 - iter 84/146 - loss 0.08151152 - time (sec): 5.47 - samples/sec: 4731.13 - lr: 0.000041 - momentum: 0.000000
|
114 |
+
2023-10-25 21:08:16,823 epoch 3 - iter 98/146 - loss 0.08031537 - time (sec): 6.33 - samples/sec: 4791.34 - lr: 0.000041 - momentum: 0.000000
|
115 |
+
2023-10-25 21:08:17,535 epoch 3 - iter 112/146 - loss 0.08190989 - time (sec): 7.05 - samples/sec: 4836.59 - lr: 0.000040 - momentum: 0.000000
|
116 |
+
2023-10-25 21:08:18,326 epoch 3 - iter 126/146 - loss 0.08448286 - time (sec): 7.84 - samples/sec: 4838.21 - lr: 0.000040 - momentum: 0.000000
|
117 |
+
2023-10-25 21:08:19,203 epoch 3 - iter 140/146 - loss 0.08269211 - time (sec): 8.71 - samples/sec: 4854.04 - lr: 0.000039 - momentum: 0.000000
|
118 |
+
2023-10-25 21:08:19,638 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-25 21:08:19,639 EPOCH 3 done: loss 0.0836 - lr: 0.000039
|
120 |
+
2023-10-25 21:08:20,557 DEV : loss 0.10184833407402039 - f1-score (micro avg) 0.7212
|
121 |
+
2023-10-25 21:08:20,562 ----------------------------------------------------------------------------------------------------
|
122 |
+
2023-10-25 21:08:21,520 epoch 4 - iter 14/146 - loss 0.06152145 - time (sec): 0.96 - samples/sec: 5271.53 - lr: 0.000038 - momentum: 0.000000
|
123 |
+
2023-10-25 21:08:22,334 epoch 4 - iter 28/146 - loss 0.05606755 - time (sec): 1.77 - samples/sec: 4945.36 - lr: 0.000038 - momentum: 0.000000
|
124 |
+
2023-10-25 21:08:23,145 epoch 4 - iter 42/146 - loss 0.05920449 - time (sec): 2.58 - samples/sec: 4896.12 - lr: 0.000037 - momentum: 0.000000
|
125 |
+
2023-10-25 21:08:24,097 epoch 4 - iter 56/146 - loss 0.05389913 - time (sec): 3.53 - samples/sec: 4806.51 - lr: 0.000037 - momentum: 0.000000
|
126 |
+
2023-10-25 21:08:24,831 epoch 4 - iter 70/146 - loss 0.05447504 - time (sec): 4.27 - samples/sec: 4767.50 - lr: 0.000036 - momentum: 0.000000
|
127 |
+
2023-10-25 21:08:25,787 epoch 4 - iter 84/146 - loss 0.05784428 - time (sec): 5.22 - samples/sec: 4714.07 - lr: 0.000036 - momentum: 0.000000
|
128 |
+
2023-10-25 21:08:26,661 epoch 4 - iter 98/146 - loss 0.05700628 - time (sec): 6.10 - samples/sec: 4729.43 - lr: 0.000035 - momentum: 0.000000
|
129 |
+
2023-10-25 21:08:27,499 epoch 4 - iter 112/146 - loss 0.05350858 - time (sec): 6.94 - samples/sec: 4689.34 - lr: 0.000035 - momentum: 0.000000
|
130 |
+
2023-10-25 21:08:28,507 epoch 4 - iter 126/146 - loss 0.05232477 - time (sec): 7.94 - samples/sec: 4696.73 - lr: 0.000034 - momentum: 0.000000
|
131 |
+
2023-10-25 21:08:29,417 epoch 4 - iter 140/146 - loss 0.05225578 - time (sec): 8.85 - samples/sec: 4793.42 - lr: 0.000034 - momentum: 0.000000
|
132 |
+
2023-10-25 21:08:29,757 ----------------------------------------------------------------------------------------------------
|
133 |
+
2023-10-25 21:08:29,757 EPOCH 4 done: loss 0.0523 - lr: 0.000034
|
134 |
+
2023-10-25 21:08:30,673 DEV : loss 0.10563240945339203 - f1-score (micro avg) 0.7404
|
135 |
+
2023-10-25 21:08:30,678 saving best model
|
136 |
+
2023-10-25 21:08:31,335 ----------------------------------------------------------------------------------------------------
|
137 |
+
2023-10-25 21:08:32,246 epoch 5 - iter 14/146 - loss 0.02488614 - time (sec): 0.91 - samples/sec: 5101.79 - lr: 0.000033 - momentum: 0.000000
|
138 |
+
2023-10-25 21:08:33,018 epoch 5 - iter 28/146 - loss 0.02254102 - time (sec): 1.68 - samples/sec: 4976.30 - lr: 0.000032 - momentum: 0.000000
|
139 |
+
2023-10-25 21:08:33,914 epoch 5 - iter 42/146 - loss 0.02974599 - time (sec): 2.58 - samples/sec: 5077.10 - lr: 0.000032 - momentum: 0.000000
|
140 |
+
2023-10-25 21:08:34,838 epoch 5 - iter 56/146 - loss 0.02885809 - time (sec): 3.50 - samples/sec: 4900.07 - lr: 0.000031 - momentum: 0.000000
|
141 |
+
2023-10-25 21:08:35,787 epoch 5 - iter 70/146 - loss 0.02769486 - time (sec): 4.45 - samples/sec: 4744.69 - lr: 0.000031 - momentum: 0.000000
|
142 |
+
2023-10-25 21:08:36,619 epoch 5 - iter 84/146 - loss 0.02843650 - time (sec): 5.28 - samples/sec: 4729.31 - lr: 0.000030 - momentum: 0.000000
|
143 |
+
2023-10-25 21:08:37,565 epoch 5 - iter 98/146 - loss 0.03048721 - time (sec): 6.23 - samples/sec: 4678.06 - lr: 0.000030 - momentum: 0.000000
|
144 |
+
2023-10-25 21:08:38,453 epoch 5 - iter 112/146 - loss 0.03202213 - time (sec): 7.12 - samples/sec: 4695.52 - lr: 0.000029 - momentum: 0.000000
|
145 |
+
2023-10-25 21:08:39,677 epoch 5 - iter 126/146 - loss 0.03216288 - time (sec): 8.34 - samples/sec: 4614.52 - lr: 0.000029 - momentum: 0.000000
|
146 |
+
2023-10-25 21:08:40,447 epoch 5 - iter 140/146 - loss 0.03299174 - time (sec): 9.11 - samples/sec: 4678.91 - lr: 0.000028 - momentum: 0.000000
|
147 |
+
2023-10-25 21:08:40,842 ----------------------------------------------------------------------------------------------------
|
148 |
+
2023-10-25 21:08:40,842 EPOCH 5 done: loss 0.0327 - lr: 0.000028
|
149 |
+
2023-10-25 21:08:41,755 DEV : loss 0.10233564674854279 - f1-score (micro avg) 0.7706
|
150 |
+
2023-10-25 21:08:41,760 saving best model
|
151 |
+
2023-10-25 21:08:42,313 ----------------------------------------------------------------------------------------------------
|
152 |
+
2023-10-25 21:08:43,115 epoch 6 - iter 14/146 - loss 0.01610677 - time (sec): 0.80 - samples/sec: 5199.94 - lr: 0.000027 - momentum: 0.000000
|
153 |
+
2023-10-25 21:08:44,073 epoch 6 - iter 28/146 - loss 0.01771595 - time (sec): 1.76 - samples/sec: 4763.73 - lr: 0.000027 - momentum: 0.000000
|
154 |
+
2023-10-25 21:08:45,030 epoch 6 - iter 42/146 - loss 0.01749347 - time (sec): 2.72 - samples/sec: 4851.45 - lr: 0.000026 - momentum: 0.000000
|
155 |
+
2023-10-25 21:08:45,893 epoch 6 - iter 56/146 - loss 0.01637983 - time (sec): 3.58 - samples/sec: 4827.76 - lr: 0.000026 - momentum: 0.000000
|
156 |
+
2023-10-25 21:08:46,839 epoch 6 - iter 70/146 - loss 0.02013402 - time (sec): 4.52 - samples/sec: 4762.24 - lr: 0.000025 - momentum: 0.000000
|
157 |
+
2023-10-25 21:08:47,690 epoch 6 - iter 84/146 - loss 0.01826125 - time (sec): 5.38 - samples/sec: 4727.79 - lr: 0.000025 - momentum: 0.000000
|
158 |
+
2023-10-25 21:08:48,586 epoch 6 - iter 98/146 - loss 0.01955487 - time (sec): 6.27 - samples/sec: 4801.37 - lr: 0.000024 - momentum: 0.000000
|
159 |
+
2023-10-25 21:08:49,583 epoch 6 - iter 112/146 - loss 0.02207506 - time (sec): 7.27 - samples/sec: 4730.97 - lr: 0.000024 - momentum: 0.000000
|
160 |
+
2023-10-25 21:08:50,408 epoch 6 - iter 126/146 - loss 0.02262062 - time (sec): 8.09 - samples/sec: 4746.73 - lr: 0.000023 - momentum: 0.000000
|
161 |
+
2023-10-25 21:08:51,267 epoch 6 - iter 140/146 - loss 0.02172538 - time (sec): 8.95 - samples/sec: 4777.39 - lr: 0.000023 - momentum: 0.000000
|
162 |
+
2023-10-25 21:08:51,616 ----------------------------------------------------------------------------------------------------
|
163 |
+
2023-10-25 21:08:51,617 EPOCH 6 done: loss 0.0229 - lr: 0.000023
|
164 |
+
2023-10-25 21:08:52,528 DEV : loss 0.127528578042984 - f1-score (micro avg) 0.7511
|
165 |
+
2023-10-25 21:08:52,532 ----------------------------------------------------------------------------------------------------
|
166 |
+
2023-10-25 21:08:53,355 epoch 7 - iter 14/146 - loss 0.01150269 - time (sec): 0.82 - samples/sec: 5084.44 - lr: 0.000022 - momentum: 0.000000
|
167 |
+
2023-10-25 21:08:54,542 epoch 7 - iter 28/146 - loss 0.01576881 - time (sec): 2.01 - samples/sec: 5019.70 - lr: 0.000021 - momentum: 0.000000
|
168 |
+
2023-10-25 21:08:55,297 epoch 7 - iter 42/146 - loss 0.01762664 - time (sec): 2.76 - samples/sec: 4869.93 - lr: 0.000021 - momentum: 0.000000
|
169 |
+
2023-10-25 21:08:56,135 epoch 7 - iter 56/146 - loss 0.01668696 - time (sec): 3.60 - samples/sec: 4798.61 - lr: 0.000020 - momentum: 0.000000
|
170 |
+
2023-10-25 21:08:56,965 epoch 7 - iter 70/146 - loss 0.01515176 - time (sec): 4.43 - samples/sec: 4829.07 - lr: 0.000020 - momentum: 0.000000
|
171 |
+
2023-10-25 21:08:57,897 epoch 7 - iter 84/146 - loss 0.01337344 - time (sec): 5.36 - samples/sec: 4908.17 - lr: 0.000019 - momentum: 0.000000
|
172 |
+
2023-10-25 21:08:58,754 epoch 7 - iter 98/146 - loss 0.01419310 - time (sec): 6.22 - samples/sec: 4934.76 - lr: 0.000019 - momentum: 0.000000
|
173 |
+
2023-10-25 21:08:59,525 epoch 7 - iter 112/146 - loss 0.01622088 - time (sec): 6.99 - samples/sec: 4885.45 - lr: 0.000018 - momentum: 0.000000
|
174 |
+
2023-10-25 21:09:00,360 epoch 7 - iter 126/146 - loss 0.01549015 - time (sec): 7.83 - samples/sec: 4890.68 - lr: 0.000018 - momentum: 0.000000
|
175 |
+
2023-10-25 21:09:01,272 epoch 7 - iter 140/146 - loss 0.01516680 - time (sec): 8.74 - samples/sec: 4861.79 - lr: 0.000017 - momentum: 0.000000
|
176 |
+
2023-10-25 21:09:01,676 ----------------------------------------------------------------------------------------------------
|
177 |
+
2023-10-25 21:09:01,676 EPOCH 7 done: loss 0.0146 - lr: 0.000017
|
178 |
+
2023-10-25 21:09:02,587 DEV : loss 0.13177402317523956 - f1-score (micro avg) 0.7689
|
179 |
+
2023-10-25 21:09:02,591 ----------------------------------------------------------------------------------------------------
|
180 |
+
2023-10-25 21:09:03,473 epoch 8 - iter 14/146 - loss 0.02475525 - time (sec): 0.88 - samples/sec: 4376.44 - lr: 0.000016 - momentum: 0.000000
|
181 |
+
2023-10-25 21:09:04,487 epoch 8 - iter 28/146 - loss 0.01803622 - time (sec): 1.90 - samples/sec: 4481.02 - lr: 0.000016 - momentum: 0.000000
|
182 |
+
2023-10-25 21:09:05,574 epoch 8 - iter 42/146 - loss 0.01556120 - time (sec): 2.98 - samples/sec: 4529.62 - lr: 0.000015 - momentum: 0.000000
|
183 |
+
2023-10-25 21:09:06,474 epoch 8 - iter 56/146 - loss 0.01480972 - time (sec): 3.88 - samples/sec: 4494.81 - lr: 0.000015 - momentum: 0.000000
|
184 |
+
2023-10-25 21:09:07,396 epoch 8 - iter 70/146 - loss 0.01592848 - time (sec): 4.80 - samples/sec: 4523.32 - lr: 0.000014 - momentum: 0.000000
|
185 |
+
2023-10-25 21:09:08,211 epoch 8 - iter 84/146 - loss 0.01469362 - time (sec): 5.62 - samples/sec: 4618.26 - lr: 0.000014 - momentum: 0.000000
|
186 |
+
2023-10-25 21:09:08,991 epoch 8 - iter 98/146 - loss 0.01471966 - time (sec): 6.40 - samples/sec: 4605.33 - lr: 0.000013 - momentum: 0.000000
|
187 |
+
2023-10-25 21:09:09,977 epoch 8 - iter 112/146 - loss 0.01407256 - time (sec): 7.39 - samples/sec: 4671.32 - lr: 0.000013 - momentum: 0.000000
|
188 |
+
2023-10-25 21:09:10,789 epoch 8 - iter 126/146 - loss 0.01427331 - time (sec): 8.20 - samples/sec: 4669.48 - lr: 0.000012 - momentum: 0.000000
|
189 |
+
2023-10-25 21:09:11,640 epoch 8 - iter 140/146 - loss 0.01285895 - time (sec): 9.05 - samples/sec: 4747.94 - lr: 0.000012 - momentum: 0.000000
|
190 |
+
2023-10-25 21:09:11,945 ----------------------------------------------------------------------------------------------------
|
191 |
+
2023-10-25 21:09:11,946 EPOCH 8 done: loss 0.0125 - lr: 0.000012
|
192 |
+
2023-10-25 21:09:13,015 DEV : loss 0.14684733748435974 - f1-score (micro avg) 0.7458
|
193 |
+
2023-10-25 21:09:13,020 ----------------------------------------------------------------------------------------------------
|
194 |
+
2023-10-25 21:09:13,897 epoch 9 - iter 14/146 - loss 0.00374008 - time (sec): 0.88 - samples/sec: 5312.36 - lr: 0.000011 - momentum: 0.000000
|
195 |
+
2023-10-25 21:09:14,741 epoch 9 - iter 28/146 - loss 0.00602499 - time (sec): 1.72 - samples/sec: 5187.02 - lr: 0.000010 - momentum: 0.000000
|
196 |
+
2023-10-25 21:09:15,543 epoch 9 - iter 42/146 - loss 0.00469098 - time (sec): 2.52 - samples/sec: 4978.45 - lr: 0.000010 - momentum: 0.000000
|
197 |
+
2023-10-25 21:09:16,616 epoch 9 - iter 56/146 - loss 0.00530783 - time (sec): 3.59 - samples/sec: 4869.76 - lr: 0.000009 - momentum: 0.000000
|
198 |
+
2023-10-25 21:09:17,612 epoch 9 - iter 70/146 - loss 0.00888826 - time (sec): 4.59 - samples/sec: 4850.68 - lr: 0.000009 - momentum: 0.000000
|
199 |
+
2023-10-25 21:09:18,526 epoch 9 - iter 84/146 - loss 0.00861823 - time (sec): 5.50 - samples/sec: 4792.41 - lr: 0.000008 - momentum: 0.000000
|
200 |
+
2023-10-25 21:09:19,463 epoch 9 - iter 98/146 - loss 0.00801291 - time (sec): 6.44 - samples/sec: 4800.57 - lr: 0.000008 - momentum: 0.000000
|
201 |
+
2023-10-25 21:09:20,244 epoch 9 - iter 112/146 - loss 0.00827974 - time (sec): 7.22 - samples/sec: 4759.68 - lr: 0.000007 - momentum: 0.000000
|
202 |
+
2023-10-25 21:09:21,147 epoch 9 - iter 126/146 - loss 0.00760175 - time (sec): 8.13 - samples/sec: 4737.94 - lr: 0.000007 - momentum: 0.000000
|
203 |
+
2023-10-25 21:09:22,026 epoch 9 - iter 140/146 - loss 0.00722376 - time (sec): 9.00 - samples/sec: 4748.75 - lr: 0.000006 - momentum: 0.000000
|
204 |
+
2023-10-25 21:09:22,355 ----------------------------------------------------------------------------------------------------
|
205 |
+
2023-10-25 21:09:22,355 EPOCH 9 done: loss 0.0073 - lr: 0.000006
|
206 |
+
2023-10-25 21:09:23,267 DEV : loss 0.1549414098262787 - f1-score (micro avg) 0.7692
|
207 |
+
2023-10-25 21:09:23,271 ----------------------------------------------------------------------------------------------------
|
208 |
+
2023-10-25 21:09:24,097 epoch 10 - iter 14/146 - loss 0.00128690 - time (sec): 0.82 - samples/sec: 5057.66 - lr: 0.000005 - momentum: 0.000000
|
209 |
+
2023-10-25 21:09:24,940 epoch 10 - iter 28/146 - loss 0.00525616 - time (sec): 1.67 - samples/sec: 5092.22 - lr: 0.000005 - momentum: 0.000000
|
210 |
+
2023-10-25 21:09:25,838 epoch 10 - iter 42/146 - loss 0.00459227 - time (sec): 2.57 - samples/sec: 4870.74 - lr: 0.000004 - momentum: 0.000000
|
211 |
+
2023-10-25 21:09:26,823 epoch 10 - iter 56/146 - loss 0.00708331 - time (sec): 3.55 - samples/sec: 4768.34 - lr: 0.000004 - momentum: 0.000000
|
212 |
+
2023-10-25 21:09:27,709 epoch 10 - iter 70/146 - loss 0.00587957 - time (sec): 4.44 - samples/sec: 4762.88 - lr: 0.000003 - momentum: 0.000000
|
213 |
+
2023-10-25 21:09:28,497 epoch 10 - iter 84/146 - loss 0.00554849 - time (sec): 5.22 - samples/sec: 4713.50 - lr: 0.000003 - momentum: 0.000000
|
214 |
+
2023-10-25 21:09:29,533 epoch 10 - iter 98/146 - loss 0.00596721 - time (sec): 6.26 - samples/sec: 4666.95 - lr: 0.000002 - momentum: 0.000000
|
215 |
+
2023-10-25 21:09:30,455 epoch 10 - iter 112/146 - loss 0.00572547 - time (sec): 7.18 - samples/sec: 4724.35 - lr: 0.000002 - momentum: 0.000000
|
216 |
+
2023-10-25 21:09:31,589 epoch 10 - iter 126/146 - loss 0.00547473 - time (sec): 8.32 - samples/sec: 4612.75 - lr: 0.000001 - momentum: 0.000000
|
217 |
+
2023-10-25 21:09:32,468 epoch 10 - iter 140/146 - loss 0.00498630 - time (sec): 9.20 - samples/sec: 4651.83 - lr: 0.000000 - momentum: 0.000000
|
218 |
+
2023-10-25 21:09:32,839 ----------------------------------------------------------------------------------------------------
|
219 |
+
2023-10-25 21:09:32,839 EPOCH 10 done: loss 0.0049 - lr: 0.000000
|
220 |
+
2023-10-25 21:09:33,748 DEV : loss 0.15722544491291046 - f1-score (micro avg) 0.7706
|
221 |
+
2023-10-25 21:09:34,271 ----------------------------------------------------------------------------------------------------
|
222 |
+
2023-10-25 21:09:34,272 Loading model from best epoch ...
|
223 |
+
2023-10-25 21:09:35,987 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
|
224 |
+
2023-10-25 21:09:37,504
|
225 |
+
Results:
|
226 |
+
- F-score (micro) 0.7628
|
227 |
+
- F-score (macro) 0.6702
|
228 |
+
- Accuracy 0.6352
|
229 |
+
|
230 |
+
By class:
|
231 |
+
precision recall f1-score support
|
232 |
+
|
233 |
+
PER 0.8319 0.8391 0.8355 348
|
234 |
+
LOC 0.6656 0.8314 0.7394 261
|
235 |
+
ORG 0.4468 0.4038 0.4242 52
|
236 |
+
HumanProd 0.6818 0.6818 0.6818 22
|
237 |
+
|
238 |
+
micro avg 0.7306 0.7980 0.7628 683
|
239 |
+
macro avg 0.6565 0.6890 0.6702 683
|
240 |
+
weighted avg 0.7342 0.7980 0.7625 683
|
241 |
+
|
242 |
+
2023-10-25 21:09:37,504 ----------------------------------------------------------------------------------------------------
|