File size: 23,912 Bytes
7c730ba
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
2023-10-25 21:05:36,363 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:36,364 Model: "SequenceTagger(
  (embeddings): TransformerWordEmbeddings(
    (model): BertModel(
      (embeddings): BertEmbeddings(
        (word_embeddings): Embedding(64001, 768)
        (position_embeddings): Embedding(512, 768)
        (token_type_embeddings): Embedding(2, 768)
        (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (encoder): BertEncoder(
        (layer): ModuleList(
          (0-11): 12 x BertLayer(
            (attention): BertAttention(
              (self): BertSelfAttention(
                (query): Linear(in_features=768, out_features=768, bias=True)
                (key): Linear(in_features=768, out_features=768, bias=True)
                (value): Linear(in_features=768, out_features=768, bias=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
              (output): BertSelfOutput(
                (dense): Linear(in_features=768, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (intermediate): BertIntermediate(
              (dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
            )
            (output): BertOutput(
              (dense): Linear(in_features=3072, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
      )
      (pooler): BertPooler(
        (dense): Linear(in_features=768, out_features=768, bias=True)
        (activation): Tanh()
      )
    )
  )
  (locked_dropout): LockedDropout(p=0.5)
  (linear): Linear(in_features=768, out_features=17, bias=True)
  (loss_function): CrossEntropyLoss()
)"
2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:36,364 MultiCorpus: 1166 train + 165 dev + 415 test sentences
 - NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:36,364 Train:  1166 sentences
2023-10-25 21:05:36,364         (train_with_dev=False, train_with_test=False)
2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:36,364 Training Params:
2023-10-25 21:05:36,364  - learning_rate: "3e-05" 
2023-10-25 21:05:36,364  - mini_batch_size: "8"
2023-10-25 21:05:36,364  - max_epochs: "10"
2023-10-25 21:05:36,364  - shuffle: "True"
2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:36,364 Plugins:
2023-10-25 21:05:36,364  - TensorboardLogger
2023-10-25 21:05:36,364  - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:36,364 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 21:05:36,364  - metric: "('micro avg', 'f1-score')"
2023-10-25 21:05:36,364 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:36,365 Computation:
2023-10-25 21:05:36,365  - compute on device: cuda:0
2023-10-25 21:05:36,365  - embedding storage: none
2023-10-25 21:05:36,365 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:36,365 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-25 21:05:36,365 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:36,365 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:36,365 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 21:05:37,164 epoch 1 - iter 14/146 - loss 2.83025878 - time (sec): 0.80 - samples/sec: 4267.88 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:05:37,955 epoch 1 - iter 28/146 - loss 2.46691922 - time (sec): 1.59 - samples/sec: 4479.04 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:05:38,902 epoch 1 - iter 42/146 - loss 1.86829011 - time (sec): 2.54 - samples/sec: 4669.08 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:05:39,724 epoch 1 - iter 56/146 - loss 1.54384758 - time (sec): 3.36 - samples/sec: 4663.76 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:05:40,514 epoch 1 - iter 70/146 - loss 1.34624853 - time (sec): 4.15 - samples/sec: 4705.75 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:05:41,496 epoch 1 - iter 84/146 - loss 1.20137129 - time (sec): 5.13 - samples/sec: 4668.86 - lr: 0.000017 - momentum: 0.000000
2023-10-25 21:05:42,434 epoch 1 - iter 98/146 - loss 1.07011610 - time (sec): 6.07 - samples/sec: 4743.31 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:05:43,498 epoch 1 - iter 112/146 - loss 0.97878778 - time (sec): 7.13 - samples/sec: 4737.41 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:05:44,331 epoch 1 - iter 126/146 - loss 0.90038816 - time (sec): 7.96 - samples/sec: 4778.42 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:05:45,275 epoch 1 - iter 140/146 - loss 0.82667252 - time (sec): 8.91 - samples/sec: 4803.31 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:05:45,697 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:45,697 EPOCH 1 done: loss 0.8075 - lr: 0.000029
2023-10-25 21:05:46,358 DEV : loss 0.17039310932159424 - f1-score (micro avg)  0.5702
2023-10-25 21:05:46,362 saving best model
2023-10-25 21:05:46,879 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:47,843 epoch 2 - iter 14/146 - loss 0.20229098 - time (sec): 0.96 - samples/sec: 4761.19 - lr: 0.000030 - momentum: 0.000000
2023-10-25 21:05:48,858 epoch 2 - iter 28/146 - loss 0.18766990 - time (sec): 1.98 - samples/sec: 4776.95 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:05:49,755 epoch 2 - iter 42/146 - loss 0.18648775 - time (sec): 2.88 - samples/sec: 4775.20 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:05:50,655 epoch 2 - iter 56/146 - loss 0.19029101 - time (sec): 3.77 - samples/sec: 4741.42 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:05:51,431 epoch 2 - iter 70/146 - loss 0.18913930 - time (sec): 4.55 - samples/sec: 4763.79 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:05:52,200 epoch 2 - iter 84/146 - loss 0.19230771 - time (sec): 5.32 - samples/sec: 4757.70 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:05:53,030 epoch 2 - iter 98/146 - loss 0.18747787 - time (sec): 6.15 - samples/sec: 4744.20 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:05:54,055 epoch 2 - iter 112/146 - loss 0.18004954 - time (sec): 7.18 - samples/sec: 4732.79 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:05:54,915 epoch 2 - iter 126/146 - loss 0.17380432 - time (sec): 8.03 - samples/sec: 4783.21 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:05:55,761 epoch 2 - iter 140/146 - loss 0.17394433 - time (sec): 8.88 - samples/sec: 4827.33 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:05:56,111 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:56,111 EPOCH 2 done: loss 0.1735 - lr: 0.000027
2023-10-25 21:05:57,015 DEV : loss 0.10457519441843033 - f1-score (micro avg)  0.7177
2023-10-25 21:05:57,019 saving best model
2023-10-25 21:05:57,704 ----------------------------------------------------------------------------------------------------
2023-10-25 21:05:58,621 epoch 3 - iter 14/146 - loss 0.09931197 - time (sec): 0.91 - samples/sec: 4567.22 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:05:59,407 epoch 3 - iter 28/146 - loss 0.09405829 - time (sec): 1.70 - samples/sec: 4431.34 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:06:00,370 epoch 3 - iter 42/146 - loss 0.08962058 - time (sec): 2.66 - samples/sec: 4500.54 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:06:01,259 epoch 3 - iter 56/146 - loss 0.08813612 - time (sec): 3.55 - samples/sec: 4372.04 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:06:02,395 epoch 3 - iter 70/146 - loss 0.09225973 - time (sec): 4.69 - samples/sec: 4529.05 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:06:03,276 epoch 3 - iter 84/146 - loss 0.09278810 - time (sec): 5.57 - samples/sec: 4642.69 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:06:04,165 epoch 3 - iter 98/146 - loss 0.09139854 - time (sec): 6.46 - samples/sec: 4698.69 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:06:04,901 epoch 3 - iter 112/146 - loss 0.09379452 - time (sec): 7.19 - samples/sec: 4736.93 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:06:05,723 epoch 3 - iter 126/146 - loss 0.09468401 - time (sec): 8.02 - samples/sec: 4729.48 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:06:06,622 epoch 3 - iter 140/146 - loss 0.09287278 - time (sec): 8.91 - samples/sec: 4744.25 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:06:07,059 ----------------------------------------------------------------------------------------------------
2023-10-25 21:06:07,060 EPOCH 3 done: loss 0.0934 - lr: 0.000024
2023-10-25 21:06:08,132 DEV : loss 0.09595039486885071 - f1-score (micro avg)  0.7332
2023-10-25 21:06:08,137 saving best model
2023-10-25 21:06:08,810 ----------------------------------------------------------------------------------------------------
2023-10-25 21:06:09,790 epoch 4 - iter 14/146 - loss 0.07411911 - time (sec): 0.98 - samples/sec: 5160.10 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:06:10,610 epoch 4 - iter 28/146 - loss 0.06840387 - time (sec): 1.80 - samples/sec: 4874.67 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:06:11,437 epoch 4 - iter 42/146 - loss 0.07292162 - time (sec): 2.62 - samples/sec: 4818.23 - lr: 0.000022 - momentum: 0.000000
2023-10-25 21:06:12,396 epoch 4 - iter 56/146 - loss 0.06469956 - time (sec): 3.58 - samples/sec: 4740.65 - lr: 0.000022 - momentum: 0.000000
2023-10-25 21:06:13,143 epoch 4 - iter 70/146 - loss 0.06518597 - time (sec): 4.33 - samples/sec: 4699.77 - lr: 0.000022 - momentum: 0.000000
2023-10-25 21:06:14,098 epoch 4 - iter 84/146 - loss 0.06696645 - time (sec): 5.29 - samples/sec: 4659.86 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:06:14,934 epoch 4 - iter 98/146 - loss 0.06661296 - time (sec): 6.12 - samples/sec: 4711.64 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:06:15,737 epoch 4 - iter 112/146 - loss 0.06367292 - time (sec): 6.92 - samples/sec: 4697.66 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:06:16,739 epoch 4 - iter 126/146 - loss 0.06160560 - time (sec): 7.93 - samples/sec: 4707.84 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:06:17,616 epoch 4 - iter 140/146 - loss 0.06032160 - time (sec): 8.80 - samples/sec: 4821.21 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:06:17,933 ----------------------------------------------------------------------------------------------------
2023-10-25 21:06:17,933 EPOCH 4 done: loss 0.0601 - lr: 0.000020
2023-10-25 21:06:18,846 DEV : loss 0.10524275153875351 - f1-score (micro avg)  0.7642
2023-10-25 21:06:18,850 saving best model
2023-10-25 21:06:19,534 ----------------------------------------------------------------------------------------------------
2023-10-25 21:06:20,415 epoch 5 - iter 14/146 - loss 0.02996552 - time (sec): 0.88 - samples/sec: 5280.96 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:06:21,159 epoch 5 - iter 28/146 - loss 0.03043510 - time (sec): 1.62 - samples/sec: 5157.31 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:06:22,023 epoch 5 - iter 42/146 - loss 0.03731623 - time (sec): 2.48 - samples/sec: 5263.23 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:06:22,930 epoch 5 - iter 56/146 - loss 0.03644991 - time (sec): 3.39 - samples/sec: 5056.90 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:06:23,888 epoch 5 - iter 70/146 - loss 0.03446408 - time (sec): 4.35 - samples/sec: 4852.99 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:06:24,758 epoch 5 - iter 84/146 - loss 0.03395159 - time (sec): 5.22 - samples/sec: 4784.45 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:06:25,718 epoch 5 - iter 98/146 - loss 0.03646640 - time (sec): 6.18 - samples/sec: 4713.73 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:06:26,607 epoch 5 - iter 112/146 - loss 0.03898602 - time (sec): 7.07 - samples/sec: 4726.24 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:06:27,672 epoch 5 - iter 126/146 - loss 0.03853003 - time (sec): 8.13 - samples/sec: 4730.88 - lr: 0.000017 - momentum: 0.000000
2023-10-25 21:06:28,445 epoch 5 - iter 140/146 - loss 0.03950165 - time (sec): 8.91 - samples/sec: 4785.32 - lr: 0.000017 - momentum: 0.000000
2023-10-25 21:06:28,835 ----------------------------------------------------------------------------------------------------
2023-10-25 21:06:28,836 EPOCH 5 done: loss 0.0395 - lr: 0.000017
2023-10-25 21:06:29,746 DEV : loss 0.10796511173248291 - f1-score (micro avg)  0.7617
2023-10-25 21:06:29,751 ----------------------------------------------------------------------------------------------------
2023-10-25 21:06:30,562 epoch 6 - iter 14/146 - loss 0.02045471 - time (sec): 0.81 - samples/sec: 5141.92 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:06:31,512 epoch 6 - iter 28/146 - loss 0.02431460 - time (sec): 1.76 - samples/sec: 4759.68 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:06:32,460 epoch 6 - iter 42/146 - loss 0.02349163 - time (sec): 2.71 - samples/sec: 4864.11 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:06:33,331 epoch 6 - iter 56/146 - loss 0.02207213 - time (sec): 3.58 - samples/sec: 4826.86 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:06:34,276 epoch 6 - iter 70/146 - loss 0.02538588 - time (sec): 4.52 - samples/sec: 4762.20 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:06:35,169 epoch 6 - iter 84/146 - loss 0.02401518 - time (sec): 5.42 - samples/sec: 4691.30 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:06:36,103 epoch 6 - iter 98/146 - loss 0.02319494 - time (sec): 6.35 - samples/sec: 4741.03 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:06:37,157 epoch 6 - iter 112/146 - loss 0.02399453 - time (sec): 7.41 - samples/sec: 4643.43 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:06:38,007 epoch 6 - iter 126/146 - loss 0.02529112 - time (sec): 8.26 - samples/sec: 4654.13 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:06:38,896 epoch 6 - iter 140/146 - loss 0.02430354 - time (sec): 9.14 - samples/sec: 4677.55 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:06:39,258 ----------------------------------------------------------------------------------------------------
2023-10-25 21:06:39,258 EPOCH 6 done: loss 0.0249 - lr: 0.000014
2023-10-25 21:06:40,322 DEV : loss 0.11456426978111267 - f1-score (micro avg)  0.7849
2023-10-25 21:06:40,327 saving best model
2023-10-25 21:06:40,998 ----------------------------------------------------------------------------------------------------
2023-10-25 21:06:41,831 epoch 7 - iter 14/146 - loss 0.01377810 - time (sec): 0.83 - samples/sec: 5025.09 - lr: 0.000013 - momentum: 0.000000
2023-10-25 21:06:43,030 epoch 7 - iter 28/146 - loss 0.02000949 - time (sec): 2.03 - samples/sec: 4964.90 - lr: 0.000013 - momentum: 0.000000
2023-10-25 21:06:43,818 epoch 7 - iter 42/146 - loss 0.02159133 - time (sec): 2.82 - samples/sec: 4774.67 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:06:44,680 epoch 7 - iter 56/146 - loss 0.02141280 - time (sec): 3.68 - samples/sec: 4696.42 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:06:45,530 epoch 7 - iter 70/146 - loss 0.02033340 - time (sec): 4.53 - samples/sec: 4722.77 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:06:46,479 epoch 7 - iter 84/146 - loss 0.01873330 - time (sec): 5.48 - samples/sec: 4804.21 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:06:47,374 epoch 7 - iter 98/146 - loss 0.01877254 - time (sec): 6.37 - samples/sec: 4815.74 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:06:48,168 epoch 7 - iter 112/146 - loss 0.01963305 - time (sec): 7.17 - samples/sec: 4765.05 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:06:48,984 epoch 7 - iter 126/146 - loss 0.01898464 - time (sec): 7.98 - samples/sec: 4793.92 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:06:49,883 epoch 7 - iter 140/146 - loss 0.01845447 - time (sec): 8.88 - samples/sec: 4782.28 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:06:50,299 ----------------------------------------------------------------------------------------------------
2023-10-25 21:06:50,299 EPOCH 7 done: loss 0.0181 - lr: 0.000010
2023-10-25 21:06:51,208 DEV : loss 0.13842682540416718 - f1-score (micro avg)  0.7788
2023-10-25 21:06:51,213 ----------------------------------------------------------------------------------------------------
2023-10-25 21:06:52,098 epoch 8 - iter 14/146 - loss 0.02179360 - time (sec): 0.88 - samples/sec: 4362.26 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:06:53,100 epoch 8 - iter 28/146 - loss 0.01457476 - time (sec): 1.89 - samples/sec: 4502.85 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:06:54,176 epoch 8 - iter 42/146 - loss 0.01363156 - time (sec): 2.96 - samples/sec: 4560.96 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:06:55,065 epoch 8 - iter 56/146 - loss 0.01358221 - time (sec): 3.85 - samples/sec: 4529.92 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:06:55,987 epoch 8 - iter 70/146 - loss 0.01502044 - time (sec): 4.77 - samples/sec: 4552.97 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:06:56,784 epoch 8 - iter 84/146 - loss 0.01489034 - time (sec): 5.57 - samples/sec: 4658.20 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:06:57,539 epoch 8 - iter 98/146 - loss 0.01449699 - time (sec): 6.32 - samples/sec: 4659.56 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:06:58,527 epoch 8 - iter 112/146 - loss 0.01427531 - time (sec): 7.31 - samples/sec: 4717.80 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:06:59,324 epoch 8 - iter 126/146 - loss 0.01369188 - time (sec): 8.11 - samples/sec: 4719.35 - lr: 0.000007 - momentum: 0.000000
2023-10-25 21:07:00,154 epoch 8 - iter 140/146 - loss 0.01269696 - time (sec): 8.94 - samples/sec: 4804.98 - lr: 0.000007 - momentum: 0.000000
2023-10-25 21:07:00,462 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:00,462 EPOCH 8 done: loss 0.0128 - lr: 0.000007
2023-10-25 21:07:01,376 DEV : loss 0.15397675335407257 - f1-score (micro avg)  0.7315
2023-10-25 21:07:01,380 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:02,243 epoch 9 - iter 14/146 - loss 0.00725404 - time (sec): 0.86 - samples/sec: 5396.63 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:07:03,063 epoch 9 - iter 28/146 - loss 0.00858283 - time (sec): 1.68 - samples/sec: 5303.83 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:07:03,844 epoch 9 - iter 42/146 - loss 0.00697012 - time (sec): 2.46 - samples/sec: 5096.55 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:07:04,896 epoch 9 - iter 56/146 - loss 0.00943578 - time (sec): 3.51 - samples/sec: 4980.37 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:07:05,865 epoch 9 - iter 70/146 - loss 0.01141288 - time (sec): 4.48 - samples/sec: 4965.81 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:07:06,742 epoch 9 - iter 84/146 - loss 0.01134343 - time (sec): 5.36 - samples/sec: 4920.44 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:07:07,668 epoch 9 - iter 98/146 - loss 0.01085725 - time (sec): 6.29 - samples/sec: 4919.30 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:07:08,425 epoch 9 - iter 112/146 - loss 0.01140709 - time (sec): 7.04 - samples/sec: 4880.19 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:07:09,309 epoch 9 - iter 126/146 - loss 0.01114475 - time (sec): 7.93 - samples/sec: 4855.84 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:07:10,174 epoch 9 - iter 140/146 - loss 0.01073435 - time (sec): 8.79 - samples/sec: 4862.91 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:07:10,483 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:10,483 EPOCH 9 done: loss 0.0109 - lr: 0.000004
2023-10-25 21:07:11,390 DEV : loss 0.15072251856327057 - f1-score (micro avg)  0.7598
2023-10-25 21:07:11,395 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:12,200 epoch 10 - iter 14/146 - loss 0.00398947 - time (sec): 0.80 - samples/sec: 5186.37 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:07:13,172 epoch 10 - iter 28/146 - loss 0.00875898 - time (sec): 1.78 - samples/sec: 4781.14 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:07:14,022 epoch 10 - iter 42/146 - loss 0.00922190 - time (sec): 2.63 - samples/sec: 4759.26 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:07:14,955 epoch 10 - iter 56/146 - loss 0.01137857 - time (sec): 3.56 - samples/sec: 4758.01 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:07:15,813 epoch 10 - iter 70/146 - loss 0.00974451 - time (sec): 4.42 - samples/sec: 4783.44 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:07:16,574 epoch 10 - iter 84/146 - loss 0.00891124 - time (sec): 5.18 - samples/sec: 4755.93 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:07:17,576 epoch 10 - iter 98/146 - loss 0.01023278 - time (sec): 6.18 - samples/sec: 4727.60 - lr: 0.000001 - momentum: 0.000000
2023-10-25 21:07:18,489 epoch 10 - iter 112/146 - loss 0.00961012 - time (sec): 7.09 - samples/sec: 4783.79 - lr: 0.000001 - momentum: 0.000000
2023-10-25 21:07:19,586 epoch 10 - iter 126/146 - loss 0.00898474 - time (sec): 8.19 - samples/sec: 4684.43 - lr: 0.000001 - momentum: 0.000000
2023-10-25 21:07:20,438 epoch 10 - iter 140/146 - loss 0.00851157 - time (sec): 9.04 - samples/sec: 4730.83 - lr: 0.000000 - momentum: 0.000000
2023-10-25 21:07:20,797 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:20,797 EPOCH 10 done: loss 0.0087 - lr: 0.000000
2023-10-25 21:07:21,711 DEV : loss 0.1518029123544693 - f1-score (micro avg)  0.7588
2023-10-25 21:07:22,231 ----------------------------------------------------------------------------------------------------
2023-10-25 21:07:22,233 Loading model from best epoch ...
2023-10-25 21:07:23,959 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 21:07:25,507 
Results:
- F-score (micro) 0.7581
- F-score (macro) 0.6581
- Accuracy 0.6331

By class:
              precision    recall  f1-score   support

         PER     0.7855    0.8420    0.8128       348
         LOC     0.6943    0.8352    0.7583       261
         ORG     0.4348    0.3846    0.4082        52
   HumanProd     0.5926    0.7273    0.6531        22

   micro avg     0.7197    0.8009    0.7581       683
   macro avg     0.6268    0.6973    0.6581       683
weighted avg     0.7177    0.8009    0.7560       683

2023-10-25 21:07:25,507 ----------------------------------------------------------------------------------------------------