stefan-it commited on
Commit
5924672
1 Parent(s): 3ac3035

Upload ./training.log with huggingface_hub

Browse files
Files changed (1) hide show
  1. training.log +242 -0
training.log ADDED
@@ -0,0 +1,242 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-25 21:07:49,395 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-25 21:07:49,396 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(64001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=17, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-25 21:07:49,396 MultiCorpus: 1166 train + 165 dev + 415 test sentences
52
+ - NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
53
+ 2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-25 21:07:49,396 Train: 1166 sentences
55
+ 2023-10-25 21:07:49,396 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-25 21:07:49,396 Training Params:
58
+ 2023-10-25 21:07:49,396 - learning_rate: "5e-05"
59
+ 2023-10-25 21:07:49,396 - mini_batch_size: "8"
60
+ 2023-10-25 21:07:49,396 - max_epochs: "10"
61
+ 2023-10-25 21:07:49,396 - shuffle: "True"
62
+ 2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-25 21:07:49,396 Plugins:
64
+ 2023-10-25 21:07:49,396 - TensorboardLogger
65
+ 2023-10-25 21:07:49,396 - LinearScheduler | warmup_fraction: '0.1'
66
+ 2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
67
+ 2023-10-25 21:07:49,396 Final evaluation on model from best epoch (best-model.pt)
68
+ 2023-10-25 21:07:49,396 - metric: "('micro avg', 'f1-score')"
69
+ 2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
70
+ 2023-10-25 21:07:49,396 Computation:
71
+ 2023-10-25 21:07:49,396 - compute on device: cuda:0
72
+ 2023-10-25 21:07:49,396 - embedding storage: none
73
+ 2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
74
+ 2023-10-25 21:07:49,396 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
75
+ 2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-25 21:07:49,396 ----------------------------------------------------------------------------------------------------
77
+ 2023-10-25 21:07:49,397 Logging anything other than scalars to TensorBoard is currently not supported.
78
+ 2023-10-25 21:07:50,189 epoch 1 - iter 14/146 - loss 2.76093476 - time (sec): 0.79 - samples/sec: 4306.04 - lr: 0.000004 - momentum: 0.000000
79
+ 2023-10-25 21:07:50,999 epoch 1 - iter 28/146 - loss 2.16927590 - time (sec): 1.60 - samples/sec: 4446.03 - lr: 0.000009 - momentum: 0.000000
80
+ 2023-10-25 21:07:51,943 epoch 1 - iter 42/146 - loss 1.56576289 - time (sec): 2.55 - samples/sec: 4650.49 - lr: 0.000014 - momentum: 0.000000
81
+ 2023-10-25 21:07:52,744 epoch 1 - iter 56/146 - loss 1.31295193 - time (sec): 3.35 - samples/sec: 4680.42 - lr: 0.000019 - momentum: 0.000000
82
+ 2023-10-25 21:07:53,535 epoch 1 - iter 70/146 - loss 1.14715761 - time (sec): 4.14 - samples/sec: 4718.31 - lr: 0.000024 - momentum: 0.000000
83
+ 2023-10-25 21:07:54,508 epoch 1 - iter 84/146 - loss 1.02143305 - time (sec): 5.11 - samples/sec: 4686.68 - lr: 0.000028 - momentum: 0.000000
84
+ 2023-10-25 21:07:55,436 epoch 1 - iter 98/146 - loss 0.90968592 - time (sec): 6.04 - samples/sec: 4766.54 - lr: 0.000033 - momentum: 0.000000
85
+ 2023-10-25 21:07:56,476 epoch 1 - iter 112/146 - loss 0.82600257 - time (sec): 7.08 - samples/sec: 4773.82 - lr: 0.000038 - momentum: 0.000000
86
+ 2023-10-25 21:07:57,299 epoch 1 - iter 126/146 - loss 0.76327971 - time (sec): 7.90 - samples/sec: 4816.97 - lr: 0.000043 - momentum: 0.000000
87
+ 2023-10-25 21:07:58,227 epoch 1 - iter 140/146 - loss 0.70269213 - time (sec): 8.83 - samples/sec: 4846.77 - lr: 0.000048 - momentum: 0.000000
88
+ 2023-10-25 21:07:58,636 ----------------------------------------------------------------------------------------------------
89
+ 2023-10-25 21:07:58,637 EPOCH 1 done: loss 0.6875 - lr: 0.000048
90
+ 2023-10-25 21:07:59,149 DEV : loss 0.1478959619998932 - f1-score (micro avg) 0.6147
91
+ 2023-10-25 21:07:59,153 saving best model
92
+ 2023-10-25 21:07:59,662 ----------------------------------------------------------------------------------------------------
93
+ 2023-10-25 21:08:00,616 epoch 2 - iter 14/146 - loss 0.16108065 - time (sec): 0.95 - samples/sec: 4811.08 - lr: 0.000050 - momentum: 0.000000
94
+ 2023-10-25 21:08:01,592 epoch 2 - iter 28/146 - loss 0.15067828 - time (sec): 1.93 - samples/sec: 4899.11 - lr: 0.000049 - momentum: 0.000000
95
+ 2023-10-25 21:08:02,467 epoch 2 - iter 42/146 - loss 0.15492269 - time (sec): 2.80 - samples/sec: 4896.36 - lr: 0.000048 - momentum: 0.000000
96
+ 2023-10-25 21:08:03,367 epoch 2 - iter 56/146 - loss 0.16102186 - time (sec): 3.70 - samples/sec: 4832.43 - lr: 0.000048 - momentum: 0.000000
97
+ 2023-10-25 21:08:04,144 epoch 2 - iter 70/146 - loss 0.15945960 - time (sec): 4.48 - samples/sec: 4838.41 - lr: 0.000047 - momentum: 0.000000
98
+ 2023-10-25 21:08:04,909 epoch 2 - iter 84/146 - loss 0.16413615 - time (sec): 5.25 - samples/sec: 4824.82 - lr: 0.000047 - momentum: 0.000000
99
+ 2023-10-25 21:08:05,687 epoch 2 - iter 98/146 - loss 0.15984296 - time (sec): 6.02 - samples/sec: 4843.16 - lr: 0.000046 - momentum: 0.000000
100
+ 2023-10-25 21:08:06,672 epoch 2 - iter 112/146 - loss 0.15388793 - time (sec): 7.01 - samples/sec: 4844.91 - lr: 0.000046 - momentum: 0.000000
101
+ 2023-10-25 21:08:07,514 epoch 2 - iter 126/146 - loss 0.14973376 - time (sec): 7.85 - samples/sec: 4895.41 - lr: 0.000045 - momentum: 0.000000
102
+ 2023-10-25 21:08:08,387 epoch 2 - iter 140/146 - loss 0.14951333 - time (sec): 8.72 - samples/sec: 4914.47 - lr: 0.000045 - momentum: 0.000000
103
+ 2023-10-25 21:08:08,740 ----------------------------------------------------------------------------------------------------
104
+ 2023-10-25 21:08:08,740 EPOCH 2 done: loss 0.1491 - lr: 0.000045
105
+ 2023-10-25 21:08:09,802 DEV : loss 0.10244478285312653 - f1-score (micro avg) 0.722
106
+ 2023-10-25 21:08:09,806 saving best model
107
+ 2023-10-25 21:08:10,488 ----------------------------------------------------------------------------------------------------
108
+ 2023-10-25 21:08:11,393 epoch 3 - iter 14/146 - loss 0.09051348 - time (sec): 0.90 - samples/sec: 4622.43 - lr: 0.000044 - momentum: 0.000000
109
+ 2023-10-25 21:08:12,169 epoch 3 - iter 28/146 - loss 0.08682021 - time (sec): 1.68 - samples/sec: 4486.31 - lr: 0.000043 - momentum: 0.000000
110
+ 2023-10-25 21:08:13,122 epoch 3 - iter 42/146 - loss 0.08134884 - time (sec): 2.63 - samples/sec: 4551.81 - lr: 0.000043 - momentum: 0.000000
111
+ 2023-10-25 21:08:13,976 epoch 3 - iter 56/146 - loss 0.07868931 - time (sec): 3.49 - samples/sec: 4454.44 - lr: 0.000042 - momentum: 0.000000
112
+ 2023-10-25 21:08:15,095 epoch 3 - iter 70/146 - loss 0.08088185 - time (sec): 4.60 - samples/sec: 4610.30 - lr: 0.000042 - momentum: 0.000000
113
+ 2023-10-25 21:08:15,955 epoch 3 - iter 84/146 - loss 0.08151152 - time (sec): 5.47 - samples/sec: 4731.13 - lr: 0.000041 - momentum: 0.000000
114
+ 2023-10-25 21:08:16,823 epoch 3 - iter 98/146 - loss 0.08031537 - time (sec): 6.33 - samples/sec: 4791.34 - lr: 0.000041 - momentum: 0.000000
115
+ 2023-10-25 21:08:17,535 epoch 3 - iter 112/146 - loss 0.08190989 - time (sec): 7.05 - samples/sec: 4836.59 - lr: 0.000040 - momentum: 0.000000
116
+ 2023-10-25 21:08:18,326 epoch 3 - iter 126/146 - loss 0.08448286 - time (sec): 7.84 - samples/sec: 4838.21 - lr: 0.000040 - momentum: 0.000000
117
+ 2023-10-25 21:08:19,203 epoch 3 - iter 140/146 - loss 0.08269211 - time (sec): 8.71 - samples/sec: 4854.04 - lr: 0.000039 - momentum: 0.000000
118
+ 2023-10-25 21:08:19,638 ----------------------------------------------------------------------------------------------------
119
+ 2023-10-25 21:08:19,639 EPOCH 3 done: loss 0.0836 - lr: 0.000039
120
+ 2023-10-25 21:08:20,557 DEV : loss 0.10184833407402039 - f1-score (micro avg) 0.7212
121
+ 2023-10-25 21:08:20,562 ----------------------------------------------------------------------------------------------------
122
+ 2023-10-25 21:08:21,520 epoch 4 - iter 14/146 - loss 0.06152145 - time (sec): 0.96 - samples/sec: 5271.53 - lr: 0.000038 - momentum: 0.000000
123
+ 2023-10-25 21:08:22,334 epoch 4 - iter 28/146 - loss 0.05606755 - time (sec): 1.77 - samples/sec: 4945.36 - lr: 0.000038 - momentum: 0.000000
124
+ 2023-10-25 21:08:23,145 epoch 4 - iter 42/146 - loss 0.05920449 - time (sec): 2.58 - samples/sec: 4896.12 - lr: 0.000037 - momentum: 0.000000
125
+ 2023-10-25 21:08:24,097 epoch 4 - iter 56/146 - loss 0.05389913 - time (sec): 3.53 - samples/sec: 4806.51 - lr: 0.000037 - momentum: 0.000000
126
+ 2023-10-25 21:08:24,831 epoch 4 - iter 70/146 - loss 0.05447504 - time (sec): 4.27 - samples/sec: 4767.50 - lr: 0.000036 - momentum: 0.000000
127
+ 2023-10-25 21:08:25,787 epoch 4 - iter 84/146 - loss 0.05784428 - time (sec): 5.22 - samples/sec: 4714.07 - lr: 0.000036 - momentum: 0.000000
128
+ 2023-10-25 21:08:26,661 epoch 4 - iter 98/146 - loss 0.05700628 - time (sec): 6.10 - samples/sec: 4729.43 - lr: 0.000035 - momentum: 0.000000
129
+ 2023-10-25 21:08:27,499 epoch 4 - iter 112/146 - loss 0.05350858 - time (sec): 6.94 - samples/sec: 4689.34 - lr: 0.000035 - momentum: 0.000000
130
+ 2023-10-25 21:08:28,507 epoch 4 - iter 126/146 - loss 0.05232477 - time (sec): 7.94 - samples/sec: 4696.73 - lr: 0.000034 - momentum: 0.000000
131
+ 2023-10-25 21:08:29,417 epoch 4 - iter 140/146 - loss 0.05225578 - time (sec): 8.85 - samples/sec: 4793.42 - lr: 0.000034 - momentum: 0.000000
132
+ 2023-10-25 21:08:29,757 ----------------------------------------------------------------------------------------------------
133
+ 2023-10-25 21:08:29,757 EPOCH 4 done: loss 0.0523 - lr: 0.000034
134
+ 2023-10-25 21:08:30,673 DEV : loss 0.10563240945339203 - f1-score (micro avg) 0.7404
135
+ 2023-10-25 21:08:30,678 saving best model
136
+ 2023-10-25 21:08:31,335 ----------------------------------------------------------------------------------------------------
137
+ 2023-10-25 21:08:32,246 epoch 5 - iter 14/146 - loss 0.02488614 - time (sec): 0.91 - samples/sec: 5101.79 - lr: 0.000033 - momentum: 0.000000
138
+ 2023-10-25 21:08:33,018 epoch 5 - iter 28/146 - loss 0.02254102 - time (sec): 1.68 - samples/sec: 4976.30 - lr: 0.000032 - momentum: 0.000000
139
+ 2023-10-25 21:08:33,914 epoch 5 - iter 42/146 - loss 0.02974599 - time (sec): 2.58 - samples/sec: 5077.10 - lr: 0.000032 - momentum: 0.000000
140
+ 2023-10-25 21:08:34,838 epoch 5 - iter 56/146 - loss 0.02885809 - time (sec): 3.50 - samples/sec: 4900.07 - lr: 0.000031 - momentum: 0.000000
141
+ 2023-10-25 21:08:35,787 epoch 5 - iter 70/146 - loss 0.02769486 - time (sec): 4.45 - samples/sec: 4744.69 - lr: 0.000031 - momentum: 0.000000
142
+ 2023-10-25 21:08:36,619 epoch 5 - iter 84/146 - loss 0.02843650 - time (sec): 5.28 - samples/sec: 4729.31 - lr: 0.000030 - momentum: 0.000000
143
+ 2023-10-25 21:08:37,565 epoch 5 - iter 98/146 - loss 0.03048721 - time (sec): 6.23 - samples/sec: 4678.06 - lr: 0.000030 - momentum: 0.000000
144
+ 2023-10-25 21:08:38,453 epoch 5 - iter 112/146 - loss 0.03202213 - time (sec): 7.12 - samples/sec: 4695.52 - lr: 0.000029 - momentum: 0.000000
145
+ 2023-10-25 21:08:39,677 epoch 5 - iter 126/146 - loss 0.03216288 - time (sec): 8.34 - samples/sec: 4614.52 - lr: 0.000029 - momentum: 0.000000
146
+ 2023-10-25 21:08:40,447 epoch 5 - iter 140/146 - loss 0.03299174 - time (sec): 9.11 - samples/sec: 4678.91 - lr: 0.000028 - momentum: 0.000000
147
+ 2023-10-25 21:08:40,842 ----------------------------------------------------------------------------------------------------
148
+ 2023-10-25 21:08:40,842 EPOCH 5 done: loss 0.0327 - lr: 0.000028
149
+ 2023-10-25 21:08:41,755 DEV : loss 0.10233564674854279 - f1-score (micro avg) 0.7706
150
+ 2023-10-25 21:08:41,760 saving best model
151
+ 2023-10-25 21:08:42,313 ----------------------------------------------------------------------------------------------------
152
+ 2023-10-25 21:08:43,115 epoch 6 - iter 14/146 - loss 0.01610677 - time (sec): 0.80 - samples/sec: 5199.94 - lr: 0.000027 - momentum: 0.000000
153
+ 2023-10-25 21:08:44,073 epoch 6 - iter 28/146 - loss 0.01771595 - time (sec): 1.76 - samples/sec: 4763.73 - lr: 0.000027 - momentum: 0.000000
154
+ 2023-10-25 21:08:45,030 epoch 6 - iter 42/146 - loss 0.01749347 - time (sec): 2.72 - samples/sec: 4851.45 - lr: 0.000026 - momentum: 0.000000
155
+ 2023-10-25 21:08:45,893 epoch 6 - iter 56/146 - loss 0.01637983 - time (sec): 3.58 - samples/sec: 4827.76 - lr: 0.000026 - momentum: 0.000000
156
+ 2023-10-25 21:08:46,839 epoch 6 - iter 70/146 - loss 0.02013402 - time (sec): 4.52 - samples/sec: 4762.24 - lr: 0.000025 - momentum: 0.000000
157
+ 2023-10-25 21:08:47,690 epoch 6 - iter 84/146 - loss 0.01826125 - time (sec): 5.38 - samples/sec: 4727.79 - lr: 0.000025 - momentum: 0.000000
158
+ 2023-10-25 21:08:48,586 epoch 6 - iter 98/146 - loss 0.01955487 - time (sec): 6.27 - samples/sec: 4801.37 - lr: 0.000024 - momentum: 0.000000
159
+ 2023-10-25 21:08:49,583 epoch 6 - iter 112/146 - loss 0.02207506 - time (sec): 7.27 - samples/sec: 4730.97 - lr: 0.000024 - momentum: 0.000000
160
+ 2023-10-25 21:08:50,408 epoch 6 - iter 126/146 - loss 0.02262062 - time (sec): 8.09 - samples/sec: 4746.73 - lr: 0.000023 - momentum: 0.000000
161
+ 2023-10-25 21:08:51,267 epoch 6 - iter 140/146 - loss 0.02172538 - time (sec): 8.95 - samples/sec: 4777.39 - lr: 0.000023 - momentum: 0.000000
162
+ 2023-10-25 21:08:51,616 ----------------------------------------------------------------------------------------------------
163
+ 2023-10-25 21:08:51,617 EPOCH 6 done: loss 0.0229 - lr: 0.000023
164
+ 2023-10-25 21:08:52,528 DEV : loss 0.127528578042984 - f1-score (micro avg) 0.7511
165
+ 2023-10-25 21:08:52,532 ----------------------------------------------------------------------------------------------------
166
+ 2023-10-25 21:08:53,355 epoch 7 - iter 14/146 - loss 0.01150269 - time (sec): 0.82 - samples/sec: 5084.44 - lr: 0.000022 - momentum: 0.000000
167
+ 2023-10-25 21:08:54,542 epoch 7 - iter 28/146 - loss 0.01576881 - time (sec): 2.01 - samples/sec: 5019.70 - lr: 0.000021 - momentum: 0.000000
168
+ 2023-10-25 21:08:55,297 epoch 7 - iter 42/146 - loss 0.01762664 - time (sec): 2.76 - samples/sec: 4869.93 - lr: 0.000021 - momentum: 0.000000
169
+ 2023-10-25 21:08:56,135 epoch 7 - iter 56/146 - loss 0.01668696 - time (sec): 3.60 - samples/sec: 4798.61 - lr: 0.000020 - momentum: 0.000000
170
+ 2023-10-25 21:08:56,965 epoch 7 - iter 70/146 - loss 0.01515176 - time (sec): 4.43 - samples/sec: 4829.07 - lr: 0.000020 - momentum: 0.000000
171
+ 2023-10-25 21:08:57,897 epoch 7 - iter 84/146 - loss 0.01337344 - time (sec): 5.36 - samples/sec: 4908.17 - lr: 0.000019 - momentum: 0.000000
172
+ 2023-10-25 21:08:58,754 epoch 7 - iter 98/146 - loss 0.01419310 - time (sec): 6.22 - samples/sec: 4934.76 - lr: 0.000019 - momentum: 0.000000
173
+ 2023-10-25 21:08:59,525 epoch 7 - iter 112/146 - loss 0.01622088 - time (sec): 6.99 - samples/sec: 4885.45 - lr: 0.000018 - momentum: 0.000000
174
+ 2023-10-25 21:09:00,360 epoch 7 - iter 126/146 - loss 0.01549015 - time (sec): 7.83 - samples/sec: 4890.68 - lr: 0.000018 - momentum: 0.000000
175
+ 2023-10-25 21:09:01,272 epoch 7 - iter 140/146 - loss 0.01516680 - time (sec): 8.74 - samples/sec: 4861.79 - lr: 0.000017 - momentum: 0.000000
176
+ 2023-10-25 21:09:01,676 ----------------------------------------------------------------------------------------------------
177
+ 2023-10-25 21:09:01,676 EPOCH 7 done: loss 0.0146 - lr: 0.000017
178
+ 2023-10-25 21:09:02,587 DEV : loss 0.13177402317523956 - f1-score (micro avg) 0.7689
179
+ 2023-10-25 21:09:02,591 ----------------------------------------------------------------------------------------------------
180
+ 2023-10-25 21:09:03,473 epoch 8 - iter 14/146 - loss 0.02475525 - time (sec): 0.88 - samples/sec: 4376.44 - lr: 0.000016 - momentum: 0.000000
181
+ 2023-10-25 21:09:04,487 epoch 8 - iter 28/146 - loss 0.01803622 - time (sec): 1.90 - samples/sec: 4481.02 - lr: 0.000016 - momentum: 0.000000
182
+ 2023-10-25 21:09:05,574 epoch 8 - iter 42/146 - loss 0.01556120 - time (sec): 2.98 - samples/sec: 4529.62 - lr: 0.000015 - momentum: 0.000000
183
+ 2023-10-25 21:09:06,474 epoch 8 - iter 56/146 - loss 0.01480972 - time (sec): 3.88 - samples/sec: 4494.81 - lr: 0.000015 - momentum: 0.000000
184
+ 2023-10-25 21:09:07,396 epoch 8 - iter 70/146 - loss 0.01592848 - time (sec): 4.80 - samples/sec: 4523.32 - lr: 0.000014 - momentum: 0.000000
185
+ 2023-10-25 21:09:08,211 epoch 8 - iter 84/146 - loss 0.01469362 - time (sec): 5.62 - samples/sec: 4618.26 - lr: 0.000014 - momentum: 0.000000
186
+ 2023-10-25 21:09:08,991 epoch 8 - iter 98/146 - loss 0.01471966 - time (sec): 6.40 - samples/sec: 4605.33 - lr: 0.000013 - momentum: 0.000000
187
+ 2023-10-25 21:09:09,977 epoch 8 - iter 112/146 - loss 0.01407256 - time (sec): 7.39 - samples/sec: 4671.32 - lr: 0.000013 - momentum: 0.000000
188
+ 2023-10-25 21:09:10,789 epoch 8 - iter 126/146 - loss 0.01427331 - time (sec): 8.20 - samples/sec: 4669.48 - lr: 0.000012 - momentum: 0.000000
189
+ 2023-10-25 21:09:11,640 epoch 8 - iter 140/146 - loss 0.01285895 - time (sec): 9.05 - samples/sec: 4747.94 - lr: 0.000012 - momentum: 0.000000
190
+ 2023-10-25 21:09:11,945 ----------------------------------------------------------------------------------------------------
191
+ 2023-10-25 21:09:11,946 EPOCH 8 done: loss 0.0125 - lr: 0.000012
192
+ 2023-10-25 21:09:13,015 DEV : loss 0.14684733748435974 - f1-score (micro avg) 0.7458
193
+ 2023-10-25 21:09:13,020 ----------------------------------------------------------------------------------------------------
194
+ 2023-10-25 21:09:13,897 epoch 9 - iter 14/146 - loss 0.00374008 - time (sec): 0.88 - samples/sec: 5312.36 - lr: 0.000011 - momentum: 0.000000
195
+ 2023-10-25 21:09:14,741 epoch 9 - iter 28/146 - loss 0.00602499 - time (sec): 1.72 - samples/sec: 5187.02 - lr: 0.000010 - momentum: 0.000000
196
+ 2023-10-25 21:09:15,543 epoch 9 - iter 42/146 - loss 0.00469098 - time (sec): 2.52 - samples/sec: 4978.45 - lr: 0.000010 - momentum: 0.000000
197
+ 2023-10-25 21:09:16,616 epoch 9 - iter 56/146 - loss 0.00530783 - time (sec): 3.59 - samples/sec: 4869.76 - lr: 0.000009 - momentum: 0.000000
198
+ 2023-10-25 21:09:17,612 epoch 9 - iter 70/146 - loss 0.00888826 - time (sec): 4.59 - samples/sec: 4850.68 - lr: 0.000009 - momentum: 0.000000
199
+ 2023-10-25 21:09:18,526 epoch 9 - iter 84/146 - loss 0.00861823 - time (sec): 5.50 - samples/sec: 4792.41 - lr: 0.000008 - momentum: 0.000000
200
+ 2023-10-25 21:09:19,463 epoch 9 - iter 98/146 - loss 0.00801291 - time (sec): 6.44 - samples/sec: 4800.57 - lr: 0.000008 - momentum: 0.000000
201
+ 2023-10-25 21:09:20,244 epoch 9 - iter 112/146 - loss 0.00827974 - time (sec): 7.22 - samples/sec: 4759.68 - lr: 0.000007 - momentum: 0.000000
202
+ 2023-10-25 21:09:21,147 epoch 9 - iter 126/146 - loss 0.00760175 - time (sec): 8.13 - samples/sec: 4737.94 - lr: 0.000007 - momentum: 0.000000
203
+ 2023-10-25 21:09:22,026 epoch 9 - iter 140/146 - loss 0.00722376 - time (sec): 9.00 - samples/sec: 4748.75 - lr: 0.000006 - momentum: 0.000000
204
+ 2023-10-25 21:09:22,355 ----------------------------------------------------------------------------------------------------
205
+ 2023-10-25 21:09:22,355 EPOCH 9 done: loss 0.0073 - lr: 0.000006
206
+ 2023-10-25 21:09:23,267 DEV : loss 0.1549414098262787 - f1-score (micro avg) 0.7692
207
+ 2023-10-25 21:09:23,271 ----------------------------------------------------------------------------------------------------
208
+ 2023-10-25 21:09:24,097 epoch 10 - iter 14/146 - loss 0.00128690 - time (sec): 0.82 - samples/sec: 5057.66 - lr: 0.000005 - momentum: 0.000000
209
+ 2023-10-25 21:09:24,940 epoch 10 - iter 28/146 - loss 0.00525616 - time (sec): 1.67 - samples/sec: 5092.22 - lr: 0.000005 - momentum: 0.000000
210
+ 2023-10-25 21:09:25,838 epoch 10 - iter 42/146 - loss 0.00459227 - time (sec): 2.57 - samples/sec: 4870.74 - lr: 0.000004 - momentum: 0.000000
211
+ 2023-10-25 21:09:26,823 epoch 10 - iter 56/146 - loss 0.00708331 - time (sec): 3.55 - samples/sec: 4768.34 - lr: 0.000004 - momentum: 0.000000
212
+ 2023-10-25 21:09:27,709 epoch 10 - iter 70/146 - loss 0.00587957 - time (sec): 4.44 - samples/sec: 4762.88 - lr: 0.000003 - momentum: 0.000000
213
+ 2023-10-25 21:09:28,497 epoch 10 - iter 84/146 - loss 0.00554849 - time (sec): 5.22 - samples/sec: 4713.50 - lr: 0.000003 - momentum: 0.000000
214
+ 2023-10-25 21:09:29,533 epoch 10 - iter 98/146 - loss 0.00596721 - time (sec): 6.26 - samples/sec: 4666.95 - lr: 0.000002 - momentum: 0.000000
215
+ 2023-10-25 21:09:30,455 epoch 10 - iter 112/146 - loss 0.00572547 - time (sec): 7.18 - samples/sec: 4724.35 - lr: 0.000002 - momentum: 0.000000
216
+ 2023-10-25 21:09:31,589 epoch 10 - iter 126/146 - loss 0.00547473 - time (sec): 8.32 - samples/sec: 4612.75 - lr: 0.000001 - momentum: 0.000000
217
+ 2023-10-25 21:09:32,468 epoch 10 - iter 140/146 - loss 0.00498630 - time (sec): 9.20 - samples/sec: 4651.83 - lr: 0.000000 - momentum: 0.000000
218
+ 2023-10-25 21:09:32,839 ----------------------------------------------------------------------------------------------------
219
+ 2023-10-25 21:09:32,839 EPOCH 10 done: loss 0.0049 - lr: 0.000000
220
+ 2023-10-25 21:09:33,748 DEV : loss 0.15722544491291046 - f1-score (micro avg) 0.7706
221
+ 2023-10-25 21:09:34,271 ----------------------------------------------------------------------------------------------------
222
+ 2023-10-25 21:09:34,272 Loading model from best epoch ...
223
+ 2023-10-25 21:09:35,987 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
224
+ 2023-10-25 21:09:37,504
225
+ Results:
226
+ - F-score (micro) 0.7628
227
+ - F-score (macro) 0.6702
228
+ - Accuracy 0.6352
229
+
230
+ By class:
231
+ precision recall f1-score support
232
+
233
+ PER 0.8319 0.8391 0.8355 348
234
+ LOC 0.6656 0.8314 0.7394 261
235
+ ORG 0.4468 0.4038 0.4242 52
236
+ HumanProd 0.6818 0.6818 0.6818 22
237
+
238
+ micro avg 0.7306 0.7980 0.7628 683
239
+ macro avg 0.6565 0.6890 0.6702 683
240
+ weighted avg 0.7342 0.7980 0.7625 683
241
+
242
+ 2023-10-25 21:09:37,504 ----------------------------------------------------------------------------------------------------