stefan-it commited on
Commit
410f4df
1 Parent(s): 916e0d2

Upload folder using huggingface_hub

Browse files
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d8894b6085f785253b0f5588417cdf608e5fc40f224a1c05aebf1fb70af03dc6
3
+ size 19050210
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 17:48:44 0.0000 1.3532 0.4525 0.0000 0.0000 0.0000 0.0000
3
+ 2 17:49:03 0.0000 0.5064 0.3579 0.2115 0.0344 0.0592 0.0308
4
+ 3 17:49:22 0.0000 0.4296 0.3420 0.4061 0.1759 0.2455 0.1428
5
+ 4 17:49:41 0.0000 0.3846 0.3150 0.4061 0.2705 0.3247 0.2012
6
+ 5 17:50:01 0.0000 0.3570 0.3158 0.3788 0.2823 0.3235 0.2019
7
+ 6 17:50:21 0.0000 0.3326 0.3055 0.3731 0.3182 0.3435 0.2174
8
+ 7 17:50:40 0.0000 0.3195 0.3022 0.3661 0.3346 0.3497 0.2222
9
+ 8 17:50:59 0.0000 0.3097 0.3043 0.3841 0.3253 0.3522 0.2235
10
+ 9 17:51:19 0.0000 0.3029 0.3005 0.3721 0.3378 0.3541 0.2257
11
+ 10 17:51:38 0.0000 0.3025 0.3002 0.3776 0.3401 0.3579 0.2283
runs/events.out.tfevents.1697651308.46dc0c540dd0.2878.4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:947483c78418d15dc00283fad12fbd741dbef987db29cd29f805afde85db5bd5
3
+ size 502124
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,247 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-18 17:48:28,457 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-18 17:48:28,458 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(32001, 128)
7
+ (position_embeddings): Embedding(512, 128)
8
+ (token_type_embeddings): Embedding(2, 128)
9
+ (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-1): 2 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=128, out_features=128, bias=True)
18
+ (key): Linear(in_features=128, out_features=128, bias=True)
19
+ (value): Linear(in_features=128, out_features=128, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=128, out_features=128, bias=True)
24
+ (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=128, out_features=512, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=512, out_features=128, bias=True)
34
+ (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=128, out_features=128, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=128, out_features=21, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-18 17:48:28,458 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-18 17:48:28,458 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
52
+ - NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
53
+ 2023-10-18 17:48:28,458 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-18 17:48:28,458 Train: 3575 sentences
55
+ 2023-10-18 17:48:28,458 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-18 17:48:28,458 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-18 17:48:28,458 Training Params:
58
+ 2023-10-18 17:48:28,458 - learning_rate: "3e-05"
59
+ 2023-10-18 17:48:28,458 - mini_batch_size: "4"
60
+ 2023-10-18 17:48:28,458 - max_epochs: "10"
61
+ 2023-10-18 17:48:28,458 - shuffle: "True"
62
+ 2023-10-18 17:48:28,458 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-18 17:48:28,458 Plugins:
64
+ 2023-10-18 17:48:28,458 - TensorboardLogger
65
+ 2023-10-18 17:48:28,458 - LinearScheduler | warmup_fraction: '0.1'
66
+ 2023-10-18 17:48:28,458 ----------------------------------------------------------------------------------------------------
67
+ 2023-10-18 17:48:28,458 Final evaluation on model from best epoch (best-model.pt)
68
+ 2023-10-18 17:48:28,458 - metric: "('micro avg', 'f1-score')"
69
+ 2023-10-18 17:48:28,458 ----------------------------------------------------------------------------------------------------
70
+ 2023-10-18 17:48:28,458 Computation:
71
+ 2023-10-18 17:48:28,458 - compute on device: cuda:0
72
+ 2023-10-18 17:48:28,458 - embedding storage: none
73
+ 2023-10-18 17:48:28,458 ----------------------------------------------------------------------------------------------------
74
+ 2023-10-18 17:48:28,459 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
75
+ 2023-10-18 17:48:28,459 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-18 17:48:28,459 ----------------------------------------------------------------------------------------------------
77
+ 2023-10-18 17:48:28,459 Logging anything other than scalars to TensorBoard is currently not supported.
78
+ 2023-10-18 17:48:29,748 epoch 1 - iter 89/894 - loss 3.19630071 - time (sec): 1.29 - samples/sec: 7045.16 - lr: 0.000003 - momentum: 0.000000
79
+ 2023-10-18 17:48:31,082 epoch 1 - iter 178/894 - loss 2.96353199 - time (sec): 2.62 - samples/sec: 7187.38 - lr: 0.000006 - momentum: 0.000000
80
+ 2023-10-18 17:48:32,421 epoch 1 - iter 267/894 - loss 2.74540200 - time (sec): 3.96 - samples/sec: 6702.85 - lr: 0.000009 - momentum: 0.000000
81
+ 2023-10-18 17:48:33,790 epoch 1 - iter 356/894 - loss 2.43894253 - time (sec): 5.33 - samples/sec: 6411.16 - lr: 0.000012 - momentum: 0.000000
82
+ 2023-10-18 17:48:35,196 epoch 1 - iter 445/894 - loss 2.12976788 - time (sec): 6.74 - samples/sec: 6277.64 - lr: 0.000015 - momentum: 0.000000
83
+ 2023-10-18 17:48:36,585 epoch 1 - iter 534/894 - loss 1.88648563 - time (sec): 8.13 - samples/sec: 6263.25 - lr: 0.000018 - momentum: 0.000000
84
+ 2023-10-18 17:48:37,976 epoch 1 - iter 623/894 - loss 1.67326583 - time (sec): 9.52 - samples/sec: 6416.47 - lr: 0.000021 - momentum: 0.000000
85
+ 2023-10-18 17:48:39,326 epoch 1 - iter 712/894 - loss 1.53353448 - time (sec): 10.87 - samples/sec: 6403.24 - lr: 0.000024 - momentum: 0.000000
86
+ 2023-10-18 17:48:40,702 epoch 1 - iter 801/894 - loss 1.42922638 - time (sec): 12.24 - samples/sec: 6366.34 - lr: 0.000027 - momentum: 0.000000
87
+ 2023-10-18 17:48:42,044 epoch 1 - iter 890/894 - loss 1.35456652 - time (sec): 13.58 - samples/sec: 6351.81 - lr: 0.000030 - momentum: 0.000000
88
+ 2023-10-18 17:48:42,100 ----------------------------------------------------------------------------------------------------
89
+ 2023-10-18 17:48:42,100 EPOCH 1 done: loss 1.3532 - lr: 0.000030
90
+ 2023-10-18 17:48:44,322 DEV : loss 0.45246344804763794 - f1-score (micro avg) 0.0
91
+ 2023-10-18 17:48:44,344 ----------------------------------------------------------------------------------------------------
92
+ 2023-10-18 17:48:45,711 epoch 2 - iter 89/894 - loss 0.52574422 - time (sec): 1.37 - samples/sec: 6183.97 - lr: 0.000030 - momentum: 0.000000
93
+ 2023-10-18 17:48:47,080 epoch 2 - iter 178/894 - loss 0.55545770 - time (sec): 2.74 - samples/sec: 6374.43 - lr: 0.000029 - momentum: 0.000000
94
+ 2023-10-18 17:48:48,520 epoch 2 - iter 267/894 - loss 0.54150980 - time (sec): 4.18 - samples/sec: 6041.74 - lr: 0.000029 - momentum: 0.000000
95
+ 2023-10-18 17:48:49,990 epoch 2 - iter 356/894 - loss 0.53379165 - time (sec): 5.64 - samples/sec: 5904.15 - lr: 0.000029 - momentum: 0.000000
96
+ 2023-10-18 17:48:51,410 epoch 2 - iter 445/894 - loss 0.53062271 - time (sec): 7.07 - samples/sec: 6094.43 - lr: 0.000028 - momentum: 0.000000
97
+ 2023-10-18 17:48:52,758 epoch 2 - iter 534/894 - loss 0.52081983 - time (sec): 8.41 - samples/sec: 6121.72 - lr: 0.000028 - momentum: 0.000000
98
+ 2023-10-18 17:48:54,219 epoch 2 - iter 623/894 - loss 0.51963233 - time (sec): 9.87 - samples/sec: 6246.67 - lr: 0.000028 - momentum: 0.000000
99
+ 2023-10-18 17:48:55,574 epoch 2 - iter 712/894 - loss 0.51540611 - time (sec): 11.23 - samples/sec: 6162.82 - lr: 0.000027 - momentum: 0.000000
100
+ 2023-10-18 17:48:56,953 epoch 2 - iter 801/894 - loss 0.51041159 - time (sec): 12.61 - samples/sec: 6165.15 - lr: 0.000027 - momentum: 0.000000
101
+ 2023-10-18 17:48:58,341 epoch 2 - iter 890/894 - loss 0.50535478 - time (sec): 14.00 - samples/sec: 6164.74 - lr: 0.000027 - momentum: 0.000000
102
+ 2023-10-18 17:48:58,402 ----------------------------------------------------------------------------------------------------
103
+ 2023-10-18 17:48:58,403 EPOCH 2 done: loss 0.5064 - lr: 0.000027
104
+ 2023-10-18 17:49:03,571 DEV : loss 0.3578655421733856 - f1-score (micro avg) 0.0592
105
+ 2023-10-18 17:49:03,594 saving best model
106
+ 2023-10-18 17:49:03,627 ----------------------------------------------------------------------------------------------------
107
+ 2023-10-18 17:49:05,037 epoch 3 - iter 89/894 - loss 0.48336938 - time (sec): 1.41 - samples/sec: 6484.71 - lr: 0.000026 - momentum: 0.000000
108
+ 2023-10-18 17:49:06,438 epoch 3 - iter 178/894 - loss 0.47658882 - time (sec): 2.81 - samples/sec: 6324.71 - lr: 0.000026 - momentum: 0.000000
109
+ 2023-10-18 17:49:07,814 epoch 3 - iter 267/894 - loss 0.45712463 - time (sec): 4.19 - samples/sec: 6340.69 - lr: 0.000026 - momentum: 0.000000
110
+ 2023-10-18 17:49:09,179 epoch 3 - iter 356/894 - loss 0.46393711 - time (sec): 5.55 - samples/sec: 6172.52 - lr: 0.000025 - momentum: 0.000000
111
+ 2023-10-18 17:49:10,558 epoch 3 - iter 445/894 - loss 0.44830358 - time (sec): 6.93 - samples/sec: 6114.08 - lr: 0.000025 - momentum: 0.000000
112
+ 2023-10-18 17:49:11,950 epoch 3 - iter 534/894 - loss 0.44669476 - time (sec): 8.32 - samples/sec: 6129.35 - lr: 0.000025 - momentum: 0.000000
113
+ 2023-10-18 17:49:13,338 epoch 3 - iter 623/894 - loss 0.43924135 - time (sec): 9.71 - samples/sec: 6133.73 - lr: 0.000024 - momentum: 0.000000
114
+ 2023-10-18 17:49:14,741 epoch 3 - iter 712/894 - loss 0.43899574 - time (sec): 11.11 - samples/sec: 6177.45 - lr: 0.000024 - momentum: 0.000000
115
+ 2023-10-18 17:49:16,164 epoch 3 - iter 801/894 - loss 0.43256443 - time (sec): 12.54 - samples/sec: 6201.00 - lr: 0.000024 - momentum: 0.000000
116
+ 2023-10-18 17:49:17,548 epoch 3 - iter 890/894 - loss 0.42949594 - time (sec): 13.92 - samples/sec: 6194.26 - lr: 0.000023 - momentum: 0.000000
117
+ 2023-10-18 17:49:17,606 ----------------------------------------------------------------------------------------------------
118
+ 2023-10-18 17:49:17,606 EPOCH 3 done: loss 0.4296 - lr: 0.000023
119
+ 2023-10-18 17:49:22,809 DEV : loss 0.34198394417762756 - f1-score (micro avg) 0.2455
120
+ 2023-10-18 17:49:22,832 saving best model
121
+ 2023-10-18 17:49:22,865 ----------------------------------------------------------------------------------------------------
122
+ 2023-10-18 17:49:24,232 epoch 4 - iter 89/894 - loss 0.41318889 - time (sec): 1.37 - samples/sec: 5961.29 - lr: 0.000023 - momentum: 0.000000
123
+ 2023-10-18 17:49:25,688 epoch 4 - iter 178/894 - loss 0.38872500 - time (sec): 2.82 - samples/sec: 6480.28 - lr: 0.000023 - momentum: 0.000000
124
+ 2023-10-18 17:49:27,081 epoch 4 - iter 267/894 - loss 0.38659751 - time (sec): 4.22 - samples/sec: 6377.23 - lr: 0.000022 - momentum: 0.000000
125
+ 2023-10-18 17:49:28,484 epoch 4 - iter 356/894 - loss 0.39484288 - time (sec): 5.62 - samples/sec: 6340.75 - lr: 0.000022 - momentum: 0.000000
126
+ 2023-10-18 17:49:29,929 epoch 4 - iter 445/894 - loss 0.38596894 - time (sec): 7.06 - samples/sec: 6296.23 - lr: 0.000022 - momentum: 0.000000
127
+ 2023-10-18 17:49:31,291 epoch 4 - iter 534/894 - loss 0.38762013 - time (sec): 8.43 - samples/sec: 6266.21 - lr: 0.000021 - momentum: 0.000000
128
+ 2023-10-18 17:49:32,694 epoch 4 - iter 623/894 - loss 0.38363860 - time (sec): 9.83 - samples/sec: 6235.32 - lr: 0.000021 - momentum: 0.000000
129
+ 2023-10-18 17:49:34,090 epoch 4 - iter 712/894 - loss 0.38723100 - time (sec): 11.22 - samples/sec: 6214.74 - lr: 0.000021 - momentum: 0.000000
130
+ 2023-10-18 17:49:35,490 epoch 4 - iter 801/894 - loss 0.38665888 - time (sec): 12.62 - samples/sec: 6162.80 - lr: 0.000020 - momentum: 0.000000
131
+ 2023-10-18 17:49:36,891 epoch 4 - iter 890/894 - loss 0.38610173 - time (sec): 14.03 - samples/sec: 6143.24 - lr: 0.000020 - momentum: 0.000000
132
+ 2023-10-18 17:49:36,957 ----------------------------------------------------------------------------------------------------
133
+ 2023-10-18 17:49:36,957 EPOCH 4 done: loss 0.3846 - lr: 0.000020
134
+ 2023-10-18 17:49:41,899 DEV : loss 0.3150075674057007 - f1-score (micro avg) 0.3247
135
+ 2023-10-18 17:49:41,923 saving best model
136
+ 2023-10-18 17:49:41,956 ----------------------------------------------------------------------------------------------------
137
+ 2023-10-18 17:49:43,421 epoch 5 - iter 89/894 - loss 0.38144140 - time (sec): 1.46 - samples/sec: 5968.14 - lr: 0.000020 - momentum: 0.000000
138
+ 2023-10-18 17:49:44,878 epoch 5 - iter 178/894 - loss 0.35024717 - time (sec): 2.92 - samples/sec: 6219.91 - lr: 0.000019 - momentum: 0.000000
139
+ 2023-10-18 17:49:46,330 epoch 5 - iter 267/894 - loss 0.35588428 - time (sec): 4.37 - samples/sec: 5912.08 - lr: 0.000019 - momentum: 0.000000
140
+ 2023-10-18 17:49:47,698 epoch 5 - iter 356/894 - loss 0.35467769 - time (sec): 5.74 - samples/sec: 6017.05 - lr: 0.000019 - momentum: 0.000000
141
+ 2023-10-18 17:49:49,081 epoch 5 - iter 445/894 - loss 0.36077938 - time (sec): 7.12 - samples/sec: 5971.89 - lr: 0.000018 - momentum: 0.000000
142
+ 2023-10-18 17:49:50,469 epoch 5 - iter 534/894 - loss 0.36734085 - time (sec): 8.51 - samples/sec: 5956.77 - lr: 0.000018 - momentum: 0.000000
143
+ 2023-10-18 17:49:51,914 epoch 5 - iter 623/894 - loss 0.36702674 - time (sec): 9.96 - samples/sec: 6018.41 - lr: 0.000018 - momentum: 0.000000
144
+ 2023-10-18 17:49:53,392 epoch 5 - iter 712/894 - loss 0.36851786 - time (sec): 11.44 - samples/sec: 6059.43 - lr: 0.000017 - momentum: 0.000000
145
+ 2023-10-18 17:49:54,785 epoch 5 - iter 801/894 - loss 0.36449745 - time (sec): 12.83 - samples/sec: 6069.19 - lr: 0.000017 - momentum: 0.000000
146
+ 2023-10-18 17:49:56,193 epoch 5 - iter 890/894 - loss 0.35757282 - time (sec): 14.24 - samples/sec: 6057.28 - lr: 0.000017 - momentum: 0.000000
147
+ 2023-10-18 17:49:56,259 ----------------------------------------------------------------------------------------------------
148
+ 2023-10-18 17:49:56,259 EPOCH 5 done: loss 0.3570 - lr: 0.000017
149
+ 2023-10-18 17:50:01,547 DEV : loss 0.3157925307750702 - f1-score (micro avg) 0.3235
150
+ 2023-10-18 17:50:01,571 ----------------------------------------------------------------------------------------------------
151
+ 2023-10-18 17:50:03,005 epoch 6 - iter 89/894 - loss 0.34112212 - time (sec): 1.43 - samples/sec: 5988.42 - lr: 0.000016 - momentum: 0.000000
152
+ 2023-10-18 17:50:04,422 epoch 6 - iter 178/894 - loss 0.33324202 - time (sec): 2.85 - samples/sec: 5812.42 - lr: 0.000016 - momentum: 0.000000
153
+ 2023-10-18 17:50:05,839 epoch 6 - iter 267/894 - loss 0.32141621 - time (sec): 4.27 - samples/sec: 5680.00 - lr: 0.000016 - momentum: 0.000000
154
+ 2023-10-18 17:50:07,210 epoch 6 - iter 356/894 - loss 0.34308998 - time (sec): 5.64 - samples/sec: 5766.94 - lr: 0.000015 - momentum: 0.000000
155
+ 2023-10-18 17:50:08,571 epoch 6 - iter 445/894 - loss 0.34151889 - time (sec): 7.00 - samples/sec: 5841.83 - lr: 0.000015 - momentum: 0.000000
156
+ 2023-10-18 17:50:10,009 epoch 6 - iter 534/894 - loss 0.35052796 - time (sec): 8.44 - samples/sec: 6032.34 - lr: 0.000015 - momentum: 0.000000
157
+ 2023-10-18 17:50:11,412 epoch 6 - iter 623/894 - loss 0.34315959 - time (sec): 9.84 - samples/sec: 6061.10 - lr: 0.000014 - momentum: 0.000000
158
+ 2023-10-18 17:50:12,775 epoch 6 - iter 712/894 - loss 0.33510239 - time (sec): 11.20 - samples/sec: 6112.00 - lr: 0.000014 - momentum: 0.000000
159
+ 2023-10-18 17:50:14,192 epoch 6 - iter 801/894 - loss 0.33637264 - time (sec): 12.62 - samples/sec: 6162.15 - lr: 0.000014 - momentum: 0.000000
160
+ 2023-10-18 17:50:15,601 epoch 6 - iter 890/894 - loss 0.33235115 - time (sec): 14.03 - samples/sec: 6145.37 - lr: 0.000013 - momentum: 0.000000
161
+ 2023-10-18 17:50:15,658 ----------------------------------------------------------------------------------------------------
162
+ 2023-10-18 17:50:15,658 EPOCH 6 done: loss 0.3326 - lr: 0.000013
163
+ 2023-10-18 17:50:20,999 DEV : loss 0.30548417568206787 - f1-score (micro avg) 0.3435
164
+ 2023-10-18 17:50:21,022 saving best model
165
+ 2023-10-18 17:50:21,056 ----------------------------------------------------------------------------------------------------
166
+ 2023-10-18 17:50:22,623 epoch 7 - iter 89/894 - loss 0.31222141 - time (sec): 1.57 - samples/sec: 5714.42 - lr: 0.000013 - momentum: 0.000000
167
+ 2023-10-18 17:50:24,005 epoch 7 - iter 178/894 - loss 0.32705010 - time (sec): 2.95 - samples/sec: 5872.12 - lr: 0.000013 - momentum: 0.000000
168
+ 2023-10-18 17:50:25,399 epoch 7 - iter 267/894 - loss 0.32263937 - time (sec): 4.34 - samples/sec: 5931.39 - lr: 0.000012 - momentum: 0.000000
169
+ 2023-10-18 17:50:26,782 epoch 7 - iter 356/894 - loss 0.32544371 - time (sec): 5.73 - samples/sec: 5904.42 - lr: 0.000012 - momentum: 0.000000
170
+ 2023-10-18 17:50:28,201 epoch 7 - iter 445/894 - loss 0.31601726 - time (sec): 7.14 - samples/sec: 5919.08 - lr: 0.000012 - momentum: 0.000000
171
+ 2023-10-18 17:50:29,634 epoch 7 - iter 534/894 - loss 0.31615422 - time (sec): 8.58 - samples/sec: 6012.46 - lr: 0.000011 - momentum: 0.000000
172
+ 2023-10-18 17:50:31,015 epoch 7 - iter 623/894 - loss 0.31916724 - time (sec): 9.96 - samples/sec: 6066.94 - lr: 0.000011 - momentum: 0.000000
173
+ 2023-10-18 17:50:32,413 epoch 7 - iter 712/894 - loss 0.32144582 - time (sec): 11.36 - samples/sec: 6137.21 - lr: 0.000011 - momentum: 0.000000
174
+ 2023-10-18 17:50:33,805 epoch 7 - iter 801/894 - loss 0.32125099 - time (sec): 12.75 - samples/sec: 6150.79 - lr: 0.000010 - momentum: 0.000000
175
+ 2023-10-18 17:50:35,151 epoch 7 - iter 890/894 - loss 0.31897858 - time (sec): 14.09 - samples/sec: 6117.17 - lr: 0.000010 - momentum: 0.000000
176
+ 2023-10-18 17:50:35,210 ----------------------------------------------------------------------------------------------------
177
+ 2023-10-18 17:50:35,210 EPOCH 7 done: loss 0.3195 - lr: 0.000010
178
+ 2023-10-18 17:50:40,535 DEV : loss 0.3022187650203705 - f1-score (micro avg) 0.3497
179
+ 2023-10-18 17:50:40,559 saving best model
180
+ 2023-10-18 17:50:40,599 ----------------------------------------------------------------------------------------------------
181
+ 2023-10-18 17:50:42,000 epoch 8 - iter 89/894 - loss 0.34555510 - time (sec): 1.40 - samples/sec: 6326.33 - lr: 0.000010 - momentum: 0.000000
182
+ 2023-10-18 17:50:43,433 epoch 8 - iter 178/894 - loss 0.32944189 - time (sec): 2.83 - samples/sec: 6467.94 - lr: 0.000009 - momentum: 0.000000
183
+ 2023-10-18 17:50:44,836 epoch 8 - iter 267/894 - loss 0.32918232 - time (sec): 4.24 - samples/sec: 6225.29 - lr: 0.000009 - momentum: 0.000000
184
+ 2023-10-18 17:50:46,332 epoch 8 - iter 356/894 - loss 0.31980933 - time (sec): 5.73 - samples/sec: 6146.00 - lr: 0.000009 - momentum: 0.000000
185
+ 2023-10-18 17:50:47,718 epoch 8 - iter 445/894 - loss 0.31177364 - time (sec): 7.12 - samples/sec: 6270.04 - lr: 0.000008 - momentum: 0.000000
186
+ 2023-10-18 17:50:49,196 epoch 8 - iter 534/894 - loss 0.30996565 - time (sec): 8.60 - samples/sec: 6170.04 - lr: 0.000008 - momentum: 0.000000
187
+ 2023-10-18 17:50:50,598 epoch 8 - iter 623/894 - loss 0.31266497 - time (sec): 10.00 - samples/sec: 6092.52 - lr: 0.000008 - momentum: 0.000000
188
+ 2023-10-18 17:50:52,119 epoch 8 - iter 712/894 - loss 0.30729815 - time (sec): 11.52 - samples/sec: 6096.38 - lr: 0.000007 - momentum: 0.000000
189
+ 2023-10-18 17:50:53,590 epoch 8 - iter 801/894 - loss 0.31538057 - time (sec): 12.99 - samples/sec: 6039.31 - lr: 0.000007 - momentum: 0.000000
190
+ 2023-10-18 17:50:54,948 epoch 8 - iter 890/894 - loss 0.31034285 - time (sec): 14.35 - samples/sec: 6002.89 - lr: 0.000007 - momentum: 0.000000
191
+ 2023-10-18 17:50:55,010 ----------------------------------------------------------------------------------------------------
192
+ 2023-10-18 17:50:55,010 EPOCH 8 done: loss 0.3097 - lr: 0.000007
193
+ 2023-10-18 17:50:59,964 DEV : loss 0.30434364080429077 - f1-score (micro avg) 0.3522
194
+ 2023-10-18 17:50:59,988 saving best model
195
+ 2023-10-18 17:51:00,025 ----------------------------------------------------------------------------------------------------
196
+ 2023-10-18 17:51:01,426 epoch 9 - iter 89/894 - loss 0.23959273 - time (sec): 1.40 - samples/sec: 6024.01 - lr: 0.000006 - momentum: 0.000000
197
+ 2023-10-18 17:51:02,796 epoch 9 - iter 178/894 - loss 0.27105112 - time (sec): 2.77 - samples/sec: 6003.56 - lr: 0.000006 - momentum: 0.000000
198
+ 2023-10-18 17:51:04,184 epoch 9 - iter 267/894 - loss 0.28143971 - time (sec): 4.16 - samples/sec: 6298.31 - lr: 0.000006 - momentum: 0.000000
199
+ 2023-10-18 17:51:05,606 epoch 9 - iter 356/894 - loss 0.29591837 - time (sec): 5.58 - samples/sec: 6341.19 - lr: 0.000005 - momentum: 0.000000
200
+ 2023-10-18 17:51:07,028 epoch 9 - iter 445/894 - loss 0.29997546 - time (sec): 7.00 - samples/sec: 6173.84 - lr: 0.000005 - momentum: 0.000000
201
+ 2023-10-18 17:51:08,706 epoch 9 - iter 534/894 - loss 0.30091527 - time (sec): 8.68 - samples/sec: 5943.69 - lr: 0.000005 - momentum: 0.000000
202
+ 2023-10-18 17:51:10,103 epoch 9 - iter 623/894 - loss 0.30243132 - time (sec): 10.08 - samples/sec: 5982.17 - lr: 0.000004 - momentum: 0.000000
203
+ 2023-10-18 17:51:11,554 epoch 9 - iter 712/894 - loss 0.29460013 - time (sec): 11.53 - samples/sec: 6064.68 - lr: 0.000004 - momentum: 0.000000
204
+ 2023-10-18 17:51:12,917 epoch 9 - iter 801/894 - loss 0.30229177 - time (sec): 12.89 - samples/sec: 6047.04 - lr: 0.000004 - momentum: 0.000000
205
+ 2023-10-18 17:51:14,303 epoch 9 - iter 890/894 - loss 0.30375418 - time (sec): 14.28 - samples/sec: 6036.73 - lr: 0.000003 - momentum: 0.000000
206
+ 2023-10-18 17:51:14,365 ----------------------------------------------------------------------------------------------------
207
+ 2023-10-18 17:51:14,365 EPOCH 9 done: loss 0.3029 - lr: 0.000003
208
+ 2023-10-18 17:51:19,332 DEV : loss 0.30048123002052307 - f1-score (micro avg) 0.3541
209
+ 2023-10-18 17:51:19,357 saving best model
210
+ 2023-10-18 17:51:19,393 ----------------------------------------------------------------------------------------------------
211
+ 2023-10-18 17:51:20,890 epoch 10 - iter 89/894 - loss 0.29449068 - time (sec): 1.50 - samples/sec: 6558.64 - lr: 0.000003 - momentum: 0.000000
212
+ 2023-10-18 17:51:22,283 epoch 10 - iter 178/894 - loss 0.30220126 - time (sec): 2.89 - samples/sec: 6265.58 - lr: 0.000003 - momentum: 0.000000
213
+ 2023-10-18 17:51:23,707 epoch 10 - iter 267/894 - loss 0.28820853 - time (sec): 4.31 - samples/sec: 6138.84 - lr: 0.000002 - momentum: 0.000000
214
+ 2023-10-18 17:51:25,097 epoch 10 - iter 356/894 - loss 0.29937325 - time (sec): 5.70 - samples/sec: 6225.42 - lr: 0.000002 - momentum: 0.000000
215
+ 2023-10-18 17:51:26,464 epoch 10 - iter 445/894 - loss 0.29761073 - time (sec): 7.07 - samples/sec: 6096.08 - lr: 0.000002 - momentum: 0.000000
216
+ 2023-10-18 17:51:27,899 epoch 10 - iter 534/894 - loss 0.29810315 - time (sec): 8.51 - samples/sec: 6153.61 - lr: 0.000001 - momentum: 0.000000
217
+ 2023-10-18 17:51:29,274 epoch 10 - iter 623/894 - loss 0.30838358 - time (sec): 9.88 - samples/sec: 6263.76 - lr: 0.000001 - momentum: 0.000000
218
+ 2023-10-18 17:51:30,631 epoch 10 - iter 712/894 - loss 0.31079931 - time (sec): 11.24 - samples/sec: 6198.45 - lr: 0.000001 - momentum: 0.000000
219
+ 2023-10-18 17:51:31,990 epoch 10 - iter 801/894 - loss 0.30309182 - time (sec): 12.60 - samples/sec: 6192.14 - lr: 0.000000 - momentum: 0.000000
220
+ 2023-10-18 17:51:33,352 epoch 10 - iter 890/894 - loss 0.30297146 - time (sec): 13.96 - samples/sec: 6148.71 - lr: 0.000000 - momentum: 0.000000
221
+ 2023-10-18 17:51:33,434 ----------------------------------------------------------------------------------------------------
222
+ 2023-10-18 17:51:33,434 EPOCH 10 done: loss 0.3025 - lr: 0.000000
223
+ 2023-10-18 17:51:38,772 DEV : loss 0.300153523683548 - f1-score (micro avg) 0.3579
224
+ 2023-10-18 17:51:38,797 saving best model
225
+ 2023-10-18 17:51:38,868 ----------------------------------------------------------------------------------------------------
226
+ 2023-10-18 17:51:38,868 Loading model from best epoch ...
227
+ 2023-10-18 17:51:38,945 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
228
+ 2023-10-18 17:51:41,291
229
+ Results:
230
+ - F-score (micro) 0.3715
231
+ - F-score (macro) 0.1528
232
+ - Accuracy 0.2374
233
+
234
+ By class:
235
+ precision recall f1-score support
236
+
237
+ loc 0.4882 0.5923 0.5353 596
238
+ pers 0.1927 0.2072 0.1997 333
239
+ org 0.0000 0.0000 0.0000 132
240
+ time 0.0500 0.0204 0.0290 49
241
+ prod 0.0000 0.0000 0.0000 66
242
+
243
+ micro avg 0.3842 0.3597 0.3715 1176
244
+ macro avg 0.1462 0.1640 0.1528 1176
245
+ weighted avg 0.3041 0.3597 0.3290 1176
246
+
247
+ 2023-10-18 17:51:41,291 ----------------------------------------------------------------------------------------------------