Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- test.tsv +0 -0
- training.log +243 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2db67e9b0a4d5462817d2fafaf260f0136ac5e8af7e51966d7f2aa80fcd6eb78
|
3 |
+
size 443335879
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 13:20:41 0.0000 0.6389 0.1785 0.7203 0.6122 0.6619 0.5045
|
3 |
+
2 13:21:19 0.0000 0.1514 0.1278 0.6438 0.7631 0.6984 0.5571
|
4 |
+
3 13:21:56 0.0000 0.0808 0.1350 0.7082 0.7592 0.7328 0.5946
|
5 |
+
4 13:22:35 0.0000 0.0529 0.1548 0.7687 0.7795 0.7741 0.6516
|
6 |
+
5 13:23:13 0.0000 0.0353 0.1776 0.7543 0.7826 0.7682 0.6425
|
7 |
+
6 13:23:50 0.0000 0.0205 0.2128 0.7598 0.7568 0.7583 0.6343
|
8 |
+
7 13:24:27 0.0000 0.0163 0.2125 0.7545 0.7928 0.7732 0.6463
|
9 |
+
8 13:25:05 0.0000 0.0096 0.2141 0.7734 0.7952 0.7841 0.6630
|
10 |
+
9 13:25:43 0.0000 0.0066 0.2306 0.7703 0.7920 0.7810 0.6578
|
11 |
+
10 13:26:21 0.0000 0.0030 0.2350 0.7713 0.7991 0.7849 0.6632
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,243 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-13 13:20:08,359 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-13 13:20:08,360 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=768, out_features=21, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-13 13:20:08,360 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-13 13:20:08,360 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
|
52 |
+
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
|
53 |
+
2023-10-13 13:20:08,360 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-13 13:20:08,360 Train: 3575 sentences
|
55 |
+
2023-10-13 13:20:08,360 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-13 13:20:08,360 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-13 13:20:08,360 Training Params:
|
58 |
+
2023-10-13 13:20:08,360 - learning_rate: "5e-05"
|
59 |
+
2023-10-13 13:20:08,360 - mini_batch_size: "8"
|
60 |
+
2023-10-13 13:20:08,360 - max_epochs: "10"
|
61 |
+
2023-10-13 13:20:08,360 - shuffle: "True"
|
62 |
+
2023-10-13 13:20:08,361 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-13 13:20:08,361 Plugins:
|
64 |
+
2023-10-13 13:20:08,361 - LinearScheduler | warmup_fraction: '0.1'
|
65 |
+
2023-10-13 13:20:08,361 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-13 13:20:08,361 Final evaluation on model from best epoch (best-model.pt)
|
67 |
+
2023-10-13 13:20:08,361 - metric: "('micro avg', 'f1-score')"
|
68 |
+
2023-10-13 13:20:08,361 ----------------------------------------------------------------------------------------------------
|
69 |
+
2023-10-13 13:20:08,361 Computation:
|
70 |
+
2023-10-13 13:20:08,361 - compute on device: cuda:0
|
71 |
+
2023-10-13 13:20:08,361 - embedding storage: none
|
72 |
+
2023-10-13 13:20:08,361 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-13 13:20:08,361 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
|
74 |
+
2023-10-13 13:20:08,361 ----------------------------------------------------------------------------------------------------
|
75 |
+
2023-10-13 13:20:08,361 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-13 13:20:11,185 epoch 1 - iter 44/447 - loss 2.98851820 - time (sec): 2.82 - samples/sec: 3102.17 - lr: 0.000005 - momentum: 0.000000
|
77 |
+
2023-10-13 13:20:14,004 epoch 1 - iter 88/447 - loss 1.95325123 - time (sec): 5.64 - samples/sec: 3148.92 - lr: 0.000010 - momentum: 0.000000
|
78 |
+
2023-10-13 13:20:16,621 epoch 1 - iter 132/447 - loss 1.49794150 - time (sec): 8.26 - samples/sec: 3114.70 - lr: 0.000015 - momentum: 0.000000
|
79 |
+
2023-10-13 13:20:19,630 epoch 1 - iter 176/447 - loss 1.20997650 - time (sec): 11.27 - samples/sec: 3062.86 - lr: 0.000020 - momentum: 0.000000
|
80 |
+
2023-10-13 13:20:22,366 epoch 1 - iter 220/447 - loss 1.03351243 - time (sec): 14.00 - samples/sec: 3050.14 - lr: 0.000024 - momentum: 0.000000
|
81 |
+
2023-10-13 13:20:25,109 epoch 1 - iter 264/447 - loss 0.91449018 - time (sec): 16.75 - samples/sec: 3041.16 - lr: 0.000029 - momentum: 0.000000
|
82 |
+
2023-10-13 13:20:27,854 epoch 1 - iter 308/447 - loss 0.82776650 - time (sec): 19.49 - samples/sec: 3038.53 - lr: 0.000034 - momentum: 0.000000
|
83 |
+
2023-10-13 13:20:30,594 epoch 1 - iter 352/447 - loss 0.75655567 - time (sec): 22.23 - samples/sec: 3042.01 - lr: 0.000039 - momentum: 0.000000
|
84 |
+
2023-10-13 13:20:33,213 epoch 1 - iter 396/447 - loss 0.69589232 - time (sec): 24.85 - samples/sec: 3043.96 - lr: 0.000044 - momentum: 0.000000
|
85 |
+
2023-10-13 13:20:36,387 epoch 1 - iter 440/447 - loss 0.64524193 - time (sec): 28.03 - samples/sec: 3041.44 - lr: 0.000049 - momentum: 0.000000
|
86 |
+
2023-10-13 13:20:36,800 ----------------------------------------------------------------------------------------------------
|
87 |
+
2023-10-13 13:20:36,800 EPOCH 1 done: loss 0.6389 - lr: 0.000049
|
88 |
+
2023-10-13 13:20:41,834 DEV : loss 0.17851579189300537 - f1-score (micro avg) 0.6619
|
89 |
+
2023-10-13 13:20:41,868 saving best model
|
90 |
+
2023-10-13 13:20:42,189 ----------------------------------------------------------------------------------------------------
|
91 |
+
2023-10-13 13:20:45,136 epoch 2 - iter 44/447 - loss 0.18786776 - time (sec): 2.95 - samples/sec: 3040.94 - lr: 0.000049 - momentum: 0.000000
|
92 |
+
2023-10-13 13:20:48,309 epoch 2 - iter 88/447 - loss 0.18005101 - time (sec): 6.12 - samples/sec: 3024.51 - lr: 0.000049 - momentum: 0.000000
|
93 |
+
2023-10-13 13:20:50,957 epoch 2 - iter 132/447 - loss 0.17172587 - time (sec): 8.77 - samples/sec: 2987.88 - lr: 0.000048 - momentum: 0.000000
|
94 |
+
2023-10-13 13:20:53,671 epoch 2 - iter 176/447 - loss 0.17670397 - time (sec): 11.48 - samples/sec: 3002.83 - lr: 0.000048 - momentum: 0.000000
|
95 |
+
2023-10-13 13:20:56,617 epoch 2 - iter 220/447 - loss 0.17152176 - time (sec): 14.43 - samples/sec: 2986.38 - lr: 0.000047 - momentum: 0.000000
|
96 |
+
2023-10-13 13:20:59,327 epoch 2 - iter 264/447 - loss 0.16193941 - time (sec): 17.14 - samples/sec: 3019.15 - lr: 0.000047 - momentum: 0.000000
|
97 |
+
2023-10-13 13:21:01,946 epoch 2 - iter 308/447 - loss 0.15929042 - time (sec): 19.75 - samples/sec: 3022.81 - lr: 0.000046 - momentum: 0.000000
|
98 |
+
2023-10-13 13:21:04,541 epoch 2 - iter 352/447 - loss 0.15834174 - time (sec): 22.35 - samples/sec: 3031.62 - lr: 0.000046 - momentum: 0.000000
|
99 |
+
2023-10-13 13:21:07,674 epoch 2 - iter 396/447 - loss 0.15381226 - time (sec): 25.48 - samples/sec: 3014.93 - lr: 0.000045 - momentum: 0.000000
|
100 |
+
2023-10-13 13:21:10,380 epoch 2 - iter 440/447 - loss 0.15235393 - time (sec): 28.19 - samples/sec: 3024.09 - lr: 0.000045 - momentum: 0.000000
|
101 |
+
2023-10-13 13:21:10,888 ----------------------------------------------------------------------------------------------------
|
102 |
+
2023-10-13 13:21:10,889 EPOCH 2 done: loss 0.1514 - lr: 0.000045
|
103 |
+
2023-10-13 13:21:19,373 DEV : loss 0.12778525054454803 - f1-score (micro avg) 0.6984
|
104 |
+
2023-10-13 13:21:19,406 saving best model
|
105 |
+
2023-10-13 13:21:19,820 ----------------------------------------------------------------------------------------------------
|
106 |
+
2023-10-13 13:21:22,656 epoch 3 - iter 44/447 - loss 0.10065484 - time (sec): 2.83 - samples/sec: 3018.94 - lr: 0.000044 - momentum: 0.000000
|
107 |
+
2023-10-13 13:21:25,786 epoch 3 - iter 88/447 - loss 0.08666742 - time (sec): 5.96 - samples/sec: 2992.75 - lr: 0.000043 - momentum: 0.000000
|
108 |
+
2023-10-13 13:21:28,598 epoch 3 - iter 132/447 - loss 0.08572550 - time (sec): 8.78 - samples/sec: 2986.73 - lr: 0.000043 - momentum: 0.000000
|
109 |
+
2023-10-13 13:21:31,298 epoch 3 - iter 176/447 - loss 0.08491542 - time (sec): 11.48 - samples/sec: 3003.02 - lr: 0.000042 - momentum: 0.000000
|
110 |
+
2023-10-13 13:21:33,912 epoch 3 - iter 220/447 - loss 0.08321697 - time (sec): 14.09 - samples/sec: 2985.23 - lr: 0.000042 - momentum: 0.000000
|
111 |
+
2023-10-13 13:21:36,687 epoch 3 - iter 264/447 - loss 0.08312643 - time (sec): 16.87 - samples/sec: 2997.64 - lr: 0.000041 - momentum: 0.000000
|
112 |
+
2023-10-13 13:21:39,389 epoch 3 - iter 308/447 - loss 0.08303934 - time (sec): 19.57 - samples/sec: 2999.67 - lr: 0.000041 - momentum: 0.000000
|
113 |
+
2023-10-13 13:21:42,237 epoch 3 - iter 352/447 - loss 0.07934551 - time (sec): 22.41 - samples/sec: 3012.75 - lr: 0.000040 - momentum: 0.000000
|
114 |
+
2023-10-13 13:21:44,842 epoch 3 - iter 396/447 - loss 0.08105453 - time (sec): 25.02 - samples/sec: 3030.91 - lr: 0.000040 - momentum: 0.000000
|
115 |
+
2023-10-13 13:21:47,987 epoch 3 - iter 440/447 - loss 0.08108549 - time (sec): 28.16 - samples/sec: 3028.88 - lr: 0.000039 - momentum: 0.000000
|
116 |
+
2023-10-13 13:21:48,390 ----------------------------------------------------------------------------------------------------
|
117 |
+
2023-10-13 13:21:48,390 EPOCH 3 done: loss 0.0808 - lr: 0.000039
|
118 |
+
2023-10-13 13:21:56,816 DEV : loss 0.13502156734466553 - f1-score (micro avg) 0.7328
|
119 |
+
2023-10-13 13:21:56,851 saving best model
|
120 |
+
2023-10-13 13:21:57,269 ----------------------------------------------------------------------------------------------------
|
121 |
+
2023-10-13 13:21:59,999 epoch 4 - iter 44/447 - loss 0.04201711 - time (sec): 2.72 - samples/sec: 3098.25 - lr: 0.000038 - momentum: 0.000000
|
122 |
+
2023-10-13 13:22:02,677 epoch 4 - iter 88/447 - loss 0.05222326 - time (sec): 5.40 - samples/sec: 3037.66 - lr: 0.000038 - momentum: 0.000000
|
123 |
+
2023-10-13 13:22:05,495 epoch 4 - iter 132/447 - loss 0.04940786 - time (sec): 8.22 - samples/sec: 3037.73 - lr: 0.000037 - momentum: 0.000000
|
124 |
+
2023-10-13 13:22:08,212 epoch 4 - iter 176/447 - loss 0.04414691 - time (sec): 10.94 - samples/sec: 3046.88 - lr: 0.000037 - momentum: 0.000000
|
125 |
+
2023-10-13 13:22:11,658 epoch 4 - iter 220/447 - loss 0.04915488 - time (sec): 14.38 - samples/sec: 3015.37 - lr: 0.000036 - momentum: 0.000000
|
126 |
+
2023-10-13 13:22:14,498 epoch 4 - iter 264/447 - loss 0.05177181 - time (sec): 17.22 - samples/sec: 3015.62 - lr: 0.000036 - momentum: 0.000000
|
127 |
+
2023-10-13 13:22:17,164 epoch 4 - iter 308/447 - loss 0.04997384 - time (sec): 19.89 - samples/sec: 3006.17 - lr: 0.000035 - momentum: 0.000000
|
128 |
+
2023-10-13 13:22:19,895 epoch 4 - iter 352/447 - loss 0.05023042 - time (sec): 22.62 - samples/sec: 2998.23 - lr: 0.000035 - momentum: 0.000000
|
129 |
+
2023-10-13 13:22:23,206 epoch 4 - iter 396/447 - loss 0.05174515 - time (sec): 25.93 - samples/sec: 2981.06 - lr: 0.000034 - momentum: 0.000000
|
130 |
+
2023-10-13 13:22:25,978 epoch 4 - iter 440/447 - loss 0.05238596 - time (sec): 28.70 - samples/sec: 2969.31 - lr: 0.000033 - momentum: 0.000000
|
131 |
+
2023-10-13 13:22:26,430 ----------------------------------------------------------------------------------------------------
|
132 |
+
2023-10-13 13:22:26,431 EPOCH 4 done: loss 0.0529 - lr: 0.000033
|
133 |
+
2023-10-13 13:22:35,016 DEV : loss 0.15478116273880005 - f1-score (micro avg) 0.7741
|
134 |
+
2023-10-13 13:22:35,048 saving best model
|
135 |
+
2023-10-13 13:22:35,500 ----------------------------------------------------------------------------------------------------
|
136 |
+
2023-10-13 13:22:38,504 epoch 5 - iter 44/447 - loss 0.03979897 - time (sec): 3.00 - samples/sec: 2988.15 - lr: 0.000033 - momentum: 0.000000
|
137 |
+
2023-10-13 13:22:41,411 epoch 5 - iter 88/447 - loss 0.03334224 - time (sec): 5.91 - samples/sec: 2922.69 - lr: 0.000032 - momentum: 0.000000
|
138 |
+
2023-10-13 13:22:44,290 epoch 5 - iter 132/447 - loss 0.03321982 - time (sec): 8.79 - samples/sec: 2971.55 - lr: 0.000032 - momentum: 0.000000
|
139 |
+
2023-10-13 13:22:47,083 epoch 5 - iter 176/447 - loss 0.03182161 - time (sec): 11.58 - samples/sec: 2994.52 - lr: 0.000031 - momentum: 0.000000
|
140 |
+
2023-10-13 13:22:49,725 epoch 5 - iter 220/447 - loss 0.03370102 - time (sec): 14.22 - samples/sec: 2998.29 - lr: 0.000031 - momentum: 0.000000
|
141 |
+
2023-10-13 13:22:52,582 epoch 5 - iter 264/447 - loss 0.03501394 - time (sec): 17.08 - samples/sec: 3001.26 - lr: 0.000030 - momentum: 0.000000
|
142 |
+
2023-10-13 13:22:55,826 epoch 5 - iter 308/447 - loss 0.03844377 - time (sec): 20.32 - samples/sec: 2980.60 - lr: 0.000030 - momentum: 0.000000
|
143 |
+
2023-10-13 13:22:58,482 epoch 5 - iter 352/447 - loss 0.03808194 - time (sec): 22.98 - samples/sec: 2984.83 - lr: 0.000029 - momentum: 0.000000
|
144 |
+
2023-10-13 13:23:01,347 epoch 5 - iter 396/447 - loss 0.03629037 - time (sec): 25.85 - samples/sec: 2970.33 - lr: 0.000028 - momentum: 0.000000
|
145 |
+
2023-10-13 13:23:04,266 epoch 5 - iter 440/447 - loss 0.03561767 - time (sec): 28.76 - samples/sec: 2968.14 - lr: 0.000028 - momentum: 0.000000
|
146 |
+
2023-10-13 13:23:04,682 ----------------------------------------------------------------------------------------------------
|
147 |
+
2023-10-13 13:23:04,682 EPOCH 5 done: loss 0.0353 - lr: 0.000028
|
148 |
+
2023-10-13 13:23:13,184 DEV : loss 0.17760951817035675 - f1-score (micro avg) 0.7682
|
149 |
+
2023-10-13 13:23:13,217 ----------------------------------------------------------------------------------------------------
|
150 |
+
2023-10-13 13:23:16,042 epoch 6 - iter 44/447 - loss 0.01877968 - time (sec): 2.82 - samples/sec: 3048.07 - lr: 0.000027 - momentum: 0.000000
|
151 |
+
2023-10-13 13:23:18,942 epoch 6 - iter 88/447 - loss 0.01932285 - time (sec): 5.72 - samples/sec: 3077.70 - lr: 0.000027 - momentum: 0.000000
|
152 |
+
2023-10-13 13:23:21,625 epoch 6 - iter 132/447 - loss 0.02015927 - time (sec): 8.41 - samples/sec: 3088.96 - lr: 0.000026 - momentum: 0.000000
|
153 |
+
2023-10-13 13:23:24,907 epoch 6 - iter 176/447 - loss 0.01821601 - time (sec): 11.69 - samples/sec: 3065.39 - lr: 0.000026 - momentum: 0.000000
|
154 |
+
2023-10-13 13:23:27,771 epoch 6 - iter 220/447 - loss 0.01881791 - time (sec): 14.55 - samples/sec: 2979.48 - lr: 0.000025 - momentum: 0.000000
|
155 |
+
2023-10-13 13:23:30,488 epoch 6 - iter 264/447 - loss 0.01860310 - time (sec): 17.27 - samples/sec: 2980.78 - lr: 0.000025 - momentum: 0.000000
|
156 |
+
2023-10-13 13:23:33,366 epoch 6 - iter 308/447 - loss 0.01815398 - time (sec): 20.15 - samples/sec: 2976.57 - lr: 0.000024 - momentum: 0.000000
|
157 |
+
2023-10-13 13:23:36,065 epoch 6 - iter 352/447 - loss 0.01915966 - time (sec): 22.85 - samples/sec: 2977.83 - lr: 0.000023 - momentum: 0.000000
|
158 |
+
2023-10-13 13:23:38,865 epoch 6 - iter 396/447 - loss 0.01991320 - time (sec): 25.65 - samples/sec: 2999.07 - lr: 0.000023 - momentum: 0.000000
|
159 |
+
2023-10-13 13:23:41,624 epoch 6 - iter 440/447 - loss 0.02043759 - time (sec): 28.41 - samples/sec: 3003.11 - lr: 0.000022 - momentum: 0.000000
|
160 |
+
2023-10-13 13:23:42,034 ----------------------------------------------------------------------------------------------------
|
161 |
+
2023-10-13 13:23:42,034 EPOCH 6 done: loss 0.0205 - lr: 0.000022
|
162 |
+
2023-10-13 13:23:50,494 DEV : loss 0.21282611787319183 - f1-score (micro avg) 0.7583
|
163 |
+
2023-10-13 13:23:50,525 ----------------------------------------------------------------------------------------------------
|
164 |
+
2023-10-13 13:23:53,838 epoch 7 - iter 44/447 - loss 0.02198471 - time (sec): 3.31 - samples/sec: 2998.52 - lr: 0.000022 - momentum: 0.000000
|
165 |
+
2023-10-13 13:23:56,555 epoch 7 - iter 88/447 - loss 0.01806647 - time (sec): 6.03 - samples/sec: 2959.07 - lr: 0.000021 - momentum: 0.000000
|
166 |
+
2023-10-13 13:23:59,442 epoch 7 - iter 132/447 - loss 0.01433410 - time (sec): 8.92 - samples/sec: 2982.30 - lr: 0.000021 - momentum: 0.000000
|
167 |
+
2023-10-13 13:24:02,316 epoch 7 - iter 176/447 - loss 0.01416791 - time (sec): 11.79 - samples/sec: 3005.39 - lr: 0.000020 - momentum: 0.000000
|
168 |
+
2023-10-13 13:24:05,100 epoch 7 - iter 220/447 - loss 0.01395297 - time (sec): 14.57 - samples/sec: 3015.13 - lr: 0.000020 - momentum: 0.000000
|
169 |
+
2023-10-13 13:24:07,721 epoch 7 - iter 264/447 - loss 0.01425224 - time (sec): 17.19 - samples/sec: 3003.40 - lr: 0.000019 - momentum: 0.000000
|
170 |
+
2023-10-13 13:24:10,471 epoch 7 - iter 308/447 - loss 0.01599349 - time (sec): 19.95 - samples/sec: 3015.57 - lr: 0.000018 - momentum: 0.000000
|
171 |
+
2023-10-13 13:24:13,263 epoch 7 - iter 352/447 - loss 0.01639863 - time (sec): 22.74 - samples/sec: 3007.11 - lr: 0.000018 - momentum: 0.000000
|
172 |
+
2023-10-13 13:24:15,883 epoch 7 - iter 396/447 - loss 0.01654187 - time (sec): 25.36 - samples/sec: 3012.06 - lr: 0.000017 - momentum: 0.000000
|
173 |
+
2023-10-13 13:24:18,717 epoch 7 - iter 440/447 - loss 0.01632206 - time (sec): 28.19 - samples/sec: 3031.51 - lr: 0.000017 - momentum: 0.000000
|
174 |
+
2023-10-13 13:24:19,099 ----------------------------------------------------------------------------------------------------
|
175 |
+
2023-10-13 13:24:19,099 EPOCH 7 done: loss 0.0163 - lr: 0.000017
|
176 |
+
2023-10-13 13:24:27,956 DEV : loss 0.21246877312660217 - f1-score (micro avg) 0.7732
|
177 |
+
2023-10-13 13:24:27,990 ----------------------------------------------------------------------------------------------------
|
178 |
+
2023-10-13 13:24:30,849 epoch 8 - iter 44/447 - loss 0.01236522 - time (sec): 2.86 - samples/sec: 3006.90 - lr: 0.000016 - momentum: 0.000000
|
179 |
+
2023-10-13 13:24:33,776 epoch 8 - iter 88/447 - loss 0.01153258 - time (sec): 5.78 - samples/sec: 2962.65 - lr: 0.000016 - momentum: 0.000000
|
180 |
+
2023-10-13 13:24:36,431 epoch 8 - iter 132/447 - loss 0.01262017 - time (sec): 8.44 - samples/sec: 3003.64 - lr: 0.000015 - momentum: 0.000000
|
181 |
+
2023-10-13 13:24:39,104 epoch 8 - iter 176/447 - loss 0.01112667 - time (sec): 11.11 - samples/sec: 3015.10 - lr: 0.000015 - momentum: 0.000000
|
182 |
+
2023-10-13 13:24:41,911 epoch 8 - iter 220/447 - loss 0.00946565 - time (sec): 13.92 - samples/sec: 3002.28 - lr: 0.000014 - momentum: 0.000000
|
183 |
+
2023-10-13 13:24:44,625 epoch 8 - iter 264/447 - loss 0.00948303 - time (sec): 16.63 - samples/sec: 3018.28 - lr: 0.000013 - momentum: 0.000000
|
184 |
+
2023-10-13 13:24:47,483 epoch 8 - iter 308/447 - loss 0.00851173 - time (sec): 19.49 - samples/sec: 2998.77 - lr: 0.000013 - momentum: 0.000000
|
185 |
+
2023-10-13 13:24:50,628 epoch 8 - iter 352/447 - loss 0.00968723 - time (sec): 22.64 - samples/sec: 2987.84 - lr: 0.000012 - momentum: 0.000000
|
186 |
+
2023-10-13 13:24:53,754 epoch 8 - iter 396/447 - loss 0.00996054 - time (sec): 25.76 - samples/sec: 2977.54 - lr: 0.000012 - momentum: 0.000000
|
187 |
+
2023-10-13 13:24:56,483 epoch 8 - iter 440/447 - loss 0.00980498 - time (sec): 28.49 - samples/sec: 2985.89 - lr: 0.000011 - momentum: 0.000000
|
188 |
+
2023-10-13 13:24:56,967 ----------------------------------------------------------------------------------------------------
|
189 |
+
2023-10-13 13:24:56,967 EPOCH 8 done: loss 0.0096 - lr: 0.000011
|
190 |
+
2023-10-13 13:25:05,316 DEV : loss 0.21407929062843323 - f1-score (micro avg) 0.7841
|
191 |
+
2023-10-13 13:25:05,349 saving best model
|
192 |
+
2023-10-13 13:25:06,114 ----------------------------------------------------------------------------------------------------
|
193 |
+
2023-10-13 13:25:08,889 epoch 9 - iter 44/447 - loss 0.00542228 - time (sec): 2.77 - samples/sec: 2990.28 - lr: 0.000011 - momentum: 0.000000
|
194 |
+
2023-10-13 13:25:11,925 epoch 9 - iter 88/447 - loss 0.00463397 - time (sec): 5.81 - samples/sec: 2922.50 - lr: 0.000010 - momentum: 0.000000
|
195 |
+
2023-10-13 13:25:14,532 epoch 9 - iter 132/447 - loss 0.00602455 - time (sec): 8.42 - samples/sec: 2961.12 - lr: 0.000010 - momentum: 0.000000
|
196 |
+
2023-10-13 13:25:17,407 epoch 9 - iter 176/447 - loss 0.00746601 - time (sec): 11.29 - samples/sec: 2945.80 - lr: 0.000009 - momentum: 0.000000
|
197 |
+
2023-10-13 13:25:20,398 epoch 9 - iter 220/447 - loss 0.00670432 - time (sec): 14.28 - samples/sec: 2945.21 - lr: 0.000008 - momentum: 0.000000
|
198 |
+
2023-10-13 13:25:23,382 epoch 9 - iter 264/447 - loss 0.00608308 - time (sec): 17.27 - samples/sec: 2921.85 - lr: 0.000008 - momentum: 0.000000
|
199 |
+
2023-10-13 13:25:26,159 epoch 9 - iter 308/447 - loss 0.00653327 - time (sec): 20.04 - samples/sec: 2942.50 - lr: 0.000007 - momentum: 0.000000
|
200 |
+
2023-10-13 13:25:29,535 epoch 9 - iter 352/447 - loss 0.00606227 - time (sec): 23.42 - samples/sec: 2950.37 - lr: 0.000007 - momentum: 0.000000
|
201 |
+
2023-10-13 13:25:32,391 epoch 9 - iter 396/447 - loss 0.00639259 - time (sec): 26.28 - samples/sec: 2943.53 - lr: 0.000006 - momentum: 0.000000
|
202 |
+
2023-10-13 13:25:35,184 epoch 9 - iter 440/447 - loss 0.00614557 - time (sec): 29.07 - samples/sec: 2940.07 - lr: 0.000006 - momentum: 0.000000
|
203 |
+
2023-10-13 13:25:35,554 ----------------------------------------------------------------------------------------------------
|
204 |
+
2023-10-13 13:25:35,555 EPOCH 9 done: loss 0.0066 - lr: 0.000006
|
205 |
+
2023-10-13 13:25:43,820 DEV : loss 0.23063816130161285 - f1-score (micro avg) 0.781
|
206 |
+
2023-10-13 13:25:43,854 ----------------------------------------------------------------------------------------------------
|
207 |
+
2023-10-13 13:25:47,476 epoch 10 - iter 44/447 - loss 0.00098734 - time (sec): 3.62 - samples/sec: 2730.27 - lr: 0.000005 - momentum: 0.000000
|
208 |
+
2023-10-13 13:25:50,549 epoch 10 - iter 88/447 - loss 0.00181722 - time (sec): 6.69 - samples/sec: 2770.81 - lr: 0.000005 - momentum: 0.000000
|
209 |
+
2023-10-13 13:25:53,340 epoch 10 - iter 132/447 - loss 0.00316493 - time (sec): 9.48 - samples/sec: 2826.86 - lr: 0.000004 - momentum: 0.000000
|
210 |
+
2023-10-13 13:25:56,007 epoch 10 - iter 176/447 - loss 0.00376538 - time (sec): 12.15 - samples/sec: 2864.66 - lr: 0.000003 - momentum: 0.000000
|
211 |
+
2023-10-13 13:25:58,830 epoch 10 - iter 220/447 - loss 0.00318066 - time (sec): 14.97 - samples/sec: 2891.81 - lr: 0.000003 - momentum: 0.000000
|
212 |
+
2023-10-13 13:26:01,539 epoch 10 - iter 264/447 - loss 0.00282318 - time (sec): 17.68 - samples/sec: 2901.48 - lr: 0.000002 - momentum: 0.000000
|
213 |
+
2023-10-13 13:26:04,410 epoch 10 - iter 308/447 - loss 0.00291001 - time (sec): 20.55 - samples/sec: 2894.73 - lr: 0.000002 - momentum: 0.000000
|
214 |
+
2023-10-13 13:26:07,417 epoch 10 - iter 352/447 - loss 0.00276025 - time (sec): 23.56 - samples/sec: 2890.66 - lr: 0.000001 - momentum: 0.000000
|
215 |
+
2023-10-13 13:26:10,104 epoch 10 - iter 396/447 - loss 0.00297373 - time (sec): 26.25 - samples/sec: 2912.16 - lr: 0.000001 - momentum: 0.000000
|
216 |
+
2023-10-13 13:26:12,949 epoch 10 - iter 440/447 - loss 0.00306137 - time (sec): 29.09 - samples/sec: 2939.98 - lr: 0.000000 - momentum: 0.000000
|
217 |
+
2023-10-13 13:26:13,356 ----------------------------------------------------------------------------------------------------
|
218 |
+
2023-10-13 13:26:13,357 EPOCH 10 done: loss 0.0030 - lr: 0.000000
|
219 |
+
2023-10-13 13:26:21,613 DEV : loss 0.2350376695394516 - f1-score (micro avg) 0.7849
|
220 |
+
2023-10-13 13:26:21,645 saving best model
|
221 |
+
2023-10-13 13:26:22,417 ----------------------------------------------------------------------------------------------------
|
222 |
+
2023-10-13 13:26:22,419 Loading model from best epoch ...
|
223 |
+
2023-10-13 13:26:23,884 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
|
224 |
+
2023-10-13 13:26:28,948
|
225 |
+
Results:
|
226 |
+
- F-score (micro) 0.7551
|
227 |
+
- F-score (macro) 0.6861
|
228 |
+
- Accuracy 0.6253
|
229 |
+
|
230 |
+
By class:
|
231 |
+
precision recall f1-score support
|
232 |
+
|
233 |
+
loc 0.8361 0.8473 0.8417 596
|
234 |
+
pers 0.6658 0.7778 0.7175 333
|
235 |
+
org 0.5620 0.5152 0.5375 132
|
236 |
+
prod 0.6333 0.5758 0.6032 66
|
237 |
+
time 0.6909 0.7755 0.7308 49
|
238 |
+
|
239 |
+
micro avg 0.7388 0.7721 0.7551 1176
|
240 |
+
macro avg 0.6776 0.6983 0.6861 1176
|
241 |
+
weighted avg 0.7397 0.7721 0.7544 1176
|
242 |
+
|
243 |
+
2023-10-13 13:26:28,948 ----------------------------------------------------------------------------------------------------
|