Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- test.tsv +0 -0
- training.log +240 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4946d68c4a2b4b2130d4de5877f894682d6d9e7e9190efc919a92ec9f237ca3f
|
3 |
+
size 443323527
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 23:49:56 0.0000 0.3687 0.1275 0.2297 0.4394 0.3017 0.1778
|
3 |
+
2 23:54:18 0.0000 0.1873 0.1455 0.2548 0.5833 0.3546 0.2166
|
4 |
+
3 23:58:40 0.0000 0.1415 0.2712 0.2230 0.5777 0.3217 0.1930
|
5 |
+
4 00:03:02 0.0000 0.1010 0.2461 0.2653 0.6231 0.3722 0.2302
|
6 |
+
5 00:07:25 0.0000 0.0756 0.2759 0.2991 0.5739 0.3933 0.2465
|
7 |
+
6 00:11:47 0.0000 0.0563 0.3152 0.2543 0.5852 0.3546 0.2167
|
8 |
+
7 00:16:10 0.0000 0.0418 0.3469 0.2641 0.6023 0.3672 0.2263
|
9 |
+
8 00:20:31 0.0000 0.0296 0.4152 0.2587 0.6061 0.3626 0.2225
|
10 |
+
9 00:24:52 0.0000 0.0210 0.4039 0.2535 0.6250 0.3607 0.2209
|
11 |
+
10 00:29:13 0.0000 0.0141 0.4168 0.2670 0.6174 0.3728 0.2304
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,240 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-15 23:45:36,931 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-15 23:45:36,932 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=768, out_features=17, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-15 23:45:36,932 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-15 23:45:36,932 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
|
52 |
+
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
|
53 |
+
2023-10-15 23:45:36,932 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-15 23:45:36,932 Train: 20847 sentences
|
55 |
+
2023-10-15 23:45:36,932 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-15 23:45:36,932 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-15 23:45:36,932 Training Params:
|
58 |
+
2023-10-15 23:45:36,932 - learning_rate: "5e-05"
|
59 |
+
2023-10-15 23:45:36,933 - mini_batch_size: "4"
|
60 |
+
2023-10-15 23:45:36,933 - max_epochs: "10"
|
61 |
+
2023-10-15 23:45:36,933 - shuffle: "True"
|
62 |
+
2023-10-15 23:45:36,933 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-15 23:45:36,933 Plugins:
|
64 |
+
2023-10-15 23:45:36,933 - LinearScheduler | warmup_fraction: '0.1'
|
65 |
+
2023-10-15 23:45:36,933 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-15 23:45:36,933 Final evaluation on model from best epoch (best-model.pt)
|
67 |
+
2023-10-15 23:45:36,933 - metric: "('micro avg', 'f1-score')"
|
68 |
+
2023-10-15 23:45:36,933 ----------------------------------------------------------------------------------------------------
|
69 |
+
2023-10-15 23:45:36,933 Computation:
|
70 |
+
2023-10-15 23:45:36,933 - compute on device: cuda:0
|
71 |
+
2023-10-15 23:45:36,933 - embedding storage: none
|
72 |
+
2023-10-15 23:45:36,933 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-15 23:45:36,933 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
|
74 |
+
2023-10-15 23:45:36,933 ----------------------------------------------------------------------------------------------------
|
75 |
+
2023-10-15 23:45:36,933 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-15 23:46:01,690 epoch 1 - iter 521/5212 - loss 1.37202838 - time (sec): 24.76 - samples/sec: 1438.81 - lr: 0.000005 - momentum: 0.000000
|
77 |
+
2023-10-15 23:46:26,941 epoch 1 - iter 1042/5212 - loss 0.87856005 - time (sec): 50.01 - samples/sec: 1460.52 - lr: 0.000010 - momentum: 0.000000
|
78 |
+
2023-10-15 23:46:51,993 epoch 1 - iter 1563/5212 - loss 0.68736436 - time (sec): 75.06 - samples/sec: 1446.56 - lr: 0.000015 - momentum: 0.000000
|
79 |
+
2023-10-15 23:47:17,119 epoch 1 - iter 2084/5212 - loss 0.58462865 - time (sec): 100.18 - samples/sec: 1440.83 - lr: 0.000020 - momentum: 0.000000
|
80 |
+
2023-10-15 23:47:42,651 epoch 1 - iter 2605/5212 - loss 0.51553132 - time (sec): 125.72 - samples/sec: 1437.34 - lr: 0.000025 - momentum: 0.000000
|
81 |
+
2023-10-15 23:48:08,713 epoch 1 - iter 3126/5212 - loss 0.46300876 - time (sec): 151.78 - samples/sec: 1438.92 - lr: 0.000030 - momentum: 0.000000
|
82 |
+
2023-10-15 23:48:33,975 epoch 1 - iter 3647/5212 - loss 0.43081269 - time (sec): 177.04 - samples/sec: 1443.11 - lr: 0.000035 - momentum: 0.000000
|
83 |
+
2023-10-15 23:48:59,438 epoch 1 - iter 4168/5212 - loss 0.40412772 - time (sec): 202.50 - samples/sec: 1440.23 - lr: 0.000040 - momentum: 0.000000
|
84 |
+
2023-10-15 23:49:24,710 epoch 1 - iter 4689/5212 - loss 0.38575539 - time (sec): 227.78 - samples/sec: 1435.07 - lr: 0.000045 - momentum: 0.000000
|
85 |
+
2023-10-15 23:49:50,594 epoch 1 - iter 5210/5212 - loss 0.36876867 - time (sec): 253.66 - samples/sec: 1448.37 - lr: 0.000050 - momentum: 0.000000
|
86 |
+
2023-10-15 23:49:50,688 ----------------------------------------------------------------------------------------------------
|
87 |
+
2023-10-15 23:49:50,688 EPOCH 1 done: loss 0.3687 - lr: 0.000050
|
88 |
+
2023-10-15 23:49:56,687 DEV : loss 0.12752242386341095 - f1-score (micro avg) 0.3017
|
89 |
+
2023-10-15 23:49:56,714 saving best model
|
90 |
+
2023-10-15 23:49:57,179 ----------------------------------------------------------------------------------------------------
|
91 |
+
2023-10-15 23:50:22,567 epoch 2 - iter 521/5212 - loss 0.20030192 - time (sec): 25.39 - samples/sec: 1528.79 - lr: 0.000049 - momentum: 0.000000
|
92 |
+
2023-10-15 23:50:47,800 epoch 2 - iter 1042/5212 - loss 0.19679440 - time (sec): 50.62 - samples/sec: 1482.50 - lr: 0.000049 - momentum: 0.000000
|
93 |
+
2023-10-15 23:51:13,032 epoch 2 - iter 1563/5212 - loss 0.18526110 - time (sec): 75.85 - samples/sec: 1470.72 - lr: 0.000048 - momentum: 0.000000
|
94 |
+
2023-10-15 23:51:38,382 epoch 2 - iter 2084/5212 - loss 0.18140515 - time (sec): 101.20 - samples/sec: 1471.11 - lr: 0.000048 - momentum: 0.000000
|
95 |
+
2023-10-15 23:52:03,691 epoch 2 - iter 2605/5212 - loss 0.18582562 - time (sec): 126.51 - samples/sec: 1464.80 - lr: 0.000047 - momentum: 0.000000
|
96 |
+
2023-10-15 23:52:28,906 epoch 2 - iter 3126/5212 - loss 0.18577517 - time (sec): 151.72 - samples/sec: 1469.25 - lr: 0.000047 - momentum: 0.000000
|
97 |
+
2023-10-15 23:52:53,821 epoch 2 - iter 3647/5212 - loss 0.18828427 - time (sec): 176.64 - samples/sec: 1465.00 - lr: 0.000046 - momentum: 0.000000
|
98 |
+
2023-10-15 23:53:19,622 epoch 2 - iter 4168/5212 - loss 0.18599284 - time (sec): 202.44 - samples/sec: 1470.56 - lr: 0.000046 - momentum: 0.000000
|
99 |
+
2023-10-15 23:53:44,440 epoch 2 - iter 4689/5212 - loss 0.18772856 - time (sec): 227.26 - samples/sec: 1455.41 - lr: 0.000045 - momentum: 0.000000
|
100 |
+
2023-10-15 23:54:09,398 epoch 2 - iter 5210/5212 - loss 0.18732109 - time (sec): 252.22 - samples/sec: 1455.77 - lr: 0.000044 - momentum: 0.000000
|
101 |
+
2023-10-15 23:54:09,513 ----------------------------------------------------------------------------------------------------
|
102 |
+
2023-10-15 23:54:09,513 EPOCH 2 done: loss 0.1873 - lr: 0.000044
|
103 |
+
2023-10-15 23:54:18,685 DEV : loss 0.14554405212402344 - f1-score (micro avg) 0.3546
|
104 |
+
2023-10-15 23:54:18,713 saving best model
|
105 |
+
2023-10-15 23:54:19,332 ----------------------------------------------------------------------------------------------------
|
106 |
+
2023-10-15 23:54:44,600 epoch 3 - iter 521/5212 - loss 0.16892477 - time (sec): 25.26 - samples/sec: 1431.22 - lr: 0.000044 - momentum: 0.000000
|
107 |
+
2023-10-15 23:55:09,173 epoch 3 - iter 1042/5212 - loss 0.15480170 - time (sec): 49.84 - samples/sec: 1382.24 - lr: 0.000043 - momentum: 0.000000
|
108 |
+
2023-10-15 23:55:34,192 epoch 3 - iter 1563/5212 - loss 0.15253235 - time (sec): 74.86 - samples/sec: 1379.25 - lr: 0.000043 - momentum: 0.000000
|
109 |
+
2023-10-15 23:55:59,000 epoch 3 - iter 2084/5212 - loss 0.14966988 - time (sec): 99.66 - samples/sec: 1394.82 - lr: 0.000042 - momentum: 0.000000
|
110 |
+
2023-10-15 23:56:23,271 epoch 3 - iter 2605/5212 - loss 0.14248758 - time (sec): 123.94 - samples/sec: 1422.36 - lr: 0.000042 - momentum: 0.000000
|
111 |
+
2023-10-15 23:56:48,693 epoch 3 - iter 3126/5212 - loss 0.14250745 - time (sec): 149.36 - samples/sec: 1432.87 - lr: 0.000041 - momentum: 0.000000
|
112 |
+
2023-10-15 23:57:14,127 epoch 3 - iter 3647/5212 - loss 0.14348930 - time (sec): 174.79 - samples/sec: 1440.23 - lr: 0.000041 - momentum: 0.000000
|
113 |
+
2023-10-15 23:57:39,517 epoch 3 - iter 4168/5212 - loss 0.14496238 - time (sec): 200.18 - samples/sec: 1439.25 - lr: 0.000040 - momentum: 0.000000
|
114 |
+
2023-10-15 23:58:05,279 epoch 3 - iter 4689/5212 - loss 0.14349134 - time (sec): 225.94 - samples/sec: 1447.61 - lr: 0.000039 - momentum: 0.000000
|
115 |
+
2023-10-15 23:58:31,311 epoch 3 - iter 5210/5212 - loss 0.14146075 - time (sec): 251.98 - samples/sec: 1458.09 - lr: 0.000039 - momentum: 0.000000
|
116 |
+
2023-10-15 23:58:31,401 ----------------------------------------------------------------------------------------------------
|
117 |
+
2023-10-15 23:58:31,401 EPOCH 3 done: loss 0.1415 - lr: 0.000039
|
118 |
+
2023-10-15 23:58:40,504 DEV : loss 0.27121227979660034 - f1-score (micro avg) 0.3217
|
119 |
+
2023-10-15 23:58:40,532 ----------------------------------------------------------------------------------------------------
|
120 |
+
2023-10-15 23:59:05,417 epoch 4 - iter 521/5212 - loss 0.09043585 - time (sec): 24.88 - samples/sec: 1421.95 - lr: 0.000038 - momentum: 0.000000
|
121 |
+
2023-10-15 23:59:30,195 epoch 4 - iter 1042/5212 - loss 0.10494334 - time (sec): 49.66 - samples/sec: 1447.79 - lr: 0.000038 - momentum: 0.000000
|
122 |
+
2023-10-15 23:59:55,656 epoch 4 - iter 1563/5212 - loss 0.09724528 - time (sec): 75.12 - samples/sec: 1479.73 - lr: 0.000037 - momentum: 0.000000
|
123 |
+
2023-10-16 00:00:21,419 epoch 4 - iter 2084/5212 - loss 0.09872646 - time (sec): 100.89 - samples/sec: 1485.87 - lr: 0.000037 - momentum: 0.000000
|
124 |
+
2023-10-16 00:00:46,935 epoch 4 - iter 2605/5212 - loss 0.10353381 - time (sec): 126.40 - samples/sec: 1480.74 - lr: 0.000036 - momentum: 0.000000
|
125 |
+
2023-10-16 00:01:12,909 epoch 4 - iter 3126/5212 - loss 0.10532981 - time (sec): 152.38 - samples/sec: 1464.22 - lr: 0.000036 - momentum: 0.000000
|
126 |
+
2023-10-16 00:01:38,193 epoch 4 - iter 3647/5212 - loss 0.10412690 - time (sec): 177.66 - samples/sec: 1458.40 - lr: 0.000035 - momentum: 0.000000
|
127 |
+
2023-10-16 00:02:03,668 epoch 4 - iter 4168/5212 - loss 0.10176002 - time (sec): 203.14 - samples/sec: 1462.23 - lr: 0.000034 - momentum: 0.000000
|
128 |
+
2023-10-16 00:02:28,489 epoch 4 - iter 4689/5212 - loss 0.10219116 - time (sec): 227.96 - samples/sec: 1455.00 - lr: 0.000034 - momentum: 0.000000
|
129 |
+
2023-10-16 00:02:53,672 epoch 4 - iter 5210/5212 - loss 0.10091424 - time (sec): 253.14 - samples/sec: 1450.58 - lr: 0.000033 - momentum: 0.000000
|
130 |
+
2023-10-16 00:02:53,769 ----------------------------------------------------------------------------------------------------
|
131 |
+
2023-10-16 00:02:53,769 EPOCH 4 done: loss 0.1010 - lr: 0.000033
|
132 |
+
2023-10-16 00:03:02,079 DEV : loss 0.24610090255737305 - f1-score (micro avg) 0.3722
|
133 |
+
2023-10-16 00:03:02,108 saving best model
|
134 |
+
2023-10-16 00:03:02,652 ----------------------------------------------------------------------------------------------------
|
135 |
+
2023-10-16 00:03:28,355 epoch 5 - iter 521/5212 - loss 0.08946468 - time (sec): 25.70 - samples/sec: 1499.17 - lr: 0.000033 - momentum: 0.000000
|
136 |
+
2023-10-16 00:03:53,305 epoch 5 - iter 1042/5212 - loss 0.08272511 - time (sec): 50.65 - samples/sec: 1445.65 - lr: 0.000032 - momentum: 0.000000
|
137 |
+
2023-10-16 00:04:19,363 epoch 5 - iter 1563/5212 - loss 0.07858447 - time (sec): 76.71 - samples/sec: 1419.47 - lr: 0.000032 - momentum: 0.000000
|
138 |
+
2023-10-16 00:04:45,160 epoch 5 - iter 2084/5212 - loss 0.07638850 - time (sec): 102.51 - samples/sec: 1440.40 - lr: 0.000031 - momentum: 0.000000
|
139 |
+
2023-10-16 00:05:10,817 epoch 5 - iter 2605/5212 - loss 0.07945967 - time (sec): 128.16 - samples/sec: 1449.44 - lr: 0.000031 - momentum: 0.000000
|
140 |
+
2023-10-16 00:05:35,603 epoch 5 - iter 3126/5212 - loss 0.07924286 - time (sec): 152.95 - samples/sec: 1448.28 - lr: 0.000030 - momentum: 0.000000
|
141 |
+
2023-10-16 00:06:00,818 epoch 5 - iter 3647/5212 - loss 0.07893342 - time (sec): 178.16 - samples/sec: 1442.27 - lr: 0.000029 - momentum: 0.000000
|
142 |
+
2023-10-16 00:06:26,438 epoch 5 - iter 4168/5212 - loss 0.07696755 - time (sec): 203.78 - samples/sec: 1448.05 - lr: 0.000029 - momentum: 0.000000
|
143 |
+
2023-10-16 00:06:51,588 epoch 5 - iter 4689/5212 - loss 0.07559505 - time (sec): 228.93 - samples/sec: 1447.40 - lr: 0.000028 - momentum: 0.000000
|
144 |
+
2023-10-16 00:07:16,799 epoch 5 - iter 5210/5212 - loss 0.07550940 - time (sec): 254.14 - samples/sec: 1445.61 - lr: 0.000028 - momentum: 0.000000
|
145 |
+
2023-10-16 00:07:16,884 ----------------------------------------------------------------------------------------------------
|
146 |
+
2023-10-16 00:07:16,885 EPOCH 5 done: loss 0.0756 - lr: 0.000028
|
147 |
+
2023-10-16 00:07:25,246 DEV : loss 0.27587252855300903 - f1-score (micro avg) 0.3933
|
148 |
+
2023-10-16 00:07:25,276 saving best model
|
149 |
+
2023-10-16 00:07:25,899 ----------------------------------------------------------------------------------------------------
|
150 |
+
2023-10-16 00:07:51,242 epoch 6 - iter 521/5212 - loss 0.04500721 - time (sec): 25.34 - samples/sec: 1438.74 - lr: 0.000027 - momentum: 0.000000
|
151 |
+
2023-10-16 00:08:16,452 epoch 6 - iter 1042/5212 - loss 0.05219254 - time (sec): 50.55 - samples/sec: 1458.09 - lr: 0.000027 - momentum: 0.000000
|
152 |
+
2023-10-16 00:08:41,812 epoch 6 - iter 1563/5212 - loss 0.05073335 - time (sec): 75.91 - samples/sec: 1454.84 - lr: 0.000026 - momentum: 0.000000
|
153 |
+
2023-10-16 00:09:07,218 epoch 6 - iter 2084/5212 - loss 0.05244273 - time (sec): 101.32 - samples/sec: 1458.61 - lr: 0.000026 - momentum: 0.000000
|
154 |
+
2023-10-16 00:09:32,274 epoch 6 - iter 2605/5212 - loss 0.05597317 - time (sec): 126.37 - samples/sec: 1450.57 - lr: 0.000025 - momentum: 0.000000
|
155 |
+
2023-10-16 00:09:58,365 epoch 6 - iter 3126/5212 - loss 0.05870564 - time (sec): 152.46 - samples/sec: 1447.89 - lr: 0.000024 - momentum: 0.000000
|
156 |
+
2023-10-16 00:10:23,762 epoch 6 - iter 3647/5212 - loss 0.05885555 - time (sec): 177.86 - samples/sec: 1458.55 - lr: 0.000024 - momentum: 0.000000
|
157 |
+
2023-10-16 00:10:49,124 epoch 6 - iter 4168/5212 - loss 0.05780173 - time (sec): 203.22 - samples/sec: 1461.81 - lr: 0.000023 - momentum: 0.000000
|
158 |
+
2023-10-16 00:11:14,399 epoch 6 - iter 4689/5212 - loss 0.05720709 - time (sec): 228.50 - samples/sec: 1459.04 - lr: 0.000023 - momentum: 0.000000
|
159 |
+
2023-10-16 00:11:39,181 epoch 6 - iter 5210/5212 - loss 0.05635932 - time (sec): 253.28 - samples/sec: 1448.97 - lr: 0.000022 - momentum: 0.000000
|
160 |
+
2023-10-16 00:11:39,333 ----------------------------------------------------------------------------------------------------
|
161 |
+
2023-10-16 00:11:39,333 EPOCH 6 done: loss 0.0563 - lr: 0.000022
|
162 |
+
2023-10-16 00:11:47,640 DEV : loss 0.3152172863483429 - f1-score (micro avg) 0.3546
|
163 |
+
2023-10-16 00:11:47,669 ----------------------------------------------------------------------------------------------------
|
164 |
+
2023-10-16 00:12:12,705 epoch 7 - iter 521/5212 - loss 0.05081149 - time (sec): 25.03 - samples/sec: 1394.27 - lr: 0.000022 - momentum: 0.000000
|
165 |
+
2023-10-16 00:12:37,682 epoch 7 - iter 1042/5212 - loss 0.04120887 - time (sec): 50.01 - samples/sec: 1432.98 - lr: 0.000021 - momentum: 0.000000
|
166 |
+
2023-10-16 00:13:03,240 epoch 7 - iter 1563/5212 - loss 0.04350707 - time (sec): 75.57 - samples/sec: 1433.54 - lr: 0.000021 - momentum: 0.000000
|
167 |
+
2023-10-16 00:13:29,103 epoch 7 - iter 2084/5212 - loss 0.04429995 - time (sec): 101.43 - samples/sec: 1456.00 - lr: 0.000020 - momentum: 0.000000
|
168 |
+
2023-10-16 00:13:54,490 epoch 7 - iter 2605/5212 - loss 0.04318346 - time (sec): 126.82 - samples/sec: 1447.52 - lr: 0.000019 - momentum: 0.000000
|
169 |
+
2023-10-16 00:14:19,664 epoch 7 - iter 3126/5212 - loss 0.04497001 - time (sec): 151.99 - samples/sec: 1451.89 - lr: 0.000019 - momentum: 0.000000
|
170 |
+
2023-10-16 00:14:44,996 epoch 7 - iter 3647/5212 - loss 0.04321396 - time (sec): 177.33 - samples/sec: 1459.04 - lr: 0.000018 - momentum: 0.000000
|
171 |
+
2023-10-16 00:15:10,347 epoch 7 - iter 4168/5212 - loss 0.04275074 - time (sec): 202.68 - samples/sec: 1460.57 - lr: 0.000018 - momentum: 0.000000
|
172 |
+
2023-10-16 00:15:36,189 epoch 7 - iter 4689/5212 - loss 0.04205276 - time (sec): 228.52 - samples/sec: 1443.40 - lr: 0.000017 - momentum: 0.000000
|
173 |
+
2023-10-16 00:16:01,971 epoch 7 - iter 5210/5212 - loss 0.04183362 - time (sec): 254.30 - samples/sec: 1444.10 - lr: 0.000017 - momentum: 0.000000
|
174 |
+
2023-10-16 00:16:02,089 ----------------------------------------------------------------------------------------------------
|
175 |
+
2023-10-16 00:16:02,090 EPOCH 7 done: loss 0.0418 - lr: 0.000017
|
176 |
+
2023-10-16 00:16:10,408 DEV : loss 0.3469476103782654 - f1-score (micro avg) 0.3672
|
177 |
+
2023-10-16 00:16:10,453 ----------------------------------------------------------------------------------------------------
|
178 |
+
2023-10-16 00:16:37,044 epoch 8 - iter 521/5212 - loss 0.02217721 - time (sec): 26.59 - samples/sec: 1468.68 - lr: 0.000016 - momentum: 0.000000
|
179 |
+
2023-10-16 00:17:02,282 epoch 8 - iter 1042/5212 - loss 0.02729592 - time (sec): 51.83 - samples/sec: 1486.36 - lr: 0.000016 - momentum: 0.000000
|
180 |
+
2023-10-16 00:17:27,560 epoch 8 - iter 1563/5212 - loss 0.02832787 - time (sec): 77.10 - samples/sec: 1473.74 - lr: 0.000015 - momentum: 0.000000
|
181 |
+
2023-10-16 00:17:53,339 epoch 8 - iter 2084/5212 - loss 0.02989946 - time (sec): 102.88 - samples/sec: 1460.61 - lr: 0.000014 - momentum: 0.000000
|
182 |
+
2023-10-16 00:18:18,756 epoch 8 - iter 2605/5212 - loss 0.02875976 - time (sec): 128.30 - samples/sec: 1453.04 - lr: 0.000014 - momentum: 0.000000
|
183 |
+
2023-10-16 00:18:44,249 epoch 8 - iter 3126/5212 - loss 0.02972707 - time (sec): 153.79 - samples/sec: 1457.00 - lr: 0.000013 - momentum: 0.000000
|
184 |
+
2023-10-16 00:19:08,428 epoch 8 - iter 3647/5212 - loss 0.02955815 - time (sec): 177.97 - samples/sec: 1459.29 - lr: 0.000013 - momentum: 0.000000
|
185 |
+
2023-10-16 00:19:32,685 epoch 8 - iter 4168/5212 - loss 0.02992807 - time (sec): 202.23 - samples/sec: 1463.77 - lr: 0.000012 - momentum: 0.000000
|
186 |
+
2023-10-16 00:19:56,993 epoch 8 - iter 4689/5212 - loss 0.02991417 - time (sec): 226.54 - samples/sec: 1458.61 - lr: 0.000012 - momentum: 0.000000
|
187 |
+
2023-10-16 00:20:22,203 epoch 8 - iter 5210/5212 - loss 0.02960201 - time (sec): 251.75 - samples/sec: 1459.33 - lr: 0.000011 - momentum: 0.000000
|
188 |
+
2023-10-16 00:20:22,290 ----------------------------------------------------------------------------------------------------
|
189 |
+
2023-10-16 00:20:22,290 EPOCH 8 done: loss 0.0296 - lr: 0.000011
|
190 |
+
2023-10-16 00:20:31,432 DEV : loss 0.4151654839515686 - f1-score (micro avg) 0.3626
|
191 |
+
2023-10-16 00:20:31,464 ----------------------------------------------------------------------------------------------------
|
192 |
+
2023-10-16 00:20:56,517 epoch 9 - iter 521/5212 - loss 0.01705744 - time (sec): 25.05 - samples/sec: 1507.99 - lr: 0.000011 - momentum: 0.000000
|
193 |
+
2023-10-16 00:21:21,288 epoch 9 - iter 1042/5212 - loss 0.02156692 - time (sec): 49.82 - samples/sec: 1460.72 - lr: 0.000010 - momentum: 0.000000
|
194 |
+
2023-10-16 00:21:46,082 epoch 9 - iter 1563/5212 - loss 0.02449670 - time (sec): 74.62 - samples/sec: 1440.44 - lr: 0.000009 - momentum: 0.000000
|
195 |
+
2023-10-16 00:22:10,930 epoch 9 - iter 2084/5212 - loss 0.02337409 - time (sec): 99.46 - samples/sec: 1440.07 - lr: 0.000009 - momentum: 0.000000
|
196 |
+
2023-10-16 00:22:36,386 epoch 9 - iter 2605/5212 - loss 0.02237419 - time (sec): 124.92 - samples/sec: 1453.59 - lr: 0.000008 - momentum: 0.000000
|
197 |
+
2023-10-16 00:23:01,612 epoch 9 - iter 3126/5212 - loss 0.02194368 - time (sec): 150.15 - samples/sec: 1459.66 - lr: 0.000008 - momentum: 0.000000
|
198 |
+
2023-10-16 00:23:26,731 epoch 9 - iter 3647/5212 - loss 0.02224411 - time (sec): 175.26 - samples/sec: 1458.11 - lr: 0.000007 - momentum: 0.000000
|
199 |
+
2023-10-16 00:23:51,921 epoch 9 - iter 4168/5212 - loss 0.02139231 - time (sec): 200.46 - samples/sec: 1462.79 - lr: 0.000007 - momentum: 0.000000
|
200 |
+
2023-10-16 00:24:17,364 epoch 9 - iter 4689/5212 - loss 0.02129007 - time (sec): 225.90 - samples/sec: 1465.66 - lr: 0.000006 - momentum: 0.000000
|
201 |
+
2023-10-16 00:24:42,558 epoch 9 - iter 5210/5212 - loss 0.02095764 - time (sec): 251.09 - samples/sec: 1463.02 - lr: 0.000006 - momentum: 0.000000
|
202 |
+
2023-10-16 00:24:42,651 ----------------------------------------------------------------------------------------------------
|
203 |
+
2023-10-16 00:24:42,652 EPOCH 9 done: loss 0.0210 - lr: 0.000006
|
204 |
+
2023-10-16 00:24:52,948 DEV : loss 0.40386104583740234 - f1-score (micro avg) 0.3607
|
205 |
+
2023-10-16 00:24:52,983 ----------------------------------------------------------------------------------------------------
|
206 |
+
2023-10-16 00:25:18,379 epoch 10 - iter 521/5212 - loss 0.01307094 - time (sec): 25.39 - samples/sec: 1403.60 - lr: 0.000005 - momentum: 0.000000
|
207 |
+
2023-10-16 00:25:43,265 epoch 10 - iter 1042/5212 - loss 0.01629748 - time (sec): 50.28 - samples/sec: 1423.93 - lr: 0.000004 - momentum: 0.000000
|
208 |
+
2023-10-16 00:26:08,138 epoch 10 - iter 1563/5212 - loss 0.01516469 - time (sec): 75.15 - samples/sec: 1421.72 - lr: 0.000004 - momentum: 0.000000
|
209 |
+
2023-10-16 00:26:33,158 epoch 10 - iter 2084/5212 - loss 0.01544442 - time (sec): 100.17 - samples/sec: 1426.48 - lr: 0.000003 - momentum: 0.000000
|
210 |
+
2023-10-16 00:26:58,058 epoch 10 - iter 2605/5212 - loss 0.01529237 - time (sec): 125.07 - samples/sec: 1425.16 - lr: 0.000003 - momentum: 0.000000
|
211 |
+
2023-10-16 00:27:23,342 epoch 10 - iter 3126/5212 - loss 0.01463251 - time (sec): 150.36 - samples/sec: 1442.88 - lr: 0.000002 - momentum: 0.000000
|
212 |
+
2023-10-16 00:27:48,516 epoch 10 - iter 3647/5212 - loss 0.01441137 - time (sec): 175.53 - samples/sec: 1454.61 - lr: 0.000002 - momentum: 0.000000
|
213 |
+
2023-10-16 00:28:14,171 epoch 10 - iter 4168/5212 - loss 0.01390284 - time (sec): 201.19 - samples/sec: 1462.31 - lr: 0.000001 - momentum: 0.000000
|
214 |
+
2023-10-16 00:28:39,493 epoch 10 - iter 4689/5212 - loss 0.01394799 - time (sec): 226.51 - samples/sec: 1464.10 - lr: 0.000001 - momentum: 0.000000
|
215 |
+
2023-10-16 00:29:04,370 epoch 10 - iter 5210/5212 - loss 0.01412248 - time (sec): 251.39 - samples/sec: 1461.38 - lr: 0.000000 - momentum: 0.000000
|
216 |
+
2023-10-16 00:29:04,465 ----------------------------------------------------------------------------------------------------
|
217 |
+
2023-10-16 00:29:04,465 EPOCH 10 done: loss 0.0141 - lr: 0.000000
|
218 |
+
2023-10-16 00:29:13,621 DEV : loss 0.41680729389190674 - f1-score (micro avg) 0.3728
|
219 |
+
2023-10-16 00:29:14,169 ----------------------------------------------------------------------------------------------------
|
220 |
+
2023-10-16 00:29:14,171 Loading model from best epoch ...
|
221 |
+
2023-10-16 00:29:15,700 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
|
222 |
+
2023-10-16 00:29:30,958
|
223 |
+
Results:
|
224 |
+
- F-score (micro) 0.4193
|
225 |
+
- F-score (macro) 0.2665
|
226 |
+
- Accuracy 0.2701
|
227 |
+
|
228 |
+
By class:
|
229 |
+
precision recall f1-score support
|
230 |
+
|
231 |
+
LOC 0.4918 0.5206 0.5058 1214
|
232 |
+
PER 0.3522 0.4084 0.3782 808
|
233 |
+
ORG 0.2314 0.1501 0.1821 353
|
234 |
+
HumanProd 0.0000 0.0000 0.0000 15
|
235 |
+
|
236 |
+
micro avg 0.4141 0.4247 0.4193 2390
|
237 |
+
macro avg 0.2689 0.2698 0.2665 2390
|
238 |
+
weighted avg 0.4031 0.4247 0.4117 2390
|
239 |
+
|
240 |
+
2023-10-16 00:29:30,958 ----------------------------------------------------------------------------------------------------
|