stefan-it commited on
Commit
aa3ae23
1 Parent(s): 9400af4

Upload folder using huggingface_hub

Browse files
Files changed (5) hide show
  1. best-model.pt +3 -0
  2. dev.tsv +0 -0
  3. loss.tsv +11 -0
  4. test.tsv +0 -0
  5. training.log +240 -0
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4946d68c4a2b4b2130d4de5877f894682d6d9e7e9190efc919a92ec9f237ca3f
3
+ size 443323527
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 23:49:56 0.0000 0.3687 0.1275 0.2297 0.4394 0.3017 0.1778
3
+ 2 23:54:18 0.0000 0.1873 0.1455 0.2548 0.5833 0.3546 0.2166
4
+ 3 23:58:40 0.0000 0.1415 0.2712 0.2230 0.5777 0.3217 0.1930
5
+ 4 00:03:02 0.0000 0.1010 0.2461 0.2653 0.6231 0.3722 0.2302
6
+ 5 00:07:25 0.0000 0.0756 0.2759 0.2991 0.5739 0.3933 0.2465
7
+ 6 00:11:47 0.0000 0.0563 0.3152 0.2543 0.5852 0.3546 0.2167
8
+ 7 00:16:10 0.0000 0.0418 0.3469 0.2641 0.6023 0.3672 0.2263
9
+ 8 00:20:31 0.0000 0.0296 0.4152 0.2587 0.6061 0.3626 0.2225
10
+ 9 00:24:52 0.0000 0.0210 0.4039 0.2535 0.6250 0.3607 0.2209
11
+ 10 00:29:13 0.0000 0.0141 0.4168 0.2670 0.6174 0.3728 0.2304
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,240 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-15 23:45:36,931 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-15 23:45:36,932 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=17, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-15 23:45:36,932 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-15 23:45:36,932 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
52
+ - NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
53
+ 2023-10-15 23:45:36,932 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-15 23:45:36,932 Train: 20847 sentences
55
+ 2023-10-15 23:45:36,932 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-15 23:45:36,932 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-15 23:45:36,932 Training Params:
58
+ 2023-10-15 23:45:36,932 - learning_rate: "5e-05"
59
+ 2023-10-15 23:45:36,933 - mini_batch_size: "4"
60
+ 2023-10-15 23:45:36,933 - max_epochs: "10"
61
+ 2023-10-15 23:45:36,933 - shuffle: "True"
62
+ 2023-10-15 23:45:36,933 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-15 23:45:36,933 Plugins:
64
+ 2023-10-15 23:45:36,933 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2023-10-15 23:45:36,933 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-15 23:45:36,933 Final evaluation on model from best epoch (best-model.pt)
67
+ 2023-10-15 23:45:36,933 - metric: "('micro avg', 'f1-score')"
68
+ 2023-10-15 23:45:36,933 ----------------------------------------------------------------------------------------------------
69
+ 2023-10-15 23:45:36,933 Computation:
70
+ 2023-10-15 23:45:36,933 - compute on device: cuda:0
71
+ 2023-10-15 23:45:36,933 - embedding storage: none
72
+ 2023-10-15 23:45:36,933 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-15 23:45:36,933 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
74
+ 2023-10-15 23:45:36,933 ----------------------------------------------------------------------------------------------------
75
+ 2023-10-15 23:45:36,933 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-15 23:46:01,690 epoch 1 - iter 521/5212 - loss 1.37202838 - time (sec): 24.76 - samples/sec: 1438.81 - lr: 0.000005 - momentum: 0.000000
77
+ 2023-10-15 23:46:26,941 epoch 1 - iter 1042/5212 - loss 0.87856005 - time (sec): 50.01 - samples/sec: 1460.52 - lr: 0.000010 - momentum: 0.000000
78
+ 2023-10-15 23:46:51,993 epoch 1 - iter 1563/5212 - loss 0.68736436 - time (sec): 75.06 - samples/sec: 1446.56 - lr: 0.000015 - momentum: 0.000000
79
+ 2023-10-15 23:47:17,119 epoch 1 - iter 2084/5212 - loss 0.58462865 - time (sec): 100.18 - samples/sec: 1440.83 - lr: 0.000020 - momentum: 0.000000
80
+ 2023-10-15 23:47:42,651 epoch 1 - iter 2605/5212 - loss 0.51553132 - time (sec): 125.72 - samples/sec: 1437.34 - lr: 0.000025 - momentum: 0.000000
81
+ 2023-10-15 23:48:08,713 epoch 1 - iter 3126/5212 - loss 0.46300876 - time (sec): 151.78 - samples/sec: 1438.92 - lr: 0.000030 - momentum: 0.000000
82
+ 2023-10-15 23:48:33,975 epoch 1 - iter 3647/5212 - loss 0.43081269 - time (sec): 177.04 - samples/sec: 1443.11 - lr: 0.000035 - momentum: 0.000000
83
+ 2023-10-15 23:48:59,438 epoch 1 - iter 4168/5212 - loss 0.40412772 - time (sec): 202.50 - samples/sec: 1440.23 - lr: 0.000040 - momentum: 0.000000
84
+ 2023-10-15 23:49:24,710 epoch 1 - iter 4689/5212 - loss 0.38575539 - time (sec): 227.78 - samples/sec: 1435.07 - lr: 0.000045 - momentum: 0.000000
85
+ 2023-10-15 23:49:50,594 epoch 1 - iter 5210/5212 - loss 0.36876867 - time (sec): 253.66 - samples/sec: 1448.37 - lr: 0.000050 - momentum: 0.000000
86
+ 2023-10-15 23:49:50,688 ----------------------------------------------------------------------------------------------------
87
+ 2023-10-15 23:49:50,688 EPOCH 1 done: loss 0.3687 - lr: 0.000050
88
+ 2023-10-15 23:49:56,687 DEV : loss 0.12752242386341095 - f1-score (micro avg) 0.3017
89
+ 2023-10-15 23:49:56,714 saving best model
90
+ 2023-10-15 23:49:57,179 ----------------------------------------------------------------------------------------------------
91
+ 2023-10-15 23:50:22,567 epoch 2 - iter 521/5212 - loss 0.20030192 - time (sec): 25.39 - samples/sec: 1528.79 - lr: 0.000049 - momentum: 0.000000
92
+ 2023-10-15 23:50:47,800 epoch 2 - iter 1042/5212 - loss 0.19679440 - time (sec): 50.62 - samples/sec: 1482.50 - lr: 0.000049 - momentum: 0.000000
93
+ 2023-10-15 23:51:13,032 epoch 2 - iter 1563/5212 - loss 0.18526110 - time (sec): 75.85 - samples/sec: 1470.72 - lr: 0.000048 - momentum: 0.000000
94
+ 2023-10-15 23:51:38,382 epoch 2 - iter 2084/5212 - loss 0.18140515 - time (sec): 101.20 - samples/sec: 1471.11 - lr: 0.000048 - momentum: 0.000000
95
+ 2023-10-15 23:52:03,691 epoch 2 - iter 2605/5212 - loss 0.18582562 - time (sec): 126.51 - samples/sec: 1464.80 - lr: 0.000047 - momentum: 0.000000
96
+ 2023-10-15 23:52:28,906 epoch 2 - iter 3126/5212 - loss 0.18577517 - time (sec): 151.72 - samples/sec: 1469.25 - lr: 0.000047 - momentum: 0.000000
97
+ 2023-10-15 23:52:53,821 epoch 2 - iter 3647/5212 - loss 0.18828427 - time (sec): 176.64 - samples/sec: 1465.00 - lr: 0.000046 - momentum: 0.000000
98
+ 2023-10-15 23:53:19,622 epoch 2 - iter 4168/5212 - loss 0.18599284 - time (sec): 202.44 - samples/sec: 1470.56 - lr: 0.000046 - momentum: 0.000000
99
+ 2023-10-15 23:53:44,440 epoch 2 - iter 4689/5212 - loss 0.18772856 - time (sec): 227.26 - samples/sec: 1455.41 - lr: 0.000045 - momentum: 0.000000
100
+ 2023-10-15 23:54:09,398 epoch 2 - iter 5210/5212 - loss 0.18732109 - time (sec): 252.22 - samples/sec: 1455.77 - lr: 0.000044 - momentum: 0.000000
101
+ 2023-10-15 23:54:09,513 ----------------------------------------------------------------------------------------------------
102
+ 2023-10-15 23:54:09,513 EPOCH 2 done: loss 0.1873 - lr: 0.000044
103
+ 2023-10-15 23:54:18,685 DEV : loss 0.14554405212402344 - f1-score (micro avg) 0.3546
104
+ 2023-10-15 23:54:18,713 saving best model
105
+ 2023-10-15 23:54:19,332 ----------------------------------------------------------------------------------------------------
106
+ 2023-10-15 23:54:44,600 epoch 3 - iter 521/5212 - loss 0.16892477 - time (sec): 25.26 - samples/sec: 1431.22 - lr: 0.000044 - momentum: 0.000000
107
+ 2023-10-15 23:55:09,173 epoch 3 - iter 1042/5212 - loss 0.15480170 - time (sec): 49.84 - samples/sec: 1382.24 - lr: 0.000043 - momentum: 0.000000
108
+ 2023-10-15 23:55:34,192 epoch 3 - iter 1563/5212 - loss 0.15253235 - time (sec): 74.86 - samples/sec: 1379.25 - lr: 0.000043 - momentum: 0.000000
109
+ 2023-10-15 23:55:59,000 epoch 3 - iter 2084/5212 - loss 0.14966988 - time (sec): 99.66 - samples/sec: 1394.82 - lr: 0.000042 - momentum: 0.000000
110
+ 2023-10-15 23:56:23,271 epoch 3 - iter 2605/5212 - loss 0.14248758 - time (sec): 123.94 - samples/sec: 1422.36 - lr: 0.000042 - momentum: 0.000000
111
+ 2023-10-15 23:56:48,693 epoch 3 - iter 3126/5212 - loss 0.14250745 - time (sec): 149.36 - samples/sec: 1432.87 - lr: 0.000041 - momentum: 0.000000
112
+ 2023-10-15 23:57:14,127 epoch 3 - iter 3647/5212 - loss 0.14348930 - time (sec): 174.79 - samples/sec: 1440.23 - lr: 0.000041 - momentum: 0.000000
113
+ 2023-10-15 23:57:39,517 epoch 3 - iter 4168/5212 - loss 0.14496238 - time (sec): 200.18 - samples/sec: 1439.25 - lr: 0.000040 - momentum: 0.000000
114
+ 2023-10-15 23:58:05,279 epoch 3 - iter 4689/5212 - loss 0.14349134 - time (sec): 225.94 - samples/sec: 1447.61 - lr: 0.000039 - momentum: 0.000000
115
+ 2023-10-15 23:58:31,311 epoch 3 - iter 5210/5212 - loss 0.14146075 - time (sec): 251.98 - samples/sec: 1458.09 - lr: 0.000039 - momentum: 0.000000
116
+ 2023-10-15 23:58:31,401 ----------------------------------------------------------------------------------------------------
117
+ 2023-10-15 23:58:31,401 EPOCH 3 done: loss 0.1415 - lr: 0.000039
118
+ 2023-10-15 23:58:40,504 DEV : loss 0.27121227979660034 - f1-score (micro avg) 0.3217
119
+ 2023-10-15 23:58:40,532 ----------------------------------------------------------------------------------------------------
120
+ 2023-10-15 23:59:05,417 epoch 4 - iter 521/5212 - loss 0.09043585 - time (sec): 24.88 - samples/sec: 1421.95 - lr: 0.000038 - momentum: 0.000000
121
+ 2023-10-15 23:59:30,195 epoch 4 - iter 1042/5212 - loss 0.10494334 - time (sec): 49.66 - samples/sec: 1447.79 - lr: 0.000038 - momentum: 0.000000
122
+ 2023-10-15 23:59:55,656 epoch 4 - iter 1563/5212 - loss 0.09724528 - time (sec): 75.12 - samples/sec: 1479.73 - lr: 0.000037 - momentum: 0.000000
123
+ 2023-10-16 00:00:21,419 epoch 4 - iter 2084/5212 - loss 0.09872646 - time (sec): 100.89 - samples/sec: 1485.87 - lr: 0.000037 - momentum: 0.000000
124
+ 2023-10-16 00:00:46,935 epoch 4 - iter 2605/5212 - loss 0.10353381 - time (sec): 126.40 - samples/sec: 1480.74 - lr: 0.000036 - momentum: 0.000000
125
+ 2023-10-16 00:01:12,909 epoch 4 - iter 3126/5212 - loss 0.10532981 - time (sec): 152.38 - samples/sec: 1464.22 - lr: 0.000036 - momentum: 0.000000
126
+ 2023-10-16 00:01:38,193 epoch 4 - iter 3647/5212 - loss 0.10412690 - time (sec): 177.66 - samples/sec: 1458.40 - lr: 0.000035 - momentum: 0.000000
127
+ 2023-10-16 00:02:03,668 epoch 4 - iter 4168/5212 - loss 0.10176002 - time (sec): 203.14 - samples/sec: 1462.23 - lr: 0.000034 - momentum: 0.000000
128
+ 2023-10-16 00:02:28,489 epoch 4 - iter 4689/5212 - loss 0.10219116 - time (sec): 227.96 - samples/sec: 1455.00 - lr: 0.000034 - momentum: 0.000000
129
+ 2023-10-16 00:02:53,672 epoch 4 - iter 5210/5212 - loss 0.10091424 - time (sec): 253.14 - samples/sec: 1450.58 - lr: 0.000033 - momentum: 0.000000
130
+ 2023-10-16 00:02:53,769 ----------------------------------------------------------------------------------------------------
131
+ 2023-10-16 00:02:53,769 EPOCH 4 done: loss 0.1010 - lr: 0.000033
132
+ 2023-10-16 00:03:02,079 DEV : loss 0.24610090255737305 - f1-score (micro avg) 0.3722
133
+ 2023-10-16 00:03:02,108 saving best model
134
+ 2023-10-16 00:03:02,652 ----------------------------------------------------------------------------------------------------
135
+ 2023-10-16 00:03:28,355 epoch 5 - iter 521/5212 - loss 0.08946468 - time (sec): 25.70 - samples/sec: 1499.17 - lr: 0.000033 - momentum: 0.000000
136
+ 2023-10-16 00:03:53,305 epoch 5 - iter 1042/5212 - loss 0.08272511 - time (sec): 50.65 - samples/sec: 1445.65 - lr: 0.000032 - momentum: 0.000000
137
+ 2023-10-16 00:04:19,363 epoch 5 - iter 1563/5212 - loss 0.07858447 - time (sec): 76.71 - samples/sec: 1419.47 - lr: 0.000032 - momentum: 0.000000
138
+ 2023-10-16 00:04:45,160 epoch 5 - iter 2084/5212 - loss 0.07638850 - time (sec): 102.51 - samples/sec: 1440.40 - lr: 0.000031 - momentum: 0.000000
139
+ 2023-10-16 00:05:10,817 epoch 5 - iter 2605/5212 - loss 0.07945967 - time (sec): 128.16 - samples/sec: 1449.44 - lr: 0.000031 - momentum: 0.000000
140
+ 2023-10-16 00:05:35,603 epoch 5 - iter 3126/5212 - loss 0.07924286 - time (sec): 152.95 - samples/sec: 1448.28 - lr: 0.000030 - momentum: 0.000000
141
+ 2023-10-16 00:06:00,818 epoch 5 - iter 3647/5212 - loss 0.07893342 - time (sec): 178.16 - samples/sec: 1442.27 - lr: 0.000029 - momentum: 0.000000
142
+ 2023-10-16 00:06:26,438 epoch 5 - iter 4168/5212 - loss 0.07696755 - time (sec): 203.78 - samples/sec: 1448.05 - lr: 0.000029 - momentum: 0.000000
143
+ 2023-10-16 00:06:51,588 epoch 5 - iter 4689/5212 - loss 0.07559505 - time (sec): 228.93 - samples/sec: 1447.40 - lr: 0.000028 - momentum: 0.000000
144
+ 2023-10-16 00:07:16,799 epoch 5 - iter 5210/5212 - loss 0.07550940 - time (sec): 254.14 - samples/sec: 1445.61 - lr: 0.000028 - momentum: 0.000000
145
+ 2023-10-16 00:07:16,884 ----------------------------------------------------------------------------------------------------
146
+ 2023-10-16 00:07:16,885 EPOCH 5 done: loss 0.0756 - lr: 0.000028
147
+ 2023-10-16 00:07:25,246 DEV : loss 0.27587252855300903 - f1-score (micro avg) 0.3933
148
+ 2023-10-16 00:07:25,276 saving best model
149
+ 2023-10-16 00:07:25,899 ----------------------------------------------------------------------------------------------------
150
+ 2023-10-16 00:07:51,242 epoch 6 - iter 521/5212 - loss 0.04500721 - time (sec): 25.34 - samples/sec: 1438.74 - lr: 0.000027 - momentum: 0.000000
151
+ 2023-10-16 00:08:16,452 epoch 6 - iter 1042/5212 - loss 0.05219254 - time (sec): 50.55 - samples/sec: 1458.09 - lr: 0.000027 - momentum: 0.000000
152
+ 2023-10-16 00:08:41,812 epoch 6 - iter 1563/5212 - loss 0.05073335 - time (sec): 75.91 - samples/sec: 1454.84 - lr: 0.000026 - momentum: 0.000000
153
+ 2023-10-16 00:09:07,218 epoch 6 - iter 2084/5212 - loss 0.05244273 - time (sec): 101.32 - samples/sec: 1458.61 - lr: 0.000026 - momentum: 0.000000
154
+ 2023-10-16 00:09:32,274 epoch 6 - iter 2605/5212 - loss 0.05597317 - time (sec): 126.37 - samples/sec: 1450.57 - lr: 0.000025 - momentum: 0.000000
155
+ 2023-10-16 00:09:58,365 epoch 6 - iter 3126/5212 - loss 0.05870564 - time (sec): 152.46 - samples/sec: 1447.89 - lr: 0.000024 - momentum: 0.000000
156
+ 2023-10-16 00:10:23,762 epoch 6 - iter 3647/5212 - loss 0.05885555 - time (sec): 177.86 - samples/sec: 1458.55 - lr: 0.000024 - momentum: 0.000000
157
+ 2023-10-16 00:10:49,124 epoch 6 - iter 4168/5212 - loss 0.05780173 - time (sec): 203.22 - samples/sec: 1461.81 - lr: 0.000023 - momentum: 0.000000
158
+ 2023-10-16 00:11:14,399 epoch 6 - iter 4689/5212 - loss 0.05720709 - time (sec): 228.50 - samples/sec: 1459.04 - lr: 0.000023 - momentum: 0.000000
159
+ 2023-10-16 00:11:39,181 epoch 6 - iter 5210/5212 - loss 0.05635932 - time (sec): 253.28 - samples/sec: 1448.97 - lr: 0.000022 - momentum: 0.000000
160
+ 2023-10-16 00:11:39,333 ----------------------------------------------------------------------------------------------------
161
+ 2023-10-16 00:11:39,333 EPOCH 6 done: loss 0.0563 - lr: 0.000022
162
+ 2023-10-16 00:11:47,640 DEV : loss 0.3152172863483429 - f1-score (micro avg) 0.3546
163
+ 2023-10-16 00:11:47,669 ----------------------------------------------------------------------------------------------------
164
+ 2023-10-16 00:12:12,705 epoch 7 - iter 521/5212 - loss 0.05081149 - time (sec): 25.03 - samples/sec: 1394.27 - lr: 0.000022 - momentum: 0.000000
165
+ 2023-10-16 00:12:37,682 epoch 7 - iter 1042/5212 - loss 0.04120887 - time (sec): 50.01 - samples/sec: 1432.98 - lr: 0.000021 - momentum: 0.000000
166
+ 2023-10-16 00:13:03,240 epoch 7 - iter 1563/5212 - loss 0.04350707 - time (sec): 75.57 - samples/sec: 1433.54 - lr: 0.000021 - momentum: 0.000000
167
+ 2023-10-16 00:13:29,103 epoch 7 - iter 2084/5212 - loss 0.04429995 - time (sec): 101.43 - samples/sec: 1456.00 - lr: 0.000020 - momentum: 0.000000
168
+ 2023-10-16 00:13:54,490 epoch 7 - iter 2605/5212 - loss 0.04318346 - time (sec): 126.82 - samples/sec: 1447.52 - lr: 0.000019 - momentum: 0.000000
169
+ 2023-10-16 00:14:19,664 epoch 7 - iter 3126/5212 - loss 0.04497001 - time (sec): 151.99 - samples/sec: 1451.89 - lr: 0.000019 - momentum: 0.000000
170
+ 2023-10-16 00:14:44,996 epoch 7 - iter 3647/5212 - loss 0.04321396 - time (sec): 177.33 - samples/sec: 1459.04 - lr: 0.000018 - momentum: 0.000000
171
+ 2023-10-16 00:15:10,347 epoch 7 - iter 4168/5212 - loss 0.04275074 - time (sec): 202.68 - samples/sec: 1460.57 - lr: 0.000018 - momentum: 0.000000
172
+ 2023-10-16 00:15:36,189 epoch 7 - iter 4689/5212 - loss 0.04205276 - time (sec): 228.52 - samples/sec: 1443.40 - lr: 0.000017 - momentum: 0.000000
173
+ 2023-10-16 00:16:01,971 epoch 7 - iter 5210/5212 - loss 0.04183362 - time (sec): 254.30 - samples/sec: 1444.10 - lr: 0.000017 - momentum: 0.000000
174
+ 2023-10-16 00:16:02,089 ----------------------------------------------------------------------------------------------------
175
+ 2023-10-16 00:16:02,090 EPOCH 7 done: loss 0.0418 - lr: 0.000017
176
+ 2023-10-16 00:16:10,408 DEV : loss 0.3469476103782654 - f1-score (micro avg) 0.3672
177
+ 2023-10-16 00:16:10,453 ----------------------------------------------------------------------------------------------------
178
+ 2023-10-16 00:16:37,044 epoch 8 - iter 521/5212 - loss 0.02217721 - time (sec): 26.59 - samples/sec: 1468.68 - lr: 0.000016 - momentum: 0.000000
179
+ 2023-10-16 00:17:02,282 epoch 8 - iter 1042/5212 - loss 0.02729592 - time (sec): 51.83 - samples/sec: 1486.36 - lr: 0.000016 - momentum: 0.000000
180
+ 2023-10-16 00:17:27,560 epoch 8 - iter 1563/5212 - loss 0.02832787 - time (sec): 77.10 - samples/sec: 1473.74 - lr: 0.000015 - momentum: 0.000000
181
+ 2023-10-16 00:17:53,339 epoch 8 - iter 2084/5212 - loss 0.02989946 - time (sec): 102.88 - samples/sec: 1460.61 - lr: 0.000014 - momentum: 0.000000
182
+ 2023-10-16 00:18:18,756 epoch 8 - iter 2605/5212 - loss 0.02875976 - time (sec): 128.30 - samples/sec: 1453.04 - lr: 0.000014 - momentum: 0.000000
183
+ 2023-10-16 00:18:44,249 epoch 8 - iter 3126/5212 - loss 0.02972707 - time (sec): 153.79 - samples/sec: 1457.00 - lr: 0.000013 - momentum: 0.000000
184
+ 2023-10-16 00:19:08,428 epoch 8 - iter 3647/5212 - loss 0.02955815 - time (sec): 177.97 - samples/sec: 1459.29 - lr: 0.000013 - momentum: 0.000000
185
+ 2023-10-16 00:19:32,685 epoch 8 - iter 4168/5212 - loss 0.02992807 - time (sec): 202.23 - samples/sec: 1463.77 - lr: 0.000012 - momentum: 0.000000
186
+ 2023-10-16 00:19:56,993 epoch 8 - iter 4689/5212 - loss 0.02991417 - time (sec): 226.54 - samples/sec: 1458.61 - lr: 0.000012 - momentum: 0.000000
187
+ 2023-10-16 00:20:22,203 epoch 8 - iter 5210/5212 - loss 0.02960201 - time (sec): 251.75 - samples/sec: 1459.33 - lr: 0.000011 - momentum: 0.000000
188
+ 2023-10-16 00:20:22,290 ----------------------------------------------------------------------------------------------------
189
+ 2023-10-16 00:20:22,290 EPOCH 8 done: loss 0.0296 - lr: 0.000011
190
+ 2023-10-16 00:20:31,432 DEV : loss 0.4151654839515686 - f1-score (micro avg) 0.3626
191
+ 2023-10-16 00:20:31,464 ----------------------------------------------------------------------------------------------------
192
+ 2023-10-16 00:20:56,517 epoch 9 - iter 521/5212 - loss 0.01705744 - time (sec): 25.05 - samples/sec: 1507.99 - lr: 0.000011 - momentum: 0.000000
193
+ 2023-10-16 00:21:21,288 epoch 9 - iter 1042/5212 - loss 0.02156692 - time (sec): 49.82 - samples/sec: 1460.72 - lr: 0.000010 - momentum: 0.000000
194
+ 2023-10-16 00:21:46,082 epoch 9 - iter 1563/5212 - loss 0.02449670 - time (sec): 74.62 - samples/sec: 1440.44 - lr: 0.000009 - momentum: 0.000000
195
+ 2023-10-16 00:22:10,930 epoch 9 - iter 2084/5212 - loss 0.02337409 - time (sec): 99.46 - samples/sec: 1440.07 - lr: 0.000009 - momentum: 0.000000
196
+ 2023-10-16 00:22:36,386 epoch 9 - iter 2605/5212 - loss 0.02237419 - time (sec): 124.92 - samples/sec: 1453.59 - lr: 0.000008 - momentum: 0.000000
197
+ 2023-10-16 00:23:01,612 epoch 9 - iter 3126/5212 - loss 0.02194368 - time (sec): 150.15 - samples/sec: 1459.66 - lr: 0.000008 - momentum: 0.000000
198
+ 2023-10-16 00:23:26,731 epoch 9 - iter 3647/5212 - loss 0.02224411 - time (sec): 175.26 - samples/sec: 1458.11 - lr: 0.000007 - momentum: 0.000000
199
+ 2023-10-16 00:23:51,921 epoch 9 - iter 4168/5212 - loss 0.02139231 - time (sec): 200.46 - samples/sec: 1462.79 - lr: 0.000007 - momentum: 0.000000
200
+ 2023-10-16 00:24:17,364 epoch 9 - iter 4689/5212 - loss 0.02129007 - time (sec): 225.90 - samples/sec: 1465.66 - lr: 0.000006 - momentum: 0.000000
201
+ 2023-10-16 00:24:42,558 epoch 9 - iter 5210/5212 - loss 0.02095764 - time (sec): 251.09 - samples/sec: 1463.02 - lr: 0.000006 - momentum: 0.000000
202
+ 2023-10-16 00:24:42,651 ----------------------------------------------------------------------------------------------------
203
+ 2023-10-16 00:24:42,652 EPOCH 9 done: loss 0.0210 - lr: 0.000006
204
+ 2023-10-16 00:24:52,948 DEV : loss 0.40386104583740234 - f1-score (micro avg) 0.3607
205
+ 2023-10-16 00:24:52,983 ----------------------------------------------------------------------------------------------------
206
+ 2023-10-16 00:25:18,379 epoch 10 - iter 521/5212 - loss 0.01307094 - time (sec): 25.39 - samples/sec: 1403.60 - lr: 0.000005 - momentum: 0.000000
207
+ 2023-10-16 00:25:43,265 epoch 10 - iter 1042/5212 - loss 0.01629748 - time (sec): 50.28 - samples/sec: 1423.93 - lr: 0.000004 - momentum: 0.000000
208
+ 2023-10-16 00:26:08,138 epoch 10 - iter 1563/5212 - loss 0.01516469 - time (sec): 75.15 - samples/sec: 1421.72 - lr: 0.000004 - momentum: 0.000000
209
+ 2023-10-16 00:26:33,158 epoch 10 - iter 2084/5212 - loss 0.01544442 - time (sec): 100.17 - samples/sec: 1426.48 - lr: 0.000003 - momentum: 0.000000
210
+ 2023-10-16 00:26:58,058 epoch 10 - iter 2605/5212 - loss 0.01529237 - time (sec): 125.07 - samples/sec: 1425.16 - lr: 0.000003 - momentum: 0.000000
211
+ 2023-10-16 00:27:23,342 epoch 10 - iter 3126/5212 - loss 0.01463251 - time (sec): 150.36 - samples/sec: 1442.88 - lr: 0.000002 - momentum: 0.000000
212
+ 2023-10-16 00:27:48,516 epoch 10 - iter 3647/5212 - loss 0.01441137 - time (sec): 175.53 - samples/sec: 1454.61 - lr: 0.000002 - momentum: 0.000000
213
+ 2023-10-16 00:28:14,171 epoch 10 - iter 4168/5212 - loss 0.01390284 - time (sec): 201.19 - samples/sec: 1462.31 - lr: 0.000001 - momentum: 0.000000
214
+ 2023-10-16 00:28:39,493 epoch 10 - iter 4689/5212 - loss 0.01394799 - time (sec): 226.51 - samples/sec: 1464.10 - lr: 0.000001 - momentum: 0.000000
215
+ 2023-10-16 00:29:04,370 epoch 10 - iter 5210/5212 - loss 0.01412248 - time (sec): 251.39 - samples/sec: 1461.38 - lr: 0.000000 - momentum: 0.000000
216
+ 2023-10-16 00:29:04,465 ----------------------------------------------------------------------------------------------------
217
+ 2023-10-16 00:29:04,465 EPOCH 10 done: loss 0.0141 - lr: 0.000000
218
+ 2023-10-16 00:29:13,621 DEV : loss 0.41680729389190674 - f1-score (micro avg) 0.3728
219
+ 2023-10-16 00:29:14,169 ----------------------------------------------------------------------------------------------------
220
+ 2023-10-16 00:29:14,171 Loading model from best epoch ...
221
+ 2023-10-16 00:29:15,700 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
222
+ 2023-10-16 00:29:30,958
223
+ Results:
224
+ - F-score (micro) 0.4193
225
+ - F-score (macro) 0.2665
226
+ - Accuracy 0.2701
227
+
228
+ By class:
229
+ precision recall f1-score support
230
+
231
+ LOC 0.4918 0.5206 0.5058 1214
232
+ PER 0.3522 0.4084 0.3782 808
233
+ ORG 0.2314 0.1501 0.1821 353
234
+ HumanProd 0.0000 0.0000 0.0000 15
235
+
236
+ micro avg 0.4141 0.4247 0.4193 2390
237
+ macro avg 0.2689 0.2698 0.2665 2390
238
+ weighted avg 0.4031 0.4247 0.4117 2390
239
+
240
+ 2023-10-16 00:29:30,958 ----------------------------------------------------------------------------------------------------