beomi commited on
Commit
a8e8214
1 Parent(s): 081e85a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +77 -0
README.md CHANGED
@@ -138,6 +138,83 @@ But Llama 3 is a new technology, and like any new technology, there are risks as
138
 
139
  Please see the Responsible Use Guide available at [http://llama.meta.com/responsible-use-guide](http://llama.meta.com/responsible-use-guide)
140
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
141
 
142
  ## Citation instructions
143
 
 
138
 
139
  Please see the Responsible Use Guide available at [http://llama.meta.com/responsible-use-guide](http://llama.meta.com/responsible-use-guide)
140
 
141
+ ## Benchmark Scores
142
+
143
+ - vllm (pretrained=beomi/Llama-3-Open-Ko-8B,revision=081e85a,tensor_parallel_size=1,dtype=bfloat16,data_parallel_size=2,gpu_memory_utilization=0.8), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: auto
144
+
145
+ | Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
146
+ |----------------------------------------------------------|-------|------|-----:|-----------|-----:|---|------|
147
+ |haerae |N/A |none | 5|acc |0.6801|± |0.0138|
148
+ | | |none | 5|acc_norm |0.6801|± |0.0138|
149
+ | - haerae_general_knowledge | 1|none | 5|acc |0.4375|± |0.0375|
150
+ | | |none | 5|acc_norm |0.4375|± |0.0375|
151
+ | - haerae_history | 1|none | 5|acc |0.7340|± |0.0323|
152
+ | | |none | 5|acc_norm |0.7340|± |0.0323|
153
+ | - haerae_loan_word | 1|none | 5|acc |0.7870|± |0.0316|
154
+ | | |none | 5|acc_norm |0.7870|± |0.0316|
155
+ | - haerae_rare_word | 1|none | 5|acc |0.7012|± |0.0228|
156
+ | | |none | 5|acc_norm |0.7012|± |0.0228|
157
+ | - haerae_standard_nomenclature | 1|none | 5|acc |0.7190|± |0.0365|
158
+ | | |none | 5|acc_norm |0.7190|± |0.0365|
159
+ |kmmlu_direct |N/A |none | 5|exact_match|0.4054|± |0.0026|
160
+ | - kmmlu_direct_accounting | 2|none | 5|exact_match|0.3600|± |0.0482|
161
+ | - kmmlu_direct_agricultural_sciences | 2|none | 5|exact_match|0.3130|± |0.0147|
162
+ | - kmmlu_direct_aviation_engineering_and_maintenance | 2|none | 5|exact_match|0.3690|± |0.0153|
163
+ | - kmmlu_direct_biology | 2|none | 5|exact_match|0.3330|± |0.0149|
164
+ | - kmmlu_direct_chemical_engineering | 2|none | 5|exact_match|0.4190|± |0.0156|
165
+ | - kmmlu_direct_chemistry | 2|none | 5|exact_match|0.3833|± |0.0199|
166
+ | - kmmlu_direct_civil_engineering | 2|none | 5|exact_match|0.3870|± |0.0154|
167
+ | - kmmlu_direct_computer_science | 2|none | 5|exact_match|0.6340|± |0.0152|
168
+ | - kmmlu_direct_construction | 2|none | 5|exact_match|0.3340|± |0.0149|
169
+ | - kmmlu_direct_criminal_law | 2|none | 5|exact_match|0.2850|± |0.0320|
170
+ | - kmmlu_direct_ecology | 2|none | 5|exact_match|0.4210|± |0.0156|
171
+ | - kmmlu_direct_economics | 2|none | 5|exact_match|0.4077|± |0.0433|
172
+ | - kmmlu_direct_education | 2|none | 5|exact_match|0.5000|± |0.0503|
173
+ | - kmmlu_direct_electrical_engineering | 2|none | 5|exact_match|0.3620|± |0.0152|
174
+ | - kmmlu_direct_electronics_engineering | 2|none | 5|exact_match|0.4790|± |0.0158|
175
+ | - kmmlu_direct_energy_management | 2|none | 5|exact_match|0.3110|± |0.0146|
176
+ | - kmmlu_direct_environmental_science | 2|none | 5|exact_match|0.3210|± |0.0148|
177
+ | - kmmlu_direct_fashion | 2|none | 5|exact_match|0.4190|± |0.0156|
178
+ | - kmmlu_direct_food_processing | 2|none | 5|exact_match|0.3600|± |0.0152|
179
+ | - kmmlu_direct_gas_technology_and_engineering | 2|none | 5|exact_match|0.3320|± |0.0149|
180
+ | - kmmlu_direct_geomatics | 2|none | 5|exact_match|0.3640|± |0.0152|
181
+ | - kmmlu_direct_health | 2|none | 5|exact_match|0.5100|± |0.0502|
182
+ | - kmmlu_direct_industrial_engineer | 2|none | 5|exact_match|0.3970|± |0.0155|
183
+ | - kmmlu_direct_information_technology | 2|none | 5|exact_match|0.5720|± |0.0157|
184
+ | - kmmlu_direct_interior_architecture_and_design | 2|none | 5|exact_match|0.4740|± |0.0158|
185
+ | - kmmlu_direct_korean_history | 2|none | 5|exact_match|0.2700|± |0.0446|
186
+ | - kmmlu_direct_law | 2|none | 5|exact_match|0.3990|± |0.0155|
187
+ | - kmmlu_direct_machine_design_and_manufacturing | 2|none | 5|exact_match|0.4080|± |0.0155|
188
+ | - kmmlu_direct_management | 2|none | 5|exact_match|0.4660|± |0.0158|
189
+ | - kmmlu_direct_maritime_engineering | 2|none | 5|exact_match|0.4417|± |0.0203|
190
+ | - kmmlu_direct_marketing | 2|none | 5|exact_match|0.6720|± |0.0149|
191
+ | - kmmlu_direct_materials_engineering | 2|none | 5|exact_match|0.4130|± |0.0156|
192
+ | - kmmlu_direct_math | 2|none | 5|exact_match|0.2567|± |0.0253|
193
+ | - kmmlu_direct_mechanical_engineering | 2|none | 5|exact_match|0.3800|± |0.0154|
194
+ | - kmmlu_direct_nondestructive_testing | 2|none | 5|exact_match|0.3890|± |0.0154|
195
+ | - kmmlu_direct_patent | 2|none | 5|exact_match|0.2700|± |0.0446|
196
+ | - kmmlu_direct_political_science_and_sociology | 2|none | 5|exact_match|0.4433|± |0.0287|
197
+ | - kmmlu_direct_psychology | 2|none | 5|exact_match|0.3620|± |0.0152|
198
+ | - kmmlu_direct_public_safety | 2|none | 5|exact_match|0.3200|± |0.0148|
199
+ | - kmmlu_direct_railway_and_automotive_engineering | 2|none | 5|exact_match|0.3200|± |0.0148|
200
+ | - kmmlu_direct_real_estate | 2|none | 5|exact_match|0.3650|± |0.0341|
201
+ | - kmmlu_direct_refrigerating_machinery | 2|none | 5|exact_match|0.3210|± |0.0148|
202
+ | - kmmlu_direct_social_welfare | 2|none | 5|exact_match|0.4500|± |0.0157|
203
+ | - kmmlu_direct_taxation | 2|none | 5|exact_match|0.3550|± |0.0339|
204
+ | - kmmlu_direct_telecommunications_and_wireless_technology| 2|none | 5|exact_match|0.5490|± |0.0157|
205
+ |kobest_boolq | 1|none | 5|acc |0.7984|± |0.0107|
206
+ | | |none | 5|f1 |0.7961|± |N/A |
207
+ |kobest_copa | 1|none | 5|acc |0.8150|± |0.0123|
208
+ | | |none | 5|f1 |0.8148|± |N/A |
209
+ |kobest_hellaswag | 1|none | 5|acc |0.4800|± |0.0224|
210
+ | | |none | 5|f1 |0.4771|± |N/A |
211
+ | | |none | 5|acc_norm |0.6120|± |0.0218|
212
+ |kobest_sentineg | 1|none | 5|acc |0.9597|± |0.0099|
213
+ | | |none | 5|f1 |0.9597|± |N/A |
214
+ |haerae |N/A |none | 5|acc |0.6801|± |0.0138|
215
+ | | |none | 5|acc_norm |0.6801|± |0.0138|
216
+ |kmmlu_direct|N/A |none | 5|exact_match|0.4054|± |0.0026|
217
+
218
 
219
  ## Citation instructions
220