MT-Bench Results

by 0-hero - opened Apr 17

Discussion

0-hero

Apr 17

•

edited Apr 17

MT-Bench

Model	MT-Bench
Claude 3 Opus	9.43
GPT-4-1106-Preview	9.32
Claude 3 Sonnet	9.18
WizardLM-2 8x22B	9.12
GPT-4-0314	8.96
Mixtral-8x22B-Instruct-v0.1	8.66
zephyr-orpo-141b-A35b-v0.1	8.17
Matter-0.2-8x22B	8.00

Asaf-Yehudai

Apr 17

Nice!
It will be interesting to see more benchmark results here.
I guess Mixtral-8x22B-Instruct-v0.1 is better in multilingualty than WizardLM-2 8x22B.
Maybe merging them can work even better :)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment