mlx_lm.server gives wonky answers

#49
by conleysa - opened

Hello! I am noticing that when I run llama-3-8b-instruct using mlx_lm.server, I get strange answers. Like I ask it for a query and it tells me about dog breeds. On the other hand, if I use from mlx_lm.load and mlx_lm.generate, I get reasonable responses.

Is there any reason the new llama-3 shouldn't be run from the mlx_lm server?

I can run as a server using llama-2-13b and get reasonable responses.

Sign up or log in to comment