I wonder what is the difference between version 1.1 and 1.2

#2
by Flanua - opened

It would be nice to know the difference between version 1.1 and 1.2 and it's max content length though.
P.S I guess the model v1.2 slightly clever than v1.1 at least by looking at the AI leaderboard..

Hey! Going from v1.0 to v1.1 - Difference is the introduction of generalized Tree-of-Thoughts.

v1.2 was trained on more data -- but it's MMLU score is lower because more data (or more training) flattens the underlying entropy of the model.

Having said that, I use v1.2 now in personal use. :)

Context length is a bit lower, at 2048 -- but if you use RoPE scaling it'll give you upto 8K. Again, having said that, I've been using the model up to 4096 out of the box, and it doesn't break.

How is context length might be lower than standard 4096 tokens of LLaMA v2 ?

Just couldn't fit my dataset and train it with a decent batch size (on a single H100), so I cut it down. But if this is important to you guys I'll do the v1.3 with larger context length. But in my experiments I haven't seen a performance degradation.

Thanks for provided info and yes it's very important for me at least to have context length of 4096 so I can put more personality traits into my AI agent and talk for longer with it. Also it would be very nice if you can increase MMLU score a bit somehow in the future that would be awesome. Can't wait to see v1.3 with context length 4096 though.

Impressive model!! I started testing the model and I get better results with v1.1 than v1.2 for my use case, specifically with very long context length.
I'd definitely like to see a v1.3 with larger context length to compare and see if the slight degradation is resolved.

Keep up the good work!

Hey, @migtissera , thanks for releasing Synthia! I'm testing the 70B model right now and am seriously impressed by both the intelligence and personality this model exhibits.

Didn't even notice the training only being 2K, ran her well on 4K context, but noticed some problems when the context was full which might be solved by RoPE scale 0.5 or a v1.3 if you do retrain at 4K. Now I'm trying her at 8K with 0.25 scale because I can never have enough context.

By the way, would you consider a 34B version, using Code Llama 2 as the base model? There are few 34B models but it's the perfect size to put all layers on a 3090 GPU and get much better speed than 70B but at much higher quality than 13B. Would instantly make your model very relevant to those of us who long for something bigger than 13B but smaller than 70B. And I'm sure once anyone has tried it, the quality will speak for itself, it's quickly becoming my new favorite model!

Yeah, agree. 34B is highly welcomed addition to the family.

BTW, I'm trying to find the differences between 1.2 and 1.2b with my benchmarks while waiting 1.3

Sure, I can train one but I don't know whether it'll be that good. The underlying entropy of that model is conditioned for code, so unless it's for coding, not sure how that'll go.

I just want to beat ChatGPT! :D

Also for coding, I thought Phind and WizardLM models are good on HumanEval? What are your usecases?

I do hope you'll beat ChatGPT some day! And until we have local LLMs that do that, I'm just using them for entertainment, to chat and have fun with (in that regard, you've actually beaten ChatGPT already, in my opinion!).

So I'm not thinking of 34B as a code model, but a base for chat/instruct finetunes. Airoboros-c34B-2.1 and Samantha-1.11-CodeLlama-34B have shown that Code Llama 2 can be tuned very well for roleplay and chat, and with the 34B base trained at 16K context length, that has even more advantages than 13B or 70B bases.

By the way, your model reminds me of the Samantha model a lot, which has been praised for its personality and intelligence. But Samantha is censored worse than Llama 2 Chat, and while I can get her to do NSFW roleplay, she's too moralizing and needs constant coercion, that's why I consider her too annoying to bother with. Synthia has shown at least as much intelligence and personality, and she's uncensored, so she's always fun to talk to and totally easy-going. Haven't smiled and even laughed as much when testing a model as I did with her today, so consider me a Synthia fan now. ;)

DM me on Twitter. I have a model that might be very intriguing to you guys.

migtissera changed discussion status to closed

DM me on Twitter. I have a model that might be very intriguing to you guys.

@migtissera I created a Twitter account just to do that, and am following you, but it won't let me send a DM unless you follow me as well (or I pay monthly for a verified account, which isn't even an option yet because my account is too new). So could you (temporarily) follow me, too, so we can exchange DMs? Username: WolframRvnwlf

Thanks, that's great news, 34B should be perfect balance between speed, quality, and context size! I'll test it as soon as @TheBloke gets around to quantize Synthia-34B-v1.2.

By the way, @migtissera - did you see my previous message about reaching out on Twitter? Or was the "model that might be very intriguing" this 34B?

Synthia-34B is out guys: https://huggingface.co/migtissera/Synthia-34B-v1.2

Nice - will the model card be coming soon?

@TheBloke could you please convert to GGUF this 34B model as well this 70B v1.2b: https://huggingface.co/migtissera/Synthia-70B-v1.2b

@wolfram sorry I missed your message! Gave you a follow on Twitter!

Sign up or log in to comment