Better aligned models obtained by weak-to-strong model extrapolation (ExPO)
-
Weak-to-Strong Extrapolation Expedites Alignment
Paper • 2404.16792 • Published • 1 -
chujiezheng/LLaMA3-iterative-DPO-final-ExPO
Text Generation • Updated -
chujiezheng/Mistral7B-PairRM-SPPO-ExPO
Text Generation • Updated • 185 -
chujiezheng/Snorkel-Mistral-PairRM-DPO-ExPO
Text Generation • Updated • 215