mlabonne (Maxime Labonne)

upvoted 2 papers 12 days ago

Model Merging by Uncertainty-Based Gradient Matching

Paper • 2310.12808 • Published Oct 19, 2023 • 6

FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design

Paper • 2401.14112 • Published Jan 25 • 17

upvoted a paper 14 days ago

WildChat: 1M ChatGPT Interaction Logs in the Wild

Paper • 2405.01470 • Published 16 days ago • 52

upvoted an article 15 days ago

Article

🧑‍⚖️ "Replacing Judges with Juries" using distilabel

By

•

15 days ago

• 14

upvoted 2 papers 15 days ago

Self-Play Preference Optimization for Language Model Alignment

Paper • 2405.00675 • Published 16 days ago • 18

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

Paper • 2405.01535 • Published 15 days ago • 92

upvoted a paper 18 days ago

Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Paper • 2404.18796 • Published 19 days ago • 62

upvoted an article 18 days ago

Article

StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation

19 days ago

• 68

upvoted an article 19 days ago

Article

⚗️ 🧑🏼‍🌾 Let's grow some Domain Specific Datasets together

By

•

19 days ago

• 25

upvoted an article 22 days ago

Article

🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets

By

•

21 days ago

• 54

upvoted a paper 29 days ago

Sailor: Open Language Models for South-East Asia

Paper • 2404.03608 • Published Apr 4 • 17

upvoted 4 articles 29 days ago

Article

Merge Large Language Models with mergekit

By

•

Jan 9

• 17

Article

Create Mixtures of Experts with MergeKit

By

•

Mar 28

• 9

Article

Fine-tune Llama 3 with ORPO

By

•

25 days ago

• 177

Article

Welcome Llama 3 - Meta's new open LLM

about 1 month ago

• 238

upvoted an article 30 days ago

Article

Releasing Youtube-Commons: a massive open corpus for conversational and multimodal data

By

•

30 days ago

• 20

upvoted a collection about 1 month ago

fuck quadratic attention

Collection

11 items • Updated 24 days ago • 19

upvoted 3 papers about 1 month ago

JetMoE: Reaching Llama2 Performance with 0.1M Dollars

Paper • 2404.07413 • Published Apr 11 • 32

Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences

Paper • 2404.03715 • Published Apr 4 • 57

ReFT: Representation Finetuning for Language Models

Paper • 2404.03592 • Published Apr 4 • 71

upvoted 6 papers about 2 months ago

upvoted 3 papers 3 months ago

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

Paper • 2402.19427 • Published Feb 29 • 48

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 566

FuseChat: Knowledge Fusion of Chat Models

Paper • 2402.16107 • Published Feb 25 • 35

upvoted 2 collections 3 months ago

Mamba

Collection

Mamba checkpoints compatible with transformers • 6 items • Updated Feb 19 • 2

Gemma release

Collection

Groups the Gemma models released by the Google team. • 40 items • Updated 3 days ago • 302

upvoted a paper 3 months ago

World Model on Million-Length Video And Language With RingAttention

Paper • 2402.08268 • Published Feb 13 • 33

upvoted 2 collections 3 months ago

👑 Monarch

Collection

Family of 7B models that combine excellent reasoning and conversational abilities. • 7 items • Updated Mar 22 • 9

🐶 Beagle

Collection

Merges done using mergekit and LazyMergekit: https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb#scrollTo=d5mYzDo1q96y • 8 items • Updated Mar 22 • 6

upvoted 3 papers 3 months ago

Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling

Paper • 2402.10211 • Published Feb 15 • 8

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Paper • 2312.00752 • Published Dec 1, 2023 • 131

StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

Paper • 2402.01391 • Published Feb 2 • 41

upvoted a collection 4 months ago

Cultrix-Best-Models

Collection

11 items • Updated Feb 22 • 1

upvoted 3 papers 4 months ago

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18 • 135

Tuning Language Models by Proxy

Paper • 2401.08565 • Published Jan 16 • 19

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Paper • 2401.09417 • Published Jan 17 • 51

upvoted a collection 4 months ago

MoEs papers reading list

Collection

41 items • Updated 5 days ago • 122

upvoted a paper 4 months ago

Quantum Denoising Diffusion Models

Paper • 2401.07049 • Published Jan 13 • 12

upvoted a collection 4 months ago

Hermes

Collection

Nous' Flagship LLM Series • 21 items • Updated 3 days ago • 84

upvoted 2 papers 4 months ago

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Paper • 2401.06066 • Published Jan 11 • 35

DocGraphLM: Documental Graph Language Model for Information Extraction

Paper • 2401.02823 • Published Jan 5 • 32

upvoted 2 collections 4 months ago

Preference Datasets for DPO

Collection

This collection contains a list of curated preference datasets for DPO fine-tuning for intent alignment of LLMs • 7 items • Updated Apr 4 • 19

Datasets built with ⚗️ distilabel

Collection

This collection contains some datasets generated and/or labelled using https://github.com/argilla-io/distilabel • 5 items • Updated Apr 4 • 5

upvoted a paper 4 months ago

Can Language Models Solve Graph Problems in Natural Language?

Paper • 2305.10037 • Published May 17, 2023 • 1

upvoted a collection 4 months ago

Comparing DPO with IPO and KTO

Collection

A collection of chat models to explore the differences between three alignment techniques: DPO, IPO, and KTO. • 56 items • Updated Jan 9 • 31

upvoted 10 papers 4 months ago

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Paper • 2401.01335 • Published Jan 2 • 61

LLaMA Pro: Progressive LLaMA with Block Expansion

Paper • 2401.02415 • Published Jan 4 • 50

Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM

Paper • 2401.02994 • Published Jan 4 • 44

CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution

Paper • 2401.03065 • Published Jan 5 • 10

Mixtral of Experts

Paper • 2401.04088 • Published Jan 8 • 152

MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

Paper • 2401.04081 • Published Jan 8 • 68

LLM Augmented LLMs: Expanding Capabilities through Composition

Paper • 2401.02412 • Published Jan 4 • 35

The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction

Paper • 2312.13558 • Published Dec 21, 2023 • 5

Understanding LLMs: A Comprehensive Overview from Training to Inference

Paper • 2401.02038 • Published Jan 4 • 59

TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4 • 81

Maxime Labonne PRO

AI & ML interests

Articles

Fine-tune Llama 3 with ORPO

Create Mixtures of Experts with MergeKit

Merge Large Language Models with mergekit

Organizations

mlabonne's activity

🧑‍⚖️ "Replacing Judges with Juries" using distilabel

StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation

⚗️ 🧑🏼‍🌾 Let's grow some Domain Specific Datasets together

🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets

Merge Large Language Models with mergekit

Create Mixtures of Experts with MergeKit

Fine-tune Llama 3 with ORPO

Welcome Llama 3 - Meta's new open LLM

Releasing Youtube-Commons: a massive open corpus for conversational and multimodal data