FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design Paper • 2401.14112 • Published Jan 25 • 17
view article Article 🧑⚖️ "Replacing Judges with Juries" using distilabel By alvarobartt • 15 days ago • 14
Self-Play Preference Optimization for Language Model Alignment Paper • 2405.00675 • Published 16 days ago • 18
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published 15 days ago • 92
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models Paper • 2404.18796 • Published 19 days ago • 62
view article Article StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation 19 days ago • 68
view article Article ⚗️ 🧑🏼🌾 Let's grow some Domain Specific Datasets together By burtenshaw • 19 days ago • 25
view article Article 🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets By dvilasuero • 21 days ago • 54
view article Article Releasing Youtube-Commons: a massive open corpus for conversational and multimodal data By Pclanglais • 30 days ago • 20
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences Paper • 2404.03715 • Published Apr 4 • 57
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6 • 172
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12 • 54
Arcee's MergeKit: A Toolkit for Merging Large Language Models Paper • 2403.13257 • Published Mar 20 • 16
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models Paper • 2402.19427 • Published Feb 29 • 48
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 566
Gemma release Collection Groups the Gemma models released by the Google team. • 40 items • Updated 3 days ago • 302
World Model on Million-Length Video And Language With RingAttention Paper • 2402.08268 • Published Feb 13 • 33
👑 Monarch Collection Family of 7B models that combine excellent reasoning and conversational abilities. • 7 items • Updated Mar 22 • 9
🐶 Beagle Collection Merges done using mergekit and LazyMergekit: https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb#scrollTo=d5mYzDo1q96y • 8 items • Updated Mar 22 • 6
Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling Paper • 2402.10211 • Published Feb 15 • 8
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Paper • 2312.00752 • Published Dec 1, 2023 • 131
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback Paper • 2402.01391 • Published Feb 2 • 41
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model Paper • 2401.09417 • Published Jan 17 • 51
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models Paper • 2401.06066 • Published Jan 11 • 35
DocGraphLM: Documental Graph Language Model for Information Extraction Paper • 2401.02823 • Published Jan 5 • 32
Preference Datasets for DPO Collection This collection contains a list of curated preference datasets for DPO fine-tuning for intent alignment of LLMs • 7 items • Updated Apr 4 • 19
Datasets built with ⚗️ distilabel Collection This collection contains some datasets generated and/or labelled using https://github.com/argilla-io/distilabel • 5 items • Updated Apr 4 • 5
Can Language Models Solve Graph Problems in Natural Language? Paper • 2305.10037 • Published May 17, 2023 • 1
Comparing DPO with IPO and KTO Collection A collection of chat models to explore the differences between three alignment techniques: DPO, IPO, and KTO. • 56 items • Updated Jan 9 • 31
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models Paper • 2401.01335 • Published Jan 2 • 61
Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM Paper • 2401.02994 • Published Jan 4 • 44
CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution Paper • 2401.03065 • Published Jan 5 • 10
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts Paper • 2401.04081 • Published Jan 8 • 68
LLM Augmented LLMs: Expanding Capabilities through Composition Paper • 2401.02412 • Published Jan 4 • 35
The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction Paper • 2312.13558 • Published Dec 21, 2023 • 5
Understanding LLMs: A Comprehensive Overview from Training to Inference Paper • 2401.02038 • Published Jan 4 • 59