-
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Paper • 2402.14905 • Published • 81 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 567 -
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper • 2403.09611 • Published • 122 -
Jamba: A Hybrid Transformer-Mamba Language Model
Paper • 2403.19887 • Published • 99
Collections
Discover the best community collections!
Collections including paper arxiv:2404.19296
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 567 -
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 94 -
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
Paper • 2404.02258 • Published • 102 -
TransformerFAM: Feedback attention is working memory
Paper • 2404.09173 • Published • 42
-
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation
Paper • 2403.16990 • Published • 24 -
ViTAR: Vision Transformer with Any Resolution
Paper • 2403.18361 • Published • 48 -
Getting it Right: Improving Spatial Consistency in Text-to-Image Models
Paper • 2404.01197 • Published • 29 -
Bigger is not Always Better: Scaling Properties of Latent Diffusion Models
Paper • 2404.01367 • Published • 19
-
ODA: Observation-Driven Agent for integrating LLMs and Knowledge Graphs
Paper • 2404.07677 • Published • 1 -
ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models
Paper • 2404.07738 • Published • 2 -
Scaling Instructable Agents Across Many Simulated Worlds
Paper • 2404.10179 • Published • 23 -
A Multimodal Automated Interpretability Agent
Paper • 2404.14394 • Published • 19
-
Jamba: A Hybrid Transformer-Mamba Language Model
Paper • 2403.19887 • Published • 99 -
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Paper • 2404.00399 • Published • 39 -
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
Paper • 2404.02258 • Published • 102 -
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
Paper • 2404.08801 • Published • 62