Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2404.19296

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Paper • 2402.14905 • Published Feb 22 • 81
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 567
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Paper • 2403.09611 • Published Mar 14 • 122
Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28 • 99

Papers - Ensemble

Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30 • 115

Papers - Nexa AI

Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30 • 115

Papers - Octopus

Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30 • 115

Papers - Agent - Tasks

LEGENT: Open Platform for Embodied Agents

Paper • 2404.18243 • Published Apr 28 • 20
Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations

Paper • 2404.17521 • Published Apr 26 • 12
Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30 • 115

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 567
BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 94
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2 • 102
TransformerFAM: Feedback attention is working memory

Paper • 2404.09173 • Published Apr 14 • 42

On the Scalability of GNNs for Molecular Graphs

Paper • 2404.11568 • Published Apr 17 • 1
Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30 • 115

Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation

Paper • 2403.16990 • Published Mar 25 • 24
ViTAR: Vision Transformer with Any Resolution

Paper • 2403.18361 • Published Mar 27 • 48
Getting it Right: Improving Spatial Consistency in Text-to-Image Models

Paper • 2404.01197 • Published Apr 1 • 29
Bigger is not Always Better: Scaling Properties of Latent Diffusion Models

Paper • 2404.01367 • Published Apr 1 • 19

ODA: Observation-Driven Agent for integrating LLMs and Knowledge Graphs

Paper • 2404.07677 • Published Apr 11 • 1
ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models

Paper • 2404.07738 • Published Apr 11 • 2
Scaling Instructable Agents Across Many Simulated Worlds

Paper • 2404.10179 • Published Mar 13 • 23
A Multimodal Automated Interpretability Agent

Paper • 2404.14394 • Published Apr 22 • 19

Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28 • 99
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

Paper • 2404.00399 • Published Mar 30 • 39
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2 • 102
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

Paper • 2404.08801 • Published Apr 12 • 62

Previous
1
2
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs