MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Paper • 2403.09611 • Published Mar 14 • 122
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model Paper • 2402.03766 • Published Feb 6 • 9
Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models Paper • 2403.20331 • Published Mar 29 • 14
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Paper • 2404.07143 • Published Apr 10 • 93
FABLES: Evaluating faithfulness and content selection in book-length summarization Paper • 2404.01261 • Published Apr 1 • 3
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints Paper • 2305.13245 • Published May 22, 2023 • 5
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published Apr 22 • 122
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores Paper • 2311.05908 • Published Nov 10, 2023 • 11
STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases Paper • 2404.13207 • Published Apr 19
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report Paper • 2405.00732 • Published Apr 29 • 115
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published May 2 • 103
You Only Cache Once: Decoder-Decoder Architectures for Language Models Paper • 2405.05254 • Published 25 days ago • 8
SUTRA: Scalable Multilingual Language Model Architecture Paper • 2405.06694 • Published 26 days ago • 34
What matters when building vision-language models? Paper • 2405.02246 • Published about 1 month ago • 87
Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model Paper • 2405.09215 • Published 19 days ago • 14
Many-Shot In-Context Learning in Multimodal Foundation Models Paper • 2405.09798 • Published 18 days ago • 25
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning Paper • 2405.12130 • Published 14 days ago • 42
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention Paper • 2405.12981 • Published 12 days ago • 23
Aya 23: Open Weight Releases to Further Multilingual Progress Paper • 2405.15032 • Published 10 days ago • 21
Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation Paper • 2404.19752 • Published Apr 30 • 20
Jina CLIP: Your CLIP Model Is Also Your Text Retriever Paper • 2405.20204 • Published 4 days ago • 19
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts Paper • 2405.19893 • Published 4 days ago • 15
LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images Paper • 2403.11703 • Published Mar 18 • 13