To read... eventually - a mohammedbriman Collection

mohammedbriman 's Collections

Multilingual LLMs (papers and models)

To read... eventually

To read... eventually

updated about 17 hours ago

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Paper • 2403.09611 • Published Mar 14 • 122
Evolutionary Optimization of Model Merging Recipes

Paper • 2403.13187 • Published Mar 19 • 45
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model

Paper • 2402.03766 • Published Feb 6 • 9
LLM Agent Operating System

Paper • 2403.16971 • Published Mar 25 • 62
Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28 • 99
ReALM: Reference Resolution As Language Modeling

Paper • 2403.20329 • Published Mar 29 • 20
Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models

Paper • 2403.20331 • Published Mar 29 • 14
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Paper • 2404.07143 • Published Apr 10 • 93
TransformerFAM: Feedback attention is working memory

Paper • 2404.09173 • Published Apr 14 • 42
FABLES: Evaluating faithfulness and content selection in book-length summarization

Paper • 2404.01261 • Published Apr 1 • 3
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints

Paper • 2305.13245 • Published May 22, 2023 • 5
Video as the New Language for Real-World Decision Making

Paper • 2402.17139 • Published Feb 27 • 17
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

Paper • 2404.14619 • Published Apr 22 • 122
Pegasus-v1 Technical Report

Paper • 2404.14687 • Published Apr 23 • 29
Transformers Can Represent n-gram Language Models

Paper • 2404.14994 • Published Apr 23 • 18
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores

Paper • 2311.05908 • Published Nov 10, 2023 • 11
Question Aware Vision Transformer for Multimodal Reasoning

Paper • 2402.05472 • Published Feb 8 • 6
STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases

Paper • 2404.13207 • Published Apr 19
Make Your LLM Fully Utilize the Context

Paper • 2404.16811 • Published Apr 25 • 52
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published Apr 29 • 115
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

Paper • 2405.01535 • Published May 2 • 103
You Only Cache Once: Decoder-Decoder Architectures for Language Models

Paper • 2405.05254 • Published 25 days ago • 8
SUTRA: Scalable Multilingual Language Model Architecture

Paper • 2405.06694 • Published 26 days ago • 34
What matters when building vision-language models?

Paper • 2405.02246 • Published about 1 month ago • 87
Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model

Paper • 2405.09215 • Published 19 days ago • 14
LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published 18 days ago • 73
Many-Shot In-Context Learning in Multimodal Foundation Models

Paper • 2405.09798 • Published 18 days ago • 25
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published 14 days ago • 42
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

Paper • 2405.12981 • Published 12 days ago • 23
Your Transformer is Secretly Linear

Paper • 2405.12250 • Published 14 days ago • 136
Aya 23: Open Weight Releases to Further Multilingual Progress

Paper • 2405.15032 • Published 10 days ago • 21
Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation

Paper • 2404.19752 • Published Apr 30 • 20
An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published 7 days ago • 67
Dense Connector for MLLMs

Paper • 2405.13800 • Published 12 days ago • 20
Jina CLIP: Your CLIP Model Is Also Your Text Retriever

Paper • 2405.20204 • Published 4 days ago • 19
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts

Paper • 2405.19893 • Published 4 days ago • 15
LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

Paper • 2403.11703 • Published Mar 18 • 13