Molbap (Pablo Montalvo)

upvoted 2 articles 4 days ago

Article

Multimodal Augmentation for Documents: Recovering “Comprehension” in “Reading and Comprehension” task

By

•

4 days ago

• 15

Article

License to Call: Introducing Transformers Agents 2.0

8 days ago

• 65

upvoted a collection 5 days ago

PaliGemma Release

Collection

Pretrained and mix checkpoints for PaliGemma • 11 items • Updated 3 days ago • 91

upvoted an article 5 days ago

Article

2024-04-22 - Hub Incident Post Mortem

By

•

3 days ago

• 15

upvoted an article 6 days ago

Article

PaliGemma – Google's Cutting-Edge Open Vision Language Model

7 days ago

• 98

upvoted a paper about 1 month ago

InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD

Paper • 2404.06512 • Published Apr 9 • 29

upvoted a paper 3 months ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 566

upvoted 4 papers 4 months ago

upvoted a collection 4 months ago

SigLIP

Collection

Contrastive (sigmoid) image-text models from https://arxiv.org/abs/2303.15343 • 8 items • Updated 6 days ago • 24

upvoted a paper 5 months ago

Distributed Inference and Fine-tuning of Large Language Models Over The Internet

Paper • 2312.08361 • Published Dec 13, 2023 • 23

upvoted a paper 6 months ago

GAIA: a benchmark for General AI Assistants

Paper • 2311.12983 • Published Nov 21, 2023 • 171

upvoted a paper 10 months ago

Retentive Network: A Successor to Transformer for Large Language Models

Paper • 2307.08621 • Published Jul 17, 2023 • 167

Pablo Montalvo PRO

AI & ML interests

Organizations

Molbap's activity

Multimodal Augmentation for Documents: Recovering “Comprehension” in “Reading and Comprehension” task

License to Call: Introducing Transformers Agents 2.0

PaliGemma Release

2024-04-22 - Hub Incident Post Mortem

PaliGemma – Google's Cutting-Edge Open Vision Language Model

InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Small Language Model Meets with Reinforced Vision Vocabulary

SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities

CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark

Scalable Pre-training of Large Autoregressive Image Models

SigLIP

Distributed Inference and Fine-tuning of Large Language Models Over The Internet

GAIA: a benchmark for General AI Assistants

Retentive Network: A Successor to Transformer for Large Language Models