victor (Victor Mustar)

upvoted an article 1 day ago

Article

Evaling llm-jp-eval (evals are hard)

By

•

1 day ago

• 2

upvoted a paper 1 day ago

Sakuga-42M Dataset: Scaling Up Cartoon Research

Paper • 2405.07425 • Published 6 days ago • 3

upvoted 2 collections 2 days ago

everything-ai

Collection

PaliGemma Release

Collection

Pretrained and mix checkpoints for PaliGemma • 11 items • Updated 2 days ago • 88

upvoted 2 collections 3 days ago

Compressed LLMs for nm-vllm

Collection

LLMs compressed using SparseGPT and GPTQ for optimized inference with nm-vllm https://github.com/neuralmagic/nm-vllm • 17 items • Updated 9 days ago • 7

Sparse Foundational Llama 2 Models

Collection

Sparse pre-trained and fine-tuned Llama models made by Neural Magic + Cerebras • 27 items • Updated 1 day ago • 5

upvoted a paper 3 days ago

Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment

Paper • 2405.03594 • Published 13 days ago • 6

upvoted 3 articles 3 days ago

Article

Multimodal Augmentation for Documents: Recovering “Comprehension” in “Reading and Comprehension” task

By

•

3 days ago

• 15

Article

Hugging Face + Google Visual Blocks

By

•

3 days ago

• 16

Article

PaliGemma – Google's Cutting-Edge Open Vision Language Model

5 days ago

• 95

upvoted a paper 4 days ago

LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published 20 days ago • 107

upvoted a collection 4 days ago

Yi-1.5 (2024/05)

Collection

6 items • Updated 7 days ago • 61

upvoted 3 articles 5 days ago

Article

Introducing the Open Arabic LLM Leaderboard

5 days ago

• 40

Article

Adapt custom AI models to the trainer API and to 🤗

By

•

5 days ago

• 14

Article

Hugging Face x LangChain : A new partner package in LangChain

5 days ago

• 47

upvoted 5 collections 6 days ago

upvoted 2 articles 9 days ago

Article

Everything About Long Context Fine-tuning

By

•

9 days ago

• 9

Article

Inference for PROs

Sep 22, 2023

• 15

upvoted a paper 9 days ago

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Paper • 1901.02860 • Published Jan 9, 2019 • 2

upvoted a paper 10 days ago

Chatbot is Not All You Need: Information-rich Prompting for More Realistic Responses

Paper • 2312.16233 • Published Dec 25, 2023 • 2

upvoted a collection 10 days ago

Awesome SFT datasets

Collection

A curated list of interesting datasets to fine-tune language models with. • 43 items • Updated Apr 12 • 89

upvoted an article 10 days ago

Article

SeeMoE: Implementing a MoE Vision Language Model from Scratch

By

•

13 days ago

• 24

upvoted a collection 11 days ago

Granite Code Models

Collection

A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 10 items • Updated 7 days ago • 118

upvoted 4 papers 13 days ago

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

Paper • 2405.01535 • Published 17 days ago • 92

On Bringing Robots Home

Paper • 2311.16098 • Published Nov 27, 2023 • 2

EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars

Paper • 2404.19110 • Published 19 days ago • 3

3D Gaussian Blendshapes for Head Avatar Animation

Paper • 2404.19398 • Published 19 days ago • 2

upvoted a collection 13 days ago

🎭 Avatars

Collection

The latest AI-powered technologies usher in a new era of realistic avatars! 🚀 • 33 items • Updated 5 days ago • 49

upvoted 3 articles 16 days ago

Article

🧑‍⚖️ "Replacing Judges with Juries" using distilabel

By

•

16 days ago

• 14

Article

A Guide to Designing New Functional Proteins and Improving Protein Function, Stability, and Diversity with Generative AI

By

•

5 days ago

• 15

Article

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

18 days ago

• 50

upvoted 2 papers 17 days ago

InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation

Paper • 2404.19427 • Published 19 days ago • 64

Octopus v4: Graph of language models

Paper • 2404.19296 • Published 19 days ago • 89

upvoted a collection 17 days ago

ZeroGPU Spaces

Collection

ZeroGPU Spaces made by the community • 16 items • Updated 2 days ago • 145

upvoted a collection 18 days ago

GreenBitAI MLX LLM

Collection

GreenBitAI's Low-bit LLMs in MLX format • 69 items • Updated 12 days ago • 4

upvoted an article 18 days ago

Article

RAG chatbot using llama3

By

•

26 days ago

• 24

upvoted 2 papers 18 days ago

KAN: Kolmogorov-Arnold Networks

Paper • 2404.19756 • Published 19 days ago • 91

Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Paper • 2404.18796 • Published 20 days ago • 62

upvoted an article 19 days ago

Article

Improving Prompt Consistency with Structured Generations

19 days ago

• 41

upvoted a paper 19 days ago

AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation

Paper • 2404.12753 • Published 30 days ago • 38

upvoted 2 articles 20 days ago

Article

Expanding Model Context and Creating Chat Models with a Single Click

By

•

21 days ago

• 29

Article

⚗️ 🧑🏼‍🌾 Let's grow some Domain Specific Datasets together

By

•

20 days ago

• 25

upvoted 4 papers 20 days ago

PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning

Paper • 2404.16994 • Published 24 days ago • 30

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites

Paper • 2404.16821 • Published 24 days ago • 48

PuLID: Pure and Lightning ID Customization via Contrastive Alignment

Paper • 2404.16022 • Published 25 days ago • 16

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data

Paper • 2404.15653 • Published 25 days ago • 24

upvoted a collection 20 days ago

LLaVA++ (LLaMA-3 and Phi-3-Mini)

Collection

Extending Visual Capabilities of LLaVA with LLaMA-3 and Phi-3 • 11 items • Updated 19 days ago • 21

upvoted a collection 23 days ago

Meta Llama 3

Collection

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated about 1 month ago • 523

upvoted an article 24 days ago

Article

Can We Train Chat Models with Raw Data?

By

•

24 days ago

• 17

upvoted a collection 24 days ago

Phi-3

Collection

Phi-3 family of models • 7 items • Updated 2 days ago • 200

upvoted a paper 24 days ago

How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study

Paper • 2404.14047 • Published 27 days ago • 37

upvoted a collection 25 days ago

OpenELM Instruct Models

Collection

4 items • Updated Apr 12 • 96

upvoted a paper 25 days ago

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

Paper • 2404.14619 • Published 26 days ago • 120

upvoted an article 26 days ago

Article

LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!)

By

•

25 days ago

• 38

upvoted a paper 26 days ago

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published 27 days ago • 230

upvoted an article 26 days ago

Article

seemore: Implement a Vision Language Model from Scratch

By

•

7 days ago

• 41

Victor Mustar PRO

AI & ML interests

Articles

Inference for PROs

Organizations

victor's activity

Evaling llm-jp-eval (evals are hard)

Multimodal Augmentation for Documents: Recovering “Comprehension” in “Reading and Comprehension” task

Hugging Face + Google Visual Blocks

PaliGemma – Google's Cutting-Edge Open Vision Language Model

Introducing the Open Arabic LLM Leaderboard

Adapt custom AI models to the trainer API and to 🤗

Hugging Face x LangChain : A new partner package in LangChain

Everything About Long Context Fine-tuning

Inference for PROs

SeeMoE: Implementing a MoE Vision Language Model from Scratch

🧑‍⚖️ "Replacing Judges with Juries" using distilabel

A Guide to Designing New Functional Proteins and Improving Protein Function, Stability, and Diversity with Generative AI

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

RAG chatbot using llama3

Improving Prompt Consistency with Structured Generations

Expanding Model Context and Creating Chat Models with a Single Click

⚗️ 🧑🏼‍🌾 Let's grow some Domain Specific Datasets together

Can We Train Chat Models with Raw Data?

LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!)

seemore: Implement a Vision Language Model from Scratch