122 22 159

Philipp Schmid

philschmid

https://www.philschmid.de

_philschmid

philschmid

AI & ML interests

None yet

Articles

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

16 days ago

• 47

Hugging Face Collaborates with Microsoft to Launch Hugging Face Model Catalog on Azure

May 24, 2023

Creating a Coding Assistant with StarCoder

May 9, 2023

Accelerating Hugging Face Transformers with AWS Inferentia2

Apr 17, 2023

Hugging Face and AWS partner to make AI more accessible

Feb 21, 2023

• 1

Pre-Train BERT with Hugging Face Transformers and Habana Gaudi

Aug 22, 2022

• 3

Convert Transformers to ONNX with Hugging Face Optimum

Jun 22, 2022

Accelerated Inference with Optimum and Transformers Pipelines

May 10, 2022

Accelerate BERT inference with Hugging Face Transformers and AWS inferentia

Mar 16, 2022

Case Study: Millisecond Latency using Hugging Face Infinity and modern CPUs

Jan 13, 2022

Deploy GPT-J 6B for inference using Hugging Face Transformers and Amazon SageMaker

Jan 11, 2022

Few-shot learning in practice: GPT-NEO and the 🤗 Accelerated Inference API

Jun 3, 2021

• 1

Distributed Training: Train BART/T5 for Summarization using 🤗 Transformers and Amazon SageMaker

Apr 8, 2021

Organizations

philschmid's activity

upvoted a collection 4 days ago

Yi-1.5 (2024/05)

Collection

6 items • Updated 4 days ago • 59

upvoted a paper 5 days ago

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Paper • 2405.04434 • Published 9 days ago • 6

upvoted a paper 15 days ago

Iterative Reasoning Preference Optimization

Paper • 2404.19733 • Published 16 days ago • 41

upvoted a paper 18 days ago

Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences

Paper • 2404.03715 • Published Apr 4 • 57

upvoted a paper 19 days ago

Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks

Paper • 2404.14723 • Published 24 days ago • 9

upvoted an article 28 days ago

Article

Welcome Llama 3 - Meta's new open LLM

29 days ago

• 238

upvoted a paper about 1 month ago

JetMoE: Reaching Llama2 Performance with 0.1M Dollars

Paper • 2404.07413 • Published Apr 11 • 32

upvoted 2 articles about 1 month ago

Article

Welcome Gemma - Google's new open LLM

Feb 21

• 9

Article

CodeGemma - an official Google release for code LLMs

Apr 9

• 95

upvoted a paper about 1 month ago

Octopus v2: On-device language model for super agent

Paper • 2404.01744 • Published Apr 2 • 53

upvoted a collection about 1 month ago

HF-curated models available on Workers AI

Collection

A collection of models curated with Hugging Face that can be run on Cloudflare's Workers AI serverless inference platform. • 15 items • Updated Apr 2 • 48

upvoted a paper about 2 months ago

Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28 • 98

upvoted a collection about 2 months ago

MoEs papers reading list

Collection

41 items • Updated 4 days ago • 122

upvoted 2 papers about 2 months ago

Aligning Modalities in Vision Large Language Models via Preference Fine-tuning

Paper • 2402.11411 • Published Feb 18 • 1

Simple and Scalable Strategies to Continually Pre-train Large Language Models

Paper • 2403.08763 • Published Mar 13 • 48

upvoted a paper 2 months ago

ORPO: Monolithic Preference Optimization without Reference Model

Paper • 2403.07691 • Published Mar 12 • 54

upvoted a collection 5 months ago

Awesome SFT datasets

Collection

A curated list of interesting datasets to fine-tune language models with. • 43 items • Updated Apr 12 • 89

upvoted 2 collections 7 months ago

Distil-Whisper Models

Collection

The first version of the Distil-Whisper models released with the Distil-Whisper paper. • 4 items • Updated Mar 21 • 34

Zephyr 7B

Collection

Models, datasets, and demos associated with Zephyr 7B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 9 items • Updated Apr 12 • 137

upvoted a paper 7 months ago

Zephyr: Direct Distillation of LM Alignment

Paper • 2310.16944 • Published Oct 25, 2023 • 116

upvoted a paper 8 months ago

Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 84

upvoted a paper 9 months ago

Code Llama: Open Foundation Models for Code

Paper • 2308.12950 • Published Aug 24, 2023 • 19

Philipp Schmid

AI & ML interests

Articles

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

Welcome Llama 3 - Meta's new open LLM

Making thousands of open LLMs bloom in the Vertex AI Model Garden

CodeGemma - an official Google release for code LLMs

Bringing serverless GPU inference to Hugging Face users

Easily Train Models with H100 GPUs on NVIDIA DGX Cloud

Welcome Gemma - Google's new open LLM

From OpenAI to Open LLMs with Messages API

Hugging Face Text Generation Inference available for AWS Inferentia2

Hugging Face and Google partner for open AI collaboration

Mixture of Experts Explained

Welcome Mixtral - a SOTA Mixture of Experts on Hugging Face

Deploy Embedding Models with Hugging Face Inference Endpoints

Llama 2 on Amazon SageMaker a Benchmark

Fine-tuning Llama 2 70B using PyTorch FSDP

Spread Your Wings: Falcon 180B is here

Code Llama: Llama 2 learns to code

Introducing SafeCoder

Hugging Face Platform on the AWS Marketplace: Pay with your AWS Account

Llama 2 is here - get it on Hugging Face

Deploy LLMs with Hugging Face Inference Endpoints

The Falcon has landed in the Hugging Face ecosystem

Introducing the Hugging Face LLM Inference Container for Amazon SageMaker

Hugging Face Collaborates with Microsoft to Launch Hugging Face Model Catalog on Azure

Creating a Coding Assistant with StarCoder

Accelerating Hugging Face Transformers with AWS Inferentia2

Hugging Face and AWS partner to make AI more accessible

Pre-Train BERT with Hugging Face Transformers and Habana Gaudi

Convert Transformers to ONNX with Hugging Face Optimum

Accelerated Inference with Optimum and Transformers Pipelines

Accelerate BERT inference with Hugging Face Transformers and AWS inferentia

Case Study: Millisecond Latency using Hugging Face Infinity and modern CPUs

Deploy GPT-J 6B for inference using Hugging Face Transformers and Amazon SageMaker

Few-shot learning in practice: GPT-NEO and the 🤗 Accelerated Inference API

Distributed Training: Train BART/T5 for Summarization using 🤗 Transformers and Amazon SageMaker

Organizations

philschmid's activity

Welcome Llama 3 - Meta's new open LLM

Welcome Gemma - Google's new open LLM

CodeGemma - an official Google release for code LLMs