121 20 155

Philipp Schmid

philschmid

https://www.philschmid.de

_philschmid

philschmid

AI & ML interests

None yet

Articles

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

9 days ago

• 43

Hugging Face Collaborates with Microsoft to Launch Hugging Face Model Catalog on Azure

May 24, 2023

Creating a Coding Assistant with StarCoder

May 9, 2023

Accelerating Hugging Face Transformers with AWS Inferentia2

Apr 17, 2023

Hugging Face and AWS partner to make AI more accessible

Feb 21, 2023

• 1

Pre-Train BERT with Hugging Face Transformers and Habana Gaudi

Aug 22, 2022

• 3

Convert Transformers to ONNX with Hugging Face Optimum

Jun 22, 2022

Accelerated Inference with Optimum and Transformers Pipelines

May 10, 2022

Accelerate BERT inference with Hugging Face Transformers and AWS inferentia

Mar 16, 2022

Case Study: Millisecond Latency using Hugging Face Infinity and modern CPUs

Jan 13, 2022

Deploy GPT-J 6B for inference using Hugging Face Transformers and Amazon SageMaker

Jan 11, 2022

Few-shot learning in practice: GPT-NEO and the 🤗 Accelerated Inference API

Jun 3, 2021

• 1

Distributed Training: Train BART/T5 for Summarization using 🤗 Transformers and Amazon SageMaker

Apr 8, 2021

Organizations

philschmid's activity

New activity in mlabonne/Meta-Llama-3-120B-Instruct 1 day ago

fix snippet

#8 opened 1 day ago by

philschmid

New activity in meta-llama/Meta-Llama-3-70B-Instruct 14 days ago

[Don't merge] inferentia2 workaround

#34 opened 14 days ago by

philschmid

New activity in meta-llama/Meta-Llama-3-8B-Instruct 20 days ago

Fix chat template to add generation prompt only if the option is selected

#9 opened 21 days ago by

ArkaAbacus

New activity in meta-llama/Meta-Llama-3-70B-Instruct 20 days ago

Fix chat template to add generation prompt only if the option is selected

#6 opened 21 days ago by

ArkaAbacus

New activity in meta-llama/Meta-Llama-Guard-2-8B 20 days ago

template-format

#5 opened 21 days ago by

pcuenq

New activity in meta-llama/Meta-Llama-3-8B-Instruct 21 days ago

Example for AutoModelForCausalLM

#11 opened 21 days ago by

pcuenq

New activity in mistral-community/Mixtral-8x22B-v0.1 22 days ago

mention official weights

#15 opened 22 days ago by

philschmid

commented a paper about 1 month ago

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2 • 99 •

New activity in aws-neuron/optimum-neuron-cache about 1 month ago

Create stable-diffusion.json

#43 opened about 1 month ago by

Jingya

commented a paper about 1 month ago

Improved Baselines with Visual Instruction Tuning

Paper • 2310.03744 • Published Oct 5, 2023 • 32 •

New activity in ai21labs/Jamba-v0.1 about 1 month ago

Remove TGI tag

#8 opened about 1 month ago by

osanseviero

New activity in databricks/dbrx-instruct about 1 month ago

Add license with external link to databricks

#4 opened about 1 month ago by

philschmid

New activity in databricks/dbrx-base about 1 month ago

Add license with external link to databricks

#4 opened about 1 month ago by

philschmid

New activity in google/gemma-7b about 1 month ago

Fine-Tune Gemma with ChatML and Transformer Reinforcement Learning

#80 opened about 2 months ago by

Ateeqq

New activity in google/gemma-7b 2 months ago

Deploy in Sagemaker

#42 opened 3 months ago by

XuanNg

New activity in philschmid/gemma-7b-dolly-chatml 2 months ago

Thanks. GGUF coming?

#1 opened 2 months ago by

Phil337

New activity in philschmid/gemma-tokenizer-chatml 2 months ago

Update tokenizer_config.json

#1 opened 2 months ago by

philschmid

commented a paper 3 months ago

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

Paper • 2402.13064 • Published Feb 20 • 45 •

New activity in google/gemma-7b-it 3 months ago

Type Error when executing dummy script

#8 opened 3 months ago by

LKriesch

New activity in google/gemma-2b 3 months ago

add tos

#6 opened 3 months ago by

philschmid

New activity in google/gemma-2b-it 3 months ago

add tos

#7 opened 3 months ago by

philschmid

New activity in google/gemma-7b 3 months ago

add tos

#13 opened 3 months ago by

philschmid

New activity in google/gemma-7b-it 3 months ago

add tos

#5 opened 3 months ago by

philschmid

New activity in TheBloke/Llama-2-13B-chat-GPTQ 3 months ago

add template

#51 opened 3 months ago by

philschmid

New activity in codellama/CodeLlama-7b-Instruct-hf 3 months ago

The AWS SageMaker training code is not working.

#25 opened 3 months ago by

hyerimpark

New activity in philschmid/pyannote-speaker-diarization-endpoint 4 months ago

Example use with requests/httpx in Python?

#1 opened 4 months ago by

hbredin

New activity in mistralai/Mixtral-8x7B-v0.1 5 months ago

Deployment failing on Sagemaker

#15 opened 5 months ago by

vibranium

New activity in mistralai/Mixtral-8x7B-Instruct-v0.1 5 months ago

Inference endpoint fails to deploy

#13 opened 5 months ago by

dragosmc

New activity in philschmid/lilt-en-funsd 5 months ago

Inquiry Regarding Commercial Use of lilt-en-funsd Model

#4 opened 5 months ago by

Ifyouknowthenyouknow

New activity in aws-neuron/stable-diffusion-xl-on-inf2 6 months ago

AWS headquarter in seatle

#1 opened 6 months ago by

philschmid

New activity in aws-neuron/stable-diffusion-xl-base-1-0-1024x1024 6 months ago

allow dynamic batch size

#3 opened 6 months ago by

l-i

Copy suggestions about revisions, branches and neuronx version

#2 opened 6 months ago by

jeffboudier

model card copy nits

#1 opened 6 months ago by

jeffboudier

New activity in mteb/results 6 months ago

Create amazon.titan-embed-text-v1/t

#19 opened 6 months ago by

philschmid

New activity in TheBloke/Falcon-180B-Chat-GPTQ 7 months ago

Can run on SageMaker g5 instance?

#4 opened 8 months ago by

nsegev

New activity in HuggingFaceM4/idefics-80b 7 months ago

Unable to deploy to SageMaker via Studio notebook

#18 opened 9 months ago by

dualblades

New activity in facebook/bart-large-mnli 7 months ago

JSON input when deploying in Azure ML studio

#26 opened 7 months ago by

Shivesh15

New activity in google/flan-t5-xxl 7 months ago

While try to using google/flan-t5-xxl inference deploy in AWS sagemaker. Answers is truncated.

#62 opened 7 months ago by

haizamir

New activity in philschmid/stable-diffusion-2-inpainting-endpoint 7 months ago

StableDiffusionPipeline not working

#2 opened 7 months ago by

hoeirup32

New activity in codellama/CodeLlama-7b-Instruct-hf 8 months ago

SageMaker deployment script doesn't work

#5 opened 9 months ago by

mamachang

New activity in codellama/CodeLlama-13b-Instruct-hf 8 months ago

Issues while deploying on AWS SageMaker with TGI

#7 opened 8 months ago by

rajaswa-postman

New activity in OpenAssistant/codellama-13b-oasst-sft-v10 8 months ago

Update README.md

#6 opened 8 months ago by

philschmid

New activity in Deci/DeciLM-6b 8 months ago

Inference Endpoints Performance

#2 opened 8 months ago by

philschmid

New activity in tiiuae/falcon-180B-chat 8 months ago

Update README.md

#1 opened 8 months ago by

philschmid

New activity in tiiuae/falcon-180B 8 months ago

Update README.md

#1 opened 8 months ago by

philschmid

New activity in codellama/CodeLlama-34b-Instruct-hf 9 months ago

Fix instruct 13B and 34B pointing to instruct 7B model

#2 opened 9 months ago by

ikala-ray

New activity in codellama/README 9 months ago

add org card

#1 opened 9 months ago by

lvwerra

New activity in WizardLMTeam/WizardLM-70B-V1.0 9 months ago

Prompt Format

#1 opened 9 months ago by

philschmid

New activity in tiiuae/falcon-7b-instruct 9 months ago

UnexpectedStatusException: Error hosting endpoint huggingface-pytorch-tgi-inference-2023-07-12-08-18-29-406

#61 opened 10 months ago by

pranavnerurkar

New activity in philschmid/sagemaker-models 9 months ago

Update model.json with amazon/FalconLite instance recommendation

#1 opened 9 months ago by

jinyolim

New activity in TheBloke/Llama-2-13B-chat-GPTQ 10 months ago

Prompt format

#5 opened 10 months ago by

mr96

New activity in philschmid/trocr-base-printed 10 months ago

Rename pipeline.py to handler.py

#1 opened 10 months ago by

jeffboudier

New activity in bigcode/stack-v2-extensions 10 months ago

Javascript/typescript frontend extension for svelte,react,vue

#2 opened 10 months ago by

philschmid

New activity in michaelfeil/ct2fast-Llama-2-13b-chat-hf 10 months ago

Latency

#1 opened 10 months ago by

philschmid

New activity in text-generation-inference/README 10 months ago

Update README.md

#1 opened 10 months ago by

philschmid

New activity in pankajmathur/orca_mini_13b 11 months ago

How can the model have `mit` when GPT-3.5 was used to create the dataset?

#6 opened 11 months ago by

philschmid

New activity in google/flan-ul2 11 months ago

Adding `safetensors` variant of this model

#21 opened 11 months ago by

philschmid

Adding `safetensors` variant of this model

#22 opened 11 months ago by

philschmid

New activity in hf-doc-build/doc-build about 1 year ago

Rename optimum-neuron/pr_.zip to optimum-neuron/v0.0.1.zip

#9 opened about 1 year ago by

philschmid

New activity in Cedille/de-anna about 1 year ago

Congrats! 🎉 And a question :)

#1 opened about 1 year ago by

philschmid

Philipp Schmid

AI & ML interests

Articles

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

Welcome Llama 3 - Meta's new open LLM

Making thousands of open LLMs bloom in the Vertex AI Model Garden

CodeGemma - an official Google release for code LLMs

Bringing serverless GPU inference to Hugging Face users

Easily Train Models with H100 GPUs on NVIDIA DGX Cloud

Welcome Gemma - Google's new open LLM

From OpenAI to Open LLMs with Messages API

Hugging Face Text Generation Inference available for AWS Inferentia2

Hugging Face and Google partner for open AI collaboration

Welcome Mixtral - a SOTA Mixture of Experts on Hugging Face

Mixture of Experts Explained

Deploy Embedding Models with Hugging Face Inference Endpoints

Llama 2 on Amazon SageMaker a Benchmark

Fine-tuning Llama 2 70B using PyTorch FSDP

Spread Your Wings: Falcon 180B is here

Code Llama: Llama 2 learns to code

Introducing SafeCoder

Hugging Face Platform on the AWS Marketplace: Pay with your AWS Account

Llama 2 is here - get it on Hugging Face

Deploy LLMs with Hugging Face Inference Endpoints

The Falcon has landed in the Hugging Face ecosystem

Introducing the Hugging Face LLM Inference Container for Amazon SageMaker

Hugging Face Collaborates with Microsoft to Launch Hugging Face Model Catalog on Azure

Creating a Coding Assistant with StarCoder

Accelerating Hugging Face Transformers with AWS Inferentia2

Hugging Face and AWS partner to make AI more accessible

Pre-Train BERT with Hugging Face Transformers and Habana Gaudi

Convert Transformers to ONNX with Hugging Face Optimum

Accelerated Inference with Optimum and Transformers Pipelines

Accelerate BERT inference with Hugging Face Transformers and AWS inferentia

Case Study: Millisecond Latency using Hugging Face Infinity and modern CPUs

Deploy GPT-J 6B for inference using Hugging Face Transformers and Amazon SageMaker

Few-shot learning in practice: GPT-NEO and the 🤗 Accelerated Inference API

Distributed Training: Train BART/T5 for Summarization using 🤗 Transformers and Amazon SageMaker

Organizations

philschmid's activity

fix snippet

[Don't merge] inferentia2 workaround

Fix chat template to add generation prompt only if the option is selected

Fix chat template to add generation prompt only if the option is selected

template-format

Example for AutoModelForCausalLM

mention official weights

Create stable-diffusion.json

Remove TGI tag

Add license with external link to databricks

Add license with external link to databricks

Fine-Tune Gemma with ChatML and Transformer Reinforcement Learning

Deploy in Sagemaker

Thanks. GGUF coming?

Update tokenizer_config.json

Type Error when executing dummy script

add tos

add tos

add tos

add tos

add template

The AWS SageMaker training code is not working.

Example use with requests/httpx in Python?

Deployment failing on Sagemaker

Inference endpoint fails to deploy

Inquiry Regarding Commercial Use of lilt-en-funsd Model

AWS headquarter in seatle

allow dynamic batch size

Copy suggestions about revisions, branches and neuronx version

model card copy nits

Create amazon.titan-embed-text-v1/t

Can run on SageMaker g5 instance?

Unable to deploy to SageMaker via Studio notebook

JSON input when deploying in Azure ML studio

While try to using google/flan-t5-xxl inference deploy in AWS sagemaker. Answers is truncated.

StableDiffusionPipeline not working

SageMaker deployment script doesn't work

Issues while deploying on AWS SageMaker with TGI

Update README.md

Inference Endpoints Performance