ucyang (Unchun Yang)

upvoted a paper about 3 hours ago

LIMA: Less Is More for Alignment

Paper • 2305.11206 • Published May 18, 2023 • 18

upvoted a collection 2 days ago

AkaLlama

Collection

Korean adaptation of Llama-3 LLM suites, developed by MIR Lab @ Yonsei University • 3 items • Updated about 7 hours ago • 1

upvoted a paper 3 days ago

Optimizing Language Augmentation for Multilingual Large Language Models: A Case Study on Korean

Paper • 2403.10882 • Published Mar 16 • 4

upvoted a collection 3 days ago

PaliGemma Release

Collection

Pretrained and mix checkpoints for PaliGemma • 11 items • Updated 1 day ago • 85

upvoted an article 4 days ago

Article

Hugging Face x LangChain : A new partner package in LangChain

5 days ago

• 46

upvoted a collection 4 days ago

NuNerZero - Zero Shot NER

Collection

The best compact Zero-Shot NER models with MIT license • 4 items • Updated 8 days ago • 11

upvoted an article 4 days ago

Article

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

By

•

11 days ago

• 21

upvoted 2 articles 5 days ago

Article

quanto: a pytorch quantization toolkit

Mar 18

• 13

Article

LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!)

By

•

24 days ago

• 37

upvoted a paper 6 days ago

Iterative Reasoning Preference Optimization

Paper • 2404.19733 • Published 18 days ago • 41

upvoted a collection 6 days ago

Yi-1.5 (2024/05)

Collection

6 items • Updated 6 days ago • 60

upvoted an article 6 days ago

Article

Llama 2 is here - get it on Hugging Face

Jul 18, 2023

• 15

upvoted a collection 6 days ago

Llama 2 Family

Collection

This collection hosts the transformers and original repos of the Llama 2 and Llama Guard releases • 13 items • Updated 30 days ago • 28

upvoted a paper 6 days ago

xLSTM: Extended Long Short-Term Memory

Paper • 2405.04517 • Published 11 days ago • 7

upvoted a collection 7 days ago

Granite Time Series Models

Collection

A collection of time series models trained by IBM licensed under CDLA-permissive-2.0 license. • 3 items • Updated 11 days ago • 4

upvoted a collection 9 days ago

Granite Code Models

Collection

A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 10 items • Updated 6 days ago • 117

upvoted a paper 13 days ago

WildChat: 1M ChatGPT Interaction Logs in the Wild

Paper • 2405.01470 • Published 16 days ago • 52

upvoted an article 13 days ago

Article

StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation

20 days ago

• 68

upvoted an article 14 days ago

Article

Overview of natively supported quantization schemes in 🤗 Transformers

Sep 12, 2023

• 6

upvoted an article 15 days ago

Article

Optimizing your LLM in production

Sep 15, 2023

• 4

upvoted a paper 16 days ago

Better & Faster Large Language Models via Multi-token Prediction

Paper • 2404.19737 • Published 18 days ago • 61

upvoted an article 16 days ago

Article

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

18 days ago

• 50

upvoted 3 articles 21 days ago

Article

What Makes a Dialog Agent Useful?

Jan 24, 2023

• 1

Article

Can We Train Chat Models with Raw Data?

By

•

23 days ago

• 17

Article

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

May 24, 2023

• 30

upvoted a collection 22 days ago

Korean Datasets I've released so far.

Collection

지금까지 업로드한 한국어 데이터셋 콜렉션입니다. • 6 items • Updated Dec 29, 2023 • 14

upvoted a paper 22 days ago

PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training

Paper • 2309.10400 • Published Sep 19, 2023 • 22

upvoted 3 collections 24 days ago

upvoted a collection 25 days ago

FewMany

Collection

Benchmark For Few Shot Classification with Many Classes • 8 items • Updated about 1 month ago • 6

upvoted a paper 25 days ago

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published 26 days ago • 230

upvoted 3 collections 25 days ago

LayoutLM

Collection

The LayoutLM series are Transformer encoders useful for document AI tasks such as invoice parsing, document image classification and DocVQA. • 5 items • Updated 10 days ago • 9

Table Transformer

Collection

The Table Transformer (TATR) is a series of object detection models useful for table extraction from PDF images. • 5 items • Updated 10 days ago • 12

Phi-3

Collection

Phi-3 family of models • 7 items • Updated 1 day ago • 200

upvoted an article 25 days ago

Article

Faster fine-tuning using TRL & Unsloth

Jan 10

• 17

upvoted an article 26 days ago

Article

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

27 days ago

• 71

upvoted 2 articles 27 days ago

Article

Fine-tune Llama 3 with ORPO

By

•

26 days ago

• 177

Article

Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU

Mar 9, 2023

• 10

upvoted a collection 30 days ago

Meta Llama 3

Collection

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated 30 days ago • 521

upvoted an article 30 days ago

Article

Welcome Llama 3 - Meta's new open LLM

about 1 month ago

• 238

upvoted an article about 1 month ago

Article

Synthetic data: save money, time and carbon with open source

Feb 16

• 21

upvoted a paper about 1 month ago

Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 235

upvoted an article about 1 month ago

Article

Making LLMs lighter with AutoGPTQ and transformers

Aug 23, 2023

• 11

upvoted 2 papers about 1 month ago

ORPO: Monolithic Preference Optimization without Reference Model

Paper • 2403.07691 • Published Mar 12 • 54

Long-context LLMs Struggle with Long In-context Learning

Paper • 2404.02060 • Published Apr 2 • 32

upvoted a paper about 2 months ago

ViTAR: Vision Transformer with Any Resolution

Paper • 2403.18361 • Published Mar 27 • 48

upvoted a collection about 2 months ago

HF-curated models available on Workers AI

Collection

A collection of models curated with Hugging Face that can be run on Cloudflare's Workers AI serverless inference platform. • 15 items • Updated Apr 2 • 48

upvoted a paper about 2 months ago

Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk

Paper • 2401.05033 • Published Jan 10 • 14

upvoted a collection about 2 months ago

DBRX

Collection

DBRX is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. • 3 items • Updated Mar 27 • 89

upvoted 2 papers about 2 months ago

Arcee's MergeKit: A Toolkit for Merging Large Language Models

Paper • 2403.13257 • Published Mar 20 • 16

Simple and Scalable Strategies to Continually Pre-train Large Language Models

Paper • 2403.08763 • Published Mar 13 • 48

upvoted a collection about 2 months ago

StructLM

Collection

The structure knowledge grounded language model • 6 items • Updated Apr 6 • 6

upvoted 7 papers 2 months ago

StableDrag: Stable Dragging for Point-based Image Editing

Paper • 2403.04437 • Published Mar 7 • 23

Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference

Paper • 2403.04132 • Published Mar 7 • 38

DenseMamba: State Space Models with Dense Hidden Connection for Efficient Large Language Models

Paper • 2403.00818 • Published Feb 26 • 13

VisionLLaMA: A Unified LLaMA Interface for Vision Tasks

Paper • 2403.00522 • Published Mar 1 • 40

LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention

Paper • 2303.16199 • Published Mar 28, 2023 • 4

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 566

GPTVQ: The Blessing of Dimensionality for LLM Quantization

Paper • 2402.15319 • Published Feb 23 • 19

Unchun Yang

AI & ML interests

Organizations

ucyang's activity

Hugging Face x LangChain : A new partner package in LangChain

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

quanto: a pytorch quantization toolkit

LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!)

Llama 2 is here - get it on Hugging Face

StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation

Overview of natively supported quantization schemes in 🤗 Transformers

Optimizing your LLM in production

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

What Makes a Dialog Agent Useful?

Can We Train Chat Models with Raw Data?

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

Faster fine-tuning using TRL & Unsloth

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Fine-tune Llama 3 with ORPO

Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU

Welcome Llama 3 - Meta's new open LLM

Synthetic data: save money, time and carbon with open source

Making LLMs lighter with AutoGPTQ and transformers