AkaLlama Collection Korean adaptation of Llama-3 LLM suites, developed by MIR Lab @ Yonsei University • 3 items • Updated about 7 hours ago • 1
Optimizing Language Augmentation for Multilingual Large Language Models: A Case Study on Korean Paper • 2403.10882 • Published Mar 16 • 4
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 11 items • Updated 1 day ago • 85
NuNerZero - Zero Shot NER Collection The best compact Zero-Shot NER models with MIT license • 4 items • Updated 8 days ago • 11
view article Article makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch By AviSoori1x • 11 days ago • 21
view article Article LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!) By wolfram • 24 days ago • 37
Llama 2 Family Collection This collection hosts the transformers and original repos of the Llama 2 and Llama Guard releases • 13 items • Updated 30 days ago • 28
Granite Time Series Models Collection A collection of time series models trained by IBM licensed under CDLA-permissive-2.0 license. • 3 items • Updated 11 days ago • 4
Granite Code Models Collection A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 10 items • Updated 6 days ago • 117
view article Article StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation 20 days ago • 68
view article Article Overview of natively supported quantization schemes in 🤗 Transformers Sep 12, 2023 • 6
Better & Faster Large Language Models via Multi-token Prediction Paper • 2404.19737 • Published 18 days ago • 61
view article Article Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints 18 days ago • 50
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA May 24, 2023 • 30
Korean Datasets I've released so far. Collection 지금까지 업로드한 한국어 데이터셋 콜렉션입니다. • 6 items • Updated Dec 29, 2023 • 14
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training Paper • 2309.10400 • Published Sep 19, 2023 • 22
FewMany Collection Benchmark For Few Shot Classification with Many Classes • 8 items • Updated about 1 month ago • 6
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published 26 days ago • 230
LayoutLM Collection The LayoutLM series are Transformer encoders useful for document AI tasks such as invoice parsing, document image classification and DocVQA. • 5 items • Updated 10 days ago • 9
Table Transformer Collection The Table Transformer (TATR) is a series of object detection models useful for table extraction from PDF images. • 5 items • Updated 10 days ago • 12
view article Article Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent 27 days ago • 71
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated 30 days ago • 521
Llama 2: Open Foundation and Fine-Tuned Chat Models Paper • 2307.09288 • Published Jul 18, 2023 • 235
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12 • 54
HF-curated models available on Workers AI Collection A collection of models curated with Hugging Face that can be run on Cloudflare's Workers AI serverless inference platform. • 15 items • Updated Apr 2 • 48
Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk Paper • 2401.05033 • Published Jan 10 • 14
DBRX Collection DBRX is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. • 3 items • Updated Mar 27 • 89
Arcee's MergeKit: A Toolkit for Merging Large Language Models Paper • 2403.13257 • Published Mar 20 • 16
Simple and Scalable Strategies to Continually Pre-train Large Language Models Paper • 2403.08763 • Published Mar 13 • 48
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference Paper • 2403.04132 • Published Mar 7 • 38
DenseMamba: State Space Models with Dense Hidden Connection for Efficient Large Language Models Paper • 2403.00818 • Published Feb 26 • 13
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention Paper • 2303.16199 • Published Mar 28, 2023 • 4
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 566
GPTVQ: The Blessing of Dimensionality for LLM Quantization Paper • 2402.15319 • Published Feb 23 • 19