view article Article Exploration of Job Application Automation with Data Scraping By herooooooooo • 10 days ago • 3
view article Article Glaze and the Effectiveness of Anti-AI Methods for Diffusion Models By parsee-mizuhashi • 5 days ago • 1
view article Article Synthetic dataset generation techniques: Self-Instruct By davanstrien • 5 days ago • 3
LlamaForTokenClassification Collection Fine Tuned llama variants for Token Classification • 6 items • Updated 7 days ago • 2
Terminus XL Collection v-prediction SDXL clone with zero-terminal SNR noise schedule • 8 items • Updated 26 days ago • 5
view article Article Multimodal Augmentation for Documents: Recovering “Comprehension” in “Reading and Comprehension” task By danaaubakirova • 4 days ago • 15
view article Article Evalverse: Revolutionizing Large Language Model Evaluation with a Unified, User-Friendly Framework By Yescia • 13 days ago • 1
view article Article Advancing Open-source Large Language Models in the Medical & Healthcare Domain By aaditya • 10 days ago • 3
view article Article Adapt custom AI models to the trainer API and to 🤗 By not-lain • 6 days ago • 15
view article Article Knowledge Distillation for Fine-Tuning a GPT-3.5 Judge: Enhancing Accuracy and Performance By Andyrasika • 7 days ago • 4
view article Article SeeMoE: Implementing a MoE Vision Language Model from Scratch By AviSoori1x • 14 days ago • 24
view article Article Can we create pedagogically valuable multi-turn synthetic datasets from Cosmopedia? By davanstrien • 13 days ago • 6
view article Article Train Custom Models on Hugging Face Spaces with AutoTrain SpaceRunner By abhishek • 11 days ago • 6
view article Article makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch By AviSoori1x • 13 days ago • 21
view article Article Expanding Model Context and Creating Chat Models with a Single Click By maywell • 22 days ago • 30
view article Article ⚗️ 🧑🏼🌾 Let's grow some Domain Specific Datasets together By burtenshaw • 21 days ago • 25
view article Article A Guide to Designing New Functional Proteins and Improving Protein Function, Stability, and Diversity with Generative AI By AmelieSchreiber • 6 days ago • 15
view article Article Fish Speech V1 - New Multilingual Open Source TTS Model By lengyue233 • 17 days ago • 4
view article Article Token Merging for fast LLM inference : Background and first trials with Mistral By samchain • 20 days ago • 1
view article Article StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation 21 days ago • 69
view article Article LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!) By wolfram • 26 days ago • 42
view article Article Fine Tuning a LLM Using Kubernetes with Intel® Xeon® Scalable Processors By dmsuehir • 26 days ago • 2
view article Article seemore: Implement a Vision Language Model from Scratch By AviSoori1x • 8 days ago • 41
view article Article Post-OCR-Correction: 1 billion words dataset of automated OCR correction by LLM By Pclanglais • 24 days ago • 10
view article Article 🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets By dvilasuero • 24 days ago • 54
view article Article Estimating Memory Consumption of LLMs for Inference and Fine-Tuning for Cohere Command-R+ By Andyrasika • 24 days ago • 6
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Apr 18 • 525
view article Article Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent 28 days ago • 71
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published 28 days ago • 230
view article Article Releasing Youtube-Commons: a massive open corpus for conversational and multimodal data By Pclanglais • Apr 18 • 20
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12 • 56
view article Article Ryght’s Journey to Empower Healthcare and Life Sciences with Expert Support from Hugging Face Apr 16 • 6
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models Paper • 2404.02258 • Published Apr 2 • 101
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent Paper • 2404.03648 • Published Apr 4 • 22
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community Apr 15 • 127
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis Paper • 2307.01952 • Published Jul 4, 2023 • 74
ClassPruning: Speed Up Image Restoration Networks by Dynamic N:M Pruning Paper • 2211.05488 • Published Nov 10, 2022 • 1
Multi-Curve Translator for High-Resolution Photorealistic Image Translation Paper • 2203.07756 • Published Mar 15, 2022 • 1
Modular Degradation Simulation and Restoration for Under-Display Camera Paper • 2209.11455 • Published Sep 23, 2022 • 1
StarEnhancer: Learning Real-Time and Style-Aware Image Enhancement Paper • 2107.12898 • Published Jul 27, 2021 • 2
Rethinking Performance Gains in Image Dehazing Networks Paper • 2209.11448 • Published Sep 23, 2022 • 1
Transformers Learn Higher-Order Optimization Methods for In-Context Learning: A Study with Linear Models Paper • 2310.17086 • Published Oct 26, 2023 • 1
IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations Paper • 2404.01266 • Published Apr 1 • 1
Plug-In Inversion: Model-Agnostic Inversion for Vision with Data Augmentations Paper • 2201.12961 • Published Jan 31, 2022 • 1