Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models Paper • 2405.15574 • Published 5 days ago • 41
Transformers Can Do Arithmetic with the Right Embeddings Paper • 2405.17399 • Published 1 day ago • 31
Chameleon: Mixed-Modal Early-Fusion Foundation Models Paper • 2405.09818 • Published 13 days ago • 93
MAmmoTH2 Collection Scaling up instruction data from the web for to build better LLMs • 11 items • Updated 3 days ago • 6
To Asymmetry and Beyond: Structured Pruning of Sequence to Sequence Models for Improved Inference Efficiency Paper • 2304.02721 • Published Apr 5, 2023 • 2
CommonCanvas Collection Collection of models trained on the CommonCatalogue datasets • 8 items • Updated 13 days ago • 5
CommonCatalog Collection Common Catalog, a dataset with Creative Commons licensed images and machine-generated caption pairs • 8 items • Updated 13 days ago • 7
MADLAD-400 Collection Models and spaces for MADLAD-400: A Multilingual And Document-Level Large Audited Dataset • 8 items • Updated Nov 14, 2023 • 5
Chronos Models Collection Chronos: Pretrained (language) models for time series forecasting based on the T5 architecture. • 6 items • Updated Mar 18 • 25
Speaker Diarization Datasets Collection A collection of speaker diarization datasets compatible with Diarizers. • 6 items • Updated about 1 hour ago • 1
End-to-end speaker segmentation for overlap-aware resegmentation Paper • 2104.04045 • Published Apr 8, 2021 • 1
Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation Paper • 2210.13248 • Published Oct 24, 2022 • 1
view article Article Train custom AI models with the trainer API and adapt them to 🤗 By not-lain • 4 days ago • 19
Granite Code Models Collection A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 14 items • Updated 7 days ago • 133
llama 3 self-align experiments Collection Replicating the pipeline for StarCoder-2 Instruct on Llama-3-8B with some tweaks https://huggingface.co/blog/sc2-instruct • 4 items • Updated 20 days ago • 6
Community Tools Collection Cool HF tools that I and others at HF work on that I regularly use • 4 items • Updated 8 days ago • 3
Leveraging LLMs for Synthesizing Training Data Across Many Languages in Multilingual Dense Retrieval Paper • 2311.05800 • Published Nov 10, 2023 • 2
🦢SWIM-IR Dataset Collection 29 million Synthetic Wikipedia-based Multilingual Retrieval Training Pairs. • 4 items • Updated Apr 28 • 6
PersonaLLM: Investigating the Ability of Large Language Models to Express Personality Traits Paper • 2305.02547 • Published May 4, 2023 • 5
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences Paper • 2404.03715 • Published Apr 4 • 58
ablation-models Collection 1.8B models trained on 350BT to compare different pretraining datasets • 7 items • Updated 24 days ago • 20
Generalizable Face Landmarking Guided by Conditional Face Warping Paper • 2404.12322 • Published Apr 18 • 1
Enabling Natural Zero-Shot Prompting on Encoder Models via Statement-Tuning Paper • 2404.12897 • Published Apr 19 • 1
Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis Paper • 2404.13686 • Published Apr 21 • 26
Antidote Project Collection Data and models generated within the Antidote Project (https://univ-cotedazur.eu/antidote) • 20 items • Updated 23 days ago • 5
BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing Paper • 2206.15076 • Published Jun 30, 2022 • 3
Arcee's MergeKit: A Toolkit for Merging Large Language Models Paper • 2403.13257 • Published Mar 20 • 17
Idefics2 🐶 Collection Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated 23 days ago • 82
MGM Collection Official model collection for the paper "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models" • 13 items • Updated 26 days ago • 43
Inference Endpoints For Eval Spec. Models Collection Models i want to use upstream as part of evaluation libraries then use them to optimize evaluations and downstream applications. • 8 items • Updated Apr 5 • 2
HF-curated models available on Workers AI Collection A collection of models curated with Hugging Face that can be run on Cloudflare's Workers AI serverless inference platform. • 15 items • Updated Apr 2 • 48
RED^{rm FM}: a Filtered and Multilingual Relation Extraction Dataset Paper • 2306.09802 • Published Jun 16, 2023 • 4
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models Paper • 2403.18814 • Published Mar 27 • 40
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order Paper • 2404.00399 • Published Mar 30 • 39
SambaLingo Collection Expert models that adapt Llama2 to a diverse set of languages from around the world. • 27 items • Updated Apr 17 • 34
UDOP Collection UDOP is a general multimodal model for document AI • 4 items • Updated 7 days ago • 20
Neural Circuit Diagrams: Robust Diagrams for the Communication, Implementation, and Analysis of Deep Learning Architectures Paper • 2402.05424 • Published Feb 8 • 17