Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
JournalistsonHF
's Collections
Transcription
Image Tools
Test Chat Models
For Fun & Understanding AI Capabilities
Datasets
Text-Analysis Tools
LLMs Evaluation
Data visualization
Datasets
updated
10 days ago
A curated list of datasets to train your models
Upvote
1
HuggingFaceFW/fineweb
Viewer
•
Updated
about 17 hours ago
•
23.8k
•
1.27k
HuggingFaceTB/cosmopedia
Viewer
•
Updated
Apr 16
•
5.11k
•
496
academic-datasets/AMMeBa
Preview
•
Updated
11 days ago
HuggingFaceM4/OBELICS
Viewer
•
Updated
Aug 22, 2023
•
4.74k
•
116
bigcode/the-stack-v2
Viewer
•
Updated
Apr 23
•
560
•
211
pixparse/pdfa-eng-wds
Viewer
•
Updated
Mar 29
•
4.37k
•
88
pixparse/idl-wds
Viewer
•
Updated
Mar 29
•
3.76k
•
106
argilla/OpenHermesPreferences
Viewer
•
Updated
Mar 1
•
20.2k
•
170
argilla/Capybara-Preferences
Viewer
•
Updated
23 days ago
•
2.32k
•
34
PleIAs/YouTube-Commons
Viewer
•
Updated
Apr 18
•
581
•
273
PleIAs/French-PD-Newspapers
Viewer
•
Updated
Mar 19
•
6
•
60
mozilla-foundation/common_voice_17_0
Viewer
•
Updated
Apr 25
•
54.3k
•
88
satellogic/EarthView
Viewer
•
Updated
22 days ago
•
9.51k
•
94
Upvote
1
Share collection
View history
Collection guide
Browse collections