Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
mrfakename 
posted an update Feb 22
Post
Hugging Face announces Cosmo 1B, a fully open sourced Phi competitor with an open sourced dataset. The dataset references various articles and textbooks as "seed data" to generate conversations. Licensed under the Apache 2.0 license. The dataset, dubbed "Cosmopedia," is published on the Hugging Face Hub under the Apache 2.0 license. It was generated using Mixtral 8x7B with various sources (AutoMathText, OpenStax, WikiHow, etc) as "seed data."

Model: HuggingFaceTB/cosmo-1b
Dataset: HuggingFaceTB/cosmopedia
In this post