Collections
Discover the best community collections!
Collections trending this week
-
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
Paper β’ 1701.06538 β’ Published β’ 4 -
Sparse Networks from Scratch: Faster Training without Losing Performance
Paper β’ 1907.04840 β’ Published β’ 3 -
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
Paper β’ 1910.02054 β’ Published β’ 3 -
A Mixture of h-1 Heads is Better than h Heads
Paper β’ 2005.06537 β’ Published β’ 2
-
yentinglin/Taiwan-LLM-8x7B-DPO
Text Generation β’ Updated β’ 1.87k β’ 15 -
yentinglin/Taiwan-LLM-13B-v2.0-chat
Text Generation β’ Updated β’ 1.78k β’ 44 -
Taiwan LLM: Bridging the Linguistic Divide with a Culturally Aligned Language Model
Paper β’ 2311.17487 β’ Published β’ 2 -
yentinglin/Taiwan-LLM-13B-v2.0-chat-awq
Text Generation β’ Updated β’ 109 β’ 3