arxiv:2405.03594
Michael Goin
mgoin
AI & ML interests
LLM inference optimization, compression, quantization, pruning, distillation
Organizations
Papers
3
spaces
3
models
51
mgoin/Llama-2-7b-chat-hf-pruned95
Text Generation
•
Updated
mgoin/Llama-2-7b-chat-hf-pruned90
Text Generation
•
Updated
mgoin/Llama-2-7b-chat-hf-pruned85
Text Generation
•
Updated
•
1
mgoin/Llama-2-7b-chat-hf-pruned80
Text Generation
•
Updated
mgoin/Llama-2-7b-chat-hf-pruned75
Text Generation
•
Updated
•
7
mgoin/Hermes-2-Pro-Llama-3-8B-Marlin
Text Generation
•
Updated
•
6
•
1
mgoin/Meta-Llama-3-70B-Instruct-Marlin
Text Generation
•
Updated
•
350
•
4
mgoin/Meta-Llama-3-8B-Instruct-Marlin
Text Generation
•
Updated
•
46
mgoin/Meta-Llama-3-70B-Instruct-GPTQ
Text Generation
•
Updated
•
180
•
1
mgoin/TinyLlama-1.1B-Chat-v1.0-pruned50-quant-ds
Text Generation
•
Updated
•
3
datasets
None public yet