arxiv:2405.03594
Michael Goin
mgoin
AI & ML interests
LLM inference optimization, compression, quantization, pruning, distillation
Organizations
Papers
3
spaces
3
models
51
mgoin/Llama-2-7b-chat-hf-pruned95
Text Generation
•
Updated
•
2
mgoin/Llama-2-7b-chat-hf-pruned90
Text Generation
•
Updated
mgoin/Llama-2-7b-chat-hf-pruned85
Text Generation
•
Updated
•
9
mgoin/Llama-2-7b-chat-hf-pruned80
Text Generation
•
Updated
mgoin/Llama-2-7b-chat-hf-pruned75
Text Generation
•
Updated
•
7
mgoin/Hermes-2-Pro-Llama-3-8B-Marlin
Text Generation
•
Updated
•
7
•
1
mgoin/Meta-Llama-3-70B-Instruct-Marlin
Text Generation
•
Updated
•
440
•
5
mgoin/Meta-Llama-3-8B-Instruct-Marlin
Text Generation
•
Updated
•
52
mgoin/Meta-Llama-3-70B-Instruct-GPTQ
Text Generation
•
Updated
•
193
•
1
mgoin/TinyLlama-1.1B-Chat-v1.0-pruned50-quant-ds
Text Generation
•
Updated
•
3
datasets
None public yet