[Cache Request] meta-llama/Meta-Llama-3-70B-Instruct

#56

by CodeVinayak - opened Apr 18

Discussion

CodeVinayak

Apr 18

Please add the following model to the neuron cache

dacorvo

AWS Inferentia and Trainium org Apr 19

•

edited Apr 19

The model is now cached for 0.0.21. It can take an hour for the Sagemaker deployment snippet to appear in the model card, but you can start using the model right away.

dacorvo changed discussion status to closed Apr 19

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment