Inference Endpoints (dedicated)

Join the Hugging Face community

and get access to the augmented documentation experience

Collaborate on models, datasets and Spaces

Faster examples with accelerated inference

Switch between documentation themes

to get started

API Reference (Swagger)

🤗 Inference Endpoints can be used through the UI and programmatically through an API. The API exposes open-API specification for each available route.

Update May 2024: We have renamed instances and further details can be found in the pricing documentation. For example, when using 1x A10G instance, the naming is: instance_type: nvidia-a10g instance_size: x1

< > Update on GitHub

←Supported Tasks Autoscaling→