Input validation error: `inputs` tokens + `max_new_tokens` must be <= 2048. on Mixtral8x7b 32K token

#199
by sunnykusawa - opened

I have deployed Mistral on Sagemaker using the Huggignface image. I am getting good response for small input prompts. when I send little big size promt I am gettitng errror: Input validation error: inputs tokens + max_new_tokens must be <= 2048.

similar errors in huggingchat

I am able to solve this issue for sagemaker endpoint.

we need to set environment variables MAX_INPUT_LENGTH and MAX_TOTAL_TOKEN.

While deploying llm with sagemaker add this environment variables

hub = {
'HF_MODEL_ID':'mistralai/Mixtral-8x7B-Instruct-v0.1',
'SM_NUM_GPUS': json.dumps(8),
"MAX_INPUT_LENGTH": '30000', => put here any value upto 32768 as per your requirement.
"MAX_TOTAL_TOKENS": '32768',
"MAX_BATCH_PREFILL_TOKENS": '32768',
"MAX_BATCH_TOTAL_TOKENS": '32768',
}

It will change the defaul MAX_INPUT_TOKEN size from 2048 to 30000

Sign up or log in to comment