xianbao HF staff commited on
Commit
8aad28d
1 Parent(s): c1ba757

Make the app more scalable.

Browse files

Increase the concurrent limit to shorten the queue. The inference api should be quite scalable.

Files changed (1) hide show
  1. app.py +2 -1
app.py CHANGED
@@ -90,5 +90,6 @@ gr.ChatInterface(
90
  fn=generate,
91
  chatbot=gr.Chatbot(show_label=False, show_share_button=False, show_copy_button=True, likeable=True, layout="panel"),
92
  additional_inputs=additional_inputs,
93
- title="Mixtral 46.7B"
 
94
  ).launch(show_api=False)
 
90
  fn=generate,
91
  chatbot=gr.Chatbot(show_label=False, show_share_button=False, show_copy_button=True, likeable=True, layout="panel"),
92
  additional_inputs=additional_inputs,
93
+ title="Mixtral 46.7B",
94
+ concurrency_limit=20
95
  ).launch(show_api=False)