Deploy Streamlit LLM Demo

Hi,

I built a Streamlit Chat-with-your-PDF app and now want to deploy the app so that team members of mine can try it out.
Now I was wondering of what would be good (cheap) options to host such an application? Most tutorials are using an OpenAI model but I am using open-source models, so I need something where I can access a GPU in the cloud.

At the moment I am running the App locally on my Mac M2 without any problems, even with using the quantized version of Mixtral (33 GB RAM required). But also smaller models would be an option, which require RAM of 7GB or less.

Any suggestions here?

Hi @moxinator98

There’s a tutorial blog to get you started that shows the use of an LLM model (Llama2) hosting platform (Replicate) that can be used by the Streamlit app for response generation:

Hope this helps!

Thanks! I also thought about putting the streamlit app into a container and share it with colleagues by sending them the files, so that no Cloud service is needed. Do you have any experience with this?

There’s a tutorial for deploying using Docker that you can check out: