I built a Streamlit Chat-with-your-PDF app and now want to deploy the app so that team members of mine can try it out.
Now I was wondering of what would be good (cheap) options to host such an application? Most tutorials are using an OpenAI model but I am using open-source models, so I need something where I can access a GPU in the cloud.
At the moment I am running the App locally on my Mac M2 without any problems, even with using the quantized version of Mixtral (33 GB RAM required). But also smaller models would be an option, which require RAM of 7GB or less.
Any suggestions here?