How to scale Streamlit App for multiple concurrent users on AWS?

I want to deploy an LLM RAG application where the user can upload his pdf files and the content will be chunked and embedded in a vector store, LLMs weights downloaded from HuggingFace and deployed on the AWS EC2 instance. What steps must I follow to deploy this Retrieval Augmented Generation app on the AWS cloud so that multiple users can access the app concurrently, upload their PDF files, and get answers?

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.