How to scale Streamlit App for multiple concurrent users on AWS?

I want to deploy an LLM RAG application where the user can upload his pdf files and the content will be chunked and embedded in a vector store, LLMs weights downloaded from HuggingFace and deployed on the AWS EC2 instance. What steps must I follow to deploy this Retrieval Augmented Generation app on the AWS cloud so that multiple users can access the app concurrently, upload their PDF files, and get answers?