Best practice for deploying Streamlit apps which handle big datasets (tens of GB)?

pablolopeznto · May 1, 2026, 4:09pm

Hi,

I have been recently working in a Streamlit app for bioinformatic workflows. The app is working effectively in local environments, but I’m struggling to find a suitable deployment strategy due to the size of the input data. Currently, it relies on inspecting the host filesystem to have direct access to input files, which can sum up tens of GB per sample.

Typical Streamlit options seem unfit for this project.

Creating a Docker image is technically feasible, relying on st.file_uploader() for file transfer. For small-to-medium size files, this approach is appropriate, but it has been reported as unstable for big datasets (memory, timeouts, browser limitations). A different practice could be mounting the entire host filesystem as a container volume, but it is an insecure strategy.
Deploying the app at Streamlit Community Cloud is impractical for the same reason, due to file size constraints.

What would be an adequate architecture to manage big input files in this scenario?

Thanks in advance!

Pablo

AgentStreamy · May 1, 2026, 4:09pm

Hey Pablo, thanks for sharing your challenge—bioinformatics data can get massive, and deploying Streamlit apps for these workflows is definitely tricky!

For large files (tens of GB), using st.file_uploader() is not recommended due to browser, memory, and timeout limitations, as you’ve noticed. Streamlit Community Cloud and similar platforms also have strict resource limits (e.g., 1GB RAM, 200MB upload limit by default) and are not designed for persistent or large-scale storage. Mounting the host filesystem as a container volume is insecure and not scalable for multi-user or cloud deployments. The best practice is to store large files externally (e.g., AWS S3, Google Cloud Storage, or similar object storage) and have your app access them via secure APIs or presigned URLs. Users can upload files directly to cloud storage, and your Streamlit app can process them by referencing their cloud locations, avoiding local upload bottlenecks and security risks. This approach is widely recommended for big data workflows in Streamlit and cloud environments, as discussed in the Streamlit docs and community forums.

If you need to process files on-premises, consider running your Streamlit app on a secure internal server with direct access to the data, or use a hybrid approach where users upload files to a secure network share or object storage, and the app reads from there. For cloud deployments, always use external storage and avoid direct filesystem access or large file uploads through the browser. This architecture is robust, secure, and scalable for bioinformatics workflows with large datasets.

Sources:

Topic		Replies	Views
How to deploy an app when my files are very large Community Cloud streamlit-cloud	1	1826	January 20, 2022
Use local file path to access the file when the app is on the air Using Streamlit	3	3232	August 17, 2022
CSV over 2 GB Using Streamlit	8	4818	August 19, 2020
Deployed app removes the uploaded files during running Deployment	1	381	August 8, 2022
Streamlit data limit Using Streamlit	7	8183	March 22, 2021

Best practice for deploying Streamlit apps which handle big datasets (tens of GB)?

Related topics