Download file best practices

ksdaftari · May 3, 2023, 3:55pm

Summary

for some background, my specific use case, is that we want to create dashboard where we can review some of the raw FTP files we get. the process in dashboard is to download the data from S3, which is 1GB xml file, and able to parse and display to user is prettified way (Eg some of the xml entries are better displayed as dataframe table). so main area trying to consider how do best is right now using download_file method (Downloading files - Boto3 1.26.124 documentation) but worried about just downloading file to disk will leave this file on server indefinitely and not clean up. curious how others have solved this issue. I know in past I have used things such as downloading file in-memory instead to file (but again dont know if that will have in-memory issues, especially if trying caching in-memory, in terms of storage/running out of memory). Curious how others have solved

with that was thinking if somehow using the cached_data persist=True would help, but with this still have questions.
I am looking to get some advice on the proper use of cache_data persist parameter. I have following questions:

what are main use cases for persist parameter. is it namely if potentially large amount of data that if stored in memory may take up to much space where disk may have more space? eg downloading 1gb xml file (that end up doing data manipulation on)?
given there is no integration with ttl functioanltiy, is it case if data cached with persist, that will indefinitely be on servers disk (is there process that can clean up/delete this data). I am namely worried about using this parameter and just having files saved to disk indefinitely (having long term space concerns on server).

other ideas dont know if some way to use tempdirectory (but still dont know how can guarntee that gets deleted after certain amount of time, other safety features like that, etc)

system · October 30, 2023, 3:55pm

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Storing uploaded file and data between sessions Using Streamlit cache , file-upload , streamlit-cloud	2	767	April 15, 2024
How do i cache multiple file reads from s3? Using Streamlit cache	2	1489	November 19, 2021
Implications of persist option of "disk" during caching? Deployment cache	2	416	August 4, 2024
How to clear cache data from web application deployed in streamlit cloud community Using Streamlit	4	979	October 11, 2023
Streamlit Sharing - Fileupload, where does it go? Community Cloud	8	7186	February 7, 2022

Download file best practices

Summary

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies