Database or CSV files

optimvs · March 6, 2024, 12:17pm

Hi team, I’ve a big doubt, which is faster to use database or csv files?

I’ve an app to show and filter about 80 csv files and 1Gb data. I need to read every file and filter data to show results. This process takes about 5 minutes and need to refilter in every settings change.

How can I make my app more efficient with so much data?

Thanks a lot

dataprofessor · March 6, 2024, 12:57pm

Hi @optimvs

Have you looked into caching your data using st.cache_data (st.cache_data - Streamlit Docs)

Also see this post that suggests using an option for persisting to disk (st.cache_data(persist="disk")):

Another area to explore in improving the performance is to determine if the entire data is needed, here are some questions to consider:

Are all columns needed?
Or only certain columns are used from the data? If so, you can use the usecols parameter of pd.read_csv.

Another great read is the blog by @randyzwitch on building performant apps (6 Tips for Improving Your App Performance | Streamlit), in particular sections 4 and 5 on: * 4. Remove unused data* and * 5. Optimize data storage formats*

Hope this helps!

Goyo · March 6, 2024, 1:19pm

Make sure you only read (and parse) the files once, not on every settings change.

ferdy · March 6, 2024, 2:48pm

Depends on what you are doing. Eventually you have to test which one is faster in your case.

Reading the file is usually the bottleneck. You can solve this by caching. If filtering takes too much time too, try caching the results. Read the caching documents. And test your implementation if there is improvement.

system · September 2, 2024, 2:48pm

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Hitting API each time for data Using Streamlit cache	2	414	May 13, 2022
Reading data quickly Using Streamlit	7	3250	January 12, 2022
How to run faster an Streamlit app Using Streamlit discussion	2	819	March 2, 2025
Streamlit with datasets up to 1 mil of rows Using Streamlit cache , file-upload , pandas	4	2989	September 18, 2023
How to improve Streamlit app loading speed Show the Community!	1	1909	April 2, 2024

Database or CSV files

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies