FAQ: How to improve performance of apps with large data

dataprofessor · March 6, 2024, 1:01pm

Problem

Your app uses large database or data files (CSV or JSON) and you’re noticing that the app is slow and you want to improve its speed.

Solution

Here are some suggestions that you can look into to speed up your app:

1. Removing unused data

An important point to consider is whether the entirety of the data file is really needed. Often times, you may actually need only a small subset of the original dataset. So instead of loading the entire data file, you can actually load only the columns that you really need, which has the advantage of consuming less memory and consequently improves the speed of the app.

Here’s what you can do when loading specific column that you need (i.e. let’s say we need only 3 columns x1, x2 and x3 instead of loading the entire data) a subset of a CSV data file using Pandas:

df = pd.read_csv('data.csv', usecols=['x1', 'x2', 'x3'])

2. Caching the data

You can use st.cache_data to cache your data:

@st.cache_data
def load_csv_data():
    df = pd.read_csv('data.csv', usecols=['x1', 'x2', 'x3'])
    return df

Briefly, this allows your data to be loaded on the first run, and subsequent runs would then load from memory. This can really add up if your app uses the same data more than once.

Furthermore, there’s also the persist="disk" option that allows you to cache the data to the local disk.

3. Choosing an optimal data storage format

If you’re using a lot of data in CSV or JSON formats, consider switching to a more computer-friendly format like Apache Parquet or Apache Arrow IPC. Particularly, while CSV and JSON are optimized for humans, they’re not the speediest for computers! Opting for a binary format like Parquet or Arrow removes the need to parsing text into data types like integers, floats, and strings, which is a time-consuming process. Binary formats usually come with metadata and logical partitioning, which Python can use for efficient data handling.

Topic		Replies	Views
Application processing speed Community Cloud	3	475	January 23, 2024
How to improve Streamlit app loading speed Show the Community!	1	1738	April 2, 2024
Database or CSV files Using Streamlit cache , discussion	4	619	September 2, 2024
Server performance and app performance Deployment	2	539	November 25, 2023
Application performance enhancement Community Cloud	2	523	January 17, 2024

FAQ: How to improve performance of apps with large data

Problem

Solution

1. Removing unused data

2. Caching the data

3. Choosing an optimal data storage format

Resources

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies