Slow Data Display

Hello,

I am new to streamlit and coding in general and I’m working on a custom app to display some of our organization’s data. The app I’ve created works, but it is very slow and zooming in and out and other interactive features are very time consuming. I think this can be solved with caching but I’m not sure how to integrate that into my current app. Any help would be great!

import streamlit as st
import altair as alt
import pandas as pd
import erddapy
from vega_datasets import data

Connect to ERDDAP

st.title(‘SWOT Prawler Data’)
e = erddapy.ERDDAP(‘http://heron.pmel.noaa.gov:8080/erddap’, protocol=‘tabledap’)
e.dataset_id = ‘TELON001_PRAWC_N001’ # Data Set to Use

Pull Data from ERDDAP

dfp = e.to_pandas()
dfp[‘time (UTC)’] = pd.to_datetime(dfp[‘time (UTC)’])

Create a subset of the datas

sub = dfp.loc[:,[‘time (UTC)’, ‘SB_Depth’, ‘SB_Temp’, ‘SB_Conductivity’, ‘wetlab_Chlorophyll’]]
source = sub
brush = alt.selection(type=‘interval’)

#dropdown to select y-axis
option = st.selectbox(
‘Select a Dataset’,
[‘SB_Temp:Q’, ‘SB_Conductivity:Q’, ‘Optode_Dissolved_O2:O’, ‘wetlab_Chlorophyll:Q’])

#top panel to plot the data

c = alt.Chart(
source,
title=“SWOT Prawler Data”
).mark_circle(size=30).encode(
x=alt.X(‘time (UTC):T’, scale=alt.Scale(
clamp=True, padding=10)
),
y=alt.Y(‘SB_Depth:Q’,axis=alt.Axis(title=‘Depth (m)’),
scale=alt.Scale(zero=False, padding=5, domain=[500,0])),
color=alt.condition(brush, option, alt.value(‘lightgray’))
).add_selection(
brush

).properties(
width=700,
height=300
)

#2nd chart
second = alt.Chart(
source,
title=“Zoomed in Plot”
).mark_circle(size=30).encode(
y=alt.Y(‘SB_Depth:Q’, sort = “descending”),
color=alt.Color(option, sort = ‘descending’, scale=alt.Scale(scheme=“redblue”)),

x='time (UTC):T',
	tooltip=[
	alt.Tooltip('time (UTC):O', title='Datetime'),
	alt.Tooltip('SB_Depth:O', title='Depth'),
	alt.Tooltip(option, title=option)
]

).transform_filter(
brush
).properties(
height=500, width=700
).interactive()

c & second

Hey @drdevereaux,

Welcome to the forum :wave:

I’m getting timeout’s from ERDDAP when I run your code so I wasn’t able to test it fully, but here’s an example gist that moves the e.to_pandas() code into a cached function. That should be enough to get you the speedup that you need.

Possible tweaks to the parameters of the @st.cache() call might be needed but I wasn’t able to run it to check.

Thanks so much for taking the time to look at this! This works on my end and gives me a much better idea of how to integrate caching for future streamlit projects. I appreciate the help!