Hi there,
I am using streamlit to create a dashboard with plots coming from different dataframes. The app computes these dataframes based on some text widgets contained in a streamlit form and stores them in a dataclass. The computation is very expensive so I use streamlit cache to cache the resulting dataclass. I created a mini snippet of code which is enough to reproduce the error and get an idea of the context.
import time
from dataclasses import dataclass
import pandas as pd
import streamlit as st
@dataclass
class GroupOfDataFrames:
df1: pd.DataFrame
df2: pd.DataFrame
df3: pd.DataFrame
@st.cache_data(show_spinner=False, max_entries=10)
def get_data(parameter_1: str, parameter_2: str) -> GroupOfDataFrames:
df1 = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
df2 = pd.DataFrame({"c": [7, 8, 9], "d": [10, 11, 12]})
df3 = pd.DataFrame({"e": [13, 14, 15], "f": [16, 17, 18]})
time.sleep(60)
return GroupOfDataFrames(df1, df2, df3)
with st.form(key="my_form"):
parameter_1 = st.text_input("Parameter 1", value="a")
parameter_2 = st.text_input("Parameter 2", value="b")
submit_button = st.form_submit_button(label="Submit")
if submit_button:
data = get_data(parameter_1, parameter_2)
else:
st.stop()
st.write("Dataframe 1")
st.dataframe(data.df1)
st.write("Dataframe 2")
st.dataframe(data.df2)
st.write("Dataframe 3")
st.dataframe(data.df3)
The snippet runs smoothly when there is a single user interacting with the app. The issue comes when one user is waiting for a result and another user queries the same data (form with the same parameters). To reproduce the error, launch the app and open two tabs. Submit the form with the default parameters in one tab and click on submit on the second tab with the exact same parameters.
The traceback I am getting is:
UnserializableReturnValueError: Cannot serialize the return value (of type __main__.GroupOfDataFrames) in get_data(). st.cache_data uses pickle to serialize the function’s return value and safely store it in the cache without mutating the original object. Please convert the return value to a pickle-serializable type. If you want to cache unserializable objects such as database connections or Tensorflow sessions, use st.cache_resource instead (see our docs for differences).
Traceback:
File "/path-to-project/.venv/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 534, in _run_script
exec(code, module.__dict__)
File "/path-to-project/streamlit_test.py", line 31, in <module>
data = get_data(parameter_1, parameter_2)
File "/path-to-project/.venv/lib/python3.10/site-packages/streamlit/runtime/caching/cache_utils.py", line 212, in wrapper
return cached_func(*args, **kwargs)
File "/path-to-project/.venv/lib/python3.10/site-packages/streamlit/runtime/caching/cache_utils.py", line 243, in __call__
return self._get_or_create_cached_value(args, kwargs)
File "/path-to-project/.venv/lib/python3.10/site-packages/streamlit/runtime/caching/cache_utils.py", line 267, in _get_or_create_cached_value
return self._handle_cache_miss(cache, value_key, func_args, func_kwargs)
File "/path-to-project/.venv/lib/python3.10/site-packages/streamlit/runtime/caching/cache_utils.py", line 343, in _handle_cache_miss
raise UnserializableReturnValueError
I am using cache_data
since GroupOfDataFrames
is serializable. In any case, cache_resource
does not work either.
The app runs in a kubernetes cluster service but the error I am having is reproducible locally.
I am using python version 3.10.8 and streamlit 1.29.0.
Could you please help me?