Polars is a new Python library that often executes much faster (10x) than Pandas. You can convert Pandas dataframes to Polars dataframe and viceversa with the Polars to_polars and to_pandas function.
I am converting my Pandas functions to Polars, function by function.
I have noticed that the Streamlit cache does not seem to support Polars dataframe. It gives me an error if I try to input a Polars dataframe in a “cached” function.
My solution is to convert back every Polars dataframe to Pandas so each function returns a Pandas dataframe.
I was wondering if there were plans to support Polars dataframe in terms of st.cache.
@st.cache was deprecated in Streamlit 1.18.0. So st.cache will never support caching Polars dataframes. We recommend using one of the new caching decorators@st.cache_data as a replacement to cache data.
Here’s an example demonstrating caching of a Polars dataframe:
@Fabio I’m guessing you’re running into the UnhashableParamError when passing a Polars dataframe as an argument to a cache-decorated function. To tell Streamlit to stop hashing the argument, add a leading underscore to the argument’s name in the function signature:
On taking another look, I realize you’re correct in thinking that the function will not rerun if the excluded parameter (when it is the only param to the function) doesn’t change.
What you can do in this case is pass another input param to the cached function that changes whenever the polars dataframe changes. One such option is to use polars.DataFrame.hash_rows in conjunction with polars.Series.view. The first method hashes and combines the rows in the polars DataFrame. As the result is an unhashable polars.Series object, we convert it to a NumPy array containing the UInt64 hashes:
apparently with the upcoming Pandas 2.0 converting from Pandas to Polars and vice-versa will become a “free” operation (both have underlying arrow structure), which I think solves the issue.
@st.resource appears to treat polar data frames better, I think the serialisation/pickle aspect of @st.cache causes inflation of the polars df as well as inconsistencies when a hash is calculated?
I am new to Streamlit, and I agree with @matth, I was using the @sst.cache_data decorator before the function that loads my Polars dataframe via .read_csv() (~3.5 GB), and it was slowing down the loading time and visualizations considerably as compared to just directly reading the data (i.e., without using the decorator or a function).
Thanks for stopping by! We use cookies to help us understand how you interact with our website.
By clicking “Accept all”, you consent to our use of cookies. For more information, please see our privacy policy.
Cookie settings
Strictly necessary cookies
These cookies are necessary for the website to function and cannot be switched off. They are usually only set in response to actions made by you which amount to a request for services, such as setting your privacy preferences, logging in or filling in forms.
Performance cookies
These cookies allow us to count visits and traffic sources so we can measure and improve the performance of our site. They help us understand how visitors move around the site and which pages are most frequently visited.
Functional cookies
These cookies are used to record your choices and settings, maintain your preferences over time and recognize you when you return to our website. These cookies help us to personalize our content for you and remember your preferences.
Targeting cookies
These cookies may be deployed to our site by our advertising partners to build a profile of your interest and provide you with content that is relevant to you, including showing you relevant ads on other websites.