InternalHashError: 1 - DF readin @st.cache bug

eliztnguyen · October 6, 2020, 5:02pm

I keep getting this error. And the recommended hash doesn’t seem to work…

ERROR:
InternalHashError : 1

While caching the return value of setupResults() , Streamlit encountered an object of type pandas.core.frame.DataFrame , which it does not know how to hash.

In this specific case, it’s very likely you found a Streamlit bug so please file a bug report here.

In the meantime, you can try bypassing this error by registering a custom hash function via the hash_funcs keyword in @st.cache(). For example:

@st.cache(hash_funcs={pandas.core.frame.DataFrame: my_hash_func})
def my_func(...):

Alex_31 · October 6, 2020, 5:22pm

Hi,

can you please provide a code snippet in order that we can reproduce the error?

Thanks,
Alex

eliztnguyen · October 6, 2020, 7:44pm

HI Alex. I guess I should delete that comment (I’m just not sure how).
I found an error in my code that was triggering that “bug” message.
I don’t think it was a bug.
I was able to resolve it.

Streamlit caught something that my IDE didn’t

eliztnguyen · October 9, 2020, 1:59am

Unfortunately I was wrong. This bug is happening again periodically.

eliztnguyen · October 9, 2020, 2:10am

I believe these are the functions triggering the error:

#@st.cache(hash_funcs={pandas.core.frame.DataFrame: my_hash_func})
def readinresults(data):
    '''simpler readin function for excel file'''
    df = pd.read_excel(data)
    return df

@st.cache
def readinFile(filepath, extension):
    '''function to read in file with the following options;
    :param filepath: written as "str";
    :param extension: "csv", "xlsx", "xls";
    :return: data as dataframe
    '''

    tempdf = pd.DataFrame()

    if extension == "csv":
        tempdf = pd.read_csv(filepath)
    elif extension == "xlsx" or extension == "xls":
        tempdf = pd.read_excel(filepath)
    return tempdf


@st.cache
def setupResults(results):
    '''
    set up results for processing in Accurate Insight project
    :param results: results dataframe for one_organization-one_survey
    :return: results dataframe with column names, new columns, and only if "submitted"==True
    '''

    # subset data to only submitted surveys
    results = results[results["submitted"] == True]
    ### NOTE: these include submitted surveys that have ZEROS, which means NO ANSWER

    # add new "Location - Department" column
    locdepCol = results["Location Name"] + " - " + results["Department Name"]
    results["Location - Department"] = locdepCol

    return results

I can’t seem to post a like to my Github repository, but here’s the reference : https://github.com/eliztnguyen/Activated-Insights-Streamlit

And my deployed app:
share.streamlit.io/eliztnguyen/activated-insights-streamlit/main/ActivatedInsights_Streamlit.py

Topic		Replies	Views
When loading a function using @st.cache i get errors Using Streamlit	7	1131	December 24, 2023
Caching pandas dataframe Using Streamlit cache , pandas	4	11452	November 19, 2021
UnhashableType: Cannot hash object of type _thread._local Using Streamlit cache	16	11436	January 4, 2023
InternalHashError Using Streamlit	2	557	April 11, 2023
Hash function error for uploaded text file Using Streamlit cache	10	6928	November 19, 2021

InternalHashError: 1 - DF readin @st.cache bug

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies