UnhashableType: Cannot hash object of type _thread._local

ccoggins · February 11, 2020, 10:11pm

I keep getting the same error returned when I use st.cache before a function doing a complex query & pandas manipulation. The original function queries an API for json data twice, and then a SQL database, puts them in pandas, does a few manipulations, and then joins all 3 into 1 dataframe. Several of the columns contain lists, which seem like it might be the problem… But when I separated out the SQL query to it’s own function (that I’m also attempting to cache) now it’s seems to error on the SQL function. I’ve looked all over for documentation to resolve this and am having trouble finding the solution. Screen shot attached of the error.

andfanilo · February 12, 2020, 10:53pm

Hello @ccoggins, welcome to the forums !

The documentation for Streamlit has just been updated with a page on Advanced caching, most notably a bit on caching a database connection

It’s hard to say without your original code, but my guess would be the code you are trying to cache which involves the SQL function opens a thread, and this thread cannot be hashed by Streamlit by default. You can configure the way Streamlit deals with caching this thread though, as a first try does it work with @st.cache(allow_output_mutation=True) ?

ccoggins · February 15, 2020, 8:17pm

Thanks! I just got to sit down with the new documentation, and I updated a few things that seem to have been beneficial to my streamlit dashboard. I had stumbled onto allow_output_mutation=True prior to my post, and it didn’t fix the issue. What I did do was separate the SQL engine connection into it’s own function and cache it with allow_output_mutation=True, and then passed that into the function that holds the SQL query and pandas manipulations. Even after that, the function containing the SQL query and pandas manipulations still returned the same error above.

What I also tried is turning off caching for this function, and the entire dashboard worked fine. I also tried passing in some of the arguments mentioned in the new documentation, and it didn’t resolve anything (some of the concepts are new to me, so it’s likely I was doing them wrong). When I tried hash_funcs={DBConnection: id}, it returned an error NameError: name 'DBConnection' is not defined. After changing DBConnection to my engine connection object, it returns the same error I originally posted about. It’s probably a rookie mistake I can’t tell if DBConnection is a variable/object I should pass in, or an argument Streamlit looks for.

Here’s the function I’m having trouble caching, for context:

def get_songs(engine):
sql = “”"
SELECT *
FROM “public”.“table”
LIMIT 10;
“”"
df = pd.read_sql(sql, con=engine)
df.id = df.id.astype(str)
df.col5 = [’,’.join(map(str, i)) for i in df.col5]
df.col5 = df.col5.replace({“text1”:“text1 edit”, “text2”:“text2 edit”}, regex=True)
# col2, col3, and col4 are list objects; col5 is a string
df[‘col6’] = (df.col2 + df.col3 + df.col4 + df.col5.apply(lambda x: [x])).map(list).apply(lambda x: sorted(x))
df = df[[“id”, “col6”]]
return df

andfanilo · February 17, 2020, 4:04pm

Just to be sure, I did a minimal example with sqlite3 so we can debug together ! Here is what I understood from your problem :

load_sqlite.py

import sqlite3

conn = sqlite3.connect("test.db")
conn.execute(
    """CREATE TABLE COMPANY
         (ID INT PRIMARY KEY     NOT NULL,
         NAME           TEXT    NOT NULL,
         AGE            INT     NOT NULL,
         ADDRESS        CHAR(50),
         SALARY         REAL);"""
)
conn.execute(
    "INSERT INTO COMPANY (ID,NAME,AGE,ADDRESS,SALARY) \
      VALUES (1, 'Paul', 32, 'California', 20000.00 )"
)
conn.execute(
    "INSERT INTO COMPANY (ID,NAME,AGE,ADDRESS,SALARY) \
      VALUES (2, 'Allen', 25, 'Texas', 15000.00 )"
)
conn.execute(
    "INSERT INTO COMPANY (ID,NAME,AGE,ADDRESS,SALARY) \
      VALUES (3, 'Teddy', 23, 'Norway', 20000.00 )"
)
conn.execute(
    "INSERT INTO COMPANY (ID,NAME,AGE,ADDRESS,SALARY) \
      VALUES (4, 'Mark', 25, 'Rich-Mond ', 65000.00 )"
)
conn.commit()
print("Records created successfully")

conn.close()

You can run python load_sqlite.py to create a local test.db which contains a table.

app.py

import sqlite3
from sqlite3 import Connection

import pandas as pd
import streamlit as st


@st.cache(allow_output_mutation=True)
def get_connection(path):
    """Put the connection in cache to reuse if path does not change."""
    return sqlite3.connect(path)


@st.cache(hash_funcs={Connection: id})
def get_data(engine):
    """If the Connection.id is the same as before, use the cached dataframe"""
    sql = """
    SELECT * FROM COMPANY;
    """
    df = pd.read_sql(sql, con=engine)
    return df


engine = get_connection("test.db")
data = get_data(engine)
st.write(data)

Run with streamlit run app.py

on NameError: name 'DBConnection' is not defined, you need to import it in your Python script, it’s an object/class.

The @st.cache(hash_funcs={Connection: id}) means Streamlit, each time you come accross a Connection object in the cached method, when you compute the cache key, use id value of Connection and don’t try to use the rest of the Connection object as a cache key because it might change dynamically given other factors, like if there’s a Thread object within the Connection that we don’t know how to hash. Freely replace Connection by DBConnection in your case

In my app.py if I remove from sqlite3 import Connection I get NameError: name 'Connection' is not defined.

you need to look for this DBConnection depending on the SQL client library you are using under SQLAlchemy. I only looked for from psycopg2.engine import DBConnection for PostgreSQL and could not find it. What SQL client + SQL DB are you using ?

Adrien_Treuille · February 25, 2020, 5:12pm

Wow, @andfanilo. This is such an amazing post, and that example you gave perfectly captures the new hash_funcs workflow for database connections.

Adrien_Treuille · February 25, 2020, 5:18pm

@ccoggins: The suggestion from @andfanilo is exactly correct. Please let us know if hash_funcs={Connection: id} works for you. If so, we will consider making this the default behavior for sqlite3.Connection.

If this suggestion doesn’t work, please know that the next version of Streamlit will include additional information so that we can diagnose and fix this problem.

Thank you for using Streamlit. We’re excited to get this use-case working for you!

ccoggins · February 25, 2020, 6:31pm

Sorry for the delayed response! @andfanilo your explanation of hash functions is great.

Being fully transparent, I don’t remember when I did it, but I did get it working. It’s still a little over my head so I could have it wrong…but I believe I’m mirroring your implementation using sqlalchemy. I Here’s a toy version of my implementation:

import pandas as pd
import sqlalchemy

@st.cache(allow_output_mutation=True)
def connect_engine(url):
    engine = sqlalchemy.create_engine(url)
    return engine

st.cache(hash_funcs={sqlalchemy.engine.base.Engine: id})
def get_data(engine):
    sql = """
    SELECT * FROM table;
    """
    df = pd.read_sql(sql, con=engine)
    return df

connection_url = "db_connection_string"
engine = connect_engine(connection_url)
data = get_data(engine)
st.write(data)

andfanilo · February 25, 2020, 8:40pm

Great, glad you got it to work

Good now I know I can reuse that example everywhere !

Adrien_Treuille · February 26, 2020, 2:26am

Great, @ccoggins! This is a very common use-case and we’d like to simplify it further in future releases. In the meantime though, that’s the exact right approach, and I’m so glad this worked for you.

Thanks for the kind words, @andfanilo!

tc1 · March 5, 2020, 11:30pm

Hey @ccoggins ,

Wanted to give you a quick update that we created a new GitHub issue to update the default behavior Adrien was mentioning and create a hash_func for type _thread._local.

rdef · March 30, 2020, 4:33pm

This has been bothering me for so long. Thanks for helping me solve my problem!

LostGoatOnHill · February 12, 2021, 2:03pm

Thanks so much for complementing the super helpful @andfanilo example with sqlalchemy variant, between both of you really helped me move beyond the hash challenge of the db connection as a streamlit newbie.

andfanilo · February 12, 2021, 2:21pm

Well almost a year later, let me mark this as the solution happy Streamlit-ing™

Jiangang_Hao · April 5, 2021, 9:10pm

Just chime in another data point. In my case for sqlite3, it works after I replace

@st.cache(hash_funcs={DBConnection: id})

with

@st.cache(hash_funcs={sqlite3.Connection: id})

jp0861 · March 15, 2022, 3:57am

@ccoggins. You saved my life! This unhashable error was killing me. My company, which shall remain nameless, to save it from public humiliation, is using ColdFusion to develop its web apps and I simply can’t abide by it! Streamlit is my only hope and you have helped keep hope alive, my friend!

Faffi · June 1, 2022, 1:56pm

For psycopg2 you can use
@st.cache(hash_funcs={psycopg2.extensions.connection: id})
Thanks! That worked for me as well

milindsoorya · January 4, 2023, 3:39pm

For psycopg2 you can use -

import streamlit as st
import psycopg2
import pandas as pd

# Initialize connection.
# Uses st.experimental_singleton to only run once.
@st.experimental_singleton()
def get_connection():
    return psycopg2.connect(**st.secrets["postgres"])

@st.cache(hash_funcs={psycopg2.extensions.connection: id})
def get_data(engine):
    """If the Connection.id is the same as before, use the cached dataframe"""
    query = 'select "Email" from emails_only;'
    data = pd.read_sql(query, engine)
    return data

def main():
    engine = get_connection()
    data = get_data(engine)
    st.write(data)

if __name__ == "__main__":
    main()

Hope it helps.

Topic		Replies	Views
Streamlit cannot hash an object of type <class 'sqlalchemy.engine.base.Engine'> Using Streamlit cache , windows	8	3271	November 19, 2021
Caching pandas dataframe Using Streamlit cache , pandas	4	11451	November 19, 2021
UnhashableTypeError: Cannot hash object of type _thread.RLock, found in the arguments of establish_connection() Using Streamlit cache	1	673	November 16, 2022
St.cache with sqlalchemy Using Streamlit cache	4	5382	November 19, 2021
How to use st.cache with sqlalchemy.orm objects Using Streamlit cache	8	6756	January 12, 2022

UnhashableType: Cannot hash object of type _thread._local

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies