On refresh, database app errors out

Hey ya’ll,
This will sound like a strange situation.

So I made a StreamLit app where it makes a database connection to a DB2 server, submits a query, and then just simply displays the query result as a pandas dataframe. I am using jaydebeapi library. When I first launch the StreamLit server, my StreamLit app displays the dataframe fine, no errors…all fine and dandy.

But here’s the weird part. If I hit the browser refresh button or if it refreshed after the first initial load, I get a jaydebeapi specific error. If I restart the server, it refreshes, and then everything is fine!? But, if it refreshes after than initial load, boom, get an error message all over again.

I also executed a non-StreamLit version in an ipython session, multiple times, and get no/ZERO errors. So it is something weird on StreamLit side where it is not saving state or session and somehow causing my jaydebeapi error.

Here’s the error message (some info scrubbed):
java.sql.SQLExceptionPyRaisable: java.sql.SQLException: No suitable driver found for jdbc:db2://some_server:50000/some_database
File “c:\users\user\appdata\local\continuum\anaconda3_64bit\envs\dashboards\lib\site-packages\streamlit\ScriptRunner.py”, line 306, in run_script exec(code, module.dict)
File “D:\pyscripts\streamlit\defect_rates.py”, line 17, in jars=[‘D:/JDBC_Drivers/folder1/db2jcc4.jar’, ‘D:/JDBC_Drivers/folder1/db2jcc.jar’])
File "c:\users\user\appdata\local\continuum\anaconda3_64bit\envs\dashboards\lib\site-packages\jaydebeapi_init
.py", line 381, in connect jconn = jdbc_connect(jclassname, url, driver_args, jars, libs)
File "c:\users\user\appdata\local\continuum\anaconda3_64bit\envs\dashboards\lib\site-packages\jaydebeapi_init
.py", line 199, in _jdbc_connect_jpype return jpype.java.sql.DriverManager.getConnection(url, *dargs)

Here’s my StreamLit app code:
from getpass import getpass
import jaydebeapi as jdba
import os
import pandas as pd
import streamlit as st

‘’’
Web App to Obtain Some Data
‘’’

user = os.environ[‘some_user_var’]
pwd = os.environ[‘some_pwd_var’]

conn = jdba.connect(‘com.ibm.db2.jcc.DB2Driver’, ‘jdbc:db2://some_server:50000/some_database’,
[user, pwd],
jars=[‘D:/JDBC_Drivers/folder1/db2jcc4.jar’, ‘D:/JDBC_Drivers/folder1/db2jcc.jar’])

sql = “”"
SELECT
*
FROM
some_table
FETCH FIRST 200 ROWS ONLY
“”"

df = pd.read_sql(sql, conn, index_col=None)

conn.close()
st.write(df)

I can execute the above script excluding the last line in an ipython session multiple times and get no errors at all.

Sorry I should have included my setup:
OS: Windows 10
Python version: 3.7 with Anaconda environment activated
Browser: Chrome version 77

Hey pybokeh, thanks for writing!

I’m not sure exactly what’s causing this, but one thing Streamlit does differently from, e.g., iPython is that the Streamlit python process stays open between runs of your app, so any app global data (including that used by third party libs like jaydebeapi) won’t be reset between runs.

It may be that we need to do something special with jaydebeapi, but could you also give this a try:

@st.cache(ignore_hash=True)
def get_db_conn():
    user = os.environ['some_user_var']
    pwd = os.environ['some_pwd_var']
    return jdba.connect(
        'com.ibm.db2.jcc.DB2Driver',
        'jdbc:db2://some_server:50000/some_database',
        [user, pwd],
        jars=['D:/JDBC_Drivers/folder1/db2jcc4.jar',
              'D:/JDBC_Drivers/folder1/db2jcc.jar'])

conn = get_db_conn()
# The rest of your app
# omit the conn.close() call
st.write(df)

This is untested - I haven’t used jaydebeapi before - but what I’ve done is factor the code that establishes the database connection into a function that is decorated with @st.cache. This means that the connection object will be cached in memory between runs of your script, so that the connection is established when the app first runs and then doesn’t need to be re-established if you re-run it before exiting Streamlit. (Also note that you should probably remove the conn.close() call before the app finishes, since you’ll be reusing that connection on subsequent runs.)

What @tim just said :wink: except add ignore_hash=True to @st.cache, otherwise it will choke trying to cache your database connection.

@st.cache(ignore_hash=True)
def get_db_conn():
   ...

Also, @tim’s note about closing the database connection is very important! We need to think about how to fix that in a future iteration of st.cache!

Whoops - I’ll edit my post, thanks for the catch Adrien!

Close, but getting 2 different warning messages and one traceback:
1st)
More information: to prevent unexpected behavior, Streamlit tries to detect mutations in cached objects so it can alert the user if needed. However, something went wrong while performing this check.

Please file a bug.

To stop this warning from showing in the meantime, try one of the following:

  • Preferred: modify your code to avoid using this type of object.
  • Or add the argument ignore_cache=True to the st.cache decorator.

2nd)
Streamlit failed to hash an object of type <class ‘function’>.,

More information: to prevent unexpected behavior, Streamlit tries to detect mutations in cached objects so it can alert the user if needed. However, something went wrong while performing this check.

Please file a bug.

To stop this warning from showing in the meantime, try one of the following:

  • Preferred: modify your code to avoid using this type of object.
  • Or add the argument ignore_cache=True to the st.cache decorator.

Then the actual traceback:
TypeError: object supporting the buffer API required
File “c:\users\user\appdata\local\continuum\anaconda3_64bit\envs\dashboards\lib\site-packages\streamlit\ScriptRunner.py”, line 306, in _run_script exec(code, module.dict)
File “D:\pyscripts\streamlit\defect_rates.py”, line 21, in conn = get_db_conn()
File “c:\users\user\appdata\local\continuum\anaconda3_64bit\envs\dashboards\lib\site-packages\streamlit\caching.py”, line 544, in wrapped_func return get_or_set_cache()
File “c:\users\user\appdata\local\continuum\anaconda3_64bit\envs\dashboards\lib\site-packages\streamlit\caching.py”, line 509, in get_or_set_cache code_hasher.update(func)
File “c:\users\user\appdata\local\continuum\anaconda3_64bit\envs\dashboards\lib\site-packages\streamlit\hashing.py”, line 185, in update self._update(self.hasher, obj, context)
File “c:\users\user\appdata\local\continuum\anaconda3_64bit\envs\dashboards\lib\site-packages\streamlit\hashing.py”, line 217, in _update hasher.update(b)

On a related note, the warning message suggests to add “ignore_cache=True”, but the suggestion by @Adrien_Treuille says to add “ignore_hash=True” I did ignore_cache=True but got error message saying that doesnt exist. So the warning message needs to change - just FYI.

1 Like

OK so I moved the 2 lines that reference the os environment variables where I am getting the user name and password outside of the def get_db_conn() method and placed them above the get_db_conn() method. Restarted the server. The dataframe renders at start-up, but if I hit refresh, there is a couple seconds pause, and then I get the dreaded “Python has stopped working” Windows dialog message. So I started a new terminal session and again, got that message again. I will reboot my computer and report back again. I will also try entering my user name and password in plain text and see if I get problems still.

Sorry, I missed the top line of the 1st warning message when pasting. It should have included:
Streamlit failed to hash an object of type <os.environ’>.

So this hinted to me that I needed to move the os.environ lines outside of the decorated get_db_conn() method.

After extensive trial and error here is what I’ve found:
It will error out if I have the os.environ lines referenced within the decorated get_db_conn() method. When I move them outside of the decorated method, all is good now…but when I don’t use jaydebeapi library as I still get the “Python has stopped working” Windows message.

So I switched to using pyodbc library to connect to that same DB server instead to see if I will get the “Python has stopped working” Windows message and with the os.environ lines moved outside of the decorated method. By using pyodbc, everything works fine now. Yay! As a further test, I moved the os.environ lines back into the decorated method, and it errors out. Moved them outside, all is good!

So apparently, jaydebeapi library still can’t handle what’s going on with the decorated method even with the os.environ lines moved outside of the decorated method, whereas, with the pyodbc library, it seems to work ok.

TL;DR - st.cache decorator don’t play nice with os.environ, so move them outside of the decorated method. Even with os.environ moved outside, using jaydebeapi will still error, but pyodbc does not error.

Submitted a bug report based on your experience. Thanks, @pybokeh!

@pybokeh (and @tim): I also added a feature request to handle the cache object lifecycle issue that came up in this conversation.