Snowflake to DataFrame long runtime

I am helping build streamlit apps for a client that pulls in live data from snowflake tables. When dumping data into a a data frame using code below, the runtime is way too long for a sufficient user experience. To be clear, the connection to snowflake and query seem to run very fast it’s the .fetch_pandas_all() that is taking a long time.

Example:
cur.execute(query)
df=cur.fetch_pandas_all()

Code reference: Using Pandas DataFrames with the Python Connector — Snowflake Documentation

I know others have figured out how to deploy streamlit apps within corporate walls and not have this runtime issue. Looking for suggestions or alternatives. Thanks!

Hi @ksoderholm22, welcome to the forum!

The eventual answer will be to build your streamlit apps within Snowflake, but that’s not available yet (sneak preview here).

In the meantime, if you’re regularly rerunning the same queries to pull data into pandas, then you should consider using st.experimental_memo to cache the results so you don’t have to re-run the query and fetch_pandas_all every time. st.experimental_memo - Streamlit Docs

e.g.

@st.experimental_memo
def get_df(query):
    cur.execute(query)
    return cur.fetch_pandas_all()

You might also consider that you may not actually need to load the whole dataset, and might be able to transform it within the query itself before returning the data to python.

1 Like