Unable to deploy streamlit app from Snowflake quickstart

Hello Community,

I am following up each and every step from Snowflake quick start at below link:

https://quickstarts.snowflake.com/guide/getting_started_with_snowpark_for_python_streamlit/index.html?index=..%2F..index#0

I have created Github .py file, associated Streamlit app on Streamlit cloud using this .py file and tried deploying app using python 3.8 and 3.9 both. While deploying the app, query gets executed on Snowflake but Streamlit is unable to render the results and it throws unknown error. Is streamlit cloud facing any outage right now ?

-Mandar

1 Like

Welcome to the community!
Unfortunately this is too little information to give any advice.

  • What is the error message in the web console?
  • Please share a link to your public github repo.
2 Likes

Problem is there is no clear error message displayed by Streamlit. Here is the link to my repo. See if you can replicate this code on your repo and test this streamlit app. This is Streamlit + Snowflake app that displays warehouse credit consumption - nothing special

I think there are files missing in your github repo:

  • config.json
  • requirements.txt

I took them out as it is Public repo. Here are the content of my config.json file. I have re uploaded requirements.txt file.

{

“account” : “”,

“user” : “user id”,

“password” : “pwd”,

“role” : “ACCOUNTADMIN”,

“database” : “SNOWFLAKE”,

“schema” : “ACCOUNT_USAGE”,

“warehouse” : “COMPUTE_WH”

}

This will not work because the file is missing.
Use st.secrets for this.

I had this config.json file before in this repository and with given connection parameters, it was able to connect to Snowflake and execute queries. I have already checked this under query_history in Snowflake. This was also visible in Streamlit logs at the right bottom corner in Manage your app. I just removed it before posting link to my public repository on this community in order not to compromise my credentials to Snowflake account.

If the above mentoined github repo is the one from which the deployment is run, then it cannot work if the json file is missing.

Use st.secrets, because that’s what it’s meant for.

You should be able to use st.secrets as recommended by making this change:

def create_session_object():
    connection_parameters = st.secrets["snowflake"]
    session = Session.builder.configs(connection_parameters).create()
    return session

And then you can create a .streamlit/secrets.toml file (make sure not to upload this to github, but when you create the app on community cloud you can copy and paste this same content)

[snowflake]
account = ...
user = ...
...

Can you test and see if that works locally, and then add those secrets to Community Cloud?

2 Likes

I did configure secrets and now I can replicate the same error. Here is the link to my app. My analysis says that below statement is the culprit

snow_df_co2 = snow_df_co2.to_pandas()

Link to my Streamlit app:

https://mandarha1-streamlit-repo-sfdemo1-secrets-85s98s.streamlit.app/

Typo in line 8 of your code

1 Like

Corrected but error still remains. When I comment below line, app works just fine. But when I put this line in code, app stops working.

snow_df_co2 = snow_df_co2.to_pandas()

Moreover, Streamlit does not tell anything about error in Manage app. App simply crashes without error message.

image

Please don’t post images of error messages here in the forum.
Have you tested your streamlit app on your local computer before pushing to github?

I guess, that your database query has an error, because it is not executed before the snow_df_co2.to_pandas() instruction.

See also in the snowpark documentation:

More importantly, note that at this point nothing is executed on the server because of lazy evaluation–which reduces the amount of data exchanged between Snowflake and the client/application. Also note that when working with Streamlit we need Pandas DataFrames and Snowpark API for Python exposes a method to convert Snowpark DataFrames to Pandas. An action, for example to_pandas() in our case, causes the DataFrame to be evaluated and sends the corresponding generated SQL statement to the server for execution.

ok, noted.

My database query does not have error. I can see query getting executed on Snowflake.

If you follow the link I mentioned in my original post, I am following official documentation (in fact I literally copied the code) from Snowpark Streamlit integration and it is in error at to_pandas conversion. This is happening at my local Streamlit installation as well as Community Cloud. I was trying it on cloud just to ensure that there is no problem with my installation.

I doubt it, because you said it crashes on the line with snow_df_co2.to_pandas()
Leave away streamlit and streamlit cloud and focus on your snowflake problem.
Strip your example and especially the database query down until something works and then build up again.

I am re iterating - There is no problem in the Snowflake query. I have already provided screenshot from Streamlit logs and I have also verified the same in Snowflake backend. Query is getting executed and returning 6 rows in total. Here is the query that got executed on Snowflake and returned 6 rows.

SELECT * FROM ( SELECT “WAREHOUSE_NAME”, sum(“CREDITS_USED”) AS “Total Credit Consumption” FROM ( SELECT “WAREHOUSE_NAME”, to_decimal(“CREDITS_USED”, 10, 0) AS “CREDITS_USED” FROM WAREHOUSE_METERING_HISTORY WHERE “WAREHOUSE_NAME” IS NOT NULL) GROUP BY “WAREHOUSE_NAME”) ORDER BY “Total Credit Consumption” DESC NULLS LAST LIMIT 10;

Anyways, I am closing this topic as without proper error message displayed on Streamlit cloud, it is not possible to identify root cause.

When I tried your code locally, a very clear error message showed up

SnowparkFetchDataException: (1406): Failed to fetch a Pandas Dataframe. The error is: to_pandas() did not return a Pandas DataFrame. If you use session.sql(
).to_pandas(), the input query can only be a SELECT statement. Or you can use session.sql(
).collect() to get a list of Row objects for a non-SELECT statement, then convert it to a Pandas DataFrame.

Now, this error message doesn’t really make sense, because what you are doing is in fact a select call, as far as I can tell, but I tried their suggestion of using .collect() and converting it to a dataframe, and that worked fine.

You might look on on GitHub - snowflakedb/snowpark-python: Snowflake Snowpark Python API to see if this has already been reported as an issue, and if not than add a new issue for this.

Anyway, this code seems to work fine, in place of .to_pandas()

snow_df_co2 = pd.DataFrame(snow_df_co2.collect())
4 Likes

Thanks Blackary. This has really helped and fixed my issue.

Error message you mentioned was occurring even after I made SELECT statement under sessions.sql. When I used streamlit / snowpark dataframe syntax, app was literally crashing without any error message (even after correct query was getting executed on backend Snowflake DB) and next time I had to reboot it to just see the same error message. It was really frustrating !

I can now proceed on my actual business logic :smiley:

1 Like

Hello,
Thanks a lot for this thread. I have a very similar issue.
I am building a simple Streamlit app in Snowflake, where I want to display a line graph based on Snowflake data.
Once I collect the data (from a select statement) with data = session.sql(sql).collect()
I want to convert the output (supposed a Snowpark dataframe?) with: df = data.to_pandas()
I get the error: AttributeError: ‘list’ object has no attribute ‘to_pandas’
I am able to do the job with pd.DataFrame but I would like to understand why I can’t use the to_pandas() method?
Thanks for your feedback,
Vincent

Hi @vincentswid!

In this case, if you call collect(), it runs the query and returns a list
If INSTEAD you call to_pandas() it will return a pandas dataframe

e.g.

data = session.sql(sql) # no .collect()
df = data.to_pandas()
1 Like