Written parquet file to snowflake via put command

JonSimons · March 21, 2023, 12:59pm

In my deployed streamlit app, the user can upload mat and hea files and i turn them into csv/parquet files to upload to snowflake. How can i get this part working?:

df.to_parquet(os.path.join(os.path.dirname(file), “data/data.parquet”), engine=‘fastparquet’)
filename = os.path.join(os.path.dirname(file), “data/data.parquet”)
query = “put file://.” + filename + " @MY_STAGE"
session.sql(query).collect()

In this example i want to turn my pandas df to a parquet file but when the put command runs i get this error:

2023-03-21 12:58:32.633 query: [put file://./app/snowflakeai/file_handling/data.parquet @MY_STAGE]

2023-03-21 12:58:32.810 query execution done

2023-03-21 12:58:32.813 Failed to execute query [queryID: None] put file://./app/snowflakeai/file_handling/data.parquet @MY_STAGE

253006: 253006: File doesn't exist: ['./app/snowflakeai/file_handling/data.parquet']

Where do files get saved when u write to them and how do i access them with the put command?

blackary · March 21, 2023, 1:37pm

I don’t see anything obviously wrong with your script, but it looks like you’re trying to upload data.parquet instead of data/data.parquet according to the query logs. Here’s a slightly modified version of your script that works fine for me. I would recommend creating the path once, and using the same path variable every time to make sure you’re not trying to upload a different file from the one you created

from pathlib import Path
...

MY_STAGE = "TEST"

path = Path(__file__).parent / "data" / "data.parquet"
path.parent.mkdir(parents=True, exist_ok=True)

df.to_parquet(path)

query = f"put file://{path} @{MY_STAGE}"

st.write(query)

if st.button("Put file"):
    st.write(session.sql(query).collect())

JonSimons · March 21, 2023, 1:41pm

I changed the code to not have the data directory and still got the same output. I think the writing to the file works fine but for some reason the snowflake put command cant find my file. Im using snowpark with a session. Locally it worked fine because i was able to give an absolute path to the saved files, but it doesnt appear to work in the deployed environment.

JonSimons · March 21, 2023, 1:42pm

df.to_parquet(os.path.join(os.path.dirname(file), “data.parquet”), engine=‘fastparquet’)
filename = os.path.join(os.path.dirname(file), “data.parquet”)
query = “put file://.” + filename + " @MY_STAGE"
session.sql(query).collect)

This is how i do it at the moment and I get the same error as before:
File doesn’t exist: [‘./app/snowflakeai/file_handling/data.parquet’]

blackary · March 21, 2023, 1:51pm

I suspect that the issue is related to using __file__ to get the location of the data, vs the location that the streamlit app is running from. However, that should be solved if you get the absolute path and use that value both for the to_pandas and for the put.

In the case of my code, that would be

path = Path(__file__).parent.absolute() / "data" / "data.parquet"

You would also have to remove the . in the put query for that to work.

JonSimons · March 21, 2023, 2:07pm

That was it! Thanks for the help!

Topic		Replies	Views
Experimental_connection issue Streamlit and Snowflake	5	913	July 31, 2023
Unable to write to or export CSV to S3 bucket from Streamlit in Snowflake Using Streamlit session-state , file-upload , pandas , streamlit-cloud , debugging	2	445	September 10, 2024
Scalability of Streamlit in Snowflake Streamlit and Snowflake discussion	3	692	October 3, 2024
Streamlit Cloud <--> Snowflake - doesn't deploy, but works locally Streamlit and Snowflake	4	1232	January 4, 2024
Can I write back to Snowflake from streamlit app Streamlit and Snowflake	3	6842	January 18, 2024

Written parquet file to snowflake via put command

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies