Issus with connecting Streamlit webapp to BigQuery

Hi,
I am building a webapp using streamlit framework that connects to the BigQuery to fetch tables. The issue is that the BigQuery credential authentication link and location to paste the authentication code generated appears on CLI in the backend. What I want is for the link to appear on the webapp itself so that when I host the webapp, it will be convenient for the user to click on the link and paste the generated code in the link on the webapp. Can someone suggest to me a way to do that?

Hi @Shashank_Vats -

We use BigQuery internally (as well as Streamlit to do our reporting!), what you want to do is create a service account key file:

If you have the file in its default location, you can do something like the following so that your credentials aren’t saved in your Python code:

from google.cloud import bigquery

# If you don't specify credentials when constructing the client, the
# client library will look for credentials in the environment.
client = bigquery.Client()

Best,
Randy

1 Like

The problem is the app is supposed to be used by multiple users and most of them would be having different project ids so I guess creating a service account key file won’t serve the purpose here. Hence we want authentication to happen at the front-end, everytime for each user.

@randyzwitch or anyone, can you please share an example of caching Google BigQuery data to improve Streamlit app performance? I am getting my dataframe from BigQuery, by using the below code, but when I try to use the @st.cache before my function definition, I get an error about hashing.

df = (
bqclient.query(query_string)
.result()
.to_dataframe(bqstorage_client=bqstorageclient)
)

What does the bqstorage_client=bqstorageclient part of the code do? If you remove this, st.cache should work fine as far as I can tell, as the result will return a pandas dataframe (which can be cached)

Best,
Randy

1 Like

Thanks @randyzwitch! You’re right, once I remove the “bqstorage_client=bqstorageclient”, I don’t get the hashing error. Now my function includes the below code:

df = (
    bqclient.query(query_string)
    .result()
    .to_dataframe()
)

However, the caching still doesn’t seem to be working. I will explain further on this thread.