Introducing st.experimental_connection: Easily connect your app to data sources and APIs using our new connection feature. Find more details in our blog post and the API reference, and stay tuned for in-depth documentation! In the meantime, explore our updated MySQL and Snowflake connection tutorials for examples of this feature.
Here’s my take on a few benefits to streamlit having this built-in (even though, depending on the connection type, it is in fact wrapping SQLAlchemy, psycopg, etc):
Standard and more automated handling of credentials.
Without experimental connection, you have to decide where to store the credentials (hopefully in secrets.toml), and manually pull those out and pass them to the underlying library. With the built-in connection, you store your credentials in secrets, and they get automatically passed to the connection.
Automatic caching of connection object and read results
Usually there is a bunch of boilerplate code required if you want to a) cache the database connection object so that it can be reused and b) cache results from reading from your database if you don’t want to make a new request every time your app reruns. With the built-in connection, these are handled for you.
A generally-standard API across data sources.
Today if you are using sqlite or postgresql or bigquery or snowflake or S3 or Google sheets or something else to store your data, setting up the necessary code to a) set up a connection to the data source and b) read some data from the data source will look very different. If you use the built-in connection types, or community-created ones that build off of st.experimental_connection, the API for setting up your connection and using it will look very similar across these different types of data sources.
As somebody who builds a lot of streamlit app that connect to a variety of different sources, I am looking forward to all of these benefits, and my guess is that this standard interface will be very helpful especially for those getting started with streamlit who want to connect to a 3rd party datasource.
Hi @blackary, would you mind commenting on experimental_connection vs cache_resource, please? Outside of the credential handling, are there other reasons to use the former? (Our company handles the process for us through some credential magic so that we don’t have to enter passwords.)
Your point 2 is interesting… If I understand it correctly, with this code:
conn = st.experimental_connection("sql")
df = conn.query("select * from pet_owners")
The result from the query is cached, so that subsequent runs would reuse the result from the 1st run, is that right? However, if it is expected that the data on the database can change with every query, then I should not use this experimental_connection mechanism, but should stick with cache_resource? Thanks.
Hi @HHest, st.experimental_connection uses cache_resource under the hood to cache the database connection. st.experimental_connection accepts a ttl argument, so you can specify how long it uses the cache for queries that you do on the data. If you pass ttl=0, the results will not be cached at all. So, if you don’t ever want to cache any queries, then
conn = st.experimental_connection("sql", ttl=0)
conn.query("...") # will always re-run the query, never cache the data
@blackary, thank you for answer, but that’s basically my doubts regarding new feature exactly. Looks like added complexity without clear value.
In my experience .env and environment variables look to be much more common. Suitable for any type of deployment too (container or direct). Supported by many many tools. Streamlit secrets doesn’t look like a bonus imo.
2 aspects there. Automatic caching is a way to difficult to catch errors with data that changes. Not obvious ones. Looks like a minus to me more then a plus. 2nd - isn’t that exactly what @cache_resource does in a more obvious way for the connection (not data)? If underlying library supports cross-thread operation on the connection that is. Hiding actual connection is a huge question too imo.
Isn’t that why ORM is there (SQLalchemy)? Plus, if you use a connection and a query - you still have to write specific queries (which could differ for different platforms, depending on supported types and features set). ORM also hides the connection, but it gives an object model as a result. Object model which could be applied to different sources.
What I’m looking out for is an example, where say SQLalchemy is a bad choice and st.experimental_connection would be preferable really.
P.S.: imo, caching data by default is a very bad choice.
We just added a blog detailing it as well and will have more documentation soon about use cases:
Just like Secrets, st.experimental_connection is a totally optional feature. If you have existing credential management solution and all your driver imports and caching that works for your use case, you should totally just use that.
Connections can also support environment variable usage, for example if you want to use the FilesConnection to access S3, all you need to do is specify your AWS credentials using ENV the normal way and it will work.
The intent is to simplify and promote the access to ORM and existing tools like you mentioned. SQL Connection is using SQLAlchemy underneath. FilesConnection is using fsspec underneath. For many developers who are not already familiar with and heavily using these tools, it can be a great entry point to using them and allow more rapid prototyping and iterative development, and more compact, clearer and portable code.
If you prefer to have more control and have a workflow that works well for you, you should continue using your existing approach and no need to worry with st.connection
I still see the arrow widget; it’s just dependent on whether or not there is other content in the sidebar. If there is no content in the sidebar, then navigation doesn’t collapse under the arrow widget. Can you add st.sidebar.write('') and see if the menu shows up collapsed with the arrow then?