Ideas for Magic beyond Streamlit 1.0.0?

I spent some time last week thinking about future “Magic” features from Streamlit and I thought it might be interesting to share some of those ideas with the community, and to give the community an opportunity to put forward 10x ideas that are not yet in the Roadmap.. I came at this from a personal slant so this is all heavily flavored with “YMMV” & “FWIW”. Here’s an excerpt from what I wrote:

I am still getting familiar with the Magic features, but as I understand it from looking at the github code, a key aspect is that you are rewriting the Python Abstract Syntax Tree on the fly to eliminate a lot of boilerplate code. I am a participant in OpenAI’s Codex private beta (as well as GitHub copilot) and it’s clear that the world is moving rapidly towards a sort of push-button wish fulfillment for creating program functions. That fits well with what Streamlit is already good at. An integration there would be a quick win, and I suspect many other alternatives are likely to arise.

I believe the bigger challenges, and therefore the bigger opportunities for 10x ‘Magic’ features, are in figuring out ways to give individual data scientists the ability to compete with the data collection and processing “moats” that surround large organizations. For example in a domain I know well, book publishing, a couple of big companies have huge advantages in terms of their access to large corpora of books. How is the individual or small team to compete with Amazon or Google’s access to millions of digitized books that they acquired through mechanisms that were, frankly, barely legal? People in universities can sometimes get lesser levels of access through programs like HathiTrust or through noncommercial collaboration, but that’s not good enough. The same situation applies in other industries, of course, where a handful of big companies control crucial troves of data in social networks, media, location, and so on. On the processing side, too, when the competition is among huge companies building trillion-parameter models, it can be daunting for individual data scientists to secure comparable resources.

So I would argue that one obvious opportunity for ‘magic’ is to somehow radically improve individual data scientists’ ability to collect and process large data sets, and I would add the requirement that to be ‘magic’, it should not be dependent on the goodwill of the FAANG companies or any national government. :wink:

As most of us probably realize, barriers to data collection, access, and sharing are long-standing and heavily non-software-based problems that have resisted many diligent and earnest attempts at creating noncommercial solutions based on metadata and data interchange standards. 10x changes to overcome these issues will not be easy, but maybe Streamlit has an opportunity to play a role. One intuition is that it might involve some sort of decentralization, blockchain, or peer-to-peer play to take advantage of those 4.5M data scientist downloads!

What other ideas are there for 10x Magic?

2 Likes

Hi @fredzannarbor -

Not sure what you’re suggesting here in terms of OpenAI integration…are you saying that you think it’s possible for Streamlit to incorporate OpenAI Codex as the way to open up more magic integrations?

Best,
Randy

1 Like

Yes, it should be very easy. there is an OpenAI python API and it is just a question of selecting the right engine (Codex) and formulating the right query to the Open AI. Codex works really well if you give it comment blocks. Something like this:

st.codex("""
create a function that transposes dataframe and gives user choice of stylers
transpose dataframe
get list of available stylers
create streamlit select with list of available stylers
apply selected styler
“”")

would probably work. there would be a lot of experimentation required to figure out the best approach. but Codex is really good at creating functions on demand, and creating functions is a big part of the streamlit workflow.

I’m sure you could get an organizational invite from OpenAI if you don’t already have one. I am on the private beta for Codex and would be glad to help get the ball rolling.

Fred

Interesting.

One thing I could see as an issue with adding this directly to the Streamlit core library is that it’s somewhat non-deterministic, if only in the case that the Python runtime has no idea what to expect. Our magic functions do check for a number of common data types, but only a minor subset of what Codex might return. And then people would think Streamlit was broken, instead of Codex being excessively clever :slight_smile:

This would be an interesting thing to package up as a potential Streamlit component though. Put all of the mechanisms to call Codex into a simple-ish authentication function, then check do a similar check to what Streamlit does for magic, increasingly defining all of the output types as someone trips over an error.

Best,
Randy

1 Like

I might be interested in giving that a try!

1 Like