I spent some time last week thinking about future “Magic” features from Streamlit and I thought it might be interesting to share some of those ideas with the community, and to give the community an opportunity to put forward 10x ideas that are not yet in the Roadmap.. I came at this from a personal slant so this is all heavily flavored with “YMMV” & “FWIW”. Here’s an excerpt from what I wrote:
I am still getting familiar with the Magic features, but as I understand it from looking at the github code, a key aspect is that you are rewriting the Python Abstract Syntax Tree on the fly to eliminate a lot of boilerplate code. I am a participant in OpenAI’s Codex private beta (as well as GitHub copilot) and it’s clear that the world is moving rapidly towards a sort of push-button wish fulfillment for creating program functions. That fits well with what Streamlit is already good at. An integration there would be a quick win, and I suspect many other alternatives are likely to arise.
I believe the bigger challenges, and therefore the bigger opportunities for 10x ‘Magic’ features, are in figuring out ways to give individual data scientists the ability to compete with the data collection and processing “moats” that surround large organizations. For example in a domain I know well, book publishing, a couple of big companies have huge advantages in terms of their access to large corpora of books. How is the individual or small team to compete with Amazon or Google’s access to millions of digitized books that they acquired through mechanisms that were, frankly, barely legal? People in universities can sometimes get lesser levels of access through programs like HathiTrust or through noncommercial collaboration, but that’s not good enough. The same situation applies in other industries, of course, where a handful of big companies control crucial troves of data in social networks, media, location, and so on. On the processing side, too, when the competition is among huge companies building trillion-parameter models, it can be daunting for individual data scientists to secure comparable resources.
So I would argue that one obvious opportunity for ‘magic’ is to somehow radically improve individual data scientists’ ability to collect and process large data sets, and I would add the requirement that to be ‘magic’, it should not be dependent on the goodwill of the FAANG companies or any national government.
As most of us probably realize, barriers to data collection, access, and sharing are long-standing and heavily non-software-based problems that have resisted many diligent and earnest attempts at creating noncommercial solutions based on metadata and data interchange standards. 10x changes to overcome these issues will not be easy, but maybe Streamlit has an opportunity to play a role. One intuition is that it might involve some sort of decentralization, blockchain, or peer-to-peer play to take advantage of those 4.5M data scientist downloads!
What other ideas are there for 10x Magic?