Streamlit is very convenient to quickly develop apps that have a small fixed state-space (i.e., all the widgets).
How, if at all, is it possible to interactively evolve some more complex state, like a dictionary of annotations? Streamlit is good to interactively showing data instances. But how can I record, say, a binary label per instance that the user indicates per button click or keystroke?
For me, an important part of what I do is review/annotate data, often observation by observation, and review/evaluate models, observation by observation, (and then retrain or show results to stakeholders).
So imagine a deduplication algorithm for database records. I would want to show pairs of records I think are duplicates, paging through them, marking them as dupes or not dupes. I might train a logistic regression on features or improve features. I then want to go through the pairs, see the two records side by side, and the prediction. If they are in the training set, I can see if we have a false positive or negative (and why). If they are not in the training set, I can add them.
Finally, I can use a similar UI to show to stakeholders what the algorithm is going to do.
I find myself spending an inordinate amount of time working on these kinds of data science UIs. Streamlit may do this right now, itās just not immediately apparent how. I think this is at least part of what Lutz is also asking.
I should mention that a slightly more orthodox (but perhaps less intuitive) approach would be to use a mutable cache object by passing ignore_hash=True to st.cache.
Hi Adrien,
thanks for the responses. I already checked out the SessionState code and implemented a proof-of-concept of an annotation script, but it still feels very hacky and is far from readable.
Can you elaborate on the ignore_hash idea? I donāt see how hashing (or not hashing) the output of a function call changes anything.
Maybe my other question (Memoize/cache partial function) goes into a similar direction? If you allow a function to be executed another time, if the last call with the same arguments yielded None, I could see how to implement an annotation script.
To elaborate, ignore_hash=True lets you create mutable state. For example:
import streamlit as st
@st.cache(ignore_hash=True)
def get_state():
return []
state = get_state()
state.append(len(state))
st.write(state)
st.button('Rerun')
Every time you run this script it appends a element to the state:
I also responded to your partial cache function in the other thread.
Thanks for all the great questions and happy app creating!!
Hi Adrien,
thanks for the example!
Based on your code, I coded this small prototype:
import streamlit as st
data = ["eins", "zwei", "drei", "vier", "fĆ¼nf"]
categories = ["good", "bad"]
@st.cache(ignore_hash=True)
def get_annotation():
return {}
instance = st.empty()
buttons = {}
for cat in categories:
buttons[cat] = st.empty()
annotation = get_annotation()
if len(annotation)<len(data):
for cat in categories:
buttons[cat] = st.button(cat)
instance.markdown("# "+data[len(annotation)])
for cat in categories:
if buttons[cat]:
index = len(annotation)
annotation[data[index]] = cat
if len(annotation) < len(data):
instance.markdown("# " + data[len(annotation)])
st.write(annotation)
It is relatively readable and does what I want. Just one small issue: Why is the text rendered one too often. I.e., with 5 data instances, I have to click 6 times (where the last button click happens while the last data instance is shown a second time and is inconsequential)?
The extra question is being asked because of a tricky quirk of button semantics: youāre updating the state after the button is clicked (in the if buttons[cat]: block) but before the script is rerun.
To be honest, the conversation is making me rethink the button API a tiny bit.
In an ideal world, this is how I think you code should be written:
import streamlit as st
data = ["eins", "zwei", "drei", "vier", "fĆ¼nf"]
categories = ["good", "bad"]
@st.cache(ignore_hash=True)
def get_annotation():
return {}
annotation = get_annotation()
index = len(annotation)
if index < len(data):
st.markdown("# " + data[index])
for cat in categories:
if st.button(cat):
annotation[data[index]] = cat
st.rerun()
st.write(annotation)
But unfortunately, st.rerun() does not exist.
I think it could be hacked together using st.ScriptRunner.RerunException but this requires knowledge of the internal workings of Streamlit which I do no possess. Iām asking the eng team on our internal slack channel if they can help.
Please sit tight and Iāll get back to you.
p.s. Youāre helping us understand and improve Streamlitās design. Thank you for these great questions!
@Adrien_Treuille Iām glad to see youāre rethinking the button API; Iāve found that it never behaves as I would expect. The ability to trigger a rerun would be a nice addition. The other awkward part about a button is that it seems to be set to True if the button was previously clicked. This makes it awkward when I want to use the button to trigger some action and update the state in the app (it gets stuck in an infinite loop since the button stays True). What I would expect is something where you:
Click button to trigger an update
Execute code to modify data objects
Re-run the top-down execution with the modified data objects
What Iāve observed in the past is something like
import streamlit as st
import requests
external_api = 'localhost/foo'
color = st.multiselect(
'What are your favorite colors',
('Green', 'Yellow', 'Red', 'Blue'))
submit = st.button('send to server')
if submit:
requests.post(external_api, json={'color': color})
Will just infinitely send the default color to the server since submit stays True. I can try to create a self contained example later if it would be helpful.
Re-run the top-down execution with the modified data objects
I agree that your three-part flow for how a button should work is probably right. Weāre thinking about how to do that. One API would be something like
Will just infinitely send the default color to the server since submit stays True.
I find this very surprising. The way the buttons work now is that the app is run from top to bottom with the button returning True, the next time the app is run, it should be set back to False.
I can try to create a self contained example later if it would be helpful.
That would be great. If we can reproduce this behavior and it differs from that I just described, then this is definitely a bug we should fix! Thank you!!
This is all very cool and interesting. I was able to take @Lutzās example and convert it to load a DataFrame, add annotations to the DataFrame, and finally save it for a current project. There are, of course, many possible embellishments (saving work so far, seeking up to elements not yet annotated, quitting early, etc).
I also ran into the (same) problem where it shows the last item twice. Additionally, the necessity of writing the same code twice to get it to ārunā was weird but I just wrote a display() function. All of the global state is making the functional programmer in me twitch.
There are minor things (being able to put the buttons in a row) that Iād like to see, otherwise. I foresee some NLP applications where you might want to return the index of the selection (Iām trying to think about things I have done in the past).
I have no idea if it is āgoodā, though.
import streamlit as st
import pandas as pd
categories = {"good": 3, "ambiguous": 2, "skip": 1, "bad": 0}
@st.cache(ignore_hash=True)
def get_data():
data = pd.read_csv("test.csv")
data["annotation"] = None
return data
@st.cache(ignore_hash=True)
def get_annotation():
return {"row": 0}
row = st.empty()
match = st.empty()
buttons = {}
data = get_data()
annotation = get_annotation()
def detail():
current_obs = data.loc[annotation["row"]]
row.markdown(f"# {annotation['row'] + 1}")
match.markdown(f"**{current_obs['location']}** matched **{current_obs['area']}**")
if annotation["row"] < len(data.index):
for cat in categories.keys():
buttons[cat] = st.button(cat)
detail()
for cat in categories.keys():
if buttons[cat]:
data.loc[annotation["row"], "annotation"] = categories[cat]
annotation["row"] += 1
if annotation["row"] < len(data.index):
detail()
else:
data.to_csv("test_annotated.csv")
st.write("finished")
I tried to reproduce the odd behavior yesterday and was unable to - Iāve been trying to remember the exact conditions but until Iām able to reproduce it, letās assume that it was user error
Decorating a function seems like a natural way to encapsulate the action that a button should take, though Iām a little unclear on how you would place the button on the screen. Would it be something like:
import streamlit as st
@st.button('A button')
def callback():
do_something()
do_something_else()
st.title('Example')
st.write('Lorem ipsum dolor sit amet, consectetur adipiscing elit')
callback()
Since weāre brainstorming cool APIs, a solution that would avoid that problem is something like:
st.button("Click me!", callback=my_callback)
ā¦but itās unclear how that would work given Streamlitās execution model.
So a more āStreamlityā solution would be to limit what can be done in the callback function by transforming it into a pure āstate transition functionā, like this:
SessionsState is one of these objects weāve proposed in the past. It holds information that persists across reruns of the same script, on a per-user basis.
Weād make SessionsState objects have an .update decorator that is used to mark a function as a āstate transition functionā. That is, a function whose sole purpose is to take a SessionState object and update it, and itās not allowed to do things like refer to outer scope objects. This is much less general than just a ācallbackā, but I think itās (potentially!) a really nice and clean architecture. It also maps to Streamlitās execution model really well.
The update argument in st.click only accepts state transition functions.
So when the button is clicked, Streamlit would first call the update function and then rerun the script from top to bottom.
In fact, most python decorators allow this dual decorator / kwarg formulation.
Semantics thoughts
The semantics which I think would make sense would be to run the callback immediately after the click and before the subsequent of the Streamlit script.
The semantics which I think would make sense would be to run the callback immediately after the click and before the subsequent of the Streamlit script.
Agreed. One of the main Streamlit apps that Iāve been working on talks to other APIs and serves mainly as a frontend interface. So for example, I might make a GET request to the backend and populate a list ['a', 'b', 'c'] displayed on the Streamlit app. I might also have options ['d', 'e', 'f'] displayed with a checkbox next to each item. Then below I would have a button to submit the selected items to make a POST request to the backend API. After clicking the button, I would want the action to be triggered and then restart the execution from the top of the script.
I created a little Gist to demonstrate this.
This has some odd behavior, such as state not updating when I would expect and updating when I would not expect it to. This might be a user error but the source of the problem is not clear.
This example is also slightly different than @thiagoās suggestion since state is being managed outside of the Streamlit app (although in the real app Iām managing some state such as the page number using the SessionState object).
This is very helpful @jeremyjordan. FYI: I think the main next step for us is improvements on the caching, then we will get to state / callbacks, hopefully all in 2019.
We have some updates regarding Session State. Itās now natively supported in Streamlit via the 0.84.0 release! One of the examples in the Session State Demo App is annotating data.
Thanks for stopping by! We use cookies to help us understand how you interact with our website.
By clicking āAccept allā, you consent to our use of cookies. For more information, please see our privacy policy.
Cookie settings
Strictly necessary cookies
These cookies are necessary for the website to function and cannot be switched off. They are usually only set in response to actions made by you which amount to a request for services, such as setting your privacy preferences, logging in or filling in forms.
Performance cookies
These cookies allow us to count visits and traffic sources so we can measure and improve the performance of our site. They help us understand how visitors move around the site and which pages are most frequently visited.
Functional cookies
These cookies are used to record your choices and settings, maintain your preferences over time and recognize you when you return to our website. These cookies help us to personalize our content for you and remember your preferences.
Targeting cookies
These cookies may be deployed to our site by our advertising partners to build a profile of your interest and provide you with content that is relevant to you, including showing you relevant ads on other websites.