STqdm : a tqdm-like progress bar for streamlit

Wirg · February 22, 2021, 6:46pm

Hello there !

I integrated tqdm with streamlit.
As discussed here, I am sharing the project without the community.

Link to :

the repo : GitHub - Wirg/stqdm: stqdm is the simplest way to handle a progress bar in streamlit app.
pypi : stqdm · PyPI

Install :
pip install stqdm

Some examples :

Directly

from time import sleep

from stqdm import stqdm

for _ in stqdm(range(50)):
    for _ in stqdm(range(15)):
        sleep(0.5)

nested_stqdm

Using pandas

from time import sleep

import pandas as pd
from stqdm import stqdm

stqdm.pandas()

pd.Series(range(50)).progress_map(lambda x: sleep(1))
pd.Dataframe({"a": range(50)}).progress_apply(lambda x: sleep(1), axis=1)

thiago · February 22, 2021, 10:11pm

Awesome! @adrien always wanted this

Adrien_Treuille · February 23, 2021, 12:24am

OMG THIS IS SO AWESOME!! Looking forward to using this in every project @Wirg!

andfanilo · February 23, 2021, 9:05am

@Wirg eheh nice seeing you here!

Could you add your “component” to the Component tracker so we keep “track” of it ?

Cheers,
Fanilo

Wirg · February 23, 2021, 4:28pm

Hi

I hope this will prove useful.

@andfanilo I think I did it. I updated the post. Is this what you meant ?

Have a nice day

andfanilo · February 23, 2021, 6:18pm

Yes this is perfect thanks a lot for your work!

milcent · February 25, 2021, 1:01pm

Really awesome!!

luca · March 8, 2021, 5:52pm

This is awesome!

I think it could be included natively in streamlit. What do you all think?

Wirg · March 9, 2021, 10:41am

Hi @luca,

Thank you for the suggestion.
I think the standard way to go about it will be to include it in tqdm itself.
tqdm is already doing this kind of switch to work seamlessly whether you are working in a normal python script, a notebook, or something else.
This is also going in streamlit direction to be usable both seamlessly in a normal python script or in a streamlit app.

I think that before doing that kind of move, the package has to prove stable, usable, and used :

I guess that tqdm or streamlit team has no interest in paying the toll of maintaining something that will be used by only 10 people
on the other hand, the package itself is not quite mature yet, mainly because the only feedback I have is mine. Integrating it into a bigger library could slow up development. Right now I am thinking of some improvements: using a single widget (rather than 2) by integrating the text into the progress bar, adding colors (at least to signify errors and help debug a user find the failing iteration).

I don’t know if other people agree with me on this?

waiting to integrate the package to a bigger one
the next improvements

Wirg · April 15, 2021, 8:51am

Hi !

Quick check for everyone using stqdm ! I am happy to have you onboard.

What are you feeling on the package ?
Is it useful ?
Do you feel like missing some features ?

Have a nice day !

mihir · April 29, 2021, 8:57pm

Hey, I really like using it so far. I am having a little bit of trouble however when trying to use it to display upload progress when uploading to S3. The tqdm progress bar in the console prints fine, but I can’t seem to get the progress bar to display in my Streamlit app. Do you have any tips on how I could diagnose this behavior?

import streamlit as st
import boto3
from botocore.client import Config
from stqdm import stqdm

BUCKET_FOLDER = 'my_bucket'
KB = float(1024)

def hook(t):
    """
    params:
        t (tqdm): Accepts tqdm class
    """
    def inner(bytes_amount):
        bytes_amount = bytes_amount / KB
        t.update(bytes_amount)

    return inner


def upload_file_to_s3(upload_obj):
    """
    params:
        upload_obj (UploadedFile): Accepts Streamlit's uploaded file class
    """
    s3 = boto3.client('s3', region_name='us-east-2', config=Config(s3={'addressing_style': 'path'}), )
    upload_location = BUCKET_FOLDER + upload_obj.name
    xlsx_bytes = io.BytesIO(upload_obj.getvalue())

    try:
        with stqdm(total=upload_obj.size / KB,
                   unit='kB', 
                   unit_scale=True,
                   unit_divisor=KB,
                   desc=upload_obj.name,
                   leave=True) as t:
            s3.upload_fileobj(Fileobj=xlsx_bytes, Bucket=BUCKET_NAME, Key=upload_location, Callback=hook(t))

    except Exception as e:
        st.error(e)

My console outputs:

my_file.xlsx: 100%|██████████| 11.7/11.7 [00:00<00:00, 15.1kB/s]2021-04-29 13:42:05.476 Thread 'ThreadPoolExecutor-1_0': missing ReportContext

But the streamlit application’s stqdm bar looks like:

my_file.xlsx: 0% 0.00/11.7 [00:00<?, ?kB/s]
|□□□□□□□□□□|

Wirg · May 7, 2021, 11:21am

Hi @mihir,

Thanks for asking, I have been trying to reproduce and i think I am lacking some context.

Is the working console output coming from the console output of the server with streamlit run ?
What is happening on streamlit side ? Is the progress bar stuck ?

From what I have tried, download / upload would not work at all. The reason for that is that upload_fileobj & download_fileobj are creating threads.
From what I know of streamlit sofar, subthreading can produce issues if not handled properly.
For example, I have a missing ReportContext when running a slightly modified version of your code. Improve "missing ReportContext" threading error · Issue #1326 · streamlit/streamlit · GitHub

I did not find a way to force that in boto3, but you can ask boto3’s fileobj functions not to produce thread by adding it to the config.
Using s3.download_fileobj(Bucket=BUCKET_NAME, Key=path, Fileobj=bytes_file, Callback=hook(t), Config=TransferConfig(use_threads=False)) works for me.

Is this what your issue was ?

mihir · May 10, 2021, 11:04pm

The working console output came from the console of the server running streamlit run, correct. The Streamlit application’s progress bar was stuck.

I added in the disable threading config from the boto3.s3 and it worked! Thanks so much @Wirg

jeffrichardchemistry · August 30, 2021, 1:12pm

I really liked this package.

qiuwei · November 24, 2021, 9:57am

Does it support multiprocessing?

Wirg · November 26, 2021, 3:34pm

@qiuwei
It depends of what you mean by supporting multiprocessing.
I would say yes.
For example, this would work.


from multiprocessing import Pool
from time import sleep


def sleep_and_return(i):
    sleep(0.5)
    return i


N = 100
with Pool(processes=5) as pool:
    for i in stqdm(pool.imap(sleep_and_return, range(N)), total=N):
        st.write(i)

Bhargav_Choithwani · February 12, 2022, 3:51pm

Hello, @Wirg
The stqdm package is really very awesome. It is really a good option to use instead of st.progress.

I am currently facing one issue while using stqdm. Will you please give any suggestions for the below case

The process starts by uploading a file in streamlit and then a for loop runs for some data manipulation.
But if I clicked on the stop button or close the page in between the for loop execution where I have used stqdm.

In next run the code runs till before the stqdm but doesn’t execute the for loop code where I have used this stqdm for the progress bar.
It doesn’t shows any error and just shows running on right upper corner.

The only way to fix is to relaunch the streamlit page.

Can you please suggest any way to resolve this.

Regards,
Bhargav

Wirg · February 28, 2022, 4:45pm

Hi @Bhargav_Choithwani ,

Sorry for the late answer, I missed the notification.

I am not really clear on your use case.
Can you provide an example?

From my understanding, you are stopping a running for-loop.
A minimal example would be something like this? Am I right?

from time import sleep

from stqdm import stqdm


for _ in stqdm(range(500)):
    sleep(0.5)

You stop it (by pressing stop or closing the window) at let’s say the 50th iteration.
Then you restart the window. Are you looking to start it from the 50th iteration and not from the start?

Bhargav_Choithwani · March 6, 2022, 12:44pm

Hello, @Wirg

Below is the sample script
##############################

If in between the execution of loop if some one clicks on stop button or close the browser tab and if we try again to open the page and click the “Start” button to start the loop from beginning then scripts stuck at stqdm part and it doesn’t throw any error.

Can you please suggest any way for handling this issue ?

Regards,
Bhargav

Wirg · March 10, 2022, 10:06am

Hi @Bhargav_Choithwani ,

You may have a misconception on what stqdm is doing and how streamlit works.
This code is rerun with or without stqdm, without stqdm you probably just don’t see it.
In a nutshell everytime you change something on the front side (widget, stop and restart …), what streamlit does is rerun the full script after updating the value of the input widgets (like your process_button value)

If you want to implement caching with streamlit within a loop. The simplest way would be to do something like this using st.cache.

from time import sleep
import streamlit as st
from stqdm import stqdm


@st.cache # Use streamlit to cache the results of this function for i
def process_for_index(index: int) -> int:
    sleep(0.5)
    return 2 * index + 1

for i in stqdm(range(50)):
    st.write(process_for_index(i))

You will pass through the progress bar for the first indexes but it will be a lot faster because the result is cached.

You can try implementing another cache system that will skip the first indexes but it will be harsh and probably conterintuitive.
If you are affraid of performance issues mainly because of the interaction of the widget or of the cache system, don’t.
Unless you are on a gigantic loop with micro operations (>> 1k), the caching system will be definitely faster and stqdm use tqdm as a backend to avoid unnecessary updates.
If you are in this case, I advise you to batch the processing steps and cache for a batch.

from time import sleep
from typing import List

import streamlit as st
from stqdm import stqdm


def process_for_index(index: int) -> int:
    sleep(0.05)
    return 2 * index + 1


@st.cache  # Use streamlit to cache the results of this function for i
def process_for_multiple_indexes(indexes: List[int]) -> List[int]:
    return [process_for_index(index) for index in indexes]


batch_size = 500

for batch_start in stqdm(range(0, 5000, batch_size)):
    batch_results = process_for_multiple_indexes(list(range(batch_start, batch_start + batch_size)))
    for result in batch_results:
        st.write(result) # Note that you don't need to write

Have a nice day,

Topic		Replies	Views
How to display progress bar corresponding to the number of things downloaded parallely? Using Streamlit	1	952	September 26, 2023
Displaying a tqdm bar with multiprocessing Using Streamlit	11	6495	November 10, 2023
Progress bar Using Streamlit	4	4006	March 11, 2023
Stqdm.pandas.progress_apply(): app freezes if interrupted Community Cloud	3	543	June 8, 2024
Update st.progress with joblib (multiprocessing tasks) Using Streamlit	2	2815	December 22, 2023

STqdm : a tqdm-like progress bar for streamlit

Related topics