STqdm : a tqdm-like progress bar for streamlit

Hello there !

I integrated tqdm with streamlit.
As discussed here, I am sharing the project without the community. :slight_smile:

Link to :

Install :
pip install stqdm

Some examples :

  • Directly
from time import sleep

from stqdm import stqdm

for _ in stqdm(range(50)):
    for _ in stqdm(range(15)):
        sleep(0.5)

nested_stqdm

  • Using pandas
from time import sleep

import pandas as pd
from stqdm import stqdm

stqdm.pandas()

pd.Series(range(50)).progress_map(lambda x: sleep(1))
pd.Dataframe({"a": range(50)}).progress_apply(lambda x: sleep(1), axis=1)
22 Likes

Awesome! @adrien always wanted this :smiley:

OMG THIS IS SO AWESOME!! Looking forward to using this in every project @Wirg!

1 Like

@Wirg eheh :slight_smile: nice seeing you here!

Could you add your ā€œcomponentā€ to the Component tracker so we keep ā€œtrackā€ of it :stuck_out_tongue: ?

Cheers,
Fanilo

Hi :slight_smile:

I hope this will prove useful.

@andfanilo I think I did it. I updated the post. Is this what you meant ?

Have a nice day

Yes this is perfect :slight_smile: thanks a lot for your work!

Really awesome!!

This is awesome!

I think it could be included natively in streamlit. What do you all think?

Hi @luca,

Thank you for the suggestion.
I think the standard way to go about it will be to include it in tqdm itself.
tqdm is already doing this kind of switch to work seamlessly whether you are working in a normal python script, a notebook, or something else.
This is also going in streamlit direction to be usable both seamlessly in a normal python script or in a streamlit app.

I think that before doing that kind of move, the package has to prove stable, usable, and used :

  • I guess that tqdm or streamlit team has no interest in paying the toll of maintaining something that will be used by only 10 people
  • on the other hand, the package itself is not quite mature yet, mainly because the only feedback I have is mine. :wink: Integrating it into a bigger library could slow up development. Right now I am thinking of some improvements: using a single widget (rather than 2) by integrating the text into the progress bar, adding colors (at least to signify errors and help debug a user find the failing iteration).

I don’t know if other people agree with me on this?

  • waiting to integrate the package to a bigger one
  • the next improvements
2 Likes

Hi !

Quick check for everyone using stqdm ! :slight_smile: I am happy to have you onboard. :wink:

What are you feeling on the package ?
Is it useful ?
Do you feel like missing some features ?

Have a nice day !

2 Likes

Hey, I really like using it so far. I am having a little bit of trouble however when trying to use it to display upload progress when uploading to S3. The tqdm progress bar in the console prints fine, but I can’t seem to get the progress bar to display in my Streamlit app. Do you have any tips on how I could diagnose this behavior?

import streamlit as st
import boto3
from botocore.client import Config
from stqdm import stqdm

BUCKET_FOLDER = 'my_bucket'
KB = float(1024)

def hook(t):
    """
    params:
        t (tqdm): Accepts tqdm class
    """
    def inner(bytes_amount):
        bytes_amount = bytes_amount / KB
        t.update(bytes_amount)

    return inner


def upload_file_to_s3(upload_obj):
    """
    params:
        upload_obj (UploadedFile): Accepts Streamlit's uploaded file class
    """
    s3 = boto3.client('s3', region_name='us-east-2', config=Config(s3={'addressing_style': 'path'}), )
    upload_location = BUCKET_FOLDER + upload_obj.name
    xlsx_bytes = io.BytesIO(upload_obj.getvalue())

    try:
        with stqdm(total=upload_obj.size / KB,
                   unit='kB', 
                   unit_scale=True,
                   unit_divisor=KB,
                   desc=upload_obj.name,
                   leave=True) as t:
            s3.upload_fileobj(Fileobj=xlsx_bytes, Bucket=BUCKET_NAME, Key=upload_location, Callback=hook(t))

    except Exception as e:
        st.error(e)

My console outputs:

my_file.xlsx: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 11.7/11.7 [00:00<00:00, 15.1kB/s]2021-04-29 13:42:05.476 Thread 'ThreadPoolExecutor-1_0': missing ReportContext

But the streamlit application’s stqdm bar looks like:

my_file.xlsx: 0% 0.00/11.7 [00:00<?, ?kB/s]
|▔▔▔▔▔▔▔▔▔▔| 

Hi @mihir,

Thanks for asking, I have been trying to reproduce and i think I am lacking some context. :slight_smile:

Is the working console output coming from the console output of the server with streamlit run ?
What is happening on streamlit side ? Is the progress bar stuck ?

From what I have tried, download / upload would not work at all. The reason for that is that upload_fileobj & download_fileobj are creating threads.
From what I know of streamlit sofar, subthreading can produce issues if not handled properly.
For example, I have a missing ReportContext when running a slightly modified version of your code. Improve "missing ReportContext" threading error Ā· Issue #1326 Ā· streamlit/streamlit Ā· GitHub

I did not find a way to force that in boto3, but you can ask boto3’s fileobj functions not to produce thread by adding it to the config.
Using s3.download_fileobj(Bucket=BUCKET_NAME, Key=path, Fileobj=bytes_file, Callback=hook(t), Config=TransferConfig(use_threads=False)) works for me.

Is this what your issue was ?

The working console output came from the console of the server running streamlit run, correct. The Streamlit application’s progress bar was stuck.

I added in the disable threading config from the boto3.s3 and it worked! Thanks so much @Wirg

1 Like

I really liked this package.

1 Like

Does it support multiprocessing?

@qiuwei
It depends of what you mean by supporting multiprocessing.
I would say yes.
For example, this would work.


from multiprocessing import Pool
from time import sleep


def sleep_and_return(i):
    sleep(0.5)
    return i


N = 100
with Pool(processes=5) as pool:
    for i in stqdm(pool.imap(sleep_and_return, range(N)), total=N):
        st.write(i)

Hello, @Wirg
The stqdm package is really very awesome. It is really a good option to use instead of st.progress.

I am currently facing one issue while using stqdm. Will you please give any suggestions for the below case

The process starts by uploading a file in streamlit and then a for loop runs for some data manipulation.
But if I clicked on the stop button or close the page in between the for loop execution where I have used stqdm.

In next run the code runs till before the stqdm but doesn’t execute the for loop code where I have used this stqdm for the progress bar.
It doesn’t shows any error and just shows running on right upper corner.

The only way to fix is to relaunch the streamlit page.

Can you please suggest any way to resolve this.

Regards,
Bhargav

Hi @Bhargav_Choithwani ,

Sorry for the late answer, I missed the notification.

I am not really clear on your use case.
Can you provide an example?

From my understanding, you are stopping a running for-loop.
A minimal example would be something like this? Am I right?

from time import sleep

from stqdm import stqdm


for _ in stqdm(range(500)):
    sleep(0.5)

You stop it (by pressing stop or closing the window) at let’s say the 50th iteration.
Then you restart the window. Are you looking to start it from the 50th iteration and not from the start?

Hello, @Wirg

Below is the sample script
##############################
image

If in between the execution of loop if some one clicks on stop button or close the browser tab and if we try again to open the page and click the ā€œStartā€ button to start the loop from beginning then scripts stuck at stqdm part and it doesn’t throw any error.

Can you please suggest any way for handling this issue ?

Regards,
Bhargav

Hi @Bhargav_Choithwani ,

You may have a misconception on what stqdm is doing and how streamlit works.
This code is rerun with or without stqdm, without stqdm you probably just don’t see it.
In a nutshell everytime you change something on the front side (widget, stop and restart …), what streamlit does is rerun the full script after updating the value of the input widgets (like your process_button value)

If you want to implement caching with streamlit within a loop. The simplest way would be to do something like this using st.cache.

from time import sleep
import streamlit as st
from stqdm import stqdm


@st.cache # Use streamlit to cache the results of this function for i
def process_for_index(index: int) -> int:
    sleep(0.5)
    return 2 * index + 1

for i in stqdm(range(50)):
    st.write(process_for_index(i))

You will pass through the progress bar for the first indexes but it will be a lot faster because the result is cached.

You can try implementing another cache system that will skip the first indexes but it will be harsh and probably conterintuitive.
If you are affraid of performance issues mainly because of the interaction of the widget or of the cache system, don’t.
Unless you are on a gigantic loop with micro operations (>> 1k), the caching system will be definitely faster and stqdm use tqdm as a backend to avoid unnecessary updates.
If you are in this case, I advise you to batch the processing steps and cache for a batch.

from time import sleep
from typing import List

import streamlit as st
from stqdm import stqdm


def process_for_index(index: int) -> int:
    sleep(0.05)
    return 2 * index + 1


@st.cache  # Use streamlit to cache the results of this function for i
def process_for_multiple_indexes(indexes: List[int]) -> List[int]:
    return [process_for_index(index) for index in indexes]


batch_size = 500

for batch_start in stqdm(range(0, 5000, batch_size)):
    batch_results = process_for_multiple_indexes(list(range(batch_start, batch_start + batch_size)))
    for result in batch_results:
        st.write(result) # Note that you don't need to write

Have a nice day,