How to get large object using `requests` in Streamlit app?

I have a requirement to create a WEB application written with the Streamlit Python package and I need to access an external endpoint through the requests package. The problem is that this endpoint can take more than 30 seconds to return a large object that is very expensive to build, causing timeout.
My approach was to split the request into two, where the first invokes the large object construction in the background and immediately returns a uuid to identify the transaction. Then the Streamlit client can invoke the second endpoint by passing uuid and this will return the step attribute which ranges from 1 to 100. When the value of step reaches 100, the large object is ready too is returned. I have already created and tested both endpoints and now I am having difficulty writing the code to invoke the endpoints and update the progressbar in the Streamlit App.

How can I implement Python 3.10 code in Streamlit that uses a Promise JavaScript mechanism to immediately release the page and execute the second request repeatedly? Is it possible to use the streamlit-js-eval module to do this? Is there another alternative?

I’m planning to my app locally using Streamlit version 1.28.2 in Python 3.10+

I would appreciate any additional information that might give me some advice.

One option that might work for you is to set up a separate thread to download the file, and another one to do whatever else you want to happen while it’s downloading. Here’s a rough example

import streamlit as st
from typing import Callable
import time
import threading
from streamlit.runtime.scriptrunner import add_script_run_ctx


def download_large_file():
    progress = st.progress(0, text="Downloading...")
    for i in range(100):
        time.sleep(0.1)
        progress.progress(i + 1)
    st.success("Download complete!")


st.title("Downloading large file in the background")


def setup_tasks(funcs: list[Callable]):
    threads = [threading.Thread(target=func) for func in funcs]
    for thread in threads:
        add_script_run_ctx(thread)
        thread.start()

    for thread in threads:
        thread.join()


def other_code():
    st.write("Doing other things!")


if __name__ == "__main__":
    setup_tasks([download_large_file, other_code])

Hi @blackary, thank you very much for your suggestion. I used your approach but used st.status instead of st.progress and made some modifications and the App reacted better on reload.

See the simplified code below

import threading
import time
from collections.abc import Callable

import streamlit as st
from streamlit.runtime.scriptrunner import add_script_run_ctx

if "last_progress" not in st.session_state.keys():
    st.session_state["last_progress"] = 0


def setup_tasks(funcs: list[Callable]):
    threads = [threading.Thread(target=func) for func in funcs]
    for thread in threads:
        add_script_run_ctx(thread)
        thread.start()

    for thread in threads:
        thread.join()


st.title("Download from external source")
msg = """
    Obtaining datasets from external source can be a time-consuming task.
    For such, you can use this function to download in the background while
    works on other important things.
    """
st.write(msg)

available_datasets = ["other_dataset", "x1_dataset", "x2_dataset", "x3_dataset"]


def other_code():
    with st.expander("Management"):
        tab1, tab2 = st.tabs(["**Delete dataset**", "**Others functions**"])
        # "**Remove dataset**"
        with tab1:
            with st.form("data_exclusion"):
                dataset_name = st.selectbox(
                    label="Dataset name",
                    options=available_datasets,
                    key="dataset_alias_to_delete",
                )

                submitted = st.form_submit_button(label="Delete")

                if submitted:
                    print("delete", dataset_name, "last_progress =", st.session_state["last_progress"])
                else:
                    pass
                    # st.warning("Selecione um dataset para excluir!")

        with tab2:
            st.write("Some code here")


setup_tasks([other_code])

with st.status("Downloading data...", expanded=True) as status:
    st.success("Contacting the server...")
    while True:
        i = st.session_state["last_progress"]
        time.sleep(0.125)
        pb_value = min(i + 1, 100)
        if i > 99:
            print("break while loop")
            break
        i += 1
        if i % 10 == 0:
            print("i =", i)
        st.session_state["last_progress"] = i
    st.success("Download complete!")
    status.update(label="Download complete!", state="complete", expanded=False)


def clear_session_state():
    st.session_state["last_progress"] = 0


st.button("Rerun", on_click=clear_session_state)

print("tasks finished")
1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.