St.write_stream, writing the answers in logs

Hey everyone,

  1. I am running my app locally on colab, for some test purposes. It’s a simple RAG app. I want to display the answer using st.write_stream(). My input for this method is coming from a generator (haystack pipeline) which stream correctly on colab. But in the streamlit app : return : ““text_embedderllm”” and writing the llm answer in the logs.
  2. Here is a snippet of my code :
prompt_builder = PromptBuilder(
    template = prompt_template

llm = OpenAIGenerator(
    streaming_callback=lambda chunk: print(chunk.content, end="", flush=True),
        "temperature": 0.1,
        "top_p": 1,
        "max_tokens": 1024,

text_embedder = OpenAITextEmbedder(

retriever = PineconeEmbeddingRetriever(
    document_store = document_store

### RAG query pipeline ###
query_pipeline = Pipeline()
query_pipeline.add_component("text_embedder", text_embedder)
query_pipeline.add_component("retriever", retriever)
query_pipeline.add_component("prompt_builder", prompt_builder)
query_pipeline.add_component("llm", llm)

### Connecting pipeline ###
query_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
query_pipeline.connect("retriever.documents", "prompt_builder.documents")
query_pipeline.connect("prompt_builder", "llm")

# App title
st.set_page_config(page_title="👩‍⚕️ Docteur K.")

with st.sidebar:
    st.sidebar.image("/content/kataryna.jpg", use_column_width=True)
    st.markdown("<h1 style='text-align: center; color: white;'>Docteur Catherine d'Opale</h1>", unsafe_allow_html=True)
    st.markdown("<h1 style='text-align: center; color: white;'>Oncologue médicale virtuelle</h1>", unsafe_allow_html=True)

# Store LLM generated responses
if "messages" not in st.session_state.keys():
    st.session_state.messages = [{"role": "assistant", "content": "Comment puis-je vous aider?"}]

# Display or clear chat messages
for message in st.session_state.messages:
    with st.chat_message(message["role"],  avatar='/content/kataryna.jpg'):

# Function for generating response
#def generate_kataryna_response(query):
#    output =
#    {
#        "text_embedder": {"text": query},
#        "prompt_builder": {"query": query},
#    }
#    )
#    value = output["llm"]["replies"]  # Access value by key

#    return value

# User-provided prompt
if query := st.chat_input("What is up?"):
    st.session_state.messages.append({"role": "user", "content": query})
    with st.chat_message("user", avatar='/content/user.jpg'):
# Generate a new response if last message is not from assistant
    with st.chat_message("assistant", avatar='/content/kataryna.jpg'):
        with st.spinner("..."):
          output =
        "text_embedder": {"text": query},
        "prompt_builder": {"query": query},
          response = st.write_stream(output)
    st.session_state.messages.append({"role": "assistant", "content": response})
  1. It’s not really an error because i find a way to display the answer in the app, without streaming. But i need the streaming. Because without it it makes the user wait a moment while the answer is being writed in a function.

  2. am using streamlit 1.31, and python 3.9

Thanks for your help

1 Like

Hello @Boltzmann08,

Here is a sample adjustment to your message handling:

# Simulated function to fetch new messages (replace with your actual fetching logic)
def fetch_new_messages():
    # Placeholder: Implement logic to check for and return new messages
    return []

# Check for new messages and update session state
new_messages = fetch_new_messages()
if new_messages:

# Existing logic to display messages
for message in st.session_state.messages:
    with st.chat_message(message["role"], avatar='/content/kataryna.jpg'):

Hope this helps!

Kind Regards,
Sahir Maharaj
Data Scientist | AI Engineer

P.S. Lets connect on LinkedIn!

➤ Want me to build your solution? Lets chat about how I can assist!
➤ Join my Medium community of 30k readers! Sharing my knowledge about data science and AI
➤ Website:
➤ Email:
➤ 100+ FREE Power BI Themes: Download Now