Langchain stream

goldengrape · May 22, 2023, 6:05pm

Display the streaming output from LangChain to Streamlit

from langchain.callbacks.base import BaseCallbackHandler
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage
import streamlit as st

class StreamHandler(BaseCallbackHandler):
    def __init__(self, container, initial_text="", display_method='markdown'):
        self.container = container
        self.text = initial_text
        self.display_method = display_method

    def on_llm_new_token(self, token: str, **kwargs) -> None:
        self.text += token + "/"
        display_function = getattr(self.container, self.display_method, None)
        if display_function is not None:
            display_function(self.text)
        else:
            raise ValueError(f"Invalid display_method: {self.display_method}")

query = st.text_input("input your query", value="Tell me a joke")
ask_button = st.button("ask")

st.markdown("### streaming box")
chat_box = st.empty()
stream_handler = StreamHandler(chat_box, display_method='write')
chat = ChatOpenAI(max_tokens=25, streaming=True, callbacks=[stream_handler])

st.markdown("### together box")

if query and ask_button:
    response = chat([HumanMessage(content=query)])
    llm_response = response.content
    st.markdown(llm_response)

asehmi · May 23, 2023, 3:00pm

Great. Thanks for sharing. Worked nicely.

Vignes · May 23, 2023, 8:15pm

Hey @goldengrape ,
how do I stream the output of a Sequentialchain() that has two input variables ‘context’ and ‘query’ on Streamlit?

chat_llm = AzureChatOpenAI(max_tokens=25,
streaming=True,
callbacks=[stream_handler])
memory = ConversationBufferWindowMemory(memory_key=“chat_history”, k=15, input_key=“query”, output_key=“AIassistant”, )
context_chain = LLMChain(llm=chat_llm, prompt=context_prompt_template)
llm_chain = LLMChain(llm=chat_llm, prompt=prompt_template, output_key=“AIassistant”)
SequentialChain(chains=[context_chain, llm_chain],input_variables=[“context”,“query”], verbose=False, memory=st.session_state.entity_memory)
if user_input:
res = lang_chain_load_retrieve(user_input, FAISS_DATABASE, API_KEY)
context = “”
for y in range(len(res)):
context = context + “\n” + str(res[y])
response = overall_chain.run({“context”: context, “query”: user_input})

Currently, I get ‘TypeError: Object of type StreamHandler is not JSON serializable’

goldengrape · May 23, 2023, 9:40pm

@Vignes

In my experience, LangChain is a very complex HIGH LEVEL abstraction, and if you follow their example exactly, it’s easy to get good results, but if you try to modify something yourself, it often brings very complicated bugs because they hide too much information in it.

Just by looking at this part of your code, I have no idea what is happening. Also, I haven’t obtained the Azure OpenAI API key yet, so I cannot test AzureChatOpenAI either.

If I were to debug it, I think I would need to first test if the response is being properly outputted when streaming is set to False.

Instead of using Streamlit and a custom stream_handler, I suggest using langchain’s built-in StreamingStdOutCallbackHandler to check if the streaming output works correctly. Please refer to the following link for more information: https://python.langchain.com/en/latest/modules/models/llms/examples/streaming_llm.html

If everything mentioned above is working fine, I noticed that the error message states: “Object of type StreamHandler is not JSON serializable.” It’s possible that the information returned by the AI is in JSON format. In that case, you might need to extract a specific part of the JSON, such as the text or token, and then pass it to the StreamHandler for processing. You can refer to the “output parser” reference for guidance: https://python.langchain.com/en/latest/modules/prompts/output_parsers/getting_started.html

Or, if your entire program’s code is not very long, you may want to copy all the code along with the error messages into GPT-4 or Claude 100k and let GPT-4 do the debug.

In fact, I wrote this StreamHandler with the help of GPT-4. I gave GPT-4 the callback description page and let it come up with it. They are pretty good.

goldengrape · May 24, 2023, 4:11am

Upgrade.
Now not only supports stream display, but also supports synchronized voice reading

gist.github.com

https://gist.github.com/goldengrape/84ce3624fd5be8bc14f9117c3e6ef81a

langchain_stream_in_streamlit

from langchain.callbacks.base import BaseCallbackHandler
import azure.cognitiveservices.speech as speechsdk
import os

class StreamDisplayHandler(BaseCallbackHandler):
    def __init__(self, container, initial_text="", display_method='markdown'):
        self.container = container
        self.text = initial_text
        self.display_method = display_method
        self.new_sentence = ""

This file has been truncated. show original

Vignes · June 1, 2023, 3:06pm

Hey @goldengrape,
Thanks for your suggestion. I tried the langchain’s built-in StreamingStdOutCallbackHandler to check if the streaming output worked correctly. I was able to stream the response on the terminal. But, as mentioned early I was looking for a way to stream the output on Streamlit. I was able to do this by adopting a custom stream_handler (StreamlitCallbackHandler(BaseCallbackHandler)). Then I used a callback_manager to the LLM before running the SequentialChain().

class StreamlitCallbackHandler(BaseCallbackHandler):
def init(self, streamlit_text, res_box):
self.streamlit_text = streamlit_text
self.current_text = “”
self.res_box = res_box
self.llm_response_started = False

def on_llm_new_token(self, token: str, *args, **kwargs):  
    if not self.llm_response_started:  
        self.res_box.empty()  
        self.llm_response_started = True  
    self.current_text += token  
    self.streamlit_text.markdown(self.current_text) 

# Provide empty implementations for the required methods ...

if user_input:
…the same code as shared earlier
res_box = st.empty()

start_response = time.time()

response_container = st.empty()  

text = random.choice(llm_pre_response_texts)  
res_box.markdown(f'*{text}*')
callback_manager = CallbackManager([StreamlitCallbackHandler(response_container, res_box)])  
chat_llm.callback_manager = callback_manager  
llm_chain.callback_manager = callback_manager
overall_chain.callback_manager = callback_manager

print("Generating LLM response...")
response = overall_chain.run({"context": context, "query": user_input})  
print(response)  
print("LLM response generated.")

Eshaan_Agarwal · June 8, 2023, 6:21am

@goldengrape Hi would this work with if i provide custom css to it. I actually was trying to implment a chatbot app. Where i was using GitHub - AI-Yash/st-chat: Streamlit Component, for a Chatbot UI this to create chat UI. But i have a hard time integarting streaming support into this. Can somebody please let me know a way ?

antoineyb17 · June 25, 2023, 4:31pm

Hey, same here. Would me great to implement this streaming feature into streamlit_chat! Does anyone have an idea of how to do such a thing? All my previous attempts fail so far

fabmeyer · July 5, 2023, 12:27pm

With the latest (1.24) version of Streamlit streaming is possible, however ONLY for some special cases like OpenAI’s chat completion API. I am working on a streamlit app that uses LangChain RetrievalQAWithSourcesChain to answer questions from text documents.

Is there no possibility to add streaming with Streamlit + LangChain RetrievalQAWithSourcesChain ?

Kalim_Amzad · October 23, 2023, 8:07pm

In the case of streaming how can I count the token usage and cost?

Tried to extend OpenAICallbackHandler

def on_llm_end(self, response: LLMResult, **kwargs: Any) -> None:
        """Collect token usage."""
        if response.llm_output is None:
            return None

but after the streaming, response.llm_output is None.

I also tried following, but it did not work.

with get_openai_callback() as cb:
    st_cb = StreamHandler(st.empty())
    response = chain.run(input=user_query, callbacks=[st_cb])
    st.write(cb)

Could you please help?
Thanks in advance

Timothee_de_Almeida · March 1, 2024, 10:34am

hello using st.mardown(var) in place of doing st.write(var) do it

system · August 28, 2024, 10:35am

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Streamlit and LangChain Async Using Streamlit	1	3368	April 14, 2024
Streamlit Chatbot: Token Streaming LLMs and AI	1	3168	December 19, 2023
Using write_stream with langchain llm streaming showing incorrect output Using Streamlit debugging , write_stream	3	1319	May 22, 2024
LangChain 🤝 Streamlit Show the Community! llms	1	2055	March 9, 2024
Questions about the new streaming feature LLMs and AI real-time	2	2341	July 10, 2023

Langchain stream

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies