Langchain stream

Display the streaming output from LangChain to Streamlit

from langchain.callbacks.base import BaseCallbackHandler
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage
import streamlit as st

class StreamHandler(BaseCallbackHandler):
    def __init__(self, container, initial_text="", display_method='markdown'):
        self.container = container
        self.text = initial_text
        self.display_method = display_method

    def on_llm_new_token(self, token: str, **kwargs) -> None:
        self.text += token + "/"
        display_function = getattr(self.container, self.display_method, None)
        if display_function is not None:
            display_function(self.text)
        else:
            raise ValueError(f"Invalid display_method: {self.display_method}")

query = st.text_input("input your query", value="Tell me a joke")
ask_button = st.button("ask")

st.markdown("### streaming box")
chat_box = st.empty()
stream_handler = StreamHandler(chat_box, display_method='write')
chat = ChatOpenAI(max_tokens=25, streaming=True, callbacks=[stream_handler])

st.markdown("### together box")

if query and ask_button:
    response = chat([HumanMessage(content=query)])
    llm_response = response.content
    st.markdown(llm_response)
4 Likes

Great. Thanks for sharing. Worked nicely.

1 Like

Hey @goldengrape ,
how do I stream the output of a Sequentialchain() that has two input variables ‘context’ and ‘query’ on Streamlit?

chat_llm = AzureChatOpenAI(max_tokens=25,
streaming=True,
callbacks=[stream_handler])
memory = ConversationBufferWindowMemory(memory_key=“chat_history”, k=15, input_key=“query”, output_key=“AIassistant”, )
context_chain = LLMChain(llm=chat_llm, prompt=context_prompt_template)
llm_chain = LLMChain(llm=chat_llm, prompt=prompt_template, output_key=“AIassistant”)
SequentialChain(chains=[context_chain, llm_chain],input_variables=[“context”,“query”], verbose=False, memory=st.session_state.entity_memory)
if user_input:
res = lang_chain_load_retrieve(user_input, FAISS_DATABASE, API_KEY)
context = “”
for y in range(len(res)):
context = context + “\n” + str(res[y])
response = overall_chain.run({“context”: context, “query”: user_input})

Currently, I get ‘TypeError: Object of type StreamHandler is not JSON serializable’

@Vignes

In my experience, LangChain is a very complex HIGH LEVEL abstraction, and if you follow their example exactly, it’s easy to get good results, but if you try to modify something yourself, it often brings very complicated bugs because they hide too much information in it.

Just by looking at this part of your code, I have no idea what is happening. Also, I haven’t obtained the Azure OpenAI API key yet, so I cannot test AzureChatOpenAI either.

If I were to debug it, I think I would need to first test if the response is being properly outputted when streaming is set to False.

Instead of using Streamlit and a custom stream_handler, I suggest using langchain’s built-in StreamingStdOutCallbackHandler to check if the streaming output works correctly. Please refer to the following link for more information: https://python.langchain.com/en/latest/modules/models/llms/examples/streaming_llm.html

If everything mentioned above is working fine, I noticed that the error message states: “Object of type StreamHandler is not JSON serializable.” It’s possible that the information returned by the AI is in JSON format. In that case, you might need to extract a specific part of the JSON, such as the text or token, and then pass it to the StreamHandler for processing. You can refer to the “output parser” reference for guidance: https://python.langchain.com/en/latest/modules/prompts/output_parsers/getting_started.html

Or, if your entire program’s code is not very long, you may want to copy all the code along with the error messages into GPT-4 or Claude 100k and let GPT-4 do the debug.

In fact, I wrote this StreamHandler with the help of GPT-4. I gave GPT-4 the callback description page and let it come up with it. They are pretty good.

1 Like

Upgrade.
Now not only supports stream display, but also supports synchronized voice reading

Hey @goldengrape,
Thanks for your suggestion. I tried the langchain’s built-in StreamingStdOutCallbackHandler to check if the streaming output worked correctly. I was able to stream the response on the terminal. But, as mentioned early I was looking for a way to stream the output on Streamlit. I was able to do this by adopting a custom stream_handler (StreamlitCallbackHandler(BaseCallbackHandler)). Then I used a callback_manager to the LLM before running the SequentialChain().

class StreamlitCallbackHandler(BaseCallbackHandler):
def init(self, streamlit_text, res_box):
self.streamlit_text = streamlit_text
self.current_text = “”
self.res_box = res_box
self.llm_response_started = False

def on_llm_new_token(self, token: str, *args, **kwargs):  
    if not self.llm_response_started:  
        self.res_box.empty()  
        self.llm_response_started = True  
    self.current_text += token  
    self.streamlit_text.markdown(self.current_text) 

# Provide empty implementations for the required methods ... 

if user_input:
…the same code as shared earlier
res_box = st.empty()

start_response = time.time()

response_container = st.empty()  

text = random.choice(llm_pre_response_texts)  
res_box.markdown(f'*{text}*')
callback_manager = CallbackManager([StreamlitCallbackHandler(response_container, res_box)])  
chat_llm.callback_manager = callback_manager  
llm_chain.callback_manager = callback_manager
overall_chain.callback_manager = callback_manager

print("Generating LLM response...")
response = overall_chain.run({"context": context, "query": user_input})  
print(response)  
print("LLM response generated.")