Hey everyone,
- I am running my app locally on colab, for some test purposes. It’s a simple RAG app. I want to display the answer using st.write_stream(). My input for this method is coming from a generator (haystack pipeline) which stream correctly on colab. But in the streamlit app : return : ““text_embedderllm”” and writing the llm answer in the logs.
- Here is a snippet of my code :
prompt_builder = PromptBuilder(
template = prompt_template
)
llm = OpenAIGenerator(
api_key=MISTRAL_API_KEY,
model=MODEL,
api_base_url=API_BASE_URL,
streaming_callback=lambda chunk: print(chunk.content, end="", flush=True),
generation_kwargs={
"temperature": 0.1,
"top_p": 1,
"max_tokens": 1024,
}
)
text_embedder = OpenAITextEmbedder(
api_key=MISTRAL_API_KEY,
model='mistral-embed',
api_base_url=API_BASE_URL
)
retriever = PineconeEmbeddingRetriever(
document_store = document_store
)
### RAG query pipeline ###
query_pipeline = Pipeline()
query_pipeline.add_component("text_embedder", text_embedder)
query_pipeline.add_component("retriever", retriever)
query_pipeline.add_component("prompt_builder", prompt_builder)
query_pipeline.add_component("llm", llm)
### Connecting pipeline ###
query_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
query_pipeline.connect("retriever.documents", "prompt_builder.documents")
query_pipeline.connect("prompt_builder", "llm")
# App title
st.set_page_config(page_title="👩⚕️ Docteur K.")
with st.sidebar:
st.sidebar.image("/content/kataryna.jpg", use_column_width=True)
st.markdown("<h1 style='text-align: center; color: white;'>Docteur Catherine d'Opale</h1>", unsafe_allow_html=True)
st.markdown("<h1 style='text-align: center; color: white;'>Oncologue médicale virtuelle</h1>", unsafe_allow_html=True)
# Store LLM generated responses
if "messages" not in st.session_state.keys():
st.session_state.messages = [{"role": "assistant", "content": "Comment puis-je vous aider?"}]
# Display or clear chat messages
for message in st.session_state.messages:
with st.chat_message(message["role"], avatar='/content/kataryna.jpg'):
st.write(message["content"])
# Function for generating response
#@st.cache_resource
#def generate_kataryna_response(query):
# output = query_pipeline.run(
# {
# "text_embedder": {"text": query},
# "prompt_builder": {"query": query},
# }
# )
# value = output["llm"]["replies"] # Access value by key
# return value
# User-provided prompt
if query := st.chat_input("What is up?"):
st.session_state.messages.append({"role": "user", "content": query})
with st.chat_message("user", avatar='/content/user.jpg'):
st.write(query)
# Generate a new response if last message is not from assistant
with st.chat_message("assistant", avatar='/content/kataryna.jpg'):
with st.spinner("..."):
output = query_pipeline.run(
{
"text_embedder": {"text": query},
"prompt_builder": {"query": query},
}
)
response = st.write_stream(output)
st.session_state.messages.append({"role": "assistant", "content": response})
-
It’s not really an error because i find a way to display the answer in the app, without streaming. But i need the streaming. Because without it it makes the user wait a moment while the answer is being writed in a function.
-
am using streamlit 1.31, and python 3.9
Thanks for your help