Hello everyone.
I’m having trouble posting a streaming reply with my chatbot Mistral Ai.
The Mistral’ api give a working example (displaying live response in terminal).
client = MistralClient(api_key=MISTRAL_API_KEY)
messages = [ChatMessage(role="user", content="write python program to find prime numbers")]
stream_response = client.chat_stream(model=model, messages=messages)
for chunk in stream_response:
print(chunk.choices[0].delta.content)
In my streamlit chatbot RAG app, this would be :
stream_response = client.chat_stream(
model=model,
messages=messages
)
response = "test"
for chunk in stream_response:
st.write_stream(chunk.choices[0].delta.content)
but this gave me the following error :
streamlit.errors.StreamlitAPIException: st.write_stream
expects a generator or stream-like object as input not <class ‘str’>. Please use st.write
instead for this data type.
Has anyone ever used the mistrail ai api with a streaming response in a streamlit chatbot?
Thank you in advance !
1 Like
Goyo
March 20, 2024, 8:09pm
2
What happens if you follow the advice included in the error message?
Thats what I did.
The function client.chat_stream()
gives a stream-like object, but it didn’t seems to be recognize as such by streamlit, hence my question.
The Mistral’API doc dont give any other function to display streaming response.
Goyo
March 20, 2024, 11:06pm
4
I don’t know how well passing that object to write_stream
would work, but that is not what you are doing. Maybe you should try that too.
Yes, I’ve tried to do that but it doesn’t work, because to get a streaming response you have to iterate over the chunks returned by the object.
I don’t think it’s possible to do it as it is with Mistral Ai unfortunately…
Goyo
March 21, 2024, 9:34am
6
So what exactly is wrong with using st.write
as the error message suggests?
In my opinion, the error come from the object himself.
Open Ai’s function client.chat()
have an argument stream = True
and render one “streamable” object.
Mistral display a streaming response by looping in chunks given by client.chat_stream()
.
Therefore, the st.wrtie_streaming()
cant receive a streming object from mistral… because it doesnt gave it.
For the moment, I’m getting round the problem by display a non-streming response.
But it is less visually captiving
Goyo
March 21, 2024, 3:13pm
8
You mean this?
for chunk in stream_response:
st.write(chunk.choices[0].delta.content)
Unfortunately no, this gave me one error per token ^^
I don’t understant why this solution doesn’t work.
Goyo
March 21, 2024, 6:09pm
10
I don’t understand it either. What does “one error per token” means?"
To display the streamed response, Mistral’s api passes through a for loop that iterates over all the chunks generated by the client.chat_stream()
object.
There are therefore hundreds of chunks for one response.
As there are also hundreds of errors generated when I go through the following solution :
stream_response = client.chat_stream(model, messages)
for chunk in stream_response:
st.write(chunk.choices[0].delta.content)
I deduced that Streamlit was unable to display a streaming response from the Mistral api.
Finally works !!
Here is the final code. Huge thanks to @Intelligent_Bit3942 for his working exemple that i adapted for my case.
from mistralai.client import MistralClient
from mistralai.models.chat_completion import ChatMessage
import streamlit as st
import json
import faiss
import numpy as np
model = "open-mixtral-8x7b"
mistral_api_key = st.secrets["MISTRAL_API_KEY"]
client = MistralClient(api_key=mistral_api_key)
st.title("Assistant ChatBot catalogue 2024")
def load_json(rep:str):
f = open(rep, encoding='UTF-8')
return json.load(f)
def split_chunk(data, chunk_size):
data_str = [json.dumps(entry) for entry in data]
chunk_size = chunk_size
chunks = [data_str[i:i + chunk_size] for i in range(0, len(data_str), chunk_size)]
print(f"Nb. chunks = {len(chunks)}")
return chunks
def get_text_embedding(input):
embeddings_batch_response = client.embeddings(
model='mistral-embed',
input=input
)
return embeddings_batch_response.data[0].embedding
def load_vector_db(text_embedded):
d = text_embedded.shape[1]
index = faiss.IndexFlatL2(d)
index.add(text_embedded)
return index
def find_similar_chunk(index, question_embeddings, chunks):
D, I = index.search(question_embeddings, k=2) # distance, index
return [chunks[i] for i in I.tolist()[0]]
def prompt_chat(retrieved_chunk, question):
return f"""
Les informations contextuelles sont les suivantes.
---------------------
{retrieved_chunk}
---------------------
Compte tenu des informations contextuelles et sans connaissances préalables,
réponds en français à la question suivante de manière concise.
Utilise des listes pour plus de lisibilité.
Question: {question}
Réponse:
"""
# Chargement des données
data = load_json('catalogue_2024.json')
chunks = split_chunk(data, 3)
text_embeddings = np.load("catalogue_embeddings.npy")
index = load_vector_db(text_embeddings)
if "messages" not in st.session_state:
st.session_state["messages"] = [{"role": "assistant", "content": "Comment puis-je vous aider?"}]
st.session_state["History"] = []
st.session_state.History.append(ChatMessage(role="assitant", content="Comment puis-je vous aider?"))
for msg in st.session_state.messages:
st.chat_message(msg["role"]).write(msg["content"])
if prompt := st.chat_input():
question_embeddings = np.array([get_text_embedding(prompt)])
retrieved_chunk = find_similar_chunk(index, question_embeddings, chunks)
p = prompt_chat(retrieved_chunk=retrieved_chunk, question=prompt)
st.session_state.messages.append({"role": "user", "content": prompt})
st.session_state.History.append(ChatMessage(role="user", content=p))
st.chat_message("user").write(prompt)
with st.chat_message("assistant"):
message_placeholder = st.empty()
full_response = ""
for response in client.chat_stream(
model=model,
messages=st.session_state.History[1:]
):
full_response += (response.choices[0].delta.content or "")
message_placeholder.markdown(full_response + "|")
message_placeholder.markdown(full_response)
st.session_state.History.append(ChatMessage(role="assistant", content=full_response))
st.session_state.messages.append({"role": "assistant", "content": full_response})
system
Closed
March 28, 2024, 2:19pm
13
This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.