Hello
I am building a chatbot-style app in Streamlit using the OpenAI API. It works well for short exchanges but once the conversation gets longer (6–8 turns); the app becomes laggy or occasionally freezes.
I am using st.session_state to store the conversation history & st.chat_message to render messages.
I am not sure if the slowdown is from the way I’m updating the UI or from how Streamlit handles memory with larger payloads. ![]()
I have tried clearing older messages from session_state, limiting tokens in each prompt & even splitting history into chunks but the lag persists. Are there known limits or best practices for handling multi-turn conversations with LLMs in Streamlit?
I am also exploring ways to stream responses but curious what others have done to keep performance smooth. Checked 6 Tips for Improving Your App Performance | Streamlit related to this.
While researching this, I also came across what is Perplexity AI, which made me wonder how some tools manage long LLM sessions so efficiently. Any tips, examples, or tricks for optimizing long conversations in Streamlit would be really helpful! ![]()
Thank you !! ![]()