Why Isn’t My Streamlit App Handling Long LLM Conversations Smoothly?

ricaxa · July 28, 2025, 5:45am

Hello

I am building a chatbot-style app in Streamlit using the OpenAI API. It works well for short exchanges but once the conversation gets longer (6–8 turns); the app becomes laggy or occasionally freezes. I am using st.session_state to store the conversation history & st.chat_message to render messages.

I am not sure if the slowdown is from the way I’m updating the UI or from how Streamlit handles memory with larger payloads.

I have tried clearing older messages from session_state, limiting tokens in each prompt & even splitting history into chunks but the lag persists. Are there known limits or best practices for handling multi-turn conversations with LLMs in Streamlit? I am also exploring ways to stream responses but curious what others have done to keep performance smooth. Checked 6 Tips for Improving Your App Performance | Streamlit related to this.

While researching this, I also came across what is Perplexity AI, which made me wonder how some tools manage long LLM sessions so efficiently. Any tips, examples, or tricks for optimizing long conversations in Streamlit would be really helpful!

Thank you !!

Topic		Replies	Views
Stale LLM Chat Messages Using Streamlit session-state , discussion	0	32	June 12, 2025
Concurrency Question - External Requests for Chatbot Using Streamlit windows , session-state , discussion	7	680	August 28, 2024
Connection timeout while waiting for model to return the result Using Streamlit	1	514	May 13, 2022
Memory handling in multi-user LLM app Using Streamlit session-state	4	2485	December 6, 2023
Maintaining state when working with LLM's Using Streamlit	5	2602	July 7, 2023

Why Isn’t My Streamlit App Handling Long LLM Conversations Smoothly?

Related topics