Streamlit application killed

godot63 · April 27, 2020, 3:17pm

Each time I tweet about a streamlit application I just built, the application is down shortly after. The tmux session console just says: killed. Until I restart the server, I cannot even connect with putty to relaunch the app. It’s an AWS-EC2 micro AMI server, which is ok for most of the time, but not for short term heavy load. Does anyone have any experience on how to solve this problem without necessarily buying a bigger server? I don’t mind if the application crashes from time to time. I just would like it to recover. Also, I have no idea what is happening, all monitoring graphics look ok, and I just think there must be too many sessions, but maybe somebody knows a better way to analyze the problem. This might help me to find a solution. Thanks for any insight.

randyzwitch · April 27, 2020, 8:41pm

Hi @godot63 -

From the sound of it, it does sound like this is related to having a quick burst of traffic. Are you using anything like systemd to do auto-restarting?

I’ll have to ask internally whether we have any load testing metrics or guidelines around concurrent sessions.

Best,
Randy

godot63 · April 28, 2020, 4:51am

hi @randyzwitch Randy

thanks for your reply. It is the burst for sure. I believe that one of the side-effects of the caching mechanism of streamlit is, that the apps use up a lot of memory, depending on the amount being cached. There is not much you can do: more memory, less caching, or, as you suggest live with it but make sure the application restarts. I want to go with the last solution and will try systems. I’m now to Linux so that is another challenge I have here. Since AWS calls everything elastic, I wonder if there is no elastic memory, I am willing to pay a bit for a few hours after a tweet when the app is really busy. For that I would like to analyse what the requirements would be. if you come up with any answers regarding testing and metrics I would be most grateful.

cheers,
Lukas

randyzwitch · April 28, 2020, 12:57pm

It might be the case you are running up against one of our cache design decisions, outlined here in the documentation:

https://docs.streamlit.io/caching.html#example-5-use-caching-to-speed-up-your-app-across-users

It sounds like what you might be seeing is that your AWS instance is undersized for every possible combination of your app inputs, but it’s only noticeable when many users hit it. Meaning, perhaps it takes 1000 runs to overload the cache, which you don’t notice individually, but when 1000 users in one hour do it, then it causes the issue. If that’s the case, then there’s nothing to do but make a bigger AWS instance.

You might also try using the ttl keyword argument on st.cache(), but that’s more about managing the freshness of the result than memory management:

https://docs.streamlit.io/api.html?highlight=ttl#streamlit.cache

ttl (float or None) – The maximum number of seconds to keep an entry in the cache, or None if cache entries should not expire. The default is None.

godot63 · April 29, 2020, 3:27am

Thanks again @randyzwitch. I now start my apps from a script below that will restart the process if it is killed. The line [ -e stopme ] && break causes a break if a file named stopme is found in the folder. However, I am still interested to know if there is a way to stress-test an app. On the other hand I will check with AWS if there is a way to blow up the memory on demand for a limited amount of time.

#!/bin/bash
# starts the traffic app and restarts it if crashed

while true; do
    [ -e stopme ] && break
    streamlit run app.py --server.enableCORS False --server.port 8502
done

randyzwitch · April 29, 2020, 1:14pm

I’ll take this back to our engineering team, and see if they have any suggestions. Off the top of my head, using something like Selenium might be possible and enumerate the possible input values to your app:

Fabio · July 4, 2020, 10:02am

Hi,

My app runs o a dedicated server on nginx. Nobody except me is using it at the moment. When I do especially hard tests streamlit seems to crash and I get 506 error from nginx. I need to restart streamlit (run streamlit myapp) and all is fine again.

What is the best way to monitor that streamlit is up and running and to restart it if not?

Many thanks

Fabio

astromath · September 28, 2020, 6:58pm

Hello,

Have you found a solution for monitoring?
I recently learned about jmeter and it might be what you are looking for.
You can use it to monitor the server’s behavior as well as simulate many simultaneous users and see what happens.

Cheers
Alex

Fabio · September 29, 2020, 7:47am

Thanks Alex,

I will certainly try. I think the problem was due to a bug in my sw, because since then the app have never crashed.

Best

Fabio

JCCKwong · December 9, 2020, 3:02am

Hi everyone,

I’ve been having issues with crashes when more than one user is using the app. I have to reboot the app from streamlit share with each crash.

Is there a way to auto-reboot if a crash occurs?

Otherwise, Streamlit has been amazing, super intuitive to make beautiful apps!!

Best,
Jethro

SXAPDX · December 14, 2020, 6:40pm

Hi Randy,

We are building an app to be used by a college class of up to 80 students. I would like to validate that the app will be able to handle that number of users. We are running on Heroku.

Is there a recommended approach you have found to test this ahead of time?

Thanks

Soren

Topic		Replies	Views
App killed Deployment	4	3044	April 7, 2023
How can I debug streamlit killed crashes? Using Streamlit	2	775	June 28, 2024
App crashes after some time without error Community Cloud	12	4247	March 5, 2024
/app/scripts/run-streamlit.sh: line 9: 879 Killed Community Cloud cache , file-upload , deep-learning , streamlit-cloud , debugging	1	330	October 23, 2024
Deployed app crashes once in a while with no obvious error logs Community Cloud cache , tensorflow , streamlit-cloud	2	549	November 1, 2023

Streamlit application killed

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies