App crashes after some time without error

After some hours or days, one of my streamlit apps on share.streamlit.io crashes without any error message in the logs. I wonder why this might be happening and how to troubleshoot the issue.

What happens if the cache of an app gets so full that the resource constraints are exceeded? Would this trigger and error in the log or fail silently?

Thank you for your help

Log from last crash: logs-sbi-benchmark-streamlit-main-posteriors.py-2021-01-18T11_11_29.421Z.txt · GitHub
Repo of app: GitHub - sbi-benchmark/streamlit: Streamlit app for interactive results
App on streamlit (might be offline again): https://share.streamlit.io/sbi-benchmark/streamlit/main/posteriors.py

Hey @jml,

I do expect the Streamlit app to crash and silently restart if your cache gets too big and you hit resource limits. Here are some tips to limit the size of your cache from another forum topic.

Happy Streamlitin’ :balloon:
Fanilo

1 Like

Hi @andfanilo,

thanks for the reply and tips. I limited caching in order to narrow down whether this indeed is the problem.

When crashed, however, the app did not automatically restart. Should automatic restarting be default behavior?

Hey @jml, I actually am not sure about fault-tolerance behavior :confused: let’s wait for an answer from the Streamlit team :balloon: @randyzwitch @Marisa_Smith

Hey @jml (and @andfanilo!)

I believe we used to automatically restart apps but they were then getting into crash loops as they reached the resource limits over and over.

So currently, if an app crashed due to hitting resource limits it won’t automatically restart.

We do have plans in our roadmap to notify the developer if an app that they successfully deployed goes down at some later time due to reaching resource limits, but at the moment I can’t give you a time estimate on that (we are still in the midst of planning 2021!)

Cheers,
Marisa

2 Likes

Hi,

I wanted to ask whether there are any updates regarding a time estimate for the mentioned notification in the meantime @Marisa_Smith. For our app, it keeps happening that the app crashes after a couple of days so that website visitors are just faced with a meaningless error message.

Restarting the app manually fixes the problem for a few days but it’s very annoying to manually need to keep checking whether things are still alright. An automated solution to catch this would be highly useful.

Cheers

Hey @jml,

The team is working on the next version of the Sharing platform that addresses this. I just asked them if we have a time estimate for this feature I will get back to you as soon as I know more!

Marisa

Great, thank you! Did they get back to you with an estimate? I unfortunately keep running into this issue.

1 Like

Hey @jml,

Just got confirmation from our engineering team that this is scheduled to launch this quarter!!! :tada:

They are not exactly sure of the date just yet, as I believe it still needs to go through some more beta testing, but you can expect this to be released before the end of Q2!