Streamlit stop by itself after running for a while on AWS EC2

Hello friends,
I have a problem when hosting my streamlit app on an AWS EC2 instance. I was wondering if there is anyone who can help? Basically, my app allows a real-time training job running on the EC2, which takes up to 2 hours. This training process is smooth without any trouble for the first tens of epochs (around 20 min). But after a while, this training process will stop by itself before it completes. Neither did I refresh the app nor did I interrupt anything. It is just stopped by itselfโ€ฆ And the CPU usage of the EC2 instance becomes low as if there is nothing running there.
However, I tried hosting this app on my local computer and everything is fine. The whole training process can be finished without any problem.
Please help me if anyone knows whatโ€™s going onโ€ฆ Thanks!

Same problem hereโ€ฆ Did you figure it out ?