Preventing malicious bots crushing Streamlit-app

aab · July 21, 2023, 6:24am

Hello folks,

i am really new to streamlit-dev. I have rolled out an app on a EC2 instance using docker. My structure is based on multi-container approach: Webserver → Multipage Streamlit-app → Restful API (with in-memory Cache). 1 day ago my app went down. After inspecting logs in all containers i have found out that some malicious requests were the last events before the container has stopped.
2023-07-15 20:06:23.349 “browser.browser” is not a valid config option. If you previously had this config option set, it may have been removed.

2023-07-16 04:00:01.653 MediaFileHandler: Missing file .env
2023-07-19 00:03:02.637 MediaFileHandler: Missing file wp-includes/wlwmanifest.xml
Stopping…

It seems that the last event was a request for wordpress files… obviously a malicious request… but i didn’t get why the app crashed. while requesting the same with a browser: domain-name/wp-includes/wlwmanifest.xml or any other non-existing ressource it returns nothing… nothing in logs too.

Has someone an idea, is there a bug in streamlit or it wasn’t just properly configured or probably the root of the problem is not in streamlit?

Thanks for any ideas in advance!

marduk · July 21, 2023, 7:06am

Hi there @aab ,

Can you share your Dockerfile and config.toml file? First let’s check what is causing this:

2023-07-15 20:06:23.349 “browser.browser” is not a valid config option. If you previously had this config option set, it may have been removed.

On another hand, did you check resource usage? Maybe there is a memory leak in your app, and each bot “visit” was building it up until it crashed.

aab · July 21, 2023, 9:04am

Hi @marduk

thanks for your response. Sure:

Dockerfile:

FROM python:3.11-slim
WORKDIR /app
COPY ./requirements.txt requirements.txt
RUN pip install --upgrade pip
RUN pip3 install --no-cache-dir --upgrade -r requirements.txt
COPY ./src/ ./
ENTRYPOINT ["streamlit", "run", "main.py", "--server.port=8501", "--server.address=0.0.0.0"]

requirements.txt

streamlit==1.24.0
requests==2.31.0
python-dotenv==1.0.0

docker-compose.yml:

  app:
    build: ./app
    container_name: stream-app
    healthcheck:
      test: ["CMD", "curl", "--fail", "http://app:8501/_stcore/health"]
      interval: 1m30s
      timeout: 10s
      retries: 3
    depends_on:
      - fastapi
      - redis
    environment:
      PRODUCTION: 'true'
    env_file:
      - .env
    expose:
      - 8051
    networks:
      - prod_network

config.toml:

[browser]
browser.gatherUsageStats = false

I haven’t check the ressource usage yet. I have come across mentioning it in the internet, but was not sure how to check it …

blackary · July 21, 2023, 1:43pm

Two quick things:

Could you repost your Docket, docker-compose, etc. files as code blocks? You can do this by adding triple backticks ``` around each snippet. This will make them formatted correctly so I and others can test them out.
In my experience, it’s not uncommon for bots to try and access wordpress endpoints on random domains, just because wordpress is a popular platform, and often is deployed in ways that leave some vulnerabilities. This shouldn’t affect your app running in any meaningful way.

I would focus on two main things for debugging this:

Try to run your app independently of docker, and see if you experience any crashes
Try to run a super simple app using your docker setup, and see if you experience any crashes

That should help narrow down whether it’s a docker issue or an issue with your app itself.

aab · July 21, 2023, 2:03pm

@blackary thanks for the hint with ```!
And thank you for your reply!

The app was up for a week or two, it’s not crushing after an hour or two. My basic security concept is based on isolating the app and not running on the host machine, the port 8051 is exposed only inside a docker network and being accessed by a web server. host machine is only exposed to ports 80, 443, 22. Being exposed to the internet I have implemented basic login function:

def login():
    # Get the login password from environment variables
    actual_password = os.environ['LOGIN_PASSWORD']
    # Initialize session state for login status
    st.session_state.login_successful = st.session_state.get('login_successful', False)

    # If the user is not logged in yet
    if not st.session_state.login_successful:
        placeholder = st.empty()

        # Show login form
        with placeholder.form("login"):
            st.markdown("#### Enter your password")
            password = st.text_input("Password", type="password")
            submit = st.form_submit_button("Login")

        # After user submits the form, check if the password is correct
        if submit:
            if password == actual_password:
                # Correct password, clear the form and mark login as successful
                placeholder.empty()
                st.success("Login successful")
                st.session_state.login_successful = True
            else:
                # Incorrect password, show an error
                st.error("Login failed")

My next steps:

I am going to monitor resources usage espl. the memory leak topic, as suggested by @marduk https://blog.streamlit.io/common-app-problems-resource-limits/
I haven’t implemented caching concept yet, going to do it anyway. https://docs.streamlit.io/library/advanced-features/caching?ref=blog.streamlit.io
If the previous steps don’t help, going to try out your step 2.

Topic		Replies	Views
Trouble in deploying streamlit with NGINX load balancer 🚀 Deployment docker , nginx	1	1230	October 10, 2023
Connection time out on streamlit app on accessing url through docker container 🚀 Deployment	1	980	December 3, 2023
Streamlit doesn't reload site when code in container changes 🎈 Using Streamlit windows , docker	3	3861	June 29, 2023
.streamlit/config.toml is not run when docker image is created 🎈 Using Streamlit docker , file-upload , configuration	3	1589	April 27, 2023
Failed: Deployment on Azure WebAPP via Docker Container 🚀 Deployment windows , azure , docker	2	1699	May 13, 2022

Preventing malicious bots crushing Streamlit-app

Dockerfile:

requirements.txt

docker-compose.yml:

config.toml:

Related Topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies