Make a streamlit web app use a chromedriver.exe file

Hello!

I’d like to deploy my web app through the streamlit community cloud but I’m getting this error:

AttributeError: This app has encountered an error. The original error message is redacted to prevent data leaks. Full error details have been recorded in the logs (if you're on Streamlit Cloud, click on 'Manage app' in the lower right of your app).
Traceback:
File "/home/adminuser/venv/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 541, in _run_script
    exec(code, module.__dict__)
File "/mount/src/corners-betting/CornersBetting.py", line 17, in <module>
    mws = import_MyWebScrapingTools().MyWsTools()
File "MyWebscrapingTools.py", line 29, in __init__
    self.driver = init_driver()
File "MyWebscrapingTools.py", line 25, in init_driver
    chrome_service = Service(ChromeDriverManager().install())
File "/home/adminuser/venv/lib/python3.9/site-packages/webdriver_manager/chrome.py", line 40, in install
    driver_path = self._get_driver_binary_path(self.driver)
File "/home/adminuser/venv/lib/python3.9/site-packages/webdriver_manager/core/manager.py", line 40, in _get_driver_binary_path
    file = self._download_manager.download_file(driver.get_driver_download_url(os_type))
File "/home/adminuser/venv/lib/python3.9/site-packages/webdriver_manager/drivers/chrome.py", line 32, in get_driver_download_url
    driver_version_to_download = self.get_driver_version_to_download()
File "/home/adminuser/venv/lib/python3.9/site-packages/webdriver_manager/core/driver.py", line 48, in get_driver_version_to_download
    return self.get_latest_release_version()
File "/home/adminuser/venv/lib/python3.9/site-packages/webdriver_manager/drivers/chrome.py", line 64, in get_latest_release_version
    determined_browser_version = ".".join(determined_browser_version.split(".")[:3])

The app should perform some actions of webscraping with Selenium but to do so it need to execute a chromedriver.exe file.
I tried two ways to do it:

  1. Upload the latest chromdriver.exe in the GitHub repo and use the GitHub link as executable_path in webdriver.Chrome(executable_path) FAILS TO READ IT
  2. Use Service(ChromeDriverManager().install()) and try to automatically install the latest chromedriver.exe at any run. FAILS WITH THE ABOVE ERROR (and a question…where should install it?)

This is the link to the GitHub page. (Note that I use a custom module to speed up some webscraping actions but it’s just some functions built on top of selenium).

Do you know any solution?
TIA

You cannot use a windows exe on streamlit cloud, since it is a linux debian based system.

Clear, thanks.
Any solution to run the webscraping script anyway?

Hi @LeonardoAcquaroli

Have you tried creating a packages.txt file and inside it enter the following:

chromium

This would be the equivalent of installing chromium in Ubuntu via

apt-get install chromium

Hope this helps!

Thanks, I tried and now the error says:

[13:10:32] 🐍 Python dependencies were installed from /mount/src/corners-betting/requirements.txt using pip.

Check if streamlit is installed

Streamlit is already installed

[13:10:33] 📦 Processed dependencies!

/bin/sh: 1: google-chrome: not found

/bin/sh: 1: google-chrome-stable: not found

/bin/sh: 1: google-chrome-beta: not found

/bin/sh: 1: google-chrome-dev: not found

/bin/sh: 1: google-chrome: not found

/bin/sh: 1: google-chrome-stable: not found

/bin/sh: 1: google-chrome-beta: not found

/bin/sh: 1: google-chrome-dev: not found

2023-10-17 13:11:10.201 Uncaught app exception

Traceback (most recent call last):

  File "/home/adminuser/venv/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 541, in _run_script

    exec(code, module.__dict__)

  File "/mount/src/corners-betting/CornersBetting.py", line 17, in <module>

    mws = import_MyWebScrapingTools().MyWsTools()

  File "MyWebscrapingTools.py", line 29, in __init__

    self.driver = init_driver()

  File "MyWebscrapingTools.py", line 25, in init_driver

    chrome_service = Service(ChromeDriverManager().install())

  [...]

  File "/home/adminuser/venv/lib/python3.9/site-packages/webdriver_manager/drivers/chrome.py", line 64, in get_latest_release_version

    determined_browser_version = ".".join(determined_browser_version.split(".")[:3])

AttributeError: 'NoneType' object has no attribute 'split'

I see this part as new unkown:

/bin/sh: 1: google-chrome: not found

/bin/sh: 1: google-chrome-stable: not found

/bin/sh: 1: google-chrome-beta: not found
1 Like

I have the same question :sob:,do you have solved these errors?

You cannot use a windows executable chromedriver.exe on streamlit cloud.

See here a simple working example how to use selenium on streamlit cloud:

thanks so much!!
but I have no experience with docker :joy:
it looks not simple for me.
I will study for it hardly hhh :yum:

You don’t need docker, this is just one of the options to test and run the streamlit app locally.

Thank you very much for your reply!!
However, the example included Docker-related content, and I am unsure about how to use it without Docker.
In my packages.txt, included:
chromium
chromium-driver

And my dependencies have been successfully installed.
But the error message I encountered is as follows:

Traceback (most recent call last):
  File "/home/adminuser/venv/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script
    exec(code, module.__dict__)
  File "/mount/src/domain/app.py", line 51, in <module>
    output1.text=main1.mian1(url_input,yumingA_input,yumingB_input,ua_input).replace('\n','<br>')
  File "/mount/src/domain/main1.py", line 7, in mian1
    Access.getPage(url,yumingA,ua)
  File "/mount/src/domain/Access.py", line 35, in getPage
    driver = webdriver.Chrome(ChromeDriverManager().install(), options=options)
  File "/home/adminuser/venv/lib/python3.9/site-packages/webdriver_manager/chrome.py", line 40, in install
    driver_path = self._get_driver_binary_path(self.driver)
  File "/home/adminuser/venv/lib/python3.9/site-packages/webdriver_manager/core/manager.py", line 40, in _get_driver_binary_path
    file = self._download_manager.download_file(driver.get_driver_download_url(os_type))
  File "/home/adminuser/venv/lib/python3.9/site-packages/webdriver_manager/drivers/chrome.py", line 32, in get_driver_download_url
    driver_version_to_download = self.get_driver_version_to_download()
  File "/home/adminuser/venv/lib/python3.9/site-packages/webdriver_manager/core/driver.py", line 48, in get_driver_version_to_download
    return self.get_latest_release_version()
  File "/home/adminuser/venv/lib/python3.9/site-packages/webdriver_manager/drivers/chrome.py", line 64, in get_latest_release_version
    determined_browser_version = ".".join(determined_browser_version.split(".")[:3])
AttributeError: 'NoneType' object has no attribute 'split'

What else should I do :dizzy_face:

You don’t have to use docker, it only makes development much easier and faster.
You only need the following files from my repo:

  • packages.txt
  • requirements.txt
  • streamlit_app.py

It seems that you use ChromeDriverManager(), i am not sure, if this will work on streamlit cloud.

Just have a look at my streamlit_app.py file to see a working example with selenium.

thank u very very much!!!
use the way in your repo resolve my errors.
so I really appreciate your help!!

But there’s another issue. I’m using the following code in my application:
I use this in my code:

mitmdump_process = subprocess.Popen(['mitmdump', '-s', 'addons.py','-p', '8080'], stdout=sys.stdout, stderr=sys.stderr, env=env)

When I run this code in my local IDE, it prints logs to the console. However, when I deploy my app on Streamlit Cloud, I can’t see these logs in the left black console of the browser.

Do you know how I can print logs in Streamlit Cloud?

I really have to thank you @Franky1!
I still have an issue that I can’t figure out on my own. With selenium I have to scrape data from thee differente webpages for three different dataframes to be shown.

The first dataframe works but the webscapring of the second webpage fails to find the elements in the page and returns a timeout exception (timeout is 10s).

File "/home/adminuser/venv/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 534, in _run_script
    exec(code, module.__dict__)
File "/mount/src/corners-betting/CornersBetting.py", line 102, in <module>
    team_corners = single_team(code, team)
File "/mount/src/corners-betting/CornersBetting.py", line 94, in single_team
    team_corners_table = pd.merge(corners_for(), corners_against(), left_index=True, right_index=True, suffixes=('', '_y'))
File "/mount/src/corners-betting/CornersBetting.py", line 68, in corners_for
    corners_for_team_table = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, 'matchlogs_for')))
File "/home/adminuser/venv/lib/python3.9/site-packages/selenium/webdriver/support/wait.py", line 101, in until
    raise TimeoutException(message, screen, stacktrace)

where:

  1. is a custom webscraping module built on top of selenium that works in a local runtime
  2. the webpage to be scraped is Atalanta Match Logs (Pass Types), Serie A | FBref.com where “922493f3” is the code and “Atalanta” the team parameters of the function corners_for()

I can’t get why does the driver loads the page but then can’t find webelements. I also tried:

  1. with a totally different website and still the same result is obtained.
  2. I tried to invert the order of the dataframes (and so the scraping processess)
  3. I tried to close the driver and recreate a new driver variable before each driver call

Hello, I am still running into the following error despite trying what you did in streamlit_app.py. Perhaps I missed something. Works on my local, but not when deploying to streamlit

Here is the error:

selenium.common.exceptions.WebDriverException: Message: Service /home/appuser/.cache/selenium/chromedriver/linux64/122.0.6261.94/chromedriver unexpectedly exited. Status code was: 127

Here is my code for reference:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.chrome.options import Options
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service as ChromeService

from bs4 import BeautifulSoup as bs
import time
import csv
import streamlit as st
import os
import shutil


@st.cache_resource(show_spinner=False)
def get_webdriver_options() -> Options:
    options = Options()
    options.add_argument("--headless")
    options.add_argument("--no-sandbox")
    options.add_argument("--disable-dev-shm-usage")
    options.add_argument("--disable-gpu")
    options.add_argument("--disable-features=NetworkService")
    options.add_argument("--window-size=1920x1080")
    options.add_argument("--disable-features=VizDisplayCompositor")
    options.add_argument('--ignore-certificate-errors')
    return options

@st.cache_resource(show_spinner=False)
def get_chromedriver_path() -> str:
    return shutil.which('chromedriver')

@st.cache_resource(show_spinner=False)
def get_logpath() -> str:
    return os.path.join(os.getcwd(), 'selenium.log')

def get_webdriver_service(logpath) -> ChromeService:
    service = ChromeService(
        executable_path=get_chromedriver_path(),
        log_output = logpath
    )
    return service

def scrape_comments(url):
    driver = webdriver.Chrome(options=get_webdriver_options(),
                        service=get_webdriver_service(get_logpath()))
    driver.get("https://www.linkedin.com/login?fromSignIn=true&trk=guest_homepage-basic_nav-header-signin")

Any help would be greatly appreciated!