Issue with Selenium on a Streamlit app

KoxNoob · April 9, 2021, 6:47pm

Thanks again @randyzwitch and @Franky1 !

Indeed, no error message in the log.

I’m going to explain you what is the aim : i try to collect text and odds from bookmakers website (Winamax, Unibet, Betclic, etc in my previous example) using Selenium.

So, normally, if all works, I should have numbers instead of nan in the dataframe…

Franky1 · April 11, 2021, 8:50pm

@randyzwitch
After taking a closer look at SeleniumBase, I don’t know how to use it for webscraping. It’s a pytest framework, but I haven’t found an example of how to use the classes in a python script for webscraping. Do you have a small example?

@KoxNoob
Also, I had massive problems getting Selenium to work on streamlit sharing. I had imagined it to be easier. Locally on Windows 10 and in a local Docker container, I got it up and running relatively quickly. But the deployment to streamlit sharing of the app itself kept going wrong without any plausible error message.

My efforts for a very very simple example (proof of concept) can be found here:

https://share.streamlit.io/franky1/streamlit-selenium/main

KoxNoob · April 15, 2021, 12:42pm

By the way, thanks a lot @Franky1 !

randyzwitch · April 15, 2021, 2:44pm

I guess I was viewing this as a roundabout way of getting Selenium installed. SeleniumBase is used for testing, but by installing that, I would assume that the other Selenium package would then be able to detect the proper driver and make things happen.

In general, I’ll pass along the feedback to our product team about “What if there isn’t an apt package?” as part of the install process, not sure where adding custom apt repositories might be on the roadmap (if at all).

Best,
Randy

ferdie · April 23, 2021, 5:54pm

Hey @KoxNoob! I was wondering if you found any solution since I plan to use Selenium to do some scraping with Streamlit Sharing. I have the same issue as you, where the code runs locally but not when deployed online.

I honestly think we need a browser installed in the github repo somehow so the webdriver can use it, but I am unsure how that will work.

andfanilo · April 30, 2021, 6:11pm

Hey all,

Given that we can now use conda to install packages on Streamlit Share (see it here), we can install firefox-esr via packages.txt (as mentioned by @ferdie you need a browser to be installed for the webdriver) and install geckodriver through conda (here’s the package).

Then we need to hardcode the path to geckodriver in the conda environment:

firefoxOptions = Options()
firefoxOptions.add_argument("--headless")
driver = webdriver.Firefox(
    options=firefoxOptions,
    executable_path="/home/appuser/.conda/bin/geckodriver",
)
driver.get(URL)

This geckodriver, as opposed to an executable pushed from Github repo, can be used by the app and…I think it works, I think it works I think it does, at least I can query the Unibet tables I’ll let you test further.

I can’t guarantee it will work forever if the Cloud team decides to change the path to the conda environment then you’ll have to look for it.

App: https://share.streamlit.io/andfanilo/s4a-selenium/main/app.py
Code: GitHub - andfanilo/s4a-selenium: Test Selenium + Firefox on Streamlit Share

Cheers,
Fanilo

ferdie · April 30, 2021, 8:37pm

This is exciting work, thank you so much @andfanilo!

By any chance do you know the package name to install the Google Chrome web browser? My project used the Google chromedriver, and it’ll be a little tedious to switch over to Firefox.

andfanilo · April 30, 2021, 8:47pm

Yeah Chrome I think is going to be a little-ish harder. I don’t think you can easily install Chrome on Debian Buster (which is the OS for Streamlit Share apps), you may have better chance installing chromium in packages.txt (chromium is the open source base for Chrome), then chromedriver the same way as geckodriver (hardcoding the path to the conda installed chromedriver in your app).
Hopefully Selenium knows how to pick the installed chromium without playing with the options, otherwise you’ll have to find a way to specify using chromium. Haven’t tested, hope you can find it out this way!

I’ll try later maybe.
Fanilo

Franky1 · May 2, 2021, 10:18pm

@ferdie

Add to your packages.txt file:

chromium
chromium-driver

See a simple example project here:

gmerticariu · May 3, 2021, 1:11pm

Hi @andfanilo! With the new release you cannot use more than one package manager at a time. Either you use pip (requirements.txt) or you use conda (environment.yml). You won’t be able to use both though.

andfanilo · May 3, 2021, 1:30pm

Sounds good to me, conda and pip install conflicts has always been hard to debug

ferdie · May 3, 2021, 10:36pm

Hey Fanilo, so I’m currently in the process of debugging, and I was wondering if you can help me.

I completely switched to FireFox since I ran into issues with chromium still, but now I am currently running into this exception when trying to run the function you made to test selenium from your app.

selenium.common.exceptions.WebDriverException: Message: Failed to decode response from marionette

For context, I copied over your code (some code is omitted) into a function test_st() found in sel.py:

    URL = "https://www.unibet.fr/sport/football/europa-league/europa-league-matchs"
    XPATH = "//*[@class='ui-mainview-block eventpath-wrapper']"
    TIMEOUT = 40

    firefoxOptions = Options()
    firefoxOptions.add_argument("--headless")
    driver = webdriver.Firefox(
        options=firefoxOptions,
        executable_path="/home/appuser/.conda/bin/geckodriver",
    )
    driver.get(URL)

    try:
        WebDriverWait(driver, TIMEOUT).until(
            EC.visibility_of_element_located((By.XPATH, XPATH,))
        )
... etc

And if you were to press the button test fanilo's function here on the sidebar of my app, it would run your code and run into the exception. Locally it does work, but not through Streamlit Sharing.

This is a bigger project, and I was hoping to isolate if if there was an issue with my code, or if it’s just how I set up everything in my directory.

Any help would be appreciated, thank you again!

App: https://share.streamlit.io/ftaruc/grailed/main/st-app.py
Code: GitHub - ftaruc/grailed: grailed web application for streamlit

Edward_Lam · March 25, 2022, 9:27am

Hi @andfanilo

I am trying to clone the latest update of the code in your github, for some reason, I got a permission denied error while making this app again on streamlit cloud:

I was using the same packages.txt and requirements and try every python version. really don’t know what it went wrong.

Would you mind to help me?

Thanks & Best Regards,
Ed

Franky1 · March 25, 2022, 10:18am

@Edward_Lam

I confirm this, the project seems to be broken if you run a fresh install on streamlit cloud.
The root cause seems to be that access to the /tmp folder is restricted.

IMHO headless webscraping on streamlit cloud is always a pain…
I had no luck with firefox/geckodriver in the past and used chromium/chromedriver which seems to be more stable.
You could try my example, which still works

GitHub - Franky1/Streamlit-Selenium: Streamlit project to test Selenium running on Streamlit Cloud

Edward_Lam · March 25, 2022, 11:46am

Hi @Frank1,

Thanks a lot for your help. I tried your code and unfortunately this time it gives me a “selenium.common.exceptions.WebDriverException: Message: unknown error: cannot create temp dir for user data dir” error. I searched online and it seems like it is due to a change with recent version of chrome url, I wonder what is the chrome driver that you were using so that I can specify in the packages?

Thanks & Regards,

Ed

Franky1 · March 25, 2022, 11:52am

I did not specify any version in packages.txt i just redeployed my github repo on streamlit cloud and it seems to work. Btw, i used Python 3.9 as runtime on streamlit cloud.

Edward_Lam · March 26, 2022, 3:11am

Hi @Franky1,

I seem to locate the problem, it is the same root cause as using the geckodriver, Chromium tends to create a user directory in tmp but right now the creation of the directory is restricted.

I think your app still work as the directory is already created. Indeed you are right, headless webscraping on streamlit cloud is a pain…

ashok2216-A · June 9, 2023, 5:49am

I recently deployed a project but I facing an error Does anyone has solution for this? it indicates chromium is crashed

emphasized text

MarrosSaldanha · January 31, 2024, 3:15am

Simple and perfect solution. It worked for me.

Topic		Replies	Views
Selenium web scraping on streamlit cloud Community Cloud selenium	25	12218	February 18, 2026
Make a streamlit web app use a chromedriver.exe file Community Cloud streamlit-cloud , selenium	14	2948	August 31, 2024
Error "selenium.common.exceptions.WebDriverException" although app works fine locally Deployment streamlit-cloud	10	3770	February 9, 2024
Streamlit interaction issues with Selenium Using Streamlit	6	3712	January 3, 2022
Selenium on streamlit error webdriver Deployment streamlit-cloud , selenium	2	929	December 24, 2023

Issue with Selenium on a Streamlit app

Related topics