Biopython BLAST module help

Hello, I am trying to deploy an app which uses Biopython’s wrapper for the the NCBI blastP package. Unfortunately, I keep getting the error /bin/sh: 1: blastp: not found when deployed (though the code does work locally using streamlit run app.py).

An example code snipped for what I’m trying to do is below. The NamedTemporaryFile part is because the NcbiblastpCommandLine module requires local paths to files, and I’m not sure if this is also what’s causing the above error. Thank you for any help on this specific issue!

from Bio.Blast.Applications import NcbiblastpCommandline
with NamedTemporaryFile(dir='.', suffix='.fasta') as f:
        f.write(input_string)

        cline_pblast = NcbiblastpCommandline(
            query=f.name, 
            subject=f.name
        )
        
        out = cline_pblast()

What is in your requirements file and/or can you link your repository?

Sure thing. Here is a link to the requirements.txt file

Repository link here with the page I’m having issues with at pages/generate_lefi_data.py

I’ve never used this package, so I’m no expert. Looking at the documentation I gather you’ll need to get the setup for BLAST+ also included with your dependencies. The app dependencies link I posted above would be the starting point if you need to go beyond what you can pip install. I don’t know how quick I’m going to find the exact answer or if there will be some fundamental incompatibility depending on the exact architecture of BLAST+, but I thought I’d share at least that much while I’m passively poking around at it. :slight_smile:

(post deleted by author)

Hi, I did try adding in BLAST+ as a dependency via a conda environment.yml file, since BLAST+ didn’t appear to be installable via pip. Here is the link to my .yml file. Unfortunately I am still getting the error message message '/bin/sh: 1: blastp: not found'

I did a quick search to see if I could find another use case, and I found an example with blast where the dependency was recorded as blast instead of blast-plus. Did you by chance try it both ways? (I am kind of stabbing in the dark here, but just checking.)

Here was the gist I found: blast yaml file · GitHub

Yep, I’ve tried both. I separated blast and blast-plus into two separate apps (blast .yml link here, blast-plus .yml link here) and weirdly I’m getting new errors for both. It appears to be a streamlit/conda issue that I’m having a difficult time resolving. Thanks for all of your help and ideas so far!

Here is the full error message for the blast version, which is the same I get for the blast-plus version. The important line seems to be OSError: Could not find a suitable TLS CA certificate bundle, invalid path: /home/appuser/.local/lib/python3.9/site-packages/certifi/cacert.pem but I don’t understand why I’m getting that.

# >>>>>>>>>>>>>>>>>>>>>> ERROR REPORT <<<<<<<<<<<<<<<<<<<<<<


    Traceback (most recent call last):

      File "/home/appuser/.conda/lib/python3.9/site-packages/conda/exceptions.py", line 1129, in __call__

        return func(*args, **kwargs)

      File "/home/appuser/.conda/lib/python3.9/site-packages/conda_env/cli/main.py", line 80, in do_call

        exit_code = getattr(module, func_name)(args, parser)

      File "/home/appuser/.conda/lib/python3.9/site-packages/conda/notices/core.py", line 75, in wrapper

        display_notices(

      File "/home/appuser/.conda/lib/python3.9/site-packages/conda/notices/core.py", line 39, in display_notices

        channel_notice_responses = http.get_notice_responses(channel_name_urls, silent=silent)

      File "/home/appuser/.conda/lib/python3.9/site-packages/conda/notices/http.py", line 36, in get_notice_responses

        return tuple(

      File "/home/appuser/.conda/lib/python3.9/site-packages/conda/notices/http.py", line 39, in <genexpr>

        (

      File "/home/appuser/.conda/lib/python3.9/concurrent/futures/_base.py", line 609, in result_iterator

        yield fs.pop().result()

      File "/home/appuser/.conda/lib/python3.9/concurrent/futures/_base.py", line 439, in result

        return self.__get_result()

      File "/home/appuser/.conda/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result

        raise self._exception

      File "/home/appuser/.conda/lib/python3.9/concurrent/futures/thread.py", line 58, in run

        result = self.fn(*self.args, **self.kwargs)

      File "/home/appuser/.conda/lib/python3.9/site-packages/conda/notices/http.py", line 42, in <lambda>

        lambda args: get_channel_notice_response(*args), url_and_names

      File "/home/appuser/.conda/lib/python3.9/site-packages/conda/notices/cache.py", line 37, in wrapper

        return_value = func(url, name)

      File "/home/appuser/.conda/lib/python3.9/site-packages/conda/notices/http.py", line 58, in get_channel_notice_response

        resp = session.get(url, allow_redirects=False, timeout=5)  # timeout: connect, read

      File "/home/appuser/.conda/lib/python3.9/site-packages/requests/sessions.py", line 600, in get

        return self.request("GET", url, **kwargs)

      File "/home/appuser/.conda/lib/python3.9/site-packages/requests/sessions.py", line 587, in request

        resp = self.send(prep, **send_kwargs)

      File "/home/appuser/.conda/lib/python3.9/site-packages/requests/sessions.py", line 701, in send

        r = adapter.send(request, **kwargs)

      File "/home/appuser/.conda/lib/python3.9/site-packages/requests/adapters.py", line 460, in send

        self.cert_verify(conn, request.url, verify, cert)

      File "/home/appuser/.conda/lib/python3.9/site-packages/requests/adapters.py", line 263, in cert_verify

        raise OSError(

    OSError: Could not find a suitable TLS CA certificate bundle, invalid path: /home/appuser/.local/lib/python3.9/site-packages/certifi/cacert.pem


`$ /home/appuser/.conda/bin/conda-env update -n base --file environment.yml`


  environment variables:

                 CIO_TEST=<not set>

  CONDA_AUTO_UPDATE_CONDA=false

               CONDA_ROOT=/home/appuser/.conda

           CURL_CA_BUNDLE=<not set>

                     PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin

                          :/bin

    PYTHON_GET_PIP_SHA256=36c6f6214694ef64cc70f4127ac0ccec668408a93825359d998fb31d24968d67

       PYTHON_GET_PIP_URL=https://github.com/pypa/get-

                          pip/raw/6d265be7a6b5bc4e9c5c07646aee0bf0394be03d/public/get-pip.py

       PYTHON_PIP_VERSION=22.0.4

PYTHON_SETUPTOOLS_VERSION=58.1.0

           PYTHON_VERSION=3.9.15

       REQUESTS_CA_BUNDLE=<not set>

            SSL_CERT_FILE=<not set>


     active environment : None

       user config file : /home/appuser/.condarc

 populated config files : /home/appuser/.condarc

          conda version : 22.9.0

    conda-build version : not installed

         python version : 3.9.12.final.0

       virtual packages : __linux=5.10.133=0

                          __glibc=2.31=0

                          __unix=0=0

                          __archspec=1=x86_64

       base environment : /home/appuser/.conda  (writable)

      conda av data dir : /home/appuser/.conda/etc/conda

  conda av metadata url : None

           channel URLs : https://repo.anaconda.com/pkgs/main/linux-64

                          https://repo.anaconda.com/pkgs/main/noarch

                          https://repo.anaconda.com/pkgs/r/linux-64

                          https://repo.anaconda.com/pkgs/r/noarch

          package cache : /home/appuser/.conda/pkgs

       envs directories : /home/appuser/.conda/envs

               platform : linux-64

             user-agent : conda/22.9.0 requests/2.28.1 CPython/3.9.12 Linux/5.10.133+ debian/11 glibc/2.31

                UID:GID : 1000:1000

             netrc file : None

           offline mode : False



An unexpected error has occurred. Conda has prepared the above report.


[17:32:24] ❗️ installer returned a non-zero exit code

[17:32:24] ❗️ Error during processing dependencies! Please fix the error and push an update, or try restarting the app.

[17:33:55] ❗️ Streamlit server consistently failed status checks

[17:33:55] ❗️ Please fix the errors, push an update to the git repo, or reboot the app.

When I try to get to the BLAST+ install files from here, I do get a username/password prompt from the FTP link. I did see in a gist that someone specified some security information in the configuration for BLAST+ (for something other than Streamlit Cloud), so my non-expert thought is that if this is possible, it will be necessary to manually copy the setup files for BLAST+ into your repository. However, I really am wondering if Streamlit Cloud will be able to accommodate the installation. I don’t know how complex of a thing BLAST+ is. It says it’s command line tools (rather than a Python package), so I don’t know if there is a way to get it on Streamlit Cloud or not. I haven’t personally done anything vastly complicated with the system myself.

Hopefully someone with deeper knowledge can chime in here, as I don’t think I’ll be quick to figure out more. :slight_smile:

Thans for all the work to look into it! I’ve been downloading the tools using conda (blast, blast-plus) so I didn’t need to copy BLAST+ setup files or provide a username/password.

I’ll also add that I’ve tried troubleshooting the above error OSError: Could not find a suitable TLS CA certificate bundle, invalid path: /home/appuser/.local/lib/python3.9/site-packages/certifi/cacert.pem based on this previous forum post to little success. I get one of two errors based on my setup:

  • If I set the .yml python requirement to >= 3.9 and deploy with python3.10 in the advanced settings, the app loads but can’t find any packages that were installed via pip
  • If I set the .yml python requirement to == 3.10 and deploy with 3.10, the app fails to load with an error reading: ModuleNotFoundError: No module named 'conda.cli.main_info'

Maybe unrelated, but where in the repo are you putting the dependencies file? I noticed that the app is within the lefi/ folder and you were putting environment.yml in that same folder as well. I remember Streamlit Cloud failing if the dependencies file was not in the root of the repo, or if you had multiple dependency files – then the whole thing had to be rebooted.

Good point – I made sure to remove the environment.yml file in the root lefi/ directory, but I’m still getting similar conda issues. For example, this app (github repo here), using python3.10 during deployment, gives this error: ModuleNotFoundError: No module named 'conda.cli.main_info'

Generally, it’d be helpful to get more useful conda environment error messages-- it’s very difficult to troubleshoot things like this.

This seems to work in a separate repo using a requirements.txt. But not sure what kind of file is expected to test further, I just put some random fasta sequence and got an error (unrelated to dependencies).

What type of error did you get? It would be great to replicate what you made, since I’ve had issues using the requirements.txt approach since the blast package seems to require conda for installation. It’s possible that the error you got is related to what I’ve been dealing with.

I’d initially linked a requirements file I had been using but I unfortunately deleted it to try troubleshooting some of the other options (in retrospect, I should’ve kept things in separate directories).

It was not related to a missing dependency, it is an IndexError: string index out of range, coming from the process_fasta function. (Just curious, why not use Bio.SeqIO to parse the file?)

Traceback:
Traceback (most recent call last):

  File "/home/appuser/venv/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 564, in _run_script
    exec(code, module.__dict__)
  File "/app/lefi_blastplus/pages/generate_lefi_data.py", line 175, in <module>
    nodes, edges = fasta_to_dicts(fasta_file)
  File "/app/lefi_blastplus/pages/generate_lefi_data.py", line 50, in fasta_to_dicts
    fasta_dict = process_fasta(fasta_file)
  File "/app/lefi_blastplus/pages/generate_lefi_data.py", line 26, in process_fasta
    while lines[new_index][0] != '>':
IndexError: string index out of range

No good reason–I realized my mistake though and fixed the bug here. Very dumb error when I tried to make the parsing better, and I should probably just move to the SeqIO method.

Any other thoughts on how to get this to work? I’m still finding the conda environment-related error messages difficult to understand. Thanks for everyone’s input so far; I realize this is a bit tricky.