Shell script "dvc pull" not working

I used the code os.system("dvc pull") to load a .csv data file (labeled_projects.csv) from my Google service account (Google Drive), and it has been working well since I deployed it a few months ago. The code itself is loaded from my GitHub account.

But it appears that the code suddenly stopped loading the .csv file and I got the error message FileNotFoundError: [Errno 2] No such file or directory: '/mount/src/mlops/data/labeled_projects.csv'.

The Streamlit server provides no error message regarding the execution of os.system("dvc pull").

Attempting to replace os.system("dvc pull") by using the tempfile package to create a .sh file and executing it using the suprocess package does not help. Got the same FileNotFoundError message with no error message about “dvc pull”.

Also, executing the command find . -name 'labeled_projects.csv' could not find any matching return from the streamlit server.

The code “dvc pull” works fine if executed locally.

Thanks for your help!

Hi @tonypeng

I’ve recently used subprocess’ Popen to run a shell script file. Additionally, there’s also run from subprocess that you could also try. Please also look into using stdout and/or stderr as parameters in order to direct any error messages.

Here’s a code snippet that you can add to your Streamlit app and see if that helps.

import subprocess

result = subprocess.Popen(['bash', 'run.sh'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = result.communicate()
print("Output:", stdout.decode())
print("Error:", stderr.decode())

Thanks @dataprofessor for the code snippet.

Using the code snippet you provided, it appears that, in my case with a simple Shell script of dvc pull within the Streamlit app.py file, the correct executable of dvc that is installed into the Streamlit server from the requirements.txt file can not be successfully reached.

The following code solved my problem.

import sys
import subprocess
import streamlit as st

def pull_data_with_dvc():
    cmd = [sys.executable, "-m", "dvc", "pull"]
    result = subprocess.run(cmd, capture_output=True, text=True)
    if result.returncode == 0:
        st.write("Data pulled successfully!")
        st.write(result.stdout)
    else:
        st.write("Error pulling data!")
        st.write(result.stderr)

# Use this function somewhere in your Streamlit app.
pull_data_with_dvc()

Thanks again!

1 Like

Glad to see that you’ve found a working implementation, congrats!

1 Like