pytesseract.pytesseract.TesseractNotFoundError

I am trying to deploy an App that employs OCR. The Following is the Error I face:

Here is my Requirements.txt:

arrow==1.2.3
numpy==1.24.1
opencv_python_headless==4.7.0.68
pandas==1.4.4
Pillow==9.4.0
pytesseract==0.3.10
python_dateutil==2.8.2
streamlit==1.17.0

Here is my packages.txt:

tesseract-ocr

I have tried the solution from GitHub Issue: Pytesseract and this did not work for me.

Can someone help?

1 Like

Did you do a manual reboot after changing the environment? I recommend always doing that. From your Community Cloud admin panel, where you see your apps listed, click the three dots to the right of your app and select reboot.

Can you share the output of the console after that clean reboot to see if there are any messages about environment setup? Also, can you link your GitHub repo? Sometimes there are trivial typos in a filename like package.txt instead of packages.txt, so it’s good to verify those details, too.

If all other errors can be ruled out, it might help to set the path manually:

import pytesseract

pytesseract.pytesseract.tesseract_cmd = r'/usr/bin/tesseract'

It was actually an issue with the structure of the directory. Moving the packages.txt file to the root directory as mentioned in the Streamlit docs helped clear this error.

Thank you for your suggestions, I believe these will be useful for users when they are trying to debug an error.

2 Likes

Could you share ur GitHub repository?