ImportError: `rapidocr-onnxruntime` package not found even though the package is listed in requirements.txt

Hello community,

I am having a dependency issue in my app: https://bex-rag-tutorial.streamlit.app/

The app is working fine most of the time but for one critical feature, I need to use Rapid OCR for reading text from images. This is where I need the rapidocr-onnxruntime package. I have listed it in the requirements.txt file and the app logs show that it is installed during build. But when I upload an image and click “Process file”, I am having this error (the short version):

ImportError: `rapidocr-onnxruntime` package not found, please install it with 
`pip install rapidocr-onnxruntime`

Here is the full traceback:

ImportError: libGL.so.1: cannot open shared object file: No such file or 

directory


During handling of the above exception, another exception occurred:


────────────────────── Traceback (most recent call last) ───────────────────────

  /home/adminuser/venv/lib/python3.12/site-packages/streamlit/runtime/scriptru  

  nner/exec_code.py:85 in exec_func_with_error_handling                         

                                                                                

  /home/adminuser/venv/lib/python3.12/site-packages/streamlit/runtime/scriptru  

  nner/script_runner.py:576 in code_to_exec                                     

                                                                                

  /mount/src/rag_tutorial_hackernoon/app.py:37 in <module>                      

                                                                                

    34 │   │   │   │                                                            

    35 │   │   │   │   try:                                                     

    36 │   │   │   │   │   # Process the document                               

  ❱ 37 │   │   │   │   │   chunks = process_document(uploaded_file.name)        

    38 │   │   │   │   │                                                        

    39 │   │   │   │   │   # Create RAG chain                                   

    40 │   │   │   │   │   st.session_state.rag_chain = create_rag_chain(chunk  

                                                                                

  /mount/src/rag_tutorial_hackernoon/src/document_processor.py:16 in            

  process_document                                                              

                                                                                

    13 │   if source.lower().endswith(".pdf"):                                  

    14 │   │   return process_pdf(source)                                       

    15 │   elif source.lower().endswith((".png", ".jpg", ".jpeg")):             

  ❱ 16 │   │   return process_image(source)                                     

    17 │   else:                                                                

    18 │   │   raise ValueError(f"Unsupported file type: {source}")             

    19                                                                          

                                                                                

  /mount/src/rag_tutorial_hackernoon/src/document_processor.py:44 in            

  process_image                                                                 

                                                                                

    41 │   # Extract text from image using OCR                                  

    42 │   with open(source, "rb") as image_file:                               

    43 │   │   image_bytes = image_file.read()                                  

  ❱ 44 │   extracted_text = extract_from_images_with_rapidocr([image_bytes])    

    45 │   documents = [Document(page_content=extracted_text, metadata={"sourc  

    46 │   return split_documents(documents)                                    

    47                                                                          

                                                                                

  /home/adminuser/venv/lib/python3.12/site-packages/langchain_community/docume  

  nt_loaders/parsers/pdf.py:70 in extract_from_images_with_rapidocr             

                                                                                

     67 │   try:                                                                

     68 │   │   from rapidocr_onnxruntime import RapidOCR                       

     69 │   except ImportError:                                                 

  ❱  70 │   │   raise ImportError(                                              

     71 │   │   │   "`rapidocr-onnxruntime` package not found, please install   

     72 │   │   │   "`pip install rapidocr-onnxruntime`"                        

     73 │   │   )                                                               

────────────────────────────────────────────────────────────────────────────────

ImportError: `rapidocr-onnxruntime` package not found, please install it with 

`pip install rapidocr-onnxruntime`

Any ideas on how to solve it?

My Python version is 3.9.19 and Streamlit is 1.37.1.

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.