Summary
ImportError: cannot import name ‘HOCRConverter’ from ‘pdfminer.converter’ (/Users/aaditroychowdhury/Documents/OpenAI/PDF_Chatbot_V2/venv/lib/python3.9/site-packages/pdfminer/converter.py)
Traceback:
File "/Users/aaditroychowdhury/Documents/OpenAI/PDF_Chatbot_V2/venv/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 552, in _run_script
exec(code, module.__dict__)File "/Users/aaditroychowdhury/Documents/OpenAI/PDF_Chatbot_V2/app.py", line 4, in <module>
from Helper_Module import *File "/Users/aaditroychowdhury/Documents/OpenAI/PDF_Chatbot_V2/Helper_Module.py", line 14, in <module>
from pdfminer.high_level import extract_textFile "/Users/aaditroychowdhury/Documents/OpenAI/PDF_Chatbot_V2/venv/lib/python3.9/site-packages/pdfminer/high_level.py", line 8, in <module>
from .converter import (
I have been getting the above error despite installing all the dependencies and having an accurate requirements.txt file.
Steps to reproduce
Code snippet:
import os
import re
import glob
import shutil
import tabula
import requests
import pandas as pd
import streamlit as st
from typing import List
from PyPDF2 import PdfReader
from datetime import datetime
from bs4 import BeautifulSoup
from io import StringIO
from pdfminer.high_level import extract_text
from langchain.vectorstores import Chroma
from langchain.docstore.document import Document
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import (
TextLoader,
PDFMinerLoader,
UnstructuredWordDocumentLoader,
CSVLoader,
UnstructuredHTMLLoader,
UnstructuredODTLoader,
UnstructuredPowerPointLoader,
)
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationalRetrievalChain
import Config_Paremeters
from Config_Streamlit import *
from dotenv import load_dotenv
I have provided the import statements above.
Debug info
- Python version: (get it with
$ python --3.9.6
) - Using PipEnv
- OS version: Version 114.0.5735.133 (Official Build) (arm64)
- Browser version: Version 114.0.5735.133 (Official Build) (arm64)
Requirements file
aiosignal==1.3.1
altair==5.0.1
anyio==3.7.0
appnope==0.1.3
asttokens==2.2.1
async-timeout==4.0.2
attrs==23.1.0
backcall==0.2.0
backoff==2.2.1
beautifulsoup4==4.12.2
blinker==1.6.2
cachetools==5.3.1
camelot-py==0.11.0
certifi==2023.5.7
cffi==1.15.1
chardet==5.1.0
charset-normalizer==3.1.0
chromadb==0.3.26
click==8.1.3
clickhouse-connect==0.6.3
coloredlogs==15.0.1
comm==0.1.3
contourpy==1.1.0
cryptography==41.0.1
cycler==0.11.0
dataclasses-json==0.5.8
debugpy==1.6.7
decorator==5.1.1
distro==1.8.0
duckdb==0.8.1
et-xmlfile==1.1.0
exceptiongroup==1.1.1
executing==1.2.0
faiss-cpu==1.7.4
fastapi==0.97.0
filelock==3.12.2
flatbuffers==23.5.26
fonttools==4.40.0
frozenlist==1.3.3
fsspec==2023.6.0
ghostscript==0.7
gitdb==4.0.10
GitPython==3.1.31
greenlet==2.0.2
h11==0.14.0
hnswlib==0.7.0
httptools==0.5.0
huggingface-hub==0.15.1
humanfriendly==10.0
idna==3.4
importlib-metadata==6.7.0
ipykernel==6.23.2
ipython==8.14.0
ipywidgets==8.0.6
jedi==0.18.2
Jinja2==3.1.2
joblib==1.2.0
jsonschema==4.17.3
jupyter_client==8.2.0
jupyter_core==5.3.1
jupyterlab-widgets==3.0.7
kiwisolver==1.4.4
langchain==0.0.200
langchainplus-sdk==0.0.16
lz4==4.3.2
markdown-it-py==3.0.0
MarkupSafe==2.1.3
marshmallow==3.19.0
marshmallow-enum==1.5.1
matplotlib==3.7.1
matplotlib-inline==0.1.6
mdurl==0.1.2
monotonic==1.6
mpmath==1.3.0
multidict==6.0.4
mypy-extensions==1.0.0
nest-asyncio==1.5.6
nltk==3.8.1
numexpr==2.8.4
numpy==1.25.0
onnxruntime==1.15.1
openai==0.27.8
openapi-schema-pydantic==1.2.4
openpyxl==3.1.2
overrides==7.3.1
packaging==23.1
pandas==2.0.2
parso==0.8.3
pdfminer.six==20221105
pdfminer3k==1.3.4
pexpect==4.8.0
pickleshare==0.7.5
Pillow==9.5.0
platformdirs==3.8.0
ply==3.11
posthog==3.0.1
prompt-toolkit==3.0.38
protobuf==4.23.3
psutil==5.9.5
ptyprocess==0.7.0
pulsar-client==3.2.0
pure-eval==0.2.2
pyarrow==12.0.1
pycparser==2.21
pydantic==1.10.9
pydeck==0.8.1b0
Pygments==2.15.1
Pympler==1.0.1
PyMuPDF==1.22.5
pyparsing==3.1.0
pypdf==3.11.0
PyPDF2==3.0.1
pyrsistent==0.19.3
python-dateutil==2.8.2
python-dotenv==1.0.0
pytz==2023.3
pytz-deprecation-shim==0.1.0.post0
PyYAML==6.0
pyzmq==25.1.0
regex==2023.6.3
requests==2.31.0
rich==13.4.2
safetensors==0.3.1
six==1.16.0
smmap==5.0.0
sniffio==1.3.0
soupsieve==2.4.1
SQLAlchemy==2.0.16
stack-data==0.6.2
starlette==0.27.0
streamlit==1.24.0
streamlit-chat==0.0.2.2
sympy==1.12
tabula-py==2.7.0
tabulate==0.9.0
tenacity==8.2.2
tiktoken==0.4.0
tk==0.1.0
tokenizers==0.13.3
toml==0.10.2
toolz==0.12.0
tornado==6.3.2
tqdm==4.65.0
traitlets==5.9.0
transformers==4.30.2
typing-inspect==0.9.0
typing_extensions==4.6.3
tzdata==2023.3
tzlocal==4.3.1
urllib3==2.0.3
uvicorn==0.22.0
uvloop==0.17.0
validators==0.20.0
watchdog==3.0.0
watchfiles==0.19.0
wcwidth==0.2.6
websockets==11.0.3
widgetsnbextension==4.0.7
yarl==1.9.2
zipp==3.15.0
zstandard==0.21.0```