Hello, I am getting an error while running an app in the cloud but works locally. I suspect the underlying issue is coming from the camelot package somehow not working despite being the same version on the two env. Has anyone been successful at using this package in their apps for pdf processing?
tables = camelot.read_pdf(self.file_path, pages=str(page_number + 1))
Thank you
Repo:
Error logs:
[05:50:14] π Updated app!
[22:54:56] π Pulling code changes from Github...
[22:54:57] π¦ Processing dependencies...
[22:54:57] π¦ Processed dependencies!
[22:55:01] π Updated app!
[2024-03-06T04:54:25Z WARN lance::dataset] No existing dataset at /mount/src/untitledassitanttool/src/InformationProcessor/../STM\company_data_de59793c-6562-466f-adba-4065b5f397f4/pdf_extracted_content.lance, it will be created
2024-03-06 04:54:25.699 Uncaught app exception
Traceback (most recent call last):
File "/home/adminuser/venv/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 535, in _run_script
exec(code, module.__dict__)
File "/mount/src/untitledassitanttool/main.py", line 4, in <module>
app.main()
File "/mount/src/untitledassitanttool/src/ui/streamlit_app.py", line 124, in main
response = generate_llm_response(prompt)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mount/src/untitledassitanttool/src/ui/streamlit_app.py", line 96, in generate_llm_response
ingestor.file_broker() # Process files only once
^^^^^^^^^^^^^^^^^^^^^^
File "/mount/src/untitledassitanttool/src/InformationProcessor/ingestors.py", line 62, in file_broker
self.ingest_pdf(file, open_ai=self.ai_credentials)
File "/mount/src/untitledassitanttool/src/InformationProcessor/ingestors.py", line 111, in ingest_pdf
processed_pdf, size = call_pdf_preprocess(payload, open_ai)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mount/src/untitledassitanttool/src/InformationProcessor/preprocessor.py", line 156, in call_pdf_preprocess
extracted_text, page_count = pdf_processor.parse_pdf()
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mount/src/untitledassitanttool/src/InformationProcessor/preprocessor.py", line 87, in parse_pdf
for page_text, images, _ in results:
^^^^^^^^^^^^^^^^^^^^
TypeError: cannot unpack non-iterable NoneType object