Make chatbot to read and answer from pdf files

I need your guidance to make a streamlit app. I want to make an online web based app for my personal use like chat gpt. I mean like it should answer like chat gpt but only for my personal data e.g, a complete book or even books.

Hey @Arslan_Zahid, check out our blog post on using LlamaIndex for a very similar use case!

Thanks a lot Caroline for your response. I did try your code. But I am facing this error

ImportError: pypdf is required to read PDF files: pip install pypdf
I am trying to retrieve data from PDF file

Can you please help me?

Is pypdf installed/included in your requirements.txt file?

Yes it is included

Can you share the link to your GitHub repo?

Thank you Caroline.
Link:

This seems to be an error coming from LlamaIndex – I was able to resolve it by deleting the PyPDF import statement and deploying it with Python 3.11 (see forked repo here)

Again, thanks Caroline. I am moved the way you are trying to help me.
The world really needs people like you who always help other without even knowing them.
I have tried to follow the steps based on the forked repo shared by you.
But I am still getting this error
App - https://hellow-ewcdyf6xtecpudp8tuptlw.streamlit.app/
ImportError: pypdf is required to read PDF files: pip install pypdf
Repo - GitHub - Arzal007/hellow

2 Likes

Did you delete the deployed app and redeploy it using Python 3.11 instead? This error seems to be related to LlamaIndex and Python 3.9. Here’s our doc on picking your Python version when you re-deploy the app.

1 Like

It works!!!
Thanks a ton…
Can I also add multiple pdf files in the data repository? Is there any limit?

I mean in the data folder I have in the backend…

Awesome. There isn’t a limit, but you may hit a (1 GB) resource limit on Community Cloud at some point.

Can I also use same code for gpt 4 if I just change model=“gpt-3.5-turbo” to model=“gpt-4”

Hi Caroline,
Actually, I need to interpret graphs from the data as well. So can gpt-4 be used for this code?
And for that I just have to change model=“gpt-3.5-turbo” to model=“gpt-4” in the code?
I will update api key as well.
Also for this do I have to re-deploy app again?

You won’t need to re-deploy it for this change; you can just update the code and update the API key

Greetings Caroline…

Can you tell me that in case I want to chat with multiple pdf files, as I asked you before, then do I only have to add them in the data directory?

I mean will there any amendments required in the code as well?

I have added 3-4 files regarding Bain Report but the app does not seem to read them all. Am I missing some steps?

Also I tried to reboot the app and I am getting some error like this

tenacity.RetryError: RetryError[<Future at 0x7f1bed6c1610 state=finished raised RateLimitError>]

I have even deleted the app and tried to re-deploy and deleted all other pdf files but still I am getting same error

tenacity.RetryError: RetryError[<Future at 0x7fc6ae7a1190 state=finished raised RateLimitError>]

Hi caroline
I have even deleted the app and tried to re-deploy and deleted all other pdf files but still I am getting same error

tenacity.RetryError: RetryError[<Future at 0x7fc6ae7a1190 state=finished raised RateLimitError>]