DocDocGo: chatbot does "infinite" web research, creates KBs from websites or your files

Hi Dimitry @ReasonMeThis,

This is very nice indeed… I could feel it in my bones that you were about to release something cool! I have the same issue of an ever-expanding list of personal collections so will appreciate the collection name search. I love the chat export capability!

The dev guide is super-useful… which reminds me I ought to finish my “chat with DDG using email” prototype, that leverages DDG’s API.

Now that you have Omni in there, I assume at some point it might be possible to support multi-media (multimodal) content ingest.

A simple use case example is: upload an image PDF or picture and send to gpt-4o to extract text or a description of the image; then vectorize this into a DDG collection. Admittedly, pure text extraction from images can be done with OCR tooling, but the more general “chat with a picture” would be a super-interesting feature. E.g. the feature could be used to upload a screenshot of an Excel chart or any data chart prior to having a conversation with it.

I don’t think many changes would be required in DDG to be honest. You just need to distinguish between image and text docs (unless LangChain will do that), and submit the b64-encoded image to gpt-4o with a custom prompt to analyze the doc and extract text and data.

@FareedKhan-dev posted this “chat with graph” app a few months ago. It uses Gemini, but is an example of chatting with a data chart (i.e. a graph).

@Charly_Wargnier delved into some other use cases for MM LLMs like Omni.

Thanks for regularly updating DocDocGo… I use my own version of it all the time and try to merge your changes ASAP.

Cheers,
Arvindra

2 Likes