DocDocGo: chatbot does "infinite" web research, creates KBs from websites or your files

Hi Dimitry @ReasonMeThis,

This is very nice indeed… I could feel it in my bones that you were about to release something cool! I have the same issue of an ever-expanding list of personal collections so will appreciate the collection name search. I love the chat export capability!

The dev guide is super-useful… which reminds me I ought to finish my “chat with DDG using email” prototype, that leverages DDG’s API.

Now that you have Omni in there, I assume at some point it might be possible to support multi-media (multimodal) content ingest.

A simple use case example is: upload an image PDF or picture and send to gpt-4o to extract text or a description of the image; then vectorize this into a DDG collection. Admittedly, pure text extraction from images can be done with OCR tooling, but the more general “chat with a picture” would be a super-interesting feature. E.g. the feature could be used to upload a screenshot of an Excel chart or any data chart prior to having a conversation with it.

I don’t think many changes would be required in DDG to be honest. You just need to distinguish between image and text docs (unless LangChain will do that), and submit the b64-encoded image to gpt-4o with a custom prompt to analyze the doc and extract text and data.

@FareedKhan-dev posted this “chat with graph” app a few months ago. It uses Gemini, but is an example of chatting with a data chart (i.e. a graph).

@Charly_Wargnier delved into some other use cases for MM LLMs like Omni.

Thanks for regularly updating DocDocGo… I use my own version of it all the time and try to merge your changes ASAP.

Cheers,
Arvindra

2 Likes

@asehmi Thank you for the kind words! Supporting multimodality is definitely in my plans, including both images and audio (when OpenAI unlocks it). “Chat with picture” would indeed be very interesting, though of course for many use such cases using regular ChatGPT would more than suffice - so, I would need to think about how best to support multimodal use cases where DDG’s capabilities would be handy.

I’m currently implementing tracking timestamps for collection creation and updates, so that the many collections that have now accumulated could be selectively displayed based on age and unneeded old collections could be deleted or put in “cold storage”. Many other exciting updates are planned!

1 Like

:tada:Announcing DocDocGo version 0.2:tada:

DocDocGo is growing up! Let’s celebrate by describing the new features.

1. DDG finally has a logo that’s not just the :owl: character!

Its design continues the noble tradition of trying too hard to be clever, starting with the name “DocDocGo” :grin:

@StreamlitTeam continues to ship - the new st.logo feature arrived just in time!

2. Re-engineered collections

With over 150 public collections and growing, it was time to manage them better. Now, collections track how recently they were created/updated. Use:

  • /db list - see the 20 most recent collections (in a cute table)
  • /db list 21+ - see the next 20, and so on
  • /db list blah - list collections with “blah” in the name

DDG will remind you of the availablej commands when you use /db or /db list.

Safety improvement: Deleting a collection by number, e.g. /db delete 42, now only works if it was first listed by /db list.

3. Default modes

Tired of typing /reseach or /research heatseek? Let your fingers rest by selecting a default mode in the Streamlit sidebar.

You can still override it when you need to - just type the desired command as usual, for example: /help What's the difference between regular research and heatseek?.

4. UI improvements

The UI has been refreshed with a new intro screen that has:

  • The one-click sample commands to get you started if you’re new
  • The new and improved welcome message

Take the new version for a spin and let me know what you think! :rocket:

1 Like

GPT-4o mini and other updates

First, DocDocGo had a teeny-tiny issue with its database growing too gigantic for its VM. The issue is now fixed, so if you tried to use DocDocGo and couldn’t access it, my apologies - it’s now back in action! :muscle::

Updates:

For users:

  1. The default model is now GPT-4o mini (thanks, OpenAI! :rocket:) instead of GPT-3.5, so expect a significant improvement in the quality of responses. If you use DocDocGo with you own OpenAI API key, you can also switch to GPT-4o, GPT-4, or the old friend GPT-3.5.
  2. The database has been upgraded and cleaned up for better performance.

For developers:

  1. The developer docs have been significantly expanded to include lots of goodies on hosting the Chroma database server on AWS and GCP, running locally with Docker, cleaning it, etc. There is a lot more that can (and should) be added, so if something is unclear or you have any questions or suggestions, feel free to DM me here or on LinkdIn.
  2. In addition to cleaning up the database, Langchain and Chroma DB have been upgraded to the latest versions (0.2.10 and 0.5.4, respectively). A slew of deprecated imports have been updated, and a bug related to a breaking change in Langchain’s Chroma has been found and squashed :beetle:.

I hope you enjoy using DocDocGo with the shiny new GPT-4o mini! Because of the model switch and other major changes, it’s possible that there may be some hiccups that I haven’t caught yet. Feel free to reach out on GitHub if you see any :spider:, and the exterminator will be dispatched posthaste!

1 Like