GitHub QA Chatbot with Langchain!

aza1200 · July 28, 2023, 8:21am

Hello, guys!

our project team made an GitHub Repository ChatBot.
GitHub QA ChatBot is

Question-Answering with github’s speicific repository
You can visualize folder structure and file-content

Chat-GPT can answer information in various fields. However, Chat-GPT is not learning about the latest technology framework, the latest code, so we have to study while looking at the GitHub repository. Based on these points, the app was developed.

we will close our demo-app if there costs lots of open-ai-api-key But we want to share it with a lot of people who are my strength, and we want to keep this service for a long time.
And Here is Demo-app link and Github repository link!

For Whom?

For whom who wants to lean new repository that chat-gpt cannot answer about the information
For whom who wants to learn coding by viewing github - repository

The tech stack is like this

Github Restapi -  Repository Info by DFS algorithm
LangChain - Communication Data with GPT-3.5-turbo
VectorDB - FAISS ( mmr algorithm use)
Streamlit - Deployment

How it works?

QA Chat Bot Page

1. Get User and Repository Information by Github Rest api and we get the information by dfs algorithm. We can get an data with 
{file_name_1 : file_content, file_name_2 : file_content, file_name_3: file_content}

2.  Change Dictionary data types to document types.
   We also manually add the data which involes the folder structure by python anytree.
   So the QA chat bot can also answer about the folder structure.

3.  Document Chunking with 1000 tokens and  0 overlap with FAISS.
   - we want to use pinecone vector Database . But because we use free tier.
     we can't handler multi user by pinecone.

4. Question Answering with LangChain QA Retriever

Folder Structure Page

1. Get data with pip's anytree 
2. By DFS , we can get an tree structure 
3. For visualizing we used streamlit's agraph library

What’s the Limitation?

- We use Github Restapi which limits 5000 times per hour
- OPEN AI API KEY Cost
- Until now , we can get a memory in the LangChain QA Retriever, but it can't utilize based on what's in its memory.
- If the repository you want to analyze has lots of files, it can takes lots of times

Finally , I want to say thanks for streamlit community.
When there was something I didn’t know, I asked questions to Streamlit community
The people of streamlit community responded kindly.

question link

tonykip · July 28, 2023, 1:28pm

Hi @aza1200,

Thanks for sharing! Pretty cool app.

Dimitri_S · July 29, 2023, 1:50pm

Really cool, I like its simplicity.

Until now , we can get a memory in the LangChain QA Retriever, but it can’t utilize based on what’s in its memory.

What do you mean by this?

aza1200 · July 29, 2023, 4:01pm

Oh, that means …

If I ask
my chatbot to …

Although we follwed that LangChain Memory buffer → (ConversationalRetrievalChain + Memory · Issue #2303 · langchain-ai/langchain · GitHub),
If I ask Chetbot what is the most recent chatbot I asked, my chatbot can’t answer that.
That means my chatbot doesn’t have the ability to utilize the memory I asked before .

I guess the reason is in streamlit’s Characteristics that are refreshed every time when a button is clicked

We embed the code about the langchain memory, but it doesn’t work now … haha

And Here is the following code

    if not st.session_state['chat_memory']:
        st.session_state['chat_memory'] = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
    memory = st.session_state['chat_memory']

    qa_chain = ConversationalRetrievalChain.from_llm(
        llm=open_ai_model,
        memory=memory,
        retriever=retriever,
        get_chat_history=lambda h : h,
        verbose=True,
    )

aza1200 · July 29, 2023, 4:01pm

Thanks for your comment!

system · January 25, 2024, 4:02pm

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
New arslanex/git-and-github Streamlit App Show the Community!	1	239	November 18, 2023
New pygitdev/github-profile-readme-generator Streamlit App Show the Community!	1	427	July 8, 2023
App: Your Github stats for 2020 🐙 Show the Community!	5	608	February 7, 2022
GitHub Profile README Generator Show the Community!	1	692	May 13, 2022
Voice based code editor assistant Show the Community!	3	1287	January 12, 2022

GitHub QA Chatbot with Langchain!

github link

demo app link

For Whom?

How it works?

What’s the Limitation?

GitHub QA Chatbot with Langchain!

github link

demo app link

For Whom?

How it works?

What’s the Limitation?

Related topics