Error when creating vector store using Streamlit with langchain

If you’re creating a debugging post, please include the following info:

  1. Share the link to the public app (deployed on Community Cloud).
  2. Share the link to your app’s public GitHub repository (cenace-LLM/streamlit_app.py at main · anmerino-pnd/cenace-LLM · GitHub).
  3. Share the full text of the error message (not a screenshot).
Traceback:
File "/home/adminuser/venv/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 584, in _run_script
    exec(code, module.__dict__)
File "/mount/src/cenace-llm/streamlit_app.py", line 113, in <module>
    main()
File "/mount/src/cenace-llm/streamlit_app.py", line 104, in main
    vectore_store = get_vector_store(chunks)
                    ^^^^^^^^^^^^^^^^^^^^^^^^
File "/mount/src/cenace-llm/streamlit_app.py", line 55, in get_vector_store
    vector_store = FAISS.from_documents(chunks, embeddings)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/adminuser/venv/lib/python3.11/site-packages/langchain_core/vectorstores.py", line 550, in from_documents
    return cls.from_texts(texts, embedding, metadatas=metadatas, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/adminuser/venv/lib/python3.11/site-packages/langchain_community/vectorstores/faiss.py", line 965, in from_texts
    embeddings = embedding.embed_documents(texts)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/adminuser/venv/lib/python3.11/site-packages/langchain_community/embeddings/ollama.py", line 204, in embed_documents
    embeddings = self._embed(instruction_pairs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/adminuser/venv/lib/python3.11/site-packages/langchain_community/embeddings/ollama.py", line 192, in _embed
    return [self._process_emb_response(prompt) for prompt in iter_]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/adminuser/venv/lib/python3.11/site-packages/langchain_community/embeddings/ollama.py", line 192, in <listcomp>
    return [self._process_emb_response(prompt) for prompt in iter_]
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/adminuser/venv/lib/python3.11/site-packages/langchain_community/embeddings/ollama.py", line 163, in _process_emb_response
    raise ValueError(f"Error raised by inference endpoint: {e}")
  1. Share the Streamlit and Python versions.
streamlit==1.24.1
python == 3.11.9

I’m not sure what’s happening. I’ve been following a tutorial video from youtube and it’s almost the same code, I just changed the embedding model but that shouldn’t be a problem.

def get_vector_store(chunks):
    """Get vectors for each chunk."""
    embeddings = OllamaEmbeddings(model='gemma:2b')
    st.write(type(chunks),             type(embeddings))
    # builtins.list(iterable=(), /) ,  langchain_community.embeddings.ollama.OllamaEmbeddings
    vector_store = FAISS.from_documents(chunks, embeddings)
    return vector_store
1 Like

Hi @anmerino-pnd,

Thanks for sharing this question!

When passing in the model name, can you try omitting the suffix part. Like this nomic-embed-text instead of this nomic-embed-text:latest or this gemma instead of gemma:2b.

I would also add that gemma is not an embedding model and might not work. Consider using other options like nomic-embed-text, mxbai-embed-large, and all-minilm from Ollama.

You should also consider using try-except block to catch more logs:

def get_vector_store(chunks):
    """Get vectors for each chunk."""
    try:
        embeddings = OllamaEmbeddings(model='nomic-embed-text')
        st.write("total chunks:", len(chunks))
        vector_store = FAISS.from_documents(chunks, embeddings)
        return vector_store
    except Exception as e:
        st.error(f"Failed to create vector store: {e}")
        return None
1 Like