Connection Error When Deploying Streamlit App Using Local Ollama LLM Model

Drake · September 24, 2024, 9:36pm

I’m experiencing a persistent connection error when deploying my Streamlit app that relies on an Ollama LLM model running on my local machine.

Environment Details:

Local Machine:
- Operating System: macOS
- Ollama Version: 0.1.31 (also tried downgrading to 0.0.11)
Streamlit App:
- Uses LangChain, Ollama LLM, and Chroma for vector storage
Deployment Platform:
- Streamlit Community Cloud

Problem Description:

When running the app locally, it functions correctly and communicates with the Ollama server without issues.

Upon deploying the app on Streamlit Community Cloud, I encounter the following error:

Connection error: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/generate (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f01295e4380>: Failed to establish a new connection: [Errno 111] Connection refused'))

What I’ve Tried So Far:

Verified Ollama Server is Running:
- Started the Ollama server using ollama serve.
- Confirmed it’s listening on port 11434.
- Checked running processes with ps aux | grep ollama and saw:
```
/Applications/Ollama.app/Contents/Resources/ollama serve
```
Checked API Endpoints:
- Accessed http://localhost:11434 and received "ollama is running".
- Tried curl http://localhost:11434/api/ps and got {"models":[]}.
- Attempted curl http://localhost:11434/api/models and curl http://localhost:11434/api/generate, but received 404 Not Found errors.

Verified Installed Models:

Ran ollama list and confirmed models like llama2, llama3, and others are installed.

NAME                  ID              SIZE      MODIFIED
llama2:latest         78e26419b446    3.8 GB    3 hours ago
llama3:latest         365c0bd3c000    4.7 GB    13 days ago

Tested API Calls:
- Tried curl -X POST http://localhost:11434/api/generate -d '{"model": "llama2", "prompt": "Hello"}' and received a 404 Not Found error.
- Noted that curl http://localhost:11434/api/ps returns {"models":[]}, indicating no models are running.
Checked for Port Conflicts and Firewall Issues:
- Used lsof -i :11434 to confirm no other service is occupying the port.
- Ensured firewall settings are not blocking port 11434.
Verified Ollama Version and API Changes:
- Found that Ollama version 0.1.31 has deprecated the REST API endpoints like /api/generate.
- Recognized that the 404 errors are due to these deprecated endpoints.
Attempted to Use Correct API Endpoints:
- Modified the base URL in the code to remove /api prefix.
- Tested endpoints like http://localhost:11434/generate, but still received 404 Not Found errors.

Explored Using the Ollama CLI Instead of the REST API:

Created a custom LLM class in LangChain to interact with the Ollama CLI using Python’s subprocess module.

Code snippet for the custom LLM class:

from langchain.llms.base import LLM
import subprocess

class OllamaCLI(LLM):
    def __init__(self, model="llama2"):
        self.model = model

    def _call(self, prompt, stop=None):
        cmd = ["ollama", "generate", self.model]
        process = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
        stdout, stderr = process.communicate(input=prompt)
        if process.returncode != 0:
            raise Exception(f"Ollama Error: {stderr}")
        return stdout

    @property
    def _llm_type(self):
        return "ollama_cli"

Adjusted the application to use this custom LLM class.
Tested locally, and the app works using the CLI.

Attempted to Uninstall Ollama and Install an Older Version:
- Tried brew uninstall ollama, but received Error: Cask 'ollama' is not installed.
- Realized Ollama was installed as a standalone application in /Applications/Ollama.app.
Manually Uninstalled Ollama:
- Quit the Ollama application and killed running processes with pkill -f ollama.
- Deleted Ollama.app from the Applications folder.
- Removed associated files and directories:
```
rm -rf ~/Library/Application\ Support/Ollama
rm -rf ~/Library/Caches/com.ollama*
rm -rf ~/Library/Preferences/com.ollama*
rm -rf ~/Library/Logs/Ollama
rm -rf ~/.ollama
```
Installed Ollama Version 0.0.11 with REST API Support:
- Downloaded the older version from Ollama Releases.
- Installed the binary to /usr/local/bin.
- Verified installation with ollama -v, which now shows ollama version is 0.0.11.
- Started the server with ollama serve.
- Successfully accessed the REST API endpoints.
Adjusted Application to Use REST API:
- Updated the base URL in the application to include /api prefix.
- Changed the model names to match those available in version 0.0.11.
- Tested the app locally, and it works as expected.
Deployment Challenges Remain:
- Deployed the app on Streamlit Community Cloud.
- Still receiving connection errors because the deployed app cannot reach the local Ollama server.

My Questions:

Is it possible for a deployed Streamlit app to connect to a local Ollama server?
- Given that the app is running on Streamlit Community Cloud and the Ollama server is on my local machine, I suspect network connectivity issues are preventing communication.
If not, what are the recommended approaches to make the Ollama server accessible to the deployed app?
- Are there secure methods to expose the Ollama server to the internet without compromising security?
- Should I consider hosting the Ollama server on a cloud platform?
Alternatively, is it more feasible to use a cloud-based LLM like OpenAI’s GPT-3.5-turbo for deployment scenarios?
- Considering the complexities and potential security risks, would switching to a cloud-based LLM service be more practical?
What are the best practices for deploying apps that require a custom LLM backend?
- How do others handle deploying applications that rely on local models?

Additional Information:

Network Limitations:
- My local machine is behind a NAT firewall with a dynamic IP, making direct connections from the internet challenging.
- I am hesitant to set up port forwarding due to security concerns.
Security Concerns:
- Exposing my local server to the internet might introduce vulnerabilities.
- I prefer a solution that maintains security while allowing the deployed app to access the LLM.
Deployment Constraints:
- The app is deployed on Streamlit Community Cloud, which doesn’t allow running background processes like the Ollama server.

What I’m Seeking:

Advice on how to enable communication between the deployed Streamlit app and the Ollama server.
Recommendations for securely hosting the Ollama server in a way that the deployed app can access it.
Best practices for deploying apps that require a custom LLM backend.
Insights on whether switching to a cloud-based LLM service would be more practical for deployment.

Thank you for your assistance!

Additional Notes:

I’ve read through the Ollama GitHub Repository and the LangChain Ollama Integration Documentation to understand the API changes and potential workarounds.
I understand that newer versions of Ollama have deprecated the REST API in favor of a gRPC-based API and CLI interactions, which might not be supported by LangChain yet.
Modifying the application to use the CLI works locally but doesn’t resolve the deployment issue due to the inability to run the Ollama server on Streamlit Community Cloud.

I appreciate any guidance or suggestions on how to proceed.

Topic		Replies	Views
ConnectionError with flask server Using Streamlit flask	5	7896	January 12, 2022
Deploy failed: connection refused Deployment	23	1410	April 20, 2023
Protocol error Using Streamlit llms , discussion	6	294	September 24, 2024
Failed to establish a new connection: [Errno 61] Connection refused') Using Streamlit	2	1133	October 20, 2023
Connection Error on Local Streamlit App - Works on Streamlit Cloud Using Streamlit debugging	0	246	July 8, 2024

*Connection Error When Deploying Streamlit App Using Local Ollama LLM Model*

Related Topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies

Connection Error When Deploying Streamlit App Using Local Ollama LLM Model