Iñigo Etxaniz

Posted on Feb 14

Running Ollama in a Container Without Internet Access

#docker #ai #selfhosted #security

Prerequisites

This setup has been tested on Ubuntu. To enable GPU acceleration, install the NVIDIA Container Toolkit before running the project:

sudo apt install nvidia-container-toolkit
sudo systemctl restart docker

Motivation

With the growing demand for AI-powered applications, running large language models locally is becoming increasingly common. However, in some cases, it is essential to ensure confidentiality and prevent unintended data leaks by running Ollama in a fully sandboxed environment without internet access. This setup allows organizations or individuals to maintain full control over their AI models while still leveraging their capabilities.

Additionally, this setup enables fine-grained control over networking, allowing a dedicated model-downloading instance to have internet access while keeping the model-serving instance completely isolated.

Project Overview

This project provides a Docker-based solution to run Ollama models in an isolated environment while keeping the model-downloading instance in a normal network. The architecture consists of:

Ollama Runner: Runs AI models in a sandboxed network without internet access.
Ollama Updater: Responsible for downloading models from the internet and sharing them with the runner.
Nginx Reverse Proxy: Bridges the network gap, allowing access to the runner while maintaining isolation.

Tested with VS Code's Continue Extension

To ensure practical usability, I tested this configuration with the Continue extension in Visual Studio Code, and it worked correctly. This means developers can integrate the local Ollama instance seamlessly with their workflow while keeping the model execution isolated.

Demonstrating Network Isolation

One of the key aspects of this project is network isolation, ensuring that the Ollama Runner does not have internet access while the Ollama Updater does. To verify this, we can run the following script:

bash network-isolation-test-script.sh

This script performs the following tests:

Check internet access:
- The ollama-updater container should be able to reach google.com.
- The ollama-runner container should be fully isolated and unable to reach the internet.
Check Ollama API accessibility:
- The ollama-runner should be able to access its API internally.
- The ollama-updater should be able to access its own API.

Expected output:

✓ ollama-updater network has internet access (Expected)
✓ ollama-runner network is properly isolated (Expected)
✓ Ollama API accessible from runner network
✓ Ollama API accessible from updater network

This confirms that the model execution environment remains secure and offline, while updates can still be managed efficiently.

Technical Setup

Key Features

Fully sandboxed Ollama model execution
Separate container for downloading models with internet access
Nginx reverse proxy for controlled access
Custom small Docker networks (/28) to ensure isolation
GPU acceleration via NVIDIA toolkit (if available)

Deployment Steps

Install Docker and NVIDIA Container Toolkit (if using GPU).
Clone the repository:

   git clone https://github.com/ietxaniz/ollama-local.git
   cd ollama-local

Start the services:

   docker-compose up -d

Verify running containers:

   docker ps

Use the Ollama Runner to serve models, while the Ollama Updater manages downloads.

Managing Models

To list available models:

docker exec -it ollama-runner ollama ls

To pull new models:

docker exec -it ollama-updater ollama pull deepseek-r1:14b

To check active models:

docker exec -it ollama-runner ollama ps

Future Possibilities

While this project provides a robust foundation for securely running Ollama, I am considering extending it further to explore Retrieval-Augmented Generation (RAG). This could enhance local AI capabilities by integrating external knowledge bases while keeping execution sandboxed.

If you're interested in contributing or have suggestions, feel free to open an issue in the GitHub repository.

Conclusion

This setup ensures that AI models can run securely in an isolated environment while maintaining the flexibility to update and manage them efficiently. Whether for security reasons or simply to experiment with self-hosted AI, this approach provides a reliable solution.

Would you like to see more extensions of this project? Let me know in the comments!

DEV Community

Running Ollama in a Container Without Internet Access

Prerequisites

Motivation

Project Overview

Tested with VS Code's Continue Extension

Demonstrating Network Isolation

Technical Setup

Key Features

Deployment Steps

Managing Models

Future Possibilities

Conclusion

Top comments (0)

Read next

Is DeepSeek Really a Game Changer in 2025? Unpacking the AI Revolution

Which is Better for Prompt Engineering: Deepseek R1 or OpenAI o1?

Can GitHub Copilot Follow a Structured Development Workflow? A Real-World Experiment

How AI is Becoming a Game-Changer in the Fight Against Climate Change