Containers Uncovered: More Than Just Lightweight Virtual Machines!”
If you’re like me — always wondering how things work and eager to build them with your own mind and hands — you’re in the right place!
In the first part of this article (Part 1), I attempted to build a minimal container system using only Go, relying on Linux’s unshare and namespaces. It was purely a demonstration, and I wasn’t aiming to develop a fully functional container runtime tool. I intentionally left out many critical aspects, such as security, networking, and image management.
I initially thought it would be simple, but I quickly realized that even a basic container system involves thousands of concepts and implementations. However, my passion for understanding and building things kept me going.
Now, after a year since my first article on Building a Minimal Container in Go, I’ve realized that both my code and my original article need a fresh perspective. So, it’s time for a revisit!
System Architect
Core Components
-
User CLI
Responsibilities:
Parse user commands (run, exec, ps, rm)
Communicate with daemon via RPC or any other way
Format and display output
Key Features:
Command completion
Output formatting (JSON/YAML)
Log streaming
- Container Daemon
Responsibilities:
Manage container lifecycle
Maintain container state database
Coordinate between components
Key Features:
REST/gRPC API
Event logging
Resource tracking
- Container Runtime
Components:
Namespace Manager: CLONE_NEW* flags handling and more flags in real world .
Cgroups Manager: Resource constraints
Filesystem Setup: RootFS preparation
Features:
OCI runtime spec compliance
User namespace remapping
Seccomp/AppArmor profiles
- Image Service
Components:
Registry Client: Docker Hub integration or you own images services if you will go wiled
Layer Manager: OverlayFS/BTRFS
Snapshotter: Copy-on-write layers
Features:
Image caching
Signature verification
Garbage collection
- Network Manager
Components:
CNI Plugins: Bridge, MACVLAN, IPVLAN
IPAM: DHCP/Static allocation
Service Mesh: DNS, service discovery
Features:
Multi-host networking
Network policies
Port mapping
- Storage Driver
Components:
Volume Manager: Bind mounts
Snapshot Manager: Incremental backups
Quota Enforcer: Disk limits
Features:
Persistent storage
Temporary filesystems
Encryption support
this schema will give you a bigger picture
+---------------------+
| User CLI |
| (run, exec, ps, rm) |
+----------+----------+
|
| (gRPC/HTTP)
v
+---------------------+
| Container Daemon |
| (State Management) |
+----------+----------+
|
+------------------+------------------+
| | |
+----------+----------+ +-----+--------+ +-------+---------+
| Container Runtime | | Image Service| | Network Manager |
| (namespace/cgroups) | | (OCI Images) | | (CNI Plugins) |
+----------+----------+ +-----+--------+ +-------+---------+
| | |
+---------v---------+ +------v-------+ +--------v---------+
| Linux Kernel | | Storage Driver| | Host Networking |
| - namespaces | | (OverlayFS) | | (iptables/bridge)|
| - cgroups v2 | +---------------+ +------------------+
| - capabilities |
+--------------------+
It has been a long journey for me to learn and think through every component. I encountered many challenges, especially with aspects like OverlayFS and networking.
My biggest issue in my first implementation was networking. It was really difficult to isolate the child container and set up its own bridged network.
To solve network isolation, you need to think clearly 🤔 at this stage.
First, you need to create a bridge on the host with two virtual interfaces:
The first interface remains on the host.
The second interface is moved to the child container 🫙.
The real challenge here is managing command signaling between the host and the child container.
In my approach, I will attempt to create a proof of concept implementation.
Understanding Container Networking
When we create containers, one of the most crucial aspects is network isolation. Think of it like giving each container its own private network environment, complete with its own network interfaces, IP addresses, and routing rules. Let’s break down how we achieve this in our container implementation.
The Network Setup Process
- Creating the Network Namespace
First, we create a separate network namespace for our container. This is like giving the container its own private networking room:
const ContainerName = "mycontainer"
func createNetworkNamespace(name string) error {
// Create directory for network namespaces
if err := os.MkdirAll("/var/run/netns", 0755); err != nil {
return err
}
// Create the namespace file
nsFile := filepath.Join("/var/run/netns", name)
fd, err := os.Create(nsFile)
if err != nil {
return err
}
fd.Close()
// Bind mount it to make it accessible
return syscall.Mount("/proc/self/ns/net", nsFile, "bind", syscall.MS_BIND, "")
}
- Setting Up Virtual Network Interfaces
We create a virtual network cable (veth pair) to connect our container to the host system:
const (
VethHost = "veth0" // Host end of the cable
VethContainer = "veth1" // Container end of the cable
ContainerIP = "10.0.0.2/24"
HostIP = "10.0.0.1/24"
Gateway = "10.0.0.1"
)
The setup happens in two parts:
1-On the host side:
func setupHostNetwork(pid int) error {
// Create the virtual network cable (veth pair)
if err := exec.Command("ip", "link", "add", VethHost, "type", "veth",
"peer", "name", VethContainer).Run(); err != nil {
return fmt.Errorf("failed to create veth pair: %v", err)
}
// Move one end to the container
if err := exec.Command("ip", "link", "set", VethContainer,
"netns", fmt.Sprintf("%d", pid)).Run(); err != nil {
return fmt.Errorf("failed to move veth to container: %v", err)
}
// Configure the host end
if err := exec.Command("ip", "link", "set", VethHost, "up").Run(); err != nil {
return fmt.Errorf("failed to bring up host interface: %v", err)
}
if err := exec.Command("ip", "addr", "add", HostIP, "dev", VethHost).Run(); err != nil {
return fmt.Errorf("failed to assign IP to host interface: %v", err)
}
}
2 — Inside the container:
func setupContainerNetwork() error {
// Enable the loopback interface
if err := exec.Command("ip", "link", "set", "lo", "up").Run(); err != nil {
return fmt.Errorf("failed to bring up lo: %v", err)
}
// Configure the container's network interface
if err := exec.Command("ip", "link", "set", VethContainer, "up").Run(); err != nil {
return fmt.Errorf("failed to bring up veth: %v", err)
}
if err := exec.Command("ip", "addr", "add", ContainerIP,
"dev", VethContainer).Run(); err != nil {
return fmt.Errorf("failed to assign IP to veth: %v", err)
}
if err := exec.Command("ip", "route", "add", "default",
"via", Gateway).Run(); err != nil {
return fmt.Errorf("failed to add default route: %v", err)
}
}
Internet Connectivity
To allow our container to access the internet, we need to set up NAT (Network Address Translation) rules. This is like setting up a router for our container:
func setupHostNetwork(pid int) error {
// Get the host's internet-connected interface
iface, err := getDefaultInterface()
if err != nil {
return fmt.Errorf("failed to get default interface: %v", err)
}
// Set up NAT and forwarding rules
cmds := [][]string{
{"sysctl", "-w", "net.ipv4.ip_forward=1"},
{"iptables", "-t", "nat", "-A", "POSTROUTING",
"-s", "10.0.0.0/24", "-o", iface, "-j", "MASQUERADE"},
{"iptables", "-A", "FORWARD", "-i", iface,
"-o", VethHost, "-j", "ACCEPT"},
{"iptables", "-A", "FORWARD", "-i", VethHost,
"-o", iface, "-j", "ACCEPT"},
}
for _, cmd := range cmds {
if out, err := exec.Command(cmd[0], cmd[1:]...).CombinedOutput(); err != nil {
return fmt.Errorf("failed %v: %s\n%s", cmd, err, out)
}
}
}
finally , Resource Cleanup
One often overlooked but crucial aspect is cleaning up network resources when the container stops. Our implementation handles this through a ResourceManager:
type ResourceManager struct {
containerName string
vethHost string
mounts []string
namespaces []string
}
func (rm *ResourceManager) cleanupNetwork() error {
// Clean up iptables rules
if err := rm.cleanupIptablesRules(); err != nil {
log.Printf("Warning: iptables cleanup failed: %v", err)
}
// Remove the virtual network interface
if out, err := exec.Command("ip", "link", "delete",
rm.vethHost).CombinedOutput(); err != nil {
log.Printf("Warning: failed to delete veth interface: %v (%s)", err, out)
}
return nil
}
How It All Works Together
When starting a container:
Create a new network namespace
Create virtual network interfaces (veth pair)
Configure IP addresses and routing
Set up NAT for internet access
Mount necessary filesystems and set up devices
2 .During container runtime:
Container uses its virtual network interface for all network communication
Outgoing traffic goes through NAT to reach the internet
Incoming traffic is routed back to the container
3 . When stopping a container:
Clean up iptables rules
Remove virtual interfaces
Unmount network namespace
Remove namespace files
Common Issues and Debugging
When implementing container networking, you might encounter these common issues:
DNS Resolution Problems
Our implementation includes DNS setup:
// in most cases this will case error , still trying to solve it
func setupDNS() error {
resolvHost := "/etc/resolv.conf"
resolvContainer := filepath.Join(RootFS, "etc/resolv.conf")
return syscall.Mount(resolvHost, resolvContainer, "bind",
syscall.MS_BIND|syscall.MS_RDONLY, "")
}
2.Network Interface Issues
Always check interface status with ip link show
Verify IP assignments with ip addr show
Check routing with ip route show
3.Connection Problems
Verify iptables rules are correctly set
Check IP forwarding is enabled
Ensure the host interface is up and working
Security Considerations
Our implementation includes several security features:
Network Isolation
Each container gets its own network namespace
Network traffic is isolated between containers
2.Resource Cleanup
Proper cleanup of network resources prevents resource leaks
Automatic cleanup on container exit
This networking implementation provides a solid foundation for container isolation while maintaining internet connectivity. While it’s simpler than production container runtimes, it demonstrates the core concepts of container networking.
this was the hard part for me and i have tryed so many implemention to achive that . you have to keep in main what and where your command executted . some times you find your self trying to create vath’s in continer or you cannot connect or move the continer interface from host to chiled
you have to read my previeus articl to know what we are doing i had clean up my code and add every thing agine to test network isolation
do not forget to change RootFS to your root fs like “ubuntu or whatever image you will run”
package main
import (
"fmt"
"log"
"os"
"os/exec"
"path/filepath"
"strings"
"syscall"
"os/signal"
)
const (
ContainerName = "mycontainer"
VethHost = "veth0"
VethContainer = "veth1"
ContainerIP = "10.0.0.2/24"
HostIP = "10.0.0.1/24"
Gateway = "10.0.0.1"
RootFS = "/mnt/drive/go-projects/lc-images-regs/ubuntu_fs"
)
type ResourceManager struct {
containerName string
vethHost string
mounts []string
namespaces []string
}
func NewResourceManager(containerName string) *ResourceManager {
return &ResourceManager{
containerName: containerName,
vethHost: VethHost,
mounts: []string{
"/proc",
"/dev/pts",
"/dev",
},
namespaces: []string{
"net",
"uts",
"pid",
"ipc",
},
}
}
func (rm *ResourceManager) Setup() {
// Set up signal handling
sigChan := make(chan os.Signal, 1)
signal.Notify(sigChan, syscall.SIGINT, syscall.SIGTERM)
go func() {
sig := <-sigChan
log.Printf("Received signal %v, cleaning up...", sig)
rm.Cleanup()
os.Exit(1)
}()
}
func (rm *ResourceManager) Cleanup() error {
var errors []string
// 1. Clean up network resources
if err := rm.cleanupNetwork(); err != nil {
errors = append(errors, fmt.Sprintf("network cleanup error: %v", err))
}
// 2. Clean up mounts
if err := rm.cleanupMounts(); err != nil {
errors = append(errors, fmt.Sprintf("mount cleanup error: %v", err))
}
// 3. Clean up namespaces
if err := rm.cleanupNamespaces(); err != nil {
errors = append(errors, fmt.Sprintf("namespace cleanup error: %v", err))
}
if len(errors) > 0 {
return fmt.Errorf("cleanup errors: %s", strings.Join(errors, "; "))
}
return nil
}
func (rm *ResourceManager) cleanupNetwork() error {
// Clean up iptables rules first
if err := rm.cleanupIptablesRules(); err != nil {
log.Printf("Warning: iptables cleanup failed: %v", err)
}
// Clean up veth interfaces
if out, err := exec.Command("ip", "link", "delete", rm.vethHost).CombinedOutput(); err != nil {
log.Printf("Warning: failed to delete veth interface: %v (%s)", err, out)
}
return nil
}
func (rm *ResourceManager) cleanupIptablesRules() error {
iface, err := getDefaultInterface()
if err != nil {
return fmt.Errorf("failed to get default interface: %v", err)
}
rules := [][]string{
{"iptables", "-D", "FORWARD", "-i", iface, "-o", rm.vethHost, "-j", "ACCEPT"},
{"iptables", "-D", "FORWARD", "-i", rm.vethHost, "-o", iface, "-j", "ACCEPT"},
{"iptables", "-t", "nat", "-D", "POSTROUTING", "-s", "10.0.0.0/24", "-o", iface, "-j", "MASQUERADE"},
}
for _, rule := range rules {
if out, err := exec.Command(rule[0], rule[1:]...).CombinedOutput(); err != nil {
log.Printf("Warning: failed to remove iptables rule: %v (%s)", err, out)
}
}
return nil
}
func (rm *ResourceManager) cleanupMounts() error {
for _, mount := range rm.mounts {
mountPath := filepath.Join(RootFS, mount)
if err := syscall.Unmount(mountPath, syscall.MNT_DETACH); err != nil {
log.Printf("Warning: failed to unmount %s: %v", mountPath, err)
}
}
return nil
}
func (rm *ResourceManager) cleanupNamespaces() error {
for _, ns := range rm.namespaces {
nsPath := filepath.Join("/var/run/netns", rm.containerName)
if err := syscall.Unmount(nsPath, syscall.MNT_DETACH); err != nil {
log.Printf("Warning: failed to unmount namespace %s: %v", ns, err)
}
if err := os.Remove(nsPath); err != nil {
log.Printf("Warning: failed to remove namespace file %s: %v", nsPath, err)
}
}
return nil
}
func cleanupExistingResources() error {
// Cleanup network namespace
if _, err := os.Stat("/var/run/netns/" + ContainerName); err == nil {
if err := cleanupNetworkNamespace(ContainerName); err != nil {
return fmt.Errorf("failed to cleanup existing network namespace: %v", err)
}
}
// Cleanup veth interfaces
if _, err := exec.Command("ip", "link", "show", VethHost).CombinedOutput(); err == nil {
if err := exec.Command("ip", "link", "delete", VethHost).Run(); err != nil {
return fmt.Errorf("failed to delete existing veth interface: %v", err)
}
}
// Cleanup iptables rules
if err := cleanupIptablesRules(); err != nil {
return fmt.Errorf("failed to cleanup iptables rules: %v", err)
}
return nil
}
func cleanupIptablesRules() error {
iface, err := getDefaultInterface()
if err != nil {
return fmt.Errorf("failed to get default interface: %v", err)
}
cmds := [][]string{
{"iptables", "-D", "FORWARD", "-i", iface, "-o", VethHost, "-j", "ACCEPT"},
{"iptables", "-D", "FORWARD", "-i", VethHost, "-o", iface, "-j", "ACCEPT"},
{"iptables", "-t", "nat", "-D", "POSTROUTING", "-s", "10.0.0.0/24", "-o", iface, "-j", "MASQUERADE"},
}
for _, cmd := range cmds {
// Ignore errors since rules might not exist
exec.Command(cmd[0], cmd[1:]...).Run()
}
return nil
}
func getDefaultInterface() (string, error) {
out, err := exec.Command("ip", "route", "show", "default").CombinedOutput()
if err != nil {
return "", err
}
fields := strings.Fields(string(out))
for i, field := range fields {
if field == "dev" && i+1 < len(fields) {
return fields[i+1], nil
}
}
return "", fmt.Errorf("no default interface found")
}
func main() {
if len(os.Args) < 2 {
log.Fatal("Usage: [run|child] command [args...]")
}
switch os.Args[1] {
case "run":
run()
case "child":
child()
default:
log.Fatalf("unknown command: %s", os.Args[1])
}
}
func setupCgroups(ContainerName string , pid int, cpuShares, memoryLimitMB int) error {
cgroupBase := "/sys/fs/cgroup"
containerID := ContainerName // fmt.Sprintf("container_%d", pid)
// Create CPU cgroup
cpuPath := filepath.Join(cgroupBase, "cpu", containerID)
os.MkdirAll(cpuPath, 0755)
os.WriteFile(filepath.Join(cpuPath, "cpu.shares"), []byte(fmt.Sprintf("%d", cpuShares)), 0644)
os.WriteFile(filepath.Join(cpuPath, "tasks"), []byte(fmt.Sprintf("%d", pid)), 0644)
// Create memory cgroup
memoryPath := filepath.Join(cgroupBase, "memory", containerID)
os.MkdirAll(memoryPath, 0755)
os.WriteFile(filepath.Join(memoryPath, "memory.limit_in_bytes"), []byte(fmt.Sprintf("%d", memoryLimitMB*1024*1024)), 0644)
os.WriteFile(filepath.Join(memoryPath, "tasks"), []byte(fmt.Sprintf("%d", pid)), 0644)
uidMap := fmt.Sprintf("0 %d 1", os.Getuid())
gidMap := fmt.Sprintf("0 %d 1", os.Getgid())
os.WriteFile(fmt.Sprintf("/proc/%d/uid_map", pid), []byte(uidMap), 0644)
os.WriteFile(fmt.Sprintf("/proc/%d/gid_map", pid), []byte(gidMap), 0644)
return nil
}
func run() {
rm := NewResourceManager(ContainerName)
rm.Setup()
defer rm.Cleanup()
if err := cleanupExistingResources(); err != nil {
log.Printf("Cleanup warning: %v", err)
}
// Create network namespace
if err := createNetworkNamespace(ContainerName); err != nil {
log.Fatalf("Failed to create network namespace: %v", err)
}
// Start container process
cmd := exec.Command("/proc/self/exe", append([]string{"child"}, os.Args[2:]...)...)
cmd.Stdin = os.Stdin
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
cmd.SysProcAttr = &syscall.SysProcAttr{
Cloneflags: syscall.CLONE_NEWUTS | syscall.CLONE_NEWPID | syscall.CLONE_NEWNS | syscall.CLONE_NEWNET |
syscall.CLONE_NEWIPC ,
}
if err := cmd.Start(); err != nil {
log.Fatalf("Failed to start container: %v", err)
}
pid := cmd.Process.Pid
if err :=setupCgroups(ContainerName , pid , 512 , 256 ); err != nil { // Example: 512 CPU shares, 256 MB memory limit
log.Fatalf("Failed to setup cgroups: %v", err)
}
// Configure host-side networking
if err := setupHostNetwork(cmd.Process.Pid); err != nil {
log.Fatalf("Failed to setup host network: %v", err)
}
// Wait for container to exit
if err := cmd.Wait(); err != nil {
log.Fatalf("Container failed: %v", err)
}
// Cleanup
if err := cleanupNetworkNamespace(ContainerName); err != nil {
log.Printf("Failed to cleanup network namespace: %v", err)
}
}
func child() {
// Setup container environment
if err := setupContainer(); err != nil {
log.Fatalf("Failed to setup container: %v", err)
}
// Execute command
if len(os.Args) < 3 {
log.Fatal("No command specified")
}
cmd := exec.Command(os.Args[2], os.Args[3:]...)
cmd.Stdin = os.Stdin
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
if err := cmd.Run(); err != nil {
log.Fatalf("Command failed: %v", err)
}
}
func createNetworkNamespace(name string) error {
// Create bind mount for ip netns compatibility
if err := os.MkdirAll("/var/run/netns", 0755); err != nil {
return err
}
// Create namespace file
nsFile := filepath.Join("/var/run/netns", name)
fd, err := os.Create(nsFile)
if err != nil {
return err
}
fd.Close()
// Bind mount
return syscall.Mount("/proc/self/ns/net", nsFile, "bind", syscall.MS_BIND, "")
}
func cleanupNetworkNamespace(name string) error {
nsFile := filepath.Join("/var/run/netns", name)
if err := syscall.Unmount(nsFile, 0); err != nil {
return fmt.Errorf("failed to unmount network namespace: %v", err)
}
// Remove the file to complete cleanup.
if err := os.Remove(nsFile); err != nil {
return fmt.Errorf("failed to remove namespace file %s: %v", nsFile, err)
}
return nil
}
func setupHostNetwork(pid int) error {
// Get host's default interface
iface, err := getDefaultInterface()
if err != nil {
return fmt.Errorf("failed to get default interface: %v", err)
}
// Create veth pair
if err := exec.Command("ip", "link", "add", VethHost, "type", "veth", "peer", "name", VethContainer).Run(); err != nil {
return fmt.Errorf("failed to create veth pair: %v", err)
}
// Move container end to namespace
if err := exec.Command("ip", "link", "set", VethContainer, "netns", fmt.Sprintf("%d", pid)).Run(); err != nil {
return fmt.Errorf("failed to move veth to container: %v", err)
}
// Configure host interface
if err := exec.Command("ip", "link", "set", VethHost, "up").Run(); err != nil {
return fmt.Errorf("failed to bring up host interface: %v", err)
}
if err := exec.Command("ip", "addr", "add", HostIP, "dev", VethHost).Run(); err != nil {
return fmt.Errorf("failed to assign IP to host interface: %v", err)
}
cmds := [][]string{
{"sysctl", "-w", "net.ipv4.ip_forward=1"},
{"iptables", "-t", "nat", "-A", "POSTROUTING", "-s", "10.0.0.0/24", "-o", iface, "-j", "MASQUERADE"},
{"iptables", "-A", "FORWARD", "-i", iface, "-o", VethHost, "-j", "ACCEPT"},
{"iptables", "-A", "FORWARD", "-i", VethHost, "-o", iface, "-j", "ACCEPT"},
}
for _, cmd := range cmds {
if out, err := exec.Command(cmd[0], cmd[1:]...).CombinedOutput(); err != nil {
return fmt.Errorf("failed %v: %s\n%s", cmd, err, out)
}
}
return nil
}
func setupContainer() error {
// Setup root filesystem
if err := syscall.Chroot(RootFS); err != nil {
return fmt.Errorf("chroot failed: %v", err)
}
if err := os.Chdir("/"); err != nil {
return fmt.Errorf("chdir failed: %v", err)
}
// Mount proc
if err := syscall.Mount("proc", "/proc", "proc", 0, ""); err != nil {
return fmt.Errorf("failed to mount proc: %v", err)
}
// Setup devices
if err := setupDevices(); err != nil {
return fmt.Errorf("failed to setup devices: %v", err)
}
// Configure network
if err := setupContainerNetwork(); err != nil {
return fmt.Errorf("failed to setup network: %v", err)
}
//if err := setupDNS(); err != nil {
// return fmt.Errorf("DNS setup failed: %v", err)
//}
return nil
}
func setupDNS() error {
// Copy host's resolv.conf
resolvHost := "/etc/resolv.conf"
resolvContainer := filepath.Join(RootFS, "etc/resolv.conf")
// Create container's /etc if missing
if err := os.MkdirAll(filepath.Join(RootFS, "etc"), 0755); err != nil {
return err
}
// Bind mount host's resolv.conf
return syscall.Mount(resolvHost, resolvContainer, "bind", syscall.MS_BIND|syscall.MS_RDONLY, "")
}
func setupDevices() error {
// Mount tmpfs for /dev
if err := syscall.Mount("tmpfs", "/dev", "tmpfs", 0, "size=64k,mode=755"); err != nil {
return err
}
// Create /dev/pts directory if missing
devPts := "/dev/pts"
if err := os.MkdirAll(devPts, 0755); err != nil {
return fmt.Errorf("mkdir %s failed: %v", devPts, err)
}
// Mount devpts on /dev/pts for pty support
if err := syscall.Mount("devpts", devPts, "devpts", 0, "mode=0620,ptmxmode=0666"); err != nil {
return fmt.Errorf("failed to mount devpts on %s: %v", devPts, err)
}
// Create basic devices
devices := []struct {
name string
major uint32
minor uint32
}{
{"null", 1, 3},
{"zero", 1, 5},
{"random", 1, 8},
{"urandom", 1, 9},
}
for _, dev := range devices {
path := filepath.Join("/dev", dev.name)
if err := syscall.Mknod(path, syscall.S_IFCHR|0666, int(makedev(dev.major, dev.minor))); err != nil {
return err
}
}
return nil
}
func makedev(major, minor uint32) uint64 {
return (uint64(major) << 8) | uint64(minor)
}
func setupContainerNetwork() error {
// Bring up loopback
if err := exec.Command("ip", "link", "set", "lo", "up").Run(); err != nil {
return fmt.Errorf("failed to bring up lo: %v", err)
}
// Configure veth interface
if err := exec.Command("ip", "link", "set", VethContainer, "up").Run(); err != nil {
return fmt.Errorf("failed to bring up veth: %v", err)
}
if err := exec.Command("ip", "addr", "add", ContainerIP, "dev", VethContainer).Run(); err != nil {
return fmt.Errorf("failed to assign IP to veth: %v", err)
}
if err := exec.Command("ip", "route", "add", "default", "via", Gateway).Run(); err != nil {
return fmt.Errorf("failed to add default route: %v", err)
}
return nil
}
Important point: You must mount and create essential virtual devices and establish communication (such as pipes or signals) between the host and child container .
func setupDevices() error {
// Mount tmpfs for /dev
if err := syscall.Mount("tmpfs", "/dev", "tmpfs", 0, "size=64k,mode=755"); err != nil {
return err
}
// Create /dev/pts directory if missing
devPts := "/dev/pts"
if err := os.MkdirAll(devPts, 0755); err != nil {
return fmt.Errorf("mkdir %s failed: %v", devPts, err)
}
// Mount devpts on /dev/pts for pty support
if err := syscall.Mount("devpts", devPts, "devpts", 0, "mode=0620,ptmxmode=0666"); err != nil {
return fmt.Errorf("failed to mount devpts on %s: %v", devPts, err)
}
// Create basic devices
devices := []struct {
name string
major uint32
minor uint32
}{
{"null", 1, 3},
{"zero", 1, 5},
{"random", 1, 8},
{"urandom", 1, 9},
}
for _, dev := range devices {
path := filepath.Join("/dev", dev.name)
if err := syscall.Mknod(path, syscall.S_IFCHR|0666, int(makedev(dev.major, dev.minor))); err != nil {
return err
}
}
return nil
}
func NewResourceManager(containerName string) *ResourceManager {
return &ResourceManager{
containerName: containerName,
vethHost: VethHost,
mounts: []string{
"/proc",
"/dev/pts",
"/dev",
},
namespaces: []string{
"net",
"uts",
"pid",
"ipc",
},
}
}
func (rm *ResourceManager) Setup() {
// Set up signal handling
sigChan := make(chan os.Signal, 1)
signal.Notify(sigChan, syscall.SIGINT, syscall.SIGTERM)
go func() {
sig := <-sigChan
log.Printf("Received signal %v, cleaning up...", sig)
rm.Cleanup()
os.Exit(1)
}()
}
Now you have a broad overview, but you still have a long journey ahead to achieve what production-ready container runtime systems offer.
If you need system file images to test your code, you can use Docker to download one.
$ docker run -d --rm --name ubuntu_fs ubuntu:20.04 sleep 1000
$ mkdir -p ./ubuntu_fs
$ docker cp ubuntu_fs:/ ./ubuntu_fs
$ docker stop ubuntu_fs
Or use tool like debootstrap
sudo apt-get update
sudo apt-get install debootstrap
sudo mkdir -p /path/to/rootfs
sudo debootstrap stable /path/to/rootfs http://deb.debian.org/debian
Sometimes, while testing, you may need to install software in your container image from the host if your child container struggles to access the internet.
sudo chroot /path/to/rootfs /bin/sh -c "apk add --no-cache iproute2"
or in ubuntu_fs
sudo chroot /mnt/drive/go-projects/lc-images-regs/ubuntu_fs /bin/sh -c "apt-get update && apt-get install -y iproute2"
Note: Sometimes, when you try to start the container by running the following command to start Bash as the entry point, you may encounter a bug:
sudo go run Main.go run sudo /bin/bash
you will face this bug
2025/02/21 00:26:28 Failed to setup container: failed to setup network: failed to bring up veth: exit status 1
2025/02/21 01:26:28 Failed to setup host network: failed to assign IP to host interface: exit status 1
exit status 1
This happens due to resource cleanup errors. You can either ignore it and retry the command up to three times or fix the issue.
You still need to implement DNS to align with the original system design. What we built is just a proof of concept application.
My next step is to ensure resource limitations and create an image composer like Docker while utilizing OverlayFS. Until then, if you need any help, feel free to DM me.
this is discord channel for this topic only join me :
Happy coding, everybody!
Top comments (0)