DEV Community

Cover image for Building a Goroutine Pool in Go
Leapcell
Leapcell

Posted on

Building a Goroutine Pool in Go

Image description

Leapcell: The Next - Gen Serverless Platform for Golang app Hosting

0. Introduction

Previously, it was mentioned that when the native HTTP server in Go handles client connections, it spawns a goroutine for each connection, which is a rather brute - force approach. To gain a deeper understanding, let's take a look at the Go source code. First, define the simplest HTTP server as follows:

func myHandler(w http.ResponseWriter, r *http.Request) {
    fmt.Fprintf(w, "Hello there!\n")
}

func main() {
    http.HandleFunc("/", myHandler)     // Set the access route
    log.Fatal(http.ListenAndServe(":8080", nil))
}
Enter fullscreen mode Exit fullscreen mode

Follow the entry point http.ListenAndServe function.

// file: net/http/server.go
func ListenAndServe(addr string, handler Handler) error {
    server := &Server{Addr: addr, Handler: handler}
    return server.ListenAndServe()
}

func (srv *Server) ListenAndServe() error {
    addr := srv.Addr
    if addr == "" {
        addr = ":http"
    }
    ln, err := net.Listen("tcp", addr)
    if err!= nil {
        return err
    }
    return srv.Serve(tcpKeepAliveListener{ln.(*net.TCPListener)})        
}

func (srv *Server) Serve(l net.Listener) error {
    defer l.Close()
  ...
    for {
        rw, e := l.Accept()
        if e!= nil {
            // error handle
            return e
        }
        tempDelay = 0
        c, err := srv.newConn(rw)
        if err!= nil {
            continue
        }
        c.setState(c.rwc, StateNew) // before Serve can return
        go c.serve()
    }
}
Enter fullscreen mode Exit fullscreen mode

First, net.Listen is responsible for listening on the network port. rw, e := l.Accept() then retrieves the TCP connection from the network port, and go c.server() spawns a goroutine for each TCP connection to handle it. I also mentioned that the fasthttp network framework has better performance than the native net/http framework, and one of the reasons is the use of a goroutine pool. So, the question is, if we were to implement a goroutine pool ourselves, how would we do it? Let's start with the simplest implementation.

1. Weak Version

In Go, goroutines are launched using the go keyword. Goroutine resources are different from temporary object pools; they cannot be put back and retrieved again. So, goroutines should be running continuously. They run when needed and block when not needed, which has little impact on the scheduling of other goroutines. And the tasks of goroutines can be passed through channels. Here comes a simple weak version:

func Gopool() {
    start := time.Now()
    wg := new(sync.WaitGroup)
    data := make(chan int, 100)

    for i := 0; i < 10; i++ {
        wg.Add(1)
        go func(n int) {
            defer wg.Done()
            for _ = range data {
                fmt.Println("goroutine:", n, i)
            }
        }(i)
    }

    for i := 0; i < 10000; i++ {
        data <- i
    }
    close(data)
    wg.Wait()
    end := time.Now()
    fmt.Println(end.Sub(start))
}
Enter fullscreen mode Exit fullscreen mode

The above code also calculates the running time of the program. For comparison, here is a version without using a pool:

func Nopool() {
    start := time.Now()
    wg := new(sync.WaitGroup)

    for i := 0; i < 10000; i++ {
        wg.Add(1)
        go func(n int) {
            defer wg.Done()
            //fmt.Println("goroutine", n)
        }(i)
    }
    wg.Wait()

    end := time.Now()
    fmt.Println(end.Sub(start))
}
Enter fullscreen mode Exit fullscreen mode

Finally, comparing the running times, the code using the goroutine pool runs in about 2/3 of the time of the code without using a pool. Of course, this test is still a bit rough. Next, we use the Go benchmark testing method introduced in the reflect article to test. The test code is as follows (many irrelevant codes are removed):

package pool

import (
    "sync"
    "testing"
)

func Gopool() {
    wg := new(sync.WaitGroup)
    data := make(chan int, 100)

    for i := 0; i < 10; i++ {
        wg.Add(1)
        go func(n int) {
            defer wg.Done()
            for _ = range data {
            }
        }(i)
    }

    for i := 0; i < 10000; i++ {
        data <- i
    }
    close(data)
    wg.Wait()
}

func Nopool() {
    wg := new(sync.WaitGroup)

    for i := 0; i < 10000; i++ {
        wg.Add(1)
        go func(n int) {
            defer wg.Done()
        }(i)
    }
    wg.Wait()
}

func BenchmarkGopool(b *testing.B) {
    for i := 0; i < b.N; i++ {
        Gopool()
    }
}

func BenchmarkNopool(b *testing.B) {
    for i := 0; i < b.N; i++ {
        Nopool()
    }
}
Enter fullscreen mode Exit fullscreen mode

The final test results are as follows. The code using the goroutine pool indeed has a shorter execution time.

$ go test -bench='.' gopool_test.go
BenchmarkGopool-8            500       2696750 ns/op
BenchmarkNopool-8            500       3204035 ns/op
PASS
Enter fullscreen mode Exit fullscreen mode

2. Upgraded Version

For a good thread pool, we often have more requirements. One of the most urgent needs is to be able to customize the function that the goroutine runs. A function is nothing more than a function address and function parameters. What if the functions to be passed in have different forms (different parameters or return values)? A relatively simple method is to introduce reflection.

type worker struct {
    Func interface{}
    Args []reflect.Value
}

func main() {
    var wg sync.WaitGroup

    channels := make(chan worker, 10)
    for i := 0; i < 5; i++ {
        wg.Add(1)
        go func() {
            defer wg.Done()
            for ch := range channels {
                reflect.ValueOf(ch.Func).Call(ch.Args)
            }
        }()
    }

    for i := 0; i < 100; i++ {
        wk := worker{
            Func: func(x, y int) {
                fmt.Println(x + y)
            },
            Args: []reflect.Value{reflect.ValueOf(i), reflect.ValueOf(i)},
        }
        channels <- wk
    }
    close(channels)
    wg.Wait()
}
Enter fullscreen mode Exit fullscreen mode

However, introducing reflection also brings performance issues. The goroutine pool was originally designed to solve performance problems, but now a new performance issue has been introduced. So, what should we do? Closures.

type worker struct {
    Func func()
}

func main() {
    var wg sync.WaitGroup

    channels := make(chan worker, 10)

    for i := 0; i < 5; i++ {
        wg.Add(1)
        go func() {
            defer wg.Done()
            for ch := range channels {
                //reflect.ValueOf(ch.Func).Call(ch.Args)
                ch.Func()
            }
        }()
    }

    for i := 0; i < 100; i++ {
        j := i
        wk := worker{
            Func: func() {
                fmt.Println(j + j)
            },
        }
        channels <- wk
    }
    close(channels)
    wg.Wait()
}
Enter fullscreen mode Exit fullscreen mode

It is worth noting that in Go, closures can easily lead to problems if not used properly. A key point in understanding closures is the reference to an object rather than a copy. This is just a simplified version of the goroutine pool implementation. When actually implementing it, many details need to be considered, such as setting up a stop channel to stop the pool. But the core of the goroutine pool lies here.

3. Relationship between Goroutine Pool and CPU Cores

So, is there a relationship between the number of goroutines in the goroutine pool and the number of CPU cores? This actually needs to be discussed in different cases.

1. Goroutine pool is not fully utilized

This means that as soon as there is data in the channel data, it will be taken away by the goroutine. In this case, of course, as long as the CPU can schedule it, that is, the number of goroutines in the pool and the number of CPU cores are optimal. Tests have confirmed this.

2. Data in channel data is blocked

This means that there are not enough goroutines. If the running tasks of goroutines are not CPU - intensive (most cases are not) and are only blocked by I/O, then generally, the more goroutines within a certain range, the better. Of course, the specific range needs to be analyzed according to the specific situation.

Leapcell: The Next - Gen Serverless Platform for Golang app Hosting

Finally, I would like to recommend a platform Leapell that is most suitable for deploying Golang services.

Image description

1. Multi - Language Support

  • Develop with JavaScript, Python, Go, or Rust.

2. Deploy unlimited projects for free

  • Pay only for usage — no requests, no charges.

3. Unbeatable Cost Efficiency

  • Pay - as - you - go with no idle charges.
  • Example: $25 supports 6.94M requests at a 60ms average response time.

4. Streamlined Developer Experience

  • Intuitive UI for effortless setup.
  • Fully automated CI/CD pipelines and GitOps integration.
  • Real - time metrics and logging for actionable insights.

5. Effortless Scalability and High Performance

  • Auto - scaling to handle high concurrency with ease.
  • Zero operational overhead — just focus on building.

Image description

Explore more in the documentation!

Leapcell Twitter: https://x.com/LeapcellHQ

Top comments (0)