Leapcell: The Next - Gen Serverless Platform for Golang app Hosting
0. Introduction
Previously, it was mentioned that when the native HTTP server in Go handles client connections, it spawns a goroutine for each connection, which is a rather brute - force approach. To gain a deeper understanding, let's take a look at the Go source code. First, define the simplest HTTP server as follows:
func myHandler(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "Hello there!\n")
}
func main() {
http.HandleFunc("/", myHandler) // Set the access route
log.Fatal(http.ListenAndServe(":8080", nil))
}
Follow the entry point http.ListenAndServe
function.
// file: net/http/server.go
func ListenAndServe(addr string, handler Handler) error {
server := &Server{Addr: addr, Handler: handler}
return server.ListenAndServe()
}
func (srv *Server) ListenAndServe() error {
addr := srv.Addr
if addr == "" {
addr = ":http"
}
ln, err := net.Listen("tcp", addr)
if err!= nil {
return err
}
return srv.Serve(tcpKeepAliveListener{ln.(*net.TCPListener)})
}
func (srv *Server) Serve(l net.Listener) error {
defer l.Close()
...
for {
rw, e := l.Accept()
if e!= nil {
// error handle
return e
}
tempDelay = 0
c, err := srv.newConn(rw)
if err!= nil {
continue
}
c.setState(c.rwc, StateNew) // before Serve can return
go c.serve()
}
}
First, net.Listen
is responsible for listening on the network port. rw, e := l.Accept()
then retrieves the TCP connection from the network port, and go c.server()
spawns a goroutine for each TCP connection to handle it. I also mentioned that the fasthttp network framework has better performance than the native net/http
framework, and one of the reasons is the use of a goroutine pool. So, the question is, if we were to implement a goroutine pool ourselves, how would we do it? Let's start with the simplest implementation.
1. Weak Version
In Go, goroutines are launched using the go
keyword. Goroutine resources are different from temporary object pools; they cannot be put back and retrieved again. So, goroutines should be running continuously. They run when needed and block when not needed, which has little impact on the scheduling of other goroutines. And the tasks of goroutines can be passed through channels. Here comes a simple weak version:
func Gopool() {
start := time.Now()
wg := new(sync.WaitGroup)
data := make(chan int, 100)
for i := 0; i < 10; i++ {
wg.Add(1)
go func(n int) {
defer wg.Done()
for _ = range data {
fmt.Println("goroutine:", n, i)
}
}(i)
}
for i := 0; i < 10000; i++ {
data <- i
}
close(data)
wg.Wait()
end := time.Now()
fmt.Println(end.Sub(start))
}
The above code also calculates the running time of the program. For comparison, here is a version without using a pool:
func Nopool() {
start := time.Now()
wg := new(sync.WaitGroup)
for i := 0; i < 10000; i++ {
wg.Add(1)
go func(n int) {
defer wg.Done()
//fmt.Println("goroutine", n)
}(i)
}
wg.Wait()
end := time.Now()
fmt.Println(end.Sub(start))
}
Finally, comparing the running times, the code using the goroutine pool runs in about 2/3 of the time of the code without using a pool. Of course, this test is still a bit rough. Next, we use the Go benchmark testing method introduced in the reflect article to test. The test code is as follows (many irrelevant codes are removed):
package pool
import (
"sync"
"testing"
)
func Gopool() {
wg := new(sync.WaitGroup)
data := make(chan int, 100)
for i := 0; i < 10; i++ {
wg.Add(1)
go func(n int) {
defer wg.Done()
for _ = range data {
}
}(i)
}
for i := 0; i < 10000; i++ {
data <- i
}
close(data)
wg.Wait()
}
func Nopool() {
wg := new(sync.WaitGroup)
for i := 0; i < 10000; i++ {
wg.Add(1)
go func(n int) {
defer wg.Done()
}(i)
}
wg.Wait()
}
func BenchmarkGopool(b *testing.B) {
for i := 0; i < b.N; i++ {
Gopool()
}
}
func BenchmarkNopool(b *testing.B) {
for i := 0; i < b.N; i++ {
Nopool()
}
}
The final test results are as follows. The code using the goroutine pool indeed has a shorter execution time.
$ go test -bench='.' gopool_test.go
BenchmarkGopool-8 500 2696750 ns/op
BenchmarkNopool-8 500 3204035 ns/op
PASS
2. Upgraded Version
For a good thread pool, we often have more requirements. One of the most urgent needs is to be able to customize the function that the goroutine runs. A function is nothing more than a function address and function parameters. What if the functions to be passed in have different forms (different parameters or return values)? A relatively simple method is to introduce reflection.
type worker struct {
Func interface{}
Args []reflect.Value
}
func main() {
var wg sync.WaitGroup
channels := make(chan worker, 10)
for i := 0; i < 5; i++ {
wg.Add(1)
go func() {
defer wg.Done()
for ch := range channels {
reflect.ValueOf(ch.Func).Call(ch.Args)
}
}()
}
for i := 0; i < 100; i++ {
wk := worker{
Func: func(x, y int) {
fmt.Println(x + y)
},
Args: []reflect.Value{reflect.ValueOf(i), reflect.ValueOf(i)},
}
channels <- wk
}
close(channels)
wg.Wait()
}
However, introducing reflection also brings performance issues. The goroutine pool was originally designed to solve performance problems, but now a new performance issue has been introduced. So, what should we do? Closures.
type worker struct {
Func func()
}
func main() {
var wg sync.WaitGroup
channels := make(chan worker, 10)
for i := 0; i < 5; i++ {
wg.Add(1)
go func() {
defer wg.Done()
for ch := range channels {
//reflect.ValueOf(ch.Func).Call(ch.Args)
ch.Func()
}
}()
}
for i := 0; i < 100; i++ {
j := i
wk := worker{
Func: func() {
fmt.Println(j + j)
},
}
channels <- wk
}
close(channels)
wg.Wait()
}
It is worth noting that in Go, closures can easily lead to problems if not used properly. A key point in understanding closures is the reference to an object rather than a copy. This is just a simplified version of the goroutine pool implementation. When actually implementing it, many details need to be considered, such as setting up a stop channel to stop the pool. But the core of the goroutine pool lies here.
3. Relationship between Goroutine Pool and CPU Cores
So, is there a relationship between the number of goroutines in the goroutine pool and the number of CPU cores? This actually needs to be discussed in different cases.
1. Goroutine pool is not fully utilized
This means that as soon as there is data in the channel data
, it will be taken away by the goroutine. In this case, of course, as long as the CPU can schedule it, that is, the number of goroutines in the pool and the number of CPU cores are optimal. Tests have confirmed this.
2. Data in channel data
is blocked
This means that there are not enough goroutines. If the running tasks of goroutines are not CPU - intensive (most cases are not) and are only blocked by I/O, then generally, the more goroutines within a certain range, the better. Of course, the specific range needs to be analyzed according to the specific situation.
Leapcell: The Next - Gen Serverless Platform for Golang app Hosting
Finally, I would like to recommend a platform Leapell that is most suitable for deploying Golang services.
1. Multi - Language Support
- Develop with JavaScript, Python, Go, or Rust.
2. Deploy unlimited projects for free
- Pay only for usage — no requests, no charges.
3. Unbeatable Cost Efficiency
- Pay - as - you - go with no idle charges.
- Example: $25 supports 6.94M requests at a 60ms average response time.
4. Streamlined Developer Experience
- Intuitive UI for effortless setup.
- Fully automated CI/CD pipelines and GitOps integration.
- Real - time metrics and logging for actionable insights.
5. Effortless Scalability and High Performance
- Auto - scaling to handle high concurrency with ease.
- Zero operational overhead — just focus on building.
Explore more in the documentation!
Leapcell Twitter: https://x.com/LeapcellHQ
Top comments (0)