Kevin Wan

Posted on May 3, 2022

My best practices on Go fuzzing

#go #webdev #microservices #testing

As programmers, we often hope that our code is bug-free! But the fact is that bug-free can only be disproved, not proven. The upcoming release of Go 1.18 officially provides a great tool to help us prove it (in most cases) - go fuzzing.

Go 1.18 is all about generics, but I really think go fuzzing is the most useful feature of Go 1.18 at the moment, even compared with generics!

In this article, we'll take a closer look at go fuzzing:

What is it?
How to use go fuzzing?
What are the best practices?

What is go fuzzing

According to the official documentation, go fuzzing is a way to automate tests by continuously giving different inputs to a program and analyzing the code coverage to intelligently find the failed cases. The problems found in the test are usually difficult to find.

How to use go fuzzing

The official rules for writing fuzz tests are.

The function must start with Fuzz, the only argument is *testing.F, and there is no return value
Fuzz tests must be in the *_test.go file
The fuzz target in the above image is a method call (*testing.F).Fuzz with *testing.T as the first argument, and then arguments called fuzzing arguments, with no return value
There can be only one fuzz target in each fuzz test
Calling f.Add(...) requires that the arguments be of the same order and type as the fuzzing arguments
fuzzing arguments only supports the following types.
- string, []byte
- int, int8, int16, int32/rune, int64
- uint, uint8/byte, uint16, uint32, uint64
- float32, float64
- bool
fuzz target Do not rely on global state, it will run in parallel.

Run `fuzzing tests`

If I write a fuzzing test, e.g.



// See https://github.com/zeromicro/go-zero/blob/master/core/mr/mapreduce_fuzz_test.go for the specific code
func FuzzMapReduce(f *testing.F) {
  ...
}

Then we can execute it like this.



go test -fuzz=MapReduce

We would get something like the following result.



fuzz: elapsed: 0s, gathering baseline coverage: 0/2 completed
fuzz: elapsed: 0s, gathering baseline coverage: 2/2 completed, now fuzzing with 10 workers
fuzz: elapsed: 3s, execs: 3338 (1112/sec), new interesting: 56 (total: 57)
fuzz: elapsed: 6s, execs: 6770 (1144/sec), new interesting: 62 (total: 63)
fuzz: elapsed: 9s, execs: 10157 (1129/sec), new interesting: 69 (total: 70)
fuzz: elapsed: 12s, execs: 13586 (1143/sec), new interesting: 72 (total: 73)
^Cfuzz: elapsed: 13s, execs: 14031 (1084/sec), new interesting: 72 (total: 73)
PASS
ok github.com/zeromicro/go-zero/core/mr 13.169s

The ^C is because I pressed Ctrl-C to terminate the test, see the official documentation for details.

best practices in go-zero

Based on my experience with go-zero, I've summarized the best practices in four preliminary steps.

define fuzzing arguments, first figure out how to define fuzzing arguments, and write fuzzing target with the given fuzzing arguments. 2.
think about how to write fuzzing target, the focus here is on how to verify the correctness of the results, because fuzzing arguments are given "randomly", so there should be a general method to verify the results
think about how to print the result of a failed case, so that a new unit test can be generated
write a new unit test based on the results of the failed fuzzing test, this new unit test will be used to debug the problems found by the fuzzing test and solidify them for the CI.

The next step is to show the above steps with a simple array summation function. The actual case in go-zero is slightly more complicated, and I will give the internal go-zero landing case at the end of the article for you to write complex scenarios.

Here is a bug-injected implementation of the summation code.



func Sum(vals []int64) int64 {
  var total int64

  for _, val := range vals {
    if val%1e5 ! = 0 {
      total += val
    }
  }

  return total
}

1. Define `fuzzing arguments`

You need to give at least one fuzzing argument, otherwise go fuzzing can't generate test code, so even if we don't have good input, we need to define a fuzzing argument that will have an impact on the result, here we use the number of slice elements as fuzzing arguments, and then Go fuzzing will automatically generate different arguments to simulate the test based on the code coverage that is run.



func FuzzSum(f *testing.F) {
  f.Add(10)
  f.Fuzz(func(t *testing.T, n int) {
    n %= 20
    ...
  })
}

Here n is the number of elements that go fuzzing can simulate in a slice. To make sure the number of elements is not too high, we limit it to 20 (0 is fine) and we add a corpus of 10 (called corpus in go fuzzing), which is the value that makes go fuzzing cold start, it doesn't matter how much it is.

2. How to write the `fuzzing target`

This step focuses on writing a verifiable fuzzing target, writing test code based on the given fuzzing arguments, and generating data to verify the correctness of the results.

For our Sum function, it is actually relatively simple to generate a random slice of n elements and then sum up the expected result. As follows.



func FuzzSum(f *testing.F) {
  rand.Seed(time.Now().UnixNano())

  f.Add(10)
  f.Fuzz(func(t *testing.T, n int) {
    n %= 20
    var vals []int64
    var expect int64
    for i := 0; i < n; i++ {
      val := rand.Int63() % 1e6
      vals = append(vals, val)
      expect += val
    }

    assert.Equal(t, expect, Sum(vals))
  })
}

This code is still very easy to understand, their own summation and Sum summation to do a comparison only, will not explain in detail. But complex scenarios you need to think carefully about how to write the verification code, but this will not be too difficult, too difficult, it may not be enough understanding of the test function or simplify.

At this point, you can run fuzzing tests with the following command, and the result will be similar to the following.



$ go test -fuzz=Sum
fuzz: elapsed: 0s, gathering baseline coverage: 0/2 completed
fuzz: elapsed: 0s, gathering baseline coverage: 2/2 completed, now fuzzing with 10 workers
fuzz: elapsed: 0s, execs: 6672 (33646/sec), new interesting: 7 (total: 6)
--- FAIL: FuzzSum (0.21s)
    --- FAIL: FuzzSum (0.00s)
        sum_fuzz_test.go:34:
              Error Trace: sum_fuzz_test.go:34
                                  value.go:556
                                  value.go:339
                                  fuzz.go:334
              Error: Not equal:
                            expected: 8736932
                            actual : 8636932
              Test: FuzzSum

    Failing input written to testdata/fuzz/FuzzSum/739002313aceff0ff5ef993030bbde9115541cabee2554e6c9f3faaf581f2004
    To re-run:
    go test -run=FuzzSum/739002313aceff0ff5ef993030bbde9115541cabee2554e6c9f3faaf581f2004
FAIL
exit status 1
FAIL github.com/kevwan/fuzzing 0.614s

So here's the problem! We see the result is not right, but we can hardly analyze why it is not right, you taste carefully, this output above, how do you analyze?

3. How to print the input for the failed case

For the above failed test, if we can print out the input and form a simple test case, then we can debug it directly. It is better to copy/paste the printed input directly to the new test case, if the format is not right, you need to adjust the format line by line for so many lines of input is too tired, and it may not be only one failure case.

So we changed the code to the following.



func FuzzSum(f *testing.F) {
  rand.Seed(time.Now().UnixNano())

  f.Add(10)
  f.Fuzz(func(t *testing.T, n int) {
    n %= 20
    var vals []int64
    var expect int64
    var buf strings.
    buf.WriteString("\n")
    for i := 0; i < n; i++ {
      val := rand.Int63() % 1e6
      vals = append(vals, val)
      expect += val
      buf.WriteString(fmt.Sprintf("%d,\n", val))
    }

    assert.Equal(t, expect, Sum(vals), buf.String())
  })
}

Running the command again gives the following result.



$ go test -fuzz=Sum
fuzz: elapsed: 0s, gathering baseline coverage: 0/2 completed
fuzz: elapsed: 0s, gathering baseline coverage: 2/2 completed, now fuzzing with 10 workers
fuzz: elapsed: 0s, execs: 1402 (10028/sec), new interesting: 10 (total: 8)
--- FAIL: FuzzSum (0.16s)
    --- FAIL: FuzzSum (0.00s)
        sum_fuzz_test.go:34:
              Error Trace: sum_fuzz_test.go:34
                                  value.go:556
                                  value.go:339
                                  fuzz.go:334
              Error: Not equal:
                            expected: 5823336
                            actual : 5623336
              Test: FuzzSum
              Messages:
                            799023,
                            110387,
                            811082,
                            115543,
                            859422,
                            997646,
                            200000,
                            399008,
                            7905,
                            931332,
                            591988,

    Failing input written to testdata/fuzz/FuzzSum/26d024acf85aae88f3291bf7e1c6f473eab8b051f2adb1bf05d4491bc49f5767
    To re-run:
    go test -run=FuzzSum/26d024acf85aae88f3291bf7e1c6f473eab8b051f2adb1bf05d4491bc49f5767
FAIL
exit status 1
FAIL github.com/kevwan/fuzzing 0.602s

4. Write a new test case

Based on the output of the failure case above, we can copy/paste the following code, but of course the framework is written by ourselves and the input parameters can be copied in directly.



func TestSumFuzzCase1(t *testing.T) {
  vals := []int64{
    799023,
    110387,
    811082,
    115543,
    859422,
    997646,
    200000,
    399008,
    7905,
    931332,
    591988,
  }
  assert.Equal(t, int64(5823336), Sum(vals))
}

This makes it easy to debug and to add a valid unit test to ensure that the bug never comes up again.

`go fuzzing` more experience

Go versioning issues

I believe that Go 1.18 has been released, and most projects will not immediately upgrade their online code to 1.18, so what if the testing.F introduced by go fuzzing doesn't work?

If the online (go.mod) is not upgraded to Go 1.18, but we are fully recommended to upgrade locally, then we just need to put the above FuzzSum into a file with a name like sum_fuzz_test.go and add the following directive to the header.



//go:build go1.18
// +build go1.18

Note: The third line must be a blank line, otherwise it will become a package comment.

This way we don't report errors online no matter which version we use, and we run fuzz testing usually locally and are not affected.

go fuzzing does not reproduce failures

The above steps are for simple cases, but sometimes the problem becomes complicated when the input from the failure case forms a new unit test that does not reproduce the problem (especially if there is a goroutine deadlock problem), as you can see in the following output.



go test -fuzz=MapReduce
fuzz: elapsed: 0s, gathering baseline coverage: 0/2 completed
fuzz: elapsed: 0s, gathering baseline coverage: 2/2 completed, now fuzzing with 10 workers
fuzz: elapsed: 3s, execs: 3681 (1227/sec), new interesting: 54 (total: 55)
...
fuzz: elapsed: 1m21s, execs: 92705 (1101/sec), new interesting: 85 (total: 86)
--- FAIL: FuzzMapReduce (80.96s)
    fuzzing process hung or terminated unexpectedly: exit status 2
    Failing input written to testdata/fuzz/FuzzMapReduce/ee6a61e8c968adad2e629fba11984532cac5d177c4899d3e0b7c2949a0a3d840
    To re-run:
    go test -run=FuzzMapReduce/ee6a61e8c968adad2e629fba11984532cac5d177c4899d3e0b7c2949a0a3d840
FAIL
exit status 1
FAIL github.com/zeromicro/go-zero/core/mr 81.471s

In this case, it just tells us that the fuzzing process is stuck or has ended abnormally, with a status code of 2. In this case, the re-run is not normally reproduced. Why does it simply return an error code of 2? I went through the source code of go fuzzing, and each fuzzing test is run by a separate process, and then go fuzzing throws away the process output of the fuzzing test and just shows the status code. So how do we solve this problem?

After careful analysis, I decided to write a regular unit test code like fuzzing test myself, which would ensure that the failures are in the same process and would print the error message to the standard output, the code is roughly as follows.



func TestSumFuzzRandom(t *testing.T) {
  const times = 100000
  rand.Seed(time.Now().UnixNano())

  for i := 0; i < times; i++ {
    n := rand.Intn(20)
    var vals []int64
    var expect int64
    var buf strings.
    buf.WriteString("\n")
    for i := 0; i < n; i++ {
      val := rand.Int63() % 1e6
      vals = append(vals, val)
      expect += val
      buf.WriteString(fmt.Sprintf("%d,\n", val))
    }

    assert.Equal(t, expect, Sum(vals), buf.String())
  }
}

This way we can do a simple simulation of go fuzzing ourselves, but with any errors we can get clear output. Here maybe we study through go fuzzing, or there are other ways to control it, if you know, thanks for letting me know.

But this simulated case that takes a long time to run, we wouldn't want it to be executed every time at CI, so I put it in a separate file named something like sum_fuzzcase_test.go and added the following directive to the header of the file.



//go:build fuzz
// +build fuzz

This way we can add -tags fuzz when we need to run this mock case, e.g.



go test -tags fuzz . /...

Complex usage examples

The above is an example, still relatively simple, if you encounter complex scenarios do not know how to write, you can first see how go-zero is landed go fuzzing, as follows.