DEV Community

Trung Duong
Trung Duong

Posted on

ByteDance/Sonic: A Lightning-Fast JSON Library for Go

"In the world of microservices, every millisecond counts. See how TikTok's engineering team revolutionized JSON processing in Go."

JSON

Have you ever deployed a Go service and watched your CPU spike when processing thousands of JSON requests? You're not alone. TikTok's engineers faced this exact challenge at a massive scale - billions of requests per day. Their solution? They created Sonic, a JSON library that's changing the game for Go developers.

The JSON Problem Every Go Developer Knows

If you're a Go developer, you've probably used the standard encoding/json package. It works, but let's be real - it's not the fastest kid on the block. If you're building modern web services or APIs, you're working with JSON every day - from REST APIs to configuration files. When TikTok's engineers found their services processing millions of JSON requests per second, even small performance improvements in JSON handling could make a huge difference in server costs and user experience.

Let's look at a common scenario:

// Your typical JSON processing with standard library
import "encoding/json"

type User struct {
    Name  string `json:"name"`
    Posts []Post `json:"posts"`
}

// Processing this for millions of requests...
json.Unmarshal(data, &user)
Enter fullscreen mode Exit fullscreen mode

At small scale, this works fine. But when you're handling TikTok-scale traffic, those milliseconds add up to significant server costs and latency. That's exactly why ByteDance's team decided to tackle this challenge head-on.

The Speed Champion: Meet Sonic

Let's look at some real numbers first. When working with a medium-sized JSON file (about 13KB):

// Processing user profile with posts and metadata
Standard JSON: 106,322 ns/op (nanoseconds per operation)
Sonic:         32,393 ns/op
Memory Usage:
- Standard:    49,136 bytes with 789 allocations
- Sonic:       11,965 bytes with just 4 allocations
Enter fullscreen mode Exit fullscreen mode
  • Small (400B, 11 keys, 3 layers) small benchmarks
  • Large (635KB, 10000+ key, 6 layers) large benchmarks

See bench.sh for benchmark codes.

The Secret Sauce: Four Simple Tricks

1. Just-In-Time Compilation (JIT)

Imagine you're a chef. Instead of following a generic recipe every time, you create a special, optimized recipe for each specific dish you make frequently. That's what JIT does!

  • Regular Go JSON library: Uses the same generic code for all JSON
  • Sonic: Creates specialized code paths for your specific JSON structures

2. SIMD: Doing More Work at Once

Think of SIMD like having multiple hands to do a task:

  • Regular way: Sort cards one by one
  • SIMD way: Sort multiple cards at once

Sonic uses these "multiple hands" (SIMD instructions) to process JSON data in parallel, making everything faster.

3. Smart Memory Usage

Here's a clever trick Sonic uses: When it finds a string in your JSON that doesn't have any special characters, it doesn't make a copy. Instead, it just points to the original string. It's like giving directions instead of drawing a new map!

// Example JSON
{"name": "John", "city": "New York"}

// Standard library: Makes copies of "John" and "New York"
// Sonic: Just references these strings if they're simple
Enter fullscreen mode Exit fullscreen mode

4. Optional Features = Better Speed

Sonic makes some smart choices about what features to make optional:

  • Doesn't sort map keys by default (saves 10% processing time)
  • Doesn't escape HTML by default (saves 15% processing time)

You can still turn these features on if you need them, but by making them optional, Sonic stays fast for most common uses.

Design

The design is easy to implement:

  1. Aiming at the function-call overhead cost by the codec dynamic-assembly, JIT tech is used to assemble opcodes (asm) corresponding to the schema at runtime, which is finally cached into the off-heap memory in the form of Golang functions.
  2. For practical scenarios where big data and small data coexist, we use pre-conditional judgment (string size, floating precision, etc.) to combine SIMD with scalar instructions to achieve the best adaptation.
  3. As for insufficiency in compiling optimization of go language, we decided to use C/Clang to write and compile core computational functions, and developed a set of asm2asm tools to translate the fully optimized x86 assembly into plan9 and finally load it into Golang runtime.
  4. Giving the big speed gap between parsing and skipping, the lazy-load mechanism is certainly used in our AST parser, but in a more adaptive and efficient way to reduce the overhead of multiple-key queries. design

In detail, we conducted some further optimization:

  1. Since the native-asm functions cannot be inlined in Golang, we found that its cost even exceeded the improvement brought by the optimization of the C compiler. So we reimplemented a set of lightweight function-calls in JIT:
    • Global-function-table + static offset for calling instruction
    • Pass parameters using registers
  2. Sync.Map was used to cache the codecs at first, but for our quasi-static (read far more than write), fewer elements (usually no more than a few dozen) scenarios, its performance is not optimal, so we reimplement a high-performance and concurrent-safe cache with open-addressing-hash + RCU tech.

From Sonic Design

Real-World Benefits

Let's talk numbers that matter in the real world:

  1. Memory Usage (for medium JSON):

    • Standard Go: Uses about 49KB of memory with 789 allocations
    • Sonic: Uses only about 12KB with just 4 allocations
    • That's huge when you're processing millions of requests!
  2. Practical Impact:

    • Faster API responses
    • Lower server costs
    • Better user experience
    • More requests handled per server

Should You Switch to Sonic?

Sonic might be great for you if:

  • You process lots of JSON data
  • Performance is important for your application
  • You're working on AMD64 or ARM64 processors

But remember:

  • It requires Go version 1.17 or higher
  • It works on Linux, MacOS, and Windows
  • Some features (like HTML escaping) need to be explicitly enabled if you need them

Quick Start Example

Here's how simple it is to use Sonic:

import "github.com/bytedance/sonic"

// Encoding
data := map[string]string{"hello": "world"}
bytes, err := sonic.Marshal(data)

// Decoding
var result map[string]string
err = sonic.Unmarshal(bytes, &result)
Enter fullscreen mode Exit fullscreen mode

Wrapping Up

Sonic shows us that even with something as common as JSON processing, there's still room for impressive improvements. By using modern CPU features (SIMD), smart compilation (JIT), and thoughtful design choices, it achieves remarkable performance gains over the standard library.

Remember: In software development, it's not just about making things work - sometimes, making them work faster can open up new possibilities for what your applications can achieve!

Further Reading

Want to dive deeper? Here are some great resources:

  1. Sonic GitHub Repository
  2. Technical Introduction Document
  3. Sonic Benchmark Details
  4. SIMD
  5. JIT compilation
  6. sync.Map

What do you think about Sonic? Have you tried using it in your Go projects? Let me know in the comments below!

Top comments (0)