Forem

Cover image for The Missing Guide to Rust's Atomic::Ordering
Aryan Anand
Aryan Anand

Posted on

The Missing Guide to Rust's Atomic::Ordering

When working with concurrent programming in Rust, atomic operations provide a powerful way to manage shared state safely. However, one aspect that often confuses developers—especially those new to low-level concurrency—is atomic memory ordering. The Ordering variants in Rust's std::sync::atomic module control how operations on atomics are perceived across threads, ensuring correctness while balancing performance.

In this article, we'll break down what Atomic::Ordering really means, why it matters, and how to choose the right ordering for your use case. We'll be implementing a Mutex from scratch and build up to different (Relaxed, Acquire, Release, AcqRel, and SeqCst) orderings, examine their trade-offs, and use practical examples with real world analogies to understand the concept

Explaining Atomics

The word atomic comes from the Greek word ἄτομος, meaning indivisible, something that cannot be cut into smaller pieces. In computer science, it is used to describe an operation that is indivisible: it is either fully completed, or it didn’t happen yet.

Atomics in Rust are used to perform small operations (add, substract, compare, etc) on a shared memory. Unlike a normal x=x+1 statement which has to go through the Fetch, Decode, Execute, WriteBack cycle, atomics on the other hand gets executed in a single (or even less than that) CPU Cycle. Hence preventing a data-race condition among threads. This makes it perfect for implementing Mutexes

const LOCKED:bool = true;
const UNLOCKED:bool = false;

// Intentional incorrect implementation of a Mutex
pub struct Mutex <T> {
    locked:AtomicBool,
    v:UnsafeCell<T>,

}
impl<T> Mutex<T>{
    pub fn new(t: T) -> Self {
        Self {
            locked: AtomicBool::new(UNLOCKED),
            v: UnsafeCell::new(t),
        }
    }
    // Takes in a closure/function (what to do after getting exclusive access of <T> )
    pub fn with_lock<R>(&self, f: impl FnOnce(&mut T) -> R) -> R {
    // For the sake of understanding we'll use a SpinLock
    // One should never utilise a SpinLock in production
        while self.locked.load(Ordering::Relaxed) != UNLOCKED {}
        self.locked.store(LOCKED, Ordering::Relaxed);
        // Since Unsafe Cell returns a "unsafe" raw pointer, we need to typecast it to a "memory safe" mutable reference before passing it to our closure 'f'
        let ret = f(unsafe { &mut *self.v.get() });
        self.locked.store(UNLOCKED, Ordering::Relaxed);
        ret
    }
}
Enter fullscreen mode Exit fullscreen mode

For simplicity's sake, just ignore the ordering and let's use Ordering::Relaxed cause we're all relaxed people in general. twehe

But why Atomics?

CPU and Modern Compilers often re-order the instructions to improve performance and CPU utilisation, However, this is'nt very useful when multiple independent entities (i.e threads) .This could cause data races and read-write locks, stalling the threads for a long time.

Now our poorly implemented Mutex falls prey to this, the instructions could get shuffled and all our well-thought out implementation goes to waste. Since it could reorder into something like:

    pub fn with_lock<R>(&self, f: impl FnOnce(&mut T) -> R) -> R {
        self.locked.store(UNLOCKED, Ordering::Relaxed);
        while self.locked.load(Ordering::Relaxed) != UNLOCKED {}
        self.locked.store(LOCKED, Ordering::Relaxed);
        let ret = f(unsafe { &mut *self.v.get() });
        ret
    }
Enter fullscreen mode Exit fullscreen mode

Which would interefere with another thread who currently has the lock. OR something like

pub fn with_lock<R>(&self, f: impl FnOnce(&mut T) -> R) -> R {
    self.locked.store(LOCKED, Ordering::Relaxed);
    while self.locked.load(Ordering::Relaxed) != UNLOCKED {}
    let ret = f(unsafe { &mut *self.v.get() });
    self.locked.store(UNLOCKED, Ordering::Relaxed);
    ret
}
Enter fullscreen mode Exit fullscreen mode

This could lead to the mutex locking its own access to the data and being in the state of Deadlock forever, leading to freezing the program.

How does Atomic::Ordering save us from this ?

It gives special instructions to the compiler, i.e when should it reorder and when it should'nt.

Ordering::Acquire/Release(together) – Ensures Seeing Previous Writes and Future Reads

Concept

  • Ensures that all writes done before another thread released the data are visible. Release
  • Prevents previous reads/writes from moving after the acquire operation. Acquire

Example 1: Imagine T1 is preparing a pizza order, and T2 is the delivery person.

fn thread1() {
    DATA = 42;  
    FLAG.store(UNLOCKED, Ordering::Release); // Guarantees DATA is written before FLAG!
}

fn thread2() {
    while FLAG.load(Ordering::Acquire) != UNLOCKED {}  // Guarantees we see DATA update!
    println!("{}", DATA);  // Always prints 42.
}

Enter fullscreen mode Exit fullscreen mode

Now as compiler gets the instructions:-

  • T1 (cook) sets DATA = 42 and then flips FLAG to true.
  • T2 (delivery) waits until FLAG == true and only then picks up DATA.
  • No chance of reading old data!

A small diagram showing how 3 threads access the data changed

Example 2 : Think of T1 as a warehouse preparing a package, and T2 as a delivery worker picking it up.

fn thread1() {
    DATA = 42;  
    FLAG.store(UNLOCKED, Ordering::Release);  //  "Releases" the data 1st and then unlocks
}

fn thread2() {
    while FLAG.load(Ordering::Acquire) != UNLOCKED {}  //  Guarantees we see previous writes
    println!("{}", DATA);  // Always prints 42.
}
Enter fullscreen mode Exit fullscreen mode

Similarly :-

  • T1 (warehouse) packs the order (DATA = 42), then sets FLAG to true.
  • T2 (delivery) will NOT pick up FLAG == true until DATA = 42 is fully written.
  • T2 always gets the correct value.

Key Point: Release ensures that all previous writes (like DATA = 42) are visible before setting FLAG = true.

Ordering::Acquire - Ensures previous writes are seen

Example : Imagine T1 is the Chef cooking and T2 is the Waiter

Scenario:

After the chef cooks..

  • The waiter (T2) checks if the dish is ready (Acquire).
  • Once the ticket (flag) is marked "Ready," the waiter knows the dish is complete.
  • But before checking, they might do unrelated tasks in any order.
while !FLAG.load(Ordering::Acquire) {}  // Wait for dish to be marked as "Ready" and then load the data
println!("{}", DATA);  // Guaranteed to be correct after acquire!
Enter fullscreen mode Exit fullscreen mode

Explanation:

  • Acquire ensures that once the waiter sees FLAG == true, they will also see the completed dish (DATA = 42) and pick it up.
  • However, earlier tasks(like setting up plates) i.e previous instructions might have happened before checking.

Using Release ordering for a store ensures all prior changes are visible after the store. A connected load will see the stored value and enforce order for subsequent operations. However, in load-store operations, the store part becomes "Relaxed," losing strong ordering guarantees.

ENSURES WE SEE ALL MEMORY CHANGES MADE BY THE PREVIOUS LOCK OWNERS

Ordering::Release – Ensures future accesses see the change made

Example : Imagine T1 is the Chef cooking and T2 is the Waiter

Scenario:

Before the waiter picks the dish...

  • The chef (T1) must finish preparing the dish before marking the order as ready (Release).
  • Or else, customers could receive a half-cooked meal.

Code Analogy:

DATA = 42; 
FLAG.store(true, Ordering::Release);  // Only set flag after food is ready(data is set to 42)
Enter fullscreen mode Exit fullscreen mode

Explanation:

  • Release makes sure everything before it happens first (the dish is ready before the flag is flipped).

An Acquire load ensures all operations before a prior store are visible, preventing outdated or inconsistent data. It acts as a barrier, enforcing memory consistency across threads.

ENSURES THAT FUTURE READS WILL SEE THIS UPDATED VALUE

Ordering::SeqCst – Ensures a globally consistent order of operations

Example: Imagine T1 is the Customer A and T2 is the Customer B

Scenario:

Before transferring money...

  • Customer A (T1) must withdraw the money from their account before depositing it into Customer B's account.
  • Customer B (T2) will only see the deposit once Customer A's withdrawal is complete.
  • Both actions must happen in a globally consistent order, ensuring that no thread (i.e., no customer) will observe the operations out of order, even if both threads are executed on different processors.

Code Analogy:

ACCOUNT_A.withdraw(50);  
ACCOUNT_B.deposit(50);  
FLAG.store(true, Ordering::SeqCst);  // The deposit operation will not be seen until the withdrawal is fully completed`
Enter fullscreen mode Exit fullscreen mode

Explanation:

  • SeqCst ensures that all threads observe the operations in a globally consistent order. This means Customer A’s withdrawal is seen before Customer B’s deposit. No other thread will see the operations in a different order, thus preventing race conditions and ensuring the bank's accounting is correct.

Ordering::Relaxed – Atomic Counters (No Synchronization Guaranteed)

Scenario

  • Make modification to the shared variable without reading it.
  • Used for counters
fn thread_1() {
    VISITOR_COUNT.fetch_add(1, Ordering::Relaxed);
}
fn thread_2(){
    VISITOR_COUNT.fetch_add(1, Ordering::Relaxed);
}
Enter fullscreen mode Exit fullscreen mode

Works Fine Because:

  • We don't care when a thread sees the updated count.
  • As long as the final count is correct, we’re good.

If We Used SeqCst Instead:

  • Each update would enforce global synchronization, slowing down performance. ## Using everything we know now to fix our Mutex implementation

When unlocking and locking a mutex: When a mutex is unlocked, a happens-before relationship is created between the unlock operation and the next lock operation on the same mutex. This ensures that:

The next thread that locks the mutex will see all of the changes that were made by the thread that unlocked the mutex.

const LOCKED:bool = true;
const UNLOCKED:bool = false;

// Correct implementation of a Mutex
pub struct Mutex <T> {
    locked:AtomicBool,
    v:UnsafeCell<T>,

}
impl<T> Mutex<T>{
    pub fn new(t: T) -> Self {
        Self {
            locked: AtomicBool::new(UNLOCKED),
            v: UnsafeCell::new(t),
        }
    }
    // Takes in a closure/function (what to do after getting exclusive access of <T> )
    pub fn with_lock<R>(&self, f: impl FnOnce(&mut T) -> R) -> R {
    // For the sake of understanding we'll use a SpinLock
    // One should never utilise a SpinLock in production

    // Acquires ensures visibility of all previous writes before the unlock happens.
        while self.locked.load(Ordering::Acquire) != UNLOCKED {}

// Release ensures that all previous memory writes in this thread become visible to threads that later perform an Acquire load.

        self.locked.store(LOCKED, Ordering::Release);
        let ret = f(unsafe { &mut *self.v.get() });

// Proper Release ordering to ensure writes are visible before unlocking
        self.locked.store(UNLOCKED, Ordering::Release);
        ret
    }
}
Enter fullscreen mode Exit fullscreen mode

Simple Rule of Thumb

Operation Ordering Used Purpose
Loading (Reading) Ordering::Acquire Ensures this read sees all prior writes before a Release store.

If a Relaxed store is done, then it may/may not see based on where the CPU has re-ordered the instruction
Storing (Writing) Ordering::Release Ensures all previous Acquire writes are visible before unlocking.

Does'nt guarantee for a Relaxed for the same reason that it may be reordered ahead of the write.

References :

std::memory_order
Rust Release and acquire memory
Crust of Rust: Atomics and Memory Ordering

Top comments (0)