illia wolkow

Posted on Feb 7

ACID in Databases: A Comprehensive Guide

#acid #database #security #tutorial

In database systems, transactions play a crucial role in ensuring data integrity and reliability. A transactional database allows operations to be executed as a single, atomic unit, ensuring that either all operations succeed or none at all. To achieve this level of reliability, databases adhere to the ACID properties: Atomicity, Consistency, Isolation, and Durability.

Decoding ACID
The term ACID refers to a set of four essential properties that guarantee transactions in a database are processed reliably. These properties are Atomicity, Consistency, Isolation, and Durability, and they work together to ensure the accuracy and integrity of a database, even in the event of system failures or concurrent access by multiple users.

Let's break down each of these properties.

Atomicity Atomicity ensures that a transaction is treated as a single, indivisible unit of work. Either all the operations within the transaction are completed successfully, or none are. If any part of the transaction fails, the database must roll back to its previous state, as if the transaction had never occurred.

For example, consider a bank transfer where money is moved from one account to another. The transaction involves two operations: subtracting money from the sender’s account and adding it to the recipient’s account. Atomicity guarantees that either both operations succeed, or neither operation happens, preventing inconsistencies.

Consistency Consistency ensures that a transaction brings the database from one valid state to another. After a transaction completes, the database must satisfy all integrity constraints (such as unique keys, foreign keys, etc.), ensuring that data remains valid and accurate.

For example, if a transaction updates the balance of a bank account, Consistency ensures that the new balance adheres to predefined rules, such as non-negative balances or limits on transactions.

Isolation Isolation ensures that transactions are executed independently of one another, even if they occur concurrently. Changes made by a transaction should not be visible to other transactions until the transaction is committed, preventing issues such as dirty reads, non-repeatable reads, and phantom reads.

There are various isolation levels that define the extent of this independence:

3.1 Read Committed
At the Read Committed isolation level, a transaction can only read data that has been committed by other transactions. Uncommitted changes made by other transactions are invisible, preventing dirty reads. However, Read Committed does not prevent non-repeatable reads, meaning a value could change if another transaction commits an update after the initial read.

3.2 Snapshot Isolation (Repeatable Read)
In Snapshot Isolation, also known as Repeatable Read, a transaction sees a consistent snapshot of the database as it existed at the start of the transaction. This ensures that once a value is read, it will not change for the duration of the transaction, preventing both dirty reads and non-repeatable reads.

3.3 MVCC (Multi-Version Concurrency Control)
Many databases, such as PostgreSQL, use a technique called Multi-Version Concurrency Control (MVCC) to implement snapshot isolation. MVCC allows the database to maintain multiple versions of data. Each transaction reads from its own consistent snapshot, preventing conflicts between transactions and improving performance in high-concurrency environments.

3.4 The Lost Update Problem
One problem that Isolation helps mitigate is the Lost Update problem, which occurs when two concurrent transactions read the same data and both update it based on the initial value. Without proper isolation, one transaction's update may overwrite the other's, resulting in data loss. Higher isolation levels, such as snapshot isolation, prevent this issue by ensuring transactions do not interfere with each other.

Solution 1: Atomic Updates (Optimistic Approach)
An atomic update ensures that a read-modify-write operation is executed as a single, indivisible unit. This is often achieved using SQL commands like UPDATE with an expression rather than first reading the value and then writing it back.

For the counter example, you can use a single SQL query that increments the value atomically:

In this approach, the database manages the concurrency internally. If multiple users try to update the counter simultaneously, the database handles the updates atomically, preventing lost updates.
Solution 2: Explicit Locking (Pessimistic Approach)
Explicit locking ensures that only one transaction can modify a piece of data at a time, while others must wait for the lock to be released. This approach can be used to prevent lost updates but may reduce performance in high-concurrency environments.

For example, in PostgreSQL, you can lock a row before updating the counter:

In this scenario, the
Solution 3: Automatic Lost Update Detection
Some databases offer mechanisms to automatically detect and resolve lost updates. One common technique is optimistic concurrency control, which uses versioning or timestamps to detect if a record has been modified by another transaction before allowing an update.

Durability Durability ensures that once a transaction has been committed, its changes are permanent and will survive any system failure (e.g., power loss, crashes). This is typically achieved by writing transaction data to non-volatile storage (like a hard drive or SSD) before the transaction is considered committed.

For example, in a bank system, once a transaction is completed and the changes (e.g., updated balances) are written to disk, Durability guarantees that those changes will persist even if the system crashes immediately afterward.

In summary, the ACID properties—Atomicity, Consistency, Isolation, and Durability—form the foundation of reliable and consistent transaction processing in databases. By adhering to these principles, databases can ensure data integrity and provide mechanisms for handling concurrent transactions without conflicts or data loss.

A popular method for detecting lost updates is using a version or timestamp column in the table. When a transaction updates a row, it checks if the version number is the same as when it initially read the row. If the version number has changed, the database knows another transaction has modified the data and can abort or retry the transaction.

For example, you can implement this logic as follows:

If the

Durability Durability ensures that once a transaction has been committed, its changes are permanent and will survive any system failure (e.g., power loss, crashes). This is typically achieved by writing transaction data to non-volatile storage (like a hard drive or SSD) before the transaction is considered committed.

DEV Community

ACID in Databases: A Comprehensive Guide

Top comments (0)

Read next

Master Type-Safe Object Keys with TypeScript keyof typeof

Advanced Single Table Design Patterns With DynamoDB

Password Hashing and Data Integrity in Real-World Implementation

Choosing the Right Time Series Database for Your Use Case