ACID
In the realm of database management, ensuring the reliability and integrity of data is paramount. This is where the ACID principles come into play. ACID, an acronym for Atomicity, Consistency, Isolation, and Durability, represents a set of properties that guarantee reliable processing of database transactions. These principles are foundational to traditional relational database management systems (RDBMS) and are crucial for applications where data accuracy and consistency are non-negotiable.
Before we start, let's recap what is the transaction:
Transaction - is a sequence of one or more queries executed as a single unit of work. The main principle of a transaction is "all or nothing", meaning:
- If all queries in the transaction succeed, the changes are safely saved in the database.
- If any query in the transaction fails, all changes are undone, restoring the database to its original state before the transaction started.
β οΈ When changes are saved, we say the transaction is committed. If changes are undone, the transaction is rolled back β οΈ
Imagine you are buying a laptop from an online store. When you click "Place Order", several things must happen in the database:
- Check if the laptop is in stock.
- Deduct the laptop from the stock.
- Charge your credit card.
- Create an order record in the system.
- Send a confirmation email.
We can think of it as a single unit of work. Let's also consider that each query interacts with various tables:
π€ You can ask, why in this scenario i need a transaction, well, lets see what will happen if we didnt use it:
- The laptop is taken from stock, but the payment fails β now the store has fewer laptops but no money.
- The payment goes through, but thereβs no order record β you paid, but the store has no idea what you bought!
Using a transaction keeps everything in sync. If something goes wrong, it rolls back all the steps to make sure nothing is left half-finished. Let's consider an example where all changes are committed only if every query is successful
A transaction makes sure everything is done correctly or nothing is done at all!
If you are familiar with SQL, you can also see how this would look in a query (pseudocode):
BEGIN TRANSACTION;
-- 1. Check if the laptop is in stock
SELECT stock_quantity FROM laptops WHERE laptop_id = 1;
-- 2. Deduct the laptop from the stock
UPDATE laptops SET stock_quantity = stock_quantity - 1 WHERE laptop_id = 1;
-- 3. Charge the credit card (simplified)
INSERT INTO payments (user_id, amount) VALUES (123, 1000);
-- 4. Create an order record in the system
INSERT INTO orders (user_id, laptop_id, order_date) VALUES (123, 1, '2025-02-23');
-- 5. Send a confirmation email (simplified)
INSERT INTO email_queue (user_id, email_type, status) VALUES (123, 'order_confirmation', 'pending');
-- 6. If everything is successful, commit the transaction
COMMIT;
Now that we understand what a transaction is, let's explore what the ACID acronym stands for ! πππ
Atomicity
All or nothing. A transaction either fully completes or has no effect.
Atomicity is one of the core principles of database transactions, ensuring that a transaction is treated as a single, indivisible unit of work. This means:
β
All-or-Nothing β If a transaction completes successfully, all its changes are saved.
β
Rollback on Failure β If any part of the transaction fails, all changes are discarded, keeping the database unchanged.
π Key points that define atomacity:
- All-or-Nothing Execution β A transaction is either fully completed or not executed at all
- Rollback on Failure β If one query in the transaction fails, all previous successful queries are undone
- No Partial Updates β If a problem occurs, the database remains as if the transaction never happened
π¨ What happens if Atomicity is NOT ensured?
- The payment succeeds, but the stock update fails β You are charged, but the store doesn't reserve your laptop!
- The stock is updated, but the payment fails β The laptop is removed from inventory without receiving money.
Consistency
Rules are followed. A transaction moves the database from one valid state to another.
It ensures that a transaction brings the database from one valid state to another. In other words, a transaction must always follow the rules, constraints, and data integrity of the database, maintaining its integrity throughout the process. This means:
β
Valid Transitions β A transaction will only commit if it preserves the database's rules and integrity.
β
Invalid Transitions β If the transaction violates any of the databaseβs constraints (data type rules, referential integrity), it will be rolled back.
π Key Points that Define Consistency:
- Integrity Preservation β The database must transition from one consistent state to another, adhering to defined rules ( constraints, triggers, business rules).
- No Invalid Data β Transactions that would result in invalid data (violating primary key, foreign key, or unique constraints) are not allowed to commit.
- Enforcing Constraints β All database rules such as constraints (like unique values, not null, foreign keys) are maintained throughout the transaction.
- Automatic Rollback β If a transaction causes an inconsistency, the database will automatically roll back to its last consistent state, ensuring no corrupt data is saved.
- No Partial Data β Transactions that leave data in an inconsistent state (such as violating business logic) will not complete.
π¨ What happens if Consistency is NOT ensured?
- Payment fails after inventory update -> If the stock is deducted but payment is not successful (due to a system error), the database may reflect that the laptop has been sold, but the payment was not processed. This creates inconsistency, as the stock is reduced without receiving money.
- Order without valid data -> If the transaction violates integrity constraints (an order is created without a valid customer ID or payment), the database might store incomplete or invalid data.
- Incomplete transactions -> The database might store records with missing or invalid information, leading to errors in reporting, business logic, or even operational decisions.
Isolation
Transactions don't interfere. Each transaction is independent of others happening at the same time.
Isolation is one of the key principles of the ACID properties in database transactions. It ensures that each transaction is executed in isolation from other concurrent transactions. This means that even if multiple transactions are running at the same time, the changes made by each transaction are not visible to others until they are fully committed, preventing them from interfering with each other. This means:
β
Independent Transactions β Transactions are isolated from each other, ensuring that they donβt affect each otherβs results.
β
No Interference β Even though transactions are executed simultaneously, one transactionβs data changes will not be visible to another transaction until fully committed.
π Key Points that Define Isolation:
- Transaction Independence β Each transaction is executed as if it is the only transaction running in the system. No transaction can access the intermediate (uncommitted) data of another transaction.
- Prevents Dirty Reads β A transaction should not read data that is in the middle of being modified by another transaction (dirty reads).
- Prevents Non-Repeatable Reads β Data read by a transaction cannot change before it is completed, ensuring that subsequent reads within the same transaction give consistent results.
- Prevents Phantom Reads β A transaction should not see new rows that were added or deleted by another transaction after it started.
- Isolation Levels β Isolation can be adjusted with different levels (Read Uncommitted, Read Committed, Repeatable Read, Serializable), offering a trade-off between performance and strict isolation.
π¨ What happens if Isolation is NOT ensured?
- Dirty Reads -> One transaction might read uncommitted changes made by another transaction. For example, Customer A might deduct the stock of the laptop, but Customer B reads the same stock quantity before Customer A commits the transaction.
- Lost Updates -> If two transactions are modifying the same data at the same time, one transaction might overwrite the changes of the other, leading to lost data.
- Inconsistent Data -> A transaction might use data that is still being modified by another transaction, leading to inconsistent results and potential errors.
- Phantom Reads -> A transaction may not see the same set of rows (inventory) every time it queries the database if another transaction inserts, deletes, or updates rows during the course of the first transaction.
Durability
Changes stick. Once a transaction is complete, its changes are permanent, even if the system fails.
This principle ensures that once a transaction has been committed, its changes are permanent, even in the event of a system failure (power loss or crash). This means:
β
Permanent Changes β Once a transaction is committed, the changes are guaranteed to persist, no matter what happens next.
β
System Failures Don't Lose Data β Even if the database crashes after the transaction commits, the changes will remain intact.
π Key Points that Define Durability
- Permanent Commit β Once a transaction is successfully committed, its results are saved permanently to disk and will survive system crashes.
- Crash Recovery β In the event of a crash, the database can recover to its last committed state, ensuring no data loss.
- Data Integrity After Failure β Even if there is a failure after the transaction is committed, the database will never revert to a previous, inconsistent state.
- No Rollback After Commit β After a transaction is committed, thereβs no way to undo its changes unless explicitly done by another transaction.
- Guarantee of Persistence β Durability guarantees that once the transaction is completed, it becomes part of the database history.
π¨ What happens if Durability is NOT ensured?
- After committing a transaction, if the system crashes, any committed changes could be lost -> Payment might be processed, but the system could fail before saving the updated inventory or generating the order record.
- When the system recovers, the transaction could be lost -> inconsistent data such as uncharged payments, incorrect stock levels, or missing orders.
π In summary, ACID provides the foundation for secure and efficient transaction management, helping prevent errors, inconsistencies, and data corruption, which ultimately builds trust in database systems and ensures smooth operations for end-users and businesses.
BASE
With the advent of NoSQL databases, a new paradigm emerged for managing and manipulating data, emphasizing flexibility and scalability over rigid consistency. This shift led to the development of the BASE model, which stands for Basically Available, Soft state, Eventual consistency. Unlike traditional relational databases that prioritize strict consistency and transactional integrity, the BASE model embraces a more relaxed approach to data management. The BASE model is designed to address the challenges of distributed systems, where data is spread across multiple nodes and consistency is not always immediate. It allows for high availability and partition tolerance, making it ideal for large-scale applications where real-time consistency is less critical than continuous accessibility.
Basically Available
The system always provides a response to queries, but the data might not be the most recent.
Imagine you have a distributed database with multiple nodes. If one node fails, the system can still respond to queries using data from other nodes, even if that data is slightly outdated. This ensures that the system is always available to users, even during failures.
Soft State
The system's state can change over time, even without new input, due to eventual consistency.
In a shopping cart application, if you add an item to your cart, the system might not immediately reflect this change across all servers. Over time, the system will update and synchronize, ensuring that all servers eventually show the same state.
Eventual Consistency
If no new updates are made, all accesses to a given data item will eventually return the last updated value.
Suppose you update your profile picture on a social media platform. Due to eventual consistency, some users might see your old profile picture for a short period. However, after some time, everyone will see the new picture as the system synchronizes the updates across all servers.
π In summary, BASE provides a flexible and scalable approach to database management, prioritizing availability and eventual consistency over strict transaction rules. This makes it ideal for distributed systems, allowing businesses to handle large-scale data efficiently while maintaining system responsiveness and fault tolerance.
ACID vs BASE
- Consistency & Availability - ACID prioritizes strong consistency and reliability, making it suitable for critical applications where data integrity is paramount. BASE prioritizes availability and partition tolerance, making it suitable for large-scale, distributed systems where immediate consistency is less critical.
- Scalability - BASE systems are generally more scalable due to their relaxed consistency requirements, making them suitable for high-traffic applications. ACID systems can be less scalable due to the overhead of maintaining strong consistency.
- Use Cases - ACID is preferred for traditional enterprise applications like banking and healthcare, where data integrity and consistency are crucial. BASE is preferred for modern web applications, IoT, and real-time analytics, where scalability and availability are more important.
π In summary, the choice between ACID and BASE depends on the specific requirements of the application, particularly the trade-offs between consistency, availability, and scalability.
Helpful Links π€
Text resources:
- What Is BASE in Database Engineering?
- A Guide to ACID Properties in Database Management Systems
- Whatβs the Difference Between an ACID and a BASE Database?
- ACID Databases β Atomicity, Consistency, Isolation & Durability Explained
Video resources:
Top comments (0)