Meqdad Darwish

Posted on Dec 10, 2024 • Edited on Dec 12, 2024

ACID Properties in Databases - Part 2: Your Shield Against Transactional Chaos

#database #data #sql #postgres

Read the first part of this series from here.

The Transactional Concurrency Dilemma

In the dynamic world of database management, concurrency represents the simultaneous execution of multiple transactions across complex systems. While this approach enables high-performance and efficient data processing, it introduces critical challenges that can compromise data integrity. Scenarios like dirty reads, non-repeatable reads, phantom reads, and lost updates threaten the fundamental reliability of database operations, creating potential risks that range from minor inconsistencies to significant financial and operational disruptions.

Understanding these concurrency problems is essential for developers and system architects to design robust, dependable transactional systems.

Dirty Reads

Scenario: The Risky Loan Approval

Sami works as a bank loan officer processing a critical $50,000 loan application. Simultaneously, another system process is attempting to update the applicant's credit status:

Transaction A begins reviewing the loan application
Concurrent Transaction B starts updating the applicant's credit score -Transaction B temporarily marks the credit score as improved (not yet committed)
Transaction A reads this temporary, unconfirmed credit score
Transaction B rolls back the credit score change due to an error

Potential Impact:

Loan might be approved based on a false credit improvement
Critical financial decision made on non-verified information
Significant financial risk for the bank

Non-Repeatable Reads

Scenario: Stock Trading Platform

A day trader monitors a volatile stock in real-time:

First query reveals stock price at $100
Concurrent trading activity triggers price updates
Subsequent read shows stock price at $95
Trader's decision-making becomes unreliable due to inconsistent data views

Potential Impact:

Unpredictable trading strategies
Potential financial losses
Undermined trust in trading platform

Phantom Reads

Scenario: E-commerce Inventory Management

An online retailer processes multiple simultaneous orders for limited stock items:

Initial inventory check shows 5 gaming consoles available Concurrent transactions:
- Transaction A starts inventory verification
- Transaction B adds 2 new gaming consoles to inventory
Transaction A's subsequent read now shows 7 consoles Unexpected inventory changes create reconciliation challenges

Potential Impact:

Overselling potential
Inventory tracking inconsistencies
Customer experience disruption

Lost Update

Scenario: Travel Booking System

Two customers simultaneously attempt to book the last seat on a flight:

Initial seat availability: 1 seat remaining
Customer A begins booking process
Customer B starts identical booking process
System fails to properly manage concurrent writes
One booking transaction silently overwrites the other

Potential Impact:

Potential revenue loss
Customer trust erosion
Operational chaos

Concurrency is not just a technical challenge—it's a critical business risk management issue that demands sophisticated, thoughtful solutions.

Isolation Levels

In the complex landscape of database management, isolation levels represent sophisticated strategies for managing concurrent transactions. These levels define how different transactions interact and view data simultaneously, ranging from minimal protection (Read Uncommitted) to absolute data integrity (Serializable). Each isolation level offers a unique balance between performance and consistency, addressing potential concurrency challenges like dirty reads, non-repeatable reads, and phantom reads.

By understanding and strategically implementing these isolation mechanisms, developers can design robust, reliable transactional systems that maintain data integrity while optimizing system performance.

Read Uncommitted

Scenario: Cryptocurrency Trading Platform

High-frequency trading system processing multiple transactions
No protection against dirty reads
Potential for processing unconfirmed market data

Pros:

Highest performance
Minimal system overhead
Maximum concurrency

Cons:

Highest risk of data inconsistency
Unreliable for critical systems

Use Cases:

Non-critical analytics
High-speed data streaming
Preliminary market tracking

Examples on Databases that can apply it:

SQL Server (configurable)

Read Committed

Scenario: Banking Transaction Processing

Customer checks account balance
Prevents reading uncommitted transactions
Ensures only confirmed financial data is visible

Pros:

Prevents dirty reads
Default isolation in many databases
Reasonable performance-consistency balance

Cons:

Allows non-repeatable reads
Potential for phantom read issues

Use Cases:

Standard financial applications
E-commerce platforms
General transactional systems

Examples on Databases that can apply it:

PostgreSQL (default)
Oracle, MySQL

Repeatable Read (Sometimes called: Snapshot Isolation)

Scenario: Inventory Management System

Warehouse manager generating stock reports
Guarantees consistent data view during entire transaction
Prevents other transactions from modifying viewed data

Pros:

Consistent data snapshots
Prevents non-repeatable reads
Suitable for analytical processes

Cons:

Potential performance overhead
Risk of deadlocks
Allows phantom reads

Use Cases:

Detailed financial reporting
Batch processing
Complex data analysis

Examples on Databases that can apply it:

MySQL (InnoDB Engine)
PostgreSQL

Serializable

Scenario: Global Financial Compliance Reporting

Generating regulatory financial statements
Complete isolation between concurrent transactions
Prevents all concurrency anomalies

Pros:

Absolute data consistency
Eliminates all read phenomena
Guaranteed transaction integrity

Cons:

Significant performance impact
High resource consumption
Potential system bottlenecks

Use Cases:

Critical financial systems
Regulatory compliance reporting
High-stakes transactional environments

Examples on Databases that can apply it:

SQL Server
Oracle

Before deciding which isolation level to use, CONSIDER...

Isolation level selection involves balancing:

Performance requirements
Data consistency needs
Specific application constraints
Potential concurrency challenges

Developers must carefully evaluate their system's unique requirements, understanding that higher isolation levels provide more protection at the cost of reduced performance.

More about Durability

Database durability represents a critical mechanism ensuring that committed transactions survive system failures, power outages, and unexpected interruptions. These techniques create robust data protection strategies that guarantee information integrity, allowing databases to recover and maintain consistent state even under extreme operational conditions.

Here are some of popular Durability techniques in database systems:

Write-Ahead Logging (WAL)

Concept: Logs all changes before applying them to the main database
Purpose: Enables complete transaction recovery
Databases: PostgreSQL, SQLite
Mechanism:

Records transaction details in sequential log
Allows replay of operations after system restart
Ensures no committed transaction is lost

Transaction Logs

Concept: Maintains sequential record of database modifications
Purpose: Supports point-in-time recovery
Databases: SQL Server, Oracle
Mechanism:

Captures every database state change
Provides rollback and recovery capabilities
Allows restoration to specific historical moment

There are more durability techniques, I'll put the extra resources at the bottom of article.

Data Consistency vs. Reading Consistency

Data Consistency

Data consistency ensures that information remains accurate, complete, and reliable across different system components and operations. It prevents data corruption and maintains system trustworthiness by ensuring that all copies of data remain synchronized and valid.

Maintaining Data Consistency:

Implement atomic transactions that complete entirely or not at all
Use strict validation mechanisms before data modifications
Apply locking strategies to prevent concurrent conflicting updates
Utilize versioning to track and reconcile data changes
Implement rollback mechanisms for failed or partial updates

Reading Data in Distributed Systems

In distributed systems, reading data becomes complex when multiple data sources or nodes can serve the same information. This creates a fundamental challenge: how to ensure data read consistency while maintaining system performance and availability?

Eventual Consistency

Eventual consistency emerges as a pragmatic solution, accepting temporary data divergence with a guarantee of final synchronization. It prioritizes system responsiveness and availability over immediate, strict consistency.

Example: Social Media Post Synchronization

User posts content in Qabilah (An Arabic social network)
Servers in different global regions temporarily show slightly different data
Within milliseconds to seconds, all servers converge to the same consistent state
System prioritizes quick user experience over immediate perfect synchronization

By accepting short-term inconsistencies, eventual consistency allows distributed systems to remain responsive, scalable, and resilient, making it crucial for high-performance, globally distributed applications.

Generally;

Database transactions represent a critical intersection of performance and data integrity. The challenges of concurrent systems demand sophisticated strategies to protect information reliability.

Main Insights

Concurrency Risks
- Dirty reads
- Non-repeatable reads
- Phantom reads
- Lost updates
Strategic Considerations
Successful transaction management requires balancing performance, consistency, and system-specific constraints. Isolation levels are not one-size-fits-all solutions but carefully calibrated mechanisms.

Final Word

Transactional systems are more than technical mechanisms—they're essential risk management tools. Success depends on understanding potential failure modes and implementing intelligent mitigation strategies.

The complexity of database transactions demands continuous learning and adaptive approaches. By recognizing the delicate balance between system performance and data integrity, developers can build robust, reliable computing environments.

The Transactional Concurrency Dilemma

Dirty Reads

Scenario: The Risky Loan Approval

Potential Impact:

Non-Repeatable Reads

Scenario: Stock Trading Platform

Potential Impact:

Phantom Reads

Scenario: E-commerce Inventory Management

Potential Impact:

Lost Update

Scenario: Travel Booking System

Potential Impact:

Isolation Levels

Read Uncommitted

Scenario: Cryptocurrency Trading Platform

Pros:

Cons:

Use Cases:

Examples on Databases that can apply it:

Read Committed

Scenario: Banking Transaction Processing

Pros:

Cons:

Use Cases:

Examples on Databases that can apply it:

Repeatable Read (Sometimes called: Snapshot Isolation)

Scenario: Inventory Management System

Pros:

Cons:

Use Cases:

Examples on Databases that can apply it:

Serializable

Scenario: Global Financial Compliance Reporting

Pros:

Cons:

Use Cases:

Examples on Databases that can apply it:

Before deciding which isolation level to use, CONSIDER...

More about Durability

Write-Ahead Logging (WAL)

Transaction Logs

Data Consistency vs. Reading Consistency

Data Consistency

Maintaining Data Consistency:

Reading Data in Distributed Systems

Eventual Consistency

Generally;

Main Insights

Final Word

Extra Resources

Read next

SQL - Tuple Comparison

How to Build and Deploy Full-Stack JavaScript Apps with NextJS, Tailwind, PostgreSQL, and Sevalla⚡👨‍💻

Massively Scalable Processing & Massively Parallel Processing

SQL 101 | Chapter 3: Mastering Data Retrieval with SELECT Statements