Franck Pachot for YugabyteDB

Posted on Dec 28, 2024 • Edited on Dec 29, 2024

Document data modeling to avoid write skew anomalies

#yugabytedb #distributed #postgres #json

In a previous post, I explained that a serializable isolation level is not the only solution to avoid write skew anomalies. Normalizing the database schema, I converted the read-write conflict into a write-write conflict. This strategy allows us to achieve serialization within the read-committed isolation level using a SELECT FOR UPDATE.
I used Martin Kleppmann's example of doctors' on-call shifts to demonstrate write skew anomalies.

Document data modeling is often confused with denormalization, but it can serve as a normalized alternative to the two-table solution, effectively avoiding write skew anomalies. In this post, I will demonstrate how to store doctors’ on-call shifts using a JSONB document. By utilizing a single table as a document database, I can achieve a solution similar to a normalized schema, which helps prevent write skew.

To implement this, I will create a single table to store the shifts. The doctors' information will be stored in a JSONB array that includes their names and on-call status.

yugabyte=# create table shifts (
 shift_id int primary key
 ,doctors jsonb
);
CREATE TABLE

yugabyte=# insert into shifts (shift_id, doctors)
values (1, '[
 {"name": "Alice", "on_call": true},
 {"name": "Bob",   "on_call": true}
]'::jsonb)
returning shift_id, jsonb_pretty (doctors) as doctors
;

 shift_id |         doctors
----------+--------------------------
        1 | [                       +
          |     {                   +
          |         "name": "Alice",+
          |         "on_call": true +
          |     },                  +
          |     {                   +
          |         "name": "Bob",  +
          |         "on_call": true +
          |     }                   +
          | ]
(1 row)

INSERT 0 1

Bob starts a transaction and checks the on-call doctors for shift 1:

yugabyte=# -- Bob

yugabyte=# begin isolation level read committed;
BEGIN

yugabyte=*# select shift_id, doctor
 from shifts
    , jsonb_array_elements(doctors) as doctor
 where shift_id = 1 and doctor->>'on_call' = 'true'
 for update
;

 shift_id |               doctor
----------+------------------------------------
        1 | {"name": "Alice", "on_call": true}
        1 | {"name": "Bob", "on_call": true}
(2 rows)

As there are two on-call doctors, Bob can release himself, and another doctor will remain:

yugabyte=# -- Bob

yugabyte=*# update shifts
set doctors = jsonb_set(
  doctors,
  (
    -- get the array index ( 1-based provided by ordinality to 0-based for json_set)
    select format('{%s,on_call}', idx - 1)::text[]
    from jsonb_array_elements(doctors) with ordinality arr(doc, idx)
    where doc->>'name' = 'Bob'
  ),
  'false'::jsonb
)
where shift_id = 1
returning shift_id, jsonb_pretty (doctors) as doctors;


 shift_id |         doctors
----------+--------------------------
        1 | [                       +
          |     {                   +
          |         "name": "Alice",+
          |         "on_call": true +
          |     },                  +
          |     {                   +
          |         "name": "Bob",  +
          |         "on_call": false+
          |     }                   +
          | ]
(1 row)

UPDATE 1

Updating a JSON document in an SQL database is not straightforward. I will do the same in a future blog post with a NoSQL API to see if it is easier.

This configuration is valid and Bob can commit his changes. However, at the same time Alice has the same idea and checks the on-call doctors for the same shift:

yugabyte=# -- Alice

yugabyte=# begin isolation level read committed;
BEGIN

yugabyte=*# select shift_id, doctor
 from shifts
    , jsonb_array_elements(doctors) as doctor
 where shift_id = 1 and doctor->>'on_call' = 'true'
 for update
;

...

Because of transaction isolation, Alice cannot see Bob's ongoing changes. However, when she uses SELECT FOR UPDATE, the database waits to see if Bob's transaction will commit or roll back so that an accurate result can be returned to her.

Bob terminates his transaction with a commit to make his changes durable and visible to others:

yugabyte=# -- Bob

yugabyte=*# commit;
COMMIT

Immediately, Alice's transaction continues and displays the current state of the database:

yugabyte=# -- Alice

yugabyte=# select shift_id, doctor
yugabyte-#  from shifts
yugabyte-#     , jsonb_array_elements(doctors) as doctor
yugabyte-#  where shift_id = 1 and doctor->>'on_call' = 'true'
yugabyte-#  for update
yugabyte-# ;

...

 shift_id |               doctor
----------+------------------------------------
        1 | {"name": "Alice", "on_call": true}
(1 row)

Alice sees that she is the only on-call doctor for this shift. She must cancel her attempt step back from being on-call:

yugabyte=# -- Alice

yugabyte=# rollback;
ROLLBACK

The state of the database is consistent, with one doctor on-call:

yugabyte=# select shift_id, jsonb_pretty (doctors) as doctors
from shifts
;
 shift_id |         doctors
----------+--------------------------
        1 | [                       +
          |     {                   +
          |         "name": "Alice",+
          |         "on_call": true +
          |     },                  +
          |     {                   +
          |         "name": "Bob",  +
          |         "on_call": false+
          |     }                   +
          | ]
(1 row)

yugabyte=#

Understanding how your database works, mainly locking and conflict detection, is essential for getting consistent results with high performance. YugabyteDB maintains runtime compatibility with PostgreSQL, so you won’t need to learn new behaviors. Contrary to others (like Spanner, CockroachDB, or AWS Aurora DSQL whose compatibility is limited to the wire protocol and dialect), YugabyteDB operates like PostgreSQL, supporting all isolation levels, explicit locking, and the same data types, including JSON and JSONB.

At the logical level, we identify two business entities: shifts and doctors, which are connected through a one-to-many relationship, the on-call status of a doctor for each shift. In this and the previous post, we have explored three data modeling options: two tables connected by a foreign key, a single doctor table with a shift attribute, and a single shift table with doctors in an embedded document. The appropriate choice depends on data volume, access patterns, and performance needs.

I used name attributes like "Bob" and "Alice," which might introduce data duplication since doctors are likely stored in another collection or table. However, such applications typically use immutable keys, like UUIDs, and are not subject to update anomalies, which is the main reason for normalization.

Embedding documents can provide a consistent and efficient alternative to the traditional normal forms approach. Document databases are particularly well-suited for scenarios where related data is frequently accessed together and must be locked together, as in this example.

Top comments (5)

Mladen Gogala • Dec 28 '24

Tom Kyte wrote about "write consistency" in his "Expert Database Architecture" book. In Tom's book, transaction is restarted when it finds changed query results after concurrent transaction completes. Granted, Tom wasn't writing about PostgreSQL compatible databases as he used to be a VP at Oracle Corp. but I imagine that PgSQL has similar concurrency mechanisms. It would be very interesting to see how would that operate in a DSQL environment.

Franck Pachot • Dec 28 '24 • Edited

Yes, write consistency is tricky in Multi-Version Concurrency Control databases. Tom Kyte had an awesome thread starting with "oh geez -- i'm almost afraid to publish this one" on AskTOM where he explained how he discovered this, with emails being sent twice (from trigger, using non-transactional external procedure)

Oracle Database can restart a statement (not a transaction) when a non-repeatable read is encountered (when the write state conflicts with the read state). It can do that in Read-Committed because RC allows different MVCC read times for each statement in a transaction. To avoid too many restarts, Oracle can acquire more locks on restart (to wait rather than fail).
PostgreSQL doesn't have this (because it would require implicit savepoints for each statement, and savepoints are expensive in PostgreSQL) and restarts only the reading of the row. This results in inconsistent snapshots (results with rows from two states), but it still fits the SQL standard definition of read committed (which requires reading only the committed changes, ignoring that in MVCC databases, they can come from different commit times).
YugabyteDB does statement restarts like Oracle (using implicit savepoints for each statement in a read-committed transaction—they are scalable in YugabyteDB). It is better than PostgreSQL as the result is always time-consistent. It's still PG-compatible but compatible with good case behavior. YugabyteDB can do that in Read Committed or at higher levels if it is the first statement of the transaction, as long as no result was sent to the application (because the database doesn't know if the application did something non-transactional with those results, so it cannot transparently rollback and restart). YugabyteDB doesn't re-raise the triggers in this case and provides an easy way to troubleshoot, so it's the best implementation I know. However, there are always trade-offs in distributed systems, and this requires more Raft consensus synchronization (one per statement) when in Read Committed mode.
Aurora DSQL doesn't transparently restart. It has no Read Committed, no savepoints, and detects conflicts only at commit where it's too late to do anything else that rollback the transaction and raises an error. Aurora DSQL is based on Optimistic Concurrency Control, and the trade-off is that applications must avoid conflicts and be ready to retry transactions until they can commit. Other PostgreSQL-compatible databases are available when this is not the case.

Mladen Gogala • Dec 28 '24 • Edited

Thanks for correcting me. Yes, it is a statement restart, not a transaction restart. Eggnog was apparently quite potent this year. The main problem with the save points in PgSQL is its lack of SCN and rowid values. CTID, xmin and xmax cannot be used for that purpose. Does YB have anything like SCN and rowid?

Franck Pachot • Dec 28 '24

Yes. YugabyteDB equivalent of SCN is the timestamp from the Hybrid Logical Clock (NTP time + Lamport correction to get it monotonically increasing). So no wraparound problems 😃
YugabyteDB equivalent of rowid is an encoded version of the primary key (ybctid) as rows are stored in their primary key LSM tree rather than heap tables.

Mladen Gogala • Dec 28 '24

Thanks Franck! Merry Christmas and a happy New Year to both you and your family.

DEV Community

Document data modeling to avoid write skew anomalies

Top comments (5)

Read next

Master JSON Handling in JavaScript! 🖥️🚀

Postgres in prod. Part 1 - setup and startup for growth.

Managing MQTT User Information with PostgreSQL in a Separate Container

PostgreSQL CONCAT Strings