Did you know every browser has a built-in key-value database?
I’ve been building a Chrome extension, and learning about IndexedDB while using it for an offline mode. Everything that the app reads from the API gets processed and stored locally. Then the app renders and is reactive to changes from the database.
Having done loads of backend development, I'm curious to know how I can use IndexedDB (IDB for short) with strong consistency - basically without losing data. After reading in some threads that data is easy to corrupt in IDB, it's time to explore the transaction model, and suss out what code patterns might lead to unexepected inconsistency.
So I coded up a little app that would do lots of writes into the same "store" (IDBs name for table). The idea is to use web workers to execute simultaneous writes across threads, and see if we can stomp on parallel updates to lose data. "Stomping" means using concurrency to have code execute and overwrite each others' state.
Since I use Svelte for everything, I set up a simple app with tailwind and daisyUI. You can see an example of this in my post "Svelte Without Kit".
First, the docs
After perusing IDB docs on transactions, I didn't see any details about thread safety or anything about under the hood. Is there some kind of multiversion concurrency control happening? Hard to say. But there was one example, and it emphasized that transactions are applied in order by when the transaction is created. Still not sure what that means across threads, but an ordering guarantee is a good sign that transactions provide some predictability, so let's write some code to find out!
You can find the complete code of the following files:
- App.svelte // Svelte component and UI
- worker.js // web worker that interacts with IDB
- pool.js // utility function for distributing work among workers
- database.js // utility function for connecting to IDB
Test One: update inside transaction
My first test sends messages to a web worker to update a "thread" object in IDB, and then verifies that the update succeeded. The workers each add a random number "labelId" to an array on the thread object. The idea is to test - if workers update the same thread object's array simultaneously, will one overwrite array the other in IDB and lose a labelId?
Here's the database code for updating the thread object, notice I am adding the labelId after starting the transaction and fetching the thread object from the store -
// threadId and labelId variables are in scope
database() // returns a promise with an instance of IDBDatabase
.then((db) => {
// start transaction
const txn = db.transaction(['threads'], 'readwrite')
txn.oncomplete = () => db.close()
const store = txn.objectStore('threads')
return new Promise((resolve, reject) => {
// read thread object from IDB store
const getThread = store.get(threadId)
getThread.onsuccess = ({ target }) => {
// create new thread object if it doesn't exist
const thread = target.result || { id: threadId, labelIds: [], attr: true }
// add labelId to array in thread object
if (!thread.labelIds.includes(labelId)) {
thread.labelIds.push(labelId)
}
// write thread object to IDB store
const putThread = store.put(thread)
putThread.onsuccess = () => resolve(thread)
putThread.onerror = () => reject(new Error(`Error writing message: ${putThread.error}`))
}
getThread.onerror = () => reject(new Error(`Error getting message: ${getThread.error}`))
})
})
For our App UI, we have a button to toggle the test running or not, and a running test will messages to a pool of workers. Refer to below for the full code, but it basically looks like this:
function startStopTest() {
if (testRunning) {
const promises = []
for (let step = 0; step < 15; step++) {
const lId = Math.floor(Math.random() * 1000)
const p = messageWorker({ threadId: step % 5, labelId: lId, action: 'addLabel' }, url)
.then(//... another message)
.then(//... another message)
promises.push(p)
}
Promise.all(promises)
.then(function (results) {
//... process results
})
.then(tick)
.then(startStopTest)
}
}
Let's fire it up!
Result
There are no missing labelIds in any thread object's list, so no data is being corrupted across simultaneous updates. Well that was underwhelming!
Test Two: Add another update
Next I'll try adding a second update to the worker, this time flipping a boolean on the object. The database code is the same as before - start transaction, get a thread object, modify it, put object back in store. My idea is to see if adding more updates to the same object can cause corruption.
./worker.js
function toggleAttr(threadId, labelId) {
return database().then((db) => {
const txn = db.transaction(['threads'], 'readwrite')
txn.oncomplete = () => db.close()
const store = txn.objectStore('threads')
return new Promise((resolve, reject) => {
const getThread = store.get(threadId)
getThread.onsuccess = ({ target }) => {
const thread = target.result
thread.attr = !thread.attr
const putThread = store.put(thread)
putThread.onsuccess = () => resolve()
putThread.onerror = () => reject(new Error(`Error writing thread: ${putThread.error}`))
}
getThread.onerror = () => reject(new Error(`Error getting thread: ${getThread.error}`))
})
})
}
Get it going!
Result
Same result, no corruption
Test Three: separate read and write transactions
I'm curious what happens if I read the object in one transaction, update the list, then write the object in a separate transaction. This will test if the consistency I've achieved so far is a result of transaction serialization or something.
Let's modify our worker code, it will fetch a thread object in one transaction:
function getThread(db, threadId) {
const txn = db.transaction(['threads'], 'readonly')
const store = txn.objectStore('threads')
return new Promise((resolve, reject) => {
const getThread = store.get(threadId)
getThread.onsuccess = ({ target }) => {
const thread = target.result || { id: threadId, labelIds: [], attr: true }
resolve({ db, thread })
}
getThread.onerror = () => reject(new Error(`Error getting thread: ${getThread.error}`))
})
}
Then it updates the object list, before starting a new transaction and writing the object to IDB:
database()
.then((db) => getThread(db, threadId))
.then(({ db, thread }) => {
if (!thread.labelIds.includes(labelId)) {
thread.labelIds.push(labelId)
}
return new Promise((resolve, reject) => {
const txn = db.transaction(['threads'], 'readwrite')
txn.oncomplete = () => db.close()
const store = txn.objectStore('threads')
const putThread = store.put(thread)
putThread.onsuccess = () => resolve()
putThread.onerror = () => reject(new Error(`Error writing thread: ${putThread.error}`))
})
})
Let 'er rip!
Result
Immediately we see errors, missing labelIds:
Corruption accomplished! Almost half of the objects are missing labelIds. That's bad!
This suggests that updating a thread object outside of a transaction, then writing it to the store, is susceptible to data loss. But before you jump up and say "obviously!", think about the complex applications you've built. It's pretty common for data modifications to happen outside of database transactions, for example in response to user input, and then to push that data to storage, for example through an API. But IDB is only consistent with modifications to an object that is also fetched/read within the same transaction. At least according to this test.
Test Four: mixing updates inside and outside transactions
Finally, I wanted to know if updates made outside transactions could conflict with updates happening inside transactions. So I added another "conflicting" action to the worker, one that deletes the labelId outside of a transaction. And reverted the original add labelId action to use an in-transaction update, which should be safe from data loss?
Result
Tests were still passing:
But labelIds were not being removed from the thread objects:
Once I update the delete labelId worker to modify inside a transaction, labelId lists look more reasonable:
Analysis
Ok so what's going on here? IndexedDB seems capable of handling concurrent updates as long as objects are read, updated and written in the same transaction.
If I go back to the IDB docs, I recall how it highlighted that transactions followed some kind of ordering, based on when they are created. This leads me to think that transactions are "serialized", meaning they act as if they are executed one at a time (very watered-down definition of serialization).
Here's a sketch of the pattern that leads to data loss -
Notice that there is no way to order the "put" transactions that doesn't lose data. If Worker1 writes first, its array of labelIds will get overwritten by Worker2, and vice versa.
Now what if we update within a transaction -
Order doesn't matter for these transactions, as long as they are executed serially, the object in IDB doesn't lose any data!
Conclusion
So it seems that IDB is safe as long as updates are made to objects fetched within the same transaction. I even tested transactions with long running code between the get and put, and data consistency held. (There are some situations where transactions get closed after they are passed between closures, but that raises an error).
While it's great to see that IDB can be used with consistency, this mechanism dramatically affects how code using IDB is written! You can't reliably accumulate changes to an object, and then write that object to the database, without potential for stomping on another update happening simultaneously. So if your app has multiple code paths for updating IDB, this is a risk.
I should also note that there are some updates where order does really matter. For example, adding then removing a value relies on the "add" happening before the "remove", otherwise the final value is wrong. Again, pretty common for this to happen in a web app, say a user clicks really fast.
These kind of consistency problems are exacerbated by the fact that IDB only supports a primitive put method - there's no way to make partial or conditional updates. So as a developer, I'm left with solving for consistency, and it requires me to be very deliberate about how state is updated.
Ah well, keep on coding!
App.svelte
Simple page with a button to start and stop tests, as well as a summary of the results of the running test.
<script>
import { messageWorker } from './pool'
import { tick } from 'svelte'
let successCount = 0,
failedCount = 0,
totalCount = 0,
lastError = undefined,
testRunning = false
const url = new URL('./worker.js', import.meta.url)
function toggleTest() {
testRunning = !testRunning
if (testRunning) {
successCount = failedCount = totalCount = 0
lastError = undefined
}
startStopTest()
}
function startStopTest() {
if (testRunning) {
const promises = []
for (let step = 0; step < 15; step++) {
const lId = Math.floor(Math.random() * 1000)
const p = messageWorker({ threadId: step % 5, labelId: lId, action: 'addLabel' }, url)
.then(function ({ threadId, labelId }) {
return messageWorker({ threadId, labelId, action: 'toggleAttr' }, url)
})
.then(function ({ threadId, labelId }) {
return messageWorker({ threadId, labelId, action: 'verifyLabel' }, url)
})
.then(function ({ threadId, labelId }) {
return messageWorker({ threadId, labelId, action: 'deleteLabel' }, url)
})
promises.push(p)
}
Promise.all(promises)
.then(function (results) {
results.forEach(({ error, threadId, labelId }) => {
if (error) {
failedCount = failedCount + 1
lastError = { error, threadId, labelId }
} else {
successCount = successCount + 1
}
totalCount = totalCount + 1
})
return Promise.resolve()
})
.then(tick)
.then(startStopTest)
}
}
</script>
<main>
{#if testRunning}
<button class="btn btn-primary" on:click={toggleTest}>Stop test</button>
{:else}
<button class="btn btn-warning" on:click={toggleTest}>Start test</button>
{/if}
<p class="text-sm font-light py-4">
{totalCount} total tests, {successCount} passed, {failedCount} failed
</p>
{#if lastError != undefined}
<p class="text-lg text-error">
{lastError.error?.message || 'Unknown Error'}
</p>
{:else}
<p class="font-semibold italic text-lg text-primary-content">No errors</p>
{/if}
</main>
database.js
A utility function to open a connection to the database, and return the connection in a promise.
// ./database.js
export function database() {
return new Promise((resolve, reject) => {
const request = indexedDB.open('test_db', 1)
request.onsuccess = () => resolve(request.result)
request.onerror = () => reject(request.error)
request.onblocked = () => console.warn('pending till unblocked')
request.onupgradeneeded = (e) => {
const up = request.result
if (e.oldVersion < 1) {
const threads = up.createObjectStore('threads', {
keyPath: 'id',
autoIncrement: false
})
}
}
})
}
pool.js
A utility function to manage a pool of web workers and message a worker. A promise is returned, that will be invoked when the worker responds. Promises are tracked by a nonce
value is that is passed with the message.
// ./pool.js
const workers = []
const promises = {}
function resolveMessage(data) {
promises[data.nonce].resolve(data)
delete promises[data.nonce]
}
export function messageWorker(message, url) {
const ix = Math.floor(Math.random() * 5)
if (workers.at(ix) == undefined) {
workers[ix] = new Worker(url, { type: 'module' })
workers[ix].onmessage = (event) => resolveMessage(event.data)
}
const rand = Math.floor(Math.random() * 50000)
const nonce = `${rand}`
const promise = new Promise((resolve) => {
promises[nonce] = { resolve }
})
workers[ix].postMessage({ nonce, ...message })
return promise
}
worker.js
A background worker for manipulating data with various transaction patterns.
//./worker.js
import { database } from './database'
function handleMessage(message, respond) {
let pr
if (message.action == 'addLabel')
pr = addLabel(message.threadId, message.labelId)
else if (message.action == 'toggleAttr')
pr = toggleAttr(message.threadId, message.labelId)
else if (message.action == 'verifyLabel')
pr = verifyLabel(message.threadId, message.labelId)
else if (message.action == 'deleteLabel')
pr = deleteLabel(message.threadId, message.labelId)
else
pr = Promise.reject(new Error('No action in message'))
pr
.then(() => respond({ success: true, ...message }))
.catch((error) =>
respond({ error, success: false, ...message })
)
}
function addLabel(threadId, labelId) {
return database().then((db) => {
const txn = db.transaction(['threads'], 'readwrite')
txn.oncomplete = () => db.close()
const store = txn.objectStore('threads')
return new Promise((resolve, reject) => {
const getThread = store.get(threadId)
getThread.onsuccess = ({ target }) => {
const thread = target.result || { id: threadId, labelIds: [], attr: true }
if (!thread.labelIds.includes(labelId)) {
thread.labelIds.push(labelId)
}
const putThread = store.put(thread)
putThread.onsuccess = () => resolve()
putThread.onerror = () => reject(new Error(`Error writing thread: ${putThread.error}`))
}
getThread.onerror = () => reject(new Error(`Error getting thread: ${getThread.error}`))
})
})
}
function toggleAttr(threadId, labelId) {
return database()
.then((db) => getThread(db, threadId))
.then(({ db, thread }) => {
thread.attr = !thread.attr
return new Promise(function (resolve, reject) {
const txn = db.transaction(['threads'], 'readwrite')
txn.oncomplete = () => db.close()
const store = txn.objectStore('threads')
const putThread = store.put(thread)
putThread.onsuccess = () => resolve()
putThread.onerror = () => reject(new Error(`Error writing thread: ${putThread.error}`))
})
})
}
function verifyLabel(threadId, labelId) {
return database()
.then((db) => getThread(db, threadId))
.then(({ db, thread }) => {
db.close()
if (thread == undefined) {
return Promise.reject(new Error(`Missing thread ${threadId}`))
} else if (!thread.labelIds.includes(labelId)) {
return Promise.reject(new Error(`Thread ${threadId} is missing label ${labelId}`))
} else {
return Promise.resolve()
}
})
}
function deleteLabel(threadId, labelId) {
return database()
.then((db) => getThread(db, threadId))
.then(({ db, thread }) => {
thread.labelIds = thread.labelIds.filter((lid) => lid != labelId)
return new Promise((resolve, reject) => {
const txn = db.transaction(['threads'], 'readwrite')
txn.oncomplete = () => db.close()
const store = txn.objectStore('threads')
const putThread = store.put(thread)
putThread.onsuccess = () => resolve()
putThread.onerror = () => reject(new Error(`Error writing thread: ${putThread.error}`))
})
})
}
function getThread(db, threadId) {
const txn = db.transaction(['threads'], 'readonly')
const store = txn.objectStore('threads')
return new Promise((resolve, reject) => {
const getThread = store.get(threadId)
getThread.onsuccess = ({ target }) => {
const thread = target.result || { id: threadId, labelIds: [], attr: true }
resolve({ db, thread })
}
getThread.onerror = () => reject(new Error(`Error getting thread: ${getThread.error}`))
})
}
self.onmessage = (message) => {
handleMessage(message.data, postMessage)
}
Top comments (2)
Thanks so much for these thoughtful tests and writeup. Have you come up with a strategy for addressing this in your apps?
I was thinking perhaps something like tracking an active transaction as a singleton and using it til it's closed, assuming that it has been opened with the correct object stores; and if not, awaiting its completion before creating a new transaction.
A more conservative approach might be to just use a mutex throughout: every unit of work / transaction must await the completion of any current transaction.
Keeping an "active transaction" singleton comes with some pretty big drawbacks - it limits the app from using web workers, and fails to work if multiple tabs are open to the same page/site.
So far I've been able to get away with modeling the data where every update is idempotent. Sometimes this does require keeping extra data, such as keeping a list of deleted elements rather than deleting them from the object outright.
But for more complex applications, I do think the mutex approach would be best. Check out the Web Locks API, as it is designed to work across multiple open tabs.