So, you heard about immutability and all the sweet that comes from it, but what is it actually?
We'll be using TypeScript
in this article, but the same principles apply to different languages.
In simple words, immutability means that a value can never be changed or, in a deeper sense, that the observable behaviour of a value can never be changed.
When you deal with immutable values, you can always safely "capture" them because you know that they will never change – doing the same with mutable values requires making a copy.
Let's have a look at some usage of immutable code:
let arr: ReadonlyArray<number> = []
while (arr.length < 10) {
const newArr = [...arr, arr.length]
setTimeout(() => {
console.log(newArr.length)
}, 0)
arr = newArr
}
When executed, this code will produce on the console every number between 1 and 10 (included).
If we were to do the same with mutable code, we would get to a totally different result. Let's have a look:
const arr: Array<number> = []
while (arr.length < 10) {
arr.push(arr.length)
setTimeout(() => {
console.log(arr.length)
}, 0)
}
This code will instead print 10 times the number 10.
What we've done here is showcase what "capture" means, the call to setTimeout
takes a reference from the local scope and carries it across to a different scope; setTimeout
will always execute after all the sync operations in our program are done.
At that point, in the first example, we get effectively 10 different arrays, each with different lengths, while in the second example, we get a single array containing all the numbers from 0 to 9 (having a length of 10).
Copy On Write
The technique we used in the previous example is commonly known as "copy on write", meaning whenever we intend to "mutate" something, we create a new copy of it, which allows the initial structure (pre-mutation) to remain unchanged (aka frozen).
We can see that no changes happen to the original structure by using Object.freeze
that will raise an error anytime something is mutated:
let arr: ReadonlyArray<number> = Object.freeze([])
while (arr.length < 10) {
const newArr = Object.freeze([...arr, arr.length])
setTimeout(() => {
console.log(newArr.length)
}, 0)
arr = newArr
}
The same technique can be applied to objects and generally any structure that can be "cloned".
All good, isn't it? Well, all good, except made for performance, cloning becomes progressively expensive as things grow.
const copyOnWrite = (n: number) => {
console.time(`copy-on-write-${n}`)
let arr: ReadonlyArray<number> = []
while (arr.length < n) {
const newArr = [...arr, arr.length]
arr = newArr
}
console.timeEnd(`copy-on-write-${n}`)
}
copyOnWrite(10)
copyOnWrite(100)
copyOnWrite(1000)
copyOnWrite(10000)
copyOnWrite(100000)
This program will produce:
copy-on-write-10: 0.074ms
copy-on-write-100: 0.093ms
copy-on-write-1000: 1.034ms
copy-on-write-10000: 82.26ms
copy-on-write-100000: 52.010s
Clearly, this becomes prohibitive very quickly, especially in problems of unbounded size (where you don't know the size of the structures you'll need to clone).
So, is immutability screwed for good? No... but copy on write is.
Persistent Data Structures
I won't go into all the details and techniques around Persistent Data Structures. If you're interested in the topic in more details, I can recommend following the amazing course by Erik Demaine entitled "Advanced Data Structures" consumable, for free, from the MIT opencourseware platform: https://ocw.mit.edu/courses/6-851-advanced-data-structures-spring-2012/.
For the sake of this article, we'll say that persistent data structures are data structures that remember their history and each new "copy" shares as much as possible with the original.
In the case of the example seen earlier, a persistent array structure should remember and share all the elements with the previous one instead of having to clone everything every time.
Let's do the same example using a persistent array. We will use the "Chunk" data structure from the new @fp-ts/data
that is based on the following paper: http://aleksandar-prokopec.com/resources/docs/lcpc-conc-trees.pdf.
import * as Chunk from "@fp-ts/data/Chunk"
const copyOnWrite = (n: number) => {
console.time(`persistent-${n}`)
let arr: Chunk.Chunk<number> = Chunk.empty()
while (arr.length < n) {
const newArr = Chunk.append(Chunk.size(arr))(arr)
arr = newArr
}
console.timeEnd(`persistent-${n}`)
}
copyOnWrite(10)
copyOnWrite(100)
copyOnWrite(1000)
copyOnWrite(10000)
copyOnWrite(100000)
As we can see the code looks almost the same as the previous one but its execution doesn't even compare:
persistent-10: 0.534ms
persistent-100: 0.728ms
persistent-1000: 1.493ms
persistent-10000: 14.425ms
persistent-100000: 10.705ms
It may seem surprising that 100000
is even faster than 10000
, that's because V8
(the JS engine behind Node.js
) was able to JIT optimize the code.
The point is that the complexity doesn't grow with size and all the previous references can be captured safely as they are still valid.
Magic? feels like it sometimes.
Conclusion
If you like those kind of things don't forget to check out all the work that is undergoing in https://github.com/fp-ts and https://github.com/effect-TS. And, if you can, donate a few $$
to support the amazing work of the contributors.
Corollary
Why the Record & Tuple proposal https://github.com/tc39/proposal-record-tuple is bad?
It conflates immutability with copy on write & structural freeze, and its implementation doesn't allow for custom types to be mixed in, so when used it's an all-in.
Overall, if accepted, I feel it would slow the adoption of proper immutable data structures in a close to non-reversible manner, and it will be ignored by the functional community that immutability targets in the first place and where the concept of immutability comes from.
What would be better?
As you can see from this article, it is already possible to do everything we need to do, but some custom syntax is nice; namely the proposal would allow records and tuples to be compared using ===
and to be created using #{ a: 0 }
/ #[0, 1, 2]
, which is arguably a nice to have.
An example of it would be the operator overloading proposal https://github.com/tc39/proposal-operator-overloading that would allow overriding ===
, and it would also solve in one shot many other problems (like numeric types extension, pipeable operator, and all the ones listed in the proposal).
In addition to that, a custom #
operator could be added to create objects & arrays using a prototype that overloads ===
making it value based.
Top comments (9)
In most applications, the immutability is by convention rather than by logic – because of the performance costs. You can use
as const
in typescript to enforce immutability without a performance penalty.However, immutability is mostly associated with tree-walking reconciliation in order to achieve reactivity. The observer pattern, now seeing a renaissance as reactive signals, is better suited for such applications.
Have you even read the article before commenting?
as const
is type-level and doesn't do anything whatsoever at runtime, if you specifyas const
and then do copy on write you will have 100% the same issues as not usingas const
. This article has nothing to do with reactivity or observer patterns, it's about a single thing: immutability.I'm aware of that limitation. I never said it was enforced on runtime level, or over reference boundaries. That much should be obvious to anybody with a bit of knowledge about TypeScript.
And most people are very likely to meet the concept of immutability as part of a reactive framework first. If you are talking about immutability, it doesn't hurt to show the common use cases.
ReadonlyArray<number>
is already immutable from the perspective of TS in that caseas const
would produce a tuple type which is not what is needed in the examples (as we need an array). This article, as it stated at the beginning, introduces a concept that is valid across languages (TS is only used as example) and the concept it introduces is how to achieve immutability without compromising performance (by using persistent data structures).Nice article!
About Chunks and persistent data structures in general is there any big difference in terms of memory footprint? From the benchmark we can see that Chunks scales much better (from a compute-time perspective) than Arrays when the size of the collection grows but have you got any inputs about the memory usage while the program is running? I'm guessing that it gets garbage collected correctly but I would be interested to know if there exists some sort of drawback from a memory usage perspective hard to get on simple examples
Thanks :)
Sharing structure is more efficient memory wise than copying everything so I would say that, if you want immutability than persistent structures are the most efficient way, clearly there is a price for immutability in itself, if you can achieve the same result using mutable structures they should be preferred when caring about efficiency.
Thanks a lot for this quite clear article. It's concerning the way that immutability is heading, adding this performance impact will only make people completely avoid this feature.
Regarding syntax, personally I wish JavaScript could implement a
val
declaration (like in scala) to declare a truly immutable (hopefully using persistent data structure) variable.@mikearnaldi what's the other side of the coin, e.g on Chunk (first) usage?
Chunk is O(1) in
get
so performance is close to array access and is materialized to an array whenever traversed. In some sense it can be thought as a "deferred" array with amortized O(1) for prepend / append / concat / get / slice.