DEV Community

Cover image for A New Horizon in Data Compression: Matchmaking Run-Length Algorithm
ViperT
ViperT

Posted on • Edited on

A New Horizon in Data Compression: Matchmaking Run-Length Algorithm

In the realm of data compression, efficiency is a sought-after virtue. The world may seem consumed by the rush towards innovation, but true progress lies in the nuanced methods that reshape our reality. Enter the "matchmaking" run-length algorithm, an inventive approach to data compression.

Introducing the Algorithm

The "matchmaking" run-length algorithm uses a boolean array based on a typed array to understand if a value in process requires to consume a length in the Uint8Array of lengths.

Why It Matters

In the landscape of data where run length is efficient, we often encounter values that don't repeat themselves. Using an entire byte (which is 8 bits) to specify that the value is only used once can be an extravagance.

# How It Works
Here's where our method turns conventional wisdom on its head.

  • The Lengths Table: This is a record of the lengths of continuous sequences of a given value.
  • The Values Table: It will include values that are repeated (and therefore have lengths) and those that are "alone" or not repeated.
  • The Boolean Array (Uint1Array): A novel addition, this acts as a switch that tells us when to consume a length and when to simply put the value once.

The Efficient Dance

The algorithm functions on typed arrays, and the mechanism can be understood as follows:

  • If a value repeats, the boolean array instructs to fetch the corresponding length from the lengths table.
  • If a value doesn't repeat (it is "alone"), the boolean array instructs to use the value itself.

This approach avoids consuming an entire byte to represent a single occurrence, an epitome of Affolter Matias's commitment to innovative elegance.

The Results

What we witness here is a prime example of the beauty and complexity of compression algorithms. This method can drastically reduce the memory footprint, driving forward our ceaseless quest for understanding and elevation in the technological arena.

Conclusion

The "matchmaking" run-length algorithm, with its unique utilization of boolean arrays and typed arrays, leads us to a promising threshold of data compression efficiency. Its fusion of simplicity and brilliance resonates with the resonant urge for technological excellence, showcasing that the pursuit of greatness involves rejecting the ordinary and striving for the extraordinary.

In action: https://www.npmjs.com/package/@asaitama/boolean-array

Real-world application

Image description

Using Typed Arrays for Pixel Art Color Indexing

In the realm of pixel art, color indexing plays a pivotal role in compressing the data, and this is where typed arrays come into play. Depending on the number of unique colors, different typed arrays can be used:

  • Uint8Array: Suitable for images with fewer than 256 unique colors. By using 8 bits to store a color reference, it becomes an efficient way to represent the color data.
  • Uint16Array or Bigger: For more complex images with a wider color palette, larger typed arrays like Uint16Array can be employed. These store references to the colors in 16 bits or more, allowing for a broader range of unique colors.

Storing color as a reference rather than a whole Uint32 color (which includes red, green, blue, and alpha channels) leads to a much lighter footprint. This alone provides a significant reduction in data size, yet the compression could be further optimized.

Matchmaking Run Length Algorithms for Improved Compression

The continuity of the same color indexes in pixel art was a factor that seemed inefficient and somewhat irksome. Despite the space-saving achieved through the use of Uint8Array or Uint16Array, there remained a need for a more succinct expression of continuous same-colored pixels. This led to the exploration of a novel solution.

By utilizing a matchmaking run length algorithm alongside a bit array (rather than a byte array), it became possible to reconcile the issue of "lonely patterns" that would have otherwise interfered with compression. This approach leverages two components:

  • Matchmaking Table: A table that links the lengths and values, managing the relationship between the repeated values and the number of times they occur.
  • Uint1Array: This specialized array informs the algorithm when to consume a length and when to represent a unique value, making the representation incredibly efficient.

The implementation of this matchmaking run length algorithm provided a performance boost (storage) that was 2.2 times lighter. It optimized the representation without sacrificing any detail or quality, proving to be a remarkably effective technique for compressing pixel art color indexes.

Top comments (0)