DEV Community

Cover image for Creating a Sleek Masonry Gallery with React and WebAssembly
Remi van der Laan
Remi van der Laan

Posted on • Edited on

Creating a Sleek Masonry Gallery with React and WebAssembly

Myself and three others have been working on a tool called Allusion in our spare time: A free image organization application built for artists. It runs in Electron as a ReactJS application.
One of its key components is the image gallery. Since users may import thousands of images, we can't just render them all using pure HTML and CSS. Over the course of the development, we tried out several out-of-the-box ReactJS packages (mainly react-window and react-virtualized) but none really suited our needs - be it their design or performance.
In the end, we wrote our own super slick image gallery from scratch. It turned out quite nice, so I wanted to share our findings.

The requirements we set for ourselves:

  • Keep as much as possible off the main UI thread to keep everything snappy
  • Keep computation time within a few milliseconds for up to ~10.000 images
  • Configurable thumbnail sizes
  • Three layout modes: A simple grid, vertical (column) masonry, horizontal (row) masonry

The main caveat of our method is that it needs to know image resolutions beforehand, though it could probably be adapted to measure them on the fly too. This is what made the alternatives we tried feel clunky, so we have avoided doing that. Since we store the image dimensions in a database anyways, it's no problem for for our use-case.

Our gallery is built-up out of three main sections:

  • The masonry layout algorithm itself, written in Rust
  • The webworker and shared memory between the main thread and WASM
  • The virtualized image renderer as a ReactJS component

Masonry algorithm in WebAssembly

Rust was was something I wanted to get into for a while already, and it's a natural fit for WASM modules.
The module is set-up with wasm-pack which outputs your WASM file along with TypeScript definitions as an easily importable ES6 module.

Transferring data

To provide the WASM package with the image dimensions it uses as input, we define a vector of Transform structs:

pub struct Transform {
    src_width: u16,
    src_height: u16,
}
Enter fullscreen mode Exit fullscreen mode

We chose to read the output of the layout computation from the same entry, for which we'll need some extra fields:

pub struct Transform {
    src_width: u16, // input dimensions (pixels)
    src_height: u16,
    width: u16,     // output dimensions (pixels)
    height: u16,
    left: u16,      // output offset in the layout (pixels)
    top: u16,
}
Enter fullscreen mode Exit fullscreen mode

We then define a Layout as follows:

pub struct Layout {
    num_items: usize,
    items: Vec<Transform>,
    thumbnail_size: u16, // the desired output size
    padding: u16,        // the amount of pixels in between two images
}
Enter fullscreen mode Exit fullscreen mode

Back in JavaScript land, we ask for a pointer to that items vector in WASM memory, and put our image dimensions in there one by one:

impl Layout {
    pub fn items(&self) -> *const Transform {
        self.items.as_ptr()
    }
}
Enter fullscreen mode Exit fullscreen mode
import { default as init, InitOutput, Layout } from 'masonry/pkg/masonry';
const WASM = await init('masonry/pkg/masonry_bg.wasm');
const layout = Layout.new(numItems);
const ptr = layout.items_ptr();
const items = new Uint16Array(this.WASM.memory.buffer, itemsPtr, MAX_ITEMS);

async function computeLayout(images: Image[]) {
  for (let i = 0; i < imgs.length; i++) {
    // Every item consists of 6 uint16s
    this.items![i * 6 + 0] = imgs[i].width;
    this.items![i * 6 + 1] = imgs[i].height;
  }
  await layout.compute(); // I'll cover this method next!
  // And now we can do something with the layout!
}
function getItemTransform(index: number) {
  return {
    width:  items[index * 6 + 2], // same order as in Rust
    height: items[index * 6 + 3],
    left:   items[index * 6 + 4],
    top:    items[index * 6 + 5],
  };
}
Enter fullscreen mode Exit fullscreen mode

At first, we allocated memory for the transforms anytime the layout is computed, but in practice, the layout is re-computed many times over. To eliminate some overhead, we just reserve a chunk of memory which we use for the lifetime of the module. With just a few megabytes we can support hundreds of thousands of images.
One extra change was necessary: The top offset easily can grow beyond the uint16 of 65,536 pixels. For rows of 4 square images of 200px each, we reach that limit after only 81 rows. That's no good. Therefore, we moved the top offsets to a separate vector of unsigned uint32 values, which will last us over 5 million of such rows.

Layout algorithms

The vertical masonry layout is my personal favourite, so that's the one I'll be covering here. It's quite simple really: We determine the amount of columns that fit within the container width given the desired column width, and then iteratively place the images in the shortest column up to that point.

Animation of the vertical masonry layout algorithm

impl Layout {
    pub fn compute_vertical(&mut self, container_width: u16) -> u32 {
        // First: Determine width of each column and initialize each column height at 0 pixels
        let (col_width, mut col_heights) = {
            let container_width = f32::from(container_width);
            let n_columns = (container_width / f32::from(self.thumbnail_size)).round();
            if n_columns == 0.0 {
                return 0;
            }

            let col_width = (container_width / n_columns).round() as u16;
            let col_heights: Vec<u32> = vec![0; n_columns as usize];
            (col_width, col_heights)
        };
        let item_width = col_width - self.padding;

        // Then loop over all images and place them in the shortest column
        let (current_items, _) = self.items.split_at_mut(self.num_items);
        for (item, top_offset) in current_items.iter_mut().zip(self.top_offsets.iter_mut()) {
            // take into account aspect ratio for the height
            item.height = ((f32::from(item.width) / f32::from(item.src_width)) * h).round() as u16;
            item.width = item_width;

            let shortest_col_index = col_heights
                .iter()
                .enumerate()
                .min_by_key(|(_idx, &val)| val)
                .map_or(0, |(idx, _val)| idx);

            item.left = shortest_col_index as u16 * col_width;
            *top_offset = col_heights[shortest_col_index];

            col_heights[shortest_col_index] += u32::from(item.height) + u32::from(self.padding);
        }

        // Return height of longest column
        col_heights.iter().max().map_or(0, |max| *max)
    }
}
Enter fullscreen mode Exit fullscreen mode

Performance

Now, is this any good in practice? Well, I implemented the same layout computation function in TypeScript (transpiled down to JavaScript), and measured the performance of both for a gallery of 5000 images in release mode:

image

It's a solid 0.2ms faster! Yeah... WebAssembly might have been a little overkill for a simple O(1) calculation like this. It might be even worse than the TS equivalent, since we need to put all of the image dimensions in a buffer first. Though, it does pave the way for a more complex layout computation (I'll link to some resources at the end) for which I'm sure it would pay off.
As for the high peaks in the WASM measurements, I'm not completely sure what causes those. I would have expected those to happen for the TS version instead, since Rust doesn't do garbage collection. I couldn't find any weird things happening in the glue code generated by wasm-pack so I suspect it must be something from the WebAssembly runtime itself.

WebWorker with shared memory

Even though the computation only takes less than a millisecond on my machine, it might not on low-end devices or under heavy load.
By computing the layout in a WebWorker, it won't interrupt the main UI thread, meaning that the application will stay responsive.
We opted for setting up a WebWorker using com-link, mainly for its ease of use.
We don't want to copy the memory buffer every time a message is sent from the worker. Figuring out how to set up shared memory between the WASM memory in the worker and the main thread was the biggest time sink of this adventure.
At first we sent the buffer as a Transferrable but this stopped working in a recent release of Chrome. Instead, we configure the WASM memory to become a SharedArrayBuffer, which has the same capability. This is not supported out of the box: follow this guide to learn more.

// masonry.worker.ts
import { default as init, InitOutput, Layout } from 'masonry/pkg/masonry';
import { expose } from 'comlink';

export class MasonryWorker {
  WASM?: InitOutput;
  layout?: Layout;
  items?: Uint16Array;
  initializeLayout(numItems: number): Uint16Array {
    this.WASM = await init('./wasm/masonry/pkg/masonry_bg.wasm');
    this.layout = Layout.new(numItems);
    const itemsPtr = this.layout.items();
    const sharedArrayBuffer = this.WASM.__wbindgen_export_0.buffer;
      this.items = new Uint16Array(sharedArrayBuffer, itemsPtr, MAX_ITEMS);
    return this.items;
  }
}
expose(MasonryWorker, self);

// MasonryWorkerAdapter.ts
import { Remote, wrap } from 'comlink';
import MasonryWorkerClass, { MasonryWorker } from './masonry.worker';

export class MasonryWorkerAdapter {
  worker?: Remote<MasonryWorker>;

  async initialize(numItems: number) {
    const WorkerFactory = wrap<typeof MasonryWorker>(new MasonryWorkerClass());
    this.worker = await new WorkerFactory();
    this.items = await this.worker.initializeLayout(numItems);
    // And now here in the main thread we can access WASM memory that was initialized in the worker!
  }
}

Enter fullscreen mode Exit fullscreen mode

Virtualized gallery renderer

The last step is to actually render the images in the layout that is computed. Since this is intended for a ReactJS application, the images are rendered as DOM nodes, but the same layout could also be used to render images in a canvas.
We could just put all images in the DOM since the browser is very good at rendering only whatever visible is in the viewport. We can make it lots faster though, by only putting images that are visible in the viewport in the DOM tree. This is called "virtualized rendering".
Any time the viewport dimensions change, or the user scrolls, or for any similar events, we have to re-evaluate which images to render.

const VirtualizedRenderer = ({ containerWidth, images }: VirtualizedRendererProps) => {
  const layout = useMemo(() => ..., []);
  const viewportRef= useRef<HTMLDivElement>(null);
  const containerHeight = useMemo(() => layout.recompute(containerWidth), [containerWidth]);

  // Find the top and bottom edge of the viewport in the layout (omitted for brevity: we do a binary search)
  const [startRenderIndex, endRenderIndex] = determineViewportRegion(layout, viewportRef.scrollTop, viewportRef.clientHeight);

  return (
    // One div as the scrollable viewport
    <div className={className} onScroll={handleScroll} ref={viewportRef}>
      {/* One div for the content */}
      <div style={{ width: containerWidth, height: containerHeight }}>
        {images.slice(startRenderIndex, endRenderIndex + 1).map((im, index) => {
          const fileListIndex = startRenderIndex + index;
          const transform = layout.getItemLayout(fileListIndex);
          return (
            <img
              key={im.id}
              style={transform}
              src={im.src}
            />
          );
        })}
      </div>
    </div>
  );
};
Enter fullscreen mode Exit fullscreen mode

Putting it all together, this is what we ended up with (links to a video on Imgur):
Demo video

Conclusion

Computing the masonry layout runs great performance-wise. It's also much smoother while scrolling and more flexible compared to popular packages available on NPM we tried out.
Making use of WebAssembly was not really worth the hassle in the end, since the computation is fairly simple. Though, it was a good scope for a problem to learn some Rust for. Running the computation in a WebWorker makes all the difference though. Use workers, people!

There are certainly improvements to be made. You could for instance only compute the layout for the relevant section of the viewport you are in.
There are much bigger bottle necks in the code surrounding the layout computation through: It may take dozens of milliseconds to fetch thousands of images from the database and to insert their image resolutions into WASM memory. This could be solved by streaming in data as it is being fetched. For both of these it would add some unnecessary complexity for our current use case, so we're calling it a day at this point!

Resources:

Top comments (0)