DEV Community

Cover image for Variable Rate Shading on Adreno GPUs 🇺🇦
keaukraine
keaukraine

Posted on • Edited on

Variable Rate Shading on Adreno GPUs 🇺🇦

“With high screen DPI doesn’t come high GPU fillrate” — that’s the main problem of GPUs nowadays. Modern consoles struggle to sustain stable 30, let alone 60 fps on large 4k screens. The common technique to increase FPS is rendering at lower resolution with fancy upscaling techniques like DLSS and FSR. But modern VR-capable hardware has to be able to target both very high frame rates and high image quality, and upscaling does show its limitations here — depending on implementation the image will be either blurry, too sharpened or will introduce ghosting artifacts. Variable rate shading (VRS) is a temporally stable approach of improving performance with (if applied correctly) virtually unnoticeable quality reduction.

Modern mobile Adreno GPUs by Qualcomm support Variable Rate Shading, and phones with these GPUs have been available since autumn 2021. Because our live wallpapers have to be power-efficient, we have got a test device with Adreno 642L to implement this feature in our apps.

What is Variable Rate Shading

The idea behind VRS is to rasterize a single fragment and then interpolate color between adjacent pixels on screen.

A good explanation of how VRS is implemented on Adreno GPUs can be found in the official Qualcomm Developer blog here. You can understand how simple it is by looking at this image from aforementioned blog post:
VRS - image by Qualcomm

VRS is better than generic downsample of the whole frame because:

  1. It preserves geometry edges (except cases when the shape is determined by discarding fragments).
  2. Can be adjusted per each draw call — one object can be rendered at full detail while the other one will have reduced quality.
  3. Can be applied dynamically to keep target FPS by gradually reducing image quality.

Implementation

On Snapdragon SoCs it is implemented with QCOM_shading_rate extension. Adreno GPUs support blocks of 1x1, 1x2, 2x1, 2x2, 4x2, and 4x4 pixels. Please note that some useful dimensions like 2x4 or 4x1 are not available because they are not supported by hardware.

To apply VRS to certain objects you simply make a call to glShadingRateQCOM with desired rate before the corresponding draw calls.

To disable VRS for geometries which should preserve details and be rendered at native shading rate, simply call glShadingRateQCOM with 1x1 block size.

One of the first apps we’ve added VRS support to is Bonsai Live Wallpaper. This is a good example because it has 3 very different types of geometries ranging from perfect candidates for VRS optimizations to the very unsuitable ones.

Let’s take a look at a typical scene from the app and how different parts of image can benefit from reduced shading rate:
Bonsai 3D live wallpaper screenshot

The best type of geometry to be optimized by VRS is the one which is blurred and has small color variation between fragments. So, for sky background we apply a quite heavy 4x2 VRS which still introduces virtually no quality degradation, especially with constantly moving cameras.

On the opposite side of the scales is leaves geometry. On the screenshot below we applied 4x4 VRS to the whole scene to showcase the issue with alpha-testing. Please note that branches, while also using the same heavy 4x4 reduction in this example, have the same smooth and anti-aliased edges, clearly showing a benefit of VRS over traditional upscaling.
VRS distortions on geometries with discarded fragments
Needless to say, VRS is clearly not suitable for geometries with discarded fragments.

Also because VRS is applied in screen-space, it introduces significant distortions to transparent dust particles. Their size is comparable to VRS block and they start flickering during movement. I’ve noticed a somewhat similar rendering technique used in the COD:MW game on PC when enabling half-resolution particles — sparks and other small particles flicker way too much and look very blocky.

And somewhere between these two geometries lies the ground plane. This is where we apply 2x1 rate reduction. This results in OK image quality because there’s a larger color difference between adjacent vertical pixels compared to the horizontal ones.

Where VRS definitely shines is when it is applied to geometries with very little color difference between adjacent fragments, and Bonsai wallpaper has a stylized silhouette mode where fragments use literally single color:
Bonsai live wallpaper, silhouette mode

Here we have 3 types of shaders:

  1. Alpha-testing for leaves. We already know that we should not apply VRS to these geometries.
  2. Solid black silhouette and ground. The heaviest 4x4 VRS introduces literally zero quality degradation.
  3. For the sky gradient we use 2x1 blocks. Technically it would be perfect to have a 4x1 or even 16x1 blocks because gradient changes vertically and adjacent horizontal fragments have identical color but Adreno hardware supports only 2x1 ones.

All of these applied to the scene results in identical rendering (screenshots comparison found 0 pixels difference) and 1.5x of shading speed improvement.

Dynamic quality

All our wallpapers use some ways of reducing GPU load when the battery is low. Usually this is done by limiting FPS and omitting a couple of effects.

For more efficient power usage we apply stronger VRS to certain objects in low battery mode. Tree trunks are shaded with 2x1 blocks, sky and transparent effects (light shafts and vignette) are shaded with 4x4 instead of 4x2 or 2x2 blocks. This reduction of quality is still almost unnoticeable but reduces GPU load by additional 3%.

Performance gains vs quality tradeoff

You will be hard-pressed to find any difference between original and VRS-optimized rendering — color deviation is negligible, and blocky artifacts are really hard to spot. Only ImageMagick was able to show different pixels:
Image quality comparison

Both VRS-enabled and regular rendering pipelines result in steady 120 FPS on our test device (Galaxy Samsung A52s). So we’ve run a Snapdragon Profiler to analyze performance and efficiency of the optimized build. Here are the numbers:

Bonsai 3D Live Wallpaper, regular mode:
VRS performance table

Bonsai 3D Live Wallpaper, battery saving mode.
VRS performance table

Bonsai 3D Live Wallpaper, silhouette mode.
VRS performance table

In the silhouette scene we don’t use different VRS blocks for regular and power saving modes because it already uses maximum block size and still renders the image identical to non-VRS one.


Long story short, we’ve improved rendering efficiency by approximately 30% with little to (literally) none image quality reduction.

Top comments (0)