Superluminal is a powerful sampling-based profiling tool that allows you to analyse games and applications written in C++, Rust and .NET, without making any modifications or adding performance markers.
It runs on Windows, and supports Windows, Xbox One®, Xbox Series X®, PlayStation®4 and PlayStation®5 applications and games. For access to restricted platforms (consoles) you’ll need to confirm your Developer status with Superluminal, you’ll then receive the relevant plugins through an automated update.
It does not support mobile platforms. If you’re working on Android, you can use simpleperf and the System Trace View from the Android Studio Profiler instead. On iOS, you can use Instruments and the Time Profiler template.
In order to use it you’ll need a paid license, you can redeem a 14 days trial for evaluation purposes.
Let’s see how we can leverage Superluminal to profile our game. For this guide, I’ll be using Superluminal v.1.0.6470.
Symbols
Before we start profiling and analysing our game or application, let’s make sure we have debug symbols set up correctly.
Symbols are files that attach extra information to your executable, useful for debugging. They typically include function and variable names, source code filenames, line numbers, scopes, class names and so on.
In the Superluminal Settings window, you can define additional symbol file locations:
The new PDB parser option, at the bottom of the window, provides faster symbols resolution but it could be unstable as it’s an experimental feature.
Unity includes stripped debug symbols as part of the engine installation; you can find the relevant pdb files in the Editor\PlaybackEngines[Platform]\Variations folder. They also maintain a public symbol server for Windows debug symbols, providing stripped symbols, located at https://symbolserver.unity3d.com/. There are no public symbols servers for other platforms.
When making a Windows build, make sure to enable “Copy PDB files” to ensure your final executable includes debug symbols.
On the other hand, always make sure that option is disabled before making release builds that you intend to distribute. Symbols will result in larger build size, and will provide additional information about your game source that are not meant to be shared.
Taking a capture
Superluminal allows you to profile your game, collecting valuable runtime performance information, or the actual editor, measuring for example asset import times, build times and so on, in order to find bottlenecks and speed up your workflow.
To start collecting data, you first need to copy the mono-profiler-superluminal.dll next to your executable (either the game or the editor). If you miss this step, you’ll get a prompt with a link to the DLL folder. This is normally located in the installation folder, under C:\Program Files\Superluminal\Performance\Unity. You can then pick either the x64 or x86 DLL, based on your system.
Since Unity 2022.2.0f1, there is built-in support for Superluminal. If you are using an older version, you also need to add a command line argument, -monoProfiler superluminal,
so Unity knows that it needs to look for the DLL you just provided.
When running the editor, you can add the Command line argument from the Hub:
After that, make sure to restart the game or the editor.
If you attempt to use the argument on a version of Unity that has built-in support for Superluminal, you’ll get a warning.
From the Superluminal window, you can select Run and then select the relevant executable, either for your game or for the Unity editor. This will start the application and attach the process directly.
Another way to use Superluminal is to attach it to a running instance. You can filter by processes that contain “unity” in their name, and as shown, you’ll find a few of them.
When profiling the engine, the one you’re looking for is the Unity.exe with the window title set to your project. The others are background processes, such as AssetImporters.
You can also select multiple processes if needed. If you want to include processes that may be spawned later on, you can select “Enable child process profiling” in the Capture options, and optionally define a filter to reduce noise.
Sampling by default is 8khz, but can be changed from the Capture options, either when you select Run or Attach. Different platforms will offer different frequency ranges.
As shown above, you can also limit the capture to a specific duration or file size.
While the Unity profiler is an Instrumentation based tool, Superluminal is mainly a sampling tool. This means that you need to capture a sample that is long enough to be able to provide useful data.
When you’ve collected enough data, you can stop the capture and start analysing it. At that point, Superluminal will start resolving symbols, which might take a few minutes.
When that is done, a new window will appear with your capture.
Analysing your capture
Let’s go through the capture data now. The main window will show all available threads, and the work they’ve been doing.
You can expand a thread by pressing the Expand button next to it, and you can filter it or change ordering by right-clicking on it. To make things easier to navigate, you should select all the threads you don’t need and hide them.
Threads are sorted by how much work they are doing, so in most cases you’ll find the Main Thread at the top. You can look for specific portions of your execution by hitting Ctrl+F, which lets you search by Function Name or Instrumentation event. You can also select a time range by dragging in the main window, then pressing F.
When you hover over a specific call, you might see Dash Lines appearing. These indicate thread interactions. By clicking on it you’ll get more information on the blocking calls in the Thread Interaction tab below, including blocking and unblocking callstacks.
In the Call Graph below you get a top-down view of the most expensive calls in the specified time range. This includes the function name, the module (the DLLs where it comes from), exclusive and inclusive time, and thread state (whether it’s executing or synchronising). Inclusive time refers to the time taken by the function itself and all its children recursively, while Exclusive time is time spent only in that function.
You’ll also get a pie chart showing function time distribution on the right.
The Expand Hot Path button (the flame-looking one at the top of the Call Graph tab), goes through the call stack and highlights the most expensive calls, which is quite convenient.
In some cases, you might want to focus on a single call and analyse from there. To do so, you can right click on a call and select “Set as Root”. This will refresh the Call graph taking only that specific call, which makes it easier to navigate.
If you switch to the Function List tab, it will show a raw list of all the functions called in your sample, allowing you to sort by Exclusive time and understand what’s taking most of your CPU time.
You can right-click on a function call in the Threads view and select “Show Symbol info” to ensure relevant debug symbols can be solved in the Modules window.
If you haven’t set up the symbols and added the DLL next to your executable, you’ll notice this warning in the Source and Disassembly view when selecting the function from the Functions list. You’ll then be unable to inspect source file information.
Adding Instrumentation markers
While Superluminal is mainly a sampling tool, it also allows you to add Instrumentation markers, using the Performance API, to create event markers, and set thread names.
To do that in Unity, you can use the SuperluminalPerf .NET wrapper. https://www.nuget.org/packages/SuperluminalPerf/
https://github.com/xoofx/SuperluminalPerf
First of all, initialise the API as soon as the game starts:
SuperluminalPerf.Initialize();
If Superluminal is not in the default path, you’ll need to pass the actual path for the PerformanceAPI.dll to this call.
You can then create markers by starting and ending events, as shown below:
SuperluminalPerf.BeginEvent("LevelLoading");
// Your Level loading logic
SuperluminalPerf.EndEvent();
Finally, you can set the current thread name using the related call:
SuperluminalPerf.SetCurrentThreadName("LevelLoadingThread");
Top comments (0)