Creating an emulator for Xbox 360 games on a PC is challenging, and developers may encounter treacherous bugs at each stage of development. Let's explore some common issues that may arise during the process using the Xenia project as a case study.
Intro
While searching for GameDev content, I found an article from the developers behind the Xenia emulator. The graphics developer shared their experiences with emulating Xbox 360 and how they've managed to make it work. I really enjoyed the article, so I was particularly pleased to see that the project is still going strong. I think this is a great project to check using PVS-Studio. Yet, It's also a cool opportunity for me to write my first article—a true win-win situation :)
Xenia is a research emulator for the Xbox 360 platform. The project aims to experiment, research, and educate on the topic of emulating modern devices and operating systems. Lower your black flags, pirates! All information is obtained via reverse engineering of legally purchased devices, games, and content published online.
The PVS-Studio static analyzer likely needs no introduction, so I'll just note that I used the latest 7.33 release along with the plugin for Visual Studio.
Btw, since I'm talking about GameDev, I'd like to highlight that in the latest release, my team has significantly fostered the quality of analysis for Unreal Engine-driven projects. You can learn more about it here.
This project has no release tags or branches, so the code matches the 3d30b2e commit.
I'll start with the major issues and smoothly move on to suggestions on how to enhance the code. Let's drive and delve into the detected errors!
Errors? He-he, errors :)
Fragment N1
void StfsContainerDevice::BlockToOffsetSVOD(size_t block, ....)
{
....
const size_t BLOCK_SIZE = 0x800;
const size_t HASH_BLOCK_SIZE = 0x1000;
const size_t BLOCKS_PER_L0_HASH = 0x198;
const size_t HASHES_PER_L1_HASH = 0xA1C4;
const size_t BLOCKS_PER_FILE = 0x14388;
const size_t MAX_FILE_SIZE = 0xA290000;
const size_t BLOCK_OFFSET =
header_.metadata.volume_descriptor.svod.start_data_block();
....
// Resolve the true block address and file index
size_t true_block = block - (BLOCK_OFFSET * 2);
....
size_t file_block = true_block % BLOCKS_PER_FILE;
size_t file_index = true_block / BLOCKS_PER_FILE;
size_t offset = 0;
// Calculate offset caused by Level0 Hash Tables
size_t level0_table_count = (file_block / BLOCKS_PER_L0_HASH) + 1;
offset += level0_table_count * HASH_BLOCK_SIZE;
// Calculate offset caused by Level1 Hash Tables
size_t level1_table_count = (level0_table_count / HASHES_PER_L1_HASH) + 1;
offset += level1_table_count * HASH_BLOCK_SIZE;
....
}
The PVS-Studio warning:
V1064 The 'level0_table_count' operand of integer division is less than the 'HASHES_PER_L1_HASH' one. The result will always be zero. stfs_container_device.cc 500
The analyzer warns that the level1_table_count value always equals 0 because the level0_table_count left operand is less than the HASHES_PER_L1_HASH right operand during an integer division operation. The value of_ HASHES_PER_L1_HASH_ is 41412, so to determine the level0_table_count value, take a look at the code above.
The file_block variable is computed by dividing the true_block by BLOCKS_PER_FILE, so it's within the range of [0 .. 82823].
The BLOCKS_PER_L0_HASH variable divides the value by 408, and the 1 is added to the result. When file_block reaches its maximum value, we'll get 202, so the level0_table_count variable value is within the range of [1 .. 203].
So, the level1_table_count variable is computed as 203/41412+1, and equals to 1 for any true_block values.
Could we get it wrong somewhere? It appears not, it's not just our analyzer that warns about it.
It's an interesting case where we can't even spot the error at a glance. A reviewer might easily miss it because it's time-consuming and tedious to calculate manually.
There is a code comment that might shed light on this mystery. Perhaps someone has already got some thoughts on the matter?
Fragment N2
if (unwind_info->CountOfCodes % 1)
{
// Count of unwind codes must always be even.
std::memset(&unwind_info->UnwindCode[unwind_info->CountOfCodes + 1], 0,
sizeof(UNWIND_CODE));
...
}
The PVS-Studio warning:
V1063 The modulo by 1 operation is meaningless. The result will always be zero. x64_code_cache_win.cc 299
The comment says that the condition is a check whether the variable is even. However, the modulo by 1 always yields 0, so the condition won't be executed.
It seems that a bug can't lurk in such a straightforward task. To ensure the program operates correctly, it'd be better to use division with modulo by 2 instead of 1:
if (unwind_info->CountOfCodes % 2)
Fragment N3
Be careful when using union, because we can easily get undefined behavior here.
The PVS-Studio warning:
V614 Uninitialized variable 'desc.page_count' used. xex_module.cc 594
int XexModule::ReadImageBasicCompressed(....)
{
....
for (uint32_t i = 0; i < xex_security_info()->page_descriptor_count; i++)
{
// Byteswap the bitfield manually.
xex2_page_descriptor desc;
desc.value = xe::byte_swap(
xex_security_info()->page_descriptors[i].value);
total_size += desc.page_count * heap->page_size();
}
....
}
The code creates an object of the xex2_page_descriptor structure, which looks like this:
struct xex2_page_descriptor
{
union
{
xe::be<uint32_t> value; // 0x0
struct
{
xex2_section_type info : 4;
uint32_t page_count : 28;
};
};
char data_digest[0x14]; // 0x4
};
When working with union in C++, we can read only from the active data member that was last written to. Otherwise, the behavior is undefined. This sets C++ apart from C, where we can write to one data member and read from another.
Some may not be aware of the behavior or may simply forget about it. Compilers come to the rescue here—they support this behavior as a non-standard extension. However, don't rely on it; undefined behavior may arise in the future once we update or change a compiler.
How can we address the issue in C++? Starting with C++20, we can use std::bit_cast for these purposes:
struct xex2_section_info
{
xex2_section_type info : 4;
uint32_t page_count : 28;
};
....
xe::be<uint32_t> value = xe::byte_swap(
xex_security_info()->page_descriptors[i].value
);
auto section_info = std::bit_cast<xex2_section_info>(value);
total_size += section_info.page_count * heap->page_size();
Before C++20, we can use memcry:
struct xex2_section_info
{
xex2_section_type info : 4;
uint32_t page_count : 28;
};
....
xe::be<uint32_t> value = xe::byte_swap(
xex_security_info()->page_descriptors[i].value
);
xex2_section_info section_info;
memcpy(§ion_info, &value, sizeof(section_info);
total_size += section_info.page_count * heap->page_size();
The reader may argue, "Fine, we used to treat the written value as a value of a different type, and now we copy it."No worries! Compilers are set up to detect the pattern and optimize it, no copying will occur.
Alternatively, we can implement bit_cast before C++20.
Here are some other similar warnings:
- V614 Uninitialized variable 'desc.page_count' used. xex_module.h 89
- V614 Uninitialized variable 'desc.page_count' used. xex_module.cc 594
- V614 Uninitialized variable 'desc.page_count' used. xex_module.cc 995
- V614 Uninitialized variable 'desc.info' used. xex_module.cc 996
- V614 Uninitialized variable 'desc.page_count' used. xex_module.cc 1071
- V614 Uninitialized variable 'desc.page_count' used. xex_module.cc 1472
- V614 Uninitialized variable 'desc.info' used. xex_module.cc 1474
- V614 Uninitialized variable 'page_descriptor.page_count' used. user_module.cc 687
Fragment N4
Sometimes, the formatting doesn't help us understand the code either. Let's take a look at such a case:
void D3D12CommandProcessor::CheckSubmissionFence(....)
{
....
if (SUCCEEDED(
direct_queue->Signal(queue_operations_since_submission_fence_,
fence_value) &&
SUCCEEDED(queue_operations_since_submission_fence_
->SetEventOnCompletion(fence_value,
fence_completion_event_))))
{
WaitForSingleObject(fence_completion_event_, INFINITE);
queue_operations_done_since_submission_signal_ = false;
}
....
}
The PVS-Studio warning:
V716 Suspicious type conversion: bool -> HRESULT. A cast is performed between semantically different types. d3d12_command_processor.cc 2649
The analyzer detected a suspicious logical operation with operands of the HRESULT and bool types. We can write such an operation but like for what? It makes no sense because HRESULT represents a status and has a complex format that's unrelated to bool.
The code operates as follows:
- The ID3D12CommandQueue::Signal member function has been called via the direct_queue pointer and returns HRESULT.
- The left operand is converted from HRESULT to bool. So, any non-zero value will be true, otherwise—false.
- If the left operand is true, the ID3D12Fence::SetEventOnCompletion member function is called via the queue_operations_since_submission_fence_ pointer.
- The result of the previous operation is passed to the SUCCEEDED macro that correctly converts HRESULT to bool.
- The result of the previous conversion is passed to the SUCCEEDED macro.
- A branch will be selected based on the macro result.
Developers might simply have misplaced the parentheses. So, the final code should use two SUCCEEDED results as operands of a logical AND:
if (SUCCEEDED(direct_queue
->Signal(queue_operations_since_submission_fence_,
fence_value))
&&
SUCCEEDED(queue_operations_since_submission_fence_
->SetEventOnCompletion(fence_value,
fence_completion_event_)))
{
....
}
I think it's still a pleasure to look at such a wall of code within the if statement. To enhance readability, I'd put it in a variable:
bool res = SUCCEEDED(
direct_queue->Signal(queue_operations_since_submission_fence_,
fence_value)
);
res = res
&& SUCCEEDED(
queue_operations_since_submission_fence_
->SetEventOnCompletion(fence_value, fence_completion_event_)
)
);
if (res)
{
....
}
Fragments N5
Copy-paste errors can often be difficult to spot, that's why we thoroughly check the code, both via code reviews and using the analyzer.
resolve_fsi_clear_32bpp_pipeline_ =
ui::vulkan::util::CreateComputePipeline(....);
if (resolve_fsi_clear_32bpp_pipeline_ == VK_NULL_HANDLE) {
XELOGE(
"VulkanRenderTargetCache: Failed to create the 32bpp resolve EDRAM "
"buffer clear pipeline");
Shutdown();
return false;
}
resolve_fsi_clear_64bpp_pipeline_ =
ui::vulkan::util::CreateComputePipeline(....);
if (resolve_fsi_clear_32bpp_pipeline_ == VK_NULL_HANDLE) { // <=
XELOGE(
"VulkanRenderTargetCache: Failed to create the 64bpp resolve EDRAM "
"buffer clear pipeline");
Shutdown();
return false;
}
We can observe some similar code blocks for defining and checking the resolve_fsi_clear_32bpp_pipeline_ and resolve_fsi_clear_64bpp_pipeline_ variables, for which the PVS-Studio analyzer issues a warning:
V1051 Consider checking for misprints. It's possible that the 'resolve_fsi_clear_64bpp_pipeline_' should be checked here. vulkan_render_target_cache.cc 778
The developers redundantly checked resolve_fsi_clear_32bpp_pipeline_ _for validity instead of _resolve_fsi_clear_64bpp_pipeline_. It's a pretty straightforward case—the string in the second condition indicates an error related to the 64bpp variable. The fix is simple: just replace the variable in the second condition with resolve_fsi_clear_64bpp_pipeline_.
Fragment N6
template <Domain domain_>
struct NtSystemClock
{
....
[[nodiscard]] static time_point now() noexcept
{
if constexpr (domain_ == Domain::Host)
{
// QueryHostSystemTime() returns
// windows epoch times even on POSIX
return from_file_time(Clock::QueryHostSystemTime());
}
else if constexpr (domain_ == Domain::Guest)
{
return from_file_time(Clock::QueryGuestSystemTime());
}
}
....
};
The PVS-Studio warning:
V591 Non-void function should return a value. chrono.h 110
Inside the function, the domain_ data member is checked against the enum elements:
enum class Domain
{
// boring host clock:
Host,
// adheres to guest scaling
// (differrent speed, changing clock drift etc):
Guest
};
While there are only two values, just like in the check, we can't be sure that there won't be extra elements in the future. Therefore, we should make the function always return a value for all execution branches, or the code should not compile. As a fix, we can use the following (before C++23, it looks like this):
template <typename>
struct always_false : std::false_type {};
template <typename T>
constexpr auto always_false_v = always_false<T>::value;
[[nodiscard]] static time_point now() noexcept
{
if constexpr (domain_ == Domain::Host)
{
// QueryHostSystemTime() returns windows epoch times even on POSIX
return from_file_time(Clock::QueryHostSystemTime());
}
else if constexpr (domain_ == Domain::Guest)
{
return from_file_time(Clock::QueryGuestSystemTime());
}
else
{
static_assert(always_false_v<decltype(domain_)>,
"Your message.");
}
}
Starting with C++23, we can greatly streamline the code using static_assert(false, "....") without an extra entity like the always_false class template.
Fragment N7
Speaking about errors, we often recall null dereferencing. The key to fixing such an error is to put the check in the right place.
The PVS-Studio warning:
V595 The 'extra' pointer was utilized before it was verified against nullptr. Check lines: 51, 52. xam_app.cc 51
X_HRESULT XamApp::DispatchMessageSync(....){
....
auto extra = memory_->TranslateVirtual<X_KENUMERATOR_CONTENT_AGGREGATE*>(
data->extra_ptr
);
auto buffer = memory_->TranslateVirtual(data->buffer_ptr);
auto e = kernel_state_->object_table()
->LookupObject<XEnumerator>(extra->handle);
if (!e || !buffer || !extra)
{
return X_E_INVALIDARG;
}
....
}
On the surface, all goes well: developers created three pointers and assumed that they could be null. To make code below operate correctly, devs added the checks for validity with an early return from the function.
However, once developers created the e pointer, they used extra. If it's null, its dereferencing leads to undefined behavior. Unfortunately, checking extra for validity occurres too late.
Here's the fixed code:
auto extra = memory_->TranslateVirtual<X_KENUMERATOR_CONTENT_AGGREGATE*>(
data->extra_ptr
);
auto buffer = memory_->TranslateVirtual(data->buffer_ptr);
if (!buffer || !extra)
{
return X_E_INVALIDARG;
}
auto e = kernel_state_->object_table()
->LookupObject<XEnumerator>(extra->handle);
if (!e)
{
return X_E_INVALIDARG;
}
This is a similar warning:
- V595 The 'writable_first_' pointer was utilized before it was verified against nullptr. Check lines: 100, 105. graphics_upload_buffer_pool.cc 100
Fragment N8
As we know, memory is allocated via the operator new, and we should deallocate memory at the end. Do the Xenia developers use this approach consistently across the project? Let's peek at an example:
X_STATUS SDLAudioSystem::CreateDriver(
size_t index,
xe::threading::Semaphore* semaphore,
AudioDriver** out_driver
)
{
assert_not_null(out_driver);
auto driver = new SDLAudioDriver(memory_, semaphore);
if (!driver->Initialize())
{
driver->Shutdown();
return X_STATUS_UNSUCCESSFUL;
}
*out_driver = driver;
return X_STATUS_SUCCESS;
}
The PVS-Studio warning:
V773 The function was exited without releasing the 'driver' pointer. A memory leak is possible. sdl_audio_system.cc 37
Devs made an early return from a function, and deallocated the resources that the driver had initialized during construction, but they missed the SDLAudioDriver object. It results in a memory leak, and it's not the only one:
- V773 The function was exited without releasing the 'driver' pointer. A memory leak is possible. sdl_audio_system.cc 37
- V773 The function was exited without releasing the 'driver' pointer. A memory leak is possible. xaudio2_audio_system.cc 38
- V773 The function was exited without releasing the 'module' pointer. A memory leak is possible. user_module.cc 376
- V773 The function was exited without releasing the 'sem' pointer. A memory leak is possible. xsemaphore.cc 80
Down with manual control, use the RAII idiom!
assert_not_null(out_driver);
auto driver = std::make_unique<SDLAudioDriver>(memory_, semaphore);
if (!driver->Initialize())
{
driver->Shutdown();
return X_STATUS_UNSUCCESSFUL;
}
*out_driver = driver.release();
return X_STATUS_SUCCESS;
Fragments N9
Let's move on to a very suspicious code fragment:
static TextureExtent CalculateExtent(const FormatInfo* format_info,
uint32_t pitch, uint32_t height,
uint32_t depth, bool is_tiled,
bool is_guest)
{
TextureExtent extent;
extent.depth = depth;
if (is_guest)
{
....
// Is depth special?
extent.depth = extent.depth;
}
return extent;
}
The PVS-Studio warning:
V570 The 'extent.depth' variable is assigned to itself. texture_extent.cc 58
The TextureExtent::depth data member is assigned to itself in the then branch. I find it hard to come up with a solution here, but something is wrong.
Fragment N10
Before using memset, it's better to check what data it handles.
bool GetInfo(const std::filesystem::path& path, FileInfo* out_info)
{
std::memset(out_info, 0, sizeof(FileInfo));
....
if (....) return false;
/* fill 'out_info' data members */
return true;
}
An object of the FileInfo type is passed to the_ memset_ function as an argument, which looks as follows:
struct FileInfo {
enum class Type {
kFile,
kDirectory,
};
Type type;
std::filesystem::path name;
std::filesystem::path path;
size_t total_size;
uint64_t create_timestamp;
uint64_t access_timestamp;
uint64_t write_timestamp;
};
It includes the std::filesystem::path type, which isn't trivially copyable. _Using such data in the _memset function may lead to undefined behavior, and the analyzer warns us about it:
V780 The object 'out_info' of a non-passive (non-PDS) type cannot be initialized using the memset function. filesystem_win.cc 209
I'd suggest rewriting this code in modern C++ using std::optional:
std::optional<FileInfo> GetInfo(const std::filesystem::path &path)
{
if (....) return {};
FileInfo out_info {};
/* fill 'out_info' data members */
return std::move(out_info);
}
Fragment N11
We always need to stay cautious. Here's a deceptively simple case where things go wrong:
bool Emulator::ExceptionCallback(....)
{
....
double f[32];
....
for (int i = 0; i < 32; i++) {
XELOGE(" f{:<3} = {:016X} = (double){} = (float){}", i,
*reinterpret_cast<uint64_t*>(&context->f[i]),
context->f[i],
*(float*)&context->f[i]);
}
....
}
There are two dangerous conversions of the double pointer:
- to the uint64_t pointer;
- to the float pointer.
This is a pretty serious error that violates the rules of strict aliasing. It leads to undefined behavior.
The PVS-Studio analyzer warns us about it:
V615 An odd explicit conversion from 'double *' type to 'float *' type. emulator.cc 595
We can fix it here as in fragment N4.
Fragment N12
In different projects, we can encounter an error that causes an unconditional return within a loop at the first iteration. Let's take a look at the following code snippet:
size_t SingleLayoutDescriptorSetPool::Allocate()
{
....
// Two iterations so if vkAllocateDescriptorSets fails
// even with a non-zero current_pool_sets_remaining_,
// another attempt will be made in a new pool.
for (uint32_t i = 0; i < 2; ++i)
{
if ( current_pool_ != VK_NULL_HANDLE
&& !current_pool_sets_remaining_)
{
full_pools_.push_back(current_pool_);
current_pool_ = VK_NULL_HANDLE;
}
....
--current_pool_sets_remaining_;
descriptor_sets_.push_back(descriptor_set);
return descriptor_sets_.size() - 1;
}
....
}
The PVS-Studio warning:
V612 An unconditional 'return' within a loop. single_layout_descriptor_set_pool.cc 110
The comment makes it clear that we need two iterations. At the end of the loop body, there is an unconditional return, which leads to an unexpected return.
Fragment N13
We've examined scenarios where a check is necessary but incorrectly placed. Now, let's consider the code where the check is in the right place yet redundant.
bool Setup(TestSuite& suite)
{
// Reset memory.
memory_->Reset();
std::unique_ptr<xe::cpu::backend::Backend> backend;
if (!backend)
{
#if XE_ARCH_AMD64
if (cvars::cpu == "x64")
{
backend.reset(new xe::cpu::backend::x64::X64Backend());
}
#endif // XE_ARCH
if (cvars::cpu == "any")
{
if (!backend)
{
#if XE_ARCH_AMD64
backend.reset(new xe::cpu::backend::x64::X64Backend());
#endif // XE_ARCH
}
}
}
....
}
The analyzer warnings:
V614 The 'backend' smart pointer is utilized immediately after being declared or reset. It is suspicious that no value was assigned to it. ppc_testing_main.cc 201
As we know, the std::unique_ptr constructor creates an object and initializes it to null by default. That's why the check after the declaration doesn't matter; the control flow will proceed to the then branch.
Once there, we encounter a wall of nested checks and preprocessor directives. It can be tricky to read code like this. We may notice that the smart pointer will be initialized only if the XE_ARCH_AMD64 macro is expanded to a non-zero value. We can facilitate it this way:
bool Setup(TestSuite& suite)
{
// Reset memory.
memory_->Reset();
std::unique_ptr<xe::cpu::backend::Backend> backend;
#if XE_ARCH_AMD64
if (cvars::cpu == "x64" || cvars::cpu == "any")
{
backend.reset(new xe::cpu::backend::x64::X64Backend());
}
#endif // XE_ARCH
....
}
Fragment N14
std::shared_ptr<cpptoml::table>
ParseConfig(const std::filesystem::path& config_path)
{
try
{
return ParseFile(config_path);
}
catch (cpptoml::parse_exception e)
{
xe::FatalError(
fmt::format("Failed to parse config file '{}':\n\n{}",
xe::path_to_utf8(config_path),
e.what())
);
return nullptr;
}
}
Here is the exception catching block but look closely — something strange is going on here. The exception in the catch block is caught by value, not by reference.
It's better to catch exceptions by reference because it enables us to:
- avoid creating the exception object copy;
- catch all publicly inherited exceptions from this class. Catching by value results in type slicing, which leads to the loss of information from derived types.
The analyzer reports:
V746 Object slicing. An exception should be caught by reference rather than by value. config.cc 58
Fragment N15
Now let's look at the warnings related to the class construction:
class ImGuiDialog
{
public:
~ImGuiDialog();
....
protected:
virtual void OnShow() {}
virtual void OnClose() {}
virtual void OnDraw(ImGuiIO& io) {}
};
The PVS-Studio warning:
V599 The destructor was not declared as a virtual one, although the 'ImGuiDialog' class contains virtual functions. imgui_dialog.cc 46
The warning helps us prevent possible issues that may arise from using a pointer to the base class.
There are virtual functions in the ImGuiDialog class. This means that there should be derived classes. It'd be better to introduce the destructor as virtual. Otherwise, undefined behavior arises when we destroy a derived class object via the pointer to the base class.
Fragment N16
Speaking of inheritance, it's also important to remember the rules of using virtual functions in the class constructors and destructors.
The PVS-Studio warning:
V1053 Calling the 'Reset' virtual function in the destructor may lead to unexpected result at runtime. assembler.cc 18
class Assembler
{
public:
explicit Assembler(Backend* backend);
virtual ~Assembler();
virtual bool Initialize();
virtual void Reset();
....
}
Assembler::~Assembler() { Reset(); }
The code fragment contains the Assembler class, which calls the Assembler::Reset virtual function in its destructor.
class X64Assembler : public Assembler
{
public:
explicit X64Assembler(X64Backend* backend);
~X64Assembler() override;
bool Initialize() override;
void Reset() override;
....
}
Here's its derived class, X64Assembler, that overrides the Reset virtual function. If we delete an object of the X64Assembler class, the destructor of the base class, Assembler, will be called. In the destructor, the Reset function is called from the base class, not from the derived. Developers might've expected an overridden function to be called.
My colleague described the pattern in more detail in a separate article and offered the following solution:
class Assembler
{
private:
void ResetImpl();
public:
explicit Assembler(Backend* backend);
virtual ~Assembler();
virtual bool Initialize();
virtual void Reset();
....
}
void Assembler::ResetImpl() { /* free only Assembler resources */ }
Assembler::~Assembler() { ResetImpl(); }
void Assembler::Reset() { ResetImpl(); }
class X64Assembler : public Assembler
{
private:
void ResetImpl();
public:
explicit X64Assembler(X64Backend* backend);
~X64Assembler() override;
bool Initialize() override;
void Reset() override;
....
}
void X64Assembler::ResetImpl()
{
/* free only X64Assembler resources */
}
X64Assembler::~X64Assembler() { ResetImpl(); }
void X64Assembler::Reset()
{
ResetImpl(); // free X64Assembler resources
Assembler::Reset(); // free resources of the base class
}
Outro
I'd like to show you the other bugs detected in the project, but I suppose this may interest only to the project maintainers. Just a quick note to remind you that PVS-Studio has a free license for open-source projects and educational purposes. If you haven't had a chance to try it yet, I highly recommend getting a trial :)
Top comments (0)