DEV Community

spO0q
spO0q

Posted on

PHP: The Garbage Collector explained with simple words

The Garbage Collector (GC) is the internal memory management system in PHP, but there are some subtleties to understand.

🤔 Why does the GC even exist?

The GC automates memory management, which removes the hassle of handling memory with manual tasks (which would be tedious).

This allows developers to focus on their business logic without worrying excessively about 'Out of Memory' errors.

Of course, it's not magic.

🎯 10,000 objects in short

Freeing objects that are no longer needed prevents memory leaks.

The GC uses a counting mechanism to determine the elements to drop. If no references point to a particular object (i.e., $counter = 0), then this object is eligible for cleanup.

It works pretty well, but some references can be problematic:

class A {
    public $b;
}

class B {
    public $a;
}

$a = new A();
$b = new B();

$a->b = $b;
$b->a = $a;

unset($a);
unset($b);
Enter fullscreen mode Exit fullscreen mode

In this case of poor design, PHP will not free the memory even if we unset $a and $b, as they reference each other, leading PHP to believe they are still in use.

Fortunately, there is another mechanism called the Cycle Collector for that:

gc_collect_cycles();
Enter fullscreen mode Exit fullscreen mode

Roughly speaking, the collector traverses all references and applies an algorithm to mark objects in use, which reveals objects to collect (the unmarked ones).

However, PHP does not trigger automatic cycle collection until the thresholds of 10,000 objects with potential cyclic references is reached.

Again, it's not magic, so you must invoke gc_collect_cycles() only in few cases.

💸 TANSTAAFL

Bad design can lead to overcomplex relationships between objects, leading to more references and more frequent garbage collection.

Each reference-counted object requires additional storage for its reference count.

Source: Wikipedia - Reference counting

The overhead associated with memory cleanup operations can impact the global performance significantly and ultimately increase the execution time in specific scenarios.

10 years ago, Composer got a huge performance boost just by using the gc_disable() function.

Source: Composer - disabling GC

Indeed, PHP 7 drastically improved the GC, so it is not what it was in 2014.

In addition, PHP 8 versions improved memory allocation strategies and added more useful statistics about GC operations for better monitoring (gc_status() in 8.3).

Most PHP applications are request-driven, and the memory is automatically cleared at the end of the request.

Again, it's pretty cool but not magic. What happens with asynchronous requests and long-lived objects/daemons?

You may experience memory leaks at some point.

🐘 How different is the PHP's GC?

At this point, you might not see how the PHP's GC differ from other languages.

Most of the time, other languages do not rely on reference counting to collect garbage or may use different implementations.

For example, many use the tracing algorithm that also marks unused objects but does not operate incrementally. It's a graph traversal.

Besides, some languages do not allow such direct control (e.g., on/off at runtime).

As usual, there are some advantages and inconvenients, so you may see some hybrid approaches.

🧑‍💻 Interacting with the PHP's GC

You can leverage the built-in gc_* helpers.

For example:

  • gc_collect_cycles manually triggers the garbage collection
  • gc_status() give the current status
  • gc_disable() disables it
  • gc_enable() enables it

These functions are helpful for debugging or fine-tuning garbage collection when necessary.

🐞 Understanding memory errors

You can read this post for further insights:

💪 Weak Maps to the rescue?

PHP 7.4 introduced Weak References and PHP 8 introduced Weak Maps.

A Weak Map could be described as a collection of Weak References.

This data structure is a versatile key-value store that helps PHP keep track of items without creating clutter or consuming excessive space.

You may see it as a temporary storage that will be cleared right away when it's no longer needed, as there are no [strong] reference that could prevent the garbage collection:

$object = new stdClass;
$map = new WeakMap();
$map[$object] = true;
$object->name = 'some name';
print_r($map);// $object is stored in $map

unset($object);

print_r($map);// $object is cleaned and no longer available
Enter fullscreen mode Exit fullscreen mode

✅ Pros

  • pretty straightforward
  • great for caching or memoization (e.g., expensive computations)

❌ Cons

  • while keys (objects) do not prevent garbage collection, values can, so the term "arbitrary values" can be misleading (only use simple data types as values)
  • valuable use cases are limited

🔥 Optimize the code

  • leverage design patterns that reduce interdepencies
  • use dependency injection
  • don't load too large datasets into memory and use collections and generators instead of huge arrays
  • monitor memory usage
  • profile your code with metrics
  • use gc_enable(), gc_disable(), and gc_collect_cycles() sparingly

Wrap up

For most usages, you won't have to worry about memory management, as PHP already handles it.

However, because modern stacks utilize long-lived objects, you need to monitor your application for potential memory leaks.

If you get issues, you may have to optimize the code and/or interact with the GC directly.

Top comments (0)