Always with the goal of improving my way of writing code, I discovered a few years ago Adam Wathan and his book Refactoring to Collections. And suffice to say that the slogan Never write another loop again aroused my curiosity.
From the start, the goal is clear: never write a for
/foreach
/while
loop again. And then I say to myself: impossible! These structures are so ingrained in our habits that I can't see myself doing otherwise.
If we take a very simple example written in a very classic way:
public function doubleAllValue(array $numbers)
{
$result = [];
foreach ($numbers as $number) {
$result[] = $number * 2;
}
return $result;
}
For each number in the variable $numbers
, multiply it by two and save the result in the temporary variable $result
.
Another way to write this processing is to use native PHP functions. To continue the previous example, we will use array_map()
.
public function doubleAllValue(array $numbers)
{
return array_map(fn($number) => {
return $number * 2;
}, $numbers);
}
What do we see here? No more foreach
, no more temporary variables and a single instruction. So we are on the right track!! In addition, our business processing is output in a function (here anonymous) which could be reused if extracted.
There are other native methods in PHP on arrays like array_filter
, array_reduce
... But these functions have several disadvantages:
- Their signatures are different and the order of the arguments is not the same between the different functions. So we often refer to the documentation.
array_walk($callback, $array);
array_filter($array, $callback);
- The combination of these methods to carry out a particular treatment is particularly illegible. Let's say we want to double the values of positive numbers:
class integer
{
public function doubleAllPositiveValue(array $numbers)
{
return array_map(function($number) {
return $number * 2;
},
array_filter($numbers, function($number) {
return $number > 0;
})
);
}
}
Suffice to say that it is not very readable. We could very well define methods to make reading easier:
class integer
{
public function doubleAllPositiveValue(array $numbers)
{
return array_map(function($number) {
return $number * 2;
},
$this->keepOnlyPositiveValue($numbers)
);
}
private function keepOnlyPositiveValue(array $numbers)
{
return array_filter($numbers, function($number) {
return $number > 0;
});
}
}
A little better, we could even go even further:
class integer
{
public function doubleAllPositiveValue(array $numbers)
{
return array_map(
$this->getDoubleValueCallback(),
$this->keepOnlyPositiveValue($numbers)
);
}
private function getDoubleValueCallback()
{
return function($number) {
return $number * 2;
};
}
private function keepOnlyPositiveValue(array $numbers)
{
return array_filter($numbers, function($number) {
return $number > 0;
});
}
}
The doubleAllPositiveValue
method is now more readable, but its reading direction is reversed: the first processing is the last line ($this->keepOnlyPositiveValue($numbers)
) and the result is then processed by the previous line ($this->getDoubleValueCallback()
). And then the method which returns a function is not the easiest to understand and especially to use for the uninitiated.
And if only we could have a mechanism that would allow us to define line by line what we want to do.
$result = $numbers
->filterPositiveValue()
->doubleValue()
;
Laravel collections
And thatβs where Adam Wathan comes to our rescue with the Collection pipelines. In his book, he relies on Laravel and its classes and methods on collections. But nothing prevents you from using others or writing your own classes.
To return to our previous examples.
public function doubleAllValue(array $numbers)
{
return Collection::make($numbers)
->map(
function($number) {
return $number * 2;
}
)
->toArray()
;
}
For information :
- The
Collection::make
method constructs a Collection object from a nativearray
. - The
toArray()
method returns the array contained in theCollect
object;
For this example, which is quite simple, we don't really see the point of these pipelines. But it becomes quite powerful when you start to add them up:
public function doubleAllPositiveValue(array $numbers)
{
return Collection::make($numbers)
->filter(function($number) {
return $number > 0;
})
->map(function($number) {
return $number * 2;
})
->toArray();
);
}
Here it becomes interesting, because our processing is done in the direction of reading: we start by filtering the positive numbers, then we multiply them by two. And using a class to manage our tables allows us to do what we want. For example, if I want to find the first positive number in an array and if necessary return 0
;
Classically (using early returns to avoid temporary variables):
getFirstPositifValue(array $numbers)
{
foreach ($numbers as $number) {
if ($number > 0) {
return $number;
}
}
return 0;
}
With a method of the Collect
class
getFirstPositifValue(array $numbers)
{
return Collection::make($numbers)
->first(
function($number) {
return $number > 0:
},
0
);
}
I don't know what you think, but I find it rather elegant and easily understandable! And all these are just simple examples, it is even more interesting for more complicated treatments.
So if you are interested in Collection pipelines
, I strongly advise you to start with:
- ask to receive a free chapter of his book
- watch the screencast available for free on its site
Refactoring to collections: Refactoring to collections: Never write another loop again.
This link is not an affiliate link
Do you know collection pipelines ? Do you use them (definitly yes if you using laravel)?
Thank you for reading, and let's stay in touch !
If you liked this article, please share. Join me also on Twitter/X for more PHP tips.
Top comments (3)
Adding Arrow functions and First Class Callable Syntax:
Thank you π
I donβt do it because the problem is still the same, the order of the instructions is still weird with array functions.
Laravel collections don't solve anything in terms of iterations in the code, they simply change the place of the loops, and in many cases in a much less efficient way. I only use Laravel Collections with my own data in a few places in a controlled way. When you work with large amounts of data they are a huge bottleneck.