The Task
PWC 310 Task 1, Arrays Intersection asks us to implement set intersection from lists of numbers. For musical accompaniment, I suggest Everywhere by Fleetwood Mac.
You are given a list of array of integers.
Write a script to return the common elements
in all the arrays.
-
Example 1
- Input:
$list = ( [1, 2, 3, 4], [4, 5, 6, 1], [4, 2, 1, 3] )
- Output:
(1, 4)
- Input:
-
Example 2
- Input:
$list = ( [1, 0, 2, 3], [2, 4, 5] )
- Output:
(2)
- Input:
-
Example 3
- Input:
$list = ( [1, 2, 3], [4, 5], [6] )
- Output:
()
- Input:
The Plan
Use a hash as a set. If we knew that all the lists have only distinct elements, then we could just check that an element occurs the right number of times. But if a number can repeat in a list, then counting is not enough.
In the hash, map the number to a bit map. If the number occurs in list i, then bit i will be set. At the end, if the number of bits set is equal to the number of lists, then that number must occur in every list.
A possible limitation of using a bit map is that it limits us to 64 lists, which seems like an acceptable compromise. Perl also has the vec
function for bit vectors, but that would require an hour of review and experimentation. Maybe I'll come back to it.
The Code
sub asect(@list)
{
my %common;
my $allBits = 2 ** scalar(@list) - 1;
for ( 0 .. $#list )
{
my $bit = 1 << $_;
$common{$_} |= $bit for $list[$_]->@*;
}
return [ grep { $common{$_} == $allBits } sort { $a <=> $b } keys %common ];
}
my %common
-- the keys of this hash are the numbers from the lists.my $allBits = 2 ** scalar(@list) - 1
-- This creates a mask of 1s for the number of lists. For instance, if there are 3 lists, '2**3-1` is 7, which is 111 in binary.for ( 0 .. $#list )
the indexes of the lists.my $bit = 1 << $_
An integer where exactly one bit is set: the one corresponding to the list number-
$common{$_} |= $bit for $list[$_]->@*
- Loop over the words in the _i_th list.
- A little confusing here, because
$_
has two different contexts. Infor $list[$_]->@*
,$_
is the list index from the top-level for loop, but in$common{$_}
it refers to the number taken from the list. - The
|=
operator sets bit i if it occurs in list i, even if it occurs more than once.
At exit of the loop, every number from all the lists is present as a key in %common
. That key maps to a value which has bits set corresponding to the lists in which it occurred. We want to return the numbers that have all the bits set. Breaking down the last statement from right to left:
-
sort { $a <=> $b } keys %common
I'm adding a sort to create a predictable output order, which is useful in unit test. -
grep { $common{$_} == $allBits }
Selecting only the numbers that have all the bits set. -
[ ... ]
enclosing the result in square brackets returns an array reference. I prefer returning references for three reasons:- (1) It avoids an array copy, which could be inefficient.
- (2) it means we always return a scalar, which resolves ambiguities if the function is called in an array context (for instance, as an argument to
say
or in a unit-testis()
test). - (3) Because of reason (2), it's easier to write a unit test, and therefore I am more likely to write good unit tests.
Top comments (0)