If I know anything about code performance, it's that my intuition about performance is horrible. Going through the backlog of Perl Weekly Challenges that I missed, I found myself spending way too much time thinking about the problem of reversing numbers.
In Task 2 of Perl Weekly Challenge #176 we are set a reasonably easy task:
Write a script to find out all Reversible Numbers below 100.
A number is said to be a reversible if sum of the number
and its reverse had only odd digits.
A straightforward approach starts with considering the possible pairs of numbers that could work. A number with all odd digits will be odd. Therefore it will be a sum of an even number and an odd number. Adding together a two-digit number and its reverse means that the only candidates are numbers where the first digit is even and second is odd, or vice versa.
Perl has a string-generating function called glob
. It's usually used in the context of generating possible file names that match a pattern, but it can also be used in a way that also appears in shells such as bash
, ksh
, and zsh
.
$ echo {a,b}{1,2}
a1 a2 b1 b2
That is, lists of comma-separated values in braces create a cross-product of strings from the lists. To generate the odd-and-even numbers that we need, we can use glob
like this:
my @candidate = ( glob("{1,3,5,7,9}{0,2,4,6,8}"), glob("{2,4,6,8}{1,3,5,7,9}") );
From that list of candidates, we need to select (think grep
) those that meet the criteria from the problem statement.
grep { $_ < 100 && allOdd($_ + revNum($_)) } @candidate;
All we need to do then is implement the allOdd
and revNum
functions.
These sorts of problems where we manipulate the digits in a number have two kinds of solutions: we can treat the number as a string or an array of characters, and manipulate the characters; or we can treat it as a number and do arithmetic to isolate the digits as integers.
My intuition, born of experience in C, is that doing math with integers is more efficient than doing string operations. But is it true in Perl?
Let's consider two possible implementations of reversing the digits of a number. The first uses modulo and division operations to reverse the number. Starting from the list significant digit, keep multiplying by ten and adding more digits:
sub rev1($n)
{
use integer;
my $r = $n % 10;
while ( $n = int($n/10) )
{
$r = ($r * 10) + ($n % 10);
}
return $r;
}
The second implementation turns the number into a string and uses the reverse
operator that is built into Perl. That might give us leading zeroes, which would incorrectly be seen as octal when the number is used in a sum, so we also need to remove leading zeroes.
sub rev2($n)
{
(my $r = reverse("$n")) =~ s/^0+//;
return $r;
}
So which is more efficient? Perl has a core module for benchmarking blocks of code: use Benchmark qw/cmpthese/;
The cmpthese
function takes blocks of code and runs them to accumulate CPU time, and then shows the relative performance.
Here's what the benchmarking code looks like:
cmpthese(-5, {
numeric => sub {
rev1($_) for 123400 .. 124500;
},
string => sub {
rev2($_) for 123400 .. 124500;
},
}
);
And here's a sample result of running the benchmark:
Rate numeric string
numeric 1592/s -- -53%
string 3394/s 113% --
The string version is nearly twice as fast as the numeric version. My intuition that my intuition is bad is confirmed.
Why is the string version faster? Well, it contains no loops because the looping is hidden in the built-in version of reverse
, which is probably well-optimized C code. The numeric Perl version is probably slower because the looping is done in interpreted code, which means every variable reference and every test goes through the byte-code interpreter.
Conclusion: don't be afraid to treat numbers as strings in Perl, at least not for efficiency reasons.
Incidentally, my intuition about C programs was much better. Below is the same pair of functions as a C program, and a quick-and-dirty test of the performance. The numeric version is about three times faster in my environment.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
void reverse(char s[])
{
int i, j;
char c;
for (i = 0, j = strlen(s)-1; i<j; i++, j--) {
c = s[i];
s[i] = s[j];
s[j] = c;
}
}
int rev1(int n)
{
int r = n % 10;
while ( (n = n / 10) != 0 )
{
r = r*10 + n % 10;
}
return r;
}
int rev2(int n)
{
char s[16];
sprintf(s, "%d", n);
reverse(s);
return ( atoi(s) );
}
int main(int argc, char **argv)
{
int r;
const int start = 123400000;
const int end = 123500000;
clock_t begin = clock();
for ( int n = start; n <= end; n++ )
{
r = rev1(n);
}
clock_t finish = clock();
double t = (double)(finish - begin) / CLOCKS_PER_SEC;
printf("rev numeric %d %f\n", r, t);
begin = clock();
for ( int n = start; n <= end; n++ )
{
r = rev2(n);
}
finish = clock();
t = (double)(finish - begin) / CLOCKS_PER_SEC;
printf("rev string %d %f\n", r, t);
}
Top comments (0)