Perl is a great language for many problems, and the Perl Weekly Challenge is a good excuse to try out language features. Week 215 offers two relatively easy challenges; one that scratches the surface of textual processing, and another that exercises arrays.
Task 1: Odd One Out
You are given a list of words (alphabetic characters only)
of same size.
Write a script to remove all words not sorted alphabetically
and print the number of words in the list that are not
alphabetically sorted.
Example 1
Input: @words = ('abc', 'xyz', 'tsu')
Output: 1
The words 'abc' and 'xyz' are sorted and can't be removed.
The word 'tsu' is not sorted and hence can be removed.
Like many of the challenges, the description has its ambiguities. It's not obvious why the words all have to be the same length, but it's in the specification, so we'll use that as a restriction. All the given examples are lowercase strings; should we use ordinary alphabetic ordering, or strict lexicographic ordering? Let's start by assuming, for simplicity, that we have strings limited to the English alphabet and it should be case-independent.
The problem seems easy enough. Iterate over the @words
array, and count each one that meets some criteria. In Perl, the first thing to reach for when we see "select from list" is the grep
function. Conveniently, it returns the number of matches in scalar context, so the core of our solution is going to look like
my $removeCount = grep { not isOrdered($_) } @words.
The challenge then shifts to defining an isOrdered
function for a single word. One way to do it is to sort the letters of the word, and then compare against the word. One of Perl's strengths is the number of built-in operators and functions that manipulate text. The lc
function will convert a string to lowercase; the split
function will give us an array of letters; the join
function will turn an array into a string; and the sort
function sorts in lexicographical order by default.
my $inOrder = join("", sort split(//, lc($word) );
if ( lc($word) eq $inOrder ) { ... }
Suppose we want to avoid the sort and do our own check in some misguided attempt at premature optimization. The word will be in alphabetic order if the character value of each subsequent character is an ascending sequence. Perl has ord()
available to get the numeric value of a character (and the inverse chr()
, but that's not going to come up this week).
So, now we want to iterate over pairs of elements in an array. This comes up a lot in Perl Weekly Challenge problems. There's a library function, List::MoreUtils::slide
that encapsulates this, but I like an idiom that looks like this:
my $first = shift @array;
while ( @array ) {
my $next = shift @array;
... # do something with $first and $next
$first = $next;
}
Going through the array explicitly gives us the opportunity to quit as soon as we see it isn't going to work, unlike the slide
function which would always go through the whole array.
The shift
operator consumes the array, so this may not be appropriate if we need to keep the array around for other purposes, but it will work in this case because the array of characters is temporary.
sub isOrdered($word)
{
my @char = split(//, lc($word));
my $first = shift @char;
while ( my $next = shift @char )
{
return false if ord($first) > ord($next);
$first = $next;
}
return true;
}
What if the alphabet isn't limited to English 'a' to 'z'? Let's not go too wild, but think about European languages like German and Spanish. What if there were some Unicode characters like 'ñ' or 'ß'? Perl has a long history of dealing with Unicode and UTF-8. A good introduction to the problem is this classic (in the Perl world) article about sorting and the Stackoverflow answer about Unicode that it references.
That's a rabbit hole that could consume the entire week, so I'll cut to the chase. Fortunately, most string operations in Perl are aware of UTF-8 and do the right thing, mostly. For instance, split
will return characters, not just bytes, and lc
will intelligently use locales. ord
is not quite as magically smart. Unicode::Collate is the module we need to do a least-effort attempt to handle a wider range than simple ASCII. Use the state
feature to initialize a collator object just once, and replace the ord
comparison with a smarter method from the module.
sub isOrdered($word)
{
use Unicode::Collate;
state $Collator = Unicode::Collate->new();
my @char = split(//, lc($word));
my $first = shift @char;
while ( my $next = shift @char )
{
return false if $Collator->gt($first, $next) > 0;
$first = $next;
}
return true;
}
The complete solution to the task is on Github.
Task 2: Number Placement
You are given a list of numbers having just 0 and 1.
You are also given placement count (>=1).
Write a script to find out if it is possible to replace 0
with 1 in the given list. The only condition is that you
can only replace when there is no 1 on either side.
Print 1 if it is possible otherwise 0.
Example
Input: @numbers = (1,0,0,0,0,0,0,0,1), $count = 3
Output: 1
It is possible to place 3 ones into the sequence like this: (1,0,1,0,1,0,1,0,1)
Tackling this problem by brute force will bog down in testing for boundary conditions, but there's an insight that makes it almost trivial: what if we had three consecutive values in hand? Then we could see if they are all zeroes and replace the middle one.
Accordingly, instead of starting from the front of the array, let's jump ahead to element $numbers[2]
and look backward.
sub numberPlacement($list, $count)
{
for ( my $i = 2; $count && $i <= $list->$#* ; $i++ )
{
if ( $list->[$i-2] == 0 && $list->[$i-1] == 0 && $list->[$i] == 0 )
{
$list->[$i-1] = 1;
$count--;
}
}
return ( $count == 0 ? 1 : 0 );
}
By starting at 2, we handle the special cases of arrays less than 3 long. Counting down with $count
and using it in the for-loop condition gives us a concise test for success at the end. Perl gives us the last index of the array ($list->$#*
), which makes a concise loop termination condition (and a good reason to look back instead of forward).
One thing we might consider here is that we are modifying the original array. If the problem requires us to answer the question without actually doing the operation, we could operate on a copy of the numbers list instead.
This solution would look similar in C, Java, or really any language with ALGOL ancestry. It's a strength of Perl that it uses recognizable constructs from other languages to make it accessible to new learners. But of course, Perl is its own language and learning by analogy has its pitfalls.
The problem says the input list consists only of 1s and 0s. We should probably validate that, especially if taking input from the command line. Validation is another common thing that Perl does well. We can use a regular expression applied to each argument.
sub usage { "Usage: $0 -c COUNT [1|0]..." }
my @list = @ARGV;
do { say STDERR usage(); exit 1; } if @list == 0 || grep !/^[01]$/, @list;
In learning natural languages, we speak of "false friends" -- words that look recognizable, but actually mean something completely different. For instance, an English speaker might look at the Spanish word "embarrazada" and think "embarrassed." Unfortunately, the word's primary meaning is "pregnant". Hilarity ensues.
I've used do
here because I want to execute two statements in case the validation fails. The do
statement in Perl is a false friend to C and Java programmers. In those languages, it introduces a loop that will be executed at least once, because the condition for the while
loop is evaluated at the end. In Perl, do
just encapsulates a group of statements. It may be followed by a while
clause, but it can also be used anywhere that a code block is useful instead of a simple expression. And if do
is followed by while
, hilarity ensues. The while
condition is evaluated first, just like in an ordinary while loop.
The complete solution to the task is on Github.
Top comments (0)