Favour George

Posted on Sep 2, 2023 • Originally published at psycode.hashnode.dev on Jul 25, 2023

Python Itertools: Mastering Efficient Iteration for Enhanced Productivity

What is Itertools?

Itertools is a Python module used for iterating over iterable data structures using a for loop. It is a collection of tools for handling iterators and iterations in a fast and memory-efficient way.

Now, before we get started exploring the itertools module, lets understand a few important terms used in iteration.

Iterables

An iterable is any data structure that can be iterated or looped over to access its individual items. Examples of iterables in Python include Lists, Tuples, Sets, Strings, Dictionaries etc.

Iterator

An iterator is an object that allows you to traverse or move through elements of an iterable one at a time. Iterators are stateful, meaning that they maintain an internal state to keep track of their current position during iteration.

Now with that out of the way, lets explore the various tools available to us through Itertools

Types of Iterators in Itertools

There are mainly three types of iterators available to us in Itertools:

Infinite Iterators
Finite Iterators
Combinatoric iterators

Infinite Iterators

Infinite iterators are a type of iterator that can produce an infinite sequence of values which results from an endless loop. To stop this, you have to specify a condition for breaking out of the loop.

Itertools provides us with a few of these types of iterators which include

Count(start, step)

The count iterator will return evenly spaced numbers starting from the number passed into its start parameter. It also provides steps, which can be quite useful. Lets see how it works in practice

from itertools import count

for i in count(start=2, step=2):
    if i < 10:
        print(i)
    else:
        break

In the code above, we first imported the count iterator from the itertools module. We then set up a for loop with the count iterator starting from 2 with a step size of 2. Later on, we set up a condition to break out of the loop if the iterator is greater than 10. The output of this code would be even numbers from 2 to 10, excluding 10 as shown below

Cycle(iterable)

The cycle() iterator takes an iterable as an argument and iterates through each item of the iterable in perpetuity unless a condition is provided to break out of the loop as shown below:

from itertools import cycle

tracker = 0
numbers = [1, 2]
for i in cycle(numbers):
    if tracker < 5:
        print(i)
    else:
        break
    tracker += 1

In the code above, we import the cycle iterator, define a tracker variable to keep track of the number of iterations and set up an iterable in the form of a list containing two numbers. We then construct a for loop with the cycle iterator and pass in the list of numbers as an argument. Finally, we set up a condition to break out of the loop if the number of iterations(i.e. the tracker variable) is greater than five. The output of this code is shown below:

Repeat(object, times=None)

The repeat iterator takes an object such as an iterable as an argument and returns the object infinitely unless its times parameter is provided with an argument as shown below

from itertools import repeat

numbers = [1, 2, 3, 4, 5]
for i in repeat(numbers, 5):
    print(i)

In the code above, we imported the repeat iterator, then we defined an iterable in the form of a list of numbers. We then set up a for loop with the repeat iterator, passing in our list of numbers as an argument along with the number of times we want to repeat, in this case, 5. The output of the code is seen below:

Finite Iterators

Finite iterators are a type of iterator that have a finite or limited sequence of values to iterate over. They are usually associated with finite iterables. Itertools has quite a bunch of finite iterators for us to use.

Accumulate(iterable)

The accumulate iterator takes in an iterable as an argument and returns its accumulated sums. The default of the accumulate iterator is addition, but for this article, well be using the multiplication operator as shown below:

from itertools import accumulate
from operator import mul

numbers = [1, 2, 3, 4, 5]
for i in accumulate(numbers, mul):
    print(i)

In the code above, we imported the accumulate iterator and also imported the multiplication operator from the operator module. We then defined an iterable in the form of a list of numbers. Finally, we set up a for loop with the accumulate iterator, passing in our list of numbers along with the operator we wish to use, in this case, the multiplication operator from the operator module as arguments. This code should return the accumulated product of the numbers in our list as shown:

Compress(data, selectors)

The compress iterator is quite handy for filtering the first iterable(data) with the second iterable(selectors). For this to work, we set the second iterable as a list of booleans which will be assigned to the selectors parameter. Lets show this in code:

from itertools import compress
numbers = [1, 2, 3, 4, 5]
bool_list = [True, False, True, False, True]

for i in compress(numbers, bool_list):
    print(i)

In the code above, we import the compress iterator, define a list of numbers which will be used as our data and define another list of booleans which will be used as our selectors.

We then set up a for loop with the compress iterator, passing in our various lists as data and selectors. This code will return items in our first list (data) which match True in our second list (selectors) as shown below:

Dropwhile(predicate, iterable)

The dropwhile iterator takes in a predicate which can be a function as an argument as well as an iterable. The dropwhile iterator will drop elements as long as the filter criteria is True but once it hits upon an element which when checked with the filter criteria is False, it keeps that element along with every other element that comes after. Fundamentally, this iterator drops every item in the iterable while it has not reached an item that evaluates to False. Lets show this with the code:


from itertools import dropwhile

def greater_than_three(x):
    return x > 3

num = [6, 4, 5, 6, 2, 6, 7, 9]
for i in dropwhile(greater_than_three, num):
    print(i)

In the code above, we imported the dropwhile iterator, created a function that returns True if the number provided as argument is greater than three and finally defined an iterable in the form of a list of numbers. We then set up a for loop with the dropwhile iterator, passing in our predicate( the function) and iterable( list of numbers) as arguments. The output of this code will be every item in the list after the predicate hits upon an item in the list that returns False as shown:

Filterfalse(predicate, iterable)

The filter false iterator takes in a predicate which can be a function as well as an iterable as arguments. It returns values of the iterable that evaluate to False when checked with the predicate and drops values that evaluate to True. Lets show this with the code:

from itertools import filterfalse

def less_than_four(x):
    return x < 4

num = [6, 7, 8, 9 , 1, 6, 3, 0 , 9, 2]
for i in filterfalse(less_than_four, num):
    print(i)

In the code above, we imported the filterfalse iterator, defined a function that returns True if the number passed into it is less than four and we also created an iterable in the form of a list of numbers. We then as usual, set up a for loop with the filterfalse iterator, passing in our function and iterable as arguments. This code will return items in our iterable that evaluate to False (i.e are greater than four) as shown:

Takewhile(predicate, iterable)

The takewhile iterator takes a predicate and an iterable as arguments. It is the polar opposite of the dropwhile iterator. The takewhile iterator will return values of the iterable that evaluate to True from the predicate, but once it stumbles upon an item which evaluates to False it stops and drops the rest of the values in the iterable. Essentially, this iterator takes every item in the iterable while it has not reached an item that evaluates to False. Lets show this with the code:


from itertools import takewhile

def greater_than_three(x):
     return x > 3

num = [6, 4, 5, 6, 2, 6, 7, 9]
for i in takewhile(greater_than_three, num):
    print(i)

In the code above, we imported the takewhile iterator, defined a function that returns True if the value passed to it is greater than three and we created our iterable as a list of numbers as usual. We then set up a for loop with the takewhile iterator, passing in our function and iterable as arguments. The output of this code will be every item in the list before the predicate hits upon an item in the list that returns False as shown:

Zip_longest(*iterables, fillvalue=None)

The zip_longest iterator can be used to iterate over two iterables together. If the two iterables are not of the same length, you can provide a fill value to fill up the blank spots of the lesser or shorter iterable as shown below:

from itertools import zip_longest

for i in zip_longest('abcd', '12', fillvalue='None'):
    print(i)

In the code above, we import the zip_longest iterator and set up a for loop with it passing in two iterables and a fill value as arguments. The second iterable(12) is two characters short, so we provide a fill value of None to it. The output is shown below:

Combinatoric Iterators

Combinatoric iterators are a type of iterator that focuses on generating various combinations and permutations from an iterable. They are particularly useful for solving combinatorial problems. Itertools provides us with a handful of these combinatorial generators.

Combinations(iterable, r)

The combinations iterator takes an iterable as an argument along with an r-length tuple which is the length of each combination. Lets see this in the code:

from itertools import combinations

for i in combinations('ABCD', 2):
    print(''.join(i))

In the code above, we imported the combinations iterator and then set up a for loop with it, providing a String as an iterable and 2 as the length of each combination. This code would normally return a tuple of each combination, but after using the join method weve joined the two values that would have been in a tuple to a single String as shown below:

It is important to note that the combination is done in a lexicographic manner (i.e. alphabetically, A-Z). Also, the combinations will not produce repeat values(e.g. AA, BB) if all the input elements are unique.

Combinations_with_replacement(iterable, r)

The combinations_with_replacement iterator is quite similar to the combinations iterator, except in this case, it creates combinations where elements do repeat (e.g. AA, BB). Lets see this in the code:

from itertools import combinations_with_replacement

for i in combinations_with_replacement('ABCD', 2):
    print(''.join(i))

Permutations(iterable, r)

The permutations iterator returns a progression of r-length permutations of elements of the iterable you give to it. Lets see this in code:

from itertools import permutations

for i in permutations('ABCD', 2):
    print(''.join(i))

Thats it for this article, as you can see, the itertools module provides us with a collection of very handy functions that we can use for efficient iteration without coding them from scratch by ourselves. For a more comprehensive outlook on more of these tools be sure to visit the docs.