DEV Community

Cover image for Dead Simple Python: Loops and Iterators
Jason C. McDonald
Jason C. McDonald

Posted on • Edited on

Dead Simple Python: Loops and Iterators

Like the articles? Buy the book! Dead Simple Python by Jason C. McDonald is available from No Starch Press.


Remember the last time you lost something?

You probably turned your house upside looking for it. You go through, room by room, while the people around you ask pointless questions like "where was the last place you had them?" (Seriously, if I knew that, I wouldn't be looking for them!) It'd be great to optimize your search, but your house isn't sorted...or particularly well organized, if you're anything like me. You're stuck with a linear search.

In programming, as in real life, we don't usually get data handed to us in any meaningful order. We start out with a whole mess, and we have to perform tasks on it. Searching through unordered data is probably the first example that springs to mind, but there are hundreds of other things you might want to do: convert all Fahrenheit temperature recordings to Celsius, find the average of all the data points, whatever.

"Yeah, yeah, that's what loops are for!"

But this is Python. Loops here are on a whole different level. They're so good, they're practically criminal.

An Overview of Loops

Let's get the boring stuff out of the way, shall we?

In Python, like in most languages, we have two basic loops: while and for.

while

A while loop is pretty basic.

clue = None
while clue is None:
    clue = searchLocation()
Enter fullscreen mode Exit fullscreen mode

As long as the loop condition, clue is None in this case, evaluates to True, the loop's code will be executed.

In Python, we also have a couple of useful keywords: break immediately stops the loop, while continue skips to the next iteration of the loop.

One of the most useful aspects of break is if we want to run the same code until the user provides valid input.

while True:
    try:
        age = int(input("Enter your age: "))
    except ValueError:
        print(f"Please enter a valid integer!")
    else:
        if age > 0:
            break
Enter fullscreen mode Exit fullscreen mode

As soon as we encounter the break statement, we exit the loop. Granted, that was a fairly convoluted example, but it demonstrates the point. You also often see while True: used in game loops.

Gotcha Alert: If you've ever worked with loops in any language, you're already familiar with the infinite loop. This is most often caused by a while condition which always evaluates to True and no break statement within the loop.

for

Coming from Java, C++, or many similar ALGOL-style languages, you're probably familiar with the tripartite for loop: for i := 1; i < 100; i := i + 1. I don't know about you, but when I first encountered that, it scared the dickens out of me. I'm comfortable with it now, but it just doesn't possess the elegant simplicity of Python, does it?

Python's for loop looks vastly different. The Python equivalent to that pseudocode above is...

for i in range(1,100):
    print(i)
Enter fullscreen mode Exit fullscreen mode

range() is a special "function" in Python that returns a sequence. (Technically, it's not a function at all, but that's getting pretty deep into pedantics.)

This is the impressive thing about Python - it iterates over a special type of sequence, called an iterable, which we'll talk about later.

For now, it's easiest to understand that we can iterate over a sequential data structure, like an array (called a "list" in Python).

Thus, we can do this...

places = ['Nashville', 'Norway', 'Bonaire', 'Zimbabwe', 'Chicago', 'Czechoslovakia']
for place in places:
    print(place)

print("...and back!")
Enter fullscreen mode Exit fullscreen mode

...and we get this...

Nashville
Norway
Bonaire
Zimbabwe
Chicago
Czechoslovakia
...and back!
Enter fullscreen mode Exit fullscreen mode

for...else

Python has another unique little trick in its loops: the else clause! After the loop is completed, and has not encountered a break statement, it will run the code in else. However, if the loop is broken out of manually, it will skip the else altogether.

places = ['Nashville', 'Norway', 'Bonaire', 'Zimbabwe', 'Chicago', 'Czechoslovakia']
villain_at = 'Mali'

for place in places:
    if place == villain_at:
        print("Villain captured!")
        break
else:
    print("The villain got away again.")
Enter fullscreen mode Exit fullscreen mode

Since 'Mali' wasn't in the list, we see the message "The villain got away again." However, if we change the value of villain_at to Norway, we'll see "Villain captured!" instead.

Where's the do?

Python does not have a do...while loop. If you're looking for one, the typical Python convention is to use a while True: with an inner break condition, like we demonstrated earlier.

A Few Containers

Python has a number of containers, or data structures, that hold data. We won't go into much depth on any of these, but I want to quickly skim over the most important ones:

list

A list is a mutable sequence (basically, an array).

It is defined with square brackets [ ], and you can access its elements via index.

foo = [2, 4, 2, 3]

print(foo[1])
>>> 4

foo[1] = 42
print(foo)
>>> [2, 42, 2, 3]
Enter fullscreen mode Exit fullscreen mode

Although there is no strict technical requirement for it, the typical convention is for lists to only contain items of the same type ("homogeneous").

tuple

A tuple is an immutable sequence. Once you've defined it, you technically can't change it (recall the meaning of immutability from before). This means you can't add or remove elements from a tuple after it's been defined.

A tuple is defined within parenthesis ( ), and you can access its elements via index.

foo = (2, 4, 2, 3)

print(foo[1])
>>> 4

foo[1] = 42
>>> TypeError: 'tuple' object does not support item assignment
Enter fullscreen mode Exit fullscreen mode

Unlike lists, standard convention permits tuples to contain elements of different types ("heterogeneous").

set

A set is an unordered mutable collection that is guaranteed not to have duplicates. That "unordered" part is important to remember: the sequence of individual elements cannot be guaranteed!

A set is defined within curly braces { }, although if you want an empty set, you must say foo = set(), as foo = {} creates a dict. You cannot access its elements via index, since it is unordered.

foo = {2, 4, 2, 3}

print(foo)
>>> {2, 3, 4}

print(foo[1])
>>> TypeError: 'set' object does not support indexing
Enter fullscreen mode Exit fullscreen mode

For an object to be added to a set, it also must be hashable. An object is hashable if:

  1. It defines the method __hash__(), which returns a hash as an integer. (See below)

  2. It defines the method __eq__() for comparing two objects.

A valid hash should always be the same for the same object (value), and it should be reasonably unique, so that it is somewhat uncommon that another object returns the same hash. (Two or more objects having the same hash is called a hash collision, and they still happen.)

Dictionary (dict)

A dict is a key-value data structure.

It is defined within curly braces { }, using : to separate keys and values. It is unordered, so you cannot access its elements via index; however, you indicate the key within square brackets [ ] in much the same way.

foo = {'a' : 1, 'b' : 2, 'c' : 3, 'd' : 4}

print(foo['b'])
>>> 2

foo['b'] = 42
print(foo)
>>> {'a': 1, 'b': 42, 'c': 3, 'd': 4}
Enter fullscreen mode Exit fullscreen mode

Only hashable objects may be used as dictionary keys. (See the section on set above for more information on hashability.)

Other Data Structures

Python offers additional containers besides the basics. You can find them all in the collections built-in module.

Unpacking a Container

There's an important piece of Python syntax we haven't talked about yet, but which will be useful shortly. We can assign each of the items in a container to a variable! This is called unpacking.

Of course, we need to know exactly how many items we're unpacking for this to work, otherwise we'll get a ValueError exception.

Let's look at a basic example, using a tuple.

fullname = ('Carmen', 'Sandiego')
first, last = fullname
print(first)
>>> Carmen
print(last)
>>> Sandiego
Enter fullscreen mode Exit fullscreen mode

The secret sauce is in that second line. We can list multiple variables to assign to, separated by commas. Python will unpack the container on the right side of the equal sign, assigning each value to a variable in order, left-to-right.

Gotcha Alert: Remember, set is unordered! While you can technically do this with a set, you can't be certain what value is assigned to what variable. It isn't guaranteed to be in any order; the fact that sets usually unpack their values in sorted order is incidental, and NOT guaranteed!

The in Thing

Python offers a nifty keyword, in, for checking if a particular element is found within a container.

places = ['Nashville', 'Norway', 'Bonaire', 'Zimbabwe', 'Chicago', 'Czechoslovakia']

if 'Nashville' in places:
    print("Music city!")
Enter fullscreen mode Exit fullscreen mode

This works with many containers, including lists, tuples, sets, and even with dictionary keys (but not dictionary values).

If you want one of your custom classes to support the in operator, you need only to define the __contains__(self, item) method, which should return True or False. (See the documentation).

Iterators

Python's loops are designed to work with iterables, which I mentioned earlier. These are objects that can be iterated over, using an iterator.

Cricket sounds.

Okay, let's take this from the top. A Python container object, such as a list, is also an iterable, because it has an __iter__() method defined, which returns an iterator object.

An iterator as a __next__() method defined, which in the case of a container iterator, returns the next item. Even unordered containers, like set(), can be traversed using iterators.

When nothing else can be returned by __next__(), it throws a specialized exception called StopIteration. This can be caught and handled using the typical try...except.

Let's look again at a for loop traversing over a list, for example...

dossiers = ['The Contessa', 'Double Trouble', 'Eartha Brute', 'Kneemoi', 'Patty Larceny', 'RoboCrook', 'Sarah Nade', 'Top Grunge', 'Vic the Slick', 'Wonder Rat']

for crook in dossiers:
    print(crook)
Enter fullscreen mode Exit fullscreen mode

dossiers is a list object, which is an iterable. When Python reaches the for loop, it does three things:

  1. Calls iter(dossiers), which in turn executes dossiers.__iter__(). This returns an iterator object that we'll call list_iter. This iterator object will be used by the loop.

  2. For each iteration of the loop, it calls next(list_iter), which executes list_iter.__next__(), and assigns the returned value to crook.

  3. If the iterator threw the special exception StopIteration, the loop is finished, and we exit.

It might be easier to understand this if I rewrite that logic in a while True: loop...

list_iter = iter(dossiers)
while True:
    try:
        crook = next(list_iter)
        print(crook)
    except StopIteration:
        break
Enter fullscreen mode Exit fullscreen mode

If you try both loops, you'll see they do the exact same thing!

Understanding how __iter__(), __next__(), and the StopIteration exception work, you can now make your own classes iterable!

Hack Alert: While it's fairly typical to define your iterator class separately from your iterable class, you don't necessarily have to! As long as both methods are defined in your class, and __next__() behaves appropriately, you can just define __iter__() to return self.

It's worth noting that iterators themselves are iterables: they have a __iter__() method which returns self.

The Curious Case of the Dictionary

Let's say we have a dictionary we want to work with...

locations = {
    'Parade Ground': None,
    'Ste.-Catherine Street': None,
    'Pont Victoria': None,
    'Underground City': None,
    'Mont Royal Park': None,
    'Fine Arts Museum': None,
    'Humor Hall of Fame': 'The Warrant',
    'Lachine Canal': 'The Loot',
    'Montreal Jazz Festival': None,
    'Olympic Stadium': None,
    'St. Lawrence River': 'The Crook',
    'Old Montréal': None,
    'McGill University': None,
    'Chalet Lookout': None,
    'Île Notre-Dame': None
    }
Enter fullscreen mode Exit fullscreen mode

If we just wanted to see each of the items in it, we'd just use a for loop. So, this should work, right?

for location in locations:
    print(location)
Enter fullscreen mode Exit fullscreen mode

Oops! That only shows us the keys, not the values. Definitely not what we're wanting, is it? What in the world is going on?

dict.__iter__() returns a dict_keyiterator object, which does what its class name suggests: it iterates over the keys, but not the values.

To get both the key and value, we need to call locations.items(), which returns dict_items object. dict_items.iter() returns a dict_itemiterator, which will return each key-value pair in the dictionary as a tuple.

Legacy Note: If you're using Python 2, you should call locations.iteritems() instead.

Remember earlier, when we talked about unpacking? The fact we're dealing with each pair as a tuple means we can unpack those into two variables.

for key, value in locations.items():
    print(f'{key} => {value}')
Enter fullscreen mode Exit fullscreen mode

That prints out the following:

Parade Ground => None
Ste.-Catherine Street => None
Pont Victoria => None
Underground City => None
Mont Royal Park => None
Fine Arts Museum => None
Humor Hall of Fame => The Warrant
Lachine Canal => The Loot
Montreal Jazz Festival => None
Olympic Stadium => None
St. Lawrence River => The Crook
Old Montréal => None
McGill University => None
Chalet Lookout => None
Île Notre-Dame => None
Enter fullscreen mode Exit fullscreen mode

Ahhh, that's more like it! Now we can work with the data. For example, I might want to record the important information in another dictionary.

information = {}

for location, result in locations.items():
    if result is not None:
        information[result] = location

# Win the game!
print(information['The Loot'])
print(information['The Warrant'])
print(information['The Crook'])

print("Vic the Slick....in jaaaaaaaaail!")
Enter fullscreen mode Exit fullscreen mode

That will find the Loot, Warrant, and Crook, and list them in the proper order:

Lachine Canal
Humor Hall of Fame
St. Lawrence River
Vic the Slick....in jaaaaaaaaail!
Enter fullscreen mode Exit fullscreen mode

Behold, the crime fighting power of loops and iterators!

Your Own Iterators

I already mentioned earlier that you can make your own iterables and iterators, but showing is better than telling!

Imagine we want to keep a list of agents handy, so we can always identify them by their agent number. However, there are some agents that we can't talk about. We can accomplish this pretty easily by storing agent id and name in a dictionary, and then maintaining a list of classified agents.

Gotcha Alert: Remember from our discussion of classes, there isn't actually such a thing as a private variable in Python. If you REALLY intend to keep secrets, use industry standard encryption and security practices, or at least don't expose your API to any VILE operatives. ;)

For starters, here's the basic structure of that class:

class AgentRoster:
    def __init__(self):
        self._agents = {}
        self._classified = []

    def add_agent(self, name, number, classified=False):
        self._agents[number] = name
        if classified:
            self._classified.append(name)

    def validate_number(self, number):
        try:
            name = self._agents[number]
        except KeyError:
            return False
        else:
            return True

    def lookup_agent(self, number):
        try:
            name = self._agents[number]
        except KeyError:
            name = "<NO KNOWN AGENT>"
        else:
            if name in self._classified:
                name = "<CLASSIFIED>"
        return name
Enter fullscreen mode Exit fullscreen mode

We can go ahead and test that out, just for posterity:

roster = AgentRoster()

roster.add_agent("Ann Tickwitee", 2539634)
roster.add_agent("Ivan Idea", 1324595)
roster.add_agent("Rock Solid", 1385723)
roster.add_agent("Chase Devineaux", 1495263, True)

print(roster.validate_number(2539634))
>>> True
print(roster.validate_number(9583253))
>>> False

print(roster.lookup_agent(1324595))
>>> Ivan Idea
print(roster.lookup_agent(9583253))
>>> <NO KNOWN AGENT>
print(roster.lookup_agent(1495263))
>>> <CLASSIFIED>
Enter fullscreen mode Exit fullscreen mode

Great, that works exactly as expected! Now, what if we want to be able to loop through the entire dictionary, perhaps as part of some awesome code that shows their name and current location on a snazzy global map.

However, we don't want to just access the roster._agents dictionary directly, because that will disregard the whole "classified" aspect of this class. How do we handle that?

As I mentioned before, we could just have this class also serve as its own iterator, meaning it has a __next__() method. In that case, we'd only return self. However, this is Dead Simple Python, so let's skip the annoyingly simplistic stuff and actually create a separate iterator class.

In this example, I'll actually turn that dictionary into a list of tuples, which will allow me to use indexing. (Remember, dictionaries are unordered.) I'll also figure out how many agents aren't classified. All of that logic belongs in the __init__() method, of course:

class AgentRoster_Iterator:

    def __init__(self, container):
        self._roster = list(container._agents.items())
        self._classified = container._classified
        self._max = len(self._roster) - len(self._classified)
        self._index = 0
Enter fullscreen mode Exit fullscreen mode

To be an iterator, the class must have a __next__() method; that's the only requirement! Remember, that method needs to throw StopException as soon as we have no more data to return.

I'll define AgentRoster_Iterator's __next__() method as follows:

class AgentRoster_Iterator:

    # ...snip...

    def __next__(self):
        if self._index == self._max:
            raise StopIteration
        else:
            r = self._roster[self._index]
            self._index += 1
            return r
Enter fullscreen mode Exit fullscreen mode

Now we return to the AgentRoster class, where we need to add an __iter__() method that returns an appropriate iterator object.

class AgentRoster:

    # ...snip...

    def __iter__(self):
        return AgentRoster_Iterator(self)
Enter fullscreen mode Exit fullscreen mode

That little bit of magic is all it takes, and now our AgentRoster class behaves exactly as expected with a loop! This code...

roster = AgentRoster()

roster.add_agent("Ann Tickwitee", 2539634)
roster.add_agent("Ivan Idea", 1324595)
roster.add_agent("Rock Solid", 1385723)
roster.add_agent("Chase Devineaux", 1495263, True)

for number, name in roster:
    print(f'{name}, id #{number}')
Enter fullscreen mode Exit fullscreen mode

...produces...

Ann Tickwitee, id #2539634
Ivan Idea, id #1324595
Rock Solid, id #1385723
Enter fullscreen mode Exit fullscreen mode

Looking Forward

I hear that Pythonista in the back: "Wait, wait, we can't be done yet! You haven't even touched on list comprehensions yet!"

Python indeed adds a whole additional level of magic on top of loops and iterators, with a special tool called a generator. This type of class provides another incredible tool called a comprehension, which is like a deliciously compact loop for creating a data structure.

I've also deliberately skipped such goodness as zip() and enumerate(), which make loops and iteration even more powerful. I would have included them here, but I didn't want to make the article too long. (It's already pushing it.) I'll be touching on those later as well.

I see some of you are already vibrating with excitement, but alas, you're going to have to wait until the next article to learn more.

Review

Let's review the most important concepts from this section:

  • A while loop runs as long as its condition evaluates to True.
  • You can break out of a loop with the break keyword, or skip to the next iteration with the continue keyword.
  • A for loop iterates over an iterable (an object that can be iterated over), such as a list.
  • The range() function returns an iterable sequence of numbers, which can be used in a for loop, e.g. for i in range(1, 100).
  • Python does NOT have a do...while loop. Use a while True: loop with an explicit break statement within it.
  • Python has four basic data structures, or containers:
    • Lists are mutable, ordered, sequential structures...basically, arrays.
    • Tuples are immutable, ordered, sequential structures. Think list, but you can't modify the contents.
    • Sets are mutable, unordered structures that are guaranteed never to have any duplicate elements. They can only store hashable objects.
    • Dictionaries are mutable, unordered structures that store key-value pairs. You look up items by key, not by index. Only hashable objects may be used as keys.
  • You can unpack the values of a container into multiple variables using the convention a, b, c = someContainer. The number of variables on the left and the number of elements in the container on the right must be the same!
  • You can quickly check if an element is in a container with the in keyword. If you want your class to support this, define the contains() method.
  • Python's containers are examples of iterables: they return iterators that can traverse their contents. An iterable object always returns an iterator object via its iter() method.
  • An iterator object always has a next() method, which returns a value. A container iterator's next() method would return the next element in the container. When there is nothing more to return, the iterator raises the StopIteration exception.

Ned Batchelder has a phenomenal talk on iterators and loops entitled "Loop Like A Native". I strongly recommend checking it out!

Also, as usual, be sure to read the documentation. There's plenty more you can do with loops, containers, and iterators.


Thank you to deniska, grym, and ikanobori (Freenode IRC #python) for suggested revisions.

Top comments (14)

Collapse
 
aymanone profile image
aymanone

hello
it's a perfect article all the
series really
but in the code there something wrong
in the iterator for agents
as you implemented it it will return
the first nth items in the agents list
even if they're classified

Collapse
 
codemouse92 profile image
Jason C. McDonald

Hm. I just tested the code as implemented again, and it doesn't display the classified agents. If your tests have produced otherwise, would you mind sharing a screenshot? Thanks!

Collapse
 
aymanone profile image
aymanone

yes it'l work right
but try this
agents.add(not secret)
agents.add(secret)
agent.add(not secret)
agents.add(secret)
then
self.agents=[not secret,secret,not secret,secret]
then
self.agents-self.secret
4-2=2
then what will print is
self.agents[0] # not secret
then
self.agents[1]#secret
actually it's not big deal
not deal at all but
i read the article and when i reach that
i read it several times
because i thought may be i missed something
thanks for this series i hope
you continue it

Thread Thread
 
codemouse92 profile image
Jason C. McDonald

Ah, I hadn't quite addressed the [] operator in that example. Good catch.

Thread Thread
 
pkwhaley profile image
Pete (he/him)

I noticed this error case while reading through the article as well. In the sake of learning I just couldn't move on and ignore it.

Here is my fix. I am sure there are better ways to accomplish this, so please critique and let me know how I could better accomplish this.

The only thing I changed was the next method as follows:

    def __next__(self):
        if self._index == self._max:
            raise StopIteration
        else:
            _number, _name = self._roster[self._index]
            if _name in self._classified:
                self._roster.pop(self._index)
            r = self._roster[self._index]
            self._index += 1
            return r
Thread Thread
 
codemouse92 profile image
Jason C. McDonald

Hey, that's pretty good. However, my only concern is that it would delete the internally stored information about the classified agent (which we don't want).

Thread Thread
 
pkwhaley profile image
Pete (he/him)

Ah, good point.

Take 2 [move classified to end of _roster]:

            if _name in self._classified:
                self._roster.append(self._roster.pop(self._index))
            r = self._roster[self._index]

Thanks so much for these articles and for being so responsive. They are written very well , engaging, and a great resource.

Thread Thread
 
codemouse92 profile image
Jason C. McDonald

If I were going to fix this problem (which I may well do soon -- I have to take another pass through this material when writing the book), I would actually define the __getitem__() function instead, as that controls the behavior of the [] operator.

This all comes down to separation of concerns. It shouldn't be the responsibility of __next__() to mutate the internal data to obscure information. It's only job should be to determine whether it exposes that information, and how.

Of course, in all honesty, there's nothing preventing a direct call to agents._roster[1] (Python has no private variables). If we were going to obfuscate or remove classified data, that should really occur on the add_agent() function.

Collapse
 
ardunster profile image
Anna R Dunster

I see another comment thread addressed what I was wondering about the classified agents showing up if they weren't last ;)

For some reason I had the idea that lists and arrays were different in some functional way, but from your article it sounds they are functionally the same things, just with different vocabulary based on language?

I have a hard time grokking hashability. I've tried several times but something eludes me about the logic of why one thing is hashable and another isn't. I'd rather understand it than memorize it/look things up to check when it matters.

(Also, funny thing, OSX has grokking in the dictionary, but not hashable? what.)

Collapse
 
codemouse92 profile image
Jason C. McDonald • Edited

Yes, lists and arrays are effectively the same things. (There are some implementation differences internal to the language, mind you.)

Hashing means you perform some function, usually a form of one-way (lossy, as it were) encryption on the data to produce a (usually) unique value, often shorter than the original.

For example, here's the hashes for a few strings, according to Python's hash function. You'll notice they all produce a unique integer value, and all those integers are the same length.

"Hello" → 3483667490880649043
"Me" → 6066670828640188320
"Reverse the polarity of the neutron flow." → 7317767150724217908
"What evil lurks in the hearts of men? The shadow knows!" → -6411620787934941530

When two different input values have the same hash, that's known as a "hash collision", so any container that relies on a hash (such as Python's dictionaries) needs to be able to handle that situation.

For more information, watch this excellent explanation by the legendary @vaidehijoshi :

Collapse
 
ardunster profile image
Anna R Dunster

Thanks for the link, that's a great video and the visuals are quite helpful. Will check out the rest of her series, too.

Collapse
 
zhenmisher profile image
zhenmisher • Edited

Hello, I reimplemented the __next__ method to avoid classified elements got exposed

class AgentRoster_Iterator:

    def __init__(self, container):
        self._roster = list(container._agents.items())
        self._classified = container._classified
        self._max = len(self._roster)
        self._index = 0

    def __next__(self):
        while self._index < self._max:
            num, name = self._roster[self._index]
            self._index += 1
            if name not in self._classified:
                return num, name
        raise StopIteration
Collapse
 
codemouse92 profile image
Jason C. McDonald

Oh, top notch, thanks for catching that, mate.