2021-04-06T15:54:26Z

How to Write Unit Tests in Python, Part 2: Game of Life

This is the second part of my series on unit testing Python applications. In this article I will introduce you to Conway's Game of Life, an interesting simulation that plays animated patterns on a grid.

Game of Life

My implementation of this game has an engine part, where the data structures and algorithm of the simulation are implemented, and a graphical user interface (GUI) part. In this article I will focus on testing the engine (testing GUIs will be covered in a future article).

Testing this code will present us with a few new challenges. In this article you will learn about the following topics:

  • Test parametrization
  • Building test matrices
  • Basic mocking
  • Testing Python exceptions

A Game of Life Primer

For those of you who are not familiar with Conway's Game of Life, here is a short introduction to how this simulation works.

The Game of Life runs on an infinite two-dimensional grid. Each cell in the grid can be in one of two states: alive or dead. At each time step, the simulation runs through every cell to update its state based on two rules. The rules are based on the current state of the cell and the state of its eight neighboring cells. Here are the rules that govern the simulation:

  • The survival rule: If the cell is alive and has exactly 2 or 3 neighbors alive, then it stays alive, else it dies.
  • The birth rule: If the cell is dead and has exactly 3 neighbors alive, then it becomes alive, else it stays dead.

With these simple rules it is possible to build many interesting patterns, some of which constantly change and mutate. The simulation rules can be generalized by making the number of neighbors used by the survival and birth rules configurable, so the standard survival rule can be described as [2, 3] and the birth rule as [3].

In the next section I'll show you an implementation of this simulation in Python!

Game of Life in Python

My implementation of the Game of Life engine in Python is available in a single Python file, without any external dependencies. You can look at it on GitHub: https://github.com/miguelgrinberg/python-testing/blob/main/life/life.py.

The code consists of two classes: CellList and Life.

The CellList class is an auxiliary class that maintains a list of cells, each given by a tuple with the format (x, y). Below you can see the structure of this class:

class CellList:
    """Maintain a list of (x, y) cells."""

    def __init__(self):
        pass

    def has(self, x, y):
        """Check if a cell exists in this list."""
        pass

    def set(self, x, y, value=None):
        """Add, remove or toggle a cell in this list."""
        pass

    def __iter__(self):
        """Iterator over the cells in this list."""
        pass

I think this class should be mostly self-explanatory. To help you see how the class is used, here is an example Python session that creates a list of cells:

>>> from life import CellList
>>> c = CellList()
>>> c.set(0, 0, True)  # add (0, 0) to the list
>>> c.has(0, 0)
True
>>> c.has(1, 0)
False
>>> c.set(3, 2)  # toggle (3, 2)
>>> list(c)
[(0, 0), (3, 2)]
>>> c.set(0, 0, False)  # remove (0, 0)
>>> list(c)
[(3, 2)]

The one aspect of the CellList class that may need a clarification is the __iter__() method. When this method is implemented in a class, instances become iterables. That means that you can convert an object to a list using the expression list(c) that you see above, or to a set with set(c). You can also iterate over the items stored in the object using a for-loop.

The Life class is where the logic of the simulation is implemented. Here is the high-level structure of this class:

class Life:
    """Game of Life simulation."""

    def __init__(self, survival=[2, 3], birth=[3]):
        pass

    def rules_str(self):
        """Return the rules of the game as a printable string."""
        pass

    def load(self, filename):
        """Load a pattern from a file into the game grid."""
        pass

    def toggle(self, x, y):
        """Toggle a cell in the grid."""
        pass

    def living_cells(self):
        """Iterate over the living cells."""
        pass

    def bounding_box(self):
        """Return the bounding box that includes all living cells."""
        pass

    def advance(self):
        """Advance the simulation by one time unit."""
        pass

    def _advance_cell(self, x, y):
        """Calculate the new state of a cell."""
        pass

While most implementations of the Game of Life in Python that you come across use a grid of specific dimensions, my version is true to the original design of the game and works on an infinite grid. The straightforward approach of implementing the grid on which the game is played as a two-dimensional array or list becomes problematic if the grid needs to be of infinite dimensions. The key to be able to implement an infinite grid is to store the cells that are active in the grid as a flat list. You can probably guess that the CellList class is used for this.

The most important method in the Life class is advance(). This method runs the simulation for one step, updating the state of all cells according to the survival and birth rules. The _advance_cell() is a helper method that is used internally by advance().

The load() method can import a pattern of cells from a file. I implemented two fairly popular formats used by many implementations, the Life 1.05 and Life 1.06 formats. Both are plain text formats that can be viewed and edited in a standard text editor.

The rules_str() method is a helper method that can be used to get a printable description of the rules that are configured. For the standard rules this method will return 23/3, which indicates that 2 and 3 neighbors are used for the survival rule and 3 for the birth rule.

The toggle() method inverts the state of a cell. The bounding_box() method returns the rectangular coordinates that contain all the live cells in the simulation. The living_cells() method returns an iterator over the list of cells that are alive. These three methods are there to support a GUI implementation designed to visualize the game, and they are used by the life_gui.py example implementation I have written.

Running the Simulation

Before we get into the specifics of how to test this code, you may want to play with it. Follow these steps to get it set up on your computer:

  • Clone the GitHub repository.
  • Create a virtual environment and install pygame on it, the only third-party dependency.
  • Change to the life directory
  • Run python life_gui.py [pattern-file] to start a simulation. There are many interesting patterns to try in the patterns sub-directory. The example pattern you see at the top of this article can be started with the command python life_gui.py patterns/pentadecathlon.txt. If a pattern file isn't provided, the simulation starts on an empty grid.

Once the simulation is running, there are a few keys that you can use:

  • Space to pause and resume the simulation
  • Esc to exit
  • Arrow keys to scroll through the infinite grid
  • + and - to zoom in or out
  • c to center the grid
  • Mouse click on a cell to toggle its state (you may want to do this while the simulation is paused)

Testing the Game of Life Engine

As a recap of the first article, let's quickly review why it would be a good idea to write unit tests for this application:

  • As a primary goal, we want to make sure that in the future, if we make changes or additions to the application we don't inadvertently break the existing logic.
  • As a secondary goal, we want to make sure that all the less common code paths that are harder or less convenient to test manually are tested by creating a controlled test environment.

As you've seen above, my implementation of the Game of Life engine has two classes, but only one of them is public, meaning that it is designed to be used from other code such as a GUI interface. Should the approach to testing be different for public vs internal code?

While other developers may have different opinions on this matter, my approach to unit testing does not change based on the code being publicly accessible or not. The goal of unit testing is to make sure that code does not break, and since breakages can happen anywhere, all code needs to be tested with the same level of attention.

For the Game of Life engine I'll be creating two subclasses of unittest.TestCase, one for the CellList class and another for the Life class.

A good practice is to name the Python modules containing tests with the same name as the subject of the tests, but with a test_ prefix. The CellList and Life classes live in a life.py module, so the tests for them will be in test_life.py.

Here is a first implementation of test_life.py, with just empty test cases:

import unittest
from life import CellList, Life

class TestCellList(unittest.TestCase):
    pass

class TestLife(unittest.TestCase):
    pass

Make sure you have the life.py and test_life.py files in a directory, and that you have pytest and pytest-cov installed on your virtual environment. The following command runs pytest with branch tracking and code coverage, as you learned in the first part of this series, to confirm that the test framework is operational:

(venv) $ pytest --cov=life --cov-report=term-missing --cov-branch
========================= test session starts =========================
platform darwin -- Python 3.8.6, pytest-6.2.2, py-1.10.0, pluggy-0.13.1
rootdir: /Users/mgrinberg/Documents/dev/python/testing/life
plugins: cov-2.11.1
collected 0 items


---------- coverage: platform darwin, python 3.8.6-final-0 -----------
Name      Stmts   Miss Branch BrPart  Cover   Missing
-----------------------------------------------------
life.py      98     84     70      0     8%   5, 9, 13-26, 30-32, 39-41, 45-47, 51-79, 83, 87, 91-103, 107-119, 123-132
-----------------------------------------------------
TOTAL        98     84     70      0     8%

======================== no tests ran in 0.12s =======================

Testing the CellList Class

To test the CellList class we can just use the methods of the class to create a list of cells, and then verify that the list contains the expected cells in it.

First we can test that an empty cell list is indeed empty:

class TestCellList(unittest.TestCase):
    def test_empty(self):
        c = CellList()
        assert list(c) == []

Next we can use the set() method with the value argument set to True, which effectively adds cells to the list:

class TestCellList(unittest.TestCase):
    # ...

    def test_set_true(self):
        c = CellList()
        c.set(1, 2, True)
        assert c.has(1, 2)
        assert list(c) == [(1, 2)]
        c.set(500, 600, True)
        assert c.has(1, 2) and c.has(500, 600)
        assert list(c) == [(1, 2), (500, 600)]
        c.set(1, 2, True)  # make sure a cell can be set to True twice
        assert c.has(1, 2) and c.has(500, 600)
        assert list(c) == [(1, 2), (500, 600)]

The third test can do something similar, but with the value argument set to False:

class TestCellList(unittest.TestCase):
    # ...

    def test_set_false(self):
        c = CellList()
        c.set(1, 2, False)
        assert not c.has(1, 2)
        assert list(c) == []
        c.set(1, 2, True)
        c.set(1, 2, False)
        assert not c.has(1, 2)
        assert list(c) == []
        c.set(1, 2, True)
        c.set(3, 2, True)
        c.set(1, 2, False)
        assert not c.has(1, 2)
        assert c.has(3, 2)
        assert list(c) == [(3, 2)]

The value argument of the set() method can also be omitted, in which case the method works as a toggle. This is the focus of the final unit test:

class TestCellList(unittest.TestCase):
    # ...

    def test_set_default(self):
        c = CellList()
        c.set(1, 2)
        assert c.has(1, 2)
        assert list(c) == [(1, 2)]
        c.set(1, 2)
        assert not c.has(1, 2)
        assert list(c) == []

That should be all the testing needed for the CellList class. We can confirm by running the test suite:

(venv) $ pytest --cov=life --cov-report=term-missing --cov-branch
========================= test session starts =========================
platform darwin -- Python 3.8.6, pytest-6.2.2, py-1.10.0, pluggy-0.13.1
rootdir: /Users/mgrinberg/Documents/dev/python/testing/life
plugins: cov-2.11.1
collected 4 items

test_life.py ....                                               [100%]

---------- coverage: platform darwin, python 3.8.6-final-0 -----------
Name      Stmts   Miss Branch BrPart  Cover   Missing
-----------------------------------------------------
life.py      98     67     70      0    26%   39-41, 45-47, 51-79, 83, 87, 91-103, 107-119, 123-132
-----------------------------------------------------
TOTAL        98     67     70      0    26%


========================== 4 passed in 0.10s =========================

From this we can see that we have increased coverage to 26% of the life.py file, and that all the lines of code that are part of the CellList class are covered by the current set of unit tests.

We have made some progress, but clearly we still have a lot of work to do!

Testing the Life Class

Now we get to the most interesting part of this exercise, which is to test the game logic in the Life class.

Looking at the test coverage report, the first set of lines without coverage are 39-41, which map to the body of the Life.__init__() method. Like we did with CellList, we can begin by making sure that a brand new object has a sound structure and that should take care of the constructor method. This includes making sure that the survival and birth attributes are set correctly and that the grid of the game is completely empty.

class TestLife(unittest.TestCase):
    def test_new(self):
        life = Life()
        assert life.survival == [2, 3]
        assert life.birth == [3]
        assert list(life.living_cells()) == []

    def test_new_custom(self):
        life = Life([3, 4], [4, 7, 8])
        assert life.survival == [3, 4]
        assert life.birth == [4, 7, 8]
        assert list(life.living_cells()) == []

If you haven't looked at the internals of the Life class you may be wondering how can you know what to check in a test for a constructor. The answer is simple. You can't always be effective at testing a piece of code if you are not familiar with how it works internally. We've seen that the CellList class was fairly easy to test without really understanding how it works; we were able to use public methods in all the assertions. The Life class is more complex, and some of the assertions in the test above need to look inside the class at internal attributes. Other developers may disagree with this approach, but I find it necessary to be thorough.

So far so good. The next set of lines without coverage are 45-47, which correspond to the body of the rules_str() method. This is a short method that returns a string representation of the rules that are defined by the survival and birth attributes. Since we created two unit tests above for the default and a custom set of rules, we can take advantage of these tests and expand them to also verify the result of calling rules_str(). Here are the two unit tests with this update:

class TestLife(unittest.TestCase):
    def test_new(self):
        life = Life()
        assert life.survival == [2, 3]
        assert life.birth == [3]
        assert list(life.living_cells()) == []
        assert life.rules_str() == '23/3'

    def test_new_custom(self):
        life = Life([3, 4], [4, 7, 8])
        assert life.survival == [3, 4]
        assert life.birth == [4, 7, 8]
        assert list(life.living_cells()) == []
        assert life.rules_str() == '34/478'

Code coverage with these two tests plus the previous four from the CellList class is at 32%:

---------- coverage: platform darwin, python 3.8.6-final-0 -----------
Name      Stmts   Miss Branch BrPart  Cover   Missing
-----------------------------------------------------
life.py      98     60     70      0    32%   51-79, 83, 91-103, 107-119, 123-132
-----------------------------------------------------
TOTAL        98     60     70      0    32%

The load() method

The next chunk of lines without test coverage are 51-79, which are the implementation of the load() method. When you look at these lines in the life.py source file you'll realize that this is the first non-trivial piece of code that we need to test. For example, this method loads data from a file. How can we do this within the controlled environment of a test?

One option is to refactor the code so that the file operations are in a separate method from the actual parsing, so then we can test the parser, but ignore the file I/O. The other option is actually much simpler: we can just create a few test files and let the code load them for real during the tests.

Given that loading data from a file is very quick, I've decided to leave the code alone and just create some test files that are designed especially for my unit tests. To begin with, I came up with the very simple pattern that you see in the image below:

Life Test Pattern

If we assume that the top-left cell is at coordinates (0, 0), then the four cells that are alive in this grid are at (10, 10), and (11, 11), (15, 10) and (17, 10). A representation of this in a Life 1.05 file format can be given as follows:

#Life 1.05
#D test pattern 1
#N
#P 10 10
*.
.*
#P 15 10
*.*

The first line in this file is the header, which allows parsers to know what the format of the remaining of the file is. The #D lines are comments that parsers need to ignore. The #N line indicates that this grid should follow the standard rules of Game of Life, with 2 and 3 cells in the survival rule and 3 cells in the birth rule. The #P line sets a starting position in X and Y coordinates. What follows is a matrix of characters where a * indicates a cells that is alive, and a . indicates a cell that is dead. The #P line followed by a matrix of characters can be repeated as many times as needed. The LifeWiki site has more details about the Life 1.05 file format.

Save the above file with the name pattern1.txt. You can also find this file in the GitHub repository for this project.

The Life 1.06 format is much simpler, it just has a header, and data lines, where each data line has the X and Y coordinates of a cell that is alive. I have added an extension to this format to allow comments on lines that begin with #. Here is the same pattern shown above in Life 1.06 format. Save this version as pattern2.txt.

#Life 1.06
# test pattern 2
10 10
11 11
15 10
17 10

The LifeWiki site also has a page dedicated to the Life 1.06 file format.

We are now ready to write to unit tests that load the two pattern files created above, and then make sure the state of the Life instance is correct:

class TestLife(unittest.TestCase):
    # ...

    def test_load_life_1_05(self):
        life = Life()
        life.load('pattern1.txt')
        assert life.survival == [2, 3]
        assert life.birth == [3]
        assert set(life.living_cells()) == {
            (10, 10), (11, 11), (15, 10), (17, 10)}
        assert life.bounding_box() == (10, 10, 17, 11)

    def test_load_life_1_06(self):
        life = Life()
        life.load('pattern2.txt')
        assert life.survival == [2, 3]
        assert life.birth == [3]
        assert set(life.living_cells()) == {
            (10, 10), (11, 11), (15, 10), (17, 10)}
        assert life.bounding_box() == (10, 10, 17, 11)

The tests load the pattern files and then check that all the attributes of the life object have the expected values. When checking the list of cells that are alive, the living_cells() iterator is converted to a set instead of a list, which will make the comparison work regardless of the order in which the cells are given.

These two tests are very convenient to also verify the bounding_box() method of the Life class, which returns the rectangular coordinates that contain all the cells that are alive, so I've added a check for that as well.

With these tests, coverage is now at 73%:

---------- coverage: platform darwin, python 3.8.6-final-0 -----------
Name      Stmts   Miss Branch BrPart  Cover   Missing
-----------------------------------------------------
life.py      98     24     70      2    73%   62, 79, 83, 107-119, 123-132
-----------------------------------------------------
TOTAL        98     24     70      2    73%

This report indicates that we do not have full coverage of the load() method yet, with lines 62 and 79 not being tested, so we have more work to do on this method.

Test Parametrization

Before we continue writing tests, did you notice that the two unit tests for the load() method have a lot in common? In fact, the only difference between them is that they load different files. Other than that they are identical.

This is actually a very common occurrence in unit tests. There are many situations in which you have some test logic that needs to run multiple times, but with different input arguments. In this case the argument is the file to load.

Instead of duplicating the test logic as I did in the previous section, we are going to use a technique called test parametrization to have a single instance of the test.

Pytest provides parametrization support, but unfortunately this functionality is not very flexible and in particular does not work for tests that are implemented as methods in a class. I use a standalone package called parameterized, which supports tests written as functions or methods and is fully compatible with both pytest and unittest. Add this package to your virtual environment:

(venv) $ pip install parameterized

In its simplest form, parameterized allows you to duplicate a test function or method on the fly using different arguments. Below you can see how the two tests from the previous section can be collapsed into one using the parameterized.expand() decorator:

from parameterized import parameterized

# ...

class TestLife(unittest.TestCase):
    # ...

    @parameterized.expand([('pattern1.txt',), ('pattern2.txt',)])
    def test_load(self, pattern):
        life = Life()
        life.load(pattern)
        assert life.survival == [2, 3]
        assert life.birth == [3]
        assert set(life.living_cells()) == {
            (10, 10), (11, 11), (15, 10), (17, 10)}
        assert life.bounding_box() == (10, 10, 17, 11)

The list that is passed as an argument to the parameterized.expand() decorator defines what arguments the decorated function needs to be called with. The decorator expects the list to be tuples. Since in this case each version of the test needs a single argument, the tuples use the strange ('filename',) syntax that forces single-element tuples. For this particular test we pass [('pattern1.txt',), ('pattern2.txt',)] to generate tests for the two pattern files.

Later in this article you are going to see a more advanced usage of the parameterized.expand() decorator that includes multiple arguments per test.

File Loading Corner Cases

Going back to the tests, we have two lines of code from the load() method that are not covered yet. Using the source code as a reference, line 62 is a section of the function that handles a #R option in the Life 1.05 format. This option is used to specify a non-standard set of survival and birth rules to be used, as an alternative to the #N that we used above, which specifies the default rules.

To cover line 62 we can simply add a third pattern file that uses custom rules. Here is pattern3.txt:

#Life 1.05
#D test pattern 3
#R 34/45
#P 10 10
*.
.*
*.

We can now create a test that loads this file:

class TestLife(unittest.TestCase):
    # ...

    def test_load_life_custom_rules(self):
        life = Life()
        life.load('pattern3.txt')
        assert life.survival == [3, 4]
        assert life.birth == [4, 5]
        assert list(life.living_cells()) == [(10, 10), (11, 11), (10, 12)]
        assert life.bounding_box() == (10, 10, 11, 12)

It should be possible to combine this third test of the load() method with the previous two, but that would require converting a few more values into test arguments. I prefer to keep the test separate, which helps maintain the parameterized.expand() decorator from the other two tests in its current very simple form.

Testing Exceptions

The remaining line within the load() method that needs coverage is line 79. This is the last line of the method, which triggers when the input file cannot be parsed. The interesting thing about this line of code is that it is a raise statement, which will basically cause the program to crash with a stack trace, so here we have a new challenge.

The easiest way to trigger this line is to add another test that loads a file that is not recognized as either Life 1.05 or Life 1.06. Here is pattern4.txt, which we will use for that purpose:

this is not a life file

But when I attempt to load this file the program is going to crash, and we can't allow that. When you need to test that a piece of code raises an exception, the standard assert statement is not sufficient, because it cannot catch the exception. Both pytest and unittest provide specialized asserts for exceptions. In the next test I'm going to use the pytest.raises() assertion to check that the load() method crashes with a RuntimeError when given a bad file:

import pytest

# ...

class TestLife(unittest.TestCase):
    # ...

    def test_load_invalid(self):
        life = Life()
        with pytest.raises(RuntimeError):
            life.load('pattern4.txt')

And with this we raised coverage to 76%:

---------- coverage: platform darwin, python 3.8.6-final-0 -----------
Name      Stmts   Miss Branch BrPart  Cover   Missing
-----------------------------------------------------
life.py      98     22     70      0    76%   83, 107-119, 123-132
-----------------------------------------------------
TOTAL        98     22     70      0    76%

Line 83 is the body of the toggle() method. Since this is an easy one, let's build a quick test for it:

class TestLife(unittest.TestCase):
    # ...

    def test_toggle(self):
        life = Life()
        life.toggle(5, 5)
        assert list(life.living_cells()) == [(5, 5)]
        life.toggle(5, 6)
        life.toggle(5, 5)
        assert list(life.living_cells()) == [(5, 6)]

Building test matrices

We now have 76% coverage of the life.py module, but as always, we are more interested in the other 24%. There are two methods of the Life class left to test: advance() and _advance_cell(), corresponding to the two ranges of lines missing coverage.

The _advance_cell() method is where the core logic of the Game of Life simulation resides. This method takes the X and Y coordinates of a cell, and returns a boolean response to indicate if the cell is alive or dead in the next time step. The method computes the number of alive neighbors for this cell, and based on that and the survival and birth rules set in the object determines what is the next state. This method is not public, it is only supposed to be called by the advance() method as a helper. But as we discussed before, public and private code are equally likely to break, so the approach to testing (in my opinion) should not change.

The advance() method implements a loop over all the cells that can potentially change their state, and calls _advance_cell() on all of them, and then updates the grid to reflect the new states for all the cells.

We'll start with the _advance_cell() method, which is easier to test when you consider that it does not make calls into any other code, unlike advance(). The idea is to focus on a cell, let's say (0, 0). The test can set the cell alive, then say, 3 of its neighbors also alive, and then we should assert that the method returns True, indicating that the cell remains alive for the next time unit. Here is a first approach to building this test:

import random

# ...

class TestLife(unittest.TestCase):
    # ...

    def test_advance_cell(self):
        life = Life()
        life.toggle(0, 0)
        neighbors = [(-1, -1), (0, -1), (1, -1),
                     (-1, 0), (1, 0),
                     (-1, 1), (0, 1), (1, 1)]
        for i in range(3):
            n = random.choice(neighbors)
            neighbors.remove(n)
            life.toggle(*n)

        assert life._advance_cell(0, 0)

The neighbors list contains all the neighbors of cell (0, 0). To add some variety to the test, instead of setting the same 3 neighbors alive, I'm using random.choice() to pick neighbors randomly. This will make the test use 3 different neighbors every time, and presumably cover the code better, since different combinations will be used in each test run.

The test suite appears to indicate that this is a very successful test, as we've covered all but one of the lines in the _advance_cell() method:

---------- coverage: platform darwin, python 3.8.6-final-0 -----------
Name      Stmts   Miss Branch BrPart  Cover   Missing
-----------------------------------------------------
life.py      98     14     70      1    85%   107-119, 132
-----------------------------------------------------
TOTAL        98     14     70      1    85%

But this is actually a great example of why only looking at code coverage is not always sufficient. If you look at the source code, line 132, the only line that isn't covered, is the line that applies the birth rule. It makes sense that this line isn't covered because we are starting from a cell that is alive, so the rule that applies to that cell is the survival rule.

But even after you understand that, the survival rule has nine possible outcomes, right? We know that the rule says that if the cell has 2 or 3 neighbors, then it remains alive, and if it has 0, 1, 4, 5, 6, 7, or 8 it dies. Out of these 9 possible cases we have only tested one so far, the case of having 3 neighbors. And we are not even considering that the Life class can be configured with a custom survival rule, which changes all the results. So you can see that even though code coverage shows us great numbers, it does not tell the whole story.

The first improvement that we can make is to test other numbers of neighbors. We can do this by writing a few more tests, but these are going to be almost identical. We've learned how to use test parametrization to avoid repetition, so we can do the same here.

An easy extension would be to test for 2 and 3 neighbors:

class TestLife(unittest.TestCase):
    # ...

    @parameterized.expand([(2,), (3,)])
    def test_advance_cell(self, num_neighbors):
        life = Life()
        life.toggle(0, 0)
        neighbors = [(-1, -1), (0, -1), (1, -1),
                     (-1, 0), (1, 0),
                     (-1, 1), (0, 1), (1, 1)]
        for i in range(num_neighbors):
            n = random.choice(neighbors)
            neighbors.remove(n)
            life.toggle(*n)

        assert life._advance_cell(0, 0)

But this still leaves out the 7 other possible combinations of neighbors that would cause the cell to die. With a bit of effort we can generalize the test so that it works with all possible neighbor combinations:

class TestLife(unittest.TestCase):
    # ...

    @parameterized.expand([(n,) for n in range(9)])
    def test_advance_cell(self, num_neighbors):
        life = Life()
        life.toggle(0, 0)
        neighbors = [(-1, -1), (0, -1), (1, -1),
                     (-1, 0), (1, 0),
                     (-1, 1), (0, 1), (1, 1)]
        for i in range(num_neighbors):
            n = random.choice(neighbors)
            neighbors.remove(n)
            life.toggle(*n)

        new_state = life._advance_cell(0, 0)
        if num_neighbors in [2, 3]:
            assert new_state is True
        else:
            assert new_state is False

The strange expression that you see as an argument to the parameterized.expand() decorator is a list comprehension. I'm using it to convert range(9) to a list of single-element tuples in the format [(0,), (1,), (2,) ..., (8,)] as required by the decorator.

Then in the body of the test I store the result of calling _advance_cell() in the new_state variable. If the num_neighbors argument is 2 or 3, then I assert for True on this variable. For any other number of neighbors, the assert ensures the result is False. From a single test we are now running the 9 different possibilities the survival rule can be applied to!

Note how when you run this updated test the code coverage does not change from the previous numbers, even though we are now covering many more cases than before. That means we are still missing coverage of the birth rule, which never triggers because we are setting the state of cell (0, 0) to alive for all the test combinations.

What we need now, is to repeat all these parametrized tests, but for a cell that starts as a dead cell, right? So we can build a separate parametrized test for the birth rule. Or, we can add another level of parametrization. Instead of running 9 neighbor combinations for a live cell, we can run 18 combinations, 9 starting with a live cell, and another 9 starting with a dead cell.

We can add a second argument to the parametrization of this test called alive, which is set to True for the possible nine configurations, and then to False for another nine. This is called a test matrix, because we have two dimensions:

  • alive can be True or False
  • num_neighbors can be 0, 1, 2, 3, 4, 5, 6, 7, or 8, or the equivalent range(9)

How can we generate the test parametrization for this? There is a function in the itertools package from the Python standard library called product that is very useful to expand a matrix to a plain list:

>>> import itertools
>>> list(itertools.product([True, False], range(9)))
[(True, 0), (True, 1), (True, 2), (True, 3), (True, 4), (True, 5), (True, 6), (True, 7), (True, 8), (False, 0), (False, 1), (False, 2), (False, 3), (False, 4), (False, 5), (False, 6), (False, 7), (False, 8)]

The result of the product() function is a list of tuples, which is exactly what the parameterized.expand() decorator needs. Here is the updated test:

import itertools

class TestLife(unittest.TestCase):
    # ...

    @parameterized.expand(itertools.product([True, False], range(9)))
    def test_advance_cell(self, alive, num_neighbors):
        life = Life()
        if alive:
            life.toggle(0, 0)
        neighbors = [(-1, -1), (0, -1), (1, -1),
                     (-1, 0), (1, 0),
                     (-1, 1), (0, 1), (1, 1)]
        for i in range(num_neighbors):
            n = random.choice(neighbors)
            neighbors.remove(n)
            life.toggle(*n)

        new_state = life._advance_cell(0, 0)
        if alive:
            # survival rule
            if num_neighbors in [2, 3]:
                assert new_state is True
            else:
                assert new_state is False
        else:
            # birth rule
            if num_neighbors in [3]:
                assert new_state is True
            else:
                assert new_state is False

In this new version of the test the parameterized.expand() decorator gets the result of the product of our two testing dimensions. Note that there is no need to convert this to a list as I did in the previous example. Since now we have two elements in each of the tuples, we need two arguments in the test function, so I've added alive.

In the first part of the test, I only set the (0, 0) cell to alive if the alive argument is True. The assertion portion of the test at the end is now a bit more complex, because depending on the value of alive we have to assert for the survival or birth rules.

With this improvement we are now fully covering the _advance_cell() method:

---------- coverage: platform darwin, python 3.8.6-final-0 -----------
Name      Stmts   Miss Branch BrPart  Cover   Missing
-----------------------------------------------------
life.py      98     13     70      0    86%   107-119
-----------------------------------------------------
TOTAL        98     13     70      0    86%

But are we really done with this method? As you recall, my implementation of the Game of Life engine allows for custom rules, which can be given in the survival and birth arguments when you create a Life object. If we really wanted to be thorough, we should also test one or more custom set of rules, right?

The current version of the test harcodes the [2, 3] survival rule and [3] birth rule. We can add two more layers of parametrization and generalize this as well. What I did in the final version of this test is to add two more dimensions to the test matrix, one with two survival rules (default and custom) and another with two birth rules (also default and custom). This generates four possible combinations of survival and birth rules, for each of which we want to run the 18 combinations we are already testing. Here is the test, now with four layers of parametrization in the test matrix:

class TestLife(unittest.TestCase):
    # ...

    @parameterized.expand(itertools.product(
        [[2, 3], [4]],  # two different survival rules
        [[3], [3, 4]],  # two different birth rules
        [True, False],  # two possible states for the cell
        range(0, 9),    # nine number of possible neighbors
    ))
    def test_advance_cell(self, survival, birth, alive, num_neighbors):
        life = Life(survival, birth)
        if alive:
            life.toggle(0, 0)
        neighbors = [(-1, -1), (0, -1), (1, -1),
                     (-1, 0), (1, 0),
                     (-1, 1), (0, 1), (1, 1)]
        for i in range(num_neighbors):
            n = random.choice(neighbors)
            neighbors.remove(n)
            life.toggle(*n)

        new_state = life._advance_cell(0, 0)
        if alive:
            # survival rule
            if num_neighbors in survival:
                assert new_state is True
            else:
                assert new_state is False
        else:
            # birth rule
            if num_neighbors in birth:
                assert new_state is True
            else:
                assert new_state is False

This parametrized test is now handling 2 x 2 x 2 x 9 = 72 different tests!

Mocking

We are now at 86% coverage, and the only remaining part of life.py to test is the advance() method. And this brings up the last challenge. This method calls the _advance_cell() method as a helper, which makes separating the logic in advance() from that of the helper method harder.

We have done a very thorough job in testing _advance_cell(), so we really do not need that method to run while we test advance(). In fact, it would be better if we could prevent the helper method from running and instead we can inject True or False return values to those calls as we find convenient.

Note only that, we would also like to make sure that advance() calls _advance_cell() strictly as necessary. For example, we would consider it a failure if advance() called _advance_cell() two or more times for the same cell, or for a cell that has no chance of changing state, as that could turn into a performance problem.

To summarize, we are pretty confident that the _advance_cell() method is working well, so we would like to use an alternative version of it while we test advance(). This alternative version should allow us to make it return True or False according to our testing needs, and also should allow us to record what calls advance() issues to the method, so that we can make sure it calls it only as much as necessary.

To do all this we are going to use a technique called mocking. The idea of using a mock is to temporarily hijack part of the application and replace it with a fake version that is easily controlled from the test. The unittest framework that we are using has a mock sub-package that implements this.

First, let me show you how mocks work in Python session:

>>> from unittest import mock
>>> fake = mock.MagicMock()
>>> fake.return_value = 42
>>> fake()
42
>>> fake('hello')
42
>>> fake('hello', number=123)
42

In this example I assign a MagicMock object to the fake variable, and set its return_value attribute to 42. Now I can call fake as a function, with any arguments, and I always get 42 as return value.

In addition to being a fake function, the MagicMock instance keeps track of the calls it receives:

>>> fake.call_count
3
>>> fake.call_args_list
[call(), call('hello'), call('hello', number=123)]

Isn't this fantastic? We can now create a mock replacement for _advance_cell() and make it return True or False without having to run the actual logic of the simulation, and we can also keep track of what calls advance() makes into it.

The mock package provides a few patching functions that allow you to inject a MagicMock instance into a part of your application. To replace a method within a class we can use the mock.patch_object() function:

from unittest import mock

class TestLife(unittest.TestCase):
    # ...

    @mock.patch.object(Life, '_advance_cell')
    def test_advance_false(self, mock_advance_cell):
        mock_advance_cell.return_value = False
        # ...

    @mock.patch.object(Life, '_advance_cell')
    def test_advance_true(self, mock_advance_cell):
        mock_advance_cell.return_value = True
        # ...

Here you can see that we decorated the two tests with the patch_object decorator, indicating that we want to patch the Life._advance_cell() method. The mock object that is injected into the class is passed to the test as an argument, and in this example we configure it to return False and True respectively. After that we can construct tests that invoke advance(), and any calls to _advance_cell() will go through the mock method and return the configured value without running any logic.

The two aspects of the advance() method that need to be checked in these tests are:

  • The return value of _advance_cell() determines the new state of each cell.
  • The _advance_cell() method is called the minimum number of times for the given cell configuration.

As you saw above, I decided to split the testing of the method into two tests, one that mocks _advance_cell() to return False and another that returns True. Because the assertion portion of these tests is significantly different I decided not to use parametrization.

For the False test I designed the following test grid:

Life Test Pattern

Here I three cells that are alive: (10, 10), (12, 10) and (20, 20). The cells that can potentially change state in any given configuration for the Game of Life are the cells that are alive and all of its neighbors. Any dead cells that don't have at least one neighbor that is alive will remain dead, so there is no point in checking them. The 24 cells that need to be inspected in this configuration are marked with X in the diagram.

An important part of this test is to make sure _advance_cell() is called 24 times, which means that we expect the advance() method to avoid evaluating cells more than once. If you look at the implementation of the method you'll see that there is code to prevent duplicate calls to cells.

Here is the implementation of the test:

class TestLife(unittest.TestCase):
    # ...

    @mock.patch.object(Life, '_advance_cell')
    def test_advance_false(self, mock_advance_cell):
        mock_advance_cell.return_value = False
        life = Life()
        life.toggle(10, 10)
        life.toggle(12, 10)
        life.toggle(20, 20)
        life.advance()

        # there should be exactly 24 calls to _advance_cell:
        # - 9 around the (10, 10) cell
        # - 6 around the (12, 10) cell (3 were already processed by (10, 10))
        # - 9 around the (20, 20) cell
        assert mock_advance_cell.call_count == 24
        assert list(life.living_cells()) == []

The test creates the grid configuration shown above, and then calls advance(). As indicated above, we make sure that the _advance_cell() mock was called exactly 24 times. Since all these calls are going to return False, we expect the updated state of the grid is going to be completely empty, which we can check with the living_cells() method.

For the second test I decided to slightly change the grid configuration:

Life Test Pattern

There is now a couple of cells that are closer, so the number of calls to _advance_cell() drops to 21 when duplicates are removed. And since now the mock returns True, all those 21 cells will be alive in the updated state of the grid. This is checked against a set containing these 21 cells.

class TestLife(unittest.TestCase):
    # ...

    @mock.patch.object(Life, '_advance_cell')
    def test_advance_true(self, mock_advance_cell):
        mock_advance_cell.return_value = True
        life = Life()
        life.toggle(10, 10)
        life.toggle(11, 10)
        life.toggle(20, 20)
        life.advance()

        # there should be exactly 24 calls to _advance_cell:
        # - 9 around the (10, 10) cell
        # - 3 around the (11, 10) cell (3 were already processed by (10, 10))
        # - 9 around the (20, 20) cell
        assert mock_advance_cell.call_count == 21

        # since the mocked advance_cell returns True in all cases, all 24
        # cells must be alive
        assert set(life.living_cells()) == {
            (9, 9), (10, 9), (11, 9), (12, 9),
            (9, 10), (10, 10), (11, 10), (12, 10),
            (9, 11), (10, 11), (11, 11), (12, 11),
            (19, 19), (20, 19), (21, 19),
            (19, 20), (20, 20), (21, 20),
            (19, 21), (20, 21), (21, 21),
        }

And with this, we have completed the testing of life.py. A run of the test suite now returns 100% coverage, from 13 tests that get expanded to 85 through parametrization:

(venv) $ pytest --cov=life --cov-report=term-missing --cov-branch
========================= test session starts =========================
platform darwin -- Python 3.8.6, pytest-6.2.2, py-1.10.0, pluggy-0.13.1
rootdir: /Users/mgrinberg/Documents/dev/python/testing/life
plugins: cov-2.11.1
collected 85 items

test_life.py .................................................. [ 58%]
...................................                             [100%]

---------- coverage: platform darwin, python 3.8.6-final-0 -----------
Name      Stmts   Miss Branch BrPart  Cover   Missing
-----------------------------------------------------
life.py      98      0     70      0   100%
-----------------------------------------------------
TOTAL        98      0     70      0   100%


========================= 85 passed in 0.27s =========================

Conclusion

Phew! This felt like a testing roller coaster. Guided by test coverage, and sometimes also by common sense, we were able to create a complete testing suite for the Game of Life engine, while learning a few testing techniques along the way.

I hope this gives you some new testing tools that you can employ in your own projects. There will be more testing tricks in the next installment of the series, so stay tuned for the next article!

7 comments

  • #1 Michal said 2021-04-09T12:37:26Z

    Hello Miguel, great article again, thank you. The parameterized library looks very interesting, I will have to examine it more closely. Do you know whether it's possible to build the test matrix by stacking the expand decorator, instead of using itertools.product?

  • #2 Miguel Grinberg said 2021-04-09T15:32:03Z

    @Michal: I don't think stacking decorators works with this library.

  • #3 Jackson said 2021-04-23T04:52:49Z

    The best article about Unit test and pytest I have ever read, really useful and easy to understand. How long for you to release part 3? Do you want to make a series ?

  • #4 Miguel Grinberg said 2021-04-23T09:29:10Z

    @Jackson: thanks! These articles actually take a lot of time, so I'd say 1-2 months until I'm ready to share the next one.

  • #5 rouizi said 2021-05-09T23:56:28Z

    Thank you for the great tutorial

  • #6 nod said 2021-05-17T20:08:52Z

    I like this game.

  • #7 joris said 2021-06-22T14:16:39Z

    Great tutorial. Thanks for sharing.I didn't know the pygame module. Good discovery. I can't wait to read the next one!

Leave a Comment