The Flask Mega-Tutorial, Part VII: Unit Testing (2012)

Posted by
on under

(Great news! There is a new version of this tutorial!)

This is the seventh article in the series in which I document my experience writing web applications in Python using the Flask microframework.

The goal of the tutorial series is to develop a decently featured microblogging application that demonstrating total lack of originality I have decided to call microblog.

NOTE: This article was revised in September 2014 to be in sync with current versions of Python and Flask.

Here is an index of all the articles in the series that have been published to date:


In the previous chapters of this tutorial we were concentrating in adding functionality to our little application, a step at a time. By now we have a database enabled application that can register users, log them in and out and let them view and edit their profiles.

In this session we are not going to add any new features to our application. Instead, we are going to find ways to add robustness to the code that we have already written, and we will also create a testing framework that will help us prevent failures and regressions in the future.

Let's find a bug

I mentioned at the end of the last chapter that I have intentionally introduced a bug in the application. Let me describe what the bug is, then we will use it to see what happens to our application when it does not work as expected.

The problem in the application is that there is no effort to keep the nicknames of our users unique. The initial nickname of a user is chosen automatically by the application. If the OpenID provider provides a nickname for the user then we will just use it. If not we will use the username part of the email address as nickname. If we get two users with the same nickname then the second one will not be able to register. To make matters worse, in the profile edit form we let users change their nicknames to whatever they want, and again there is no effort to avoid name collisions.

We will address these problems later, after we analyze how the application behaves when an error occurs.

Flask debugging support

So let's see what happens when we trigger our bug.

Let's start by creating a brand new database. On Linux:

rm app.db

or on Windows:

del app.db

You need two OpenID accounts to reproduce this bug, ideally from different providers, so that their cookies don't make this more complicated. Follow these steps to create a nickname collision:

  • login with your first account
  • go to the edit profile page and change the nickname to 'dup'
  • logout
  • login with your second account
  • go to the edit profile page and change the nickname to 'dup'

Oops! We've got an exception from sqlalchemy. The text of the error reads:

IntegrityError: (IntegrityError) column nickname is not unique u'UPDATE user SET nickname=?, about_me=? WHERE = ?' (u'dup', u'', 2)

What follows after the error is a stack trace of the error, and actually it is a pretty nice one, where you can go to any frame and inspect source code or even evaluate expressions right in the browser.

The error is pretty clear, we tried to insert a duplicated nickname in the database. The database model had a unique constrain on the nickname field, so this is an invalid operation.

In addition to the actual error, we have a secondary problem in our hands. If a user inadvertently causes an error in our application (this one or any other that causes an exception) it will be him or her that gets the error with the revealing error message and the stack trace, not us. While this is a fantastic feature while we are developing, it is something we definitely do not want our users to ever see.

All this time we have been running our application in debug mode. The debug mode is enabled when the application starts, by passing a debug=True argument to the run method. This is how we coded our start-up script.

When we are developing the application this is convenient, but we need to make sure it is turned off when we run our application in production mode. Let's just create another starter script that runs with debugging disabled (file

from app import app

Now restart the application with:


And now try again to rename the nickname on the second account to 'dup'.

This time we do not get an error. Instead, we get an HTTP error code 500, which is Internal Server Error. Not a great looking error, but at least we are not exposing any details of our application to strangers. The error 500 page is generated by Flask when debugging is off and an unhandled exception occurs.

While this is better, we are now having two new issues. First a cosmetic one: the default error 500 page is ugly. The second problem is much more important. With things as they are we would never know when and if a user experiences a failure in our application because when debugging is turned off application failures are silently dismissed. Luckily there are easy ways to address both problems.

Custom HTTP error handlers

Flask provides a mechanism for an application to install its own error pages. As an example, let's define custom error pages for the HTTP errors 404 and 500, the two most common ones. Defining pages for other errors works in the same way.

To declare a custom error handler the errorhandler decorator is used (file app/

def not_found_error(error):
    return render_template('404.html'), 404

def internal_error(error):
    return render_template('500.html'), 500

Not much to talk about for these, as they are almost self-explanatory. The only interesting bit is the rollback statement in the error 500 handler. This is necessary because this function will be called as a result of an exception. If the exception was triggered by a database error then the database session is going to arrive in an invalid state, so we have to roll it back in case a working session is needed for the rendering of the template for the 500 error.

Here is the template for the 404 error:

<!-- extend base layout -->
{% extends "base.html" %}

{% block content %}
  <h1>File Not Found</h1>
  <p><a href="{{ url_for('index') }}">Back</a></p>
{% endblock %}

And here is the one for the 500 error:

<!-- extend base layout -->
{% extends "base.html" %}

{% block content %}
  <h1>An unexpected error has occurred</h1>
  <p>The administrator has been notified. Sorry for the inconvenience!</p>
  <p><a href="{{ url_for('index') }}">Back</a></p>
{% endblock %}

Note that in both cases we continue to use our base.html layout, so that the error page has the look and feel of the application.

Sending errors via email

To address our second problem we are going to configure two reporting mechanisms for application errors. The first of them is to have the application send us an email each time an error occurs.

Before we get into this let's configure an email server and an administrator list in our application (file

# mail server settings
MAIL_SERVER = 'localhost'

# administrator list
ADMINS = ['']

Of course it will be up to you to change these to what makes sense.

Flask uses the regular Python logging module, so setting up an email when there is an exception is pretty easy (file app/


if not app.debug:
    import logging
    from logging.handlers import SMTPHandler
    credentials = None
        credentials = (MAIL_USERNAME, MAIL_PASSWORD)
    mail_handler = SMTPHandler((MAIL_SERVER, MAIL_PORT), 'no-reply@' + MAIL_SERVER, ADMINS, 'microblog failure', credentials)

Note that we are only enabling the emails when we run without debugging.

Testing this on a development PC that does not have an email server is easy, thanks to Python's SMTP debugging server. Just open a new console window (command prompt for Windows users) and run the following to start a fake email server:

python -m smtpd -n -c DebuggingServer localhost:25

When this is running, the emails sent by the application will be received and displayed in the console window.

Logging to a file

Receiving errors via email is nice, but sometimes this isn't enough. There are some failure conditions that do not end in an exception and aren't a major problem, yet we may want to keep track of them in a log in case we need to do some debugging.

For this reason, we are also going to maintain a log file for the application.

Enabling file logging is similar to the email logging (file app/

if not app.debug:
    import logging
    from logging.handlers import RotatingFileHandler
    file_handler = RotatingFileHandler('tmp/microblog.log', 'a', 1 * 1024 * 1024, 10)
    file_handler.setFormatter(logging.Formatter('%(asctime)s %(levelname)s: %(message)s [in %(pathname)s:%(lineno)d]'))
    app.logger.addHandler(file_handler)'microblog startup')

The log file will go to our tmp directory, with name microblog.log. We are using the RotatingFileHandler so that there is a limit to the amount of logs that are generated. In this case we are limiting the size of a log file to one megabyte, and we will keep the last ten log files as backups.

The logging.Formatter class provides custom formatting for the log messages. Since these messages are going to a file, we want them to have as much information as possible, so we write a timestamp, the logging level and the file and line number where the message originated in addition to the log message and the stack trace.

To make the logging more useful, we are lowering the logging level, both in the app logger and the file logger handler, as this will give us the opportunity to write useful messages to the log without having to call them errors. As an example, we start by logging the application start up as an informational level. From now on, each time you start the application without debugging the log will record the event.

While we don't have a lot of need for a logger at this time, debugging a web server that is online and in use is very difficult. Logging messages to a file is an extremely useful tool in diagnosing and locating issues, so we are now all ready to go should we need to use this feature.

The bug fix

Let's fix our nickname duplication bug.

As discussed earlier, there are two places that are currently not handling duplicates. The first is in the after_login handler for Flask-Login. This is called when a user successfully logs in to the system and we need to create a new User instance. Here is the affected snippet of code, with the fix in it (file app/

    if user is None:
        nickname = resp.nickname
        if nickname is None or nickname == "":
            nickname ='@')[0]
        nickname = User.make_unique_nickname(nickname)
        user = User(nickname = nickname, email =

The way we solve the problem is by letting the User class pick a unique name for us. This is what the new make_unique_nickname method does (file app/

    class User(db.Model):
    # ...
    def make_unique_nickname(nickname):
        if User.query.filter_by(nickname=nickname).first() is None:
            return nickname
        version = 2
        while True:
            new_nickname = nickname + str(version)
            if User.query.filter_by(nickname=new_nickname).first() is None:
            version += 1
        return new_nickname
    # ...

This method simply adds a counter to the requested nickname until a unique name is found. For example, if the username "miguel" exists, the method will suggest "miguel2", but if that also exists it will go to "miguel3" and so on. Note that we coded the method as a static method, since it this operation does not apply to any particular instance of the class.

The second place where we have problems with duplicate nicknames is the view function for the edit profile page. This one is a little tricker to handle, because it is the user choosing the nickname. The correct thing to do here is to not accept a duplicated nickname and let the user enter another one. We will address this by adding custom validation to the nickname form field. If the user enters an invalid nickname we'll just fail the validation for the field, and that will send the user back to the edit profile page. To add our validation we just override the form's validate method (file app/

from app.models import User

class EditForm(Form):
    nickname = StringField('nickname', validators=[DataRequired()])
    about_me = TextAreaField('about_me', validators=[Length(min=0, max=140)])

    def __init__(self, original_nickname, *args, **kwargs):
        Form.__init__(self, *args, **kwargs)
        self.original_nickname = original_nickname

    def validate(self):
        if not Form.validate(self):
            return False
        if == self.original_nickname:
            return True
        user = User.query.filter_by(
        if user != None:
            self.nickname.errors.append('This nickname is already in use. Please choose another one.')
            return False
        return True

The form constructor now takes a new argument original_nickname. The validate method uses it to determine if the nickname has changed or not. If it hasn't changed then it accepts it. If it has changed, then it makes sure the new nickname does not exist in the database.

Next we add the new constructor argument to the view function:

@app.route('/edit', methods=['GET', 'POST'])
def edit():
    form = EditForm(g.user.nickname)
    # ...

To complete this change we have to enable field errors to show in our template for the form (file app/templates/edit.html):

        <td>Your nickname:</td>
            {{ form.nickname(size=24) }}
            {% for error in form.errors.nickname %}
            <br><span style="color: red;">[{{ error }}]</span>
            {% endfor %}

Now the bug is fixed and duplicates will be prevented... except when they are not. We still have a potential problem with concurrent access to the database by two or more threads or processes, but this will be the topic of a future article.

At this point you can try again to select a duplicated name to see how the form nicely handles the error.

Unit testing framework

To close this session on testing, let's talk about automated testing a bit.

As the application grows in size it gets more and more difficult to ensure that code changes don't break existing functionality.

The traditional approach to prevent regressions is a very good one. You write tests that exercise all the different features of the application. Each test runs a focused part and verifies that the result obtained is the expected one. The tests are executed periodically to make sure that the application works as expected. When the test coverage is large you can have confidence that modifications and additions do not affect the application in a bad way just by running the tests.

We will now build a very simple testing framework using Python's unittest module (file

import os
import unittest

from config import basedir
from app import app, db
from app.models import User

class TestCase(unittest.TestCase):
    def setUp(self):
        app.config['TESTING'] = True
        app.config['WTF_CSRF_ENABLED'] = False
        app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///' + os.path.join(basedir, 'test.db') = app.test_client()

    def tearDown(self):

    def test_avatar(self):
        u = User(nickname='john', email='')
        avatar = u.avatar(128)
        expected = ''
        assert avatar[0:len(expected)] == expected

    def test_make_unique_nickname(self):
        u = User(nickname='john', email='')
        nickname = User.make_unique_nickname('john')
        assert nickname != 'john'
        u = User(nickname=nickname, email='')
        nickname2 = User.make_unique_nickname('john')
        assert nickname2 != 'john'
        assert nickname2 != nickname

if __name__ == '__main__':

Discussing the unittest module is outside the scope of this article. Let's just say that class TestCase holds our tests. The setUp and tearDown methods are special, these are run before and after each test respectively. A more complex setup could include several groups of tests each represented by a unittest.TestCase subclass, and each group then would have independent setUp and tearDown methods.

These particular setUp and tearDown methods are pretty generic. In setUp the configuration is edited a bit. For instance, we want the testing database to be different that the main database. In tearDown we just reset the database contents.

Tests are implemented as methods. A test is supposed to run some function of the application that has a known outcome, and should assert if the result is different than the expected one.

So far we have two tests in the testing framework. The first one verifies that the Gravatar avatar URLs from the previous article are generated correctly. Note how the expected avatar is hardcoded in the test and checked against the one returned by the User class.

The second test verifies the make_unique_nickname method we just wrote, also in the User class. This test is a bit more elaborate, it creates a new user and writes it to the database, then ensures the same name is not allowed as a unique name. It then creates a second user with the suggested unique name and tries one more time to request the first nickname. The expected result for this second part is to get a suggested nickname that is different from the previous two.

To run the test suite you just run the script. On Linux or Mac:


And on Windows:


If there are any errors, you will get a report in the console.

Final words

This ends today's discussion of debugging, errors and testing. I hope you found this article useful.

As always, if you have any comments please write below.

The code of the microblog application update with today's changes is available below for download:


As always, the flask virtual environment and the database are not included. See previous articles for instructions on how to generate them.

I hope to see you again in the next installment of this series.

Thank you for reading!


Become a Patron!

Hello, and thank you for visiting my blog! If you enjoyed this article, please consider supporting my work on this blog on Patreon!

  • #101 Sun Qingyao said

    May I ask why should we call db.session.remove() in tearDown?

    This doesn't seem very necessary to me, since flask-sqlalchemy has already handled this, and we haven't set SQLALCHEMY_COMMIT_ON_TEARDOWN = True in this case.

  • #102 Miguel Grinberg said

    @Sun: that ensures that the session is clean when the next test runs.

  • #103 Sun Qingyao said

    Thanks for your response! I didn't see it until I discovered it on the 5th page.

    Ah yes, that makes sense. But what's the purpose of calling db.drop_all() at the end of each test then? I know it's used to reset the database, but we've already removed the session, which is never committed to the database. Do we still need to destroy the database? I'm just wondering, would it be more appropriate to put db.create_all() and db.drop_all() in setUpClass and tearDownClass, which are called only once for a class?

    Also, just curious: which hashing algorithm are you using to generate gravatar for users on this blog (I mean "", not "microblog")? I've tried hashing my email with MD5, but it doesn't seem to return the same result :/

  • #104 Sun Qingyao said

    Oh I made a mistake: in fact db.session.commit() is called in test_make_unique_nickname.

    But I'm still wondering: Why do we need to call both db.session.remove() and db.drop_all()? Now I think a mere db.drop_all() is capable of reseting the test database...

  • #105 Miguel Grinberg said

    @Sun: you need to reset the db.session object as well as the database, in case it has any objects.

  • #106 bryan said

    Thank you so much for this tutorial. I really appreciate your effort, dedication, and hard work to put all this together.
    Can you please recommend any sources that cover the potential problem with concurrent access to the database by two or more threads or processes?


  • #107 Miguel Grinberg said

    @bryan: You are welcome! Database servers have several access mechanisms to prevent concurrency problems. Everything you do from a SQLAlchemy session is sent to the database server as transactions, meaning that a sequence of operations is considered atomic and if anything in that sequence fails, the entire sequence is discarded (this is the "A" in ACID). If multiple threads or processes have open sessions, the database handles them according to the defined isolation mode (the "I" in ACID). See for more details.

  • #108 Zach said

    Thanks for all you do, Miguel. I have a suggestion, which has been very helpful for me. Take the "drop_all" out of the teardown, and run your tests through flask-script. To do that, in your testing module, which I'll call, add the following method :

    def run_all_tests():
    suite = unittest.defaultTestLoader.loadTestsFromTestCase(TestCase)

    Then, in your file, add the following flask-script command:

    def runtests_server():
    import atexit
    import tests
    atexit.register(drop_db) #This should be another command in the module that drops and recreates the database
    print("RUNNING A SERVER")
    server= Server(port=5000,host='',use_reloader=False,use_debugger=None)

    That command will do something really beautiful: calling "python runtests_server" will run your tests, possibly populating your database with some fake users, then immediately run a server for you to interact with. When you quit the server, the database will immediately be depopulated (as your teardown method would typically do). I also like immediately starting a shell after the tests (rather than a server) to interact with my test-populated database:

    shell = Shell(make_context=make_shell_context),False)

    Note that you have to specify use_reloader=False when starting the server, otherwise the module will be reloaded and your tests will run again, which is not what we want in this case. I know that this doesn't use the newer Flask CLI, but for me it's a great configuration, and I would love to see it pop up in your highly influential guidance for others to enjoy.

  • #109 Miguel Grinberg said

    @Zach: I'm not so sure this is a generic enough workflow to promote. The problems that I see with it are that 1) there is no separation between a test and a development database, so your own data is lost when you run the tests, 2) you are mixing automated with manual testing, which is not a good practice. Unit tests should run in fully automated mode, so that you can run them often as part of your build script. 3) Not a good practice to share the same database between all the tests, as the order in which the tests run can affect the results. Each test should use a brand new database.

    You do have some interesting ideas here that could be useful if they were implemented as options that are disabled by default, like the starting of a shell or a server after a test.

  • #110 stephen said

    It looks like you;ve removed this unit testing section from the new version of your tutorial and you no longer link this legacy version to a github repo. I wanted to see the code files for this unit test section to see where you place the files as I am struggling to be able to import app, db etc into my test module

  • #111 Miguel Grinberg said

    @stephen: the old repo is still online:

  • #112 Jake Bayley said

    Hi Miguel,

    Is there a reason you don't test the actual routes with the test client?


  • #113 Miguel Grinberg said

    @Jake: this chapter is focused on unit testing, which is about testing small portions of the application in isolation. This is a lot easier than testing end-to-end with the test client.

Leave a Comment