The Flask Mega-Tutorial, Part XVI: Debugging, Testing and Profiling

This is the sixteenth article in the series in which I document my experience writing web applications in Python using the Flask microframework.

The goal of the tutorial series is to develop a decently featured microblogging application that demonstrating total lack of originality I have decided to call microblog.

NOTE: This article was revised in September 2014 to be in sync with current versions of Python and Flask.

Here is an index of all the articles in the series that have been published to date:

Our little microblog application is starting to feel decent enough for a release, so it is time to start thinking about cleaning things up as much as we can. Recently, a reader of this blog (Hi, George!) reported a strange database bug that we are going to debug today. This should help us realize that no matter how careful we are and how much we test our application there is a good chance we will miss some bugs. Unfortunately users are very good at finding them!

Instead of just fixing this bug and forgetting about it until we come across another one, we are going to take some proactive measures to be better prepared for the next one.

In the first part of this article we'll cover debugging, and I'll show you some techniques I use to debug tough problems.

Later we will see how we can have an idea of how effective our testing strategy is. We will specifically look at measuring how much of our application our unit tests are exercising, something called test coverage.

And finally, we will think about another type of problems that many applications can suffer from, bad performance. We will look at profiling techniques to find the parts of our application that are slow.

Sounds good? Let's get started then.

The Bug

This problem was found by a reader of this blog, after he implemented a new function that allows users to delete their own posts. The official version of microblog does not have that, so we'll quickly put together this feature so that we can work on the bug.

The view function that we will use to delete a post is below (file app/views.py):

@app.route('/delete/<int:id>')
@login_required
def delete(id):
    post = Post.query.get(id)
    if post is None:
        flash('Post not found.')
        return redirect(url_for('index'))
    if post.author.id != g.user.id:
        flash('You cannot delete this post.')
        return redirect(url_for('index'))
    db.session.delete(post)
    db.session.commit()
    flash('Your post has been deleted.')
    return redirect(url_for('index'))

To invoke this function we will add a delete link on all posts that are authored by the logged in user (file app/templates/post.html):

{% if post.author.id == g.user.id %}
<div><a href="{{ url_for('delete', id=post.id) }}">{{ _('Delete') }}</a></div>
{% endif %}

There is nothing earth shattering here, we've done this many times already.

NOTE: The bug discussed in this section only existed when using an old version of Flask-WhooshAlchemy. If you would like to experience the bug then uninstall the current version and then install version 0.54a of this extension.

Now let's go ahead and start our application in production mode, to experience it like a regular user would. Linux and Mac users do it like this:

$ ./runp.py

Windows users do it like this instead:

flask/Scripts/python runp.py

Now as a user, write a post, then change your mind and delete it. And right when you click the delete link... Boom!

We get a terse message that says that the application has encountered an error and that the administrator has been notified. This message is actually our 500.html template. In production mode Flask issues the status code 500 template back to the client when an exception occurs while handling a request. Since we are now in production mode we will not see error messages or stack traces.

Debugging issues from the field

If you recall the unit testing article, we enabled a couple of debugging services to run on the production version of our application. Back then we created a logger that writes to a log file, so that the application can write debugging or diagnostic information at run time. Flask itself writes the stack trace of any exceptions that aren't handled before ending the request with a code 500 error. As an extra option, we have also enabled an email based logger that will send the people in the admin list an email when an error is written to the log message.

So for a bug like the one above we would get some debugging information captured in two places, the log file and an email sent to us.

A stack trace isn't much to go on, but it's far better than nothing. Assuming we know nothing about the problem, we now need to figure out what happened from the stack trace alone. Here is a copy of this particular stack trace:

127.0.0.1 - - [03/Mar/2013 23:57:39] "GET /delete/12 HTTP/1.1" 500 -
Traceback (most recent call last):
  File "/home/microblog/flask/lib/python2.7/site-packages/flask/app.py", line 1701, in __call__
    return self.wsgi_app(environ, start_response)
  File "/home/microblog/flask/lib/python2.7/site-packages/flask/app.py", line 1689, in wsgi_app
    response = self.make_response(self.handle_exception(e))
  File "/home/microblog/flask/lib/python2.7/site-packages/flask/app.py", line 1687, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/microblog/flask/lib/python2.7/site-packages/flask/app.py", line 1360, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/microblog/flask/lib/python2.7/site-packages/flask/app.py", line 1358, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/microblog/flask/lib/python2.7/site-packages/flask/app.py", line 1344, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/microblog/flask/lib/python2.7/site-packages/flask_login.py", line 496, in decorated_view
    return fn(*args, **kwargs)
  File "/home/microblog/app/views.py", line 195, in delete
    db.session.delete(post)
  File "/home/microblog/flask/lib/python2.7/site-packages/sqlalchemy/orm/scoping.py", line 114, in do
    return getattr(self.registry(), name)(*args, **kwargs)
  File "/home/microblog/flask/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1400, in delete
    self._attach(state)
  File "/home/microblog/flask/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1656, in _attach
    state.session_id, self.hash_key))
InvalidRequestError: Object '<Post at 0xff35e7ac>' is already attached to session '1' (this is '3')

If you are used to reading stack traces in other languages, then be aware that Python shows the stack traces in reverse order, with the bottom frame being the one that caused the exception.

Now how do we make sense of this?

From the above stack trace we can see that the exception was triggered from SQLAlchemy's session handling code, which is safe to assume from the sqlalchemy/orm/session.py filename in the bottom most frame.

With stack traces it is always useful to find out what was the last statement in our own code that executed. If we start at the bottom and we go up one frame at a time we'll find that the fourth frame is in our app/views.py, more specifically in the db.session.delete(post) statement in our delete() view function.

Now we know that SQLAlchemy is unable to register the deletion of this post in the database session. But we still don't know why.

If you look at the text of the exception in the bottom most line of the stack trace the problem appears to be that the Post object is attached to a session '1' from before, and now we are trying to attach the same object to another session '3'.

If you Google the exception message you will find that most people having issues with this are using a multithreaded web server, and they get two requests trying to add the same object to different sessions at the same time. But we are using Python's development web server which is single threaded, so that could not be our case. There is a different problem here that causes two sessions to be active at the same time, while there should only be one.

To see if we can learn more about the problem we should try to reproduce the bug in a more controlled environment. Luckily, trying to do this in the development version of the application does reproduce, and this is a bit better because when the exception occurs in development mode we get Flask's web based stack trace instead of the 500.html template.

The web stack trace is nice because it allows you to inspect code and evaluate expressions all from the browser. Without really understanding much of what's going on in this code we know there is a session '1' (which we can assume is the first session ever created) that for some reason didn't get deleted like normal sessions do when the request that created them ends. So a good approach to make progress would be to figure out who is creating this first session that never goes away.

Using the Python Debugger

The easiest way to find out who is creating an object is to put a breakpoint in the object's constructor. A breakpoint is a command that interrupts the program when certain condition is met. At this point it is possible to inspect the program, like obtaining the stack trace at the time of interruption, checking or even changing variable values, etc. Breakpoints are one of the basic features found in debuggers. For this task we will use the debugger that comes with Python, appropriately called pdb.

But what class are we looking for? Let's go back to the web based stack trace and look some more there. In the bottom most stack frame we can use the code viewer and the Python console (note the icons to enable these on the right side when you hover over the frame) to find out the class that is used for sessions. In the code pane we see that we are inside a Session class, which is likely the base class for database sessions in SQLAlchemy. Since the context in the bottom stack frame is inside the session object we can just get the actual class of the session in the console, by running:

>>> print self
<flask_sqlalchemy._SignallingSession object at 0xff34914c>

And now we know that the sessions that we are using are defined by Flask-SQLAlchemy, because likely this extension defines its own session class, which is a subclass of SQLAlchemy's Session.

Now we can go and inspect the source code for the Flask-SQLAlchemy extension in flask/lib/python2.7/site-packages/flask_sqlalchemy.py and locate class _SignallingSession and its __init__() constructor, and now we are ready to play with the debugger.

There is more than one way to insert a breakpoint in a Python application. The simplest one is to just write the following at the place we want the program to stop:

import pdb; pdb.set_trace()

So we'll go ahead and temporarily insert this breakpoint in the _SignallingSession class constructor (file flask/lib/python2.7/site-packages/flask_sqlalchemy.py):

class _SignallingSession(Session):

    def __init__(self, db, autocommit=False, autoflush=False, **options):
        import pdb; pdb.set_trace() # <-- this is temporary!
        self.app = db.get_app()
        self._model_changes = {}
        Session.__init__(self, autocommit=autocommit, autoflush=autoflush,
                         extension=db.session_extensions,
                         bind=db.engine,
                         binds=db.get_binds(self.app), **options)

    # ...

So let's run the application again to see what happens:

$ ./run.py
> /home/microblog/flask/lib/python2.7/site-packages/flask_sqlalchemy.py(198)__init__()
-> self.app = db.get_app()
(Pdb)

Since there is no "Running on..." message printed we know the server didn't actually complete its start up procedures. The interruption that brought us to the pdb prompt occurred because in some part of the application someone requested the creation of our mysterious session!

The most important question that we need to answer right now is where are we in the application, because that will tell us who is requesting the creation of this session '1' we can't get rid of later. We will use the bt command (short for backtrace) to get a stack trace:

(Pdb) bt
  /home/microblog/run.py(2)<module>()
-> from app import app
  /home/microblog/app/__init__.py(44)<module>()
-> from app import views, models
  /home/microblog/app/views.py(6)<module>()
-> from forms import LoginForm, EditForm, PostForm, SearchForm
  /home/microblog/app/forms.py(4)<module>()
-> from app.models import User
  /home/microblog/app/models.py(92)<module>()
-> whooshalchemy.whoosh_index(app, Post)
  /home/microblog/flask/lib/python2.6/site-packages/flask_whooshalchemy.py(168)whoosh_index()
-> _create_index(app, model))
  /home/microblog/flask/lib/python2.6/site-packages/flask_whooshalchemy.py(199)_create_index()
-> model.query = _QueryProxy(model.query, primary_key,
  /home/microblog/flask/lib/python2.6/site-packages/flask_sqlalchemy.py(397)__get__()
-> return type.query_class(mapper, session=self.sa.session())
  /home/microblog/flask/lib/python2.6/site-packages/sqlalchemy/orm/scoping.py(54)__call__()
-> return self.registry()
  /home/microblog/flask/lib/python2.6/site-packages/sqlalchemy/util/_collections.py(852)__call__()
-> return self.registry.setdefault(key, self.createfunc())
> /home/microblog/flask/lib/python2.6/site-packages/flask_sqlalchemy.py(198)__init__()
-> self.app = db.get_app()
(Pdb)

As we did before, we start from the bottom and locate the first stack frame that has code that we recognize as ours. That turns out to be our models.py file at line 92, which is the initialization of our full text search engine:

whooshalchemy.whoosh_index(app, Post)

Hmm. We are not doing anything intentional that would trigger the creation of a database session this early in the application's life, but it appears that the act of initializing Flask-WhooshAlchemy does create a database session.

This feels like this isn't our bug after all, maybe some sort of interaction between the two Flask extensions that wrap SQLAlchemy and Whoosh. We could stop here and ask for help from the developers of these two fine extensions or the communities around them. Or we could continue debugging to see if we can figure this out and have the problem solved today. I'll keep going, if you are bored already feel free to jump to the next section.

Let's have another look at the stack trace. We call whoosh_index(), which in turn calls _create_index(). The line of code in _create_index() is this:

model.query = _QueryProxy(model.query, primary_key,
            searcher, model)

The model variable in this context is set to our Post class, we passed it as an argument in our call to whoosh_index(). With that in mind, it appears Flask-WhooshAlchemy is creating a Post.query wrapper that takes the original Post.query as an argument, plus some other Whoosh specific stuff. And here is the interesting part. According to the stack trace above, the next function in the call stack is __get__(), one of Python's descriptor methods.

The __get__() method is used to implement descriptors, which are attributes that have behavior associated with them instead of just a value. Each time the descriptor is referenced the function __get__() is called. Then the function is supposed to return the value of the attribute. The only attribute that is mentioned in this line of code is query, so now we know that this seemingly simple attribute that we have used in the past to generate our database queries isn't really an attribute but a descriptor. The remainder of the stack trace then is dealing with computing the value of the model.query expression that appears in the right side, in preparation of creating the _QueryProxy constructor.

Let's continue down the stack trace to see what happens next. The instruction in __get__() is this one:

return type.query_class(mapper, session=self.sa.session())

And this is a pretty revealing piece of code. When we say, for example, User.query.get(id) we are indirectly calling this __get__() method to provide the query object, and here we can see that this query object implicitly brings a database session along with it!

When Flask-WhooshAlchemy says model.query that also triggers a session to be created and associated with the query object. But the query object that Flask-WhooshAlchemy requests isn't short lived like any normal queries we run inside our view functions. Flask-WhooshAlchemy is wrapping this query object into its own query object, which it then stores back into model.query. Since there is no __set__() method counterpart, the new object will be stored as an attribute. For our Post class that means that after Flask-WhooshAlchemy completes its initialization we will have a descriptor and an attribute with the same name. According to the precedence rules, in this case the attribute wins, which is expected, since if not our Whoosh searches would not have worked.

The important aspect of all this is that this code is setting a persistent attribute that inside has our session '1'. Even though the first request handled by the application will use this session and then forget about it, the session does not go away because it is still referenced by the Post.query attribute. This is our bug!

The bug is caused by the confusing (my opinion) nature of descriptors. They look like regular attributes, so people tend to use them as such. The Flask-WhooshAlchemy developer just wanted to create an enhanced query object that can store some state useful for Whoosh queries, but didn't realize that referencing the query attribute of a model does way more than it seems, as there is hidden behavior associated with this attribute that starts a database session.

Regression Testing

For many, the most logical thing to do at this point would be to fix the Flask-WhooshAlchemy code and move on. But if we just do that, then what protects us from this or a similar bug happening in the future? For example, what happens if a year from now we decide to update Flask-WhooshAlchemy to a new version and forget that we had applied a custom fix to it?

The best option every time a bug is discovered is to create a unit test for it, so that we can make sure we don't have a regression in the future.

There is some trickiness in creating a test for this bug, as we need to simulate two requests inside a single test. The first request will query a Post object, simulating the query that we make to request data for displaying in the web page. Since this is the first query it will use the session '1'. Then we need to forget that session and make a new one, exactly like Flask-SQLAlchemy does. Trying to delete the Post object on the second session should trigger this bug, because the first session would not have gone away as expected.

After taking another peek at the source code for Flask-SQLAlchemy we can see that new sessions are created using the db.create_scoped_session() function, and when a request ends a session is destroyed by calling db.session.remove(). Knowing this makes it easy to write a test for this bug:

def test_delete_post(self):
    # create a user and a post
    u = User(nickname='john', email='john@example.com')
    p = Post(body='test post', author=u, timestamp=datetime.utcnow())
    db.session.add(u)
    db.session.add(p)
    db.session.commit()
    # query the post and destroy the session
    p = Post.query.get(1)
    db.session.remove()
    # delete the post using a new session
    db.session = db.create_scoped_session()
    db.session.delete(p)
    db.session.commit()

And sure enough, now when we run our test suite the failure appears:

$ ./tests.py
.E....
======================================================================
ERROR: test_delete_post (__main__.TestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./tests.py", line 133, in test_delete_post
    db.session.delete(p)
  File "/home/microblog/flask/lib/python2.7/site-packages/sqlalchemy/orm/scoping.py", line 114, in do
    return getattr(self.registry(), name)(*args, **kwargs)
  File "/home/microblog/flask/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1400, in delete
    self._attach(state)
  File "/home/microblog/flask/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1656, in _attach
    state.session_id, self.hash_key))
InvalidRequestError: Object '<Post at 0xff09b7ac>' is already attached to session '1' (this is '3')

----------------------------------------------------------------------
Ran 6 tests in 3.852s

FAILED (errors=1)

The Fix

To address this problem we need to find an alternative way of attaching Flask-WhooshAlchemy's query object to the model.

The documentation for Flask-SQLAlchemy mentions there is a model.query_class attribute that contains the class to use for queries. This is actually a much cleaner way to make Flask-SQLAlchemy use a custom query class than what Flask-WhooshAlchemy is doing. If we configure Flask-SQLAlchemy to create queries using the Whoosh enabled query class (which is already a subclass of Flask-SQLAlchemy's BaseQuery), then we should have the same result as before, but without the bug.

I have created a fork of the Flask-WhooshAlchemy project on github where I have implemented these changes. If you want to see the changes you can see the github diff for my commit, or you can also download the fixed extension and install it in place of your original flask_whooshalchemy.py file.

I have submitted my fix to the developer of Flask-WhooshAlchemy, so I hope at some point it will get included in an official version (Update: my fix has been included in version 0.56).

Test Coverage

One way we can greatly reduce the chances of having bugs after we deploy our application is to have a comprehensive test suite. We already have a unit testing framework, but how do we know how much of our application are we currently testing with it?

A test coverage tool can observe a running application and take note of which lines of code execute and which do not. After execution ends it can produce a report showing what lines have not executed. If we had such report for our test suite we would know right away what parts of our code need tests that exercise them.

Python has a coverage tool that we can use called simply coverage that we installed way back when we started this tutorial. This tool can be used as a command line tool or can also be started from inside a script. To make it easier to not forget to run it we will go with the latter.

These are the changes that we need to make to add a coverage report to our test suite (file tests.py):

from coverage import coverage
cov = coverage(branch=True, omit=['flask/*', 'tests.py'])
cov.start()

# ...

if __name__ == '__main__':
    try:
        unittest.main()
    except:
        pass
    cov.stop()
    cov.save()
    print "\n\nCoverage Report:\n"
    cov.report()
    print "HTML version: " + os.path.join(basedir, "tmp/coverage/index.html")
    cov.html_report(directory='tmp/coverage')
    cov.erase()

We begin by initializing the coverage module at the very top of the script. The branch = True argument requests that branch analysis is done in addition to regular line based coverage. The omit argument makes sure we do not get coverage report for the modules we have installed in our virtual environment and for the unit testing framework itself, we just want coverage for our application code.

To gather coverage statistics we just call cov.start(), then run our unit tests. We have to catch and pass exceptions from the unit testing framework, because if not the script would end without giving us a chance to produce a coverage report. After we are back from the testing we stop coverage with cov.stop(), and write the results with cov.save(). Finally, cov.report() dumps the data to the console, cov.html_report() generates a nicer HTML report with the same date, and cov.erase() deletes the data file.

Here is an example test run with coverage report included (note I left an intentional failure in there):

$ ./tests.py
.....F
    ======================================================================
FAIL: test_translation (__main__.TestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./tests.py", line 143, in test_translation
    assert microsoft_translate(u'English', 'en', 'es') == u'Inglés'
AssertionError

----------------------------------------------------------------------
Ran 6 tests in 3.981s

FAILED (failures=1)

Coverage Report:

Name             Stmts   Miss Branch BrMiss  Cover   Missing
------------------------------------------------------------
app/__init__        39      0      6      3    93%
app/decorators       6      2      0      0    67%   5-6
app/emails          14      6      0      0    57%   9, 12-15, 21
app/forms           30     14      8      8    42%   15-16, 19-30
app/models          63      8     10      1    88%   32, 37, 47, 50, 53, 56, 78, 90
app/momentjs        12      5      0      0    58%   5, 8, 11, 14, 17
app/translate       33     24      4      3    27%   10-36, 39-56
app/views          169    124     46     46    21%   16, 20, 24-30, 34-37, 41, 45-46, 53-67, 75-81, 88-109, 113-114, 120-125, 132-143, 149-164, 169-183, 188-198, 203-205, 210-211, 218
config              22      0      0      0   100%
------------------------------------------------------------
TOTAL              388    183     74     61    47%

HTML version: /home/microblog/tmp/coverage/index.html

With this report we know that we are testing 47% of our application. And we are given the list of lines that didn't execute during the test run, so we just need to review those lines and think about what tests we can write that use them.

We can see that our coverage for app/models.py is pretty high (88%), since we have mostly focused on testing our models. The app/views.py coverage is pretty low (21%) because we aren't executing view function code in our tests yet.

In addition to lines of code that were missed, this tool gives us branch coverage information in the Branch and BrMiss columns. Consider the following example script:

def f(x):
    if x >= 0:
        x = x + 1
    return x

f(1)

If we run coverage in this simple function we will get 100% coverage, because when the function gets 1 as an argument its three lines execute. But we have never executed this function with the argument set to 0 or less, and that causes a different behavior. In a more complex case that could cause a bug.

Branch coverage tells us how many branches of code we have missed, and this is another pointer to the kinds of tests we might be missing in our suite.

The coverage tool also generates a very nice HTML based report that displays the source code annotated with color coded markers for line and branches covered and missed.

Continuing with our testing strategy centered in testing the models we can look at the parts of our app/models.py file that have no test coverage. This is easy to visualize in the HTML report, from where we can obtain the following list:

  • User.make_valid_nickname()
  • User.is_authenticated()
  • User.is_active()
  • User.is_anonymous()
  • User.get_id()
  • User.__repr__()
  • Post.__repr__()
  • User.make_unique_nickname() (only for the branch case where the given name is already unique)

We can put the first five in the list in a new test:

def test_user(self):
    # make valid nicknames
    n = User.make_valid_nickname('John_123')
    assert n == 'John_123'
    n = User.make_valid_nickname('John_[123]\n')
    assert n == 'John_123'
    # create a user
    u = User(nickname='john', email='john@example.com')
    db.session.add(u)
    db.session.commit()
    assert u.is_authenticated() is True
    assert u.is_active() is True
    assert u.is_anonymous() is False
    assert u.id == int(u.get_id())

The __repr__() functions are really used internally, so we don't need to test them. For this, we can mark them as not interesting as follows:

def __repr__(self):  # pragma: no cover
    return '<User %r>' % (self.nickname)

Finally, back when we wrote the test for make_unique_nickname() we focused only on how the function resolves a name collision, but we forgot to provide a test case that is unique and requires no handling. We can expand our existing test to cover that case as well:

def test_make_unique_nickname(self):
    # create a user and write it to the database
    u = User(nickname='john', email='john@example.com')
    db.session.add(u)
    db.session.commit()
    nickname = User.make_unique_nickname('susan')
    assert nickname == 'susan'
    nickname = User.make_unique_nickname('john')
    assert nickname != 'john'
    #...

And with these simple changes we get to almost 100% coverage on our models.py source file. When running on Python 2.7 there are only two Python 3 specific lines of code that are missed.

For now we'll call it good. Some other day we may decide to pick this up again and figure out a good way to test view functions, but we should feel good that we have full coverage on the code that talks to the database.

Profiling for performance

The next topic of the day is performance. There is nothing more frustrating for users than having to wait a long time for pages to load. We want to make sure our application is as fast as it can be so we need to take some measures to be prepared to analyze performance problems.

The technique that we are going to use is called profiling. A code profiler watches a running program, pretty much like coverage tools do, but instead of noting which lines execute and which don't it notes how much time is spent on each function. At the end of the profiling period a report is generated that lists all the functions that executed and how long was spent in each. Sorting this list from the largest to the smallest time will give us a pretty good idea of where we should spend time optimizing code.

Python comes with a nice source code profiler called cProfile. We could embed this profiler into our application directly, but before we do any work it is a good idea to search if someone has done the integration work already. A quick search for "Flask profiler" tells us that the Werkzeug module used by Flask comes with a profiler plugin that is all ready to go, so we'll just use that.

To enable the Werkzeug profiler we can create another starter script like run.py. Let's call it profile.py:

#!flask/bin/python
from werkzeug.contrib.profiler import ProfilerMiddleware
from app import app

app.config['PROFILE'] = True
app.wsgi_app = ProfilerMiddleware(app.wsgi_app, restrictions=[30])
app.run(debug = True)

Starting the application with the above script will enable the profiler to show the 30 most expensive functions for each request (the syntax for the restrictions argument is documented here).

Once the application starts, each request will show a profiler summary. Here is an example:

--------------------------------------------------------------------------------
PATH: '/'
         95477 function calls (89364 primitive calls) in 0.202 seconds

   Ordered by: internal time, call count
   List reduced from 1587 to 30 due to restriction <30>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.061    0.061    0.061    0.061 {method 'commit' of 'sqlite3.Connection' objects}
        1    0.013    0.013    0.018    0.018 flask/lib/python2.7/site-packages/sqlalchemy/dialects/sqlite/pysqlite.py:278(dbapi)
    16807    0.006    0.000    0.006    0.000 {isinstance}
     5053    0.006    0.000    0.012    0.000 flask/lib/python2.7/site-packages/jinja2/nodes.py:163(iter_child_nodes)
8746/8733    0.005    0.000    0.005    0.000 {getattr}
      817    0.004    0.000    0.011    0.000 flask/lib/python2.7/site-packages/jinja2/lexer.py:548(tokeniter)
        1    0.004    0.004    0.004    0.004 /usr/lib/python2.7/sqlite3/dbapi2.py:24(<module>)
        4    0.004    0.001    0.015    0.004 {__import__}
        1    0.004    0.004    0.009    0.009 flask/lib/python2.7/site-packages/sqlalchemy/dialects/sqlite/__init__.py:7(<module>)
   1808/8    0.003    0.000    0.033    0.004 flask/lib/python2.7/site-packages/jinja2/visitor.py:34(visit)
     9013    0.003    0.000    0.005    0.000 flask/lib/python2.7/site-packages/jinja2/nodes.py:147(iter_fields)
     2822    0.003    0.000    0.003    0.000 {method 'match' of '_sre.SRE_Pattern' objects}
      738    0.003    0.000    0.003    0.000 {method 'split' of 'str' objects}
     1808    0.003    0.000    0.006    0.000 flask/lib/python2.7/site-packages/jinja2/visitor.py:26(get_visitor)
     2862    0.003    0.000    0.003    0.000 {method 'append' of 'list' objects}
  110/106    0.002    0.000    0.008    0.000 flask/lib/python2.7/site-packages/jinja2/parser.py:544(parse_primary)
       11    0.002    0.000    0.002    0.000 {posix.stat}
        5    0.002    0.000    0.010    0.002 flask/lib/python2.7/site-packages/sqlalchemy/engine/base.py:1549(_execute_clauseelement)
        1    0.002    0.002    0.004    0.004 flask/lib/python2.7/site-packages/sqlalchemy/dialects/sqlite/base.py:124(<module>)
  1229/36    0.002    0.000    0.008    0.000 flask/lib/python2.7/site-packages/jinja2/nodes.py:183(find_all)
    416/4    0.002    0.000    0.006    0.002 flask/lib/python2.7/site-packages/jinja2/visitor.py:58(generic_visit)
   101/10    0.002    0.000    0.003    0.000 flask/lib/python2.7/sre_compile.py:32(_compile)
       15    0.002    0.000    0.003    0.000 flask/lib/python2.7/site-packages/sqlalchemy/schema.py:1094(_make_proxy)
        8    0.002    0.000    0.002    0.000 {method 'execute' of 'sqlite3.Cursor' objects}
        1    0.002    0.002    0.002    0.002 flask/lib/python2.7/encodings/base64_codec.py:8(<module>)
        2    0.002    0.001    0.002    0.001 {method 'close' of 'sqlite3.Connection' objects}
        1    0.001    0.001    0.001    0.001 flask/lib/python2.7/site-packages/sqlalchemy/dialects/sqlite/pysqlite.py:215(<module>)
        2    0.001    0.001    0.002    0.001 flask/lib/python2.7/site-packages/wtforms/form.py:162(__call__)
      980    0.001    0.000    0.001    0.000 {id}
  936/127    0.001    0.000    0.008    0.000 flask/lib/python2.7/site-packages/jinja2/visitor.py:41(generic_visit)

--------------------------------------------------------------------------------

127.0.0.1 - - [09/Mar/2013 19:35:49] "GET / HTTP/1.1" 200 -

The columns in this report are as follows:

  • ncalls: number of times this function was called.
  • tottime: total time spent inside this function.
  • percall: this is tottime divided by ncalls.
  • cumtime: total time spent inside this function and any functions called from it.
  • percall: cumtime divided by ncalls.
  • filename:lineno(function): the function name and location.

It is interesting to note that our templates will also appear here as functions. This is because Jinja2 templates are compiled to Python code. That means that the profiler will not only point us to slow code we have written but also to slow templates!

We really don't have a performance problem at this point, at least not in this one request. We can see that the most time consuming functions are related to the sqlite3 database and Jinja2 template rendering, which is totally expected. Note in the header at the top that the this request took just 0.2 seconds to complete, so the scale of these per-function times is so small that it is in the noise.

As the application grows, it is useful to run new requests we add through the profiler, to make sure we are not doing anything that is slow.

Database Performance

To end this article, we are going to look at database performance. We've seen above that our database handling is at the top of the profiler report, so it would be good to have a system in place to get an alert when and if our database becomes too slow during production use.

The Flask-SQLAlchemy documentation mentions a get_debug_queries function, which returns a list of all the queries issued during the request with their durations.

This is very useful information. The intended use appears to be to time queries during debugging or testing, but being able to send an alert when a query takes too long is useful as a feature for the production version, even if it adds a little bit of overhead.

To use this feature during production we need to enable it in the configuration (file config.py):

SQLALCHEMY_RECORD_QUERIES = True

We are also going to setup a threshold for what we will consider a slow query (file config.py):

# slow database query threshold (in seconds)
DATABASE_QUERY_TIMEOUT = 0.5

To check if we need to send any alerts we'll add a hook after each request. In Flask this is easy, we just set up a after_request handler (file app/views.py):

from flask.ext.sqlalchemy import get_debug_queries
from config import DATABASE_QUERY_TIMEOUT

@app.after_request
def after_request(response):
    for query in get_debug_queries():
        if query.duration >= DATABASE_QUERY_TIMEOUT:
            app.logger.warning("SLOW QUERY: %s\nParameters: %s\nDuration: %fs\nContext: %s\n" % (query.statement, query.parameters, query.duration, query.context))
    return response

This will write to the log any queries that took longer than half a second. The information in the log will include the SQL statement, the actual parameters used in that statement, the duration and the location in the sources from where the query was issued.

With a small database like ours it is unlikely that we will have slow queries, but as the database and the application grow we may find that some database queries need to be optimized, for example with the addition of indexes. By simply checking the log file from time to time we will learn if some of the queries we are issuing need optimization.

Final words

Today we have done a number of mundane, yet important things all robust applications must have. The code for the updated application can be downloaded below:

Download microblog-0.16.zip.

Users comfortable with github can also get the code from here.

It feels like we are reaching the end of this tutorial series, as I'm running out of ideas for what to present. In the next, possibly last chapter we will look at deployment options, both traditional and cloud based.

If you have any topics that you feel have not been given coverage in this tutorial please let me know in the comments below.

I hope to see you next time!

Miguel

41 comments

  • #1 Slarti said :

    This whole series has been wonderful. Great detail, clear explanations and examples. I'm not sure how you're defining "traditional," but when you go over deployments, please include something on packaging your app in a way that's friendly to the package management system and filesystem layout of the server it'll be running on in production. (RPM/deb/whatever, Linux FHS, etc.) Speaking as someone who's a systems admin by trade, I can't tell you how often and how badly I despair when developers treat tarballs or git checkouts into rathole subdirectories with their own self-contained locations for confs, logs, etc. like they're legitimate deployment strategies. (I also despair at how virtualenv and keeping local unmanaged copies of dependency modules (potentially leading to multiple copies of the same modules if more than one app lives on a system) is considered the done thing rather than solving the problem of properly integrating with the system-managed python, but I've accepted that I might possibly have lost that fight. I'll accept considering the dependencies as part of the app settle for integration of the app as a whole into the system as a whole.) Thanks!

  • #2 Jimmy Juno said :

    Coming from the C/C++ world, the pythonic debugging technique of modifying 3rd-party scripts to set breakpoints feels very risky. I would suggest using pudb instead of pdb for debugging anyway -- it has a much nicer UI to pdb yet still runs in the console. Another great tutorial, Miguel. I can't wait to read the one on deployment. :-)

  • #3 Miguel Grinberg said :

    @Jimmy: Why do you think modifying 3rd party code is risky? I also come from a C/C++ world, but I learned not to take 3rd party code for granted, getting familiar with the libraries that you use enable you to understand their pluses and minuses better, and in fact, this bug I was chasing ended up being in a 3rd party library after all! Now with that said, I was not aware of pudb, looks like a much nicer alternative to pdb and I'll definitely give it a try. Thanks!

  • #4 George Mabley said :

    I have been looking at the capabilities of the extensions we have added so far, and I thought I should try out WTForm's recaptcha. I modified my forms.py correctly and added a RecaptchaField as recaptcha. I added my keys to the config.py, and then called {{form.recaptcha}} in a template. After the code hit {{form.recaptcha}}, this traceback was given: http://pastebin.com/1LVBcmMc. The error was simply "KeyError: 'babel'." Any suggestions as to what my problem is?

  • #5 Miguel Grinberg said :

    @George: are you running your app without initializing Flask-Babel maybe? The recaptcha widget has internationalized texts in its template, it seems it's trying to get babel to translate a text and babel isn't there.

  • #6 George Mabley said :

    That was it. Thank you!

  • #7 Aaron said :

    Did you cover caching anywhere? I see mention of it here, but can't find it in your tutorial: http://blog.miguelgrinberg.com/post/about-this-blog

  • #8 Miguel Grinberg said :

    @Aaron: you are right, I never found a good excuse to talk about caching, since microblog is largely dynamic. The only place it would make sense to cache is at the post level, but posts are also pretty simple. I'll think about it and if I can come up with a good way to insert caching I'll write an article about it. Thanks!

  • #9 Aaron said :

    Oh, is the software referenced in the about post not the same as what you covered in this mega-tutorial? If the queries are so simple as to not need caching, it's a little curious that you covered slow query logging at the end ;-) I've been working on a project with some fairly slow queries here and there, and I'll have to get some caching in place soon. I just thought I ought to see if you'd covered it, as your tutorial is among the best I've seen for all the things you *do* cover... even though I'm not a novice, it's helpful.

  • #10 Miguel Grinberg said :

    @Aaron: the software mentioned in the About page is this blog, it is not the subject of the tutorial. You find it a "little curious" that I don't cache my database queries because you are confused about the reasons for caching and how caching works. Caching is not a solution to fix slow queries, for that problem you improve your database design so that your queries run more efficiently. Caching only works for data that rarely changes, it would not work for an application like microblog, which as I said above, is dynamic and needs to show updated content constantly. On the other side, caching works great for a blog since articles rarely change and they are rendered exactly the same for all clients. My recommendation for your problem is that you learn about database optimization more than caching.

Leave a Comment

Note: all comments are screened before they are published. Thank you for your patience!