The Flask Mega-Tutorial, Part XVI: Debugging, Testing and Profiling (2012)

Posted by

on under

(Great news! There is a new version of this tutorial!)

This is the sixteenth article in the series in which I document my experience writing web applications in Python using the Flask microframework.

The goal of the tutorial series is to develop a decently featured microblogging application that demonstrating total lack of originality I have decided to call microblog.

NOTE: This article was revised in September 2014 to be in sync with current versions of Python and Flask.

Here is an index of all the articles in the series that have been published to date:

Our little microblog application is starting to feel decent enough for a release, so it is time to start thinking about cleaning things up as much as we can. Recently, a reader of this blog (Hi, George!) reported a strange database bug that we are going to debug today. This should help us realize that no matter how careful we are and how much we test our application there is a good chance we will miss some bugs. Unfortunately users are very good at finding them!

Instead of just fixing this bug and forgetting about it until we come across another one, we are going to take some proactive measures to be better prepared for the next one.

In the first part of this article we'll cover debugging, and I'll show you some techniques I use to debug tough problems.

Later we will see how we can have an idea of how effective our testing strategy is. We will specifically look at measuring how much of our application our unit tests are exercising, something called test coverage.

And finally, we will think about another type of problems that many applications can suffer from, bad performance. We will look at profiling techniques to find the parts of our application that are slow.

Sounds good? Let's get started then.

The Bug

This problem was found by a reader of this blog, after he implemented a new function that allows users to delete their own posts. The official version of microblog does not have that, so we'll quickly put together this feature so that we can work on the bug.

The view function that we will use to delete a post is below (file app/views.py):

@app.route('/delete/<int:id>')
@login_required
def delete(id):
    post = Post.query.get(id)
    if post is None:
        flash('Post not found.')
        return redirect(url_for('index'))
    if post.author.id != g.user.id:
        flash('You cannot delete this post.')
        return redirect(url_for('index'))
    db.session.delete(post)
    db.session.commit()
    flash('Your post has been deleted.')
    return redirect(url_for('index'))

To invoke this function we will add a delete link on all posts that are authored by the logged in user (file app/templates/post.html):

{% if post.author.id == g.user.id %}
<div><a href="{{ url_for('delete', id=post.id) }}">{{ _('Delete') }}</a></div>
{% endif %}

There is nothing earth shattering here, we've done this many times already.

NOTE: The bug discussed in this section only existed when using an old version of Flask-WhooshAlchemy. If you would like to experience the bug then uninstall the current version and then install version 0.54a of this extension.

Now let's go ahead and start our application in production mode, to experience it like a regular user would. Linux and Mac users do it like this:

$ ./runp.py

Windows users do it like this instead:

flask/Scripts/python runp.py

Now as a user, write a post, then change your mind and delete it. And right when you click the delete link... Boom!

We get a terse message that says that the application has encountered an error and that the administrator has been notified. This message is actually our 500.html template. In production mode Flask issues the status code 500 template back to the client when an exception occurs while handling a request. Since we are now in production mode we will not see error messages or stack traces.

Debugging issues from the field

If you recall the unit testing article, we enabled a couple of debugging services to run on the production version of our application. Back then we created a logger that writes to a log file, so that the application can write debugging or diagnostic information at run time. Flask itself writes the stack trace of any exceptions that aren't handled before ending the request with a code 500 error. As an extra option, we have also enabled an email based logger that will send the people in the admin list an email when an error is written to the log message.

So for a bug like the one above we would get some debugging information captured in two places, the log file and an email sent to us.

A stack trace isn't much to go on, but it's far better than nothing. Assuming we know nothing about the problem, we now need to figure out what happened from the stack trace alone. Here is a copy of this particular stack trace:

127.0.0.1 - - [03/Mar/2013 23:57:39] "GET /delete/12 HTTP/1.1" 500 -
Traceback (most recent call last):
  File "/home/microblog/flask/lib/python2.7/site-packages/flask/app.py", line 1701, in __call__
    return self.wsgi_app(environ, start_response)
  File "/home/microblog/flask/lib/python2.7/site-packages/flask/app.py", line 1689, in wsgi_app
    response = self.make_response(self.handle_exception(e))
  File "/home/microblog/flask/lib/python2.7/site-packages/flask/app.py", line 1687, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/microblog/flask/lib/python2.7/site-packages/flask/app.py", line 1360, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/microblog/flask/lib/python2.7/site-packages/flask/app.py", line 1358, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/microblog/flask/lib/python2.7/site-packages/flask/app.py", line 1344, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/microblog/flask/lib/python2.7/site-packages/flask_login.py", line 496, in decorated_view
    return fn(*args, **kwargs)
  File "/home/microblog/app/views.py", line 195, in delete
    db.session.delete(post)
  File "/home/microblog/flask/lib/python2.7/site-packages/sqlalchemy/orm/scoping.py", line 114, in do
    return getattr(self.registry(), name)(*args, **kwargs)
  File "/home/microblog/flask/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1400, in delete
    self._attach(state)
  File "/home/microblog/flask/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1656, in _attach
    state.session_id, self.hash_key))
InvalidRequestError: Object '<Post at 0xff35e7ac>' is already attached to session '1' (this is '3')

If you are used to reading stack traces in other languages, then be aware that Python shows the stack traces in reverse order, with the bottom frame being the one that caused the exception.

Now how do we make sense of this?

From the above stack trace we can see that the exception was triggered from SQLAlchemy's session handling code, which is safe to assume from the sqlalchemy/orm/session.py filename in the bottom most frame.

With stack traces it is always useful to find out what was the last statement in our own code that executed. If we start at the bottom and we go up one frame at a time we'll find that the fourth frame is in our app/views.py, more specifically in the db.session.delete(post) statement in our delete() view function.

Now we know that SQLAlchemy is unable to register the deletion of this post in the database session. But we still don't know why.

If you look at the text of the exception in the bottom most line of the stack trace the problem appears to be that the Post object is attached to a session '1' from before, and now we are trying to attach the same object to another session '3'.

If you Google the exception message you will find that most people having issues with this are using a multithreaded web server, and they get two requests trying to add the same object to different sessions at the same time. But we are using Python's development web server which is single threaded, so that could not be our case. There is a different problem here that causes two sessions to be active at the same time, while there should only be one.

To see if we can learn more about the problem we should try to reproduce the bug in a more controlled environment. Luckily, trying to do this in the development version of the application does reproduce, and this is a bit better because when the exception occurs in development mode we get Flask's web based stack trace instead of the 500.html template.

The web stack trace is nice because it allows you to inspect code and evaluate expressions all from the browser. Without really understanding much of what's going on in this code we know there is a session '1' (which we can assume is the first session ever created) that for some reason didn't get deleted like normal sessions do when the request that created them ends. So a good approach to make progress would be to figure out who is creating this first session that never goes away.

Using the Python Debugger

The easiest way to find out who is creating an object is to put a breakpoint in the object's constructor. A breakpoint is a command that interrupts the program when certain condition is met. At this point it is possible to inspect the program, like obtaining the stack trace at the time of interruption, checking or even changing variable values, etc. Breakpoints are one of the basic features found in debuggers. For this task we will use the debugger that comes with Python, appropriately called pdb.

But what class are we looking for? Let's go back to the web based stack trace and look some more there. In the bottom most stack frame we can use the code viewer and the Python console (note the icons to enable these on the right side when you hover over the frame) to find out the class that is used for sessions. In the code pane we see that we are inside a Session class, which is likely the base class for database sessions in SQLAlchemy. Since the context in the bottom stack frame is inside the session object we can just get the actual class of the session in the console, by running:

>>> print self
<flask_sqlalchemy._SignallingSession object at 0xff34914c>

And now we know that the sessions that we are using are defined by Flask-SQLAlchemy, because likely this extension defines its own session class, which is a subclass of SQLAlchemy's Session.

Now we can go and inspect the source code for the Flask-SQLAlchemy extension in flask/lib/python2.7/site-packages/flask_sqlalchemy.py and locate class _SignallingSession and its __init__() constructor, and now we are ready to play with the debugger.

There is more than one way to insert a breakpoint in a Python application. The simplest one is to just write the following at the place we want the program to stop:

import pdb; pdb.set_trace()

So we'll go ahead and temporarily insert this breakpoint in the _SignallingSession class constructor (file flask/lib/python2.7/site-packages/flask_sqlalchemy.py):

class _SignallingSession(Session):

    def __init__(self, db, autocommit=False, autoflush=False, **options):
        import pdb; pdb.set_trace() # <-- this is temporary!
        self.app = db.get_app()
        self._model_changes = {}
        Session.__init__(self, autocommit=autocommit, autoflush=autoflush,
                         extension=db.session_extensions,
                         bind=db.engine,
                         binds=db.get_binds(self.app), **options)

    # ...

So let's run the application again to see what happens:

$ ./run.py
> /home/microblog/flask/lib/python2.7/site-packages/flask_sqlalchemy.py(198)__init__()
-> self.app = db.get_app()
(Pdb)

Since there is no "Running on..." message printed we know the server didn't actually complete its start up procedures. The interruption that brought us to the pdb prompt occurred because in some part of the application someone requested the creation of our mysterious session!

The most important question that we need to answer right now is where are we in the application, because that will tell us who is requesting the creation of this session '1' we can't get rid of later. We will use the bt command (short for backtrace) to get a stack trace:

(Pdb) bt
  /home/microblog/run.py(2)<module>()
-> from app import app
  /home/microblog/app/__init__.py(44)<module>()
-> from app import views, models
  /home/microblog/app/views.py(6)<module>()
-> from forms import LoginForm, EditForm, PostForm, SearchForm
  /home/microblog/app/forms.py(4)<module>()
-> from app.models import User
  /home/microblog/app/models.py(92)<module>()
-> whooshalchemy.whoosh_index(app, Post)
  /home/microblog/flask/lib/python2.6/site-packages/flask_whooshalchemy.py(168)whoosh_index()
-> _create_index(app, model))
  /home/microblog/flask/lib/python2.6/site-packages/flask_whooshalchemy.py(199)_create_index()
-> model.query = _QueryProxy(model.query, primary_key,
  /home/microblog/flask/lib/python2.6/site-packages/flask_sqlalchemy.py(397)__get__()
-> return type.query_class(mapper, session=self.sa.session())
  /home/microblog/flask/lib/python2.6/site-packages/sqlalchemy/orm/scoping.py(54)__call__()
-> return self.registry()
  /home/microblog/flask/lib/python2.6/site-packages/sqlalchemy/util/_collections.py(852)__call__()
-> return self.registry.setdefault(key, self.createfunc())
> /home/microblog/flask/lib/python2.6/site-packages/flask_sqlalchemy.py(198)__init__()
-> self.app = db.get_app()
(Pdb)

As we did before, we start from the bottom and locate the first stack frame that has code that we recognize as ours. That turns out to be our models.py file at line 92, which is the initialization of our full text search engine:

whooshalchemy.whoosh_index(app, Post)

Hmm. We are not doing anything intentional that would trigger the creation of a database session this early in the application's life, but it appears that the act of initializing Flask-WhooshAlchemy does create a database session.

This feels like this isn't our bug after all, maybe some sort of interaction between the two Flask extensions that wrap SQLAlchemy and Whoosh. We could stop here and ask for help from the developers of these two fine extensions or the communities around them. Or we could continue debugging to see if we can figure this out and have the problem solved today. I'll keep going, if you are bored already feel free to jump to the next section.

Let's have another look at the stack trace. We call whoosh_index(), which in turn calls _create_index(). The line of code in _create_index() is this:

model.query = _QueryProxy(model.query, primary_key,
            searcher, model)

The model variable in this context is set to our Post class, we passed it as an argument in our call to whoosh_index(). With that in mind, it appears Flask-WhooshAlchemy is creating a Post.query wrapper that takes the original Post.query as an argument, plus some other Whoosh specific stuff. And here is the interesting part. According to the stack trace above, the next function in the call stack is __get__(), one of Python's descriptor methods.

The __get__() method is used to implement descriptors, which are attributes that have behavior associated with them instead of just a value. Each time the descriptor is referenced the function __get__() is called. Then the function is supposed to return the value of the attribute. The only attribute that is mentioned in this line of code is query, so now we know that this seemingly simple attribute that we have used in the past to generate our database queries isn't really an attribute but a descriptor. The remainder of the stack trace then is dealing with computing the value of the model.query expression that appears in the right side, in preparation of creating the _QueryProxy constructor.

Let's continue down the stack trace to see what happens next. The instruction in __get__() is this one:

return type.query_class(mapper, session=self.sa.session())

And this is a pretty revealing piece of code. When we say, for example, User.query.get(id) we are indirectly calling this __get__() method to provide the query object, and here we can see that this query object implicitly brings a database session along with it!

When Flask-WhooshAlchemy says model.query that also triggers a session to be created and associated with the query object. But the query object that Flask-WhooshAlchemy requests isn't short lived like any normal queries we run inside our view functions. Flask-WhooshAlchemy is wrapping this query object into its own query object, which it then stores back into model.query. Since there is no __set__() method counterpart, the new object will be stored as an attribute. For our Post class that means that after Flask-WhooshAlchemy completes its initialization we will have a descriptor and an attribute with the same name. According to the precedence rules, in this case the attribute wins, which is expected, since if not our Whoosh searches would not have worked.

The important aspect of all this is that this code is setting a persistent attribute that inside has our session '1'. Even though the first request handled by the application will use this session and then forget about it, the session does not go away because it is still referenced by the Post.query attribute. This is our bug!

The bug is caused by the confusing (my opinion) nature of descriptors. They look like regular attributes, so people tend to use them as such. The Flask-WhooshAlchemy developer just wanted to create an enhanced query object that can store some state useful for Whoosh queries, but didn't realize that referencing the query attribute of a model does way more than it seems, as there is hidden behavior associated with this attribute that starts a database session.

Regression Testing

For many, the most logical thing to do at this point would be to fix the Flask-WhooshAlchemy code and move on. But if we just do that, then what protects us from this or a similar bug happening in the future? For example, what happens if a year from now we decide to update Flask-WhooshAlchemy to a new version and forget that we had applied a custom fix to it?

The best option every time a bug is discovered is to create a unit test for it, so that we can make sure we don't have a regression in the future.

There is some trickiness in creating a test for this bug, as we need to simulate two requests inside a single test. The first request will query a Post object, simulating the query that we make to request data for displaying in the web page. Since this is the first query it will use the session '1'. Then we need to forget that session and make a new one, exactly like Flask-SQLAlchemy does. Trying to delete the Post object on the second session should trigger this bug, because the first session would not have gone away as expected.

After taking another peek at the source code for Flask-SQLAlchemy we can see that new sessions are created using the db.create_scoped_session() function, and when a request ends a session is destroyed by calling db.session.remove(). Knowing this makes it easy to write a test for this bug:

def test_delete_post(self):
    # create a user and a post
    u = User(nickname='john', email='john@example.com')
    p = Post(body='test post', author=u, timestamp=datetime.utcnow())
    db.session.add(u)
    db.session.add(p)
    db.session.commit()
    # query the post and destroy the session
    p = Post.query.get(1)
    db.session.remove()
    # delete the post using a new session
    db.session = db.create_scoped_session()
    db.session.delete(p)
    db.session.commit()

And sure enough, now when we run our test suite the failure appears:

$ ./tests.py
.E....
======================================================================
ERROR: test_delete_post (__main__.TestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./tests.py", line 133, in test_delete_post
    db.session.delete(p)
  File "/home/microblog/flask/lib/python2.7/site-packages/sqlalchemy/orm/scoping.py", line 114, in do
    return getattr(self.registry(), name)(*args, **kwargs)
  File "/home/microblog/flask/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1400, in delete
    self._attach(state)
  File "/home/microblog/flask/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1656, in _attach
    state.session_id, self.hash_key))
InvalidRequestError: Object '<Post at 0xff09b7ac>' is already attached to session '1' (this is '3')

----------------------------------------------------------------------
Ran 6 tests in 3.852s

FAILED (errors=1)

The Fix

To address this problem we need to find an alternative way of attaching Flask-WhooshAlchemy's query object to the model.

The documentation for Flask-SQLAlchemy mentions there is a model.query_class attribute that contains the class to use for queries. This is actually a much cleaner way to make Flask-SQLAlchemy use a custom query class than what Flask-WhooshAlchemy is doing. If we configure Flask-SQLAlchemy to create queries using the Whoosh enabled query class (which is already a subclass of Flask-SQLAlchemy's BaseQuery), then we should have the same result as before, but without the bug.

I have created a fork of the Flask-WhooshAlchemy project on github where I have implemented these changes. If you want to see the changes you can see the github diff for my commit, or you can also download the fixed extension and install it in place of your original flask_whooshalchemy.py file.

I have submitted my fix to the developer of Flask-WhooshAlchemy, so I hope at some point it will get included in an official version (Update: my fix has been included in version 0.56).

Test Coverage

One way we can greatly reduce the chances of having bugs after we deploy our application is to have a comprehensive test suite. We already have a unit testing framework, but how do we know how much of our application are we currently testing with it?

A test coverage tool can observe a running application and take note of which lines of code execute and which do not. After execution ends it can produce a report showing what lines have not executed. If we had such report for our test suite we would know right away what parts of our code need tests that exercise them.

Python has a coverage tool that we can use called simply coverage that we installed way back when we started this tutorial. This tool can be used as a command line tool or can also be started from inside a script. To make it easier to not forget to run it we will go with the latter.

These are the changes that we need to make to add a coverage report to our test suite (file tests.py):

from coverage import coverage
cov = coverage(branch=True, omit=['flask/*', 'tests.py'])
cov.start()

# ...

if __name__ == '__main__':
    try:
        unittest.main()
    except:
        pass
    cov.stop()
    cov.save()
    print("\n\nCoverage Report:\n")
    cov.report()
    print("HTML version: " + os.path.join(basedir, "tmp/coverage/index.html"))
    cov.html_report(directory='tmp/coverage')
    cov.erase()

We begin by initializing the coverage module at the very top of the script. The branch = True argument requests that branch analysis is done in addition to regular line based coverage. The omit argument makes sure we do not get coverage report for the modules we have installed in our virtual environment and for the unit testing framework itself, we just want coverage for our application code.

To gather coverage statistics we just call cov.start(), then run our unit tests. We have to catch and pass exceptions from the unit testing framework, because if not the script would end without giving us a chance to produce a coverage report. After we are back from the testing we stop coverage with cov.stop(), and write the results with cov.save(). Finally, cov.report() dumps the data to the console, cov.html_report() generates a nicer HTML report with the same date, and cov.erase() deletes the data file.

Here is an example test run with coverage report included (note I left an intentional failure in there):

$ ./tests.py
.....F
    ======================================================================
FAIL: test_translation (__main__.TestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./tests.py", line 143, in test_translation
    assert microsoft_translate(u'English', 'en', 'es') == u'Inglés'
AssertionError

----------------------------------------------------------------------
Ran 6 tests in 3.981s

FAILED (failures=1)


Coverage Report:

Name             Stmts   Miss Branch BrMiss  Cover   Missing
------------------------------------------------------------
app/__init__        39      0      6      3    93%
app/decorators       6      2      0      0    67%   5-6
app/emails          14      6      0      0    57%   9, 12-15, 21
app/forms           30     14      8      8    42%   15-16, 19-30
app/models          63      8     10      1    88%   32, 37, 47, 50, 53, 56, 78, 90
app/momentjs        12      5      0      0    58%   5, 8, 11, 14, 17
app/translate       33     24      4      3    27%   10-36, 39-56
app/views          169    124     46     46    21%   16, 20, 24-30, 34-37, 41, 45-46, 53-67, 75-81, 88-109, 113-114, 120-125, 132-143, 149-164, 169-183, 188-198, 203-205, 210-211, 218
config              22      0      0      0   100%
------------------------------------------------------------
TOTAL              388    183     74     61    47%

HTML version: /home/microblog/tmp/coverage/index.html

With this report we know that we are testing 47% of our application. And we are given the list of lines that didn't execute during the test run, so we just need to review those lines and think about what tests we can write that use them.

We can see that our coverage for app/models.py is pretty high (88%), since we have mostly focused on testing our models. The app/views.py coverage is pretty low (21%) because we aren't executing view function code in our tests yet.

In addition to lines of code that were missed, this tool gives us branch coverage information in the Branch and BrMiss columns. Consider the following example script:

def f(x):
    if x >= 0:
        x = x + 1
    return x

f(1)

If we run coverage in this simple function we will get 100% coverage, because when the function gets 1 as an argument its three lines execute. But we have never executed this function with the argument set to 0 or less, and that causes a different behavior. In a more complex case that could cause a bug.

Branch coverage tells us how many branches of code we have missed, and this is another pointer to the kinds of tests we might be missing in our suite.

The coverage tool also generates a very nice HTML based report that displays the source code annotated with color coded markers for line and branches covered and missed.

Continuing with our testing strategy centered in testing the models we can look at the parts of our app/models.py file that have no test coverage. This is easy to visualize in the HTML report, from where we can obtain the following list:

User.make_valid_nickname()
User.is_authenticated
User.is_active
User.is_anonymous
User.get_id()
User.__repr__()
Post.__repr__()
User.make_unique_nickname() (only for the branch case where the given name is already unique)

We can put the first five in the list in a new test:

def test_user(self):
    # make valid nicknames
    n = User.make_valid_nickname('John_123')
    assert n == 'John_123'
    n = User.make_valid_nickname('John_[123]\n')
    assert n == 'John_123'
    # create a user
    u = User(nickname='john', email='john@example.com')
    db.session.add(u)
    db.session.commit()
    assert u.is_authenticated is True
    assert u.is_active is True
    assert u.is_anonymous is False
    assert u.id == int(u.get_id())

The __repr__() functions are really used internally, so we don't need to test them. For this, we can mark them as not interesting as follows:

def __repr__(self):  # pragma: no cover
    return '<User %r>' % (self.nickname)

Finally, back when we wrote the test for make_unique_nickname() we focused only on how the function resolves a name collision, but we forgot to provide a test case that is unique and requires no handling. We can expand our existing test to cover that case as well:

def test_make_unique_nickname(self):
    # create a user and write it to the database
    u = User(nickname='john', email='john@example.com')
    db.session.add(u)
    db.session.commit()
    nickname = User.make_unique_nickname('susan')
    assert nickname == 'susan'
    nickname = User.make_unique_nickname('john')
    assert nickname != 'john'
    #...

And with these simple changes we get to almost 100% coverage on our models.py source file. When running on Python 2.7 there are only two Python 3 specific lines of code that are missed.

For now we'll call it good. Some other day we may decide to pick this up again and figure out a good way to test view functions, but we should feel good that we have full coverage on the code that talks to the database.

Profiling for performance

The next topic of the day is performance. There is nothing more frustrating for users than having to wait a long time for pages to load. We want to make sure our application is as fast as it can be so we need to take some measures to be prepared to analyze performance problems.

The technique that we are going to use is called profiling. A code profiler watches a running program, pretty much like coverage tools do, but instead of noting which lines execute and which don't it notes how much time is spent on each function. At the end of the profiling period a report is generated that lists all the functions that executed and how long was spent in each. Sorting this list from the largest to the smallest time will give us a pretty good idea of where we should spend time optimizing code.

Python comes with a nice source code profiler called cProfile. We could embed this profiler into our application directly, but before we do any work it is a good idea to search if someone has done the integration work already. A quick search for "Flask profiler" tells us that the Werkzeug module used by Flask comes with a profiler plugin that is all ready to go, so we'll just use that.

To enable the Werkzeug profiler we can create another starter script like run.py. Let's call it profile.py:

#!flask/bin/python
from werkzeug.contrib.profiler import ProfilerMiddleware
from app import app

app.config['PROFILE'] = True
app.wsgi_app = ProfilerMiddleware(app.wsgi_app, restrictions=[30])
app.run(debug = True)

Starting the application with the above script will enable the profiler to show the 30 most expensive functions for each request (the syntax for the restrictions argument is documented here).

Once the application starts, each request will show a profiler summary. Here is an example:

--------------------------------------------------------------------------------
PATH: '/'
         95477 function calls (89364 primitive calls) in 0.202 seconds

   Ordered by: internal time, call count
   List reduced from 1587 to 30 due to restriction <30>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.061    0.061    0.061    0.061 {method 'commit' of 'sqlite3.Connection' objects}
        1    0.013    0.013    0.018    0.018 flask/lib/python2.7/site-packages/sqlalchemy/dialects/sqlite/pysqlite.py:278(dbapi)
    16807    0.006    0.000    0.006    0.000 {isinstance}
     5053    0.006    0.000    0.012    0.000 flask/lib/python2.7/site-packages/jinja2/nodes.py:163(iter_child_nodes)
8746/8733    0.005    0.000    0.005    0.000 {getattr}
      817    0.004    0.000    0.011    0.000 flask/lib/python2.7/site-packages/jinja2/lexer.py:548(tokeniter)
        1    0.004    0.004    0.004    0.004 /usr/lib/python2.7/sqlite3/dbapi2.py:24(<module>)
        4    0.004    0.001    0.015    0.004 {__import__}
        1    0.004    0.004    0.009    0.009 flask/lib/python2.7/site-packages/sqlalchemy/dialects/sqlite/__init__.py:7(<module>)
   1808/8    0.003    0.000    0.033    0.004 flask/lib/python2.7/site-packages/jinja2/visitor.py:34(visit)
     9013    0.003    0.000    0.005    0.000 flask/lib/python2.7/site-packages/jinja2/nodes.py:147(iter_fields)
     2822    0.003    0.000    0.003    0.000 {method 'match' of '_sre.SRE_Pattern' objects}
      738    0.003    0.000    0.003    0.000 {method 'split' of 'str' objects}
     1808    0.003    0.000    0.006    0.000 flask/lib/python2.7/site-packages/jinja2/visitor.py:26(get_visitor)
     2862    0.003    0.000    0.003    0.000 {method 'append' of 'list' objects}
  110/106    0.002    0.000    0.008    0.000 flask/lib/python2.7/site-packages/jinja2/parser.py:544(parse_primary)
       11    0.002    0.000    0.002    0.000 {posix.stat}
        5    0.002    0.000    0.010    0.002 flask/lib/python2.7/site-packages/sqlalchemy/engine/base.py:1549(_execute_clauseelement)
        1    0.002    0.002    0.004    0.004 flask/lib/python2.7/site-packages/sqlalchemy/dialects/sqlite/base.py:124(<module>)
  1229/36    0.002    0.000    0.008    0.000 flask/lib/python2.7/site-packages/jinja2/nodes.py:183(find_all)
    416/4    0.002    0.000    0.006    0.002 flask/lib/python2.7/site-packages/jinja2/visitor.py:58(generic_visit)
   101/10    0.002    0.000    0.003    0.000 flask/lib/python2.7/sre_compile.py:32(_compile)
       15    0.002    0.000    0.003    0.000 flask/lib/python2.7/site-packages/sqlalchemy/schema.py:1094(_make_proxy)
        8    0.002    0.000    0.002    0.000 {method 'execute' of 'sqlite3.Cursor' objects}
        1    0.002    0.002    0.002    0.002 flask/lib/python2.7/encodings/base64_codec.py:8(<module>)
        2    0.002    0.001    0.002    0.001 {method 'close' of 'sqlite3.Connection' objects}
        1    0.001    0.001    0.001    0.001 flask/lib/python2.7/site-packages/sqlalchemy/dialects/sqlite/pysqlite.py:215(<module>)
        2    0.001    0.001    0.002    0.001 flask/lib/python2.7/site-packages/wtforms/form.py:162(__call__)
      980    0.001    0.000    0.001    0.000 {id}
  936/127    0.001    0.000    0.008    0.000 flask/lib/python2.7/site-packages/jinja2/visitor.py:41(generic_visit)


--------------------------------------------------------------------------------

127.0.0.1 - - [09/Mar/2013 19:35:49] "GET / HTTP/1.1" 200 -

The columns in this report are as follows:

ncalls: number of times this function was called.
tottime: total time spent inside this function.
percall: this is tottime divided by ncalls.
cumtime: total time spent inside this function and any functions called from it.
percall: cumtime divided by ncalls.
filename:lineno(function): the function name and location.

It is interesting to note that our templates will also appear here as functions. This is because Jinja2 templates are compiled to Python code. That means that the profiler will not only point us to slow code we have written but also to slow templates!

We really don't have a performance problem at this point, at least not in this one request. We can see that the most time consuming functions are related to the sqlite3 database and Jinja2 template rendering, which is totally expected. Note in the header at the top that the this request took just 0.2 seconds to complete, so the scale of these per-function times is so small that it is in the noise.

As the application grows, it is useful to run new requests we add through the profiler, to make sure we are not doing anything that is slow.

Database Performance

To end this article, we are going to look at database performance. We've seen above that our database handling is at the top of the profiler report, so it would be good to have a system in place to get an alert when and if our database becomes too slow during production use.

The Flask-SQLAlchemy documentation mentions a get_debug_queries function, which returns a list of all the queries issued during the request with their durations.

This is very useful information. The intended use appears to be to time queries during debugging or testing, but being able to send an alert when a query takes too long is useful as a feature for the production version, even if it adds a little bit of overhead.

To use this feature during production we need to enable it in the configuration (file config.py):

SQLALCHEMY_RECORD_QUERIES = True

We are also going to setup a threshold for what we will consider a slow query (file config.py):

# slow database query threshold (in seconds)
DATABASE_QUERY_TIMEOUT = 0.5

To check if we need to send any alerts we'll add a hook after each request. In Flask this is easy, we just set up a after_request handler (file app/views.py):

from flask_sqlalchemy import get_debug_queries
from config import DATABASE_QUERY_TIMEOUT

@app.after_request
def after_request(response):
    for query in get_debug_queries():
        if query.duration >= DATABASE_QUERY_TIMEOUT:
            app.logger.warning("SLOW QUERY: %s\nParameters: %s\nDuration: %fs\nContext: %s\n" % (query.statement, query.parameters, query.duration, query.context))
    return response

This will write to the log any queries that took longer than half a second. The information in the log will include the SQL statement, the actual parameters used in that statement, the duration and the location in the sources from where the query was issued.

With a small database like ours it is unlikely that we will have slow queries, but as the database and the application grow we may find that some database queries need to be optimized, for example with the addition of indexes. By simply checking the log file from time to time we will learn if some of the queries we are issuing need optimization.

Final words

Today we have done a number of mundane, yet important things all robust applications must have. The code for the updated application can be downloaded below:

Download microblog-0.16.zip.

Users comfortable with github can also get the code from here.

It feels like we are reaching the end of this tutorial series, as I'm running out of ideas for what to present. In the next, possibly last chapter we will look at deployment options, both traditional and cloud based.

If you have any topics that you feel have not been given coverage in this tutorial please let me know in the comments below.

I hope to see you next time!

Miguel

Become a Patron!

Hello, and thank you for visiting my blog! If you enjoyed this article, please consider supporting my work on this blog on Patreon!

68 comments

#51 muatik said 2015-10-26T13:38:35Z

Hi Miguel,

I have created a small tool to analyze flask endpoints. This may help those who wants to see which endpoints are the bottleneck etc. for details: https://github.com/fengsp/flask-profile I am writing here because maybe help others.
#52 muatik said 2015-10-30T10:22:20Z

the url mentioned in the #51th comment is wrong. the right one is https://github.com/muatik/flask-profiler/
#53 Neil said 2015-11-22T23:45:00Z

Hey Miguel,

Great tutorial, as always! I don't think I've ever seen a tutorial focused on profiling/proper testing; this is really helpful. I noticed that if I put the lines importing and initializing coverage at the bottom of my imports (i.e., just above the class declaration for TestCase), the only module that gets tested is app.models. If I place it at the top (like in your tests.py), I get the same output as you. Why is that?
#54 Miguel Grinberg said 2015-11-23T18:47:35Z

@Neil: the import statements do actually execute code, so coverage needs to be active before that for those file to be seen.
#55 Jeff said 2016-08-10T05:08:35Z

I just found out that the debug mode is not working properly. Inside run.py, we have

from app import app
app.run(debug=True)

but that does not set app.debug to True. I tried printing app.debug inside init.py and I'm getting False. This results for the application to log to microblog.log which should only happen during production. Why is this so? I'm thinking the one in run.py is just to set flask to debug mode but not our application. I did an app.debug = True inside init.py, and it stopped logging to microblog.log.
#56 Miguel Grinberg said 2016-08-10T06:35:04Z

@Jeff: isn't the init.py in which you are printing executing before app.run gets a chance to run? Try printing app.debug in one of your request route handlers.
#57 Jeff said 2016-08-18T11:04:54Z

@Miguel Yes, you're right, I totally missed that out. Sorry for the confusion.
#58 Peter Holliday said 2016-10-18T16:34:12Z

Hey Miguel. Excellent tutorial, it has helped me no end. However, I have adapted the database URI to work for SQL Server, and I have an issue when running a db_migrate. If I am just adding or removing a class, it works perfectly. However, if I add a new column to a class I get the following error:

C:\path\to\file\microblog>python db_migrate.py
C:\path\to\file\microblog\env\lib\site-packages\flask_sqlalchemy__init__.py:800: UserWarning: SQLALCHEMY_TRACK_MODIFICATIONS adds significant overhead and will be disabled by default in the future. Set it to True to suppress this warning.
warnings.warn('SQLALCHEMY_TRACK_MODIFICATIONS adds significant overhead and will be disabled by default in the future. Set it to True to suppress this warning.')
Traceback (most recent call last):
File "db_migrate.py", line 14, in <module>
api.upgrade(SQLALCHEMY_DATABASE_URI, SQLALCHEMY_MIGRATE_REPO)
File "C:\path\to\file\microblog\env\lib\site-packages\migrate\versioning\api.py", line 186, in upgrade
return _migrate(url, repository, version, upgrade=True, err=err, opts)
File "<decorator-gen-15>", line 2, in _migrate
File "C:\path\to\file\microblog\env\lib\site-packages\migrate\versioning\util__init__.py", line 160, in with_engine
return f(*a, kw)
File "C:\path\to\file\microblog\env\lib\site-packages\migrate\versioning\api.py", line 366, in _migrate
schema.runchange(ver, change, changeset.step)
File "C:\path\to\file\microblog\env\lib\site-packages\migrate\versioning\schema.py", line 93, in runchange
change.run(self.engine, step)
File "C:\path\to\file\microblog\env\lib\site-packages\migrate\versioning\script\py.py", line 148, in run
script_func(engine)
File "C:\path\to\file\microblog\db_repository\versions\007_migration.py", line 21, in upgrade
post_meta.tables['os_version'].columns['new_col'].create()
File "C:\path\to\file\microblog\env\lib\site-packages\migrate\changeset\schema.py", line 570, in create
visitorcallable = get_engine_visitor(engine, 'columngenerator')
File "C:\path\to\file\microblog\env\lib\site-packages\migrate\changeset\databases\visitor.py", line 47, in get_engine_visitor
return get_dialect_visitor(engine.dialect, name)
File "C:\path\to\file\microblog\env\lib\site-packages\migrate\changeset\databases\visitor.py", line 62, in get_dialect_visitor
migrate_dialect_cls = DIALECTS[sa_dialect_name]
KeyError: 'mssql'

Here is the URI I construct:

SQLALCHEMY_DATABASE_URI = 'mssql+pymssql://' + uname + ':' + pword + '@' + svr + '/' + dbname

Thanks in anticipation.
#59 Miguel Grinberg said 2016-10-18T20:31:51Z

@Peter: Looking through the sqlalchemy-migrate project I do not see that they support mssql. You may have better look with Alembic, which you can use via my Flask-Migrate extension.
#60 Chuck said 2017-10-24T17:18:08Z

First of all, thank you so much for this tutorial. Python is quite a bit to grasp. I started a project recently that utilizes a postgresql DB and the API was written in python. Ive had zero experience with both, and I thought I was going to lose my mind. After this tutorial, I am absolutely in love with Python.

I do have an issue in this tutorial I would love your help with because its driving me insane. When I go to a User profile and use the pagination links, a javascript error is thrown "Error: 'moment' is undefined". This only occurs in the User profile, and not the index.html. Whats even stranger is that when the User profile 1st loads, the moment function executes just fine, as all the posts have dates and times.

Perhaps even stranger than that...I called the alert() function in the header of base.html, and its set to alert(moment()); I get a message pop with the date time on every page load, except when I use the pagination of the User profile. Once the pagination is clicked, even the line of code for the alert throws the "Error: 'moment' is undefined" error.

Its almost like the moment library is disappearing on that one page. Ive tried including the script library on the User.html & post.html pages just hoping that maybe inserting it at that level would bypass what ever is causing it to disappear, but had no luck. Im not sure what to try next.

I understand its been awhile since you've wrote this tutorial, and I don't expect you to spend a ton of time helping me...However, I am hoping maybe youll have some suggestions, or perhaps even the answer, as soon as you've read the issue. Its driving me crazy!

Thanks,

Chuck
#61 Miguel Grinberg said 2017-10-24T20:46:52Z

@Chuck: Did you look at the HTML source code for the page that returns the error? It must have something different in the place where the moment library is imported.
#62 Chuck said 2017-10-25T12:23:32Z

@Miguel I ran the internet explorer debugger while navigating the site. The base.html remains the same as I visit every page, js script is included on each page, even the page throwing the error. The first 3 posts that load when the user.html is called loads just fine. The error is only thrown when using paginate. If I paginate to older posts, the error is thrown, if I paginate back to the 3 newest posts, which loaded fine during the User.html initial page load, I get an error...After I receive the errors, I can click the "Your Profile" link and the page loads just fine with the 3 newest posts.

Considering I can use the pagination on the index page, Id think the error would narrow down to either the user.html or view function that handles the User page, but I've compared my code with yours and there is no difference.

There are only 2 things I can think of that differ between our projects:
1) I wrote my project using Visual Studio Community 2017
2) I had to set the server to threaded=true once I got the login implemented. It was taking several minutes for pages to load before I did that. However, I have reproduced the same error even with this turned off. Also discovered, just now, that my project is running good with threaded=false, now. Not sure why the slow load times were occurring in the first place, or why the issue seems to be resolved now.

Maybe its somehow tied to VS 2017? I am completely at a loss. There's got to be something I'm missing, but I think I have to let this one go for my own sanity. haha

Thanks,
Chuck
#63 Miguel Grinberg said 2017-10-26T00:20:24Z

@Chuck: it is not clear to me if you are using my source files, or you wrote your own. It might be worth downloading my code and trying that to see if it makes any difference. The way you describe the problem makes me think it is some sort of a race condition, maybe the reference to moment happens before the moment library is loaded?
#64 Nikhil said 2020-12-28T10:37:51Z

Hey miguel,
I have scenario where I perform load testing using locust on a flask project. After some time testing I seem to get status as 0 in locust but the status returned by source code is 200. I'm thinking its because of connection timeout from the server. Could you give me some thoughts into this ?
#65 Miguel Grinberg said 2020-12-28T11:35:11Z

@Nikhil: I can't really comment on an issue without sufficient context. I can't think of any reason why the status would become 0.
#66 Dexter said 2021-01-21T17:47:38Z

In newer version of Werkzeug does not have this profile. Thankfully the old version is still available. To install the older version, you first need to uninstall the newer version. To do that, just run the following commands in your project

python3 -m pip uninstall werkzeug
and then
python3 -m pip install werkzeug==0.16.0

I took these instructions from https://stackoverflow.com/questions/60624139/when-i-do-flask-run-it-shows-error-modulenotfounderror-no-module-named-werk.

#67 lysdexia said 2021-07-29T14:31:54Z

https://github.com/Azure-Samples/ms-identity-python-webapp/issues/16

Looks like werkzeug.contrib has been removed from werkzeug 1.0

$ ./profile.py
Traceback (most recent call last):
  File "./profile.py", line 3, in <module>
    from werkzeug.contrib.profiler import ProfilerMiddleware
ModuleNotFoundError: No module named 'werkzeug.contrib'

#68 Miguel Grinberg said 2021-07-29T14:48:41Z

@lysdexia: and you are reading a version of this tutorial that is several years old. It's still on the blog for reference only, but note the link at the top that sends you to the updated version.