The Flask Mega-Tutorial, Part XIV: I18n and L10n (2012)

Posted by
on under

(Great news! There is a new version of this tutorial!)

This is the fourteenth article in the series in which I document my experience writing web applications in Python using the Flask microframework.

The goal of the tutorial series is to develop a decently featured microblogging application that demonstrating total lack of originality I have decided to call microblog.

NOTE: This article was revised in September 2014 to be in sync with current versions of Python and Flask.

Here is an index of all the articles in the series that have been published to date:

The topics of today's article are Internationalization and Localization, commonly abbreviated I18n and L10n. We want our microblog application to be used by as many people as possible, so we can't forget that there is a lot of people out there that can't speak English, or that can, but prefer to speak their native language.

To make our application accessible to foreign visitors we are going to use the Flask-Babel extension, which provides an extremely easy to use framework to translate the application into different languages.


To work with translations we are going to use a package called Babel, along with its Flask extension Flask-Babel. Flask-Babel is initialized simply by creating an instance of class Babel and passing the Flask application to it (file app/

from flask_babel import Babel
babel = Babel(app)

We also need to decide which languages we will offer translations for. For now we'll start with a Spanish version, since we have a translator at hand (yours truly), but it'll be easy to add more languages later. The list of supported languages will go to our configuration (file

# -*- coding: utf-8 -*-
# ...
# available languages
    'en': 'English',
    'es': 'Español'

The LANGUAGES dictionary has keys that are the language codes for the available languages, and values that are the printable name of the language. We are using the short language codes here, but when necessary the long codes that specify language and region can be used as well. For example if we wanted to support American and British English as separate languages we would have 'en-US' and 'en-GB' in our dictionary.

Note that because the word Español has a foreign character in it we have to add the coding comment at the top of the Python source file, to tell the Python interpreter that we are using the UTF-8 encoding and not ASCII, which lacks the ñ character.

The next piece of configuration that we need is a function that Babel will use to decide which language to use (file app/

from app import babel
from config import LANGUAGES

def get_locale():
    return request.accept_languages.best_match(LANGUAGES.keys())

The function that is marked with the localeselector decorator will be called before each request to give us a chance to choose the language to use when producing its response. For now we will do something very simple, we'll just read the Accept-Languages header sent by the browser in the HTTP request and find the best matching language from the list that we support. This is actually pretty simple, the best_match method does all the work for us.

The Accept-Languages header in most browsers is configured by default with the language selected at the operating system level, but all browsers give users the chance to select other languages. Users can provide a list of languages, each with a weight. As an example, here is a complex Accept-Languages header:

Accept-Language: da, en-gb;q=0.8, en;q=0.7

This says that Danish is the preferred language (with default weight = 1.0), followed by British English (weight = 0.8) and as a last option generic English (weight = 0.7).

The final piece of configuration that we need is a Babel configuration file that tells Babel where to look for texts to translate in our code and templates (file babel.cfg):

[python: **.py]
[jinja2: **/templates/**.html]

The first two lines give Babel the filename patterns for our Python and template files respectively. The third line tells Babel to enable a couple of extensions that make it possible to find text to translate in Jinja2 templates.

Marking texts to translate

Now comes the most tedious aspect of this task. We need to review all our code and templates and mark all English texts that need translating so that Babel can find them. For example, take a look at this snippet from function after_login:

if is None or == "":
    flash('Invalid login. Please try again.')

Here we have a flash message that we want to translate. To expose this text to Babel we just wrap the string in Babel's gettext function:

from flask_babel import gettext
# ...
if is None or == "":
    flash(gettext('Invalid login. Please try again.'))

In a template we have to do something similar, but we have the option to use _() as a shorter alias to gettext(). For example, the word Home in this link from our base template:

  <li><a href="{{ url_for('index') }}">Home</a></li>

can be marked for translation as follows:

  <li><a href="{{ url_for('index') }}">{{ _('Home') }}</a></li>

Unfortunately not all texts that we want to translate are as simple as the above. As an example of a tricky one, consider the following snippet from our post.html subtemplate:

<p><a href="{{ url_for('user', }}">{{ }}</a> said {{ momentjs(post.timestamp).fromNow() }}:</p>

Here the sentence that we want to translate has this structure: "<nickname> said <when>:". One would be tempted to just mark the word "said" for translation, but we can't really be sure that the order of the name and the time components in this sentence will be the same in all languages. The correct thing to do here is to mark the entire sentence for translation using placeholders for the name and the time, so that a translator can change the order if necessary. To complicate matters more, the name component has a hyperlink embedded in it!

There isn't really a nice and easy way to handle cases like this. The gettext function supports placeholders using the syntax %(name)s and that's the best we can do. Here is a simple example of a placeholder in a much simpler situation:

gettext('Hello, %(name)s', name=user.nickname)

The translator will need to be aware that there are placeholders and that they should not be touched. Clearly the name of a placeholder (what's between the %( and )s) must not be translated or else the connection to the actual value would be lost.

But back to our post template example, here is how it is marked for translation:

{% autoescape false %}
<p>{{ _('%(nickname)s said %(when)s:', nickname = '<a href="%s">%s</a>' % (url_for('user',,, when=momentjs(post.timestamp).fromNow()) }}</p>
{% endautoescape %}

The text that the translator will see for the above example is:

%(nickname)s said %(when)s:

Which is pretty decent. The values for nickname and when are what gives this translatable sentence its complexity, but these are given as additional arguments to the _() wrapper function and are not seen by the translator.

The nickname and when placeholders contain a lot of stuff in them. In particular, for the nickname we had to build an entire HTML link because we want this nickname to be clickable.

Because we are putting HTML in the nickname placeholder we need to turn off autoescaping to render this portion of the template, if not Jinja2 would render our HTML elements as escaped text. But requesting to render a string without escaping is considered a security risk, it is unsafe to render texts entered by users without escaping.

The text assigned to the when placeholder is safe because it is text that is entirely generated by our momentjs() wrapper function. What goes in the nickname argument, however, is coming from the nickname field of our User model, which in turn comes from our database, which can be entered by the user in a web form. If someone registers into our application with a nickname that contains embedded HTML or Javascript and then we render that malicious nickname unescaped, then we are effectively opening the door to an attacker. We certainly do not want that, so we are going to take a quick detour and remove any risks.

The solution that makes most sense is to prevent any attacks by restricting the characters that can be used in a nickname. We'll start by creating a function that converts an invalid nickname into a valid one (file app/

import re

class User(db.Model):
    def make_valid_nickname(nickname):
        return re.sub('[^a-zA-Z0-9_\.]', '', nickname)

Here we just take the nickname and remove any characters that are not letters, numbers, the dot or the underscore.

When a user registers with the site we receive his or her nickname from the OpenID provider, so we make sure we convert this nickname to something valid (file app/

def after_login(resp):
    nickname = User.make_valid_nickname(nickname)
    nickname = User.make_unique_nickname(nickname)
    user = User(nickname=nickname,

And then in the Edit Profile form, where the user can change the nickname, we can enhance our validation to not allow invalid characters (file app/

class EditForm(Form):
    def validate(self):
        if not Form.validate(self):
            return False
        if == self.original_nickname:
            return True
        if != User.make_valid_nickname(
            self.nickname.errors.append(gettext('This nickname has invalid characters. Please use letters, numbers, dots and underscores only.'))
            return False
        user = User.query.filter_by(
        if user is not None:
            self.nickname.errors.append(gettext('This nickname is already in use. Please choose another one.'))
            return False
        return True

With these simple measures we eliminate any possible attacks resulting from rendering the nickname to a page without escaping.

Extracting texts for translation

I'm not going to enumerate here all the changes required to mark all texts in the code and the templates. Interested readers can check the github diffs for this.

So let's assume we've found all the texts and wrapped them in gettext() or _() calls. What now?

Now we run pybabel to extract the texts into a separate file:

flask/bin/pybabel extract -F babel.cfg -o messages.pot app

Windows users should use this command instead:

flask\Scripts\pybabel extract -F babel.cfg -o messages.pot app

The pybabel extract command reads the given configuration file, then scans all the code and template files in the directories given as arguments (just app in our case) and when it finds a text marked for translation it copies it to the messages.pot file.

The messages.pot file is a template file that contains all the texts that need to be translated. This file is used as a model to generate language files.

Generating a language catalog

The next step in the process is to create a translation for a new language. We said we were going to do Spanish (language code es), so this is the command that adds Spanish to our application:

flask/bin/pybabel init -i messages.pot -d app/translations -l es

The pybabel init command takes the .pot file as input and writes a new language catalog to the directory given in the -d command line option for the language specified in the -l option. By default, Babel expects the translations to be in a translations folder at the same level as the templates, so that's where we'll put them.

After running the above comment a directory app/translations/es is created. Inside it there is yet another directory called LC_MESSAGES and inside it there is a file called messages.po. The command can be executed multiple times with different language codes to add support for other languages.

The messages.po file that is created in each language folder uses a format that is the de facto standard for language translations, the format used by the venerable gettext utility. There are many translation applications that work with .po files. For our translation needs we will use poedit, because it is one of the most popular and because it runs on all the major operating systems.

If you want to put your translator hat and give this task a try go ahead and install poedit from this link. The usage of this application is straightforward. Below is a screenshot after all the texts have been translated to Spanish:


The top section shows the texts in their original and translated languages. The bottom left has a box where the translator writes the text.

Once the texts have been translated and saved back to the messages.po file there is yet another step to publish these texts:

flask/bin/pybabel compile -d app/translations

The pybabel compile step just reads the contents of the .po file and writes a compiled version as a .mo file in the same directory. This file contains the translated texts in an optimized format that can be efficiently used by our application.

The translations are now ready to be used. To check them you can modify the language settings in your browser so that Spanish is the preferred language, or if you don't feel like messing with your browser configuration you can also fake it by temporarily changing the localeselector function to always request Spanish (file app/

def get_locale():
    return 'es'  # request.accept_languages.best_match(LANGUAGES.keys())

Now when we run the server each time the gettext() or _() functions are called instead of getting the English text we will get the translation defined for the language returned by the localeselector function.

Updating the translations

What happens if we leave the messages.po file incomplete, with some of the texts missing a translation? Nothing happens, the application will run just fine regardless, and those texts that don't have a translation will continue to appear in English.

What happens if we missed some of the English texts in our code or templates? Any strings that were not wrapped with gettext() or _() will not be in the translation files, so they'll be unknown to Babel and remain in English. Once we spot a missing text we can add the gettext() wrapper to it and then run the following pybabel commands to update the translation files:

flask/bin/pybabel extract -F babel.cfg -o messages.pot app
flask/bin/pybabel update -i messages.pot -d app/translations

The extract command is identical to the one we issued earlier, it just generates an updated messages.pot file that adds the new texts. The update call takes the new messages.pot file and merges the new texts into all the translations found in the folder given by the -d argument.

Once the messages.po files in each language folder have been updated we can run poedit again to enter translations for the new texts, and then repeat the pybabel compile command to make those new texts available to our application.

Translating moment.js

So now that we have entered a Spanish translation for all the texts in code and templates we can run the application to see how it looks.

And right there we'll notice that all the timestamps are still in English. The moment.js library that we are using to render our dates and times hasn't been informed that we need a different language.

Reading the moment.js documentation we find that there is a large list of translations available, and that we simply need to load a second javascript with the selected language to switch to that language. So we just download the Spanish translation from the moment.js website and put it in our static/js folder as moment-es.min.js. We will follow the convention that any language file for moment.js will be added with the format moment-<language>.min.js, so that we can then select the correct one dynamically.

To be able to load a javascript that has the language code in its name we need to expose this code to the templates. The simplest way to do that is to add the language code to Flask's g global, in a similar way to how we expose the logged in user (file app/

def before_request():
    g.user = current_user
    if g.user.is_authenticated:
        g.user.last_seen = datetime.utcnow()
        g.search_form = SearchForm()
    g.locale = get_locale()

And now that we can see the language code in the templates we can load the moment.js language script in our base template (file app/templates/base.html):

{% if g.locale != 'en' %}
<script src="/static/js/moment-{{ g.locale }}.min.js"></script>
{% endif %}

Note that we make it conditional, because if we are showing the English version of the site we already have the correct texts from the first moment.js javascript file.

Lazy evaluation

While we continue playing with the Spanish version of our site we notice another problem. When we log out and try to log back in there is a flash message that reads "Please log in to access this page." in English. But where is this message? Unfortunately we aren't putting out this message, it is the Flask-Login extension that does it on its own.

Flask-Login allows this message to be configured by the user, so we are going to take advantage of that not to change the message but to make it translatable. So in our first try we do this (file app/

from flask_babel import gettext
lm.login_message = gettext('Please log in to access this page.')

But this really does not work. The gettext function needs to be used in the context of a request to be able to produce translated messages. If we call it outside of a request it will just give us the default text, which will be the English version.

For cases like this Flask-Babel provides another function called lazy_gettext, which doesn't look for a translation immediately like gettext() and _() do but instead delay the search for a translation until the string is actually used. So here is how to properly set up the login message (file app/

from flask_babel import lazy_gettext
lm.login_message = lazy_gettext('Please log in to access this page.')

When using lazy_gettext the pybabel extract command needs to be informed that the lazy_gettext function also wraps translatable texts with the -k option:

flask/bin/pybabel extract -F babel.cfg -k lazy_gettext -o messages.pot app

So after extracting yet another messages.pot template we update the language catalogs (pybabel update), translate the added text (poedit) and finally compile the translations one more time (pybabel compile).

With the advent of Flask 0.10 user sessions are serialized to JSON. This introduces a problem with lazy evaluated texts that are given as argument to the flash() function. Flashed messages are written to the user session, but the object used to wrap the lazily evaluated texts is a complex object that does not have a direct conversion to a JSON type. Flask 0.9 did not serialize sessions to JSON so this was not a problem, but until Flask-Babel addresses this we have to provide a solution from our side, and this solution comes in the form of a custom JSON encoder (file app/

from flask.json import JSONEncoder

class CustomJSONEncoder(JSONEncoder):
    """This class adds support for lazy translation texts to Flask's
    JSON encoder. This is necessary when flashing translated texts."""
    def default(self, obj):
        from speaklater import is_lazy_string
        if is_lazy_string(obj):
                return unicode(obj)  # python 2
            except NameError:
                return str(obj)  # python 3
        return super(CustomJSONEncoder, self).default(obj)

app.json_encoder = CustomJSONEncoder

This installs a custom JSON encoder that forces the lazy texts to be evaluated into strings prior to being converted to JSON. Note the complexity in getting this done differently for Python 2 vs. Python 3.

And now we can say that we have a fully internationalized application!


Since the pybabel commands are long and hard to remember we are going to end this article with a few quick and dirty little scripts that simplify the most common tasks we've seen above.

First a script to add a language to the translation catalog (file

import os
import sys
if sys.platform == 'win32':
    pybabel = 'flask\\Scripts\\pybabel'
    pybabel = 'flask/bin/pybabel'
if len(sys.argv) != 2:
    print "usage: tr_init <language-code>"
os.system(pybabel + ' extract -F babel.cfg -k lazy_gettext -o messages.pot app')
os.system(pybabel + ' init -i messages.pot -d app/translations -l ' + sys.argv[1])

Then a script to update the catalog with new texts from source and templates (file

import os
import sys
if sys.platform == 'win32':
    pybabel = 'flask\\Scripts\\pybabel'
    pybabel = 'flask/bin/pybabel'
os.system(pybabel + ' extract -F babel.cfg -k lazy_gettext -o messages.pot app')
os.system(pybabel + ' update -i messages.pot -d app/translations')

And finally, a script to compile the catalog (file

import os
import sys
if sys.platform == 'win32':
    pybabel = 'flask\\Scripts\\pybabel'
    pybabel = 'flask/bin/pybabel'
os.system(pybabel + ' compile -d app/translations')

These scripts should make working with translation files an easy task.

Final words

Today we have implemented an often overlooked aspect of web applications. Users want to work in their native language, so being able to publish our application in as many languages as we can find translators for is a huge accomplishment.

In the next article we will look at what is probably the most complex aspect in the area of I18n and L10n, which is the real time automated translation of user generated content. And we will use this as an excuse to add some Ajax magic to our application.

Here is the download link for the latest version of microblog, including the complete Spanish translation:


If you prefer, you can also find the code on github here.

Thank you for being a loyal reader. See you next time!


Become a Patron!

Hello, and thank you for visiting my blog! If you enjoyed this article, please consider supporting my work on this blog on Patreon!

  • #26 Miguel Grinberg said

    @Dougal: I think I need more context to understand the question. Jinja2 escapes all variables by default, you can disable the escaping with the safe filter.

  • #27 Morteza Ipo said

    Perfect tutorial. Thank you so much.

  • #28 Gigi Sayfan said

    if sys.platform == 'wn32' -> if sys.platform == 'win32'

  • #29 Joshua Grigonis said

    I am seeing the same error as Veronica.

    File "c:\python27\lib\site-packages\flask\", line 1836, in call
    return self.wsgi_app(environ, start_response)
    File "c:\python27\lib\site-packages\flask\", line 1820, in wsgi_app
    response = self.make_response(self.handle_exception(e))
    File "c:\python27\lib\site-packages\flask\", line 1403, in handle_exception
    reraise(exc_type, exc_value, tb)
    File "c:\python27\lib\site-packages\flask\", line 1817, in wsgi_app
    response = self.full_dispatch_request()
    File "c:\python27\lib\site-packages\flask\", line 1479, in full_dispatch_request
    response = self.process_response(response)
    File "c:\python27\lib\site-packages\flask\", line 1693, in process_response
    self.save_session(ctx.session, response)
    File "c:\python27\lib\site-packages\flask\", line 837, in save_session
    return self.session_interface.save_session(self, session, response)
    File "c:\python27\lib\site-packages\flask\", line 326, in save_session
    val = self.get_signing_serializer(app).dumps(dict(session))
    File "c:\python27\lib\site-packages\", line 537, in dumps
    payload = want_bytes(self.dump_payload(obj))
    File "c:\python27\lib\site-packages\", line 809, in dump_payload
    json = super(URLSafeSerializerMixin, self).dump_payload(obj)
    File "c:\python27\lib\site-packages\", line 522, in dump_payload
    return want_bytes(self.serializer.dumps(obj))
    File "c:\python27\lib\site-packages\flask\", line 85, in dumps
    return json.dumps(_tag(value), separators=(',', ':'))
    File "c:\python27\lib\site-packages\flask\", line 126, in dumps
    rv = _json.dumps(obj, kwargs)
    File "c:\python27\lib\", line 250, in dumps
    File "c:\python27\lib\json\", line 209, in encode
    chunks = list(chunks)
    File "c:\python27\lib\json\", line 434, in _iterencode
    for chunk in _iterencode_dict(o, _current_indent_level):
    File "c:\python27\lib\json\", line 408, in _iterencode_dict
    for chunk in chunks:
    File "c:\python27\lib\json\", line 332, in _iterencode_list
    for chunk in chunks:
    File "c:\python27\lib\json\", line 408, in _iterencode_dict
    for chunk in chunks:
    File "c:\python27\lib\json\", line 332, in _iterencode_list
    for chunk in chunks:
    File "c:\python27\lib\json\", line 442, in _iterencode
    o = _default(o)
    File "c:\python27\lib\site-packages\flask\", line 83, in default
    return _json.JSONEncoder.default(self, o)
    File "c:\python27\lib\json\", line 184, in default
    raise TypeError(repr(o) + " is not JSON serializable")
    TypeError: lu'Por favor, inicie sesi\xf3n para ver esta p\xe1gina.' is not JSON serializable

  • #30 Joshua Grigonis said

    This fixes the default, english version:

    lm.login_message = unicode(lazy_gettext('Please log in to access this page.'))

    According to this, it doesn't work the same any more:

    I couldn't however make a solution that works for the translated texts.

  • #31 Miguel Grinberg said

    @Joshua: you are correct, I believe there is currently no easy way to make Flask 0.10 and Flask-Babel work well together because of this. This is the only problem that prevents this application from running under Flask 0.10, even on Python 3. Your workaround works, but always shows the English text.

  • #32 homolibere said

    easy way founded :)

    starting from flask-login 0.2.8 there is a callback function so all you need to do:
    (file app/
    from flask.ext.babel import gettext

    def _gettext(msg):
    return gettext(msg)

    lm.localize_callback = _gettext

    instead of:
    from flask.ext.babel import lazy_gettext
    lm.login_message = lazy_gettext('Please log in to access this page.')

  • #33 Miguel Grinberg said

    @homolibere: Thanks, very nice solution.

  • #34 Marcel Hekking said

    Great tutorial!!! Thanx for sharing.

  • #35 Igor said

    Thanks for great tutorial Miguel!
    I have a Q:
    how to translate strings that I get from database? If for example I want to made a kind of CMS where all the static info will be in DB not in templates. I talking about something like this: gettext(db.user.description)

  • #36 Miguel Grinberg said

    @Igor: Automatic translations are covered in the next part of the series.

  • #37 alex leonhardt said

    Hi Miguel - I'm getting this when trying to run the translations :

    $ pybabel compile -d app/translations
    catalog 'app/translations/es/LC_MESSAGES/messages.po' is marked as fuzzy, skipping

    Any clues ?


  • #38 Miguel Grinberg said

  • #39 Tomás Mery said

    This is really old but maybe someone else has the same issue.

    Lazy gettext wraps the translation process for later so it isn't a real string but a different type of object. Some json libs checks the type of the expresion you are trying to jsonify so it gives you an exception. To fix this you can trigger the translation using the unicode builtin.

    a = lazy_gettext("wololoh")
    jsonify({'msg': a}) # this will throw an exception
    jsonify({'msg': unicode(a)}) # this should work correctly

    hope it's useful

  • #40 Jaime said

    so nice man! this really help me, thanks.

  • #41 Mark Jeghers said

    Very nice tutorials, helping me very much. However, with the latest code just downloaded in this last week for all the components (flask, babel, etc), the translation to Spanish fails for all the static text in the templates, e.g. everything wrapped in _(), I have used the code exactly as downloaded and running Python 2.7, any idea what is wrong? Or how to fix it? I am baffled at this point.

    thanx again!

  • #42 Miguel Grinberg said

    @Mark: Are you using the same versions of packages that I'm using? Check the requirements.txt file and use those to see if this is a problem with a recently updated package.

  • #43 Mark Jeghers said

    Sorry for being dense... but where is the requirements.txt file found? I don't see it. Also, what's the quickest way to check in Python the version of the packages I am using? Is there a command in Python to tell me this?

  • #44 Miguel Grinberg said

    @Mark: look in the github project, it's in the root folder:

  • #45 Tamby said

    Hi Miguel !!

    What about form labels ?

    I tried something like this

    from flask.ext.babel import gettext

    class LoginForm(Form):
    email = StringField(gettext('Password'), validators=[Required(), Length(1, 64), Email()])

    But it doesn't work !!

  • #46 Miguel Grinberg said

    @Tamby: probably better to invoke gettext in the template file, in the place where you insert these labels.

  • #47 Aleks Clapin said

    Excellent tutorial Miguel, thank you.

    I am trying to complete the I18n and L10n part of the tutorial. I get an error message in my command prompt when I try to run flask\Scripts\pybabel extract -F babel.cfg -o messages.pot app.

    flask\Scripts\pybabel extract -F babel.cfg -o messages.pot app
    failed to create process.

    I tried to search around on the web but no success. flask-babel is correctly installed in my virtualenv.

    Any ideas of where this would come from?

  • #48 Miguel Grinberg said

    @Aleks: I don't have a Windows machine at hand right now to test. I wonder if this is some sort of incompatibility. Do you have Cygwin installed? It could be useful to find out if the same problem occurs with the Cygwin version of Python.

  • #49 Josh said

    Google will translate .PO files now. Users don't need to download POedit

  • #50 Miguel Grinberg said

    @Josh: automatic translation is not a replacement for using a real translator.

Leave a Comment