2013-04-24T06:24:58Z

The Flask Mega-Tutorial, Part XVIII: Deployment on the Heroku Cloud

This is the eighteenth article in the series in which I document my experience writing web applications in Python using the Flask microframework.

The goal of the tutorial series is to develop a decently featured microblogging application that demonstrating total lack of originality I have decided to call microblog.

NOTE: This article was revised in September 2014 to be in sync with current versions of Python and Flask.

Here is an index of all the articles in the series that have been published to date:

In the previous article we explored traditional hosting options. We've looked at two actual examples of deployment to Linux servers, first to a CentOS system and later to the Raspberry Pi credit card sized computer. Those that are not used to administer a Linux system probably thought the amount of effort we had to put into the task was huge, and that surely there must be an easier way.

Today we will try to see if deploying to the cloud is the answer to the complexity problem.

But what does it mean to "deploy to the cloud"?

A cloud hosting provider offers a platform on which an application can run. All the developer needs to provide is the application, because the rest, which includes the hardware, operating system, scripting language interpreters and database, is managed by the service.

Sounds too good to be true, right?

We'll look at deploying to Heroku, one of the most popular cloud hosting services. I picked Heroku not only because it is popular, but also because it has a free service level, so we get to host our application without having to spend any money. If you want to find information about this type of services and what other providers are out there you can consult the Wikipedia page on platform as a service.

Hosting on Heroku

Heroku was one of the first platform as a service providers. It started as a hosting option for Ruby based applications, but then grew to support many other languages like Java, Node.js and our favorite, Python.

In essence, deploying a web application to Heroku requires just uploading the application using git (you'll see how that works in a moment). Heroku looks for a file called Procfile in the application's root directory for instructions on how to execute the application. For Python projects Heroku also expects a requirements.txt file that lists all the module dependencies that need to be installed.

After the application is uploaded you are essentially done. Heroku will do its magic and the application will be online within seconds. The amount of money you pay directly determines how much computing power you get for your application, so as your application gets more users you will need to buy more units of computing, which Heroku calls "dynos", and that is how you keep up with the load.

Ready to try Heroku? Let's get started!

Creating Heroku account

Before we can deploy to Heroku we need to have an account with them. So head over to heroku.com and create an account.

Once you are logged in you have access to a dashboard, where all your apps can be managed. We will not be using the dashboard much though, but it provides a nice view of your account.

Installing the Heroku client

Even though it is possible to manage applications from the Heroku web site to some extent, there are some things that can only be done from the command line, so we'll just do everything there.

Heroku offers a tool called the "Heroku client" that we'll use to create and manage our application. This tool is available for Windows, Mac OS X and Linux. If there is a Heroku toolbelt download for your platform then that's the easiest way to get the Heroku client tool installed.

The first thing we should do with the client tool is to login to our account:

$ heroku login

Heroku will prompt for your email address and your account password. The first time you login it will send your public SSH key to the Heroku servers.

Your authenticated status will be remembered in subsequent commands.

Git setup

The git tool is core to the deployment of apps to Heroku, so it must also be available. If you installed the Heroku toolbelt then you already have it as part of that installation.

To deploy to Heroku the application must be in a local git repository, first so let's get one set up:

$ git clone -b version-0.18 git://github.com/miguelgrinberg/microblog.git
$ cd microblog

Note that we are choosing a specific branch to be checked out, this is the branch that has the Heroku integration.

Creating a Heroku app

To create a new Heroku app you just use the create command from the root directory of the application:

$ heroku apps:create flask-microblog
Creating flask-microblog... done, stack is cedar
http://flask-microblog.herokuapp.com/ | git@heroku.com:flask-microblog.git

In addition to setting up a URL this command adds a git remote to our git repository that we will soon use to upload the application.

Of course the name flask-microblog is now taken by me, so make sure you use a different app name if you are doing this along.

Eliminating local file storage

Several of the functions of our application rely on writing data to disk files.

Unfortunately we have a tricky problem with this. Applications that run on Heroku are not supposed to write permanent files to disk, because Heroku uses a virtualized platform that does not remember data files, the file system is reset to a clean state that just contains the application script files each time a virtual worker is started. Essentially this means that the application can write temporary files to disk, but should be able to regenerate those files should they disappear. Also when two or more workers (dynos) are in use each gets its own virtual file system, so it is not possible to share files among them.

This is really bad news for us. For starters, it means we cannot use sqlite as a database, and our Whoosh full text search database will also fail to work, since it writes all its data to files. We also have the compiled translation files for Flask-Babel, which are generated when running the tr_compile.py script. And yet another area where there is problem is logging, we used to write our logfile.to the tmp folder and that is also not going to work when running on Heroku.

We have identified four major problems for which we need to try to find solutions.

For our first problem, the database, we'll migrate to Heroku's own database offering, which is based on PostgreSQL.

For the full text search functionality we don't have a readily available alternative. We could re-implement full text searches using PostgreSQL functionality, but that would require several changes to our application. It is a pity, but solving this problem now would be a huge distraction, so for now we'll disable full text searches when running under Heroku.

To support translations we are going to include the compiled translation files in the git repository, that way these files will be persistant in the file system.

Finally, since we can't write our own log file, we'll add our logs to the logger that Heroku uses, which is actually simple, since Heroku will add to its log anything that goes to stdout.

Creating a Heroku database

To create a database we use the Heroku client:

$ heroku addons:add heroku-postgresql:dev
Adding heroku-postgresql:dev on flask-microblog... done, v3 (free)
Attached as HEROKU_POSTGRESQL_ORANGE_URL
Database has been created and is available
 ! This database is empty. If upgrading, you can transfer
 ! data from another database with pgbackups:restore.
Use `heroku addons:docs heroku-postgresql:dev` to view documentation.
$ heroku pg:promote HEROKU_POSTGRESQL_ORANGE_URL
Promoting HEROKU_POSTGRESQL_ORANGE_URL to DATABASE_URL... done

Note that we are adding a development database, because that is the only database offering that is free. A production web server would need one of the production database options.

And how does our application know the details to connect to this database? Heroku publishes the URI to the database in the $DATABASE_URL environment variable. If you recall, we have modified our configuration to look for this variable in the previous hosting article, so the changes are already in place to connect with this database.

Disabling full text searches

To disable full text searches we need our application to be able to know if it is running under Heroku or not. For this we will add a custom environment variable, again using the Heroku client tool:

heroku config:set HEROKU=1

The HEROKU environment variable will now be set to 1 when our application runs inside the Heroku virtual platform.

Now it is easy to disable the full text search index. First we add a configuration variable (file config.py):

# Whoosh does not work on Heroku
WHOOSH_ENABLED = os.environ.get('HEROKU') is None

Then we suppress the creation of the full text database instance (file app/models.py):

from config import WHOOSH_ENABLED

enable_search = WHOOSH_ENABLED
if enable_search:
    import flask_whooshalchemy as whooshalchemy

# ...
if enable_search:
    whooshalchemy.whoosh_index(app, Post)

Precompiled Tranlations

This one is pretty easy. After running tr_compile.py we end up with a <language>.mo file for each <language>.po source file. All we need to do is add the mo files to the git repository, and then in the future we'll have to remember to update them. The mo file for Spanish is included in the branch of the git repository dedicated to this article.

Fixing the logging

Under Heroku, anything that is written to stdout is added to the Heroku application log. But logs written to a disk file will not be accessible. So on this platform we will suppress the file log and instead use a log that writes to stdout (file app/__init__.py):

if not app.debug and os.environ.get('HEROKU') is None:
    import logging
    from logging.handlers import RotatingFileHandler
    file_handler = RotatingFileHandler('tmp/microblog.log', 'a', 1 * 1024 * 1024, 10)
    file_handler.setLevel(logging.INFO)
    file_handler.setFormatter(logging.Formatter('%(asctime)s %(levelname)s: %(message)s [in %(pathname)s:%(lineno)d]'))
    app.logger.addHandler(file_handler)
    app.logger.setLevel(logging.INFO)
    app.logger.info('microblog startup')

if os.environ.get('HEROKU') is not None:
    import logging
    stream_handler = logging.StreamHandler()
    app.logger.addHandler(stream_handler)
    app.logger.setLevel(logging.INFO)
    app.logger.info('microblog startup')

The web server

Heroku does not provide a web server. Instead, it expects the application to start its own server on the port number given in environment variable $PORT.

We know the Flask web server is not good for production use because it is single process and single threaded, so we need a better server. The Heroku tutorial for Python suggests gunicorn, a pre-fork style web server written in Python, so that's the one we'll use.

For our local environment gunicorn installs as a regular python module into our virtual environment:

$ flask/bin/pip install gunicorn

To start this browser we need to provide a single argument that names the Python module that defines the application and the application object, both separated by a colon. Now for example, if we wanted to start a local gunicorn server with this module we would issue the following command:

$ flask/bin/gunicorn --log-file - app:app
2013-04-24 08:42:34 [31296] [INFO] Starting gunicorn 19.1.1
2013-04-24 08:42:34 [31296] [INFO] Listening at: http://127.0.0.1:8000 (31296)
2013-04-24 08:42:34 [31296] [INFO] Using worker: sync
2013-04-24 08:42:34 [31301] [INFO] Booting worker with pid: 31301

The requirements file

Soon we will be uploading our application to Heroku, but before we can do that we have to inform the server what dependencies the application needs to run. We created a requirements.txt file in the previous chapter, to simplify the installation of dependencies in a dedicated server, and the good news is that Heroku also imports dependencies from a requirements file.

The gunicorn web server needs to be added to the list, and so is the psycopg2 driver, which is required by SQLAlchemy to connect to PostgreSQL databases. The final requirements.txt file looks like this:

Babel==1.3
Flask==0.10.1
Flask-Babel==0.9
Flask-Login==0.2.11
Flask-Mail==0.9.0
Flask-OpenID==1.2.1
Flask-SQLAlchemy==2.0
Flask-WTF==0.10.2
Flask-WhooshAlchemy==0.56
Jinja2==2.7.3
MarkupSafe==0.23
SQLAlchemy==0.9.7
Tempita==0.5.2
WTForms==2.0.1
Werkzeug==0.9.6
Whoosh==2.6.0
blinker==1.3
coverage==3.7.1
decorator==3.4.0
flipflop==1.0
guess-language==0.2
gunicorn==19.1.1
itsdangerous==0.24
pbr==0.10.0
psycopg2==2.5.4
python-openid==2.2.5
pytz==2014.7
six==1.8.0
speaklater==1.3
sqlalchemy-migrate==0.9.2
sqlparse==0.1.11

Some of these modules will not be needed in the Heroku version of our application, but it really doesn't hurt to have extra stuff, to me it seems better to have a complete requirements list.

The Procfile

The last requirement is to tell Heroku how to run the application. For this Heroku requires a file called Procfile in the root folder of the application.

This file is extremely simple, it just defines process names and the commands associated with them (file Procfile):

web: gunicorn app:app
init: python db_create.py
upgrade: python db_upgrade.py

The web label is associated with the web server. Heroku expects this task and will use it to start our application.

The other two tasks, named init and upgrade are custom tasks that we will use to work with our application. The init task initializes our application by creating the database. The upgrade task is similar, but instead of creating the database from scratch it upgrades it to the latest migration.

Deploying the application

And now we have reached the most interesting part, where we push the application to our Heroku hosting account. This is actually pretty simple, we just use git to push the application:

$ git push heroku master
Counting objects: 307, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (168/168), done.
Writing objects: 100% (307/307), 165.57 KiB, done.
Total 307 (delta 142), reused 272 (delta 122)

-----> Python app detected
-----> No runtime.txt provided; assuming python-2.7.4.
-----> Preparing Python runtime (python-2.7.4)
-----> Installing Distribute (0.6.36)
-----> Installing Pip (1.3.1)
-----> Installing dependencies using Pip (1.3.1)
...
-----> Discovering process types
       Procfile declares types -> init, upgrade, web

-----> Compiled slug size: 29.6MB
-----> Launching... done, v6
       http://flask-microblog.herokuapp.com deployed to Heroku

To git@heroku.com:flask-microblog.git
 * [new branch]      master -> master

The label heroku that we used in the git push command was automatically registered with our git repository when we created our application with heroku create. To see how this remote repository is configured you can run git remote -v in the application folder.

The first time we push the application to Heroku we need to initialize the database and the translation files, and for that we can execute the init task that we included in our Procfile:

$ heroku run init
Running `init` attached to terminal... up, run.7671
/app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/engine/url.py:105: SADeprecationWarning: The SQLAlchemy PostgreSQL dialect has been renamed from 'postgres' to 'postgresql'. The new URL format is postgresql[+driver]://<user>:<pass>@<host>/<dbname>
  module = __import__('sqlalchemy.dialects.%s' % (dialect, )).dialects

The deprecation warning comes from SQLAlchemy, because it does not like that the URI starts with postgres:// instead of postgresql://. This URI comes from Heroku via the $DATABASE_URL environment variable, so we really don't have any control over this. Let's hope this continues to work for a long time.

Believe it or not, now the application is online. In my case, the application can be accessed at http://flask-microblog.herokuapp.com. For example, you can become my follower from my profile page. I'm not sure how long I'll leave it there, but feel free to give it a try if you can connect to it!

Updating the application

The time will come when an update needs to be deployed. This works in a similar way to the initial deployment. First the application is pushed from git:

$ git push heroku master

Then the upgrade script is executed:

$ heroku run upgrade

Logging

If a problem occurs then it may be useful to see the logs. Recall that for the Heroku hosted version we are writing our logs to stdout which Heroku collects into its own logs.

To see the logs we use the Heroku client:

$ heroku logs

The above command will show all the logs, including Heroku ones. To only see application logs we can use this command:

$ heroku logs --source app

Things like stack traces and other application errors will appear in these app logs.

Is it worth it?

We've now seen what it takes to deploy to a cloud hosting service so we can now compare against the traditional hosting.

The simplicity aspect is easily won by cloud. At least for Heroku the deployment process was extremely simple. When deploying to a dedicated server or VPS there are a lot of administrative tasks that need to be done to prepare the system. Heroku takes care of all that and allows us to concentrate on our application.

The price is where it is harder to come to a conclusion. Cloud offerings are more expensive than dedicated servers, since you are not only paying for the server but also for the admin work. A pretty basic production service with Heroku that includes two dynos and the least expensive production database costs $85 per month at the time I'm writing this. On the other side, if you look hard you can find well provisioned VPS servers for abour $40 per year.

In the end, I think it all comes down to what is most important to you, time or money.

The End?

The updated application is available, as always, on my github page. Alternatively you can download it as a zip file below:

Download microblog 0.18.

With our application deployed in every possible way it feels like we are reaching the end of this journey.

I hope these articles were a useful introduction to the development of a real world web application project, and that the knowledge dump I've made over these eighteen articles motivates you to start your own project.

I'm not closing the door to more microblog articles. If and when an interesting topic comes to mind I will write more, but I expect the rate of updates from now on will slow down a bit. From time to time I may make small updates to the application that don't deserve a blog post, so you may want to watch the project on github to catch these.

I will continue blogging about topics related to web development and software in general, so I invite you to connect via Twitter or Facebook if you haven't done it yet, so that you find my future articles.

Thank you, again, for being a loyal reader.

Miguel

119 comments

  • #1 Jonathan Grahl said 2013-04-24T07:11:38Z

    Awesome content! Loving these tutorials that you are making, I can't wait to see what content you may release in the future!

  • #2 Anonymous said 2013-05-02T14:51:01Z

    The whole series is an excellent tutorial. Thank you, you saved me a lot of time figuring out all that stuff. You should ask for donations on your page. I have never tipped people on the web, but would definitely do it for this!

  • #3 mark lilly said 2013-05-07T19:11:36Z

    This whole series is indeed epic, and from one Portlander to another, thank you!! Are you using Flask for production systems at present?

  • #4 Miguel Grinberg said 2013-05-08T04:23:18Z

    @Mark: You are welcome. This blog is powered by Flask.

  • #5 Max said 2013-05-16T19:13:23Z

    Thank you, Miguel !

  • #6 Gonzalo said 2013-05-17T19:03:31Z

    Hi Miguel, excellent tutorial!! It's great! Have you ever try flask-social or any other social auth module with flask? Could you recommend me one? Thanks!

  • #7 Miguel Grinberg said 2013-05-18T00:42:19Z

    @Gonzalo: Unfortunately I haven't used any oauth based plugins, so I can't really recommend you one. The two that seem to be supported in Flask are Flask-Social and Python-Social-Auth. Just from looking at the documentation for the two the latter seems to be more feature rich.

  • #8 Anonymous said 2013-05-19T20:59:17Z

    Thank you for taking the time to write this up and share this excellent series. I'm an experienced developer, but new to the python web-app ecosystem. Thanks! Have a great day

  • #9 Ralph Caraveo said 2013-05-23T17:54:14Z

    Perhaps you can cover using Flask with Gevent. These tools together are a wicked combination in getting a Flask app to scale with high concurrency. I'm still learning about Gevent myself but I enjoy your writing style...and perhaps consider turning this into a book. I'd buy a copy! ;)

  • #10 zhangjingqiang said 2013-05-30T03:31:40Z

    The great tutorial! Could you write a tornado tutorial like this?

  • #11 Tri said 2013-06-10T06:59:11Z

    Awesome tutorial every time! Thanks again. You should do one where you want to incorporate maps into the app. Like a mini version of foursquare...with the ability to search for some predetermined locations from the database.

  • #12 anaheim said 2013-06-13T14:09:05Z

    Hi Miguel, great tutorial! Could you guide mi in the right direction for letting users login with their facebook account? Cheers

  • #13 Miguel Grinberg said 2013-06-14T02:30:35Z

    @anaheim: to login with Facebook/Twitter etc. you need to implement an oauth consumer. See Flask-OAuth, an extension that is similar in functionality to Flask-OpenID.

  • #14 anaheim said 2013-06-14T08:19:00Z

    Thanks Miguel! any idea what is wrong with the flask_login.py ? I keep getting a TypeErrore from there: https://gist.github.com/anonymous/5780302

  • #15 Miguel Grinberg said 2013-06-14T15:43:54Z

    Looks like Flask 0.10 (or one of its dependencies) introduced this incompatibility with Flask-Login. You can avoid the issue by first uninstalling Flask, Werkzeug and Jinja2, then installing Flask==0.9, Wekzeug==0.8.3 and Jinja2==2.6.

  • #16 ebenpack said 2013-07-18T16:52:55Z

    Miguel, Am I wrong, or wouldn't updating with git push be a little more complicated? For example, you don't want to publish your CSRF key, email password, etc. on github, so you're going to have to update your config file after pulling. How do you deal with this when updating? Another example is that I want to include a "Fork me on github" banner, analytics code, etc. in my templates, but I don't feel like it's right to have those hard-coded in the template posted on github (not that they're secret, I just feel the project should be more general/easier to fork on github). Right now I have a bash deployment script that pulls from git, copies in some image assets, and uses sed to add the analytics code and github banner. It works fine, but it feels messy and fragile. Any thoughts? Thanks

  • #17 Miguel Grinberg said 2013-07-19T05:15:04Z

    @ebenpack: for a real project I would probably not host it on a public github, it would be a on a private server under my own control. Another option would be to have predetermined environment variables that can override these configuration values. You would set these in your Heroku environment, similarly to how Heroku configures the database to use through the environment.

  • #18 Kelly Dresser said 2013-07-19T06:48:50Z

    Thank you again for a terrific series of tutorials! I've found much new & useful information here.

  • #19 jslopez said 2013-07-30T01:54:12Z

    Thanks a lot for the tutorial. How do you handle the compilation file for the translations? In my case, it's been discarded by heroku, due to the local storage issue that you described.

  • #20 Miguel Grinberg said 2013-07-30T02:54:45Z

    @jslopez: that's actually an oversight on my part. The compilation to .mo needs to happen each time a new node is started. So this should be incorporated into the runp-heroku script instead of executed manually during setup. Good catch, I'll review and update the article.

  • #21 Joe said 2013-08-03T06:28:21Z

    Dear Miguel, Thank you so much for this tutorial. And I can't wait for your book. I'm interested in implementing full text search using PostgreSQL, can you please give me some hints/ advices. Thanks in advance.

  • #22 Miguel Grinberg said 2013-08-03T06:35:31Z

    @Joe: Thanks. The following article may give you a starting point to support full text search with Postgres: http://lowmanio.co.uk/blog/entries/postgresql-full-text-search-and-sqlalchemy/.

  • #23 Joe Jean said 2013-08-04T02:35:46Z

    Hey Miguel, Today I came across SQLAlchemy-Searchable (http://sqlalchemy-searchable.readthedocs.org/en/latest/) which works pretty well with PostgreSQL and does not store data in file. The latest version does however require SQLAlchemy 0.8 or higher. Check their github repo for some examples. It works for me, that 's why I decided to share it with you!

  • #24 Miguel Grinberg said 2013-08-04T04:43:44Z

    @Joe: Thanks, looks promising!

  • #25 mick said 2013-08-14T17:01:30Z

    Dear Miguel, thank you for the tutorial. I'm using whoosh search as an integral part of my project. How do you suggest I solve this problem in order to deploy it on heroku. Thank you

Leave a Comment