The goal of the tutorial series is to develop a decently featured microblogging application that demonstrating total lack of originality I have decided to call
NOTE: This article was revised in September 2014 to be in sync with current versions of Python and Flask.
Here is an index of all the articles in the series that have been published to date:
- Part I: Hello, World!
- Part II: Templates
- Part III: Web Forms
- Part IV: Database
- Part V: User Logins
- Part VI: Profile Page And Avatars
- Part VII: Unit Testing
- Part VIII: Followers, Contacts And Friends
- Part IX: Pagination
- Part X: Full Text Search
- Part XI: Email Support
- Part XII: Facelift
- Part XIII: Dates and Times
- Part XIV: I18n, L10n
- Part XV: Ajax
- Part XVI: Debugging, Testing and Profiling
- Part XVII: Deployment on Linux (even on the Raspberry Pi!)
- Part XVIII: Deployment on the Heroku Cloud (this article)
In the previous article we explored traditional hosting options. We've looked at two actual examples of deployment to Linux servers, first to a CentOS system and later to the Raspberry Pi credit card sized computer. Those that are not used to administer a Linux system probably thought the amount of effort we had to put into the task was huge, and that surely there must be an easier way.
Today we will try to see if deploying to the cloud is the answer to the complexity problem.
But what does it mean to "deploy to the cloud"?
A cloud hosting provider offers a platform on which an application can run. All the developer needs to provide is the application, because the rest, which includes the hardware, operating system, scripting language interpreters and database, is managed by the service.
Sounds too good to be true, right?
We'll look at deploying to Heroku, one of the most popular cloud hosting services. I picked Heroku not only because it is popular, but also because it has a free service level, so we get to host our application without having to spend any money. If you want to find information about this type of services and what other providers are out there you can consult the Wikipedia page on platform as a service.
Hosting on Heroku
Heroku was one of the first platform as a service providers. It started as a hosting option for Ruby based applications, but then grew to support many other languages like Java, Node.js and our favorite, Python.
In essence, deploying a web application to Heroku requires just uploading the application using
git (you'll see how that works in a moment). Heroku looks for a file called
Procfile in the application's root directory for instructions on how to execute the application. For Python projects Heroku also expects a
requirements.txt file that lists all the module dependencies that need to be installed.
After the application is uploaded you are essentially done. Heroku will do its magic and the application will be online within seconds. The amount of money you pay directly determines how much computing power you get for your application, so as your application gets more users you will need to buy more units of computing, which Heroku calls "dynos", and that is how you keep up with the load.
Ready to try Heroku? Let's get started!
Creating Heroku account
Before we can deploy to Heroku we need to have an account with them. So head over to heroku.com and create an account.
Once you are logged in you have access to a dashboard, where all your apps can be managed. We will not be using the dashboard much though, but it provides a nice view of your account.
Installing the Heroku client
Even though it is possible to manage applications from the Heroku web site to some extent, there are some things that can only be done from the command line, so we'll just do everything there.
Heroku offers a tool called the "Heroku client" that we'll use to create and manage our application. This tool is available for Windows, Mac OS X and Linux. If there is a Heroku toolbelt download for your platform then that's the easiest way to get the Heroku client tool installed.
The first thing we should do with the client tool is to login to our account:
$ heroku login
Heroku will prompt for your email address and your account password. The first time you login it will send your public SSH key to the Heroku servers.
Your authenticated status will be remembered in subsequent commands.
git tool is core to the deployment of apps to Heroku, so it must also be available. If you installed the Heroku toolbelt then you already have it as part of that installation.
To deploy to Heroku the application must be in a local
git repository, first so let's get one set up:
$ git clone -b version-0.18 git://github.com/miguelgrinberg/microblog.git $ cd microblog
Note that we are choosing a specific branch to be checked out, this is the branch that has the Heroku integration.
Creating a Heroku app
To create a new Heroku app you just use the
create command from the root directory of the application:
$ heroku apps:create flask-microblog Creating flask-microblog... done, stack is cedar http://flask-microblog.herokuapp.com/ | firstname.lastname@example.org:flask-microblog.git
In addition to setting up a URL this command adds a git remote to our
git repository that we will soon use to upload the application.
Of course the name
flask-microblog is now taken by me, so make sure you use a different app name if you are doing this along.
Eliminating local file storage
Several of the functions of our application rely on writing data to disk files.
Unfortunately we have a tricky problem with this. Applications that run on Heroku are not supposed to write permanent files to disk, because Heroku uses a virtualized platform that does not remember data files, the file system is reset to a clean state that just contains the application script files each time a virtual worker is started. Essentially this means that the application can write temporary files to disk, but should be able to regenerate those files should they disappear. Also when two or more workers (dynos) are in use each gets its own virtual file system, so it is not possible to share files among them.
This is really bad news for us. For starters, it means we cannot use sqlite as a database, and our Whoosh full text search database will also fail to work, since it writes all its data to files. We also have the compiled translation files for Flask-Babel, which are generated when running the
tr_compile.py script. And yet another area where there is problem is logging, we used to write our logfile.to the
tmp folder and that is also not going to work when running on Heroku.
We have identified four major problems for which we need to try to find solutions.
For our first problem, the database, we'll migrate to Heroku's own database offering, which is based on PostgreSQL.
For the full text search functionality we don't have a readily available alternative. We could re-implement full text searches using PostgreSQL functionality, but that would require several changes to our application. It is a pity, but solving this problem now would be a huge distraction, so for now we'll disable full text searches when running under Heroku.
To support translations we are going to include the compiled translation files in the git repository, that way these files will be persistant in the file system.
Finally, since we can't write our own log file, we'll add our logs to the logger that Heroku uses, which is actually simple, since Heroku will add to its log anything that goes to
Creating a Heroku database
To create a database we use the Heroku client:
$ heroku addons:add heroku-postgresql:dev Adding heroku-postgresql:dev on flask-microblog... done, v3 (free) Attached as HEROKU_POSTGRESQL_ORANGE_URL Database has been created and is available ! This database is empty. If upgrading, you can transfer ! data from another database with pgbackups:restore. Use `heroku addons:docs heroku-postgresql:dev` to view documentation. $ heroku pg:promote HEROKU_POSTGRESQL_ORANGE_URL Promoting HEROKU_POSTGRESQL_ORANGE_URL to DATABASE_URL... done
Note that we are adding a development database, because that is the only database offering that is free. A production web server would need one of the production database options.
And how does our application know the details to connect to this database? Heroku publishes the URI to the database in the
$DATABASE_URL environment variable. If you recall, we have modified our configuration to look for this variable in the previous hosting article, so the changes are already in place to connect with this database.
Disabling full text searches
To disable full text searches we need our application to be able to know if it is running under Heroku or not. For this we will add a custom environment variable, again using the Heroku client tool:
heroku config:set HEROKU=1
HEROKU environment variable will now be set to
1 when our application runs inside the Heroku virtual platform.
Now it is easy to disable the full text search index. First we add a configuration variable (file
# Whoosh does not work on Heroku WHOOSH_ENABLED = os.environ.get('HEROKU') is None
Then we suppress the creation of the full text database instance (file
from config import WHOOSH_ENABLED enable_search = WHOOSH_ENABLED if enable_search: import flask_whooshalchemy as whooshalchemy # ... if enable_search: whooshalchemy.whoosh_index(app, Post)
This one is pretty easy. After running
tr_compile.py we end up with a
<language>.mo file for each
<language>.po source file. All we need to do is add the
mo files to the git repository, and then in the future we'll have to remember to update them. The
mo file for Spanish is included in the branch of the git repository dedicated to this article.
Fixing the logging
Under Heroku, anything that is written to
stdout is added to the Heroku application log. But logs written to a disk file will not be accessible. So on this platform we will suppress the file log and instead use a log that writes to
if not app.debug and os.environ.get('HEROKU') is None: import logging from logging.handlers import RotatingFileHandler file_handler = RotatingFileHandler('tmp/microblog.log', 'a', 1 * 1024 * 1024, 10) file_handler.setLevel(logging.INFO) file_handler.setFormatter(logging.Formatter('%(asctime)s %(levelname)s: %(message)s [in %(pathname)s:%(lineno)d]')) app.logger.addHandler(file_handler) app.logger.setLevel(logging.INFO) app.logger.info('microblog startup') if os.environ.get('HEROKU') is not None: import logging stream_handler = logging.StreamHandler() app.logger.addHandler(stream_handler) app.logger.setLevel(logging.INFO) app.logger.info('microblog startup')
The web server
Heroku does not provide a web server. Instead, it expects the application to start its own server on the port number given in environment variable
We know the Flask web server is not good for production use because it is single process and single threaded, so we need a better server. The Heroku tutorial for Python suggests gunicorn, a pre-fork style web server written in Python, so that's the one we'll use.
For our local environment
gunicorn installs as a regular python module into our virtual environment:
$ flask/bin/pip install gunicorn
To start this browser we need to provide a single argument that names the Python module that defines the application and the application object, both separated by a colon. Now for example, if we wanted to start a local
gunicorn server with this module we would issue the following command:
$ flask/bin/gunicorn --log-file - app:app 2013-04-24 08:42:34  [INFO] Starting gunicorn 19.1.1 2013-04-24 08:42:34  [INFO] Listening at: http://127.0.0.1:8000 (31296) 2013-04-24 08:42:34  [INFO] Using worker: sync 2013-04-24 08:42:34  [INFO] Booting worker with pid: 31301
The requirements file
Soon we will be uploading our application to Heroku, but before we can do that we have to inform the server what dependencies the application needs to run. We created a
requirements.txt file in the previous chapter, to simplify the installation of dependencies in a dedicated server, and the good news is that Heroku also imports dependencies from a requirements file.
gunicorn web server needs to be added to the list, and so is the
psycopg2 driver, which is required by SQLAlchemy to connect to PostgreSQL databases. The final
requirements.txt file looks like this:
Babel==1.3 Flask==0.10.1 Flask-Babel==0.9 Flask-Login==0.2.11 Flask-Mail==0.9.0 Flask-OpenID==1.2.1 Flask-SQLAlchemy==2.0 Flask-WTF==0.10.2 Flask-WhooshAlchemy==0.56 Jinja2==2.7.3 MarkupSafe==0.23 SQLAlchemy==0.9.7 Tempita==0.5.2 WTForms==2.0.1 Werkzeug==0.9.6 Whoosh==2.6.0 blinker==1.3 coverage==3.7.1 decorator==3.4.0 flipflop==1.0 guess-language==0.2 gunicorn==19.1.1 itsdangerous==0.24 pbr==0.10.0 psycopg2==2.5.4 python-openid==2.2.5 pytz==2014.7 six==1.8.0 speaklater==1.3 sqlalchemy-migrate==0.9.2 sqlparse==0.1.11
Some of these modules will not be needed in the Heroku version of our application, but it really doesn't hurt to have extra stuff, to me it seems better to have a complete requirements list.
The last requirement is to tell Heroku how to run the application. For this Heroku requires a file called
Procfile in the root folder of the application.
This file is extremely simple, it just defines process names and the commands associated with them (file
web: gunicorn app:app init: python db_create.py upgrade: python db_upgrade.py
web label is associated with the web server. Heroku expects this task and will use it to start our application.
The other two tasks, named
upgrade are custom tasks that we will use to work with our application. The init task initializes our application by creating the database. The upgrade task is similar, but instead of creating the database from scratch it upgrades it to the latest migration.
Deploying the application
And now we have reached the most interesting part, where we push the application to our Heroku hosting account. This is actually pretty simple, we just use
git to push the application:
$ git push heroku master Counting objects: 307, done. Delta compression using up to 4 threads. Compressing objects: 100% (168/168), done. Writing objects: 100% (307/307), 165.57 KiB, done. Total 307 (delta 142), reused 272 (delta 122) -----> Python app detected -----> No runtime.txt provided; assuming python-2.7.4. -----> Preparing Python runtime (python-2.7.4) -----> Installing Distribute (0.6.36) -----> Installing Pip (1.3.1) -----> Installing dependencies using Pip (1.3.1) ... -----> Discovering process types Procfile declares types -> init, upgrade, web -----> Compiled slug size: 29.6MB -----> Launching... done, v6 http://flask-microblog.herokuapp.com deployed to Heroku To email@example.com:flask-microblog.git * [new branch] master -> master
heroku that we used in the
git push command was automatically registered with our
git repository when we created our application with
heroku create. To see how this remote repository is configured you can run
git remote -v in the application folder.
The first time we push the application to Heroku we need to initialize the database and the translation files, and for that we can execute the
init task that we included in our
$ heroku run init Running `init` attached to terminal... up, run.7671 /app/.heroku/python/lib/python2.7/site-packages/sqlalchemy/engine/url.py:105: SADeprecationWarning: The SQLAlchemy PostgreSQL dialect has been renamed from 'postgres' to 'postgresql'. The new URL format is postgresql[+driver]://<user>:<pass>@<host>/<dbname> module = __import__('sqlalchemy.dialects.%s' % (dialect, )).dialects
The deprecation warning comes from SQLAlchemy, because it does not like that the URI starts with
postgres:// instead of
postgresql://. This URI comes from Heroku via the
$DATABASE_URL environment variable, so we really don't have any control over this. Let's hope this continues to work for a long time.
Believe it or not, now the application is online. In my case, the application can be accessed at http://flask-microblog.herokuapp.com. For example, you can become my follower from my profile page. I'm not sure how long I'll leave it there, but feel free to give it a try if you can connect to it!
Updating the application
The time will come when an update needs to be deployed. This works in a similar way to the initial deployment. First the application is pushed from
$ git push heroku master
Then the upgrade script is executed:
$ heroku run upgrade
If a problem occurs then it may be useful to see the logs. Recall that for the Heroku hosted version we are writing our logs to
stdout which Heroku collects into its own logs.
To see the logs we use the Heroku client:
$ heroku logs
The above command will show all the logs, including Heroku ones. To only see application logs we can use this command:
$ heroku logs --source app
Things like stack traces and other application errors will appear in these app logs.
Is it worth it?
We've now seen what it takes to deploy to a cloud hosting service so we can now compare against the traditional hosting.
The simplicity aspect is easily won by cloud. At least for Heroku the deployment process was extremely simple. When deploying to a dedicated server or VPS there are a lot of administrative tasks that need to be done to prepare the system. Heroku takes care of all that and allows us to concentrate on our application.
The price is where it is harder to come to a conclusion. Cloud offerings are more expensive than dedicated servers, since you are not only paying for the server but also for the admin work. A pretty basic production service with Heroku that includes two dynos and the least expensive production database costs $85 per month at the time I'm writing this. On the other side, if you look hard you can find well provisioned VPS servers for abour $40 per year.
In the end, I think it all comes down to what is most important to you, time or money.
The updated application is available, as always, on my github page. Alternatively you can download it as a zip file below:
Download microblog 0.18.
With our application deployed in every possible way it feels like we are reaching the end of this journey.
I hope these articles were a useful introduction to the development of a real world web application project, and that the knowledge dump I've made over these eighteen articles motivates you to start your own project.
I'm not closing the door to more
microblog articles. If and when an interesting topic comes to mind I will write more, but I expect the rate of updates from now on will slow down a bit. From time to time I may make small updates to the application that don't deserve a blog post, so you may want to watch the project on github to catch these.
I will continue blogging about topics related to web development and software in general, so I invite you to connect via Twitter or Facebook if you haven't done it yet, so that you find my future articles.
Thank you, again, for being a loyal reader.