RESTful Authentication with Flask

This article is the fourth in my series on RESTful APIs. Today I will be showing you a simple, yet secure way to protect a Flask based API with password or token based authentication.

This article stands on its own, but if you feel you need to catch up here are the links to the previous articles:

Example Code

The code discussed in the following sections is available for you to try and hack. You can find it on github: REST-auth.

The User Database

To give this example some resemblance to a real life project I'm going to use a Flask-SQLAlchemy database to store users.

The user model will be very simple. For each user a username and a password_hash will be stored.

class User(db.Model):
    __tablename__ = 'users'
    id = db.Column(db.Integer, primary_key = True)
    username = db.Column(db.String(32), index = True)
    password_hash = db.Column(db.String(128))

For security reasons the original password will not be stored, after the hash is calculated during registration it will be discarded. If this user database were to fall in malicious hands it would be extremely hard for the attacker to decode the real passwords from the hashes.

Passwords should never be stored in the clear in a user database.

Password Hashing

To create the password hashes I'm going to use PassLib, a package dedicated to password hashing.

PassLib provides several hashing algorithms to choose from. The custom_app_context object is an easy to use option based on the sha256_crypt hashing algorithm.

To add password hashing and verification two new methods are added to the User model:

from passlib.apps import custom_app_context as pwd_context

class User(db.Model):
    # ...

    def hash_password(self, password):
        self.password_hash = pwd_context.encrypt(password)

    def verify_password(self, password):
        return pwd_context.verify(password, self.password_hash)

The hash_password() method takes a plain password as argument and stores a hash of it with the user. This method is called when a new user is registering with the server, or when the user changes the password.

The verify_password() method takes a plain password as argument and returns True if the password is correct or False if not. This method is called whenever the user provides credentials and they need to be validated.

You may ask how can the password be verified if the original password was thrown away and lost forever after it was hashed.

Hashing algorithms are one-way functions, meaning that they can be used to generate a hash from a password, but they cannot be used in the reverse direction. But these algorithms are deterministic, given the same inputs they will always generate the same output. All PassLib needs to do to verify a password is to hash it with the same function that was used during registration, and then compare the resulting hash against the one stored in the database.

User Registration

In this example, a client can register a new user with a POST request to /api/users. The body of the request needs to be a JSON object that has username and password fields.

The implementation of the Flask route is shown below:

@app.route('/api/users', methods = ['POST'])
def new_user():
    username = request.json.get('username')
    password = request.json.get('password')
    if username is None or password is None:
        abort(400) # missing arguments
    if User.query.filter_by(username = username).first() is not None:
        abort(400) # existing user
    user = User(username = username)
    user.hash_password(password)
    db.session.add(user)
    db.session.commit()
    return jsonify({ 'username': user.username }), 201, {'Location': url_for('get_user', id = user.id, _external = True)}

This function is extremely simple. The username and password arguments are obtained from the JSON input coming with the request and then validated.

If the arguments are valid then a new User instance is created. The username is assigned to it, and the password is hashed using the hash_password() method. The user is finally written to the database.

The body of the response shows the user representation as a JSON object, with a status code of 201 and a Location header pointing to the URI of the newly created user.

Note: the implementation of the get_user endpoint is now shown here, you can find it in the full example on github.

Here is an example user registration request sent from curl:

$ curl -i -X POST -H "Content-Type: application/json" -d '{"username":"miguel","password":"python"}' http://127.0.0.1:5000/api/users
HTTP/1.0 201 CREATED
Content-Type: application/json
Content-Length: 27
Location: http://127.0.0.1:5000/api/users/1
Server: Werkzeug/0.9.4 Python/2.7.3
Date: Thu, 28 Nov 2013 19:56:39 GMT

{
  "username": "miguel"
}

Note that in a real application this would be done over secure HTTP. There is no point in going through the effort of protecting the API if the login credentials are going to travel through the network in clear text.

Password Based Authentication

Now let's assume there is a resource exposed by this API that needs to be available only to registered users. This resource is accessed at the /api/resource endpoint.

To protect this resource I'm going to use HTTP Basic Authentication, but instead of implementing this protocol by hand I'm going to let the Flask-HTTPAuth extension do it for me.

Using Flask-HTTPAuth an endpoint is protected by adding the login_required decorator to it:

from flask.ext.httpauth import HTTPBasicAuth
auth = HTTPBasicAuth()

@app.route('/api/resource')
@auth.login_required
def get_resource():
    return jsonify({ 'data': 'Hello, %s!' % g.user.username })

But of course Flask-HTTPAuth needs to be given some more information to know how to validate user credentials, and for this there are several options depending on the level of security implemented by the application.

The option that gives the maximum flexibility (and the only that can accomodate PassLib hashes) is implemented through the verify_password callback, which is given the username and password and is supposed to return True if the combination is valid or False if not. Flask-HTTPAuth invokes this callback function whenever it needs to validate a username and password pair.

An implementation of the verify_password callback for the example API is shown below:

@auth.verify_password
def verify_password(username, password):
    user = User.query.filter_by(username = username).first()
    if not user or not user.verify_password(password):
        return False
    g.user = user
    return True

This function finds the user by the username, then verifies the password using the verify_password() method. If the credentials are valid then the user is stored in Flask's g object so that the view function can use it.

Here is an example curl request that gets the protected resource for the user registered above:

$ curl -u miguel:python -i -X GET http://127.0.0.1:5000/api/resource
HTTP/1.0 200 OK
Content-Type: application/json
Content-Length: 30
Server: Werkzeug/0.9.4 Python/2.7.3
Date: Thu, 28 Nov 2013 20:02:25 GMT

{
  "data": "Hello, miguel!"
}

If an incorrect login is used, then this is what happens:

$ curl -u miguel:ruby -i -X GET http://127.0.0.1:5000/api/resource
HTTP/1.0 401 UNAUTHORIZED
Content-Type: text/html; charset=utf-8
Content-Length: 19
WWW-Authenticate: Basic realm="Authentication Required"
Server: Werkzeug/0.9.4 Python/2.7.3
Date: Thu, 28 Nov 2013 20:03:18 GMT

Unauthorized Access

Once again I feel the need to reiterate that in a real application the API should be available on secure HTTP only.

Token Based Authentication

Having to send the username and the password with every request is inconvenient and can be seen as a security risk even if the transport is secure HTTP, since the client application must have those credentials stored without encryption to be able to send them with the requests.

An improvement over the previous solution is to use a token to authenticate requests.

The idea is that the client application exchanges authentication credentials for an authentication token, and in subsequent requests just sends this token.

Tokens are usually given out with an expiration time, after which they become invalid and a new token needs to be obtained. The potential damage that can be caused if a token is leaked is much smaller due to their short life span.

There are many ways to implement tokens. A straightforward implementation is to generate a random sequence of characters of certain length that is stored with the user and the password in the database, possibly with an expiration date as well. The token then becomes sort of a plain text password, in that can be easily verified with a string comparison, plus a check of its expiration date.

A more elaborated implementation that requires no server side storage is to use a cryptographically signed message as a token. This has the advantage that the information related to the token, namely the user for which the token was generated, is encoded in the token itself and protected against tampering with a strong cryptographic signature.

Flask uses a similar approach to write secure cookies. This implementation is based on a package called itsdangerous, which I will also use here.

The token generation and verification can be implemented as additional methods in the User model:

from itsdangerous import TimedJSONWebSignatureSerializer as Serializer

class User(db.Model):
    # ...

    def generate_auth_token(self, expiration = 600):
        s = Serializer(app.config['SECRET_KEY'], expires_in = expiration)
        return s.dumps({ 'id': self.id })

    @staticmethod
    def verify_auth_token(token):
        s = Serializer(app.config['SECRET_KEY'])
        try:
            data = s.loads(token)
        except SignatureExpired:
            return None # valid token, but expired
        except BadSignature:
            return None # invalid token
        user = User.query.get(data['id'])
        return user

In the generate_auth_token() method the token is an encrypted version of a dictionary that has the id of the user. The token will also have an expiration time embedded in it, which by default will be of ten minutes (600 seconds).

The verification is implemented in a verify_auth_token() static method. A static method is used because the user will only be known once the token is decoded. If the token can be decoded then the id encoded in it is used to load the user, and that user is returned.

The API needs a new endpoint that the client can use to request a token:

@app.route('/api/token')
@auth.login_required
def get_auth_token():
    token = g.user.generate_auth_token()
    return jsonify({ 'token': token.decode('ascii') })

Note that this endpoint is protected with the auth.login_required decorator from Flask-HTTPAuth, which requires that username and password are provided.

What remains is to decide how the client is to include this token in a request.

The HTTP Basic Authentication protocol does not specifically require that usernames and passwords are used for authentication, these two fields in the HTTP header can be used to transport any kind of authentication information. For token based authentication the token can be sent as a username, and the password field can be ignored.

This means that now the server can get some requests authenticated with username and password, while others authenticated with an authentication token. The verify_password callback needs to support both authentication styles:

@auth.verify_password
def verify_password(username_or_token, password):
    # first try to authenticate by token
    user = User.verify_auth_token(username_or_token)
    if not user:
        # try to authenticate with username/password
        user = User.query.filter_by(username = username_or_token).first()
        if not user or not user.verify_password(password):
            return False
    g.user = user
    return True

This new version of the verify_password callback attempts authentication twice. First it tries to use the username argument as a token. If that doesn't work, then username and password are verified as before.

The following curl request gets an authentication token:

$ curl -u miguel:python -i -X GET http://127.0.0.1:5000/api/token
HTTP/1.0 200 OK
Content-Type: application/json
Content-Length: 139
Server: Werkzeug/0.9.4 Python/2.7.3
Date: Thu, 28 Nov 2013 20:04:15 GMT

{
  "token": "eyJhbGciOiJIUzI1NiIsImV4cCI6MTM4NTY2OTY1NSwiaWF0IjoxMzg1NjY5MDU1fQ.eyJpZCI6MX0.XbOEFJkhjHJ5uRINh2JA1BPzXjSohKYDRT472wGOvjc"
}

Now the protected resource can be obtained authenticating with the token:

$ curl -u eyJhbGciOiJIUzI1NiIsImV4cCI6MTM4NTY2OTY1NSwiaWF0IjoxMzg1NjY5MDU1fQ.eyJpZCI6MX0.XbOEFJkhjHJ5uRINh2JA1BPzXjSohKYDRT472wGOvjc:unused -i -X GET http://127.0.0.1:5000/api/resource
HTTP/1.0 200 OK
Content-Type: application/json
Content-Length: 30
Server: Werkzeug/0.9.4 Python/2.7.3
Date: Thu, 28 Nov 2013 20:05:08 GMT

{
  "data": "Hello, miguel!"
}

Note that in this last request the password is written as the word unused. The password in this request can be anything, since it isn't used.

OAuth Authentication

When talking about RESTful authentication the OAuth protocol is usually mentioned.

So what is OAuth?

OAuth can be many things. It is most commonly used to allow an application (the consumer) to access data or services that the user (the resource owner) has with another service (the provider), and this is done in a way that prevents the consumer from knowing the login credentials that the user has with the provider.

For example, consider a website or application that asks you for permission to access your Facebook account and post something to your timeline. In this example you are the resource holder (you own your Facebook timeline), the third party application is the consumer and Facebook is the provider. Even if you grant access and the consumer application writes to your timeline, it never sees your Facebook login information.

This usage of OAuth does not apply to a client/server RESTful API. Something like this would only make sense if your RESTful API can be accessed by third party applications (consumers).

In the case of a direct client/server communication there is no need to hide login credentials, the client (curl in the examples above) receives the credentials from the user and uses them to authenticate requests with the server directly.

OAuth can do this as well, and then it becomes a more elaborated version of the example described in this article. This is commonly referred to as the "two-legged OAuth", to contrast it to the more common "three-legged OAuth".

If you decide to support OAuth there are a few implementations available for Python listed in the OAuth website.

Conclusion

I hope this article helped you understand how to implement user authentication for your API.

Once again, you can download and play with a fully working implementation of the server described above. You can find the software on my github site: REST-auth.

If you have any questions or found any flaws in the solution I presented please let me know below in the comments.

Miguel

54 comments

  • #1 paramonies said :

    Miguel, wonderful as always. Thanks to you, I became a fan of Flask

  • #2 Saša said :

    I am concerned about 2 queries to the database to check if credentials are good, and getting user object. If your API is used heavily it will affect performance. How would you work around this issue if you have restful API in flask and for example AngularJS and Android clients? Cash user object in Redis or somewhere and invalidate on each change to the user or set TTL on it? Also, how would you ensure idempotency (ensure that if same requests is sent a few times only one is executed) of requests? Have some decorator around this requests and in it also store/check in redis some flag that it is or it is not already done? Thanks!

  • #3 Miguel Grinberg said :

    @Saša: when you say "two queries", do you mean the two methods of verification in the verify_password callback, namely token and then username/password? If you assume that clients use tokens most of the time then the second query will never trigger, only during token renewal this would happen. And if that bothers you then you could use the password field in the authentication header to indicate the username field is a token, and that would address the issue completely. About idempotency, I'm not sure it is what you are saying. An idempotent request is one that returns the same response when it is called multiple times, the all obviously need to execute, the key is that repeating the request does not change anything. The POST request to create a user is not idempotent, the GET request that gets the protected resource is. I don't think adding a Redis cache layer is any more complex for this example, as long as you handle the caching logic things should work just fine.

  • #4 ck said :

    Great tutorial, as always, Miguel :) Just one question: How is salting handled in passlib? Does the library take care of that?

  • #5 Miguel Grinberg said :

    @ck: Yes, passlib handles salts transparently. Each hashed password gets its own randomly generated salt, which is appended to the hash, so there is no need to store it separately in the database. During verification passlib extracts the salt from the stored hash and uses it again on the password that needs to be verified.

  • #6 Marco Massenzio said :

    Very interesting article, thanks for sharing. A couple of things: I'm wondering whether the code changes significantly if using Digest auth (whilst not perfect, and not as ideal as using https:// it sill alleviates the cleartext monstrosity of Basic HTTP). Also, just wanted to let you know that your blog renders poorly on an iPad: the right-hand side column essentially gets in the way of the text and makes it illegible (both in landscape or portrait). No big deal, but I thought you may want to know in case you have some control over the CSS of the page.

  • #7 Miguel Grinberg said :

    @Marco: Digest Auth is not a replacement for SSL. It is correct that at least you don't send clear text passwords, but with digest it is not possible to create user accounts or to change passwords, since for those the password needs to reach the server in clear text so that it is hashed. You could combine SSL for the account admin routes and digest auth for the regular routes, but if you are already doing SSL you might as well do everything encrypted. I'll look at the formatting of the page, thanks for letting me know.

  • #8 lukas said :

    the flask restfull is more problematic that I though. For: class Token(Resource): @auth.login_required def get(self): token = g.user.generate_auth_token(600, api.app.config["SECRET_KEY"]) return jsonify({"token": token.edcode("ascii"), "duration": 600 }) I got TypeError: verify_password() takes exactly 2 arguments (3 given) so I changed def verify_password(username_or_token, password): to def verify_password(username_or_token, password, bla): and now it's TypeError: verify_password() takes exactly 2 arguments (3 given)

  • #9 Miguel Grinberg said :

    @lukas: the problem is likely in your implementation of the verify_password callback. It would help me to see the stack trace of the failure, without that I can only guess.

  • #10 lukas said :

    Here is the souce code https://github.com/lukasz-madon/crud/tree/master/server

  • #11 lukas said :

    Sure! The implemention is the same. Stacktrace doesn't help much. Traceback (most recent call last): File "c:\Users\lukas\Desktop\web\project-mgmt\server\venv\lib\site-packages\fl ask\app.py", line 1836, in __call__ return self.wsgi_app(environ, start_response) File "c:\Users\lukas\Desktop\web\project-mgmt\server\venv\lib\site-packages\fl ask\app.py", line 1820, in wsgi_app response = self.make_response(self.handle_exception(e)) File "c:\Users\lukas\Desktop\web\project-mgmt\server\venv\lib\site-packages\fl ask_restful\__init__.py", line 250, in error_router return self.handle_error(e) File "c:\Users\lukas\Desktop\web\project-mgmt\server\venv\lib\site-packages\fl ask_restful\__init__.py", line 268, in handle_error raise exc TypeError: verify_password() takes exactly 3 arguments (2 given)

  • #12 Miguel Grinberg said :

    @Lukas: You can't use a view function decorator on a method. Flask provides an alternative way to attach decorators, see http://flask.pocoo.org/docs/views/#decorating-views.

  • #13 lukas said :

    I posted 4 comments, only 2 are published. Traceback (most recent call last): File "c:\Users\lukas\Desktop\web\project-mgmt\server\venv\lib\site-packages\fl ask\app.py", line 1836, in __call__ return self.wsgi_app(environ, start_response) File "c:\Users\lukas\Desktop\web\project-mgmt\server\venv\lib\site-packages\fl ask\app.py", line 1820, in wsgi_app response = self.make_response(self.handle_exception(e)) File "c:\Users\lukas\Desktop\web\project-mgmt\server\venv\lib\site-packages\fl ask_restful\__init__.py", line 250, in error_router return self.handle_error(e) File "c:\Users\lukas\Desktop\web\project-mgmt\server\venv\lib\site-packages\fl ask_restful\__init__.py", line 268, in handle_error raise exc TypeError: verify_password() takes exactly 3 arguments (2 given)

  • #14 Miguel Grinberg said :

    @lukas: how you seen my message above? You are repeating the same stack trace as before.

  • #15 Gurom said :

    Dear Miguel, Perhaps I'am very "green" in Flask knowledge, but please make some comments. Please look at the method: @staticmethod def verify_auth_token(token): .......... user = User.query.get(data['id']) return user Each time we use db-request for reading "user" . Such request will be used for each "click" of webapp. I think that it is not correct for server algorithm. Do you have idea how to get "user" from the quick cache?

  • #16 Miguel Grinberg said :

    @Gurom: You are correct, each time a request arrives the verify_auth_token() function will load the user from the database. This example does not have a cache, so that's the only way to obtain the user. It may seem to you that this is expensive, but locating a user by its primary key id is one of the simplest database operations, so I would only optimize this if it is found to be causing performance problems. In a real application you would be doing way more work during the request, reading the user will be insignificant compared to the real work done by the request.

  • #17 Jacopo said :

    Thanks for the tutorial awesome as usual. I have a question that might slightly off-topic but how would you protect the registration endpoint from bots? Here are few ideas I can think of and why none of them are ideal: - Confirmation email with activation URL. This seems the best idea but I find it inconvenient for the user. - Client-side captcha written in Javascript and validated before the post request to the registration endpoint. This is easy to implement but obviously doesn't protect you at all from registration requests outside of the client. - Checking the referer in the request header against a whitelist of allowed clients. This is a fairly simple solution as well but doesn't work against spoofing. Which solution would you suggest?

  • #18 Miguel Grinberg said :

    @Jacopo: On a real API I would probably not have user registration as part of the API. Registration would be done on a web site, where you can have CSRF protection and if you like them, captchas. The method to prevent bogus registrations varies depending on the application and is a hard problem. You may need to be prepared to ban accounts after they are found spamming or doing other bad things.

  • #19 bluekirai said :

    Hi! Thanks for writing, Flask is definitely great for API's, I am going to use it from now on. I have two design level questions regarding the API that I encountered. 1) With this design, how to handle the expiration on mobile devices? By setting an expiry time of ten minutes for token, the developer would have to store username + password locally to mobile device, and re-authenticate if token expires? My first instinct would be setting the expiry time to something like 10 days but this would not really be a solution. 2) Why you prefer using the authorization instead of delivering the token in headers as its own field? ("X-Auth-Token" or such)

  • #20 Miguel Grinberg said :

    @bluekirai: 1) if you are going to use token that have expiration then the client needs to store credentials. You can opt to not make your tokens expire if you prefer, then once a token is obtained it can be used forever. Alternatively, with this particular implementation you can use a token as authorization in a request that gets a new token. That way you can pass from token to token without having to have client credentials stored. 2) I'm using the Flask-HTTPAuth extension (which I also wrote) to handle the server side authentication. It is perfectly fine to send the token in the X-Auth-Token and handle it yourself.

  • #21 Damien Mathieu said :

    Hi, Thanks for your interesting article. I created an flask app like you. But i have a problem. The authentication is ok when i run the flask app in dev (on localhost) But behind a web server (apache), the authentication is broken (error 500). i use mod_wsgi. Do you have an idea ? Regards, damien

  • #22 Miguel Grinberg said :

    @Damien: Enable logging of errors in your Flask server so that you can determine what the source of the code 500 error is.

  • #23 Sam M said :

    Miguel, again, an excellent article. I will be using some parts of it (the idea of token based authentication) to create an API authentication mechanism for my app. So thanks for that. But a question, why not simply use generate_password_hash and check_password_hash from werkzeug instead of using passlib?

  • #24 Miguel Grinberg said :

    @Sam: using Werkzeug's hashing should be fine, it's similar. This is just a personal preference. While Werkzeug's hashing functions are unlikely to be updated when using a dedicated package it is expected that new algorithms will be more likely to be made available.

  • #25 zimyand said :

    Hm.. The problem here is if almost expired token is stolen - you can get a new token with the old one. Did you consider accepting only username and password to get a new token?

Leave a Comment

Note: all comments are screened before they are published. Thank you for your patience!