The New Way To Generate Secure Tokens in Python

Posted by
on under

Authentication Tokens

When working with web applications, it is often necessary to generate passwords, tokens or API keys, to be assigned to clients to use as authentication. While there are many sophisticated ways to generate these, in many cases it is perfectly adequate to use sufficiently long and random sequences of characters. The problem is that if you are doing this in Python, there is more than one way to generate random strings, and it isn't always clear which way is the best and most secure.

You would think that adding yet one more method to generate random strings would confuse things even more, but unlike all the other options, the new secrets module introduced in Python 3.6 is actually designed for this specific use case, so from my part it is a welcome addition to the Python standard library. In this short article I'm going to give you an overview of this new module.

Generating Tokens

The secrets module is part of the Python standard library in Python 3.6 and newer. You can import this module into your application or into a Python shell as follows:

>>> import secrets

At the core of this module there are three functions that generate random tokens using the best random number generator provided by your system. The first function generates binary sequences of random bytes:

>>> secrets.token_bytes()
b'\xc6\xce\xc8+4\xfd7\x1b\xd3jb\x1eV\xe6\xc1\xdd\x05\x9d*\xe5Q\x17\x7fe\r\x1e\x93%j\xf9\xfen'
>>> secrets.token_bytes(20)
b'\x1b\x97\x8d\xf8oM\x07\x11i\x98>\x95\x9c\x0e\x14\xfc\xefK\xd8\xa9'

Invoking the token_bytes() function without any arguments returns a token with a default length that is determined to be sufficiently safe and secure. You can also pass the desired length as an argument, as you can see in the second example above.

The token_hex() function works in a similar way, but returns a string with the bytes rendered in hexadecimal notation instead of a raw binary string:

>>> secrets.token_hex()
'1476f4cfa96f20af2ca8cfdf9c5920f54d78f1b835318d729ceec2a72403cc29'
>>> secrets.token_hex(20)
'86f41a39c3a243fd22d96228eaeb23a60df36e76'

With this function, each byte in the sequence is rendered as two hexadecimal digits, so in the second example above, where I request a token with 20 characters, the resulting string is going to be 40 characters long.

The third function in this group is token_urlsafe(), which returns the random string encoded in base64 format:

>>> secrets.token_urlsafe()
'bmP7C2E1TC0RGsMDsMRWvIW_pj4g4gxhdxTvL-43w7Q'
>>> secrets.token_urlsafe(20)
'PbnveAR7aAY-M6Fw1cIvyDZDvO8'

The base64 encoding is more efficient than hexadecimal. In the example above you can see that when I requested a token of 20 characters, the resulting base64 encoded string is 27 characters long.

How to know when to use each of these functions? For most cases, the token_urlsafe() function is probably the best option, so start from that one. If you prefer random strings encoded in hexadecimal notation (which will give you only characters in the 0-9 and a-f ranges) then use token_hex(). Finally, if you prefer a raw binary string, without any encodings, then use token_bytes().

There are many use cases that benefit from have a simple and secure way to generate tokens. Here are a few examples:

  • API keys that are given to clients after they authenticate with username and password
  • Password reset tokens to be sent to the user by email
  • Initial passwords for new accounts (you will likely want users to change their password after the first login)
  • IDs for background tasks or other asynchronous operations
  • Passwords to assign to other services such as databases, message queues, etc.
  • Dynamically created unique URLs

Generating Random Numbers

While the token generation functions I described in the previous section are the most useful, the secrets module also provides a few functions that deal with random numbers.

The choice() function returns a randomly selected item from the list provided as an argument:

>>> secrets.choice(['apple', 'banana', 'pear'])
'apple'
>>> secrets.choice(['apple', 'banana', 'pear'])
'banana'
>>> secrets.choice(['apple', 'banana', 'pear'])
'apple'

This function can be combined with a list comprehension to generate random strings that only use a specific set of characters. For example, if you want to generate a random string of 20 characters that only uses the letters abcd you can do so as follows:

>>> ''.join([secrets.choice('abcd') for i in range(20)])
'bcdcdbacaccdcaccacaa'

The randbelow() function generates a random integer number between 0 and the number given as an argument (not including this number):

>>> secrets.randbelow(10)
4
>>> secrets.randbelow(10)
5
>>> secrets.randbelow(10)
0

Finally, the randbits() function returns an random integer number that has the specified number of bits:

>>> secrets.randbits(8)
178
>>> secrets.randbits(24)
11823580
>>> secrets.randbits(173)
3046477408020019979024496462779526808974163116220575

Conclusion

I hope you found this little article useful. I find the token generation functions, and in particular token_urlsafe(), very convenient and keep discovering new uses for it. Are you using these functions for an original purpose I have not described in this article? Let me know below in the comments!

Become a Patron!

Hello, and thank you for visiting my blog! If you enjoyed this article, please consider supporting my work on this blog on Patreon!

9 comments
  • #1 Eddy van den Aker said

    Hi Miguel,

    Thanks for the awesome content, I'm currently doing my uni graduation project in Python and Flask and your book and blog is great for reference.

    Anyway, my question is, would you replace the random token generation in the Microblog (for example the password reset token now generated with jwt) by this method? Why or why not?

    Thanks,

  • #2 Miguel Grinberg said

    @Eddy: The JWT tokens are not random, they store useful information. If you replace them with randomly generated tokens, then you need a database table where you can write all that information that the JWT stores in the token. I prefer to use JWT tokens for this type of problem, as it saves you from storing token data on the server.

  • #3 Chinmay Prabhudesai said

    This doesn't take care of a Timed token for sessions, how would you manage that ?

  • #4 Miguel Grinberg said

    @Chinmay: what is a "timed token for sessions"? A token in this context is a unique sequence of characters, it does not have expiration attached.

  • #5 Abhi said

    Great article. Really helpful.

  • #6 Fergus said

    Miguel,
    Truly inspiring teaching on Flask.
    I could add an implementation of something that was mentioned in comments on the User Login module of your Mega Flask Tutorial:
    To stop users logging in simultaneously on different devices/browsers:

    Alter User class in models.py, to add the following column:

    class User(Base, UserMixin):
        ...
        session_token = Column(String(40))
    

    Add to your login function (after verifying password, before Flask-login's login_user call):

            ...
            import secrets
            user.session_token = secrets.token_hex(20)
            cs.add(user)
            cs.commit()
            ...
    

    Change the LoginManager user_loader call as follows:

    @login_manager.user_loader
    def load_user(session_token):
        return User.query.filter_by(session_token=session_token).first()
    
  • #7 Miguel Grinberg said

    @Fergus: I believe you are also missing a new definition for the get_id() method in your User class. Something like this maybe?

    def get_id(self):
        return self.session_token
    
  • #8 Rafael Ribeiro said

    Nice article! I made a lib with this inspired by your article and the keepass software... pyrandomset

  • #9 Firas Fatnassi said

    Such a nice article thanks for the amazing content. This helped me in solving predictable tokens issues.

Leave a Comment