OAuth Authentication with Flask in 2023

A long time ago I wrote a tutorial on how to add logins with a social network to your Flask application, using the OAuth protocol. It's been almost 9 years since I wrote that article, and believe it or not, the OAuth protocol continues to be well supported by all major players including Twitter, Facebook, Google, GitHub and many more.

But of course, 9 years is a very long time in tech. Even though not much has changed in terms of how this method of authentication works, some of the packages that I've used back then have had major upgrades, while others have become unmaintained, so an update is due.

OAuth Refresher

I covered this in my previous article on this topic, but in case you aren't familiar with the OAuth protocol, here is a short summary of how it works.

Let's say you have a web application hosted on yoursite.com. The authentication process is initiated when the user goes to yoursite.com and clicks on a "Login with X" button. The "X" here is a placeholder for any authentication provider that supports the OAuth protocol. Possible choices include Twitter, Facebook, Google, Okta, GitHub, GitLab, BitBucket and many, many more.

After the user clicks the login button on yoursite.com, the browser redirects them to the provider's authentication page. If the user is not already logged in with the provider, then they are asked to log in. Once the provider knows who the user is, it presents a consent page, asking the user to grant them permission to share user information with the application at yoursite.com. The consent page clearly tells the user what information is going to be shared. Normally, information shared includes the user's email address and username, but some web sites may ask for other things as well. Users should always review the consent page and make sure the data that is shared is reasonable.

After the user consents or denies the sharing of their details, the provider redirects back to yoursite.com. If the user accepted, then the provider passes an authorization code to the application. If the user did not consent, then an error message is passed instead, which the application can show to inform the user of the failure.

If consent was given, the browser returns to yoursite.com with the authorization code. The back end running on yoursite.com now uses the authorization code to request an access token from the provider. Then it uses the access token to make any necessary calls to the provider's API and obtain the user information it needs to authenticate the user, such as their email address and/or username.

Because the user is authenticated with the provider, and the application trusts that provider, this is all that is required, and now yoursite.com can log the user in. The process may sound complicated, but for the user it is extremely easy and just requires the following steps:

Click the "Login with X" button
Log in on the "X" website (only if not previously logged in)
Accept sharing of email and/or username (only the first time)
The user is now logged in to the application!

Since steps 2 and 3 above only happen once, subsequent log in attempts by the user will just require clicking the "Login with X" button and the rest of the authentication process is carried out silently and without user interaction.

What Has Changed Since 2014

The OAuth protocol has a robust design, and for that reason it has not needed new revisions. Back in 2014, there were two versions of the protocol actively in use, versions 1.0a and 2.0. These days, all the authentication providers I know use version 2.0. The few that were on 1.0a in the past are now also using 2.0. In this article I'm only going to discuss the 2.0 version, which by the way, is the more secure and also the easier to implement.

The dependencies that I used in the original article have also changed. Back in 2014, Python 2 was still the most used Python, and Flask hadn't even reached the 1.0 milestone. The rauth package that I used to handle the OAuth handshake has not seen any new releases since 2015, so I'm not using it anymore. The OAuth 2.0 protocol is simple enough that I'm now implementing it with just Flask and the requests library.

Another important thing that changed is that my understanding of the OAuth 2.0 protocol has improved since I wrote my original implementation and for that reason I've been able to simplify my solution.

You may have heard of an alternative authentication protocol called OpenID Connect (OIDC), which has gained some traction in recent years. OIDC uses OAuth 2.0 and adds a simple identity layer on top of it based on JWT tokens. Most providers that implement OIDC also support plain OAuth 2.0, but the reverse is sadly not always the case. For example, GitHub implements OAuth 2.0, but hasn't yet added OIDC support. In case you are interested in OIDC, I plan to discuss it in a follow-up article.

Ready to learn how to do OAuth with Flask in 2023? For your reference, the complete working code is in my flask-oauth-example repository on GitHub.

Let's get started!

Preparation

Before you can add social logins to your application you need to do some preparation work, which consists in registering as an application with the provider(s) you intend to support, and including the provider details in the Flask configuration.

Registering with an OAuth Provider

A web application that wants to accept logins using a third-party authentication provider needs to register with the provider beforehand. Many providers call this task "creating an OAuth app" or similar names. As a result of this registration step, the provider will assign client_id and client_secret values to the application.

As part of the registration, most providers will ask for the application's website URL (it is fine to use http://localhost:5000 during development), and also for a redirect URL (also called callback URL by some providers) within that website, where users should be sent after they have made a selection in the consent screen. For the example application that I'm going to share in this article, the redirect URLs are going to have the format http://localhost:5000/callback/<provider>, where provider is a short name for the provider, for example github, google, etc.

For the example application I'm sharing with this article, I have integrated Google and GitHub. For Google, the registration page is at https://console.cloud.google.com/apis/credentials. If you've never used the Google Cloud Developer Console before, you will first need to create a project under which you'll register your application. To register your application, click "Create Credentials" and then select "OAuth client ID". Then enter the following required information:

Application type: select "Web application".
Name: enter the application name.
Authorized redirect URIs: enter the redirect URL, such as http://localhost:5000/callback/google.

In the case of GitHub, their registration page is at https://github.com/settings/developers. To register your web application you need to provide the following required information:

Application name: enter your application name.
Homepage URL: enter your application's the top-level URL, such as http://localhost:5000.
Authorization callback URL: enter the redirect URL, such as http://localhost:5000/callback/github.

For other authentication providers you will need to find what is their process to register an OAuth application.

The authentication provider should advertise three endpoints to use during the authentication process:

Authorize endpoint: The URL where the user should be redirected to initiate a login.
Token endpoint: The URL from where the application can request an access token on behalf of the user.
User info endpoint: The URL used to request user information. This does not necessarily need to be a single endpoint, depending on the information the application wants to have about the user more than one API call may be required.

Unfortunately finding out what these three endpoints are for a given provider requires some searching. Some providers show them directly in the page where you register your web application, while others have them in their documentation.

Each provider also defines a list of items that they can share about the user. In OAuth these are called scopes, and you will need to decide which ones you need to request. My example application only needs the user email, so for each provider I looked in their documentation for the scope that exposes emails. When the user is presented with a consent screen, the scopes that the application requested are shown, so that the user can decide if the information the application wants to access is reasonable.

The OpenID Connect protocol I briefly mentioned earlier has an optional, but widely adopted solution that allows an application to discover the endpoints and the scopes in a consistent way regardless of the provider. OAuth has tried to implement a similar discovery protocol, but for some reason the majority of providers have not implemented it yet.

Flask Configuration

I like to make all the information associated with authentication providers available in a dictionary that is stored in the Flask config object, so that it is readily available whenever needed. In my example application I have included support for Google and GitHub, and this is the configuration that I've used:

app.config['OAUTH2_PROVIDERS'] = {
    # Google OAuth 2.0 documentation:
    # https://developers.google.com/identity/protocols/oauth2/web-server#httprest
    'google': {
        'client_id': os.environ.get('GOOGLE_CLIENT_ID'),
        'client_secret': os.environ.get('GOOGLE_CLIENT_SECRET'),
        'authorize_url': 'https://accounts.google.com/o/oauth2/auth',
        'token_url': 'https://accounts.google.com/o/oauth2/token',
        'userinfo': {
            'url': 'https://www.googleapis.com/oauth2/v3/userinfo',
            'email': lambda json: json['email'],
        },
        'scopes': ['https://www.googleapis.com/auth/userinfo.email'],
    },

    # GitHub OAuth 2.0 documentation:
    # https://docs.github.com/en/apps/oauth-apps/building-oauth-apps/authorizing-oauth-apps
    'github': {
        'client_id': os.environ.get('GITHUB_CLIENT_ID'),
        'client_secret': os.environ.get('GITHUB_CLIENT_SECRET'),
        'authorize_url': 'https://github.com/login/oauth/authorize',
        'token_url': 'https://github.com/login/oauth/access_token',
        'userinfo': {
            'url': 'https://api.github.com/user/emails',
            'email': lambda json: json[0]['email'],
        },
        'scopes': ['user:email'],
    },
}

This may seem like a lot to take in, so let's go through the items one by one.

The OAUTH2_PROVIDERS key in the configuration is a dictionary. The keys are short names that I assigned to each supported provider. My example application works with Google and GitHub, so I have two keys in my dictionary named google and github. Each provider is given an inner dictionary with six options.

The client_id and client_secret keys configure the application identifiers assigned by the provider during registration. Since these values are of a sensitive nature, I do not include them in the code. Instead, I import them from environment variables.

The authorize_url and token_url keys configure the authorize and token endpoints, as defined by the provider. As mentioned above, each provider documents what these endpoints are in their own way, and sometimes it takes a bit of digging through their documentation to find them. I have included the OAuth documentation pages for Google and GitHub as comments for future reference.

For the user information endpoint I had to take a slightly more complicated approach, because the OAuth protocol does not specify the format in which the provider should return user information. The userinfo key is set to a dictionary with url and email keys. The url key defines the endpoint that returns user information, and the email key is set to a function that accepts the JSON response from the endpoint as an argument and returns the user's email address. You can see in the code above that Google returns the email directly as an email attribute, while GitHub returns a list of email addresses for the user, so in that case I pick the first address from the list. Depending on what providers you decide to work on, the logic to extract the user email will need to be adjusted.

The last key for each provider is scopes, which is a list of user scopes. Scope names are also not regulated by the OAuth 2.0 protocol, so each provider documents their scopes and you just need to look for them. Note how Google uses URLs as scopes, while GitHub implements a hierarchy with colon separators. Here it is important to make sure you do not request more information than what you need, so in both cases I'm just asking for the user's email address.

Authorization Redirect

All the preparations are now done, so we can look at the next piece of the solution, which is the "Login with X" buttons that kick off the login flow. Below you can see the two buttons I have defined in my example application:

<p>
  <a class="btn btn-primary" href="{{ url_for('oauth2_authorize', provider='google') }}">Login with Google</a>
  <a class="btn btn-primary" href="{{ url_for('oauth2_authorize', provider='github') }}">Login with GitHub</a>
</p>

These aren't really buttons, in this implementation I have used <a> links that are styled to look like buttons. When the user clicks one of them, they are sent to the oauth2_authorize route in the Flask application, which initiates the login. This endpoint maps to the /authorize/<provider> URL, so the provider name has to be passed as an argument, so that the route knows where the user wants to go to authenticate. The provider argument needs to match the provider names that are given as keys in the OAUTH2_PROVIDERS dictionary from the application's configuration. For an application that works with only one authentication provider this route does not need to use a provider argument.

The oauth2_authorize route is in charge of initiating the login process with the chosen provider. This has to be done according to the OAuth 2.0 protocol, by issuing a redirect to the provider's authorize URL, passing the following parameters in the query string:

client_id: the value that was assigned to the application during registration with the provider.
redirect_uri: the redirect or callback URL for the application. This URL must be an exact match with the URL that was provided during registration. In general providers do not allow users to authenticate if the redirect URL does not match the one that was registered, for security reasons.
response_type: the value code must be given, so that the provider knows the application expects an authorization code to be returned.
scope: the requested scopes, as a string with scopes separated by a space.
state: an application-specific value, often used to prevent CSRF attacks (you'll see how this works below).

Here is the complete implementation of the authorization endpoint:

@app.route('/authorize/<provider>')
def oauth2_authorize(provider):
    if not current_user.is_anonymous:
        return redirect(url_for('index'))

    provider_data = current_app.config['OAUTH2_PROVIDERS'].get(provider)
    if provider_data is None:
        abort(404)

    # generate a random string for the state parameter
    session['oauth2_state'] = secrets.token_urlsafe(16)

    # create a query string with all the OAuth2 parameters
    qs = urlencode({
        'client_id': provider_data['client_id'],
        'redirect_uri': url_for('oauth2_callback', provider=provider,
                                _external=True),
        'response_type': 'code',
        'scope': ' '.join(provider_data['scopes']),
        'state': session['oauth2_state'],
    })

    # redirect the user to the OAuth2 provider authorization URL
    return redirect(provider_data['authorize_url'] + '?' + qs)

The first if-statement ensures that the user is not logged in yet. If the user is already logged in, then they shouldn't be calling this route, so a redirect to the index page is returned.

The provider_data variable is then assigned the configuration dictionary for the chosen provider. If the requested provider is not in the configuration, then a 404 error is returned.

Next, the secrets module from the Python standard library is used to generate a random token that is stored in the user session. This is going to be the state argument that will be passed to the provider.

The qs variable is then assembled as the query string, containing all the arguments required by the OAuth protocol. The return statement at the very end returns a redirect response set to the provider's authorize URL concatenated with the query string with the OAuth parameters.

Callback

When the browser receives the redirect response from the authorize endpoint, it will send the user to the provider's login page. Now our application loses control and the user has to work with the provider to authenticate and give consent for their information to be shared with us. When the user is done with this, the provider will issue a redirect back to us, at the URL that we set in the redirect_uri parameter of the query string, which in my example application is the oauth2_callback endpoint, also including the provider as an argument.

Below you can see the complete implementation of this endpoint. As you'll notice, this is a somewhat long endpoint that does several things. Hopefully the comments that I inserted in the code will help you identify the different tasks.

@app.route('/callback/<provider>')
def oauth2_callback(provider):
    if not current_user.is_anonymous:
        return redirect(url_for('index'))

    provider_data = current_app.config['OAUTH2_PROVIDERS'].get(provider)
    if provider_data is None:
        abort(404)

    # if there was an authentication error, flash the error messages and exit
    if 'error' in request.args:
        for k, v in request.args.items():
            if k.startswith('error'):
                flash(f'{k}: {v}')
        return redirect(url_for('index'))

    # make sure that the state parameter matches the one we created in the
    # authorization request
    if request.args['state'] != session.get('oauth2_state'):
        abort(401)

    # make sure that the authorization code is present
    if 'code' not in request.args:
        abort(401)

    # exchange the authorization code for an access token
    response = requests.post(provider_data['token_url'], data={
        'client_id': provider_data['client_id'],
        'client_secret': provider_data['client_secret'],
        'code': request.args['code'],
        'grant_type': 'authorization_code',
        'redirect_uri': url_for('oauth2_callback', provider=provider,
                                _external=True),
    }, headers={'Accept': 'application/json'})
    if response.status_code != 200:
        abort(401)
    oauth2_token = response.json().get('access_token')
    if not oauth2_token:
        abort(401)

    # use the access token to get the user's email address
    response = requests.get(provider_data['userinfo']['url'], headers={
        'Authorization': 'Bearer ' + oauth2_token,
        'Accept': 'application/json',
    })
    if response.status_code != 200:
        abort(401)
    email = provider_data['userinfo']['email'](response.json())

    # find or create the user in the database
    user = db.session.scalar(db.select(User).where(User.email == email))
    if user is None:
        user = User(email=email, username=email.split('@')[0])
        db.session.add(user)
        db.session.commit()

    # log the user in
    login_user(user)
    return redirect(url_for('index'))

This endpoint starts as the previous one, ensuring that the user is not already logged in. Next the provider_data variable is, once again, assigned the configuration for the selected provider.

If the user failed to authenticate with the provider, the redirect back to the application may not happen at all, and the user will see an error message from the provider's own website. If a redirect does happen, it is likely to include an error argument in the query string. If this argument is present, then we flash any query string arguments that start with error, so that the user can see all the information. I have noticed that some providers return query string arguments such as error, error_description, and similar ones, so all these are added to the flashed message for the user to have all the information about that error that the provider returned.

Do you remember the state argument that we added to the query string of the authorize redirect? To make sure that this is a legitimate request, the provider adds the state that we provided in their callback. We stored a copy of this state in the user session, so we can now verify that the state sent by the provider matches the one we generated and saved earlier. If there is no match, that means that this callback request may have been forged by someone trying to fool us, so in that case we return a 401 response to indicate that the authentication process failed.

A successful callback from the provider should include a code argument in the query string, with the OAuth 2.0 authorization code. If this argument isn't present, then we also fail with a 401 response, as we will not be able to authenticate the user.

At this point we have done all the checks and everything appears to be in order, so we are now ready to request an access token from the provider. This is done by sending a POST request to the provider's token endpoint, passing a number of arguments in the body of the request:

the client_id and client_secret assigned by the provider to the application. Note how the authorization redirect only included client_id, because it would be insecure to include client_secret in a redirect that is sent to the browser. Since this is a server-to-server request, it is safe to pass the secret, so that the provider knows we are who we say we are.
the code argument that the provider returned in the query string of the callback redirect.
grant_type, set to the authorization_code value, as dictated by the OAuth 2.0 protocol.
the same redirect_uri that was included in the authorization redirect.

The response from this request should have a status code of 200, and should include an access_token attribute in the body, which we requested to be given in JSON format. If a different status code is received, or if there is no access token in the body, we exit with a 401 status code.

The access token that the provider has given us can be used to make calls into the provider's API on behalf of the user. So now we send a request to the user information endpoint, passing this access token in the Authorization header. OAuth does not include any rules regarding what the format of the response from this endpoint should be, so we take the JSON response and call the email function included in the configuration with it, expecting the return from this function is the user's email address. The email functions that I have included in the configuration for Google and GitHub were carefully designed after studying the responses from the respective user information endpoints. If you need to implement other OAuth providers you will need to adapt the function to work with your provider.

Now, finally and after so much work, the application knows who the user is, and can proceed with the login. The email address may already exist in the application's user database, or it may be an email that has not been seen before, meaning that this is the first time this user logs in to the application. If this is a new user, they are added to the database and committed.

Now the application has a user object that is stored in the database, so it can log the user in using the login_user() function from the Flask-Login extension. And that's the end, now the user can be redirected back to the application as a logged-in user!

Running the Example Application

In the previous sections, you have seen snippets of an example application. A complete and fully working application that you can run does exist, and is available as a GitHub repository.

Follow these steps to get this application set up on your computer:

Register an OAuth 2.0 application with Google and GitHub. See the "Registering with an OAuth Provider" section above for guidance.
Clone the GitHub repository, create a Python virtualenv, and install all the dependencies:

git clone https://github.com/miguelgrinberg/flask-oauth-example
cd flask-oauth-example
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Copy the .env.template file to .env and fill out your OAuth client ID and secrets from Google and GitHub.
Start the application:

flask run

On your browser, navigate to http://localhost:5000 and try the login buttons!

Conclusion

I hope this tutorial helps you with your social login implementation. I have used this solution both for Flask projects as the one I demonstrate in this article, as well as Flask APIs that are driven by a JavaScript front end, though for this type of application a few minor changes were necessary. I plan to cover OAuth support for a React front end in a follow-up article, so stay tuned for that!

#1 Remy Zandwijk said 2023-07-10T16:25:19Z

Hi Miguel,

Thanks for writing this! In the “Running the Example Application” section, the link to the GitHub repository has an error: “flask-oauth-demo” should be “flask-oauth-example”.

-Remy
#2 Miguel Grinberg said 2023-07-11T18:57:35Z

@Remy: thanks, the link is now fixed.
#3 Tung Tran said 2023-09-22T15:21:16Z

I cannot say thank you enough for your writing. It will be a big help for me to understand the OAuth protocol.
#4 James Baker said 2023-10-18T10:43:21Z

Hi Miguel, Great post thank you! I've managed to get the above working with Microsoft OAuth with my Flask application. One question I have however is how to best deal with a mix of both OAuth logins and username and password logins in the database. I used your Flask Mega tutorial to build my application and currently have the ability for people to sign up using their email address and set a password. If a new user uses OAuth instead, I have no password to save for the user and when they come to login they could potentially sign in without a value. Although I could prevent this using code it seems a little messy. I had thought about removing the password login all together but then feel I may be excluding some users in the future. Maybe I am missing something simple but seems to be little online on the topic. Thanks in advance!
#5 Miguel Grinberg said 2023-10-18T11:59:56Z

@James: For a user that created their account through an OAuth provider the password field will be empty. This is actually not a problem, if they try to log via username/password they'll get a failure, no matter what they type. You can (if you want) show an error message indicating that they don't have a password and that they should try a social login instead. And if the user ever uses the reset password feature, they'll be able to set a password and login with it just fine.
#6 Raz vermont said 2023-10-22T19:35:56Z

Really nice blog as always! thanks for sharing your knowledge.
#7 Ben said 2023-12-09T17:30:43Z

Looking forward to the OIDC article.
Likewise, being able to unify registration onsite, via OIDC, and over OAuth2!
Lastly, being able to use this as an SSO for the microservice environment!
#8 Jean-Michel Tremblay said 2024-03-05T15:57:04Z

Thank you for the write up.

My setup is slightly different as I'm integrating with azure AD SSO. I was not able to commit to the session and get back my values in the call back after the redirect call. I suspect microsoft or my browser mangles with cookies in the redirect calls, or a user error on my end. My workaround was to maintain a dictionary in-lieu of automatically managed session and retrieve my data via the state key. Many ways to skin a cat :-)
#9 Khoa said 2024-03-28T07:20:58Z

Hi Miguel,
Thank you for the blog post. I'm currently implementing an OAuth flow for my application, which consists of a NextJS frontend client and a Python API server (the backend). Your way works fine for the most part, except for the part where you check for the state. I don't really know how to persist the state in my cases (NextJS - Flask as opposed to fullstack Flask). Do you have any guidance?
#10 Miguel Grinberg said 2024-03-30T00:10:58Z

@Khoa: the state is persisted in the Flask user session, which should work the same regardless of what frontend tech you use. You can see a React implementation of OAuth2 I did here: https://github.com/miguelgrinberg/react-microblog/commit/4546534caa6bcb9b966f4e427fb3270660cb4e85