What's New in SQLAlchemy 2.0?
Posted byon under
You may have heard that a major version of SQLAlchemy, version 2.0, has been released in January 2023. Or maybe you missed the announcement and this is news to you. Either way, I thought you'd be curious to know what's new in it, if it is worth upgrading and how difficult the upgrade is.
As with previous software reviews, this is going to be an opinionated overview. I have been using the SQLAlchemy ORM in web projects for a long time, so in this article I will discuss the features that affect my own work, both in positive or negative ways. If instead you are interested to see a list of every change that went into this new release, then the official change log is the place to go.
For The Impatient
If you just want to see how a real-world project written for SQLAlchemy 2.0 looks like and don't want to read through the analysis that follows, you can head over to my microblog-api project, which I just upgraded to use the latest SQLAlchemy features.
No More Query Object
The first change I'm going to discuss is the new query interface. To be exact, this feature has been introduced in SQLAlchemy 1.4 releases as a way to help developers transition to 2.0, so you may have already seen it.
The now "legacy" way to issue queries in the SQLAlchemy ORM consisted in using a
Query object, available from the
Session.query() method, or for those using the Flask-SQLAlchemy extension for Flask, also as
Model.query. Here are some examples:
# using native SQLAlchemy user = session.query(User).filter(User.username == 'susan').first() # using Flask-SQLAlchemy user = User.query.filter(User.username == 'susan').first()
In release 2.0 this is considered the "old" way of executing queries. You can still run queries in this way, but the documentation refers to this interface as the "1.x Query API" or the "legacy Query API".
The new Query API has a very clear separation between the queries themselves and the execution environment in which they run. The above query to look for a user by the
username attribute can now be written as follows:
query = select(User).where(User.username == 'susan')
This example stores the query in the
query variable. At this point the query has not executed yet, and is not even associated with a session.
To execute the query, it can be passed to the
execute() method of a session object:
results = session.execute(query)
The returned value from the
execute() call is a
Result object, which functions as an iterable that returns
Row objects with a similar interface to a named tuple. If you prefer to get the results without iterating over them, there are a few methods that can be called on this object, some of which are:
all()to return a list with a row object for each result row.
first()to return the first result row.
one()to return the first result row, and raise an exception if there are no results or more than one result.
one_or_none()to return the first result row,
Noneif there are no results, or raise an exception if there are more than one result.
Working with results as tuples makes sense when each result row can contain multiple values, but when there is a single value per row it can be tedious to extract the values out of single-element tuples. The session has two additional execution methods that make working with single-value rows more convenient to use:
ScalarResultobject with the first value of each result row. The methods of
Resultlisted above are also available on this new result object.
scalar()return the first value of the first result row.
The example legacy query shown above can be executed as follows using the new interface:
user = session.scalar(query)
If you are used to the old query object it can take some time to become familiar with the new way to issue queries, but I personally find the separation between creating and executing queries a great improvement. Once again keep in mind that the legacy queries can still be used, so you can transition gradually to the new style.
Session Context Managers
Something I'm very excited to see is the introduction of context managers for sessions, which were also included in release 1.4 to help developers with a migration towards 2.0. I have always implemented my own session context managers and even explained how to do it in a video from a few years ago.
In 1.3 and older releases, the scoped session was the main pattern for working with sessions. The Flask-SQLAlchemy extension, for example, embraced it with the
db.session variable, which is its signature. To be honest, I have never liked scoped sessions because they tie a session to a thread, which is completely arbitrary. In many cases the life of a session is much shorter than that of a thread, so you have to resort to manual management to get things to work in the intended way.
Now a session can be instantiated with a context manager, so there is a clear start and end. Here is an example:
with Session() as session: session.add(user) session.commit()
Here the session is closed when the context manager block ends. And if an error occurs inside it, the session is rolled back.
A variant of this pattern can be used to have a session that automatically commits at the end, while still rolling back on errors:
with Session() as session: with session.begin(): session.add(user)
Another interesting change introduced in release 2.0 is the option to use typing hints to declare columns and relationships in models. Consider the following definition for a
class User(Model): __tablename__ = 'users' id = Column(Integer, primary_key=True) username = Column(String(64), index=True, unique=True, nullable=False) email = Column(String(120), index=True, unique=True, nullable=False) password_hash = Column(String(128)) about_me = Column(String(140))
In 2.0, the column type can be defined with the
Mapped typing hint. If there are any additional options, they can be given in a
import sqlalchemy as sa import sqlalchemy.orm as so class User(Model): __tablename__ = 'users' id: so.Mapped[int] = so.mapped_column(primary_key=True) username: so.Mapped[str] = so.mapped_column(sa.String(64), index=True, unique=True) email: so.Mapped[str] = so.mapped_column(sa.String(120), index=True, unique=True) password_hash: so.Mapped[Optional[str]] = so.mapped_column(sa.String(128)) about_me: so.Mapped[Optional[str]] = so.mapped_column(sa.String(140))
Relationships are typed in the same way. Here is an example:
class Post(Model): # ... author: so.Mapped['User'] = so.relationship(back_populates='posts')
Note how the type can be given as a string, which is sometimes necessary to avoid forward references.
Overall this isn't very different, but using typing hints can provide some benefits:
- If you use an IDE that runs static analysis on your code and provides suggestions as you type, having a strongly typed model will help your IDE understand your code better.
- A related feature introduced in 2.0 is an integration between models and dataclasses, which also relies on the typing hints.
- One little benefit that I noticed is that there are less symbols that need to be imported from SQLAlchemy. For columns that are numbers, dates, times, or even UUIDs, you can now use the typing hint to define them, without having to import the corresponding type classes from SQLAlchemy (sadly the
String()class is still needed because for many databases a maximum length must be given).
As with the queries, the old way of defining columns and relationships continues to be supported.
If you followed my Flask tutorials, you know that I have always recommended the
lazy='dynamic' option for relationships that can be large, as this makes it possible to add pagination, sorting and filtering before SQLAlchemy goes and retrieves the related objects.
Unfortunately dynamic relationships are also considered legacy in SQLAlchemy 2.0, as they are incompatible with the new query interface. Instead, the recommended solution is a new relationship type called "Write-Only". Here is how to define a write-only relationship:
class User(Model): # ... tokens: so.WriteOnlyMapped['Token'] = so.relationship(back_populates='user')
Why the name "write-only"? The difference with the old dynamic relationship is that the write-only relationship does not load (or read) the related objects, but provides
remove() methods to make changes (or write) to it.
How do you get the related objects then? The relationship exposes a
select() method that returns a query that you can execute on the session, possibly after adding filters, sorting or pagination. This is less automatic and definitely less magical than the dynamic relationship, but it supports the same use cases.
Here is an example of how to get the
tokens objects associated with a given user, sorted by their expiration date:
tokens = session.scalars(user.tokens.select().order_by(Token.expiration)).all()
If I have to be honest, it took me a while to accept that the dynamic relationships are a thing of the past, because I really really like them. But I do understand that their design is incompatible with the new query interface. I struggled to build an application with SQLAlchemy 1.4 that did not use dynamic relationships, and complains from myself and others on the SQLAlchemy discussions board is what resulted in the addition of the write-only relationship in 2.0.
Version 1.4 introduced a beta version of the asyncio extension, which provides async versions of the
Session objects with awaitable methods. In release 2.0 this extension is not considered beta anymore.
The most important thing to keep in mind with regards to the asyncio support in SQLAlchemy 2.0 is that many of the best features of SQLAlchemy are possible because often database instructions are issued implicitly, for example as a result of the application accessing an attribute of a model instance. When using the asynchronous paradigm, implicit I/O as a result of an attribute being accessed is not possible because all database activity must happen while the application issues an
await function call, as this is what makes concurrency possible.
A big part of setting up an asynchronous solution with SQLAlchemy involves preventing all the ways in which normally there would be implicit database actions, so you will need to have at least a basic understanding of how SQLAlchemy works under the hood, and be prepared to receive obscure error messages if you missed anything. The asyncio extension documentation explains the issues and provides guidance on what needs to be done, so getting a working solution is definitely possible, but it will not be as flexible as what you get under regular Python.
I hope this was a useful review of what I consider the most important changes in SQLAlchemy 2.0. The microblog-api repository contains a complete, non-trivial API project based on Flask, with database support provided by SQLAlchemy 2.0 and my Alchemical extension. Feel free to try it out if you'd like to see the new SQLAlchemy features in action!
Learn SQLAlchemy 2 with my New Book!
That's right, I am working on a new book that will teach you SQLAlchemy 2 by building an actual database project from the very beginning and in small incremental steps. The expected release date of the book is April 30th, 2023. If you would like to secure your copy, I have made it available to pre-order from the Kindle store with a 30% introductory discount.
Become a Patron!
Hello, and thank you for visiting my blog! If you enjoyed this article, please consider supporting my work on this blog on Patreon!
#1 Fips said 2023-02-12T18:45:23Z
Brilliant overview, thank you Miguel! I bet I'll be referring back to this page if/when I decide to upgrade.
#2 Abdur-Rahmaan Janhangeer said 2023-02-17T04:30:14Z
SQLAlchemy 2.0 is the SQLAlchemy i always wanted. Result class and decoupling of queries is just really awesome as these are production gems.
#3 ernstl said 2023-02-18T20:59:13Z
thx for the nice and quick overview + tipps. Yesterday was 2.0.4 release.
Lazy feature back :)
The Session.refresh() method will now immediately load a relationship-bound attribute that is explicitly named within the Session.refresh.attribute_names collection even if it is currently linked to the “select” loader, which normally is a “lazy” loader that does not fire off during a refresh. The “lazy loader” strategy will now detect that the operation is specifically a user-initiated Session.refresh()...
#4 Miguel Grinberg said 2023-02-19T17:07:12Z
@ernstl: I think you have a misunderstanding. The "lazy" feature has never been removed. The 2.0.4 release has a minor improvement in the Session.refresh() method, that's all there is to it. There is nothing important that this changes.
#5 Andrew said 2023-03-01T16:50:26Z
What is the "so" in this line?
email: so.Mapped[str] = so.mapped_column(String(120), index=True, unique=True)
#6 Miguel Grinberg said 2023-03-01T18:53:20Z
@Andrew: Sorry, I should have included those in the example. I have updated it now, so is an abbreviation for sqlalchemy.orm and sa is an abbreviation for sqlalchemy:
import sqlalchemy as sa import sqlalchemy.orm as so
#7 CJ said 2023-03-03T18:53:53Z
Thank you for this post explaining the SQLAlchemy 2.0. I was having a lot of difficulty reading through the documentation while also trying to follow along to your amazing tutorial.
I have some questions that I would really appreciate your help with:
I noticed that in the models.py, you imported sqlalchemy as sa, but you don't seem to be using this. Is there a reason for importing sqlalchey in this module?
I tried using "results = session.execute(query)" in the Flask Shell, but I get a "NameError: name 'session' is not defined". It works if I use db.session.execute(query) instead. How were you able to just use session instead of db.session?
Where did you import the sqlalchemy's select module? I tried to import it in the init, but even then, when I start a flask session from CMD and try to query, I still get a NameError if I don't import the select module from sqlalchemy.
Thanks so much!
#8 Miguel Grinberg said 2023-03-04T00:50:30Z
@CJ: I think you are assuming the snippets of code I'm showing in this blog post are complete examples that you can run. You are also assuming I'm using Flask-SQLAlchemy, but this post discusses SQLAlchemy in general, not in a Flask context. Answers:
- There is no models.py file shown in this article, just a snippet of one. I'm not sure which file you are looking at, but I'm almost certain it is not related to this article.
db.sessiononly makes sense if you are using Flask-SQLAlchemy.
selectis a function that you import from SQLAlchemy. Or with Flask-SQLAlchemy you have it as
#9 Salvatore Fusto said 2023-03-10T08:25:26Z
Hi Miguel, thanks and congrats for your brilliant post. i' d know if your book on sqlachemy is, or wil be, available in PDF format other than Kindle
#10 Miguel Grinberg said 2023-03-10T10:13:24Z
@Salvatore: It will be available both as an e-book and paper book on Amazon. I'm not currently planning to sell it through other channels.
#11 Salvatore Fusto said 2023-03-10T10:13:41Z
hello, i just descovered this interesting blog so thanks for it. i've a little question: in your opinion qhat is the best in developing a Flask app, Flask + SQLalchemy, or Flask + Flask-SQLalchemy?
thanks and regards
#12 Miguel Grinberg said 2023-03-10T10:16:29Z
@Salvatore: The difference is not very significant. Flask-SQLAlchemy makes somethings a bit more convenient, but not in a major way.
#13 Sjoerd said 2023-03-23T06:31:51Z
I'm working on a application that is loosely based (the boilerplate) on the mega-tutorial. I'm developing it at home and use just "flask run" to fire-up the application. At work I'm use gunicorn. Works great but recently the application does not "see" database mutation - I need to restart gunicorn. I added some dirty print statement and changing and orm object though a form works fine.. After the db.session.commit() I query...first() that object and seems oke but when I move away from the page and get back to it old (eg. cached?) data is shown.. Again.. when I fire it up with just "flask run" there are no issues.
I have the feeling is has something to do with the gunicorn starting workers.. Would moving to 2.0 fix this issue (which I do not want at this point to be fair) or do I need to change something.
#14 Miguel Grinberg said 2023-03-23T10:54:31Z
@Sjoerd: I doubt a change to SQLAlchemy 2 will change anything for you. My guess is that this is an issue with your deployment, not with SQLAlchemy.
#15 George Bill said 2023-04-04T12:55:20Z
This is a wonderful post. Thanks for the update.
#16 Hendy Xu said 2023-05-16T11:22:36Z
That's great. This tutorial has been very helpful to me.