Run Your Flask Regularly Scheduled Jobs with Cron
Posted byon under
A common need of web applications is to have a periodically running task in the background. This could be a task that imports new data from third party sources, or maybe one that removes revoked tokens from your database once they have expired. In this and many other situations you are faced with the challenge of implementing a task that runs in the background at regular intervals.
This is a pattern that many people ask me about. I've seen implementations that are based on the APScheduler package, on Celery, and even homegrown solutions built inside a background thread. Sadly none of these options are very good. In this article I'm going to show you what I believe is a very robust implementation that is based on the Flask CLI and the cron service.
Implementing the Job Logic
I adhere to the "divide and conquer" principle, so when I'm implementing a scheduled job I prefer to separate the job itself from the scheduling and also from the web application. So I really view a job that runs at regular intervals as a standalone short-lived job that runs once, configured to run over and over again at the desired frequency.
When working with a Flask application, I find that the best option to implement a short-lived job is to do it as a command attached to the
flask command, not only because I can consolidate all my jobs under a single command but also because a Flask command runs inside an application context, so I can use many of the same facilities I have access in the Flask routes, the most important of all being the database.
Below you can see an example of how I would implement a job. In this case I'm using the flasky application featured in my Flask Web Development book. This application already has a few custom commands, so I added one more at the end of the flasky.py module:
import time @app.cli.command() def scheduled(): """Run scheduled job.""" print('Importing feeds...') time.sleep(5) print('Users:', str(User.query.all())) print('Done!')
Because this is just a demonstration, I'm not doing anything specific in this job, just a five second sleep to simulate some work being done. I have added a few print statements which would be used for logging, and I have also included a simple database query, to confirm that the Flask configured database works great inside the custom command.
Now I can see my custom command when I run
(venv) $ flask --help Usage: flask [OPTIONS] COMMAND [ARGS]... This shell command acts as general utility script for Flask applications. It loads the application configured (through the FLASK_APP environment variable) and then provides commands either provided by the application or Flask itself. The most useful commands are the "run" and "shell" command. Example usage: $ export FLASK_APP=hello.py $ export FLASK_DEBUG=1 $ flask run Options: --version Show the flask version --help Show this message and exit. Commands: db Perform database migrations. deploy Run deployment tasks. profile Start the application under the code... run Runs a development server. scheduled Run scheduled job. shell Runs a shell in the app context. test Run the unit tests.
This way of writing the job makes it easy to do testing, since I can simply run this job from the command-line as many times as I need to get it right:
(venv) $ flask scheduled Importing feeds... Users: [<User 'miguel'>] Done!
Defining a Cron Job
Once the job is written and tested, it is time to implement the scheduling part. For this I find the cron service available in all Unix-based distributions more than adequate.
Each user in a Unix system has the option to set up scheduled commands that are executed by the system in a "crontab" (cron table) file. The
crontab command is used to open a text editor on the user's crontab file:
$ crontab -e
It is important to run the
crontab command under the user that is intended to run the scheduled job, which typically is the same user that runs the web application. This ensures the job will run with the correct permissions. I recommend you do not put your scheduled jobs on the root user, in the same way you shouldn't be running your web application as root.
crontab -e command will start a text editor on the user's crontab file, which will initially be empty, aside from some explanatory comments.
A scheduled job is given in the crontab file as a line with six fields. The first five fields are used to set up the run scheduled for the job. The sixth and last field is the command to run. You can configure multiple jobs, each with its own schedule by writing multiple lines in the crontab file.
I find the easiest way to set up my scheduled job is to start with a default configuration that runs the command once per minute, as this allows me to test that the command runs correctly without having to wait a lot of time between runs.
To run a job once a minute put five stars separated by spaces, followed by the command to run:
* * * * * command
In my example the command I want to run is
flask scheduled, but in general when you write a command in a crontab file you have to adapt the command to compensate for the differences between running the command from the terminal versus having cron run it as a service. I can think of three aspects that need to be considered:
- Current directory: If the command needs to run from a specific directory, you have to add a
cdin the cron job. It may also be necessary to specify an absolute path to the command.
- Environment variables: If the command needs environment variables set, they need to be set as part of the command. My recommendation is that you use a .env and/or .flaskenv files to store your variables, so that they are automatically imported by Flask when the command starts.
- Virtual environment: This one is specific to Python applications. You have to either activate the virtual environment as part of the cron command, or else execute the Python executable located inside the virtualenv directory.
- Logging: The cron service collects the output of the command and sends it to the Unix user as an email. This is almost always inconvenient, so it is best to ensure that the command generates no output by redirecting
stderrto a logfile.
Here is how my
flask scheduled command can be configured to run once a minute as a cron job:
* * * * * cd /home/ubuntu/flasky && venv/bin/flask scheduled >>scheduled.log 2>&1
&& is used to include multiple commands in a single line. With it I can
cd to the directory of my project and then execute the command. To make sure the virtual environment is activated I fish the
flask command directly out of the virtualenv's bin directory. This achieves the same effect as activating the environment. For environment variables this application uses a .env file, so that works the same under cron. In terms of logging I first redirect
stdout to a file with
>>scheduled.log, which will cause new runs of the job to append at the end of the file. For
stderr I used
2>&1, which means that I want to apply the same redirection for
stderr that I configured for
stdout (the "2" and the "1" reference the file handle numbers for
As soon as you save and exit the text editor the scheduled job will start to run at the top of every minute, and you should see the output of each run added to the end of the scheduled.log file. If the command ends with a crash, the stack trace will be written to
stderr, which we are also writing the logfile, so you'll see the error in the log.
Once you have the command running successfully once a minute, you can start thinking about creating a final schedule for it. The five stars represent the following time specifications in order:
- The minute, from 0 to 59 or * for every minute
- The hour, from 0 to 23 or * for every hour
- The day of the month, from 1 to 31 or * for every day
- The month, from 1 to 12 or * for every month
- The day of the week from 0 (Sunday) to 6 (Saturday) or * for every day of the week
Using stars for all fields means that we want to run the job on every minute of every hour of every day of every month, and on every day of the week. If I wanted to run the job once per hour instead of once per minute, all I need to do is set a specific minute. For example, to run at the 0th minute of every hour:
0 * * * * cd /home/ubuntu/flasky && venv/bin/flask scheduled >>scheduled.log 2>&1
If instead I wanted to run once per hour, but at the 5th minute (i.e. at 0:05, 1:05, 2:05 and so on):
5 * * * * cd /home/ubuntu/flasky && venv/bin/flask scheduled >>scheduled.log 2>&1
To run the job daily at 4:05am:
5 4 * * * cd /home/ubuntu/flasky && venv/bin/flask scheduled >>scheduled.log 2>&1
If I want to run at at 04:05pm:
5 16 * * * cd /home/ubuntu/flasky && venv/bin/flask scheduled >scheduled.log 2>&1
To run the job at 4:05am, but only on Tuesdays:
5 4 * * 2 cd /home/ubuntu/flasky && venv/bin/flask scheduled >scheduled.log 2>&1
Instead of using a single number for each field, you can specify multiple ones separated by commas. To run the job at 4:05am on Tuesdays and Fridays:
5 4 * * 2,4 cd /home/ubuntu/flasky && venv/bin/flask scheduled >scheduled.log 2>&1
Ranges of consecutive numbers can be given with a dash. To run the job at 4:05am only on weekdays:
5 4 * * 1-5 cd /home/ubuntu/flasky && venv/bin/flask scheduled >scheduled.log 2>&1
When you specify a range of numbers, you can also include a step argument. The following example runs the job every 2 minutes, on the even minutes:
0-59/2 * * * * cd /home/ubuntu/flasky && venv/bin/flask scheduled >scheduled.log 2>&1
And if I wanted to run every two minutes on the odd ones:
1-59/2 * * * * cd /home/ubuntu/flasky && venv/bin/flask scheduled >scheduled.log 2>&1
I hope by now you get how this works. If you want to practice different cron schedules, the crontab.guru site is great, as it translates a given specification into words to make it more clear.
Once you configure your desired interval your cron job will run at the schedule time. To review how it is working, you may want to check the logfile. If you use the
Importing feeds... Users: [<User 'miguel'>] Done! Importing feeds... Users: [<User 'miguel'>] Done! Importing feeds... Users: [<User 'miguel'>] Done!
The problem is that the output of the command is always the same, so you get this repetitive stream. If at any time there was an error, or output that was unexpected, you will not know when that happened. If you wanted to know how long your runs are taking, you cannot know either.
To add a little bit more context into this logfile, timestamps can be added to each line.
from datetime import datetime import time @app.cli.command() def scheduled(): """Run scheduled job.""" print(str(datetime.utcnow()), 'Importing feeds...') time.sleep(5) print(str(datetime.utcnow()), 'Users:', str(User.query.all())) print(str(datetime.utcnow()), 'Done!')
With this change, now your logfile will show the time each line was printed:
2020-06-28 23:03:25.597371 Importing feeds... 2020-06-28 23:03:30.599382 Users:  2020-06-28 23:03:30.621601 Done!
If your job outputs more than a handful of lines in each run, you should use the
logging module from Python to create a more robust logfile.
I hope this tutorial gave you a clear idea of how to implement regularly scheduled background jobs in your Flask application. In the introduction I mentioned that using Python-based solutions is a bad idea. In case you want to know why, here are some problems:
- If your background job runs in the context of your Python process with APScheduler or a similar package, when you scale your Flask application to more than one worker you'll have multiple background jobs as well.
- If you run your background job in a homegrown thread-based solution, you'll have to have very robust error handling in place. If not, whenever the background thread crashes your jobs will stop running. Unlike most of these Python implementations, using cron requires to additional dependencies. If you deploy on a Linux machine, you always have cron available to you.
I hope I convinced you, but if you have a method of running background jobs that you like better than cron and would like to tell me about it let me know below in the comments!
Become a Patron!
Hello, and thank you for visiting my blog! If you enjoyed this article, please consider supporting my work on this blog on Patreon!
#26 Miguel Grinberg said 2020-12-01T10:20:43Z
@Roark: No. This is not a topic covered in any of my books.
#27 drdd said 2021-01-14T15:44:10Z
Dude, you're amazing. That's so much easier than setting up python advanced scheduler. Thank you!
#28 Mike said 2021-01-16T13:01:27Z
@Miguel: Thank you for the tutorial, it's so informative. Could you please comment on any options/workarounds for a Cloud Foundry container? Thus, I could manually run my task/job in SSH by "tmp/lifecycle/shell", then "export FLASK_APP=my_app", finally "flask my-task". But crontab file wasn't run as far as there is no cron service in CF. Thank you!
#29 Miguel Grinberg said 2021-01-16T23:54:10Z
@Mike: I'm not familiar enough with CF to comment on this. I suppose you can use a base image that has cron, I know that works for Docker. Or else use any scheduling options provided to you by the CF platform.
#30 Kolade said 2021-01-31T10:06:00Z
Hi Miguel, Thank you so much for your Mega Tutorial, it's been so helpful although I am wondering why I can't use Celery's crontab option instead of a linux cronjob. What are the complications as my current Flask API is using celery. Thank you
#31 Miguel Grinberg said 2021-01-31T23:10:02Z
@Kolade: You can use any method that you like. Because I show how to use A you shouldn't assume that I'm telling you not to use B. I prefer cron jobs. If you already have Celery running and you are happy with it I don't see why you shouldn't use it for your scheduled jobs.
#32 Abhishek Kumar said 2021-05-10T19:58:40Z
I am not using virtual environment how will I execute cli command using crontab flask
#33 Miguel Grinberg said 2021-05-10T22:53:17Z
@Abhishek: if you prefer not to use a virtualenv, then you need to figure out where your
flaskcommand is installed, and use the full path in the crontab command.
#34 Dannel said 2021-08-24T23:48:43Z
@Miguel I am using Windows for develop and heroku for deployment what do you recomend in my case
#35 Miguel Grinberg said 2021-08-25T13:21:44Z
@Dannel: Heroku has a scheduler extension. On Windows I would use cron under the WSL.
#36 Adel said 2022-03-14T15:24:58Z
@Miguel, thank you very much for such a great tutorial! Did I understand right that every invocation of a flask CLI command (in this case, our custom command) will create the app context and then destroy it after the command finishes? So, if we invoke a background task every minute, and we have a connection to a database in that command (or in the app context), we will be establishing and destroying connections to the database every minute. Is not it more efficient to simply have a daemon running in the background (say, with a simple infinite loop) that will keep 1 open connection to the database and just fire commands when needed?
#37 Miguel Grinberg said 2022-03-14T17:12:28Z
@Adel: It really depends. A connection per minute is really small, so I don't see it as something that needs to be optimized. But if you feel the need to do it, then sure, make it a daemon, and let SQLAlchemy pool the connection. You won't be able to use cron though, you'll have to use your own scheduler or use a third party one from the Python ecosystem.
#38 Adel said 2022-03-16T17:46:57Z
@Miguel, I got you, thank you very much once again!
#39 Adel said 2022-04-01T20:10:20Z
Dear @Miguel, I have another question to you - is it possible to access an active flask socket.io connection inside the croned background task? So that, the updates can be pushed immediately to the ones who are connected at the moment.
#40 Miguel Grinberg said 2022-04-01T22:36:10Z
@Adel: Yes. See the "emitting from an external process" section of the documentation.
#41 Jonas Hansen said 2022-04-30T10:02:25Z
If you do a custom command that updates data for your webapp how do you then tell the flask app to refresh the website to include the new data?
#42 Miguel Grinberg said 2022-04-30T12:25:37Z
@Jonas: Flask does not have the ability to update the page, unless you implement something like WebSocket. In general it is the client code running in the browser that requests periodic updates by refreshing the page.
#43 Filipe Galo said 2022-08-18T16:50:11Z
I've been reading your blog since the start of this crazy pandemic and 100% you taught me a lot. Every time I want to do something with Python or Flask I came here.
This post about how to run scheduled jobs is so clean and simple. Instead of installing APScheduler and worrying about the application context.
Thanks a lot for everything.
#44 Osama Abbas said 2023-04-26T01:29:56Z
First, thanks a lot for the tutorial.
I was wondering if I am able to use your method in updating a token. Let me briefly explain what I mean.
I have a Flask Application that uses a token from a provider with client_id and client_secret (obtained from the provider) to be able to use some API endpoints. I am using Request-OAuth for this. The bearer access token I receive is then used with subsequent requests. But this token expires in 1 hour (3600 seconds).
My Flask application doesn't have any login system to log the user out when token is expired. Thus, my workaround is that when the webserver is started with home page visited a token is obtained. A get_token function runs, and the token is then saved in Flask's session
session["access_token"] = access_token session.modified = True
However, after the first hour, the application needs to obtain a new access token. Currently, I redirect the user to the home page to get the access token renewed where the get_token function runs. But I feel like it is silly that the user has to be redirected home to get the token renewed.
Does your method fit my scenario?
#45 Miguel Grinberg said 2023-04-26T09:55:50Z
@Osama: what do you mean here by "my method"? Nothing I show in this article is really my invention, I'm showing to configure cron jobs, which are a feature of UNIX operating systems. I don't understand what is the relationship you made between cron jobs, which run at given times with renewing tokens when they expire. Wouldn't it make more sense to renew the token directly in the app when it expires?
#46 Anthony Udeagbala said 2023-05-25T15:59:34Z
This works perfectly.
I have been trying to automate the process using docker but I can't seem to do this. I don't want to manually edit the corn file because I would deploy the application to Digital Ocean.
How can Docker handle this?
#47 Miguel Grinberg said 2023-05-26T09:18:40Z
@Anthony: I would imagine you need to install the cron daemon in your image if you want to run cron jobs inside your container.