The Python Celery Cookbook: Small Tool, Big Possibilities
Everyone in the Python community has heard about Celery at least once, and maybe even already worked with it. Basically, itβs a handy tool that helps run postponed or dedicated code in a separate process or even on a separate computer or server. This saves time and effort on many levels.
Our tutorial is based on many years of development experience that we have implemented in our projects. Therefore, all information about Celery in this article is based only on real projects.
An Introduction to the Celery Python Guide
Celery tool decreases performance load by running part of the functionality as postponed tasks either on the same server as other tasks, or on a different server. Most commonly, developers use it for sending emails. However, Celery has a lot more to offer. In this article, Iβll show you some Celery basics, as well as a couple of Python-Celery best practices.
Celery Basics
If you have worked with Celery before, feel free to skip this chapter. But if Celery is new to you, here you will learn how to enable Celery in your project, and participate in a separate tutorial on using Celery with Django. Basically, you need to create a Celery instance and use it to mark Python functions as tasks.
Itβs better to create the instance in a separate file, as it will be necessary to run Celery the same way it works with WSGI in Django. For example, if you create two instances, Flask and Celery, in one file in a Flask application and run it, youβll have two instances, but use only one. Itβs the same when you run Celery Python.
Primary Python Celery Examples
As I mentioned before, the go-to case of using Celery is sending email. I will use this example to show you the basics of using Celery. Hereβs a quick Celery Python tutorial:
This code uses Django, as itβs our main framework for web applications. By using Celery, we reduce the time of response to customer, as we separate the sending process from the main code responsible for returning the response.
The simplest way to execute this task is to call delay
method of function that is provided by app.task
decorator.
Not only this β Celery provides more benefits. For example, we could set up retries upon failing.
Now the task will be restarted after ten minutes if sending fails. Also, youβll be able to set the number of retries.
Some of you may wonder why I moved the template rendering outside of the send_mail call
. Itβs because we wrap the call of send_mail
into try/except
, and itβs better to have as little code in try/except
as possible.
Celery for Advanced Users
Celery Django Scheduled Tasks
Celery makes it possible to run tasks by schedulers like crontab in Linux.
First of all, if you want to use periodic tasks, you have to run the Celery worker with –beat flag, otherwise Celery will ignore the scheduler. Your next step would be to create a config that says what task should be executed and when. Hereβs an example:
*if you donβt use Django, you should use celery_app.conf.beat_schedule
instead of CELERY_BEAT_SCHEDULE
What we have in this configuration is only one task that will be executed every Monday at 7 a.m.. The root key is a name or a cronjob, not a task.
You can add arguments to tasks and choose what should be done in case the same task should run at different times with different arguments. The crontab
method supports the syntax of the system crontab β such as crontab(minute=β*/15β)
β to run the task every 15 minutes.
Postponed Task Execution In Celery
You can also set tasks in a Python Celery queue with timeout before execution. (For example, when you need to send a notification after an action.) To do this, use the apply_async method
with an eta
or countdown
argument.
eta | execute task at exact time |
countdown | execute task in N seconds |
In the first example, the email will be sent in 15 minutes, while in the second it will be sent at 7 a.m. on May 20.Letβs look at what it might look like in code:
Setting Up Python Celery Queues
Celery can be distributed when you have several workers on different servers that use one message queue for task planning. You can configure an additional queue for your task/worker. For example, sending emails is a critical part of your system and you donβt want any other tasks to affect the sending. Then you can add a new queue, letβs call it mail
, and use this queue for sending emails.
*if you donβt use Django, use celery_app.conf.task_routes
instead of CELERY_TASK_ROUTES
Run two separate celery workers for the default queue and the new queue:
The first line will run the worker for the default queue called celery
, and the second line will run the worker for the mail
queue. You can use the first worker without the -Q
argument, then this worker will use all configured queues.
Python Celery Long Running Tasks
Sometimes, I have to deal with tasks written to go through database records and perform some operations. Quite often, developers forget about data growth, which can lead to a very long task running time. Itβs always better to write tasks like these in a way that allows working with data chunks. The easiest way is to add an offset and limit parameters to a task. This will allow you to indicate the size of the chunk, and the cursor to get a new chunk of data.
This is a very simple example of how a task like this can be implemented. At the end of the task, we check how many users we found in the database. If the number equals the limit, then weβve probably got new users to process. So we run the task again, with a new offset. If the user count is less than the limit, it means itβs the last chunk and we donβt have to continue. Beware, though: this task implementation needs to have the same ordering for records every time.
Celery: Getting Task Results
Most developers donβt record the results they get after running the task. Imagine that you can take a part of code, assign it to a task and execute this task independently as soon as you receive a user request. When we need the results of the task, we either get the results right away (if the task is completed), or wait for it to complete. Then we include the result to the general response. Using this approach, you can decrease response time, which is very good for your users and site rank.
We use this feature to run simultaneous operations. In one of our projects, we have a lot of user data and a lot of service providers. To find the best service provider, we do heavy calculations and checks. To do it faster, we create tasks for user with each service provider, run them and collect results to show to the user. Itβs very easy to do with Celery task groups.
First, why do we even run two tasks? We use the second task to form calculation task groups, launch and return them. On top of that, the second task is where you can assign project filtration β like service providers that need to be calculated for a given user. All this can be done while Celery is doing other work. When the task group returns, the result of the first task is actually the calculation we are interested in.
Hereβs an example of how to use this approach in code:
Here, we run calculations as soon as possible, wait for the results at the end of the method, then prepare the response and send it to the user.
Python Celery Best Practices
Tiny Data
Iβve probably already mentioned that I use database record IDs as task arguments instead of full objects. This is a good way to decrease the message queue size. But whatβs more important is that when a task is executed, the data in the database can be changed. And when you have only IDs, you will get fresh data as opposed to outdated data you get when passing objects.
Transactions
Another best practice of Celery. Sometimes, issues may arise when an executed task canβt find an object in a database. Why does this happen? In Django, for instance, you want to run tasks after a user is registered, like sending a greeting email, and your Django settings wrap all requests into a transaction. In Celery, however, tasks are executed fast, before the transaction is even finished. So if you use Celery when working in Django, you might see that the user doesnβt exist in the database (yet).
To deal with this, you can Google βtask transaction implementationβ. In general, itβs an overwritten apply_async
method in task, a class that sets up a task in transaction.on_commit
signal instead of doing it immediately.
Conclusion
As you see, Celery has a lot more uses than just sending emails. You can run different tasks simultaneously using the main process, and while you do your job, Celery will complete the smaller tasks at hand. You can set up queues, work with data chunks on long-running tasks, and set up times for your tasks to be executed. This will allow you to better plan your work progress, plan development time more efficiently, and spend your precious time working on the bigger things while Celery task groups work their magic. Contact Django Stars to discuss with the experienced Python-Django development team how to add some Celery to your project.
- Do I need Celery Python?
- Using Celery is optional and depends on your project. But it can be a handy tool for managing task queues, for example, when you work with Django and want to add background processing of long-running tasks.
- Why use Celery Python
- Celery decreases performance load by running part of the functionality as postponed tasks. Basically, itβs a handy tool that helps run postponed or dedicated code in a separate process or even on a separate computer or server. This saves time and effort on many levels.
- When to use Celery Python?
- Most commonly, developers use it for sending emails. However, Celery has a lot more to offer. It makes it possible to run periodic tasks by schedulers like crontab in Linux. You can also set tasks in a Python Celery queue with timeouts before execution. Celery helps run different tasks simultaneously using the main process, and while you do your job, Celery will complete the smaller tasks at hand. You can set up queues, work with data chunks on long-running tasks, and set up times for your tasks to be executed. This will allow you to better plan your work progress, plan development time more efficiently, and focus on the bigger things.
- Can I Use Celery Without Django?
- Yes. The use of Celery is not limited to Django projects. You can create the Celery object and use it as a generic asynchronous task queue with or without web frameworks like Django and Flask.
- What are some alternatives to Celery Python?
- Celery has long been the most popular asynchronous task queue for Python. It provides excellent opportunities for working with tasks, launching them within a certain period, and fine-tuning. However, this is not the only such tool. For Python alternatives, you can also look into pro and cons of utilizing arq, Huey, Dramatiq, APScheduler, or RQ (Redis Queue).
- How does Celery integrate with Django?
- Since version 3.1, Celery supports Django out of the box and no longer requires a separate library to work with Django. To learn more about how to integrate Celery and Django, check out Celery's documentation.