Skip to main content

Running tasks with Celery on Heroku guide

An example project and a basic guide showing how to run Django/Celery on Heroku.

Basic requirements

First of all, let's actually set up a typical Django project for this. We would need virtualenvwrapper  for that. One could use any other particular method. I prefer this one.
$ cd dev
$ mkvirtualenv dch
(dch) $ pip install django
(dch) $ django-admin startproject djheroku
(dch) $ cd djheroku
# Make sure is working:
(dch) $ ./manage.py runserver
From now I will consider working on a terminal with this (dch) environment on.

Heroku hosting setup

We would need our project set up for heroku python server. The docs live HERE, as for moment of this guide writing. One would need to follow and setup a basic heroku project. I will not stop here rewriting official guide as it is good enough.

Installing celery

Assuming we have a basic django dyno at heroku here we will continue.
Now let's install Celery and add it to our requirements list (as we had just started, let's just overwrite requirements.txt here):
$ pip install 'celery[redis]'
$ pip freeze > requirements.txt
Let's touch our settings.py adding the following snippet:
And include "djcelery" into INSTALLED_APPS tuple.

Redis broker 

Another option would be a Redis-based broker. AMQP is great, but three connections are barely enough - it's a really tight limitation. RedisToGo addon allows for 10 connections, so we may consider using it instead. Both RabbitMQ and Redis brokers are considered stable and fully featured. Let's install the addon and Python module for Redis:
$ heroku addons:add rediscloud
Adding rediscloud on happy-holliday-1467... done, v10 (free)
Use `heroku addons:docs rediscloud` to view documentation.

$ echo 'redis==2.10.3' >> requirements.txt
$ pip install redis==2.10.3
Now we need to add certain settings to configure settings in Django project:
BROKER_URL = BROKER_URL = os.environ.get("REDISCLOUD_URL", "django://")
BROKER_POOL_LIMIT = 1
BROKER_CONNECTION_MAX_RETRIES = None

CELERY_TASK_SERIALIZER = "json"
CELERY_ACCEPT_CONTENT = ["json", "msgpack"]
CELERYBEAT_SCHEDULER = 'djcelery.schedulers.DatabaseScheduler'

if BROKER_URL == "django://":
    INSTALLED_APPS += ("kombu.transport.django",)

BROKER_TRANSPORT_OPTIONS = {
    "max_connections": 2,
}
BROKER_POOL_LIMIT = None
We need to set the REDISCLOUD_URL after this is done in heroku app settings. (At the hosting control panel.

Continue with broker setup 

Let's store our process by doing a commit. And since djcelery app has some models, also apply migrations:
$ git add djheroku/settings.py requirements.txt
$ git commit -m 'Add Celery support'
[master 43afd41] Add Celery support
 2 files changed, 46 insertions(+)
$ git push heroku master
...
-----> Installing dependencies with pip
       Installing collected packages: amqp, anyjson, billiard, celery, django-celery, kombu, pytz
...
$ heroku run python manage.py migrate
...

 Staying in a free tier with a single dyno 

To save money on the start by not using the second dyno at all. From a Procfile we'll start a process manager that would run multiple processes for us. This just can't scale at all (any attempts to scale would give unpredictable results), but we could easily revise this at a later time. The only issue is, since this will be the web dyno, it will be killed ("sleeping" in Heroku terms) if no requests happen within one hour. Since we have a scheduler, we could probably work around this limitation by sending an HTTP request to ourselves, though. Let's consider we've added Celery worker to Procfile using one of the above methods. In this tutorial I'll stick to Python-only, Honcho.
$ echo 'honcho==1.0.1' >> requirements.txt
$ pip install honcho==1.0.1
We'll need a workers declared in a Procfile. Then we'll swap the file with a "proxy" one:
$ git mv Procfile Procfile.real
And change the Procfile.real with:
web: gunicorn helloworld.wsgi --log-file -
worker: python manage.py celery worker --loglevel=info
beat: python manage.py celery beat --loglevel=info
This Original Procfile (that is executed by heroku) should look like this:
web: env > .env; env PYTHONUNBUFFERED=true honcho start -f Procfile.real 2>&1
Now we should commit and push to heroku and connect to heroku loggin to check if everything went well:
$ heroku logs -t | cut -c34-
Another downside of this hack is messy logging. But it's the prices of a "free" compromise.

Celery essentials

Now, we're done with the setup so let's actually write some tasks and their management code. First of all, let's create celery.py. A simple task that'd fetch an URL and return a status code would look as following:
import os
from celery import Celery
from django.conf import settings


# Lets the celery command line program know where project settings are.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'djheroku.settings')

# Creates the instance of the Celery app.
app = Celery('djheroku')

app.config_from_object('djheroku:settings', namespace='CELERY')

# Set up autodiscovery of tasks in the INSTALLED_APPS.
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)

if __name__ == '__main__':
    app.start()
Now we have a command file to running celery on heroku server dyno. Next step is to add a sample task in a file called tasks.py. It will be auto-collected by celery:
from celery import task


@task()
def echoe():
    """
    A simple task that echoes a Hello World! text to celery console.
    """
    print('Hello World!')

Testing stuff

We can now test this locally by running our server on one terminal instance:
$ ./manage.py runserver 0.0.0.0:8000
And a sample celery console with built in beat process as a debug purpose worker:
$ celery worker --loglevel=info --beat
Both those terminals instances will emulate a working heroku environment that we have just created.
Now we can trigger our sample script:
$ celery call echoe
This will trigger a task and put into celery beat queue. We can observe it's execution after some time passed on the celery worker console.
That's basically it.
Time to commit our changes and push to heroku.

Here is a git repository: https://github.com/garmoncheg/djheroku


Comments

Popular posts from this blog

Pretty git Log

SO you dislike git log output in console like me and do not use it... Because it looks like so: How about this one? It's quite easy... Just type: git log - - graph - - pretty = format : '%Cred%h%Creset -%C ( yellow ) %d%Creset %s %Cgreen ( %cr) %C ( bold blue ) <%an>%Creset' - - abbrev - commit - - It may be hard to enter such an easy command every time. Let's make an alias instead... Copypaste this to your terminal: git config --global alias.lg "log --color --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit --" And use simple command to see this pretty log instead: git lg Now in case you want to see lines that changed use: git lg - p In order for this command to work remove  the -- from the end of the alias. May the code be with you! NOTE: this article is a rewritten copy of  http://coderwall.com/p/euwpig?i=3&p=1&t=git   and have b...

Django: Resetting Passwords (with internal tools)

I have had a task recently. It was about adding a forms/mechanism for resetting a password in our Django based project. We have had our own registration system ongoing... It's a corporate sector project. So you can not go and register yourself. Admins (probably via LDAP sync) will register your email/login in system. So you have to go there and only set yourself a password. For security reasons you can not register. One word. First I've tried to find standart decision. From reviewed by me were: django-registration and django password-reset . These are nice tools to install and give it a go. But I've needed a more complex decision. And the idea was that own bicycle is always better. So I've thought of django admin and that it has all the things you need to do this yourself in no time. (Actually it's django.contrib.auth part of django, but used out of the box in Admin UI) You can find views you need for this in there. they are: password_reset password_reset_...

Time Capsule for $25

The real article name might be something like:  Configuring Raspbery Pi to serve like a Time Capsule with Netatalk 3.0 for Mountain Lion.  But it's too long ;) Here I will describe the process of using Raspberry Pi like a Time Machine in my network. To be able to backup your MAC's remotely (Like it would be NAS of some kind). It assumes you have a Raspberry Pi and have installed a Raspbian there and have a ssh connection, or somehow having access to it's console. Refer to my previous article for details . Now that we have a Pi that is ready for action let's animate it. So to make it suit you as a Time Capsule (NAS) for your MAC's you need to do those basic steps: - connect and configure USB hard drive(s) - install support of HFS+ filesystem to be able to use MAC's native filesystem - make mount (auto-mount on boot) of your hard drive - install Avahi and Netatalk demons - configure Netatalk daemon to make it all serve as a Time Machine - configure ...