On occasion, you’ll need to run a piece of code on each and every request that Django handles. This code might need to modify the request before the view handles it, or maybe log information about the request for debugging purposes, etc.
Django’s middleware framework is essentially a set of hooks into Django’s request/response processing. It’s a light, low-level “plugin” system for globally altering Django’s input and/or output.
Each middleware component is responsible for doing some specific function. If you’re reading this book linearly — sorry, postmodernists — you’ll have already seen middleware a number of times:
This chapters dives deeper into exactly what middleware is and how it works, and explains how you can write your own middleware.
Middleware is actually incredible simple. A middleware component is simply a Python class that conforms to a certain API — duck typing strikes again! Before diving into the formal aspects of what that API is, let’s look at a very simple example.
High-traffic sites often need to deploy Django behind a load balancing proxy (see Chapter 21). This can cause a few small complications, one of which is that every request’s remote IP (request.META["REMOTE_IP"]) will be that of the load balancer, not the actual IP making the request. Load balancers deal with this by setting a special header, X-Forwarded-For, to the actual requesting IP address.
So here’s a small bit of middleware that lets sites running behind a proxy still see the correct IP address in request.META["REMOTE_IP"]:
class SetRemoteAddrFromForwardedFor(object):
def process_request(self, request):
try:
real_ip = request.META['HTTP_X_FORWARDED_FOR']
except KeyError:
pass
else:
# HTTP_X_FORWARDED_FOR can be a comma-separated list of IPs.
# Take just the first one.
real_ip = real_ip.split(",")[0]
request.META['REMOTE_ADDR'] = real_ip
If this is installed (see below), every request’s X-Forwarded-For value will be automatically inserted into request.META['REMOTE_ADDR']. Simple, isn’t it?
In fact, this is a common enough need that this piece of middleware is a built-in part of Django; it lives in django.middleware.http, and you can read a bit more about it below.
The linear readers in the crowd are probably old hands at this already; many of the examples in the previous few chapters will only work if you’ve already figured out how to enable middleware. However, for completeness — and for the benefit of Julio Cortázar fans who’ve torn all the pages out of this book, shuffled them, and are now reading them in random order — let’s break it down.
To activate a middleware component, add it to the MIDDLEWARE_CLASSES list in your settings module. In MIDDLEWARE_CLASSES, each middleware component is represented by a string: the full Python path to the middleware’s class name. For example, here’s the default MIDDLEWARE_CLASSES created by django-admin.py startproject:
MIDDLEWARE_CLASSES = (
'django.middleware.common.CommonMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.middleware.doc.XViewMiddleware',
)
A Django installation doesn’t require any middleware — e.g., MIDDLEWARE_CLASSES can be empty, if you’d like — but it’s strongly suggested that you use CommonMiddleware.
The order is significant. On the request and view phases, Django applies middleware in the order given in MIDDLEWARE_CLASSES, and on the response and exception phases, Django applies middleware in reverse order. That is, Django treats MIDDLEWARE_CLASSES as a sort of “wrapper” around the view function: on the request, it walks down the list to the view, and on the response it walks back up.
Now that we know what middleware is and how to install it, let’s take a look at all the available methods that middleware classes may define.
If middleware classes define an initializer (i.e. an __init__ method), it should take no arguments (beyond the standard self).
For performance reasons, middleware classes are only instantiated once in long-running server processes; this means that you can’t count on __init__ getting called every time a request runs, only once at server startup.
Middleware classes may also use initialization time to remove themselves from being installed. If an initializer raises django.exceptions.MiddlewareNotUsed, Django will remove that piece of middleware from the middleware stack. You might use this to check for some piece of software that the middleware class depends on, or whether the server is running in debug mode, or any other sort of environmental situation that might make you want to disable the middleware.
This method gets called as soon as the request as been received, and before the URL has been resolved to determine which view to run. It’s passed the HttpRequest object, which you may modify at will.
process_request() should return either None or an HttpResponse object. If it returns None, Django will continue processing this request, executing any other middleware and then the appropriate view.
If a request middleware returns an HttpResponse object, Django won’t bother calling any other middleware (of any type) or the appropriate view; it’ll return that HttpResponse.
This method gets called after the request middleware has run, and after the URL has been resolved into a view, but before that view has actually been called.
The arguments passed to this view are:
| Argument | Explanation |
|---|---|
| request | The HttpRequest object. |
| view | The Python function that Django will call to handle this request. This is the actual function object itself, not the name of the function as a string. |
| args | The list of positional arguments that will be passed to the view, not including the request argument (which is always the first argument to a view). |
| kwargs | The dictionary of keyword arguments that will be passed to the view. |
Just like process_request(), process_view() should return either None or an HttpResponse object. If it returns None, Django will continue processing this request, executing any other view middleware and then the appropriate view.
If any view middleware returns an HttpResponse object, Django won’t bother calling any other middleware or the appropriate view; it’ll return that response.
This method gets called after the view function has already been called and the response has been generation. This is where middleware can modify the output of a response; output compression (see below) is one obvious use for response middleware.
The parameters should be pretty self-explanatory — request is the request object, and response is the response object returned from the view.
Unlike the request and view middleware methods which may return None, process_response() must return an HttpResponse object. That response could be the original one passed into the function (possibly modified), or a brand new one.
This method only gets called if something goes wrong and a view raises an uncaught exception, not including Http404 exceptions. You can use this hook to send error notifications, dump post-mortem information to a log, or even try to recover from the error automatically.
The parameters to this function are the same request object we’ve been dealing with all along, and exception, which is the actual Exception object raised by the view function.
process_exception() may return an HttpResponse which will be used as the response shown to the browser, or it may return None to continue with Django’s built-in exception handling.
Django ships with a number of middleware classes — discussed below — that make good examples; reading the code for them should give you a good feel for the power of middleware.
You can also find a number of community-contributed examples on Django’s wiki: http://code.djangoproject.com/wiki/ContributedMiddleware.
Django comes with some built-in middleware to deal with common problems.
Middleware class: django.contrib.auth.middleware.AuthenticationMiddleware
Enables authentication support. Technically, this middleware adds the request.user attribute, representing the currently-logged-in user, to every incoming HttpRequest object.
See Chapter 15 for the complete details.
Middleware class: django.middleware.common.CommonMiddleware.
Adds a few conveniences for perfectionists:
Forbids access to user agents in the DISALLOWED_USER_AGENTS setting, which should be a list of strings.
Performs URL rewriting based on the APPEND_SLASH and PREPEND_WWW settings. If APPEND_SLASH is True, URLs that lack a trailing slash will be redirected to the same URL with a trailing slash, unless the last component in the path contains a period. So foo.com/bar is redirected to foo.com/bar/, but foo.com/bar/file.txt is passed through unchanged.
If PREPEND_WWW is True, URLs that lack a leading “www.” will be redirected to the same URL with a leading “www.”
Both of these options are meant to normalize URLs. The philosophy is that each URL should exist in one, and only one, place. Technically a URL foo.com/bar is distinct from foo.com/bar/ — a search-engine indexer would treat them as separate URLs — so it’s best practice to normalize URLs.
Handles ETags based on the USE_ETAGS setting. If USE_ETAGS is set to True, Django will calculate an ETag for each request by MD5-hashing the page content, and it’ll take care of sending Not Modified responses, if appropriate.
Middleware class: django.middleware.gzip.GZipMiddleware
If enabled, this middleware will automatically compress content for browsers that understand gzip compression (all modern browsers).
This can greatly reduce the amount of bandwidth a web server consumes at the expense of processing time. We usually prefer speed over bandwidth, but if you’d like to take the opposite side of this trade-off, just enable this middleware.
Middleware class: django.middleware.http.ConditionalGetMiddleware
If enabled, provides support for conditional GET operations. If the response has a ETag or Last-Modified header, and the request has If-None-Match or If-Modified-Since, the response is replaced by an 304 (“Not modified”) response.
Also removes the content from any response to a HEAD request and sets the Date and Content-Length response-headers for all requests.
Middleware class: django.middleware.http.SetRemoteAddrFromForwardedFor
This is the example we looked at above. It sets request.META['REMOTE_ADDR'] based on request.META['HTTP_X_FORWARDED_FOR'], if the latter is set. This is useful if you’re sitting behind a reverse proxy that causes each request’s REMOTE_ADDR to be set to 127.0.0.1.
Danger, Will Robinson!
This does not validate HTTP_X_FORWARDED_FOR.
If you’re not behind a reverse proxy that sets HTTP_X_FORWARDED_FOR automatically, do not use this middleware. Anybody can spoof the value of HTTP_X_FORWARDED_FOR, and because this sets REMOTE_ADDR based on HTTP_X_FORWARDED_FOR, that means anybody can fake their IP address.
Only use this middlware when you can absolutely trust the value of HTTP_X_FORWARDED_FOR.
Middleware class: django.contrib.sessions.middleware.SessionMiddleware.
Enables session support; see Chapter 15 for details.
Middleware class: django.middleware.cache.CacheMiddleware.
If this is enabled, each Django-powered page will be cached. This is discussed in detail in Chapter 14.
Middleware class: django.middleware.transaction.TransactionMiddleware
Binds a database COMMIT or ROLLBACK to the request/response phase. If a view function runs successfully, a COMMIT is done. If it fails with an exception, a ROLLBACK is done.
The order of this middleware in the stack is important: middleware modules running outside of it run with commit-on-save - the default Django behavior. Middleware modules running inside it (coming later in the stack) will be under the same transaction control as the view functions.
See XXX for more about information about database transactions.
Middleware class: django.middleware.doc.XViewMiddleware
Sends custom X-View HTTP headers to HEAD requests that come from IP addresses defined in the INTERNAL_IPS setting. This is used by Django’s automatic documentation system.
Comments are closed on this chapter.
We're no longer accepting comments on this version of this chapter.
Many thanks to all those who commented.
About this comment system
This site is using a contextual comment system to help us gather targeted feedback about the book. Instead of commenting on an entire chapter, you can leave comments on any indivdual "block" in the chapter. A "block" with comments looks like this:
A "block" is a paragraph, list item, code sample, or other small chunk of content. It'll get highlighted when you select it:
To post a comment on a block, just click in the gutter next to the bit you want to comment on:
As we edit the book, we'll review everyone's comments and roll them into a future version of the book. We'll mark reviewed comments with a little checkmark:
Please make sure to leave a full name (and not a nickname or screenname) if you'd like your contributions acknowledged in print.
Many, many thanks to Jack Slocum; the inspiration and much of the code for the comment system comes from Jack's blog, and this site couldn't have been built without his wonderful
YAHOO.extlibrary. Thanks also to Yahoo for YUI itself.