In what is probably my biggest WTF with Django to date, it doesn’t
validate your models before saving them to the database. All of the
necessary code exists and when a dev sets up her models she usually adds
the relevant validations using EmailField, URLField, blank, null,
unique, …, but unless you explicitly add code the constraints won’t be
enforced (adequately.) Some things will be caught with IntegrityErrors,
but not everything and not consistently.
Since the validation code is sitting there waiting to be hooked up the
only reason I can imagine for not having it by default is backwards
compatibility. That seems to be the reason
given
elsewhere. It’s a big enough problem in my mind to deserve a breaking
changing in a 1.x release with a configuration variable to disable it
and a large print warning in the release notes. If not that it at least
needs to be featured very prominently in the getting started and general
documentation. If it’s there it’s not obvious enough that I’ve run
across it and Google doesn’t seem to point there when searching for
relevant terms either. Oh well.
So now that I’ve told you how I feel about it, lets get to what to do
about it. You have two basic options. A signal or a base class. Both
have advantages and dis-advantages and I’ll quickly list the ones that
come to mind as we look at the necessary code.
Pre-Save Signal
from django.db.models.signals import pre_save
def validate_model(self, instance, raw=False, **kwargs):
if not raw:
instance.full_clean()
pre_save.connect(validate_model, dispatch='validate_model')
Ignoring the fact the method is called full_clean, which seems better
fit for ModelForm checking than Model enforcement, the above code will
check all models used by your app. We connect a handler to the model
pre_save
signal and on each call will make a call to full_clean unless we’re
saving in raw mode (from fixtures.)
The pre_save signal will be sent out for every object being saved
whether it’s one of ours or an upstream dependency’s. That’s both the
advantage and disadvantage of this method. If you use it from the start
all of your code will handle ValidationErrors and as you bring in
3rd-party apps/code you’ll be able to quickly see if it causes problems
for them. But you can run in to problems.
You also shouldn’t use this method if you’re developing a shared app as
it would cause anyone who uses that app to unexpectedly start seeing
ValidationErrors, even if it’s for their own good. You could add senders
to the connect calls for each of your models, but at that point you’re
better off going with the mixin below.
In my use of the signal approach I’ve run in to a problem with custom
Celery Task states. Celery’s docs give examples of arbitrary task
states, but when full_clean is called on them on their way to their
backing store a validation happens that complains about non-standard
values. The easiest way I could find to deal with it was to have a list
of opted out models, it’s not the cleanest thing in the world, but it
gets the job done.
dont_validate = {'TaskMeta'}
def validate_model(self, instance, raw=False, **kwargs):
cls = instance.__class__.__name__
if not raw and cls not in dont_validate:
instance.full_clean()
Mixin With An Overridden save
from django.db import models
class ValidateOnSaveMixin(object):
def save(self, force_insert=False, force_update=False, **kwargs):
if not (force_insert or force_update):
self.full_clean()
super(ValidateOnSaveMixin, self).save(force_insert, force_update,
**kwargs)
class Employee(ValidateOnSaveMixin, models.Model):
name = models.CharField(max_length=128)
# need to specify the max_length here or else it'll be too short for
# some rfc emails
email = EmailField(max_length=254, unique=True)
Basically the same logic, but here it’s explicit which models are going
to be validated. This is essentially the opposite of the signal
approach. You don’t have to worry about other models validating
correctly or code working with them handling ValidationErrors, but you
do have to explicitly include ValidateOnSaveMixin in each model’s
hierarchy. You’ll also have to take a bit of care if you override the
save method in any of the classes where the mixin is used to make sure
you do things in an appropriate order and that the mixin’s save method
is called.
Things to Watch For
One thing to consider with either of these approaches is that you cannot
rely on pre_save signals or field save methods to make objects valid.
Both would happen too late. In the case of the mixin, after we’ve called
full_clean and pass things up to super. with the pre_save signal
field’s save methods are called at a later point and there’s no
assurances on the order of signal handlers so you can’t rely on the
fixers being called before validate_model.
Unit Testing
I’m fan of thorough unit testing and this is a place when it can come in
extra handy and the tests are trivial to write. You don’t have to test
the actual validation unless you’re doing something custom, you can
hope/assume that the Django unit tests have that covered. You can/should
check that validations are being invoked.
from django.test import TestCase
from django.core.exceptions import ValidationError
class EmployeeTest(TestCase):
def test_validation(self):
with self.assertRaises(ValidationError):
Employee(name='Bob', email='this.is.not.an.email').save()
That’s enough of a smoke test to tell you whether or not the validation
mixin or signal is getting called. If 6 months down the road you tweak
the signal handler or change the inheritance hierarchy you’ll have tests
in place to make sure that things are still being validated.