Using Postgres JSONB Fields in Django


I remember the first time I built user preferences into an app. At first, users just needed to be able to opt in or out of our weekly emails. "No big deal," I thought, "I'll just add a new field on the Users table." For a while, that was fine. A few weeks later, my boss asked me if we could let users opt into push notifications. Fine, that's just one more column on the database. Can't hurt, right?

You probably see where this is going.

Within months, my user table had 40 columns, and while Postgres can handle it, it gets pretty tricky for new devs to keep up with all of them. You can imagine it looked pretty similar to this settings screen of Quora.

Quora's notification preferences contain dozens of options

Fortunately, Postgres provides rich support for JSON fields, which can be very handy in situations like mine. Both JSON data types (json and jsonb) allow you to store entire objects or lists directly in your database. This means that you can store any number of user preferences in one column.

JSON support in Postgres gives you the flexibility of a document store database like Mongo with the speed and structure of a relational database. JSON support is powerful, but because it comes in two types (json and jsonb), it's helpful to understand which is the right choice for your application. The json data type was added in Postgres 9.2 and enhanced in 9.3. This new data type allowed you to store JSON directly in your database and even query it. The problem was that json data was stored as a special kind of text field, so it was slow to query.

Postgres introduced jsonb in 9.4 to combat this issue. Unlike json fields, jsonb fields are stored in a binary structure rather than text strings. While this means that writes are slightly slower, querying from jsonb fields is significantly faster. It also allows you to index jsonb fields. This makes jsonb the preferred format for most JSON data stored in Postgres, and the typical choice for Django applications.

Querying JSONB data in Postgres

The query syntax for accessing JSON in Postgres is not typical SQL. You have to use the specific JSON operators and functions, so queries on JSON data look different from other Postgres queries.

For example, if you stored the following data in a Postgres table called profiles:

idnamepreferences
1Mike{"sms": false, "daily_email": true, "weekly_email": true}
2Lucy{"sms": true, "daily_email": false, "weekly_email": false}
3Harriet{"sms": true, "daily_email": true, "weekly_email": true}

And you wanted to query all users who have opted into your daily_email, you'd write a query like this:

select * from profiles
where (preferences->>'daily_email')::boolean = true;

Which would give you back the rows for Mike and Harriet. I'm pretty good with SQL, but using JSON operators always slows me down. Fortunately, Django offers support for JSONB fields, so you don't have to become an expert at querying JSON in Postgres.

Django support for JSONB

Since Django 1.9, the popular Python framework has supported jsonb and several other Postgres-specific fields. Native Django support means that creating jsonb fields, using them in your models, inserting data into them, and querying from them are all possible with Django's ORM. Let's take a look at how you can get started using jsonb with Django.

Creating JSONB fields using migrations

Django's Postgres module comes with several field classes that you can import and add to your models. If you want to use a JSON field, import the JSONField class and use it for your model's property. In this example, we'll call the field preferences:

from django.db import models
from django.contrib.postgres.fields import JSONField class Profile(models.Model): name = models.CharField(max_length=200) preferences = JSONField() def __str__(self): return self.name

Django only supports the jsonb column type, so when you run your migrations, Django will create a table definition like this:

create table app_profile
( id serial not null constraint app_profile_pkey primary key, name varchar(200) not null, preferences jsonb not null
);

Adding data to JSONB fields

Because JSON fields don't enforce a particular schema, Django will convert any valid Python data type (dictionary, list, string, number, boolean) into the appropriate JSON. For example, if you want to add a new row to the app_profile table created above, you can run the following in your Django application:

from app.models import Profile p = Profile(name="Tanner", preferences={'sms': False, 'daily_email': True, 'weekly_email': True})
p.save()

This will create a new user named Tanner who will receive our daily_email and weekly_email, but no sms messages.

Querying JSONB fields

Django uses the double underscore pattern from field lookups to query JSON object keys. For example, if you want to get all the Profiles for users who have opted into our daily_email, you'd use the following code:

results = Profile.objects.filter(preferences__daily_email=True)

If you want to check the SQL query that Django runs, you can print it from the query property on the results object:

print(results.query) SELECT "app_profile"."id", "app_profile"."name", "app_profile"."preferences" FROM "app_profile" WHERE ("app_profile"."preferences" -> daily_email) = 'true'

As you can see, the query is slightly different from the one I manually wrote above (I cast daily_email to a boolean), but it accomplishes the same thing. You can also filter records based on the keys they contain. For example, if some user accounts were created before you added the sms option, you might want to find them and let the users know about the new option. You can use the isnull field lookup on the sms key in your JSON data:

results = Profile.objects.filter(preferences__sms__isnull=True)

There are many other ways to filter queries using JSON fields, so be sure to check out the official Django docs for more.

Limitations of JSONB fields

It's worth noting that jsonb fields come with some drawbacks. I've already mentioned that it takes slightly longer to write data to jsonb fields than json because the JSON string must be converted to binary, but there are other reasons to avoid jsonb fields.

First, if your data needs to enforce a strict schema, JSON may not be an ideal choice. While you can use check constraints to enforce the use of specific fields, this isn't natively supported in Django, so you'll need to write your own migrations to accomplish this.

A better way to address this shortcoming is by writing Django validation rules to enforce the structure you want. If you don't want to write the validation rules yourself, there's a popular package called jsonschema that I'd recommend.

Another drawback to using JSON fields is handling changes to the shape of your data. If you want to add a new column to a database table in Postgres using Django, you simply update your model and run a migration. If you want to add a new field to a JSON column, it isn't quite as straightforward.

A pattern I've used before is to create a custom migration that loops through the affected records and updates each one individually. This naive method works for relatively small datasets, but it might not be a good idea if you need to update 1 million profiles in a production database. In that case, it might be better to write your code to handle the existence or absence of the key or run a batch update on the JSON object.

Conclusion

While JSON data types come with some drawbacks, they are useful in situations where you need more flexibility in your data structure. Thanks to Django's native support for jsonb, you can get started using JSON data in your web applications without learning all the native Postgres query operators.

Next time you need more flexibility in your data model and want to benefit from the strengths of Postgres give jsonb fields a try.

Share this article: If you liked this article you might want to tweet it to your peers.

Karl Hughes is a technology team leader and software engineer. He is currently the founder of Draft, where he helps create technical content for engineering blogs.