I started a Django project that enables other services to interact with it over the API. Of course, one of the best solutions for building the API using Python is Django Rest Framework. Great project with large community that got supported on Kickstarter. How cool is that?
Project
My project/service offers among other things access and creation of companies and subscriptions. Each company can have multiple subscription – we have a one-to-many relation. I quickly created the models Company and Subscription.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
from django.db import models class Company(models.Model): id = models.UUIDField(primary_key=True, editable=True) name = models.CharField(max_length=255) created_at = models.DateTimeField(auto_now_add=True) def __unicode__(self): return self.name class Subscription(models.Model): id = models.UUIDField(primary_key=True, editable=False, default=uuid.uuid4) company = models.ForeignKey(Company) price = models.DecimalField(decimal_places=2, max_digits=10) |
One thing to notice here is that I use UUId’s. The reason lays in the fact that some other services also contain company data. Those services will create companies as they have all the required data (id, name). With this I’m able to resolve sync problems.
For subscription model I will create UUID by using random method.
Django Rest Framework
Django Rest Framework has great docs. If you follow quickstart, you can set up working API in few minutes.
In my case, I created serializers
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
from rest_framework import serializers from models import Company, Subscription class CompanySerializer(serializers.ModelSerializer): id = serializers.UUIDField(required=True, read_only=False) class Meta: model = Company fields = ('id', 'name', 'created_at') class SubscriptionSerializer(serializers.ModelSerializer): company = CompanySerializer() class Meta: model = Subscription fields = ('id', 'company', 'price') |
I had to define additional id for the company serializer. By default id’s are read only (normally id’s are generated at the database level), but in my case I pass the id while creating the company.
I also created viewsets.py.
1 2 3 4 5 6 7 8 9 10 |
from rest_framework import viewsets from models import Company, Subscription class SubscriptionViewSet(viewsets.ModelViewSet): serializer_class = SubscriptionSerializer queryset = Subscription.objects.all() class CompanyViewSet(viewsets.ModelViewSet): serializer_class = CompanySerializer queryset = Company.objects.all() |
For the last step you have to add the viewsets to API router.
1 2 3 4 5 6 7 8 9 10 11 12 |
from rest_framework import routers from viewsets import CompanyViewSet, SubscriptionViewSet router = routers.DefaultRouter() router.register(r'companies', CompanyViewSet) router.register(r'subscriptions', SubscriptionViewSet) urlpatterns = [ url(r'^admin/', include(admin.site.urls)), url(r'^api/', include(router.urls)), ] |
Now when you access /api/companies/ or /api/subscriptions you should get a response (for now probably only empty array).
This part is very simple and there are tons of examples how to do this.
Problem
To create a company, I execute a POST JSON request (I’m using Postman) to /api/companies/ with the following payload.
1 2 3 4 |
{ "id": "6230fbeb-bffd-4e37-b0e8-c545f4a83a61", "name": "My Test Company" } |
and I get returned
1 2 3 4 5 |
{ "id": "6230fbeb-bffd-4e37-b0e8-c545f4a83a61", "name": "My Test Company", "created_at": "2015-09-22T10:56:58.876908Z" } |
Now I have a company in the database. Let’s create a subscription. Again, I execute POST JSON requst to /api/subscriptions with payload
1 2 3 4 5 6 7 |
{ "price": 80.0, "company": { "id": "6230fbeb-bffd-4e37-b0e8-c545f4a83a61" } } |
and I get an error that company name is required. What?
Same request and response
Before I go into explaining what previous error means and how I solved it, I have to first explain what I want.
Other services that talk with my service use different HTTP clients. One of them is also Netflix Feign. With it you can simply create HTTP clients that map the request or response to DTO’s. For example, they have a SubscriptionDTO defined as
1 2 3 4 5 6 7 8 |
import java.util.Date; import java.util.UUID; public class SubscriptionDTO { public CompanyDTO company; public Double price; } |
and CompanyDTO
1 2 3 4 5 6 7 |
import java.util.UUID; public class CompanyDTO { public UUID id; public String name; } |
So same DTO is used for request and response. I want to pass the same DTO with all the required data when creating the subscription. When response is returned, it populates the existing SubscriptionDTO. This is important, because I want to eliminate the confusion when using multiple DTO’s for same entity (Subscription).
Process of identifying the problem
To return to previous error. When I want to retrieve the subscription, I also want to include company information in the subscription list.
1 2 3 4 5 6 7 8 9 |
{ "id": "1ffc2a43-6ca6-4ba7-a551-adfccee86427", "price": 0.0, "company": { "id": "6230fbeb-bffd-4e37-b0e8-c545f4a83a61" "name": "My Test Company" } } |
I accomplished this by defining
1 |
company = CompanySerializer() |
in my SubscriptionSerializer. If I didn’t use this, then the response would be in format
1 2 3 4 5 |
{ "id": "1ffc2a43-6ca6-4ba7-a551-adfccee86427", "price": 0.0, "company": "6230fbeb-bffd-4e37-b0e8-c545f4a83a61" } |
But I don’t want this, I want the full output. When I defined company field, I didn’t pass any arguments. By default it means that when I execute the POST, it will create subscription and all it’s relations (company). That is why I got an error that company name is required, because it wanted to create a new company (but name is missing). But I don’t want this.
I checked online and asked few people. Most of them suggested that I pass read_only=True argument when I define the company field: company = CompanySerializer(read_only=True). Now when I executed the POST, I got that subscription.company_id database field should not be null. Once you define read_only for a field, it’s data is not passed to method that creates the model (subscription). Why?
There are many discussions around how to solve this.
a) https://groups.google.com/forum/#!topic/django-rest-framework/5twgbh427uQ
b) http://stackoverflow.com/questions/29950956/drf-simple-foreign-key-assignment-with-nested-serializers
c) http://stackoverflow.com/questions/22616973/django-rest-framework-use-different-serializers-in-the-same-modelviewset
Some suggest different serializers, other using 2 fields (one for read and other for create/update). But all of them seem hackish and impose a lot of extra code. Author of the DRF Tom Christie suggested that I define CompanySerializer fields (except id) as read only. This kinda solved the problem. If company has additional fields, then I need to overwrite them also which means extra code. At the same time, I want to preserve the /api/companies/ endpoint for creating/updating companies. If I set fields as read only, then I wouldn’t be able to create companies without having additional CompanySerializer.
I tried to overwrite subscription create methods, but without a success. If I defined read_only=True when creating field company, then no company information was passed to validated_data (the data that is later used to created a subscription). If I defined read_only=False, then I was always getting “name is required” error.
I wanted a simple and working solution.
Solution
I started to look for a solution that was simple and enabled me to make the requests that I want. Digging through the code I noticed many methods for field creation that I could overwrite. On the end, I had to modify validation method.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
from rest_framework import serializers from rest_framework.fields import empty class RelationModelSerializer(serializers.ModelSerializer): def __init__(self, instance=None, data=empty, **kwargs): self.is_relation = kwargs.pop('is_relation', False) super(RelationModelSerializer, self).__init__(instance, data, **kwargs) def validate_empty_values(self, data): if self.is_relation: model = getattr(self.Meta, 'model') model_pk = model._meta.pk.name if not isinstance(data, dict): error_message = self.default_error_messages['invalid'].format(datatype=type(data).__name__) raise serializers.ValidationError(error_message) if not model_pk in data: raise serializers.ValidationError({model_pk: model_pk + ' is required'}) try: instance = model.objects.get(pk=data[model_pk]) return True, instance except: raise serializers.ValidationError({model_pk: model_pk + ' is not valid'}) return super(RelationModelSerializer, self).validate_empty_values(data) |
I overwrote the validate_empty_values where I check the relation. The idea is that I check posted data. If there is an id (or primary key) of the relation model present, I validate that record exists for that id and return it. If it doesn’t exist or the data is invalid, I raise an error.
There is also a is_relation argument that you have to pass when creating serializer. The is only used when creating serializer as nestedserializer. The updated code is
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
from rest_framework import serializers from models import Company, Subscription class CompanySerializer(RelationModelSerializer): id = serializers.UUIDField(required=True, read_only=False) class Meta: model = Company fields = ('id', 'name', 'created_at') class SubscriptionSerializer(serializers.ModelSerializer): company = CompanySerializer(read_only=False, is_relation=True) class Meta: model = Subscription fields = ('id', 'company', 'price') |
What this does is that now I can execute POST JSON requests with payload
1 2 3 4 5 6 7 |
{ "price": 80.0, "company": { "id": "6230fbeb-bffd-4e37-b0e8-c545f4a83a61" } } |
and get a response
1 2 3 4 5 6 7 8 9 |
{ "id": "1ffc2a43-6ca6-4ba7-a551-adfccee86427", "price": 0.0, "company": { "id": "6230fbeb-bffd-4e37-b0e8-c545f4a83a61" "name": "My Test Company" } } |
Same DTO for request and response. At the same time, I didn’t modify the /api/companies/ endpoint. Companies get created/updated normally with all the required validation working as it should.