Python and multiple constructors

One thing I missed when switching from Java to Python was multiple constructors. Python does not support them (directly), but there a may other approaches that work very similar (maybe even better).

Problem

Let’s say we are building a client to query remote service (some aggregation service). We want to pass the aggregator.

To make code more fluent and giving it more robustness for integrating into other solutions, we have multiple options to create an aggregator.

The query.aggregator will create a new instance of Aggregator and pass it to the request.

(Possible) solution

Python has a great feature of passing args and kwargs. We can create a constructor

then in the constructor we check and parse args and kwargs. This solution works, but it has many problems:

  1. No indication what is required and what not
    This is most important for autocompletion. When I want to create a new instance of the class Aggregator, I want to know what is required. With current constructor, this is really hard.
  2. Complexity and combinations
    There are many combinations how to initialize a new instance by passing different arguments.

    This is absolutely weird and hard to read.

Better solution

Python has an option to decorate a method with @classmethod. We can define custom methods that work as multiple constructors. For example, we can create a method from_arguments.

We use it as Aggregator.from_arguments(args). The validation of the parameters (if value an int) is done in the constructor.

The from_arguments method just parses the arguments and creates a new instance of the Aggregator. We could add a validation (if list has at least 2 items, if str is in correct format, if dict has all the required elements, …).

Django Rest Framework, NestedSerializer with relation and CRUD

I started a Django project that enables other services to interact with it over the API. Of course, one of the best solutions for building the API using Python is Django Rest Framework. Great project with large community that got supported on Kickstarter. How cool is that?

Project

My project/service offers among other things access and creation of companies and subscriptions. Each company can have multiple subscription – we have a one-to-many relation. I quickly created the models Company and Subscription.

One thing to notice here is that I use UUId’s. The reason lays in the fact that some other services also contain company data. Those services will create companies as they have all the required data (id, name). With this I’m able to resolve sync problems.

For subscription model I will create UUID by using random method.

Django Rest Framework

Django Rest Framework has great docs. If you follow quickstart, you can set up working API in few minutes.

In my case, I created serializers

I had to define additional id for the company serializer. By default id’s are read only (normally id’s are generated at the database level), but in my case I pass the id while creating the company.

I also created viewsets.py.

For the last step you have to add the viewsets to API router.

Now when you access /api/companies/ or /api/subscriptions you should get a response (for now probably only empty array).

This part is very simple and there are tons of examples how to do this.

Problem

To create a company, I execute a POST JSON request (I’m using Postman) to /api/companies/ with the following payload.

and I get returned

Now I have a company in the database. Let’s create a subscription. Again, I execute POST JSON requst to /api/subscriptions with payload

and I get an error that company name is required. What?

Same request and response

Before I go into explaining what previous error means and how I solved it, I have to first explain what I want.

Other services that talk with my service use different HTTP clients. One of them is also Netflix Feign. With it you can simply create HTTP clients that map the request or response to DTO’s. For example, they have a SubscriptionDTO defined as

and CompanyDTO

So same DTO is used for request and response. I want to pass the same DTO with all the required data when creating the subscription. When response is returned, it populates the existing SubscriptionDTO. This is important, because I want to eliminate the confusion when using multiple DTO’s for same entity (Subscription).

Process of identifying the problem

To return to previous error. When I want to retrieve the subscription, I also want to include company information in the subscription list.

I accomplished this by defining

in my SubscriptionSerializer. If I didn’t use this, then the response would be in format

But I don’t want this, I want the full output. When I defined company field, I didn’t pass any arguments. By default it means that when I execute the POST, it will create subscription and all it’s relations (company). That is why I got an error that company name is required, because it wanted to create a new company (but name is missing). But I don’t want this.

I checked online and asked few people. Most of them suggested that I pass read_only=True argument when I define the company field: company = CompanySerializer(read_only=True). Now when I executed the POST, I got that subscription.company_id database field should not be null. Once you define read_only for a field, it’s data is not passed to method that creates the model (subscription). Why?

There are many discussions around how to solve this.

a) https://groups.google.com/forum/#!topic/django-rest-framework/5twgbh427uQ
b) http://stackoverflow.com/questions/29950956/drf-simple-foreign-key-assignment-with-nested-serializers
c) http://stackoverflow.com/questions/22616973/django-rest-framework-use-different-serializers-in-the-same-modelviewset

Some suggest different serializers, other using 2 fields (one for read and other for create/update). But all of them seem hackish and impose a lot of extra code. Author of the DRF Tom Christie suggested that I define CompanySerializer fields (except id) as read only. This kinda solved the problem. If company has additional fields, then I need to overwrite them also which means extra code. At the same time, I want to preserve the /api/companies/ endpoint for creating/updating companies. If I set fields as read only, then I wouldn’t be able to create companies without having additional CompanySerializer.

I tried to overwrite subscription create methods, but without a success. If I defined read_only=True when creating field company, then no company information was passed to validated_data (the data that is later used to created a subscription). If I defined read_only=False, then I was always getting “name is required” error.

I wanted a simple and working solution.

Solution

I started to look for a solution that was simple and enabled me to make the requests that I want. Digging through the code I noticed many methods for field creation that I could overwrite. On the end, I had to modify validation method.

I overwrote the validate_empty_values where I check the relation. The idea is that I check posted data. If there is an id (or primary key) of the relation model present, I validate that record exists for that id and return it. If it doesn’t exist or the data is invalid, I raise an error.

There is also a is_relation argument that you have to pass when creating serializer. The is only used when creating serializer as nestedserializer. The updated code is

What this does is that now I can execute POST JSON requests with payload

and get a response

Same DTO for request and response. At the same time, I didn’t modify the /api/companies/ endpoint. Companies get created/updated normally with all the required validation working as it should.