Java | Erol.si

You want to use Akka? Better learn Scala

April 28, 2015 - Last update: April 28, 2015

I have been using Play Framework from the 1.2 version. Lately, I do most of the work with the 2.2/2.3 version. It supports both Scala and Java (you can literally mix the code files). Because I know Java much better compared to Scala (well I don’t know Scala at all), I do all my coding in Java.

Play Framework comes with Akka that supports actors for processing data. I have about 20 different actors that handle different scenarios. Actors talk with each other, so there’s a lot of different messages. Each message represents specific action.

One thing that really frustrates me is that Akka and Java don’t play nicely. I mean, everything works, but the authors of Akka don’t put a lot of effort into Java. I know, Scala is the great new language and once you know it, why the hell would you still use Java. Problem is that examples and tutorials are mostly written in Scala, coding actors in Scala is much easier and testing is just damn short and sweet.

Examples and Tutorials

When I face a problem and I want to see how others solved it or learn a new thing, I notice that most (about 95%) of Akka examples are in Scala. That means I need to somehow decode examples and convert to Java. This is not always possible. Sometimes certain implementations in Scala cannot be directly converted to Java and different approach has to be used.

Official docs have Scala nad Java version, but Scala version has much more content. There are many very useful blogs, like http://letitcrash.com/, but it’s all in Scala. Most of the books for Akka are also in Scala. Similar is with opensource projects on github.com.

For somebody coming from Java world, this can be frustrating.

Code

Java is not a language for writing short programs. So part of the fault is on Java, but again there could be better ways to write actors. For example, when I create an actor and want to process 3 different messages.

public class ABCActor extends UntypedActor
{
    private final LoggingAdapter log = Logging.getLogger(getContext().system(), this);

    @Override
    public void onReceive(Object message) throws Exception 
    {
        if(message instanceof MessageA)
        {
            MessageA messageA = (MessageA) message;
            this.process(messageA.foo);
        }
        else if(message instanceof BMessage)
        {
            MessageB messageB = (MessageB) message;
            this.process(messageB.foo);
        }
        else if(message instanceof CMessage)
        {
            MessageC messageC = (MessageC) message;
            this.process(messageC.foo)
        }
        else
        {
            unhandled(message)
        }
    }

    public void process(String foo)
    {
        String reply = generateReply()
        sender().tell(reply, self());
    }

    public String generateReply()
    { 
        return "bar";
    }
   
    public static class MessageA implements Serializable 
    {
        public final String foo;
        public MessageA(String foo)
        {
            this.foo = foo;
        }
    }

    // similar for MessageB and MessageC
}

public class ABCActor extends UntypedActor

{

private final LoggingAdapter log = Logging.getLogger(getContext().system(), this);

@Override

public void onReceive(Object message) throws Exception

{

if(message instanceof MessageA)

{

MessageA messageA = (MessageA) message;

this.process(messageA.foo);

}

else if(message instanceof BMessage)

{

MessageB messageB = (MessageB) message;

this.process(messageB.foo);

}

else if(message instanceof CMessage)

{

MessageC messageC = (MessageC) message;

this.process(messageC.foo)

}

else

{

unhandled(message)

}

public void process(String foo)

{

String reply = generateReply()

sender().tell(reply, self());

}

public String generateReply()

{

return "bar";

}

public static class MessageA implements Serializable

{

public final String foo;

public MessageA(String foo)

{

this.foo = foo;

}

// similar for MessageB and MessageC

}

The code quickly become long (and I have very simple example), because it’s hard to filter messages and forward them to appropriate methods. While in Scala, things are much shorter.

object ABCActor
{
  case class MessageA(foo:String)
  case class MessageB(foo:String)
  case class MessageC(foo:String)
}

class ABCActor extends Actor with ActorLogging
{
  import ABCActor._

  def receive = 
  {
    case MessageA(foo:String) => process(foo)
    case MessageB(foo:String) => process(foo)
    case MessageC(foo:String) => process(foo)
    case _ => unhandled()
  }  

  def generateReply():String = "bar"

  def process(foo:String) = 
  { 
    val reply = generateReply()
    sender ! reply
  }
}

object ABCActor

{

case class MessageA(foo:String)

case class MessageB(foo:String)

case class MessageC(foo:String)

}

class ABCActor extends Actor with ActorLogging

{

import ABCActor._

def receive =

{

case MessageA(foo:String) => process(foo)

case MessageB(foo:String) => process(foo)

case MessageC(foo:String) => process(foo)

case _ => unhandled()

}

def generateReply():String = "bar"

def process(foo:String) =

{

val reply = generateReply()

sender ! reply

}

For me it’s important that you can fit the whole logic into screen. So I can see the whole code without scrolling it. It’s easier to put the logic into my brand. This is not possible with Java code (of course, I could split it and add extra files, but again the instead of scrolling I would be clicking).

Sending a message

// Java
actor.tell(message, self())

// Scala
actor ! message

// Java

actor.tell(message, self())

// Scala

actor ! message

Scheduling a message

// Java
system.scheduler().scheduleOnce(Duration.create(5, TimeUnit.SECONDS),
                                actor, 
                                message, 
                                system.dispatcher(), 
                                null);

// Scala
import context.system._
scheduler.scheduleOnce(5 seconds, actor, message)

// Java

system.scheduler().scheduleOnce(Duration.create(5, TimeUnit.SECONDS),

actor,

message,

system.dispatcher(),

null);

// Scala

import context.system._

scheduler.scheduleOnce(5 seconds, actor, message)

There are tons of other examples. What I’m trying to say it that it’s much easier to write actor logic with Scala compared to Java. Much easier.

Testing

When writing unit tests, you should always test small parts of the code. That means that tests should also be small and short. Because Scala is very descriptive language, you can easily define a test. For example, let’s test our ABC actor and see if it replies.

class ABCActorTest(_system: ActorSystem) extends TestKit(_system) 
  with FlatSpecLike
  with MockitoSugar
  with ImplicitSender
  with Matchers
  with BeforeAndAfterAll
{
  trait Fixture {
    var message = new MessageA("foo")
  }

  def this() = this(ActorSystem("ABCActorTest"))

  override def afterAll()
  {
    TestKit.shutdownActorSystem(system)
  }

  "Actor" should "send a message and get a reply" in new Fixture
  {
    val actor = TestActorRef(Props(new ABCActor()))
    actor ! message
    expectMsg("bar")
  }
}

class ABCActorTest(_system: ActorSystem) extends TestKit(_system)

with FlatSpecLike

with MockitoSugar

with ImplicitSender

with Matchers

with BeforeAndAfterAll

{

trait Fixture {

var message = new MessageA("foo")

}

def this() = this(ActorSystem("ABCActorTest"))

override def afterAll()

{

TestKit.shutdownActorSystem(system)

}

"Actor" should "send a message and get a reply" in new Fixture

{

val actor = TestActorRef(Props(new ABCActor()))

actor ! message

expectMsg("bar")

}

Again, much shorted, much easier to understand and mostly less possibility for mistakes. We can also test specific method of actor.

"Actor" should "generate reply" in 
{
  val actorRef = TestActorRef(Props(new ABCActor()))
  val actor = actorRef.underlyingActor
  actor.generateReply() should be == ("bar")
}

"Actor" should "generate reply" in

{

val actorRef = TestActorRef(Props(new ABCActor()))

val actor = actorRef.underlyingActor

actor.generateReply() should be == ("bar")

}

Or we can mock certain actor methods. First we need to create a trait (similar to Java interfaces), so we can mock methods.

trait ABCActorBase
{
  def generateReply():String
}

class ABCActor extends Actor with ActorLogging with ABCActorBase
{
  // same code
}

trait ABCActorBase

{

def generateReply():String

}

class ABCActor extends Actor with ActorLogging with ABCActorBase

{

// same code

}

Now when we create a test, we can inject different ABCActorBase.

trait ABCActorTestBase extends ABCActorBase
{
  def generateReply():String = "mocked bar"
}

"Actor" should "mock method" in new Fixture
{
  val actor = TestActorRef(Props(new ABCActor with ABCActorTestBase))
  actor ! message
  expectMsg("mocked bar")
}

trait ABCActorTestBase extends ABCActorBase

{

def generateReply():String = "mocked bar"

}

"Actor" should "mock method" in new Fixture

{

val actor = TestActorRef(Props(new ABCActor with ABCActorTestBase))

actor ! message

expectMsg("mocked bar")

}

Scala has ScalaTest, which in my opinion is one of the best testing libraries. It offers very descriptive test results and error reporting. It supports multiple styles of testing, some of them look really awesome.

Conclusion

Even though I didn’t like Scala, I had to realize that it’s actually a great language. Not my favorite, but it’s OK. Since I’m developing an application that runs on JVM, it’s a shame not to use it if possible to solve certain tasks. It has many many more great features that I didn’t showcase, so feel free to check them out.

Final result it that I rewrote all my actors with Scala. They are much shorter and it’s easier to understand what they are doing. At the same time, great part of tests are now in Scala and I have a feeling that code is much stable and more error prone. It’s been few weeks since I have new actors in production and there were zero problems. Happy coding.

How to install KairosDB time series database?

February 5, 2015 - Last update: January 26, 2016

In my previous post, I described why I switched from OpenTSDB (other time series database) to KairosDB. In this post, I will show how to install and run KairosDB.

Requirements

To run KairosDB we actually just need KairosDB (if we ignore Ubuntu/Debian/something similar and Java). How is that possible? Well, KairosDB supports two datastores: H2 and Cassandra. H2 is actually an in memory H2 database. It’s easy to setup and cleanup, and it’s mostly used for development. Don’t use it in the production; it will work, but it will be very very slow.

For our tutorial we will use Cassandra as datastore. To install Cassandra, you can follow the official tutorial at http://wiki.apache.org/cassandra/GettingStarted. We will install it via apt-get.

deb http://www.apache.org/dist/cassandra/debian 21x main
deb-src http://www.apache.org/dist/cassandra/debian 21x main

1 2	deb http://www.apache.org/dist/cassandra/debian 21x main deb-src http://www.apache.org/dist/cassandra/debian 21x main

You will want to replace 21x by the series you want to use: 20x for the 2.0.x series, 12x for the 1.2.x series, etc… You will not automatically get major version updates unless you change the series, but that is a feature.

We also need to add public keys to be able to access debian packages.

gpg --keyserver pgp.mit.edu --recv-keys F758CE318D77295D
gpg --export --armor F758CE318D77295D | sudo apt-key add -

gpg --keyserver pgp.mit.edu --recv-keys 2B5C1B00
gpg --export --armor 2B5C1B00 | sudo apt-key add -

gpg --keyserver pgp.mit.edu --recv-keys 0353B12C
gpg --export --armor 0353B12C | sudo apt-key add -

gpg --keyserver pgp.mit.edu --recv-keys F758CE318D77295D

gpg --export --armor F758CE318D77295D | sudo apt-key add -

gpg --keyserver pgp.mit.edu --recv-keys 2B5C1B00

gpg --export --armor 2B5C1B00 | sudo apt-key add -

gpg --keyserver pgp.mit.edu --recv-keys 0353B12C

gpg --export --armor 0353B12C | sudo apt-key add -

Now we are ready to install it.

sudo apt-get update
sudo apt-get install cassandra

1 2	sudo apt-get update sudo apt-get install cassandra

This will install the Cassandra database. Few things you must know is that the configuration files are located in /etc/cassandra, and the start-up options (heap size, etc) can be configured in /etc/default/cassandra. Now that Cassandra is install, run it.

sudo service cassandra start

1	sudo service cassandra start

Another requirement is that you have Oracle Java JDK instead of OpenJDK. You must install version 7 or 8 (8 is recommended, I’m using 7). Again, we will install it with apt-get.

sudo apt-get install python-software-properties
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java7-installer
sudo apt-get install oracle-java7-set-default

sudo apt-get install python-software-properties

sudo add-apt-repository ppa:webupd8team/java

sudo apt-get update

sudo apt-get install oracle-java7-installer

sudo apt-get install oracle-java7-set-default

Source: http://stackoverflow.com/a/16263651/73010

KairosDB uses Thrift for communicating with Cassandra. When I installed Cassandra, it wasn’t enabled by default. So I had to enable it first. There are many ways and if you hate to fiddle with config files, you can install OpsCenter. It’s a really great tool for monitoring your cluster. It has a simple interface where you can access your nodes and change their configuration to enable Thrift. To change it the in the config file, update start_rpc setting to true in /etc/cassandra/cassandra.yaml.

Installing KairosDB

We can again install KairosDB in few ways.

a) Building from the source

a) Clone the git repository https://github.com/kairosdb/kairosdb.git
b) Make sure that JAVA_HOME is set to your java install.
c) Compile the code

export CLASSPATH=tools/tablesaw-1.2.2.jar
java make

1 2	export CLASSPATH=tools/tablesaw-1.2.2.jar java make

b) Installing via .deb package (recommended)

Current stable version is ~~0.9.4~~ 1.1.1. Make sure you download the latest version at https://github.com/kairosdb/kairosdb/releases.

wget https://github.com/kairosdb/kairosdb/releases/download/v1.1.1/kairosdb_1.1.1-1_all.deb
sudo dpkg -i kairosdb_1.1.1-1_all.deb

1 2	wget https://github.com/kairosdb/kairosdb/releases/download/v1.1.1/kairosdb_1.1.1-1_all.deb sudo dpkg -i kairosdb_1.1.1-1_all.deb

Setting Cassandra as a datastore

As mentioned before, KairosDB by default uses H2 database for datastore. We need to change it to Cassandra.

a) If you are running from source, then copy kairosdb.properties to KairosDB root folder from src/main/resources/ folder to change it.
b) If you installed it, then change the file /opt/kairosdb/conf/kairosdb.properties.

In the file comment the line where H2 is set as datastore and uncomment Cassandra module. So the file should look like this.

#kairosdb.service.datastore=org.kairosdb.datastore.h2.H2Module
kairosdb.service.datastore=org.kairosdb.datastore.cassandra.CassandraModule

1 2	#kairosdb.service.datastore=org.kairosdb.datastore.h2.H2Module kairosdb.service.datastore=org.kairosdb.datastore.cassandra.CassandraModule

You can also change some other setting to tune it, but for now just save it and you are ready to go.

Test if everything works

Make sure your Cassandra service is running. Now lets run KairosDB.

a) Running from source

java make run

1	java make run

b) Or if installed

sudo service kairosdb start

1	sudo service kairosdb start

Go to http://localhost:8080 to check if everything works OK. If you can see KairosDB dashboard, then congratulations, you can now use KairosDB.

What’s next

In the next tutorial we will see how to save, query and delete datapoints with web interface, HTTP and Telnet API.

Passing collections between Akka actors

January 25, 2015 - Last update: January 28, 2015

Akka actors are great when we are looking for a scalable real-time transaction processing (yes, this is the actual definition using some big words). Actually, it’s really great for some background processing because you can create many instances without actually worrying about concurrency and parallelism.

The code

We have a simple application for processing the uploaded file. We accept the file, parse it (simple txt file), calculate the values and save them in some database. We could have everything in one actor, but it’s much better to split it in multiple actors and create a pipeline. Each actor does exactly one thing. We have much more clean code and at the same time, testing it is much easier.

We have (for this demonstration) 2 actors. One reads the files to a List and sends it to another actor.

public class ReadFileActor extends UntypedActor
{
    @Override
    public void onReceive(Object m) throws Exception
    {
        if(m instanceof ReadFileMessage)
        {
            ReadFileMessage message = (ReadFileMessage) m;
            List<Double> numbers = this.readNumbersFromFile(message.filename); // some simple method to read file

            final ActorSelection actor = context().system().actorSelection("user/calculator-actor");
            actor.tell(new NumbersMessage(numbers));
        }
        else
        {
            unhandled(m);
        }
    }

    public static class ReadFileMessage
    {
        public final String filename;
        public ReadFileMessage(String filename)
        {
            this.filename = filename;
        }
    }
}

public class ReadFileActor extends UntypedActor

{

@Override

public void onReceive(Object m) throws Exception

{

if(m instanceof ReadFileMessage)

{

ReadFileMessage message = (ReadFileMessage) m;

List<Double> numbers = this.readNumbersFromFile(message.filename); // some simple method to read file

final ActorSelection actor = context().system().actorSelection("user/calculator-actor");

actor.tell(new NumbersMessage(numbers));

}

else

{

unhandled(m);

}

public static class ReadFileMessage

{

public final String filename;

public ReadFileMessage(String filename)

{

this.filename = filename;

}

The second actor gets a the List of numbers and calculates the sum of them.

public class CalculatorActor extends UntypedActor
{
    @Override
    public void onReceive(Object m) throws Exception
    {
        if(m instanceof NumbersMessage)
        {
            NumbersMessagemessage = (NumbersMessage) m;
            Double sum = this.sumNumbers(message.numbers);

            // send to another actor...
        }
        else
        {
            unhandled(m);
        }
    }

    public static class NumbersMessage
    {
        public final List<Double> numbers;
        public ReadFileMessage(List<Double> numbers)
        {
            this.numbers= numbers;
        }
    }
}

public class CalculatorActor extends UntypedActor

{

@Override

public void onReceive(Object m) throws Exception

{

if(m instanceof NumbersMessage)

{

NumbersMessagemessage = (NumbersMessage) m;

Double sum = this.sumNumbers(message.numbers);

// send to another actor...

}

else

{

unhandled(m);

}

public static class NumbersMessage

{

public final List<Double> numbers;

public ReadFileMessage(List<Double> numbers)

{

this.numbers= numbers;

}

If we used this code, we would quickly discover problems. When I tested it with VisualVM for memory leaks, I quickly discovered a memory leak with List numbers. How to solve it?

Immutable collections

When passing object between actors we need to follow few guidelines. If we brake them, we can face memory leaks and consequentially app crashes. One of the guidelines is to use Immutable collections. If we pass them between actors, they have to be Immutable. What are the advantages of Immutable objects?

Thread-safe – so they can be used by many threads with no risk of race conditions.
Doesn’t need to support mutation, and can make time and space savings with that assumption.
All immutable collection implementations are more memory-efficient than their mutable siblings (analysis)
Can be used as a constant, with the expectation that it will remain fixed.

There are many implementations of Immutable collections and one of the best ones is in Guava.

Improved code

We have to use ImmutableList to create a list of numbers for passing between actors.

...
ReadFileMessage message = (ReadFileMessage) m;
ImmutableList<Double> numbers = this.readNumbersFromFile(message.filename); 
...

...

ReadFileMessage message = (ReadFileMessage) m;

ImmutableList<Double> numbers = this.readNumbersFromFile(message.filename);

...

...
    NumbersMessagemessage = (NumbersMessage) m;
    // numbers is now ImmutableList<Double>
    Double sum = this.sumNumbers(message.numbers); 
....

public static class NumbersMessage
{
    public final ImmutableList<Double> numbers;
    public ReadFileMessage(ImmutableList<Double> numbers)
    {
        this.numbers= numbers;
    }
}

...

NumbersMessagemessage = (NumbersMessage) m;

// numbers is now ImmutableList<Double>

Double sum = this.sumNumbers(message.numbers);

....

public static class NumbersMessage

{

public final ImmutableList<Double> numbers;

public ReadFileMessage(ImmutableList<Double> numbers)

{

this.numbers= numbers;

}

Rerunning VisualVM confirmed that memory leak was resolved. Great.

Scripts to start and stop Play Framework application

December 23, 2014 - Last update: January 25, 2015

Play Framework (I’m talking about 2.X version) has multiple deployment ways. If you just check the docs, you will notice it has few pages of instructions just how to deploy your Play Framework application. For me, using stage seems to be the best and most stable way.

Stage

When deploying an application, I always run clean and stage commands. The first one will remove compiled and cached files. Second one will compile the application and create an executable. Everything will be located in ${application-home}/target/universal/stage/. There you have a bin folder and inside simple script to run the app. We will create start/stop scripts to make our work a little bit easier.

Scripts start/stop

To run the application, we use nohup. nohup enables us that the application will run even when we close the terminal or logout from our development machine. As everyone says, it’s not perfect solution but it works. We run the command in stage folder and add additional parameters.

nohup target/universal/stage/bin/myapp -J-server -J-Xms256M -J-Xmx256M -Dconfig.resource=application-prod.conf -Dhttp.port=9000 > /dev/null 2>&1 & echo $! > RUNNING_PID

1	nohup target/universal/stage/bin/myapp -J-server -J-Xms256M -J-Xmx256M -Dconfig.resource=application-prod.conf -Dhttp.port=9000 > /dev/null 2>&1 & echo $! > RUNNING_PID

-J-Server enables us to pass additional JVM related settings. In our case, we define Xms and Xmx for better memory management.
We have application.conf for development and application-prod.conf for the production with different settings (database, secret key, API logins, etc).
With -Dhttp.port we define the port of our application. We use apache to map the 80 port to 9000. It’s much safer and easier like this, because later we can have load balancer in the middle to divide the load to multiple application instances.

When running nohup, it will create a nohup.out file in which it logs everything (basically what application returns). Don’t confuse it with application logs. Application will still log everything based on logger.xml configuration independently of nohup. To prevent nohup.out file, we have to redirect everything to /dev/null and basically just ignore it.

On the end, we output the pid into RUNNING_PID file. Be careful. Play Framework automatically creates an additional file RUNNING_PID in stage folder. We add this as extra information and the file is removed after stopping our application.

test -f target/universal/stage/RUNNING_PID && kill `cat target/universal/stage/RUNNING_PID` && sleep 5;
rm RUNNING_PID;

1 2	test -f target/universal/stage/RUNNING_PID && kill `cat target/universal/stage/RUNNING_PID` && sleep 5; rm RUNNING_PID;

When we want to stop our application, we need to get the pid. We read it from stage folder RUNNING_PID file and pass it to kill command. For safety reasons we wait 5 seconds just to be sure that application is stopped. We could have a running job which needs few more seconds to complete or save the state.

Extra tip

We can also pass additional parameters to our stage command. One of them is javaagent. If we are using some remote monitoring solution like newrelic, we can include jar to send application data.

To do so, we need to pass -J-javaagent:/path/to/newrelic.jar with all the other parameters. Be sure that you include the correct path, because otherwise it will fail to start the application.

Handle file uploads in Play Framework 2.x [Java]

October 25, 2014 - Last update: October 25, 2014

Most applications have the ability to upload something. Handling uploaded files should not be hard. We need to check if user uploaded the file, if it’s the right type and store it. Mather of fact, this is really easy with Play Framework.

An example

We have a form to upload a file.

<form action="/upload" method="POST" enctype="multipart/form-data">
    <input type="file" name="file" />
</form>

</form>

This form will take one file and post it to /upload path. To be able to upload a file, we need to define enctype to multipart/form-data. This just means how the POST will be constructed and how the file will be send.

Next this is to create a controller and a method. We will only enable uploading of PDF files.

package controllers;

import play.mvc.Result;
import static play.mvc.Results.badRequest;
import static play.mvc.Results.ok;
import play.mvc.Controller;
import play.mvc.Http;

public class Uploads extends Controller
{
    public static Result upload()
    {
        Http.MultipartFormData body = request().body().asMultipartFormData();
        if(body == null)
        {
            return badRequest("Invalid request, required is POST with enctype=multipart/form-data.");
        }

        Http.MultipartFormData.FilePart filePart = body.getFile("file");
        if(filePart == null)
        {
             return badRequest("Invalid request, no file has been sent.");
        }

        // getContentType can return null, so we check the other way around to prevent null exception
        if(!"application/pdf".equalsIgnoreCase(filePart.getContentType())
        {
             return badRequest("Invalid request, only PDFs are allowed.");
        }
        
        File file = filePart.getFile();

        // handle file

        return ok();    
    }
}

package controllers;

import play.mvc.Result;

import static play.mvc.Results.badRequest;

import static play.mvc.Results.ok;

import play.mvc.Controller;

import play.mvc.Http;

public class Uploads extends Controller

{

public static Result upload()

{

Http.MultipartFormData body = request().body().asMultipartFormData();

if(body == null)

{

return badRequest("Invalid request, required is POST with enctype=multipart/form-data.");

}

Http.MultipartFormData.FilePart filePart = body.getFile("file");

if(filePart == null)

{

return badRequest("Invalid request, no file has been sent.");

}

// getContentType can return null, so we check the other way around to prevent null exception

if(!"application/pdf".equalsIgnoreCase(filePart.getContentType())

{

return badRequest("Invalid request, only PDFs are allowed.");

}

File file = filePart.getFile();

// handle file

return ok();

}

Very simple, right? I highly recommend you move this code to somewhere else (for example to some service). Good practice is to keep controllers slim.

First we need to check type of the request and check if it’s multipart/form-data. If body is null, then something is wrong. Same thing is for a file. If there is no file present, we need to report an error. Beware, it’s easy to modify the content type. Checking if the file is really PDF can sometimes be more difficult. Best ways are to use some additional libraries – one of the is Apache Tika.

Handling multiple files

We can also handle multiple file at once. All we have to do is loop through all posted files.

package controllers;

import play.mvc.Result;
import static play.mvc.Results.badRequest;
import static play.mvc.Results.ok;
import play.mvc.Controller;
import play.mvc.Http;
import java.util.List;

public class Uploads extends Controller
{
    public static Result upload()
    {
        Http.MultipartFormData body = request().body().asMultipartFormData();
        if(body == null)
        {
            return badRequest("Invalid request, required is POST with enctype=multipart/form-data.");
        }

        List<Http.MultipartFormData.FilePart> fileParts = body.getFiles();
        if(fileParts.isEmpty())
        {
            return badRequest("Invalid request, no files have been included in the request.");
        }

        for(Http.MultipartFormData.FilePart filePart: fileParts)
        {
            if(!"application/pdf".equalsIgnoreCase(filePart.getContentType())
            {
                 return badRequest("Invalid request, only PDFs are allowed.");
            }
        
            File file = filePart.getFile();
            // handle file
        }

        return ok();    
    }
}

package controllers;

import play.mvc.Result;

import static play.mvc.Results.badRequest;

import static play.mvc.Results.ok;

import play.mvc.Controller;

import play.mvc.Http;

import java.util.List;

public class Uploads extends Controller

{

public static Result upload()

{

Http.MultipartFormData body = request().body().asMultipartFormData();

if(body == null)

{

return badRequest("Invalid request, required is POST with enctype=multipart/form-data.");

}

List<Http.MultipartFormData.FilePart> fileParts = body.getFiles();

if(fileParts.isEmpty())

{

return badRequest("Invalid request, no files have been included in the request.");

}

for(Http.MultipartFormData.FilePart filePart: fileParts)

{

if(!"application/pdf".equalsIgnoreCase(filePart.getContentType())

{

return badRequest("Invalid request, only PDFs are allowed.");

}

File file = filePart.getFile();

// handle file

}

return ok();

}

Extra tip

When we upload the file, it will be stored into /tmp folder (of course if we use Linux server). Then we just need to move or copy the file to the right folder.

Recommended way is to use highly tested library Apache Commons IO and methods FileUtils.copyFile(source, destination) or FileUtils.moveFile(source, destination).

try
{
    File file = filePart.getFile();
    File destination = new File("/home/app/uploads/", file.getName());
    FileUtils.moveFile(file, destination);
}
catch(IOException ex)
{
    // something went wrong, handle it
}

try

{

File file = filePart.getFile();

File destination = new File("/home/app/uploads/", file.getName());

FileUtils.moveFile(file, destination);

}

catch(IOException ex)

{

// something went wrong, handle it

}

Why I think Spring Data repositories are awesome – Part 2

August 17, 2014 - Last update: August 31, 2014

In the first part, we covered some very basic things we can do with Spring Data repositories. In this part, we will learn how to make more complex queries. By that I mean how to find data by entity field or make a count. You will be amazed how easy it is with Spring Data.

Entities

We will use Post entity from our previous part, update it and add an entity called User.

package models;
 
import java.util.Date;
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.GenerationType;
import javax.persistence.Id;
import javax.persistence.ManyToOne;
import javax.persistence.Table;

@Entity
@Table(name = "posts")
public class Post
{
    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    public Long id;
    
    @Column(nullable = false)
    public String title;    

    @Column(nullable = false, unique = true)
    public String url;  
 
    @Column(nullable = false, name = "created_at")
    public Date created_at;
 
    @Column(name = "is_active")
    public boolean isActive;

    @ManyToOne(optional = false, fetch = FetchType.LAZY)
    public User user;
}

package models;

import java.util.Date;

import javax.persistence.Column;

import javax.persistence.Entity;

import javax.persistence.GeneratedValue;

import javax.persistence.GenerationType;

import javax.persistence.Id;

import javax.persistence.ManyToOne;

import javax.persistence.Table;

@Entity

@Table(name = "posts")

public class Post

{

@Id

@GeneratedValue(strategy = GenerationType.AUTO)

public Long id;

@Column(nullable = false)

public String title;

@Column(nullable = false, unique = true)

public String url;

@Column(nullable = false, name = "created_at")

public Date created_at;

@Column(name = "is_active")

public boolean isActive;

@ManyToOne(optional = false, fetch = FetchType.LAZY)

public User user;

}

package models;
 
import java.util.Date;
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.GenerationType;
import javax.persistence.Id;
import javax.persistence.OneToMany;
import javax.persistence.Table;

@Entity
@Table(name = "users")
public class User
{
    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    public Long id;
    
    @Column(nullable = false)
    public String username;

    @OneToMany(mappedBy = "user", fetch = FetchType.LAZY)
    public Set<Post> posts;
}

package models;

import java.util.Date;

import javax.persistence.Column;

import javax.persistence.Entity;

import javax.persistence.GeneratedValue;

import javax.persistence.GenerationType;

import javax.persistence.Id;

import javax.persistence.OneToMany;

import javax.persistence.Table;

@Entity

@Table(name = "users")

public class User

{

@Id

@GeneratedValue(strategy = GenerationType.AUTO)

public Long id;

@Column(nullable = false)

public String username;

@OneToMany(mappedBy = "user", fetch = FetchType.LAZY)

public Set<Post> posts;

}

As you can see, we have defined relation between our entities. Every user can have multiple posts and each post has only 1 user.

Tasks

Let’s imagine we have a situation where we want to create a simple blogging system. We have to be able to:

1. find all active posts
2. find a post by an url
3. find all posts by a user
4. count all active posts

1. Find all active posts

We will use the PostRepository we have defined in our first part. By simply adding a method to the repository, it will generate the right code and map everything to a SQL. Because CrudRepository already has few basic methods prebuilt, we don’t need to add a method to find all posts. Instead, we can use findAll method.

But to find all active posts we have to define our custom method. Actually, it’s very simple.

package repositories;
 
import java.util.Set;
import org.springframework.data.repository.CrudRepository;
import org.springframework.stereotype.Repository;
import models.Post;
 
@Repository
public interface PostRepository extends CrudRepository<Post, Long>
{
    public Set<Post> findByIsActiveTrue();
}

package repositories;

import java.util.Set;

import org.springframework.data.repository.CrudRepository;

import org.springframework.stereotype.Repository;

import models.Post;

@Repository

public interface PostRepository extends CrudRepository<Post, Long>

{

public Set<Post> findByIsActiveTrue();

}

This is it. This is the whole magic. One short line of code. But how it actually works? Spring Data will build the query based on method name and method return type. It will split the method name. In our case into find, by, isactive, true. First part defines to make a select query, the second indicates we want to filter, the third field name and fourth the value of the field. But be careful, defining field value after specifying the field name it will only work for booleans. For other field types, you need to pass the value as method argument. One great this is also that we can add multiple fields.

2. Find post by an url

Continuing the though from the previous section, we can have method build from different fields. For example, let’s load a post by it’s url. We have to update our repository.

@Repository
public interface PostRepository extends CrudRepository<Post, Long>
{
    public Set<Post> findByIsActiveTrue();
    public Post findByUrl(String url);
}

@Repository

public interface PostRepository extends CrudRepository<Post, Long>

{

public Set<Post> findByIsActiveTrue();

public Post findByUrl(String url);

}

Again, if we look at the method name, we will see that we are finding a post by it’s url. Because the url is a String, we have to pass the value as a method argument. Because return type is a Post, it will return one post. In case if the query returns multiple rows, then the exception will be thrown. Be careful that when you expect only 1 record, that you query by some unique field.

But actually, our blog system has to return a post by url and be active. We could have a code where we load the post by url and then check if isActive or not. Instead, we can do this in one query.

@Repository
public interface PostRepository extends CrudRepository<Post, Long>
{
    public Set<Post> findByIsActiveTrue();
    public Post findByUrlAndIsActiveTrue(String url);
}

@Repository

public interface PostRepository extends CrudRepository<Post, Long>

{

public Set<Post> findByIsActiveTrue();

public Post findByUrlAndIsActiveTrue(String url);

}

We are now querying the database by 2 field: url and isActive. When we use different fields in a method name, all of them are joined by AND. We cannot use OR. For that, we have to use some other approach (we will explain it in another tutorial).

3. Find all posts by a user

Every user has a username. Our task is to find all posts by a user or more specific, find all posts by a username. Writing a method name is actually the same, we just need to include the relation name. Again, we update our PostRepository.

@Repository
public interface PostRepository extends CrudRepository<Post, Long>
{
    public Set<Post> findByIsActiveTrue();
    public Post findByUrlAndIsActiveTrue(String url);
    public Set<Post> findByUserUsername(String username);
}

@Repository

public interface PostRepository extends CrudRepository<Post, Long>

{

public Set<Post> findByIsActiveTrue();

public Post findByUrlAndIsActiveTrue(String url);

public Set<Post> findByUserUsername(String username);

}

Method name has to include the relation name. When we defined it in the entity, we have to use the same name in the method. If we change it in the entity, we also have to change the method name. It may look complicated and no really robust to changes, but there is no other way for Spring Data to know how to correctly build the query. Of course, as we mentioned it few times already, in the next part we will learn how we can actually use custom queries to help out Spring Data with building the native SQL query.

Once we define the relation in the method name, everything else is the same. We again filter by field name, we can or use multiple fields. But remember, for each relation field, we have to prepend the name with the relation name.

4. Count all active posts

For the last task, we have to count all active posts. CrudRepository already has a method called count(), but it will count all posts. We could use findByIsActiveTrue() method to find all posts and get a populated Set. All we have to do then is to call .size() and there, we have the count of all active posts.

Don’t do that. Sure, it works and it might even work in the production for a small number of posts, but in case of a larger dataset, it’s not a good practice. We have to fetch all the records, populate Set and then call .size() to just get one number. It’s a too big overhead.

Instead, we will use count which maps to SQL count. It’s much much faster and consumes much less resources. Before, we were finding records, so we prepended every method name with find. If we want to count, we have to do what? You are right, prepend every method name with count. Let’s for the last time update our PostRepository.

@Repository
public interface PostRepository extends CrudRepository<Post, Long>
{
    public Set<Post> findByIsActiveTrue();
    public Post findByUrlAndIsActiveTrue(String url);
    public Set<Post> findByUserUsername(String username);
    public Long countByIsActiveTrue();
}

@Repository

public interface PostRepository extends CrudRepository<Post, Long>

{

public Set<Post> findByIsActiveTrue();

public Post findByUrlAndIsActiveTrue(String url);

public Set<Post> findByUserUsername(String username);

public Long countByIsActiveTrue();

}

There are few differences compared to other method names. First is return type. It has to be a Long, so it will bind a row count to it (Integer can be too small). As we mentioned before, we start method with count and then define the field filters. It’s that simple.

Part 3 – What more will we learn?

In the next part, we will how we can make even more complex queries by using @Query annotation. @Query annotation enables us to write HQL, which is very similar to SQL but has a compile time checking. Another thing we will learn is how to extend Repository and use PersistanceManager to build super complex queries. We will create custom methods and insert them into repositories. It’s a really cool and advanced feature, so stay tuned.

Why I think Spring Data repositories are awesome – Part 1

May 26, 2014 - Last update: August 27, 2014

I have been using Play Framework from version 1.2. Currently I’m using 2.2, which comes with a simple Ebean ORM but after reading few comments I realized it won’t be good enough for more complex projects. There is nothing worse than to realize in the middle of the project that some bug is causing your app not to work.

I looked around and noticed Spring Data. At first I didn’t put much effort in it but after checking the docs I realized it’s awesome solution. Only problem is that I doesn’t work out of the box with the Play Framework. There are few example projects on github, for me the only working was https://github.com/jamesward/play-java-spring.

I strongly advise you to check the code, because it’s show how to combine Play Framework with Spring Data. At the same time it shows how Dependency Injection works, how you define repositories and how to use them in controllers.

How it works

The basic logic is that Spring Data offers basic repositories for basic operations like saving, finding, deleting etc. Let’s imagine we have an entity Post.

package models;

import java.util.Date;
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.GenerationType;
import javax.persistence.Id;
import javax.persistence.Table;

@Entity
@Table(name = "posts")
public class Post
{
    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    public Long id;
    
    @Column(nullable = false)
    public String title;    

    @Column(nullable = false, name = "created_at")
    public Date created_at;

    @Column(name = "is_active")
    public boolean isActive;
}

package models;

import java.util.Date;

import javax.persistence.Column;

import javax.persistence.Entity;

import javax.persistence.GeneratedValue;

import javax.persistence.GenerationType;

import javax.persistence.Id;

import javax.persistence.Table;

@Entity

@Table(name = "posts")

public class Post

{

@Id

@GeneratedValue(strategy = GenerationType.AUTO)

public Long id;

@Column(nullable = false)

public String title;

@Column(nullable = false, name = "created_at")

public Date created_at;

@Column(name = "is_active")

public boolean isActive;

}

I’m not going into details how to create an entity. There are many great tutorials around. For the entity we create a repository

package repositories;

import org.springframework.data.repository.CrudRepository;
import org.springframework.stereotype.Repository;
import models.Post;

@Repository
public interface PostRepository extends CrudRepository<Post, Long>
{

}

package repositories;

import org.springframework.data.repository.CrudRepository;

import org.springframework.stereotype.Repository;

import models.Post;

@Repository

public interface PostRepository extends CrudRepository<Post, Long>

{

}

1. We extended CrudRepository. CRUD stands for Create, Read, Update and Delete. It’s means that CrudRepository probably has some methods for creating, reading, updating and deleting posts. If we check the source of CrudRepository, it enables us to that.

public interface CrudRepository<T, ID extends Serializable> extends Repository<T, ID>;
{
    S save(S entity);
    T findOne(ID primaryKey);
    Iterable findAll();
    Long count();
    void delete(T entity);
    boolean exists(ID primaryKey);
}

public interface CrudRepository<T, ID extends Serializable> extends Repository<T, ID>;

{

S save(S entity);

T findOne(ID primaryKey);

Iterable findAll();

Long count();

void delete(T entity);

boolean exists(ID primaryKey);

}

2. We added annotation @Repository so Spring knows to find it correctly and inject it.

3. We extended CrudRepository where Post is out entity and Long is type of the primary key. Based on there 2 attributes, Spring knows how to correctly build queries.

Let’s take it for a spin

...
import repositories.PostRepository;

@org.springframework.stereotype.Controller("users")
public class Users
{
    @AutoWired
    private PostRepository postRepository;   
    
    public Result index()
    {
         // retrieve all posts
         Iterable<Post> posts = this.postRepository.findAll();

         // retrieve an existing post by primary key (in our case Long:id)
         Post post = this.postRepository.findOne(1L);

         // let's update it and save back to batabase
         post.title = "hey, it works";
         this.postRepository.save(post);

         // and for last step delete it
         this.postRepository.delete(post);
    }
}

...

import repositories.PostRepository;

@org.springframework.stereotype.Controller("users")

public class Users

{

@AutoWired

private PostRepository postRepository;

public Result index()

{

// retrieve all posts

Iterable<Post> posts = this.postRepository.findAll();

// retrieve an existing post by primary key (in our case Long:id)

Post post = this.postRepository.findOne(1L);

// let's update it and save back to batabase

post.title = "hey, it works";

this.postRepository.save(post);

// and for last step delete it

this.postRepository.delete(post);

}

We put the example in 1 method for the sake of simplicity. Point is that we can simply retrieve, insert, update and delete Post entity. Spring handles building the queries, converting to objects and all other things. Again, check the example on github I mentined before so you know how to correctly define Controller, what @AutoWired does and why method index() is not static any more.

I have been using Spring Data for some time and there was not a single situation I couldn’t solve. I think the guys at Spring did a really good job. The simplicity on one side and ability to solve even most complex scenarios makes it really awesome.

More to come

In the next part we are going to check how to make a more complex queries by just defining methods name. We will also check how to include paging, sorting and how to run really really really complex queries.

Examples and Tutorials

Code

Testing

Conclusion

Requirements

Installing KairosDB

Setting Cassandra as a datastore

Test if everything works

What’s next

The code

Immutable collections

Improved code

Stage

Scripts start/stop

Extra tip

An example

Handling multiple files

Extra tip

Entities

Tasks

Further reading

Part 3 – What more will we learn?

How it works

Let’s take it for a spin

More to come