Finding the right person to work with you on your project or product is really hard. One way is to ask around, you friends and family, and see if they can recommend someone. Mostly this way brings bad results. Other, in some cases, much better way is to use different sites to find your business partner.
I have created a short list of websites and short description from their website.
Founders Nation is a platform for entrepreneurs and founders. We built it to connect dreamers that wish to make the world a better place through innovation and technology.
FounderDating is a network of talented entrepreneurs helping one another start and grow companies. With FounderDating you find world-class people with complimentary skill-sets, expert areas and knowledge.
Find cofounders, collaborators and other makers to help build your project. CollabFinder Groups give your community members a place to team up and launch projects.
FoundersHookup’s goal is to connect you with GREAT people to seed high quality DNA into your venture from Day One. We screen applicants and only present the highest caliber Internet founder prospects.
Even though reddit is use for many things, if also offers a section for finding a cofounder. It’s not so great as previous websites, but maybe someone can find a dream match.
If you think I skipped an important website to find a co-founder, send me an email and I will update the list.
Yesterday I wanted to renew my .net domain. I logged into GoDaddy, found the domain, followed the checkout steps and when I was about to pay, I noticed the price. It was around 13.25 EUR. Looked really expensive. So I googled GoDaddy coupons (I always do that). I found one for 35% off and decided to try it. GoDaddy responsed it accepted the coupon and the price was changed so I’m getting the biggest saving. But the price did not change.
Paying 13.25 EUR (almost $18) is simply too much. I have noticed many time GoDaddy (I have been using GoDaddy for more than 5 years) gives different prices for same domain. I presumed they wanted to use the situation (domain would expire the next day) and make the most of it (read make the most money out of it). I said no.
I decided to use another registrar. I remembered along time ago somebody recommended namecheap.com. I decided to give a go and transfer my domain to them. The price was around 7.5 EUR (around $8) so almost twice cheaper. I made an account, filled the request for transfer, unlocked domain at GoDaddy and followed the steps in email. I completed everything in few hours and saved some money.
Of course I won’t become a millionaire if I saved almost 6 EUR, but paying that much for a domain is simply crazy. I still have few domains registered at GoDaddy. If the competition will offer better deals, I will slowly but for sure totally ditch GoDaddy.
I got a call the other day with a question: how can we store huge amount of sensor data. They are measuring air temperature in different rooms every 5 seconds. That means 17280 data points per data, 6307200 data points per year and for 15 rooms 94608000 data points per year.
Because I never had a situation where I needed to store a huge amount of sensor data, I didn’t know the answer. But I started digging. There are many questions online regarding what database to use to store this kind of data. Some recommend old-school databases like MySQL or Oracle. Some recommend Redis, Riak or MongoDB. But one recommendation beat them all: OpenTSDB.
OpenTSDB – The Scalable Time Series Database
Store and server massive amounts of time series data without losing granularity.
Currently in version 2.0, OpenTSDB is a tested solution build on top of HBase. It was designed especially for time series data and can handle
– up to 100+ billion data points and
– 2000 new data points per second (tested by OpenTSDB developers on a old dual-core Intel Xeon CPU from 2006; I tested on a newer machine and could easily insert 20000 points in few seconds).
Long story short. It’s perfect database for huge amount of sensor data. It has great options to query data (I will explain it below), has additional features to annotate data and it’s under active development.
Installation and running it for the first time
To run OpenTSDB, you need to first install HBase. The procedure is pretty straightforward. First you need to install HBase. Download HBase, unpack, define configuration and run it with
1
./bin/start-hbase.sh
If everything was defined correctly, you should get a message
Next step is installing OpenTSDB. There is a great tutorial how to install OpenTSDB. In short, download it and unpack or clone git repository, and run build.
1
2
3
git clonegit://github.com/OpenTSDB/opentsdb.git
cd opentsdb
./build.sh
It should take few minutes to compile everything. Next step is to create tables with command
You can see created tables with few opensource HBase viewers like hrider. Currently the compression is set to none. It’s highly recommend to use compression LZO, because there is no performance impact but it can highly reduce the size of your data.
Because we will store temperatures in metric temperatures, we need to create it first. OpenTSDB has a configuration to enable auto creation of metrics, but it’s not recommended, so we will do it manually.
1
./build/tsdb mkmetric temperatures
The last step is to run everything.
1
2
3
tsdtmp=${TMPDIR-'/tmp'}/tsd# For best performance, make sure
mkdir-p"$tsdtmp"# your temporary directory uses tmpfs
If everything went well, you should see OpenTSDB page at localhost:4242. It’s that simple.
How data is stored
How OpenTSDB is storing the data is in my opinion the biggest difference compared to other databases. It does support tables, but they are actually called metrics. In each metric we can store data points. Each data points is structured as
Timestamp (unix time or ISO 9601 format) is the time of the data point. Value is a number (integer or float). Then we have tags. With tags we separate data points. In our example, we are storing value for bedroom on our first floor. This structure enables us to separate data and later make advanced queries; for example average temperature on first floor or sum of all rooms.
1
135699840023.5room=bedroom floor=1
Storing data
With version 2.0, OpenTSDB has 2 ways to store and access data (plus one additional to store by importing data). They are Telnet API, HTTP API and batch import from a file. Make sure you have OpenTSDB running before you try the examples below.
We need to execute command PUT with metric and data. še dopiši
Storing with HTTP API
When working with HTTP API, we have to make a PUT request to the URL localhost:4242/api/put with JSON data.
JavaScript
1
2
3
4
5
6
7
8
9
{
"metric":"temperatures",
"timestamp":1356998400,
"value":23.5,
"tags":{
"room":"bedroom",
"floor":"1"
}
}
There is also a possibility to make a batch insert. Just wrap all metrics in an array.
JavaScript
1
2
3
4
5
[
{..metric1..},
{..metric2..},
{..metric3..}
]
Personally I had few problems inserting a large amount of data with the API. I ended up using Telnet API and it seems to work really well.
Querying the data
The whole beauty of OpenTSDB is it’s ability to not only to store huge amount of data, but to also query it fast. I will be showing how to query data with HTTP API, but the same query parameters can be used with Telnet API.
For the examples, we will first insert some data. Of course we can insert a much large dataset, but for this tutorial lets keep it simple.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// imaginary data
put temperatures135699840023.5room=bedroom floor=1
put temperatures135699840523.2room=bedroom floor=1
put temperatures135699841023.7room=bedroom floor=1
put temperatures135699841523.1room=bedroom floor=1
put temperatures135699840024.2room=childroom floor=1
put temperatures135699840523.9room=childroom floor=1
put temperatures135699841023.4room=childroom floor=1
put temperatures135699841523.6room=childroom floor=1
put temperatures135699840023.5room=livingroom floor=0
put temperatures135699840523.5room=livingroom floor=0
put temperatures135699841023.5room=livingroom floor=0
put temperatures135699841523.5room=livingroom floor=0
put temperatures135699840018.2room=basement floor=-1
put temperatures135699840518.1room=basement floor=-1
put temperatures135699841018.1room=basement floor=-1
put temperatures135699841518.0room=basement floor=-1
Getting all temperatures
1
GET http://localhost:4242/api/query?start=1356998400&m=sum:temperatures{room=*,floor=*}
Let’s break down the request:
1. We can make GET or POST requests
2. The HTTP API URL is http://localhost:4242/api/query
3. We must define start, but end is optional. It can be unix timestamp or you can define nx-ago where n is unit and x is metric. For example, 1day-ago or 1h-ago. OpenTSDB will automatically convert it to timestamp based on your time.
4. m is the metric, where we are using aggregation = sum and metric = temperatures.
5. The last is grouping_operator (inside {}), which is used to group the data. If we define it with *, then it will not group the data. We can also use it to filter. For example room=bedroom will only fetch data from bedroom.
GET http://localhost:4242/api/query?start=1356998400&m=avg:temperatures{floor=1}
produces
JavaScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[
{
"metric":"temperatures",
"tags":{
"floor":"1"
},
"aggregateTags":[
"room"
],
"dps":{
"1356998400":23.850000381469727,
"1356998405":23.550000190734863,
"1356998410":23.550000190734863,
"1356998415":23.350000381469727
}
}
]
We can see tag room in aggregateTags. It means it used this tag to aggregate (or if you are familiar with other databases, GROUP BY) data.
Getting average temperatures per day
Let’s imagine a situation where we want to create reports of the temperatures on a daily basis. We could load all the data and then manually calculate the averages. For larger datasets it could take some time. OpenTSDB has an answer. We can also define downsampling. Downsampling will automatically calculate the values based on our downsampling aggregation function and timeframe.
1
GET http://localhost:4242/api/query?start=1356998400&m=avg:1d-avg:temperatures{floor=1}
Notice different parameter m? We added 1d-avg (be careful to separate everything correctly with “:”), which will downsample by 1 day and calculate average. Compared to manual way, it’s much faster and it just gives us results, which we can use in graphs.
Other awesome features
OpenTSDB has few additional features to cover real-life situations. Of course we can easily add more with plugins. But 2 of them worth mentioning are Annotations and CLI Tools.
Annotations
Annotations enable us to add additional meta data to data points. For example, we could store information when we opened and closed window in each room or when we changed the heating level.
JavaScript
1
2
3
4
5
6
7
8
9
10
11
12
13
{
"tsuid":"000001000001000001",
"description":"Window Opened",
"notes":"Jane has opened the window and Mike has closed it",
CLI Tools are just simple tools to perform additional task like fixing the data storage (in case if something breaks down), querying and deleting data and creating metrics. One of the most common tools I use it scan, because it has the feature to delete data. t’s useful when we are doing different tests.
To delete all temperatures for basement, we execute command
1
./build/scan --delete 1356998400 sum temperatures room=basement floor=-1
Again, we can filter what to delete with start and end parameters, metric and tags.
Wrap up
OpenTSDB has been proved to be an excellent solution. It’s scalable, fast and has really neat features. Most importantly, it’s under active development and has many people contributing. With the era of IoT and Big Data upon us, it has a bright future ahead.
I have been using Play Framework from version 1.2. Currently I’m using 2.2, which comes with a simple Ebean ORM but after reading few comments I realized it won’t be good enough for more complex projects. There is nothing worse than to realize in the middle of the project that some bug is causing your app not to work.
I looked around and noticed Spring Data. At first I didn’t put much effort in it but after checking the docs I realized it’s awesome solution. Only problem is that I doesn’t work out of the box with the Play Framework. There are few example projects on github, for me the only working was https://github.com/jamesward/play-java-spring.
I strongly advise you to check the code, because it’s show how to combine Play Framework with Spring Data. At the same time it shows how Dependency Injection works, how you define repositories and how to use them in controllers.
How it works
The basic logic is that Spring Data offers basic repositories for basic operations like saving, finding, deleting etc. Let’s imagine we have an entity Post.
Java
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
packagemodels;
importjava.util.Date;
importjavax.persistence.Column;
importjavax.persistence.Entity;
importjavax.persistence.GeneratedValue;
importjavax.persistence.GenerationType;
importjavax.persistence.Id;
importjavax.persistence.Table;
@Entity
@Table(name="posts")
publicclassPost
{
@Id
@GeneratedValue(strategy=GenerationType.AUTO)
publicLongid;
@Column(nullable=false)
publicStringtitle;
@Column(nullable=false,name="created_at")
publicDate created_at;
@Column(name="is_active")
publicbooleanisActive;
}
I’m not going into details how to create an entity. There are many great tutorials around. For the entity we create a repository
1. We extended CrudRepository. CRUD stands for Create, Read, Update and Delete. It’s means that CrudRepository probably has some methods for creating, reading, updating and deleting posts. If we check the source of CrudRepository, it enables us to that.
2. We added annotation @Repository so Spring knows to find it correctly and inject it.
3. We extended CrudRepository where Post is out entity and Long is type of the primary key. Based on there 2 attributes, Spring knows how to correctly build queries.
// retrieve an existing post by primary key (in our case Long:id)
Post post=this.postRepository.findOne(1L);
// let's update it and save back to batabase
post.title="hey, it works";
this.postRepository.save(post);
// and for last step delete it
this.postRepository.delete(post);
}
}
We put the example in 1 method for the sake of simplicity. Point is that we can simply retrieve, insert, update and delete Post entity. Spring handles building the queries, converting to objects and all other things. Again, check the example on github I mentined before so you know how to correctly define Controller, what @AutoWired does and why method index() is not static any more.
I have been using Spring Data for some time and there was not a single situation I couldn’t solve. I think the guys at Spring did a really good job. The simplicity on one side and ability to solve even most complex scenarios makes it really awesome.
More to come
In the next part we are going to check how to make a more complex queries by just defining methods name. We will also check how to include paging, sorting and how to run really really really complex queries.
I was developing a Java app to fetch a JSON data and process it. Because the the app will be running on GAE, I had to use GAE URL Fetch Java API. google-http-java-client library has this supported with UrlFetchTransport so it’s a clear signal to use it.
I wrote the simple script, tested it locally and uploaded to GAE. Tried it for the first time and it worked. Tried the second time, it failed. Tried few times more and it was failing at random interval. Full stack trace:
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:430)
at java.lang.Long.valueOf(Long.java:540)
at com.google.api.client.util.Data.parsePrimitiveValue(Data.java:421)
at com.google.api.client.http.HttpHeaders.parseValue(HttpHeaders.java:1178)
at com.google.api.client.http.HttpHeaders.parseHeader(HttpHeaders.java:1159)
at com.google.api.client.http.HttpHeaders.fromHttpResponse(HttpHeaders.java:989)
at com.google.api.client.http.HttpResponse.<init>(HttpResponse.java:148)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:969)
.....
I had no idea what is going on. So I tried to investigate it:
1. Checked if I’m using latest versions -> Yes
2. Checked online if anyone else had same problem -> No
3. Tried to use same examples as in docs -> Same error
4. Made a request to some other servers -> It worked
The last one gave me a signal that something is wrong with the server I’m making requests to. So I made a test:
1. Make a request to some other server (which works)
2. Make request to primary server (which does not work)
I compared response headers. At first I didn’t notice it, but response from primary server was not returning Cache-Control header. Problem? I knew that primary server was using apache and mod_proxy. Main app that is returning JSON is behind a proxy server. So I had to update settings and updated the app which returns JSON to also send Cache-Control: private.
Private indicates that all or part of the response message is intended for a single user and MUST NOT be cached by a shared cache, such as a proxy server.
So the solution is actually pretty simple. Just make sure the Cache-Control: private is included in the response headers.
There are many ways to run scheduled jobs. More complicated solution can be developed with Celery, Advanced Python Scheduler, python-crontab or with many other solutions. But if you are looking for a quick and simple way to run scheduled jobs, then cron is the right tool (if you are using Windows, then services are similar alternative).
Running a simple python script using cron is simple. First we need to edit crontab
1
$crontab-e
Every line defined one cron task. Every task has to have correct syntax
1
0*/2***/usr/bin/python/home/erol/script.py
and our script.py
Python
1
2
3
# script.py
print"it works"
We will not go into details. If you want to read more about it, there are many articles. When we save, we are notified (if we correctly defined everything) that task has been installed. In our case, it will run our script.py every 2 hours.
Mixing it with Django
Running cron with Django can be a little bit more complicated. The reason lays in fact that Django needs settings to be included to run correctly. This can sometimes be pain in the ass because of the paths.
Kronos makes it really easy to schedule tasks with cron.
With Knosos we need to register tasks.
Python
1
2
3
4
5
6
7
8
# app/cron.py
importkronos
importrandom
@kronos.register('0 */2 * * *')
deftalk():
print"it works"
What will this do? It will register a task and run is every 2 hours. It will automatically include everything and define correct path, so we can also use our app code; for example models.
Python
1
2
3
4
5
6
7
8
9
10
# app/cron.py
importkronos
importrandom
frommodels importUser
@kronos.register('0 */2 * * *')
deftalk():
word=User.objects.next_word()# custom query to each time load next word
printword
To test everything, we run
1
2
$python manage.py runtask talk
it works
and to register it
1
2
$python manage.py installtasks
Installed1task.
If we check crontab, we will see something similar to
If we did everything correctly, it should work and run our talk method every 2 hours.
Extra advice
While running tasks on my Linux machine, I was always getting email from Cron Deamon. This can be annoying, so we can simply “disable” it. Actually, we can redirect cron output to /dev/null. Just add setting KRONOS_POSTFIX.
Last day I noticed a really cool solution for sending email from MailChimp called Mandrill. It offers free package up to 12000 emails per month and 250 emails per hour. I decided to include it in our project (Django) and give it a go.
Mandrill is a scalable and affordable email infrastructure service, with all the marketing-friendly analytics tools you’ve come to expect from MailChimp.
Because it’s built by famous MailChimp, it has to be really good. I found a nice library for Python called Djrill which nicely wraps django email sending capabilities.
The beauty of Djrill it you don’t have to change code, because it works with current code for sending emails. There are 2 ways to send an email
First:
Python
1
2
3
4
fromdjango.core.mail importsend_mail
send_mail("It works!","This is my email content",
"Sender <sender@example.com>",["to@example.com"])
Or second with EmailMultiAlternatives
Python
1
2
3
4
5
6
7
8
9
10
11
12
fromdjango.core.mail importEmailMultiAlternatives
msg=EmailMultiAlternatives(
subject="It works!",
body="This is my email content",
from_email="Sender <sender@example.com>",
to=["Recipient One <someone@example.com>","another.person@example.com"]
)
msg.attach_alternative("<p>This is my email content</p>","text/html")
# Send it:
msg.send()
There is also a cool feature to add tags and metadata to emails. Later in Mandrill you can filter emails by them. They can be useful to filter different types of emails (registration, newsletter) or for example, to whom you have sent an email (client a, client b, …)
The installation completed successfully. I was also getting the locale warning message
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
perl:warning:Setting locale failed.
perl:warning:Please check that your locale settings:
LANGUAGE=(unset),
LC_ALL=(unset),
LC_PAPER="en_GB.UTF-8",
LC_ADDRESS="en_GB.UTF-8",
LC_MONETARY="en_GB.UTF-8",
LC_NUMERIC="en_GB.UTF-8",
LC_TELEPHONE="en_GB.UTF-8",
LC_IDENTIFICATION="en_GB.UTF-8",
LC_MEASUREMENT="en_GB.UTF-8",
LC_TIME="en_GB.UTF-8",
LC_NAME="en_GB.UTF-8",
LANG="en_US.UTF-8"
are supported andinstalled on your system.
perl:warning:Falling back tothe standard locale("C").
and as I remembered from before, the fix for this is to add
1
LC_ALL="en_US.utf-8"
at the bottom of file /etc/environment. All looked ok, but when I started database
1
sudo/etc/init.d/postgresql start
I didn’t get any message. Then when I logged into postgres with
1
su-postgres
and wanted to create database, I got warning
1
2
3
psql:could notconnect toserver:No such file ordirectory
Isthe server running locally andaccepting
connections on Unix domain socket"/var/run/postgresql/.s.PGSQL.5432"?
I checked around and a lot of them mentioned that I had wrong or incomplete settings. Because I didn’t know what to do, I asked around (postgresql channel on freenode) and got really simple solution. I just had to run command
1
pg_createcluster9.1main--start
where 9.1 is version of my postgresql database. This recreated settings file and now everything works great.
Working on MEDinar, we are doing unit test for test our app. One of the scenario is when users upload their slides. We need to check if everything works. But the problem is I could not find a working example in Java how to test file upload. So after digging and reading few examples how to do it in Scala
While working on my Phonegap project I wanted to load initial data into SQLite. One option is to download it from the internet, but it’s always annoying to notify the user to turn on internet and to wait for all the files to download (especially if you have to download 50 images).
The other option is to load local data. Problem is that identifying the right path to local file can be tricky. But the solution is actually easy.