February 2012 – Data Science, Data Analytics and Machine Learning Consulting in Koblenz Germany

February 24, 2012February 24, 2012/ Rene

Wishlist of features for a distributed graph data base technology

I am just dreaming this does not exist and needs to be refined in a later stage. Fast traversals: Jumping from one vertex of the graph to another should be possible in O(1) Online processing: “Standard queries” (<–whatever this means) should compute within miliseconds. As an example: Local recommendations e.g. similar users in a bipartite …

February 23, 2012February 23, 2012/ Rene

From Graph (batch) processing towards a distributed graph data base

Yesterdays meeting of the reading club was quite nice. We all agreed that the papers where of good quality and we gained some nice insights. The only drawback of the papers was that it did not directly tell us how to achieve our goal for a real time distributed graph data base technology. In the …

February 19, 2012February 19, 2012/ Rene

Question of the Day: How the hell do we reach more people?

Recently I received an email from a musicians that wishes to stay unnamed telling me that many people out there love his music but it just hasn’t spread too far. His basic question is how can his band reach more people on the web especially with regard to a new upcoming video? His promoter suggested something …

February 19, 2012February 19, 2012/ Rene

Google Pregel vs Signal Collect for distributed Graph Processing – pros and cons

One of the reading club assignments was to read the paper about Google Pregel and Signal Collect, compare them and point out pros and cons of both approaches. So after I read both papers as well as Claudios overview on Pregel clones and took some notes here are my thoughts but first a short summary …

February 16, 2012February 16, 2012/ Rene

President Obama on Google+ talking to people

Not really news since it has happened like 20 days ago but here is a nice youtube summary of President Obamas public Hangout with the American folk. Kind of amazing that he actually did this. I am really looking forward to the time where these kind of events are not amazing anymore but rather standard …

February 15, 2012February 15, 2012/ Rene

Some thoughts on Google Mapeduce and Google Pregel after our discussions in the Reading Club

The first meeting of our reading club was quite a success. Everyone was well prepared and we discussed some issues about Google’s Map Reduce framework and I had the feeling that everyone now better understands what is going on there. I will now post a summary of what has been discussed and will also post some …

February 8, 2012February 8, 2012/ Rene

Reading club on Graph databases and distributed systems

Update: find a summary of last meeting and the current reading list for next week’s meeting here. Teaching classes is over for this term so for the next couple of weeks I want to spend a lot of time working on some research topics that are on my mind. My goal is to finnaly write …

February 5, 2012February 5, 2012/ Rene

Birds of a feather: Graph processing future trends in Graph Devroom

Since one of the talks got canceled the organisers of the Graph Devroom at Fosdem used the opportunity to make a public discussions with all the developers to talk about some future trends in graph processing. I really liked the idea but unfortunately the discussion wasn’t really kicking off well. I guess for a discussion …

February 5, 2012February 5, 2012/ Rene

Claudio Martella talks @ FOSDEM about Apache Giraph: Distributed Graph Processing in the Cloud

Claudio Martella introduces Apache Giraph which according to him is a loose implementation of Google Pregel which was introduced on SIGMOD in 2010. He points out that Map Reduce cannot be used to do graph processing. He then gave an example on how MapReduce can be used to to do page rank calculation. He points out that Pagerank can be calculated …