Davy really nicely introduced the problem of looking at a snapshot of a data base. This problem obviously exists for any data base technology. You have a lot of timestamped records but running a query as if you fired it a couple of month ago is always a difficult challange. With FluxGraph a solution to …
Tag: graph processing
PhD proposal on distributed graph data bases
Over the last week we had our off campus meeting with a lot of communication training (very good and fruitful) as well as a special treatment for some PhD students called “massage your diss”. I was one of the lucky students who were able to discuss our research ideas with a post doc and other …
Wishlist of features for a distributed graph data base technology
I am just dreaming this does not exist and needs to be refined in a later stage. Fast traversals: Jumping from one vertex of the graph to another should be possible in O(1) Online processing: “Standard queries” (<–whatever this means) should compute within miliseconds. As an example: Local recommendations e.g. similar users in a bipartite …
From Graph (batch) processing towards a distributed graph data base
Yesterdays meeting of the reading club was quite nice. We all agreed that the papers where of good quality and we gained some nice insights. The only drawback of the papers was that it did not directly tell us how to achieve our goal for a real time distributed graph data base technology. In the …
Google Pregel vs Signal Collect for distributed Graph Processing – pros and cons
One of the reading club assignments was to read the paper about Google Pregel and Signal Collect, compare them and point out pros and cons of both approaches. So after I read both papers as well as Claudios overview on Pregel clones and took some notes here are my thoughts but first a short summary …
Nils Grunwald from Linkfluence talks at FOSDEM about Cascalog for graph processing
Nils Grunwald works at the french startup Linkefluence. Their product is more or less social network analysis and graph processing. They crawl the web and blogs or get other social network data and provide solutions with statistics and insights for their customers. In this scenario obviously big data is envolved and the data carries a …
Claudio Martella talks @ FOSDEM about Apache Giraph: Distributed Graph Processing in the Cloud
Claudio Martella introduces Apache Giraph which according to him is a loose implementation of Google Pregel which was introduced on SIGMOD in 2010. He points out that Map Reduce cannot be used to do graph processing. He then gave an example on how MapReduce can be used to to do page rank calculation. He points out that Pagerank can be calculated …