Project #06
Title:
Partitioning in a distributed graph database suitable for social networks
Leader's
Name: Fredrik Hallberg
Member2 Name:
Micke Söderqvist Member3 Name:
Joakim Candefors
Related
paper: Nicoara, Daniel; Kamali, Shahin; Daudjee, Khuzaima; Chen, Lei; (2015): Hermes: Dynamic
Partitioning for Distributed Social Network Graph Databases; OpenProceedings.org.
http://dx.doi.org/10.5441/002/edbt.2015.04
Published in Proc. 18th International Conference on Extending
Database
Technology (EDBT), March 23-27,
2015, Brussels
Presentation
Day: May 20
Model:
ES
Abstract:
Graphs are a popular way of representing data when focus is on identifying relationships. This is e.g why the big social networks utilize graph databases to store much of their data. There are several database systems that have adapted to handle the huge amount of data contained in these databases by distributing it over multiple servers. The distribution brings with it the issue of partitioning, the question of which server a certain vertex should be placed on to optimize the querying.
This project will be implementing the partitioning algorithm described in the paper “Hermes: Dynamic Partitioning for Distributed Social Network Graph Databases” on a distributed graph database. This will be done in Apache Giraph rather than neo4j with the custom built Hermes-plugin, which is proposed in the paper. Giraph, which is based on the MapReduce database system Apache Hadoop, is used by some major social networks and is therefore relevant for our project. This is done to further our understanding of the underlying of distributed graph databases. As of our current knowledge the default partitioning in Giraph is to randomly partition the vertices (Hash-partition) instead of minimizing the number of edge-cuts between partitions which the algorithms in Hermes intends. The customized (re-)partitioning used in Hermes will be tested with an acquired data set to determine how it performs against the default partitioning in Giraph.
Hermes Dynamic Partitioning for Distributed Social Network Graph Databases.pdf