Till KTH:s startsida Till KTH:s startsida

Project #07

7.pdf

Title: Performance comparison of different languages over Hadoop and parallel DBMSs
Leader's Name: Julius Blad
Member2 Name: Andreas Pålsson

Related paper: http://openproceedings.org/2013/conf/edbt/ChenH13.pdf
Presentation Day: May 25
Model: BS


http://openproceedings.org/2013/conf/edbt/ChenH13.pdf
http://openproceedings.org/html/pages/2013_edbt.html
A performance comparison of parallel DBMSs and MapReduce on large-scale text analytics.
Fei Chen, Meichun Hsu pp. 613-624

Abstract

Information extraction has recently received significant attention due to the rapid growth of unstructured text data. However, this is computationally intensive and MapReduce and parallel database management systems have been used to analyze large amounts of data. In the paper A performance comparison of parallel DBMSs and MapReduce on large-scale text analytics the performance of a Hadoop implementation of MapReduce has been compared to one of the more popular parallel DBMSs. However, the authors only compared the performance when using one specific high level language over Hadoop.

The aim of this project is to compare the performance of the Hadoop/Pig implementation of MapReduce with Hadoop/Hive and a parallel DBMS. We will use some of the benchmark methods mentioned in the paper to do this comparison.

Hermes Dynamic Partitioning for Distributed Social Network Graph Databases.pdf