Till KTH:s startsida Till KTH:s startsida

Ändringar mellan två versioner

Här visas ändringar i "Project #07 " mellan 2015-04-09 23:33 av Michael Minock och 2015-04-10 16:59 av Michael Minock.

Visa nästa > ändring.

Project #07

Title: Performance comparison of different languages over Hadoop and parallel DBMSsLeader's Name: Julius BladMember2 Name: Andreas Pålsson

Related paper: http://openproceedings.org/2013/conf/edbt/ChenH13.pdfPresentation Day: May 25Model: ES

http://openproceedings.org/2013/conf/edbt/ChenH13.pdfhttp://openproceedings.org/html/pages/2013_edbt.htmlA performance comparison of parallel DBMSs and MapReduce on large-scale text analytics.Fei Chen, Meichun Hsu pp. 613-624

Abstract

Information extraction has recently received significant attention due to the rapid growth of unstructured text data. However, this is computationally intensive and MapReduce and parallel database management systems have been used to analyze large amounts of data. In the paper A performance comparison of parallel DBMSs and MapReduce on large-scale text analytics the performance of a Hadoop implementation of MapReduce has been compared to one of the more popular parallel DBMSs. However, the authors only compared the performance when using one specific high level language over Hadoop.

The aim of this project is to compare the performance of the Hadoop/Pig implementation of MapReduce with Hadoop/Hive and a parallel DBMS. We will use some of the benchmark methods mentioned in the paper to do this comparison.

Hermes Dynamic Partitioning for Distributed Social Network Graph Databases.pdf