Possible papers for review
- MapReduce Online, 2010 [pdf]
- Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks, 2007 [pdf]
- CIEL: a universal execution engine for distributed data-flow computing, 2011 [pdf]
- Pregel: A System for Large-Scale Graph Processing, 2010 [pdf]
- The Google File System, 2003 [pdf]
- Megastore: Providing Scalable, Highly Available Storage for Interactive Services, 2011 [pdf]
- The Chubby lock service for loosely-coupled distributed systems, 2006 [pdf]
- ZooKeeper: Wait-free coordination for Internet-scale systems, 2010 [pdf]
- PNUTS: Yahoo!’s Hosted Data Serving Platform, 2008 [pdf]
- Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS, 2011 [pdf]
- Dynamo: Amazon’s Highly Available Key-value Store, 2007 [pdf]
- Dremel: Interactive Analysis of WebScale Datasets, 2010 [pdf]
- Transactional storage for geo-replicated systems, 2011 [pdf]
- Bigtable: A Distributed Storage System for Structured Data, 2008 [pdf]
- Apache Hadoop Goes Realtime at Facebook/Hbase, 2011 [pdf]
- Hive – A Petabyte Scale Data Warehouse Using Hadoop, 2010 [pdf]
- SCADS: Scale Independent Storage for Social Computing Applications, 2009 [pdf]
- Pig Latin: A Not-So-Foreign Language for Data Processing, 2008 [pdf]
- DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language, 2009 [pdf]
- FlumeJava: Easy, Efficient Data-Parallel Pipelines, 2010 [pdf]
- Relational Cloud: A Database as a Service for the Cloud [pdf]
- GraphLab: A New Framework For Parallel Machine Learning, 2010 [pdf]
- Piccolo: Building Fast, Distributed Programs with Partitioned Tables, 2010 [pdf]
- Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing [pdf]
- Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center, 2011 [pdf]
- Dominant Resource Fairness: Fair Allocation of Multiple Resource Types, 2011 [pdf]
- Multi-Resource Fair Queueing for Packet Processing, 2012 [pdf]
- Quincy: Fair Scheduling for Distributed Computing Clusters, 2009 [pdf]
- Sharing the Data Center Network/Seawall, 2011 [pdf]
- Modeling and synthesizing task placement constraints in google compute clusters, 2011 [pdf]
-
BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data, 2013 [pdf]
- Shark: SQL and Rich Analytics at Scale, 2013 [pdf]
- Spanner: Google’s Globally-Distributed Database, 2012 [pdf]
- A Comparison of Approaches to Large-Scale Data Analysis, 2009 [pdf]
- GraphChi: Large-Scale Graph Computation on Just a PC, 2012 [pdf]
- PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs, 2012 [pdf]
- Pregel: A System for Large-Scale Graph Processing, 2010 [pdf]