Analyzing the Performance of Data Replication and Data Partitioning in the Cloud: the Beowulf Approach

Stiemer, Alexander; Fetai, Ilir; Schuldt, Heiko

Analyzing the Performance of Data Replication and Data Partitioning in the Cloud: the Beowulf Approach

Authors

Alexander Stiemer, Ilir Fetai, Heiko Schuldt

Type

In Proceedings

Date

2016/12

Appears in

Proceedings of the 4th International Workshop on Scalable Cloud Data Management (SCDM 2016) - co-located with IEEE Big Data 2016

Location

Washington, D.C., USA

Publisher

IEEE Computer Society

Pages

2837 – 2846

Abstract

Applications deployed in the Cloud usually come with dedicated performance and availability requirements. This can be achieved by replicating data across several sites and/or by partitioning data. Data replication allows to parallelize read requests and thus to decrease data access latency, but induces significant overhead for the synchronization of updates. Partitioning, in contrast, is highly beneficial if all the data accessed by an application is located at the same site, but again necessitates coordination if distributed transactions are needed to serve applications. In this paper, we analyze three protocols for distributed data management in the Cloud, namely Read-One Write-All-Available (ROWAA), Majority Quorum (MQ) and Data Partitioning (DP) - all in a configuration that guarantees strong consistency. We introduce Beowulf, a meta protocol based on a comprehensive cost model that integrates the three protocols and that dynamically selects the protocol with the lowest latency for a given workload. In the evaluation, we compare the prediction of the Beowulf cost model with a baseline evaluation. The results nicely show the effectiveness of the analytical model and the precision in selecting the best suited protocol for a given workload.

Download

https://doi.org/10.1109/BigData.2016.7840932

Staff members

Research Projects

ClouDMan: Cost-based Data Management in Cloud Environments