Beowulf: SLA-based Integration of Data Replication and Data Partitioning (Master Thesis, Finished)

Author

Description

A typical cloud application architecture consists of the web / application server tier which runs the business logic and the data tier storing the data.

Scaling out the application server tier for load balancing purposes is not enough. More importantly, the data tier needs to be scaled out as well since it easily becomes the bottleneck. However, switching from a single copy database to a distributed database leads to new difficulties. The PACELC theorem says that there are fundamental trade-offs in a distributed system. To overcome those, many data management protocols were developed for different purposes. Some serve better under high read load while others are optimized for write operations. Some guarantee strong consistency while enhance the latency.

This thesis focuses on protocols guaranteeing strong consistency, namely Read‑One‑Write‑All‑Available (ROWAA), Majority Quorum and Data Partitioning provided by Cumulus.

We introduce Beowulf, a meta protocol which integrates data management protocols based on Service Level Agreements (SLAs). It finds the most suitable data management protocol for the current workload while fulfilling the SLA's requirements. In this thesis, the most suitable protocol is the one having the lowest latency between the client's request and the system's response. We propose a generic cost model which is the core of Beowulf. Finally, Beowulf is designed to adapt at runtime to dynamically changing workloads or requirements and it's cost model is implemented in the context of ClouDMan and evaluated using the TPC-C benchmark.

Beowulf shows good results for SLA's which do not require high availability and thus do not need replication. The provided model is a good step towards a fully workload- and SLA-aware distributed system.

Start / End Dates

2016/01/04 - 2016/07/03

Supervisors

Research Topics

Cloud Data Management