Efficient and Reliable Data Stream Management
PhD Thesis Computer Science Department
University of Basel, Switzerland
The proliferation of sensor technology, especially in the context of embedded systems, and the progress of ubiquitous computing strongly supports new types of applications that make use of streams of continuously generated sensor data. Applications like telemonitoring in healthcare or roadside traffic management systems urgently require reliable data stream management (DSM) in a failure-prone distributed setting including resource-limited mobile and embedded devices. In order to motivate and illustrate our considerations, we investigate an application in the field of telemonitoring for e-health in detail. Telemonitoring applications in healthcare are demanding the key issue of this thesis, namely . Due to its importance for applicability, effectiveness and flexibility is also considered in this work. The main contribution of this thesis is threefold. First, in analogy to the SQL isolation levels, we define a model for reliable DSM based on levels of reliability and describe necessary consistency constraints for distributed DSM. Second, we present and analyze a novel algorithm for reliable distributed DSM, namely efficient coordinated operator checkpointing (ECOC) based on this model. We show that ECOC provides lossless and delay-limited reliable data stream management and thus can be used in critical application domains such as healthcare, where the loss of data stream elements cannot be tolerated. The ECOC approach considers fine-grained backups at operator level, which allows for the flexible and efficient usage of available resources in a network. Moreover, ECOC is optimized to reduce the overhead of checkpointing and to support complex stream process execution graphs, which include joins, splits and even cycles within data stream flows. Third, we present detailed performance evaluations of the ECOC algorithm running in a network of both stationary server nodes and mobile, resource-limited devices. Finally, the applicability of our approach is presented by an e-Health telemonitoring demo prototype developed with real-world sensors within this thesis. All evaluations and the demo application are based on the distributed DSM infrastructure prototype OSIRIS-SE. The Java implementation allows for running the same software on both mobile and stationary devices.