Navigation: [ Introduction | Publications | People | Progress ]
What is consisntecy maintenance and why is it not scalable?
First of all, we consider a replication-based dynamic system in which all participants can potentially update the shared-and-replicated objects. Here, consistency maintenance refers to the enforcement of consistency through communication among all the participants. The maintenance cost grows with the number of participants and, in a truly large system, such as online gaming, the consistency maintenance cost can be formidable. There are a number of systems available to support consistency maintenance. Please note that, traditonal client/server architecture cannot be applied here as there is essentially no fixed clients (all participants can issue updates and hence can be considered as servers in some sense).
Consistency retrieval -- one way out
A straightforward way to reduce the maintenance cost is to reduce the number of participants that a consistency maintenance module needs to include. We believe that this is both doable and preferable.
First, this is doable because not all participants in a collaboration application are equally active or engaged. In one digital white board scenario where students listen to a lecture, for example, the lecturers are more likely to issue updates while a majority of the students are observers—they monitor the white board and rarely issue updates. From a consistency maintenance point of view, the lecturers are more important than passive students. So there is really no real need to consider the passive students group as far as consistency maintenance is concerned at most of the time. The rationale is that, if a participant does not have updating activities, it is far more cost-effective to satisfy his or her needs on-demand. Second, this is preferable because it does not change the way most current consistency control protocol work, and hence is easier to be adopted.
We refer to this on-demand-based mechanism as consistency retrieval.
CVRetrieval -- our design to support the functions of consistency retrieval
As illustrated in the figure below, CVRetrieval differentiates three types of participants: active writers, passive writers, and observers. While active and passive writers are current updating the shared file/object, albeit at different rates, the observers are just passively observe the up-to-date view of the shared file/object. It needs to be mentioned that the role of a participant can change among the three types dynamically.
To support the consistency maintenance among active and passive writers, CVRetrieval works with our previously developed Inconsistency Detection Framework, particularly the IDEA protocol. As shown in the figure below, CVRetrieval is between the application layer and a general distributed operating system. When the application needs to guarantee consistency, it interacts with CVRetrieval. CVRetrieval depends on a consistency maintenance module—in this case, IDEA—to maintain consistency among writers and to guarantee the consistency level of the retrieved view. Finally, applications interact with the distributed operating system directly when no consistency issue is involved.
To support the retrieval functions for observers (i.e., the passive participants), CVRetrieval deploys publishers and subscribers in the system to serve as rendezvous points, similar to the publish-subscribe schemes. CVRetrieval chooses publishers and subscribers based on applications’ semantics to capture the common interest (with the consistent view of a particular application) among participants.
Scalability and performance
The evaluation of CVRetrieval is done in two parts. First, we theoretically analyze the scalability of CVRetrieval and compare it to other consistency maintenance protocols. The analytical result shows that CVRetrieval can greatly reduce communication cost and hence make consistency control more scalable.
Second, a prototype of CVRetrieval is developed and deployed on the Planet-Lab test-bed to evaluate its performance. The results show that the active participants experience a short response time at some expense of the passive participants that may encounter a longer response time depends on the system setting. Overall, the retrieval performance is still reasonably high.
We are currently investigating ways to improve its performance further.