Treffer: A Testbed for Highly-Scalable Mission Critical Information Systems

Title:
A Testbed for Highly-Scalable Mission Critical Information Systems
Contributors:
CORNELL UNIV ITHACA NY OFFICE OF SPONSORED PROGRAMS
Source:
DTIC AND NTIS
Publication Year:
2005
Collection:
Defense Technical Information Center: DTIC Technical Reports database
Document Type:
Fachzeitschrift text
File Description:
text/html
Language:
English
Rights:
Approved for public release; distribution is unlimited.
Accession Number:
edsbas.505182B2
Database:
BASE

Weitere Informationen

This effort is building a new system for scalable distributed computing. The basic problem is common in GIG and NCES systems, where an acute need has arisen for simple tools to assist the developer of a distributed service that will be shared by huge numbers of client systems in a networked environment. Headed by Professor Ken Birman, the project is exploring a novel fusion of classical protocols for reliable multicast communication with a new style of peer-to-peer protocol called scalable "gossip". The basic idea is to implement a communication platform using these new protocols, and then integrate the platform with standard Web Services tools and technologies to achieve a uniquely easy to use, scalable, and robust solution. The DURIP cluster has rapidly become a mainstay of the author's research in Quicksilver, to include scalable services architecture, time critical services, and scalable reliable event delivery. The cluster is anticipated to increase in usage over the next several years to include more members of the systems groups at Cornell University. QuickSilver currently has three sub-efforts that rely heavily on the cluster. The first project focuses on what is called a "scalable services architecture." This work explores a novel new approach to building high performance, scalable, self-managed distributed services that can be dragged and dropped onto the cluster. A second project adopts a similar approach but with a focus on time-critical services. Using a new form of forward error correction, this activity seems to support a new kind of time-critical or real-time replication technology that includes support for deadline-driven communication, periodic communication, and guaranteed low-latency responsiveness even in the face of load bursts or failures. A third project focuses on scalable reliable event delivery, messaging, and notification.