Information Unbound #2 - by Erick Von Schweber

Computing Fabrics compared with distributed and parallel software technologies
October 13, 1998

Executive Summary: A number of current software technologies aim to hide, soften, or span the boundaries between systems and thereby facilitate the creation of distributed applications. Compared with Computing Fabrics these technologies often require the use of specific languages, interfaces, or libraries and frequently mandate that the programmer take an explicitly parallel approach. Additionally, these technologies do not generally exploit low-latency interconnects between systems, thereby limiting distributed performance. Altogether, such technologies represent a bottom-up strategy for handling system boundaries in current distributed systems whereas Computing Fabrics represent a top-down approach to eliminating the distinction itself between local and remote systems.

We've received a great many responses to our Computing Fabrics Special Report in the October 5^th issue of PC Week. One of the most prevalent issues/questions raised is how Computing Fabrics compare with existing software technologies for creating distributed systems. These extant technologies include Beowulf clusters, the Linda language and tuplespace systems based on David Gelernter's Linda (IBM's T Spaces and Sun's Javaspaces), as well as parallel programming environments such as MPI and PVM. For completeness we'll also add to this list the distributed object infrastructures of CORBA and DCOM.

Before we proceed with specific comparisons a couple of general remarks are in order. First and of greatest importance, a goal of Computing Fabrics is to provide fluidly scalable and reconfigurable platforms for running common existing software. Whether one looks at the approach of SGI using distributed shared memory over modularly scalable interconnects or the approach of Microsoft Research's Millennium employing distributed shared objects, one thing is very clear - existing software is made to exploit potentially massive parallelism without having been explicitly programmed to that end. A corollary to this is that no out of the ordinary programming environments, interfaces, or libraries should need to be utilized by the programmer(s) to exploit the power of a Computing Fabric. As we shall see this is in stark contrast with several existing approaches.

Commodity Clusters - Beowulf

Beowulf is aimed at creating clusters of inexpensive PCs and workstations that for certain categories of applications realize the cumulative power of many smaller machines. Focused on Linux, Beowulf is a great project but like Linux is not (yet) extensively backed by commercial offerings, limiting its applicability in the enterprise. There's even a project now building a 1,000 processor Beowulf system using Alphas. Importantly however, Beowulf systems do not utilize a low-latency link between systems, because Beowulf is intended to exploit technology that is commodity "today". This means that communication between systems must (slowly) negotiate network protocol stacks, seriously increasing latency, and resorts to Fast Ethernet at best (due to the cost concern again), restricting the ultimate scalability of the cluster.

These limitations mean Beowulf clusters are appropriate for workloads that are challenging due to the shear number of fully independent program execution steps, such as are found in the off-line rendering of 3D graphics where 1000's of frames can be independently rendered. Beowulf is inappropriate for applications that are throughput bound involving execution step dependencies, which is the norm in most problem domains, not the exception. Computing Fabrics of both the distributed shared memory variety (SGI) and the distributed shared objects variety (Microsoft) are applicable to both kinds of problems and are therefore far more general than a Beowulf cluster approach.

Additionally, software tends to follow hardware, often by significantly long stretches. Today we’re seeing the beginnings of the hardware for Computing Fabrics. It will motivate the development of the software. Beowulf is a software solution to provide some measure of distributed processing on today’s commodity hardware, not a revolution in systems architecture. Other clustering solutions, used for breaking cryptographic codes or load balancing HTTP requests, are susceptible to this same analysis and compare to Computing Fabrics similarly.

Distributed Shared Object Spaces - Linda, T Spaces, and Javaspaces

The common theme of distributed shared object spaces is to execute a small daemon program on all participating machines to provide a consistent, unified logical view across all machines - an object space. Programmers can then utilize a very small set of instructions to read objects from, write objects to, and extract objects from this space, all without knowing the details of where objects are actually located. Programs "injected" into the space can replicate themselves, with copies across many machines working in parallel. With Javaspaces Sun adds limited transactional support to the operations in this space while IBM's T Spaces goes even further adding database capabilities such as persistence and recovery services.

The limitations of current object space implementations compared with Computing Fabrics are twofold. First, applications must be explicitly coded to utilize the services and logical view of a particular object space implementation, and different object space implementations are not interoperable. By comparison, a Computing Fabric constructed with Microsoft's Millennium will leverage the implicit parallelism inherent in a mass market of COM-based programs. Sun's Javaspaces could develop into a real competitor to Millennium in time, however, providing for Java objects what Millennium does for COM+ objects.

Second, no object space implementation (to the author's knowledge and with the exception of Microsoft's Millennium) will exploit a low-latency interconnect except within a homogeneous environment, such as a large MPP (Massively Parallel Processor). Thus, unlike a heterogeneous Computing Fabric, which will exploit low-latency interconnects, the object space only appears to be a single space from a functional view which rapidly breaks down when performance enters the picture.

In future columns I will discuss Javaspaces and T Spaces in greater detail.

Portable Parallel Libraries - MPI and PVM

MPI (Message Passing Interface) and PVM (Parallel Virtual Machine) are used by programmers of massively parallel machines (and on networks of workstations where supported by a utility such as Platform Computing's LSF) to obtain portability across parallel architectures and implementations. Without using one of these a parallel program written to execute on a Connection Machine cannot easily be ported to a cluster of workstations. MPI and PVM support explicit parallel programming using distributed "Non-Shared" memory computing, where each processor has its own memory with its own address space and message passing is used to coordinate function invocations, reads, writes, etc. amongst the ensemble. PVM was developed at Oak Ridge National Lab in ‘89 to run across a network of UNIX boxes. MPI began at a workshop on message passing in ’92 and the first version was published a little over a year later, making its big debut at Supercomputing ’93 in November of that year.

Although phenomenal results can be achieved using either of these packages they both require explicit parallel programming and provide no assistance to the user or customer looking to leverage parallelism in the execution of off-the-shelf or near off-the-shelf software. In many cases a database vendor would have to literally re-architect their product to use a parallel portable library whereas that same vendor could modify their SMP version for a Computing Fabric.

Distributed Object Infrastructures - CORBA and DCOM

CORBA and DCOM provide the programmer who's partitioned their code into modules (using a large grain approach) a relatively easy way to distribute these modules across multiple processors as they see fit. Thus, to exploit the parallelism found in a cluster of machines the programmer must determine the partitioning and sequencing of code modules and then use CORBA or DCOM to provide the "appearance" that all objects are local. While this is a boon to those implementing 3, 4, and 5-tier distributed applications it does not help at all when the number of tiers is large or unknown in advance, a situation that Computing fabrics will be well versed to handle.

The Ultimate Comparison

In drawing comparisons between Computing Fabrics and existing software what we ultimately come to is the difference between the head-on tackling of a problem in next generation technology as opposed to workarounds for current technology.

Computing Fabrics will emerge as the very distinction between "multiprocessor machines" and "clusters of machines" vanishes. This is already happening today at the very high-end, as system architecture and interconnects within supercomputers converges with the network architectures between them. The software technologies examined here attempt, to a greater or lesser extent, to reduce the complexities of exploiting multiple processors today.

So while the line between the two approaches is not hard and fast, distributed software technologies attempt to hide, soften, or span existing system boundaries whereas Computing Fabrics aim to eliminate these boundaries altogether in the next generation of computing.

Erick Von Schweber

Did this column answer your questions about the distinction between Computing Fabrics and related technologies? Let us know at unbound@infomaniacs.com.