Beowulf cluster
Beowulf
is not a particular product. It is a concept for clustering
varying number of small, relatively inexpensive computers running the
Linux operating system and using MPI or PVM for message passing. The
goal of Beowulf clustering is to create a parallel processing
supercomputer environment at a price well below that of conventional
supercomputers.
In the summer of 1994 the first Beowulf was built by
Thomas Sterling and Don Becker to address problems associated with
large data sets. The idea was an instant success and their idea of
providing COTS (Commodity off the shelf) base systems to
satisfy specific computational requirements quickly spread into the
academic and research communities. The High Performance Computer
community are now referring to such
machines as "Beowulf Class Cluster Computers."
Beowulf is one approach to clustering commodity hardware components to
form a parallel virtual supercomputer. It is a system that usually
consists of one system-management node and one or more compute nodes
connected via Ethernet or some other network or system area network
interconnect.
It is ideal for tackling very complex problems that can be split up and
run simultaneously in separate computers. And that's a key point: not
every problem can be approached in parallel so not every problem will
benefit from the Beowulf approach.
In the taxonomy of parallel computers, Beowulf clusters fall somewhere
between MPP
(Massively Parallel Processors, like the nCube, CM5, Convex SPP, Cray
T3D, Cray T3E, etc.)
and NOWs (Networks of Workstations).
Beowulf clusters benefit from developments in both
these classes of architecture. MPPs are typically larger and have a
lower latency interconnect
network than an Beowulf cluster. Programmers are still required to worry
about locality, load
balancing, granularity, and communication overheads in order to obtain
the best performance.
Even on shared memory machines, many programmers develop their programs
in a message
passing style. Programs that do not require fine-grain computation and
communication can
usually be ported and run effectively on Beowulf clusters.
Programming a NOW is usually an
attempt to harvest unused cycles on an already installed base of
workstations in a lab or on a
campus. Programming in this environment requires algorithms that are
extremely tolerant of
load balancing problems and large communication latency. Any program
that runs on a NOW
will run at least as well on a cluster.
A Beowulf class cluster computer is distinguished from a Network of
Workstations by several
subtle but significant characteristics. First, the nodes in the cluster
are dedicated to the cluster.
This helps ease load balancing problems, because the performance of
individual nodes are not
subject to external factors. Also, since the interconnection network is
isolated from the external
network, the network load is determined only by the application being
run on the cluster.
This
eases the problems associated with unpredictable latency in NOWs. All
the nodes in the cluster
are within the administrative jurisdiction of the cluster. For examples,
the interconnection
network for the cluster is not visible from the outside world so the
only authentication needed
between processors is for system integrity. On a NOW, one must be
concerned about network
security.
Another feature is the Beowulf software that provides a global process
ID. This
enables a mechanism for a process on one node to send signals to a
process on another node of
the system, all within the user domain. This is not allowed on a
NOW.
Finally, operating system
parameters can be tuned to improve performance. For example, a
workstation should be tuned
to provide the best interactive feel (instantaneous responses, short
buffers, etc), but in cluster
the nodes can be tuned to provide better throughput for coarser-grain
jobs because they are
not interacting directly with users.
Contact information:
Kees
Vuik
Back to
home page
or
educational page of Kees Vuik