![]() |
Center for Observations and Prediction at Scripps (COMPAS) |
![]() |
| Home |
About |
Research |
Publications |
People |
Links |
Facility
Introduction
The
original COMPAS cluster (funded using the NSF MRI grant) has been
leveraged to build a larger computing facility. Two other large
clusters have been purchased with ONR and NOAA funding and are all
housed and managed together at SIO. The resources of the COMPAS
clusters are shared among the PIs and students in the COMPAS project
The resources in COMPAS are facilitated by the COMPAS Director
(originally Detlef Stammer and later Bruce Cornuelle) and the COMPAS
System Manager (Caroline Papadopolous)
Hardware
The
current COMPAS compute facility has expanded so that it currently
houses over 450
computer nodes CPUs and associated storage servers:
| 310 1.0 GHz Pentium III, | 512MB/CPU (310 GFlops Peak) | 100 Mbit + Myrinet 2000 |
| 202 3.06 GHz Xeon | 1GB/CPU (1236 GFlops Peak) | 1 Gbit Ethernet + Myrinet 2000 |
| 256 2.8 GHz Xeon, 1GB/CPU (1433 GFlops Peak) | 1GB/CPU (1433 GFlops Peak) | 1 Gbit Ethernet + Myrinet 2000 |
| 128 1.0 GHz Pentium III | 512MB/CPU | 100 Mbit + 1 Gbit Ethernet. Connected to OptIPuter 10Gigabit Campus network |
| 16 CPU 733MHz Pentium III | 512MB/CPU | 100 Mbit Ethernet + Myrinet (Test Cluster) |
| 14 Storage Servers | 1 Gigabit Ethernet | 12 TB total storage, RAID-5 |
| 4 Storage Servers | 1 Gigabit Ethernet |
2.9 TB total storage, software RAID-1 |
General Description
The largest cluster is 128 Nodes (256 CPUs) and is limited by the size of existing 128-port Myrinet switches.Three independent switch fabrics define the three main clusters. Because of the large performance difference between Pentium III and Xeon processors, applications either target PIII or Xeon configurations (no mixing) even though these processors co-exist in the same Ethernet and Myrinet fabrics. This heterogeneous collection represents several different major acquisitions over the last 5.5 years. The total facility (453 Xeon + 459 PIII) has a theoretical peak speed of 3 TeraFlops (TF). The configuration of machines is defined by our targeted workload allowing us to more easily make memory vs. network vs. compute power trade-offs than more general-purpose installations. Our 18 storage servers all have hardware RAID-5 or software RAID-1 with .75TB to 1.4TB each (depending on configuration) and run the standard NFS (Network File Server) protocol giving adequate performance. The models used by COMPAS researchers have been coded to take advantage of node-local disks to dramatically improve performance. Storage performance is an acknowledged weakness for clusters, but by load balancing the nfs servers we are able to work around this weak link.
General Network
The
COMPAS facility has a single 1Gbit/s network connection to the campus
backbone. In addition a single 10Gbit/sec connection is available to
the OptIPuter network (a campus and national scale research network
funded by NSF).
Job Characteristics
Computing jobs that run on the COMPAS facility are generally characterized as long-running, mid-sized parallel applications with processor counts of 32 to 128 CPUs. Jobs using 64 and 76 CPUs are common as these mark where parallel efficiency begins to drop off in this configuration. Runs are often long-lived, typically several days (3-5+). The COMPAS computing facility assigns these long-lived runs to dedicated processors so jobs run with little intervention for weeks at a time without queue waits. The jobs are distributed to have one process per processor and the process must fit into main memory to attain acceptable performance. We have found that it is quite cost-effective to distribute individual user accounts across medium-sized, gigabit-connected, IO servers. Multiple I/O ”pipes'' mitigate interference of users running different codes. We use hardware RAID-5 with hot-swap spares or software raid 1 on these servers to help minimize data failure. Smaller jobs are usually assigned to the older test clusters in a development environment that sometimes uses queues.
Mid-range facility
COMPAS fits within the hierarchy of computer centers as a mid-range facility. Larger supercomputing centers are important resources, but our usage patterns often conflict with their stated mission. For example, SDSC's web site states: “SDSC's machines are a national resource, allocations are assigned on the basis of scientific merit and on the inability of other, less-capable computing sites to perform the work'' In this context, the COMPAS compute facility is a “less-capable computing site''. Yet, in aggregate, the current facility will be able to deliver nearly 8 Million CPU-hours/year. If COMPAS computational science was shifted to a national resource such as SDSC, then this would detract from larger jobs that required the massive resources that are available at national centers. The codes that run on the COMPAS compute facility are all somewhat similar in their use of computing resources. This has allowed us to choose machine configurations optimally (a specific balance of Flops/disk/memory) to achieve high performance. It has taken considerable human investment to port/tune/develop the codes to run efficiently on the COMPAS clusters.
![]()
These pages are maintained by webmaster , last update April 4, 2007