High Performance

 By Simon Bisson

Welcome to a world where supercomputers are part of everyday computing environments. Simon Bisson investigates.

HardCopy Issue: 50 | Found In: Systems | Published: 01/11/2010 | Last Revision: 26/11/2010

HPC systems
Two modern HPC systems side by side: the x86-based Kraken supercomputer, and a somewhat smaller system based on NVIDIA’s Fermi GPGPU platform.

Big problems need big computers, and that used to mean going to a big university or a government research laboratory. For a long time they were the sole home for what we know call High Performance Computing (HPC): computer systems designed to solve big problems without having to wait too long for an answer. The super computer league tables still focus on those massive systems, but the technologies used to build them have changed, moving away from proprietary hardware (like the hand-built Cray computers of old) to systems built on and around clusters of commodity servers.

HPC as a commodity

Walk into a big supercomputing laboratory today and you’ll see what looks like any modern machine room, with racks upon ranks of 1- and 2-U servers. They’re nothing special; in fact they’re often the familiar Dell, IBM or HP machines you’re likely to be using in your office. In some cases, like CERN’s old supercomputing array, they’re even educational tower servers or desktop PCs. The real key to today’s big iron is how the servers are connected together, and how tasks are broken up and shared between the different processing nodes. It’s hard to define high performance computing these days. It used to be a synonym for supercomputing: those enormous academic industry-sponsored systems that regularly broke benchmark records. Now high performance computing is nearly everywhere. We’re working with immensely capable hardware in our data centres and even on our desktops, and we’re using software that’s been designed to take advantage of these capabilities. Buy a current generation two or four socket server and you’re getting anything between four and twelve cores per socket, often multithreaded, with access to tens of gigabytes of RAM and terabytes of disk storage. It’s a new world, where HPC-capable hardware can be anywhere – and where companies like Oracle are selling HPC appliances in the shape of their Exalogic Elastic Cloud. Setting up your own HPC system is relatively easy. Clustering servers into high-performance computing environments is now a mainstream technology; one that’s changing the ways we look at science and at many business problems. Instead of throwing time at a problem, we can now throw CPU and memory, taking advantage of clustering software to handle scheduling and using parallel programming techniques to deliver software that can run on any number of processing nodes. It’s not just our own software that can take advantage of HPC techniques, with off-the-shelf tools like Excel able to work on clustered servers, allowing ordinary business users to run such things as large-scale Monte Carlo method simulations.

HPC on Windows

Microsoft’s clustering offer is Windows HPC Server 2008 R2, licensed purely for HPC applications and giving you the tools you need to build and run a Windows-based HPC cluster. Available in Express and Enterprise editions, HPC Server 2008 lets you build large-scale clusters with over a 1,000 nodes. There are two DVDs in the installation: one a copy of Windows Server 2008 R2 HPC Edition and the other a set of tools for managing and deploying nodes and handling workloads.

HPC on the GPU

There’s a new set of tools and technologies that are bringing HPC to, if not the masses, then certainly to the desktop. General Purpose GPU development (GPGPU) is a technique that takes advantage of the massive arrays of processors in the current generation of Graphics Processing Units (GPUs), turning them into parallel programming accelerators akin to the vector processors used by early Cray supercomputers. Using GPGPU techniques, tasks that can be parallelised are passed from traditional CPUs to graphics cards (or to GPU-based processing architectures) using APIs such as OpenCL and DirectCompute, or programming libraries like Nvidia’s CUDA.

CUDA is a powerful tool as it simplifies a lot of parallel programming tasks, handling marshalling code across an array of hundreds of processors. It’s particularly suitable for data-parallel tasks, such as linear algebra problems – and is being used for complex tasks like machine vision and computation fluid dynamics. Nvidia is selling GPU arrays as processing add-ons for high performance workstations or server farms, giving HPC performance in a much smaller volume. With data centre space at a premium, there’s a lot of scope for GPGPU-based hardware to give a significant processing boost to a small HPC cluster.

Nvidia is also working to convert well-known mathematical and statistical libraries to work with CUDA. These include a set of Fast Fourier Transform tools, ideal for large scale signal processing (for example processing seismic survey data for oil fields) and a port of the BLAS (Basic Linear Algebra Subprograms) libraries.

CUBLAS has been used as the basis of parallel processing ports of the widely used LAPACK and MAGMA mathematical libraries. Nvidia’s work with mathematical libraries means that commonly used tools, like MATLAB, Mathematica and the ANSYS simulation package, can run on GPU platforms, giving commonly used tools HPC capabilities without requiring significant redevelopment work. Giving familiar tools the ability to work on HPC hardware also helps end users take advantage of a significant processing boost without having to develop custom software for each problem.

Of course there are tools to help develop your own GPGPU-powered HPC software, and Nvidia’s Parallel Nsight toolkit plugs into Visual Studio, adding debugging, profiling and analysis tools for GPGPU programming. Parallel Nsight isn’t purely focused on C programming for CUDA; it also supports OpenCL and DirectCompute.

While much HPC software is now written in C and C++ (as well managed code languages like C# and Java), the old scientific computing favourite FORTRAN remains popular. The Portland Group’s PGI 2010 FORTRAN compiler now supports CUDA on Linux, Windows and OS X. Using PGI 2010 and CUDA FORTRAN you can define FORTRAN subroutines that run on parallel hardware – and in an upcoming release not just on GPGPU hardware, on any clustered CPU system. Simplifying writing parallel code remains key to delivering effective HPC applications, and the techniques developed for working with the massively parallel GPU arrays in a GPGPU system can deliver some of (if not all of the benefits) of large scale parallelisation on commodity hardware.

A Windows HPC Server installation uses a single head node to control resources (and you can deploy head nodes in a high-availability failover cluster). The head node also handles the complex task of scheduling operations across the HPC cluster. Set up is straight-forward: just install the host operating system, and then the management tools. You’ll need a SQL Server database which can be either the bundled SQL Server Express or, more likely, an existing SQL Server 2008 installation (which Microsoft recommends if you have more than 100 nodes). You can also install the management tools on desktop PCs running Windows Vista or Windows 7, letting you control a HPC cluster from anywhere in a network. Compute nodes can be running on a private network or an enterprise network, with support for high speed application networks. There’s support for diskless compute nodes, but in practice most nodes need some storage (at the very least for swap). Nodes also need at least one network card, and all nodes must use the same type of network card. You can deploy nodes manually, or automatically using Windows Deployment Services (essential when deploying hundreds of thousands of nodes), and there are graphical deployment tools in the administration GUI. It’s important to get management right for a HPC cluster, and Windows HPC Server 2008 R2 integrates with existing System Center implementations, as well as offering its own management tools. You can use the built-in Job Manager Console as a GUI for the cluster’s scheduler, or write your own PowerShell management scripts. There’s also a heat map report that gives an at-a-glance view of how a cluster is performing, helping you quickly pinpoint issues even in the largest clusters. There’s a default set of reports, and you can create your own using the familiar tools in SQL Server Analysis Services if you need more. Deploying nodes is a key function of the HPC Server management tools, and you’re able to work with nodes based on Server 2008 and Server 2008 R2. Being able to mix node operating systems is important as it lets you manage license costs and upgrade controllers without having to upgrade nodes in an existing HPC cluster. The new version also adds support for new networking options, including 40 Gbps Infiniband connections. When you have hundreds or thousands of nodes working together in a large scale HPC deployment, fast interconnections are vital for moving large amounts of data around quickly, and for passing messages between nodes. Clusters aren’t the only route to HPC: grids are an alternative that lets autonomous compute resources work together. Probably most familiar from tools like Folding@Home, grids work best with easily decomposable problems, where a standard executable can work with a subset of a large data model. Execution instructions are bundled with data and delivered to the execution engines, which returns results to a host server. There’s no need to balance grid hosts, and results can be delivered to the dispatcher out of order. That does require extra processing power to reassemble the results, but it’s a lot more efficient than running the task on a single host! Grids also mean that servers aren’t the only machines that can be used in HPC environments. Computing grids can use any available CPU resource, letting HPC systems scavenge unused cycles on desktop PCs and on non-clustered computing resources. There’s even an option in Windows HPC Server 2008 R2 to use Windows 7 desktop PCs as nodes, with user control over what resources can be used by the cluster. You don’t have to use Windows HPC Server 2008 to build a HPC system: there’s an open source alternative in the shape of Beowulf. Beowulf clusters work using open source UNIX-derived operating systems, including Linux, and have tools for handling messaging between cluster nodes and managing how files are passed between the various parallel virtual machines that run on the cluster. There’s no need to use specialised hardware, and automated provisioning tools like OSCAR (Open Source Cluster Application Resources) handle deployment of a Beowulf cluster and management of its message passing features.

Writing HPC software

Microsoft cluster management tools screenshot
Microsoft’s cluster management tools are an important part of its Windows HPC Server 2008 R2 environment.

The key to developing applications for a modern HPC environment is in creating parallel computing code, which is simpler now than it’s ever been before. If you’re developing .NET code for Windows HPC Server 2008 R2, then it’s easy enough to use the tools built into Visual Studio. The latest versions of the .NET Framework include a parallel version of the LINQ query language, along with a Task Parallel Library that simplifies common parallel programming tasks. There are also parallel debugging tools that help deal with the complex issues of ensuring there are no parallel-specific bugs, like race conditions, in your code.

Intel’s Parallel Studio is also suitable for HPC application development, with tools for developing parallel C and C++ applications, suitable for use in an HPC environment. It includes tools to help identify just how an application can take advantage of parallelism, as well as offering code libraries to simplify application development. There are also debugging and profiling tools to help deliver high quality code that can perform well in a distributed, parallel computing environment.

HPC for all

In the past high performance computing was the stuff of sky high budgets and blue sky research. Now, however, it’s an affordable and powerful approach to dealing with the big problems we’re seeing in business – and with HPC support in tools like Excel and Mathematica, it’s something that doesn’t need an army of programmers. It’s also something that can be managed using familiar tools and techniques, with automated deployment tools and powerful reporting features. Technologies like Microsoft’s Windows HPC Server 2008 R2 simplify deployment of HPC clusters, with Visual Studio providing a programming environment for both traditional parallel programming and GPGPU approaches. You don’t have to ask if you can afford HPC resources now, you just have to ask if you need them.

Share and Bookmark  

Comments

Be the first to comment about this article...

Leave a comment

You must login to place comments.