Supercharged
By Mary Branscombe
Mary Branscombe finds out how you can use Windows HPC Server 2008 R2 to speed up everyday applications.
HardCopy Issue: 52 | Found In: Systems | Published: 01/05/2011 | Last Revision: 28/06/2011
Client software such as Microsoft Excel and Mathematica let you do powerful calculations, but when your models get so large and complex that even a fast new PC can’t keep up, you don’t have to abandon familiar software. Instead you can shift the calculations from your local PC and run them in parallel on a Windows HPC Server 2008 R2 cluster. Furthermore, if you need to run intensive calculations only occasionally, you can save money on building and maintaining your cluster by bursting to the cloud for peak loads – all without asking your users to change the way they work.
Excel displays the progress of cluster calculations while working on a financial market simulation.
Microsoft Excel 2010 lets you run workbooks directly on the HPC cluster, either as independent calculations or as iterations of the same workbook with different data sets or parameters, which is especially useful for iterative algorithms like Monte Carlo analysis and complex VBA functions where it’s not easy to break calculations into components. In order to do this you need a copy of Excel installed on both the client PC and on each cluster server, and it may take a developer to find iterative loops in the Excel model and convert those into units of work that run on the cluster. You also have to work around the fact that Excel expects a user at the screen, however HPC Services for Excel 2010 will handle dialog boxes and notifications that could otherwise leave Excel hung on an unattended node.
Anything you do in Mathematica can be distributed, either over local compute kernels on a multicore system or onto extra compute kernels on the HPC system using gridMathematica Server. As with Excel, it takes time to move data over the network, but the performance gains are useful for financial models like Monte Carlo, data simulation for drug discovery, animation and so on. The new Mathematica 8 includes features for statistics, wavelets, control theory and image processing, many of which parallelise well.
Using HPC with desktop software has licence implications. You need the usual licence for desktop copies of Excel, but each compute node running Excel also requires a licence. With gridMathematica you require a network server and increment seat for the compute nodes, as well as a licence for Mathematica for each concurrent desktop user.
Expanding your cluster
Windows HP Server 2008 R2 supports Workstation Nodes which lets you add Windows 7 desktops that might otherwise be idle as nodes in your cluster, while Service Pack 1 lets you add nodes running on Microsoft’s Azure to build a hybrid cluster that lives both on your premises and in the cloud.
If your average utilisation is only 40 per cent of the hardware you need for peak computation then you can save a significant amount on servers (as well as power and floor space) by building for the average load in-house and bursting out to the cloud when you need to. You pay a premium for cloud usage but it’s still less than paying for the other 60 per cent all the time.
The Windows Server HPC management interface is easy to work with. It’s a graphical user interface rather than the command line of traditional supercomputing. The SharePoint integration toolkit for HPC Server includes Web parts for submitting and managing jobs.
HPC isn’t the way to solve all your performance problems. If you’re trying to load hundreds of megabytes of data into Excel to work with a pivot table you’ll get better results building a cube in an OLAP database such as SQL Server and using Reporting Services or PowerPivot. But if you need to accelerate calculation-intensive worksheets then HPC gives you a flexible centralised compute resource.
The performance increase doesn’t just save you time; it lets you experiment. Reducing calculation time from three years on a laptop running Mathematica, to two weeks on an 8-node cluster with gridMathematica and Windows HPC Server, means getting a product to market far sooner. An actuary who models a thousand scenarios for 30,000 insurance policies over 30 years will be used to that taking ten days in Excel. Repeating calculations for compliance every 30 days doesn’t leave much time for experimentation. Reduce that to two hours on a cluster and you can change scenarios a dozen times to compare results, or run 10,000 scenarios and still get the answer back in a day. And you can do it all using tools that you already know how to use.