Homegrown high-performance computing
Uptime counts for Merlin
Whereas most HPC implementations are the province of scientists and engineers hidden away in R&D departments, Merlin Securities' HPC solution interfaces directly with its hedge fund customers. That's why 24/7 uptime and security was a key HPC design requirement for Merlin, right alongside performance.
"We had to be extremely risk-averse in designing our cluster and choosing its components," says Mike Mettke, senior database administrator at Merlin.
A small prime brokerage firm serving the hedge fund industry, Merlin must contend with several larger competitors that benefit significantly from the economies of scale. Morgan Stanley, Merrill Lynch, and Bear Stearns, for example, run large mainframes that analyse millions of trades at the end of the day and return reports via batch processing the next morning. Merlin stakes its competitive edge on using its HPC cluster to deliver trading information in real time and allowing customers to slice and dice data multiple ways to uncover valuable insights, such as daily analyst trading performance as compared with other analysts, other market securities, and numerous market benchmarks. "We focus on helping clients explain not only what happened but why it happened," says CTO Amr Mohamed.
To do this, Merlin built its own highly parallelizing analysis tools, which it runs on a high-performance Oracle RAC (Real Application Cluster) installed on a rack of Dell PowerEdge 1850 and 2850 dual-core Xeon servers. Data storage is provided by EMC CLARiiON 2Gbps and 4Gbps FC storage towers. Sitting on top of Oracle is Merlin's HPC task-scheduling software, also created in-house, and an Oracle data mart that serves as a temporary holding ground for frequently used data subsets, much like a cache. Most of the high-speed calculations run directly on the Oracle RAC, which is fronted by a series of BEA WebLogic app servers that take in requests from a set of redundant load balancers sitting behind the company's customer-facing Apache Web servers. Sitting in front of each of the three layers are sets of redundant firewalls.
Cluster performance is key to running complex calculations in real time, but for Merlin, performance could never come at the expense of enterprise-level reliability, scalability, and 24/7 uptime, requirements that led to several crucial design decisions.
First, tightly coupled parallel processing via message passing was simply out of the question. Instead Merlin's architects and programmers put tremendous effort into dividing processes in an "embarrassingly parallel" fashion without any interdependencies at all. This benefits scalability and reliability, as the high-speed, low-latency communications required for interprocess communications create scalability bottlenecks. They also require cutting-edge interconnects such as Myrinet and InfiniBand, which don't have the reliability track record of Gigabit Ethernet.
"We didn't want some new interconnect driver crashing the system," Mohamed says, adding that straight Gigabit has also helped Merlin achieve considerable cost savings.
Reliability and enterprise-grade support fueled Merlin's decision to stick with an Oracle RAC, which has high-quality fault-tolerant fail-over features; dual-processor Dell PowerEdge servers; high-end EMC CLARiiON FC storage; and F5 load balancers.
"There are lots of funky platforms for HPC out there and high-bandwidth data storage solutions that can pump data at amazing rates," Mettke says. "The problem is that you end up dealing with lots of different vendors, some of whom can't deliver the 24/7 enterprise-level support you need. That adds another element of risk."
Finally, all code was written using Java, C++, and SQL.
"I've been on the other end running code written in Assembler on thousands of nodes," Mettke says. "We want the speed, but not at the expense of system crashes in the middle of a trading day. You can claim you have the best cluster out there, but it doesn't matter if there's no show when it's showtime."
Mettke adds that the architecture of Merlin's HPC infrastructure is constantly evolving to accommodate new data and applications.
Borderless corporate networks to shift focus to secure content management in Australia in 2009 2008-12-04 16:06:00+11
IDC Says Asia/Pacific Excluding Japan IT Market Will Remain The Bright Spot... 2008-12-04 15:04:00+11
AOC Launches 18.5” Widescreen Green 16:9 LCD Monitor in Australia and New Zealand 2008-12-03 15:30:00+11
Progress Software's Cure for Managing Services-based Applications 2008-12-03 14:42:00+11
EXCOM scores back-to-back award trifecta 2008-12-01 10:46:00+11



