Tag Archive

A Reaction to Elden Christensen’s MSDN blog post, “Evaluating High-Availability (HA) vs. Fault Tolerant (FT) Solutions.”

Published on 10/07/2010 By phil.riccio

This morning, I read Elden Christensen’s MSDN blog post, “Evaluating High-Availability (HA) vs. Fault Tolerant (FT) Solutions.” I found it an interesting post, but am uncomfortable with some of his fact.

First, it was unclear to me as to whether he was talking about software FT solutions or hardware FT. I also found it a bit misleading. For example, he stated, “In the event that there is a software fault (such as a hang or crash), both machines are affected and the entire solution goes down. There is no protection from software fault scenarios and at the same time you are doubling your hardware and maintenance costs. At the end of the day while a FT solution may promise zero downtime, it is in reality only to a small set of failure conditions.” With a Stratus ftServer this is totally incorrect. Stratus works with the OS vendors, like Microsoft, to harden the OS which allows our servers to ride through these types of software faults and transient errors. We also eliminate the possibility of these errors propagating across the server, which is something that occurs quite often in HA cluster solutions.

Stratus also provides root cause analysis of the fault to find out what caused it and so it does not reoccur. With Stratus ftServers there is no “doubling” of hardware or maintenance, as each unit is a single entity with total redundancy built in, licensed once managed as a single x86/x64 server, with none of the complexities and additional skills or planning required to manage a cluster. There is no scripting and the applications do not need to be ‘cluster aware’. With ftServer its drop in hardware fault tolerance. A key differentiation between Stratus’ full function hardware fault tolerant servers vs. software FT solutions is performance. Stratus builds and engineers our servers from the ground up to eliminate downtime, while maintaining 100% of the machines performance, even during and after a failure occurs. Stratus’ full function fault tolerant servers differ from HA cluster solutions in many ways, but probably the most important differentiation is we eliminate failover, not recover from it. There is no data loss, no restart, or reboot, which could take several minutes, or worse.

Microsoft and Stratus have been OEM partners for well over a decade now, and there are thousands of Stratus ftServers running Windows supporting some of the World’s most mission critical applications around the globe. In fact we just announce support for Hyper-V on the ftServer. Elden might be interested in the quote by Mike Neil, general manager, Windows Server & Virtualization at Microsoft, in our press release this week – “Customers can now experience Microsoft Hyper-V, its tools and features such Live Migration in the easy-to-use and familiar Microsoft User Interface with 99.999+ percent mission-critical availability running on Stratus ftServer systems. ftServer and critical application support are synonymous and Stratus now includes mission-critical support for Microsoft Hyper-V on ftServer systems. Giving customers the option of choosing the ftServer platform with Windows Server 2008 R2 and Hyper-V adds an availability dimension that hardens the entire solution against downtime and data loss.” The entire press release can be found here – Microsoft Hyper-V on Stratus ftServer Systems. Also, there is a quote by Claude Lorenson, director of SQL Server marketing at Microsoft Corp, from an April joint press release. – “For SQL Server users that also need the highest degree of hardware availability to complete their solution, the ftServer system from Stratus Technologies has a decade-long record of uptime performance that only a fault-tolerant server architecture can deliver.” -The complete press release can be found here – Microsoft SQL Server 2008 R2 with Stratus ultra-high server uptime.

//pardot tracking code