An Always-On World Requires Always-On Solutions

“The server is down” is no longer an acceptable excuse when your applications stop working. There are just too many ways to prevent downtime today. You can achieve nearly 100% availability for your applications with Stratus everRun — quickly, easily and cost effectively.

Stratus offers the best availability solutions on the market: everRun Enterprise – a downtime prevention availability software solution offering both continuous availability (also referred to as fault tolerance) and high availability, and everRun Express – a downtime prevention high availability software solution.

Both software defined availability solutions offer the same comprehensive features and services and share the same Availability Engine architecture – the difference is the availability they deliver and how it’s achieved and the state of the application’s data and how it’s replicated and synchronized.

Availability

Availability is a term associated with the measurement of an application’s uptime or downtime and is defined as the percentage of time in a given time span (for instance, a year) that your applications are operational and accessible to users.

  • everRun Enterprise provides 99.999+% of uptime when run in continuous availability mode (between 5 minutes and 31 seconds of downtime per year) and 99.99% of uptime when run in high availability mode (52 minutes of downtime per year).
  • everRun Express provides 99.99% of uptime (52 minutes of downtime per year).

Data Replication & Failure Protection

How data is replicated also differs in each solution.

In everRun Express, data in storage is replicated but not in-memory data.  If an outage occurs, the data is rolled back to its last known state.  This is called checkpointing.   If the outage occurs on the standby server, everything continues to run uninterrupted.  But if an outage occurs on the primary server, a restart is needed and the restart time is specific to the application.

In everRun Enterprise, data in storage and memory are being replicated and synchronized – even un-cached data.  This is called statepointing.  The application state is always real-time, even if there’s an outage.

Statepointing

Statepoint operations ensure that either server can continue operation should one fail.

To do this no I/O can leave the active server until any modified memory data has been mirrored to the standby server. Statepoint operations occur constantly and in real-time.

Because a node failure could occur during a statepoint, all I/O operations are queued until the statepoint has been completed.

Statepointing-Fail

When failures occur the standby server will continue operation from the last statepoint. Since the I/O was held on the active side, it will only be released from the standby side at the state of the last statepoint. This provides continuous availability with no data loss after the failure.eE-Vs-Ee

Failure Protection

Another difference between everRun Enterprise and everRun Express is how each solution reacts if an outage should occur.

In the fault-tolerant everRun Enterprise solution, one application is mirrored on two physical servers. If one server fails, the application continues to run on the other server with no interruptions or lost data.eE-Vs-Ee-Fail

In the high availability everRun Express solution, one application runs on one physical server.  If that server fails, it’s restarted on the 2nd server.

eE Vs Ee Private

Matching the Right Solution to Your Applications

The right solution for you depends on the applications you are running. When comparing everRun Enterprise and everRun Express, first and foremost, think about all your application’s availability needs and the importance of real-time data to your organization.

If even a short server outage could sink your business, then you need a fault-tolerant solution. Emergency services, critical infrastructure, building access control and video surveillance, online stock trading, air traffic control, credit card validation, manufacturing production lines, e-commerce, and voice communications scream out for continuous, real-time, uninterrupted computing.

HA systems are a good fit for business applications that can endure minor disruptions and minimal data loss — but not much more.  Typical applications that are a good fit for an HA solution include web applications, heating and cooling control systems, logistics and resource management applications.

The measurement of data loss is truly relevant to the criticality of the application.  For example, in financial services, if a $10m transaction is lost, the cost of that data loss is $10m.  But it’s probably even more because you likely just lost the account. In public safety and health care it could mean the loss of lives.  These are clearly examples of situations that require a fault-tolerant solution.

If data loss does not equate to safety, compliance or have a big financial impact, than chances are HA is a better way to go.

Do you have some applications that need fault-tolerant availability and some that need high availability?  Then everRun Enterprise is for you since it offers both types of availability.

If you determine that you only need high availability; then choose everRun Express.

Availability and data loss considerations are the two biggies.  But performance, and of course cost, need to be taken into consideration as well.

Performance is relative to the application, but the same application running in the same environment on everRun Express will be a bit faster than if it were run on everRun Enterprise.  And continuous availability does cost more than high availability.

As with any technology solution decision – there are tradeoffs. Availability, data loss, performance and cost.

CPU, Memory & Power

The amount of CPU, memory and power consumed by everRun Enterprise and everRun Express also differ. Enterprise consumes more of each than Express.

For realistic enterprise workloads FT PVMs run at about 50% of a non-protected VM, HA PVMs run at about 95% of a non-protected VM.  This is all very application dependent though.  For example, if we run a non-IO intensive benchmark like SPECint, the FT PVM runs at about 90% of a non-protected VM.

CPU

Need more help making a product determination?  Call 1.800.Stratus and we’ll be happy to assist you.