When someone in the technology industry throws around the term “carrier grade” it suggests the highest bar when it comes to reliability, availability and resiliency. A carrier grade network is such a high bar that it’s actually enforced by laws in some countries. So, it’s more than interesting that telcos are looking to provide network functions virtualization (NFV) in OpenStack based clouds. This means these telcos envision a future where the cloud itself will be carrier grade and ready for the job done by physical equipment today. This is still in the early semi-visionary stage but there is continued progress on many fronts.

ETSI (the European Telecommunications Standards Institute) is a telecommunication industry standards body at the forefront of defining and standardizing this vision of carrier grade clouds. Members of this body nominate various Proof of Concept (PoC) demonstrations to investigate, conceptualize and ultimately provide standards for achieving that vision. In May, Stratus along with our partners embarked upon a PoC to address what may be the single biggest barrier to making a carrier grade cloud a reality. And today we are happy to announce the results of our POC.

Our PoC titled, Availability Management with Stateful Fault Tolerance demonstrates how virtualized network functions (VNFs) from multiple vendors can be easily deployed in a highly resilient software infrastructure environment, that provides complete and seamless fault management to achieve high availability, keep running and keep state (by remembering the preceding events in a given sequence of interactions with a user) in the event of a system fault or failure.

The results were compelling in that for the first time we have been able to prove a number of things:

  • OpenStack based VIM mechanisms alone are insufficient for supporting carrier grade availability objectives. Baseline functionality is only adequate for supporting development scenarios and non-resilient workloads.
  • All phases of the fault management cycle (fault detection, fault localization, fault isolation, fault recovery and fault repair) can be provided as infrastructure services using a combination of NFVI and MANO level mechanism to deploy VNFs with varying availability and latency requirements – all without any application (i.e. VNF) level support mechanisms.
  • We also demonstrated that NFVI services can offer a sophisticated VM based state replication mechanism (CheckPointing and I/O StateStepping) to ensure globally consistent state for stateful applications in maintaining both high service accessibility and service availability, without application awareness.

We believe that this is a major step forward in proving that the vision of a carrier grade cloud is viable and a software infrastructure solution is beneficial to both VNF providers and network operators/service providers.

  • For network operators/service providers, it enables the deployment any KVM/OpenStack application with transparent and instantaneous fault tolerance for service accessibility and service continuity, without requiring code changes in the VNFs.
  • For VNF providers, it reduces the time, complexity and risk associated with adding high availability and resiliency to every VNF

While there is still much more progress to be made, the very possibility that reliable carrier grade workloads can be maintained will help accelerate the adoption of NFV worldwide. If you’d like to see the details of our POC click here. Non ETSI NFV members can download PDF versions of the PoC Proposal that describes the testing we performed as well as the PoC Report that describes the findings and results of the testing.  If you’d like to know more about the technology Stratus provides to enable these results check our Cloud Solution Brief and contact Ali_Kafel@Stratus.com for a white paper with more details.

Lastly, we did not do all of this work ourselves, there were many partners involved as well as our industry sponsors. We’d like to extend our thanks in helping us achieve this great result.