Earlier this month (May 5 to 8), I was invited to speak at the NFV World Congress in San Jose on “Ensuring Availability and Resiliency in NFV.” This is Part 2 of my blog series covering the five key takeaways from the event. This time I will discuss how:

  • Carriers are looking for a lot more than just cost savings from NFV
  • Resiliency and reliability in NFV is a hot topic that every telco cares about

 4 | Carriers are looking for a lot more than just cost savings from NFV

There is no doubt that the telcos are looking to adopt virtualization in order to make their networks more efficient and ultimately save money. According to Toby Ford, AT&T’s area vice president for cloud technology strategy, since the debut of the iPhone in 2007, data traffic over AT&T’s network has increased 100,000 percent, straining the company’s sprawling infrastructure to the limits. As a result, AT&T needs to provide 300-400 percent more capacity every three years, with the same or less spend each year.

Beyond cost savings, the telcos are looking at three key other benefits for NFV.

Common Ground: As stated in Part 1 of this blog – this is a culture change. Rather than developing their own solutions or getting their vendors to provide them with unique technologies, telcos now believe it is better to leverage the community, not just to reduce costs, but also to reduce risks. “The more we converge, the more we don’t fail,” said Margaret Chiosi, distinguished network architect at AT&T Labs and OPNFV’s president.

An End to Vendor Lock-In: The days of just sticking with the traditional vendors are long gone as telcos have learned that innovation often comes from others.

Agility: “The No. 1 opportunity is really agility. To turn up some circuit across the country, it should be minutes instead of months,” said Bryan Sullivan, director of service standards for AT&T. “NFV is going to make it much more possible to gain new partners and bring in new services from new suppliers very quickly.”

5 | Resiliency & reliability in NFV is a hot topic for every telco

In almost every session at NFV World Congress, there was something to be said about Availability, Reliability or Resiliency. Although there is a working group in OPNFV on availability, it is still an open area of concern. During the sessions, several people used these terms interchangeably as if they were synonymous with one another. The fact is, although related, they have different meanings.

Availability is the percentage of time an equipment (or a network) is in an operable state (i.e. can access information or resources). Calculated as % Availability = Uptime / Total time.

Reliability is how long a system performs in its intended function before it fails. It is typically measured in Mean Time between Failures (MTBF) = total time in service / number of failures.

Resiliency is the ability to recover quickly from failures, to return to its original form or state just before the failure.

Therefore, a Highly Available (HA) system may not be Highly Reliable (HRel) or Highly Resilient (HRes). For example a five nines (99.999%) availability means the equipment (or network) is never inaccessible for more than 5 minutes in a one year period, which is pretty darn good. But what happens if every week it fails for just a few seconds and in the process causes every session to be initialized? You now have a highly available system but with poor reliability and resiliency – this is bad. Telcos that understand the meaning of carrier-grade appreciate stateful Fault Tolerance, which provides HA, HRel and HRes. This means when a fault happens, the applications and “system” continue to run (in a secondary entity) with the same states. This is what it means to be “carrier-grade.”

