Stratus Blog

Showing archives for category Telco

Will ExxonMobil revolutionize the automation industry?

5.18.2016Industrial Automation, Manufacturing, NFV/SDN, Oil and GasBy:  

The interest around the ExxonMobil initiative to modernize their automation systems and their selection of Lockheed Martin to lead this initiative has generated considerable interest and discussion within automation vendors and the industry at large. Now that vendors have had the time to digest the impact of the proposal, to take a standards based approach across all levels of an automation system, and, in particular, to adopt a standard such as VPX, driven by the FACE initiatives in aerospace, a variety of reactions are appearing in the media.

Vendor responses have varied from embracing the concept to outright skepticism that the idea to mix and match elements from different vendors in a shelf could ever become a reality. Much of this is driven by the market position of any given vendor and what they believe they have to win or lose through the evolution of this initiative.

Yet, history has important lessons for the skeptics. The Enterprise computing industry of the 20th century was rooted in proprietary closed systems where a customer was predominantly owned by a single manufacturer. Today, it is very much a horizontal, layered industry where companies compete at their specific layers, and components of the solution can easily be mixed and matched depending customer preference and perceived value. Think plug and play with disk drives, network cards, operating systems and applications of all kinds. Similarly, the telecom industry has gone through its own evolution from completely proprietary systems to an open environment, which encompasses equipment that enables vendor mixing and matching within a shelf. The current initiatives around NFV promise even greater changes.

In both cases, the results have been the same; an explosion of innovation with new applications and a competitive environment that has enabled broad adoption at effective price points. Certainly, the vendor landscapes are dramatically different, some older vendors survive and adapt, others fade away, but new players emerge as they bring a fresh approach. Did these evolutions happen overnight?

No, they took years to evolve, sometimes with dead-ends (does anyone remember token-ring as an alternative to Ethernet?) but the long-term results were the same.

In some ways the industrial automation industry is already on this journey. The increasing adoption of industry standards servers and systems to run automation systems, whether they by small deployments, or in place of large DCS implementations, is one example. Innovative products exist in the space to provide the redundancy and continuous availability with no data loss that is the cornerstone of efficient automation implementations. And applications run very effectively on industry standard operating systems, many in a virtualized environment.

ExxonMobil has taken the first step to revolutionizing the automation industry. This is a journey that will take time and the layer 3 and above layers of an overall solution are already well down that path. It will be exciting to see how the automation industry evolves as it embarks on changes that have re-shaped other industries.

 

 

Achieving Instantaneous Fault Tolerance for Any Application on Commodity Hardware

3.8.2016High Availability, SLA, TelcoBy: A few weeks ago, Stratus hosted a Webinar with Light Reading titled “Achieving Instantaneous Fault Tolerance for Any Application on Commodity Hardware” aimed at Telcos and Communications Application Providers. I was pleasantly surprised at the turn out. We had hundreds of people interested in this topic and here is a brief overview of what we discussed.

Communications networks have always needed high availability and resiliency. As more networking applications such as SDN Controllers and virtualized functions are being deployed on commodity servers rather than proprietary purpose built hardware, the need for software-based resiliency and fault tolerance has never been greater. A reliable network depends on its ability to quickly and reliably connect end-points, transfer data and maintain quality of service (QoS). If the network goes down, even just for a few seconds, many people can be affected. System failure may not only result in loss of revenue for the provider, but it can seriously damage its reputation and/or trigger penalty payments.

Unplanned server and data center outages are expensive, and the cost of downtime is rising. The average cost per minute of unplanned downtime is $7,900 to $11,000 per minute, depending on which study you believe. With average data center downtime of 90 minutes per year, this translates to a costs of about $1M per year, per data center.

A Highly Available (HA) network is one that ensures the network and its services are always on, always accessible (service accessibility) and active sessions are always maintained without disruption (service continuity). Five nines (99.999%) availability is the minimum benchmark, meaning that on average, the service is never down for more than five minutes in a one year period. While a typical HA of five nines (99.999%) or even six nines (99.9999%) sounds impressive, for maintained QoS, it may not be good enough!  Let’s look at an example. Consider an application that has six nines (99.9999%) of availability. At this level of HA it means the application will not go down for more than 31.5 seconds a year, which may seem impressive. However, if the application were to fail once a week for just a second and was not capable of returning to its original state after a failure, this would result in a situation where active sessions would likely be disrupted or degraded. So technically, a service may still be up (maintaining its HA metrics), but if active customer sessions are experiencing connection disruption or degradation in the form of reconnecting, less throughput, higher latency or less functionality, it will likely violate the Service Level Agreement (SLA) and result in significant customer dissatisfaction and penalty consequences for the service provider.

So what Telcos and Communications Providers need is more than just five nines or even six nines of availability – they need resilient platforms that can sophistically manage faults and continue service without disruption and degradation in performance, functionality and latency and maintain minimum acceptable levels of service as defined in the SLA.  And since not all applications require the same levels of resiliency, it is important to manage Resiliency SLA based on the different types of applications and their requirements. This is the difference between traditional HA solutions and resilient fault-tolerant solutions like everRun from Stratus Technologies.

everRun is a Software Defined Availability (SDA) infrastructure that moves fault management and automatic failover from the applications to software infrastructure. This provides fully automated and complete fault tolerance for all applications, which includes fault detection, localization, isolation, service restoration, redundancy restoration, and, if desired, state replication – all without requiring application code change and with dynamic levels of resiliency.  This means any application can be instantaneously deployed with high resiliency, multiple levels of state protection and ultra-fast service restoration speed – on commercial off-the-shelf (COTS) hardware in any network, without the complexity, time consuming effort and risk associated with modifying and testing every application. This is why everRun is ideal for communications applications that include video monitoring, network management, signaling gateways, firewalls, network controllers and more.

In the Webinar, we discussed the differences between standard HA system and resilient platforms like everRun, options for deploying resiliency (in the apps versus the software infrastructure), a brief overview of everRun, customer use cases and examples of how everRun is used in the communications space for telco networks and converging industries. To watch the webinar and learn more, please click here.

AT&T Just Gets It

10.15.2015NFV/SDN, TelcoBy:  

I’d promised myself that I’d take a break from discussing cloud and NFV for a while, but against my better judgement, here’s one more…

For the past year or so we have been working on our new cloud product designed to solve the problem of resiliency for NFV based networks. NFV is a daunting task, but the potential is great and it could literally transform the daily lives of everyone.

Without NFV, the operators really can’t move forward quickly with next generation networks. Without next generation networks, innovations in IoT and even ideas that seem so far out like driverless cars will be at best stymied and at worst impossible. So everyone is in agreement that we need NFV, but it’s a daunting task and it will be really hard to get there.

  1. The technology isn’t there yet – Today’s VNFs are not carrier grade and neither is the underlying NFV infrastructure (something Stratus and others are trying to fix)
  2. The vendor incentives aren’t there – If you are Cisco or Alcatel and see what just happened to EMC, you have to consider how fast you want to change your business model from appliances to software
  3. The operators are still getting ready for this change – There is a lot of legacy and history there and changing people’s mindsets is probably harder than addressing 1 and 2 above

This is why I’m glad to see AT&T acting like an industry titan and making change happen. They have been getting a lot of press lately but the transformation doesn’t happen overnight. It’s been a while and if you have been keeping an eye on OpenStack, OP-NFV and other NFV communities it should be no surprise. So, what makes AT&T different?

  1. AT&T are working with those who have incentive to move slowly – They have a vision and a supporting program called Domain 2.0 which is an ecosystem of traditional and disruptive technology companies. They are running PoCs and testing new technologies now to see what works and what doesn’t. And in some instances, they are directly involved in the design of these technologies. Of course the incentive for the vendors involved is a crack at being part of the future vs the past.
  2. AT&T are defining the requirements in an open source way – The biggest roadblock to NFV adoption will be standards. The natural tendency for a VNF provider is to build everything into the VNF just like they did in the old days. Although that maximizes the vendor’s flexibility, it doesn’t help the operators who want to build as much into the infrastructure as possible to maximize their flexibility. This is why communities such as OP-NFV allow operators to define the infrastructure as a template that everyone can work towards. Who founded OP-NFV? Well it was AT&T.
  3. Lastly, AT&T are getting out there and being up front about what’s working at what’s not. All of the interviews, the keynotes and public facing activity is a means to communicate and demonstrate to other operators what is working. AT&T know NFV will not reach its full potential if other operators don’t buy into it. It’s not about competitive advantage. It’s about creating a next generation platform upon which competitive advantage will be built.

When I think about the enterprise and my experience in open source, it was always the industry titans that drove everyone forward. Open Source middleware would not be what it is today without companies like Geico and NYSE. Linux would not be what it is today without the U.S Government and a host of others like Apple who just announced they were adopting KVM in a big way. All were game changers, and someday, I think AT&T will be a game changer for NFV and will be to the benefit of everyone.

NFV PoC#35 at SDN & OpenFlow World Congress in Dusseldorf, Germany

10.13.2015Cloud, Fault Tolerance, High Availability, NFV/SDN, VirtualizationBy:  

This week I am in Dusseldorf, Germany showing our ETSI PoC#35 titled, Availability Management with Stateful Fault Tolerance. This Proof of Concept demonstrates how virtualized network functions (VNFs) from multiple vendors can be easily deployed in a highly resilient software infrastructure environment, that provides complete and seamless fault management to achieve fault tolerance, which means continuous availability with state protection (by remembering the preceding events in a given sequence of interactions) in the event of a system fault or failure.

The results were compelling in that for the first time we have been able to prove a number of things:

  • OpenStack based VIM mechanisms alone are insufficient for supporting carrier grade availability objectives. Baseline functionality is only adequate for supporting development scenarios and non-resilient workloads.
  • All phases of the fault management cycle (fault detection, fault localization, fault isolation, fault recovery and fault repair) can be provided as infrastructure services using a combination of NFVI and MANO level mechanism to deploy VNFs with varying availability and latency requirements – all without any application (i.e. VNF) level support mechanisms.
  • We also demonstrated that NFVI services can offer a sophisticated VM based state replication mechanism (CheckPointing and I/O StateStepping) to ensure globally consistent state for stateful applications in maintaining both high service accessibility and service availability, without application awareness.

We believe that this is a major step forward in proving that the vision of a carrier grade cloud is viable and a software infrastructure solution is beneficial to both VNF providers and network operators/service providers.

  • For network operators/service providers, it enables the deployment any KVM/OpenStack application with transparent and instantaneous fault tolerance for service accessibility and service continuity, without requiring code changes in the VNFs.
  • For VNF providers, it reduces the time, complexity and risk associated with adding high availability and resiliency to every VNF

While there is still much more progress to be made, the very possibility that reliable carrier grade workloads can be maintained will help accelerate the adoption of NFV worldwide. If you’d like to see the details of our POC click here. Non ETSI NFV members can download PDF versions of the PoC Proposal that describes the testing we performed as well as the PoC Report that describes the findings and results of the testing.  If you’d like to know more about the technology Stratus provides to enable these results check our Cloud Solution Brief and contact Ali_Kafel@Stratus.com for a white paper with more details.

Brief overview of the Stratus Fault Tolerant Cloud Infrastructure

The Stratus Fault Tolerant Cloud Infrastructure provides seamless fault management and automatic failover for all applications, without requiring code changes.  The applications do not need to be modified to become redundant and resilient because the software infrastructure enables every virtual machine (including its application) to automatically live on two virtual machines simultaneously — generally on two physical servers. If one VM fails, the application continues to run on the other VM and the processing is automatically switched to the other, with no interruptions or data loss.

Two Key Benefits

Reduce time, complexity and the risk in achieving instantaneous resiliency

  • Seamless and instantaneous fault management and continuous availablity for any application, without code changes – includes fault detection, localization, isolation, recovery and repair

Flexibity in deployment multiple levels of availability to suit the applications

  • Dynamically specify availability level at deployment time based on application type – for example some applications may require globally consistent state at all times, while others may only require an immediate and automatic restart
  • Enables mixed deployments decomposed control plane elements (CE) that may be state protection, and forwarding plane elements (FE) may be stateless, leveraging DPDK and SR-IOV for higher performance and lower latency processing

What and how we tested

  • The Stratus Fault Tolerant Cloud Infrastructure conforms to the blue elements in the ETSI NFV reference architecture below

NFV-MGT

 

We showed three configurations:

  1. Unprotected server – shows that upon a system failure, the applications will go down until manually restarted
  2. Highly Available (HA) servers – stateless protection – upon a system failure, the service will go down for a short period but will automatically and immediately be restarted by the software infrastructure
  3. Fault Tolerant (FT) server – stateful protection – upon a system failure, the applications will continue to run without any interruption or loss of state, because the software infrastructure will perform all fault management, state protection (on another server) and automatic failover

The Cobham Wireless TeraVM virtualized IP tester was one of the VNFs deployed, which was generating and measuring traffic. In this case the traffic we showed was a streaming video because it is easy to see if there is a failure.

The TeraVM is a fully virtualized IP test and measurement solution that can emulate and measure millions of unique application flows. TeraVM provides comprehensive measurement and performance analysis on each and every application flow, with the ability to easily pinpoint and isolate problem flows.

External-Openstack

While video traffic was streaming through the system passing (which includes the Firewall and QoS servers) and visible on each of the three laptops, we simulated failure for each of the three sets of systems. As expected, the video stream coming from the unprotected server stopped and never recovered. The HA system stopped and restarted after a few seconds. As for the FT system, it continued without any loss of traffic!

Calculating Your Cloud ROI

8.27.2015Cloud, NFV/SDN, VirtualizationBy:

So you’re sold on the advantages of cloud services—the flexibility, agility and “always on” business models they enable. But what’s the return on investment?

The fact is, calculating ROI on cloud services is challenging. The abstract nature of the cloud doesn’t lend itself to a simple matter of addition or subtraction. It’s not like the business case for virtualization, where you simply add up all the servers you didn’t have to buy, operate and maintain.

On the investment side of the ledger, there are costs associated with moving to the cloud, especially if you decide to build your own. The good news is that you can mitigate these costs by using open source technologies like OpenStack and KVM (kernel-based virtual machine) to eliminate the expense of software licenses. Or you can dramatically reduce up-front costs by going with a subscription cloud model, paying as you go.

Adding Up the Returns

So what are the potential returns from moving to the cloud?

For starters, productivity in the cloud is tremendous. It’s much easier and faster to build and deploy apps in the cloud. Deploying more apps in less time means you can either clear your development backlog faster or reduce your overall development costs.

The cloud also improves automation, leading to even greater density of your virtualized environment. The current global density of server virtualization is around 9x or 10x. The additional automation delivered by cloud services could take that density to 12x or 13x or even more. For larger environments, the financial impact of this could be significant.

The advantages go beyond computing. The next big cloud opportunities are in networking and storage—two areas dominated by systems built on proprietary hardware. Technologies like software-defined networking (SDN) and network function virtualization (NFV) let you run enterprise-class (and telco-grade) workloads on low-cost commodity hardware. That is a real game changer. While SDN and NFV are not, strictly speaking, “cloud” technologies, they are often employed as part of a cloud migration strategy—leading to some compelling financial benefits.

The Value of Competitive Advantage

But reducing costs is just part of the equation. Perhaps the greatest potential return lies in what the cloud enables your business to do that it couldn’t do before.

The cloud’s agility and productivity allows you to respond faster to market opportunities. You can create new services and business models and extend your reach to attract and retain customers more effectively—and more cost-effectively, because of the cloud’s ability to deliver services at scale.

Say your enterprise delivers a service at different levels—Bronze, Silver and Gold—reflecting the added cost of delivering the higher-level service. What if the cloud allowed you to offer all of your customers a higher level of service at a lower cost? What impact could that have on your business?

Cloud’s “Killer App”

Perhaps the final entry in the “return” column has to do with the cloud’s potential as an enabling technology. Every IT paradigm shift has had its “killer app”—for the cloud, I believe it is Big Data. Imagine you’re a retailer and you want to make sure the right product is on the right shelf at the right time. The ability to deploy Big Data analytics on cloud platforms is the key to solving those kinds of complicated problems, driving real business advantage and ROI.

Moving to the cloud requires a different approach to calculating return on investment. For enterprises focusing only on short-term costs and traditional metrics, deploying cloud apps may or may not add up. But for organizations that value things like business agility, development productivity, customer retention, and market leadership, the business case becomes far more compelling.

Carrier Grade Clouds

8.5.2015Cloud, Fault Tolerance, NFV/SDN, TelcoBy:

When someone in the technology industry throws around the term “carrier grade” it suggests the highest bar when it comes to reliability, availability and resiliency. A carrier grade network is such a high bar that it’s actually enforced by laws in some countries. So, it’s more than interesting that telcos are looking to provide network functions virtualization (NFV) in OpenStack based clouds. This means these telcos envision a future where the cloud itself will be carrier grade and ready for the job done by physical equipment today. This is still in the early semi-visionary stage but there is continued progress on many fronts.

ETSI (the European Telecommunications Standards Institute) is a telecommunication industry standards body at the forefront of defining and standardizing this vision of carrier grade clouds. Members of this body nominate various Proof of Concept (PoC) demonstrations to investigate, conceptualize and ultimately provide standards for achieving that vision. In May, Stratus along with our partners embarked upon a PoC to address what may be the single biggest barrier to making a carrier grade cloud a reality. And today we are happy to announce the results of our POC.

Our PoC titled, Availability Management with Stateful Fault Tolerance demonstrates how virtualized network functions (VNFs) from multiple vendors can be easily deployed in a highly resilient software infrastructure environment, that provides complete and seamless fault management to achieve high availability, keep running and keep state (by remembering the preceding events in a given sequence of interactions with a user) in the event of a system fault or failure.

The results were compelling in that for the first time we have been able to prove a number of things:

  • OpenStack based VIM mechanisms alone are insufficient for supporting carrier grade availability objectives. Baseline functionality is only adequate for supporting development scenarios and non-resilient workloads.
  • All phases of the fault management cycle (fault detection, fault localization, fault isolation, fault recovery and fault repair) can be provided as infrastructure services using a combination of NFVI and MANO level mechanism to deploy VNFs with varying availability and latency requirements – all without any application (i.e. VNF) level support mechanisms.
  • We also demonstrated that NFVI services can offer a sophisticated VM based state replication mechanism (CheckPointing and I/O StateStepping) to ensure globally consistent state for stateful applications in maintaining both high service accessibility and service availability, without application awareness.

We believe that this is a major step forward in proving that the vision of a carrier grade cloud is viable and a software infrastructure solution is beneficial to both VNF providers and network operators/service providers.

  • For network operators/service providers, it enables the deployment any KVM/OpenStack application with transparent and instantaneous fault tolerance for service accessibility and service continuity, without requiring code changes in the VNFs.
  • For VNF providers, it reduces the time, complexity and risk associated with adding high availability and resiliency to every VNF

While there is still much more progress to be made, the very possibility that reliable carrier grade workloads can be maintained will help accelerate the adoption of NFV worldwide. If you’d like to see the details of our POC click here. Non ETSI NFV members can download PDF versions of the PoC Proposal that describes the testing we performed as well as the PoC Report that describes the findings and results of the testing.  If you’d like to know more about the technology Stratus provides to enable these results check our Cloud Solution Brief and contact Ali_Kafel@Stratus.com for a white paper with more details.

Lastly, we did not do all of this work ourselves, there were many partners involved as well as our industry sponsors. We’d like to extend our thanks in helping us achieve this great result.

Key Takeaways from NFV World Congress: Part 2

5.27.2015NFV/SDN, TelcoBy:  

Earlier this month (May 5 to 8), I was invited to speak at the NFV World Congress in San Jose on “Ensuring Availability and Resiliency in NFV.” This is Part 2 of my blog series covering the five key takeaways from the event. This time I will discuss how:

  • Carriers are looking for a lot more than just cost savings from NFV
  • Resiliency and reliability in NFV is a hot topic that every telco cares about

 4 | Carriers are looking for a lot more than just cost savings from NFV

There is no doubt that the telcos are looking to adopt virtualization in order to make their networks more efficient and ultimately save money. According to Toby Ford, AT&T’s area vice president for cloud technology strategy, since the debut of the iPhone in 2007, data traffic over AT&T’s network has increased 100,000 percent, straining the company’s sprawling infrastructure to the limits. As a result, AT&T needs to provide 300-400 percent more capacity every three years, with the same or less spend each year.

Beyond cost savings, the telcos are looking at three key other benefits for NFV.

Common Ground: As stated in Part 1 of this blog – this is a culture change. Rather than developing their own solutions or getting their vendors to provide them with unique technologies, telcos now believe it is better to leverage the community, not just to reduce costs, but also to reduce risks. “The more we converge, the more we don’t fail,” said Margaret Chiosi, distinguished network architect at AT&T Labs and OPNFV’s president.

An End to Vendor Lock-In: The days of just sticking with the traditional vendors are long gone as telcos have learned that innovation often comes from others.

Agility: “The No. 1 opportunity is really agility. To turn up some circuit across the country, it should be minutes instead of months,” said Bryan Sullivan, director of service standards for AT&T. “NFV is going to make it much more possible to gain new partners and bring in new services from new suppliers very quickly.”

5 | Resiliency & reliability in NFV is a hot topic for every telco

In almost every session at NFV World Congress, there was something to be said about Availability, Reliability or Resiliency. Although there is a working group in OPNFV on availability, it is still an open area of concern. During the sessions, several people used these terms interchangeably as if they were synonymous with one another. The fact is, although related, they have different meanings.

Availability is the percentage of time an equipment (or a network) is in an operable state (i.e. can access information or resources). Calculated as % Availability = Uptime / Total time.

Reliability is how long a system performs in its intended function before it fails. It is typically measured in Mean Time between Failures (MTBF) = total time in service / number of failures.

Resiliency is the ability to recover quickly from failures, to return to its original form or state just before the failure.

Therefore, a Highly Available (HA) system may not be Highly Reliable (HRel) or Highly Resilient (HRes). For example a five nines (99.999%) availability means the equipment (or network) is never inaccessible for more than 5 minutes in a one year period, which is pretty darn good. But what happens if every week it fails for just a few seconds and in the process causes every session to be initialized? You now have a highly available system but with poor reliability and resiliency – this is bad. Telcos that understand the meaning of carrier-grade appreciate stateful Fault Tolerance, which provides HA, HRel and HRes. This means when a fault happens, the applications and “system” continue to run (in a secondary entity) with the same states. This is what it means to be “carrier-grade.”

That ends my brief summary of my key take-aways from NFV World Congress. More detail can be found on this blog posted on my LinkedIn page.

 

Key Takeaways from NFV World Congress: Part 1

5.21.2015NFV/SDN, TelcoBy:  

Earlier this month (May 5 to 8), I was invited to speak at the NFV World Congress in San Jose, attended by about 1,000 people. Although this is a relatively small group compared to the likes of Mobile World Congress, it was a very well-targeted group of thought leaders and industry influencers who are working specifically on SDN and NFV in the carrier space. Thought leaders from leading service providers, vendors, industry groups and analysts shared their opinions and research on this market. My talk – Ensuring Availability and Resiliency in NFV” – focused on how to seamlessly bring resiliency and reliability to NFV in a KVM / OpenStack environment.

There was a significant amount of debate and discussion at the event. Rather than covering all of the topics that were discussed, I will summarize my top takeaways in a two part blog series. In this blog, I will talk about how:

  1. The first release of Open Source NFV, OPNFV, is around the corner.
  2. The culture of carriers is starting to change
  3. Approved industry PoCs continue to be important

 1.  The first release of Open Source NFV, OPNFV, is just around the corner

Seven months ago, the Linux Foundation founded OPNFV (Open Platform for NFV Project Inc.) with the goal of delivering a carrier-grade open source hardware-agnostic platform architecture with software that can run in different environments to speed up NFV deployments. We learned that Release 1 of the OPNFV platform, dubbed “Arno,” includes OpenStack, the KVM open source hypervisor, OpenDaylight and Open vSwitch, and will be released “later in the spring”, according to OPNFV director Heather Kirksey. I asked how the organization was making sure commitments were being kept by volunteers who generally have other full-time jobs. The answer is essentially peer pressure, respect and the need to build one’s reputation and earn trust from their peers. In addition to Arno, OPNFV has more than ten projects in the pipeline, with names such as: “Doctor” for fault management and “Availability” for High Availability for OPNFV.

2.  The culture of the carriers is starting to change

It used to be that carriers such as AT&T would only do business with large vendors such as Cisco and Alcatel-Lucent and would only deploy technologies that were “proven”. While carrier-grade reliability and resiliency are still important, carriers are starting to become more innovative and therefore more willing to work with non-traditional technologies and vendors.

Margaret Chiosi, distinguished network architect at AT&T Labs, shared her insights during her talk on Wednesday.

She laid out a network architecture model based on Linux, OPNFV, ETSI ISG NFV, OpenStack, KVM, OVS (Open source vSwitch), DPDK (data plane dev kit) and multiple SDN controllers. Although AT&T is using the OpenDaylight SDN controller as a framework for their own global controller, they are also looking at other SDN controllers such as OpenContrail for Local Networks and ONOS for Nodal network Controllers. The objective of this model is to be able to move fast. This differs from the past where it might have taken years to get an idea from the architecture team and longer to implement. The new AT&T model is to produce code immediately by leveraging the community and making their extensions, find what’s not working, and make revisions quickly.

This approach was echoed by the other telcos, including Luigi Licciardi, a Telecom Italia vice president in charge of technology planning and standards. He said that, “Our world is a world that is based on old issues, so we have to renew our culture, in terms of developing, understanding, and using software in the proper way.”

3.  Approved industry PoCs continue to be important

At the end of 2014, there were 26 approved ETSI PoCs. By May 5, 2015 (the first day of the NFV World Congress), there were total of 35 approved PoCs (including one led by Stratus Technologies on “Availability Management with Fault Tolerance”). During the event, 23 of these PoCs were demonstrated. These PoCs gave a unique opportunity to gain first-hand knowledge and insight about this critical technology – and the current reality of that technology – in order to strengthen a telco’s strategic planning and decision-making and help them identify which NFV solutions may be viable in their networks. They were also a great way to “show and tell” and collaborate with other NFV players. Stratus’ PoC was designed to alleviate one of the main service provider concerns as they virtualize their networks. The availability management and stateful fault tolerance ETSI NFV PoC, sponsored by AT&T and NTT, demonstrated how to protect VNFs in a variety of availability modes including Fault Tolerant (FT), High Availability (HA), and General Availability (GA). We showed that VNFs can be seamlessly protected, without code change, by sitting in a VM running on a Fault Tolerant Software Infrastructure, on commodity hardware.

That’s it for Part 1. Keep an eye out for my final two take-aways in Part 2, coming soon. If you want to hear more about resiliency and reliability in NVF, sign up for an informational webinar with Stratus and TMCnet on May 26, 2015 at 2pm EDT.

Why ETSI and OP-NFV Impact Everyone in Cloud (Not Just Telcos)

5.6.2015Cloud, TelcoBy:  

Yesterday we announced that Stratus is in the process of delivering a proof of concept (PoC) for the European Telecommunication Standards Institute (ETSI) – you can read the press release here. This is a pretty big deal in the telco and NFV world, but I think ETSI PoCs like this have a much more horizontal impact than you might think.

Naturally we and our sponsors are excited, and see a major opportunity for our solutions for telco operators but in this post,  I’m going to address the big picture as to why ETSI (and in some ways by extension OP-NFV) are very important to anyone interested in clouds.

  1. ETSI and OP-NFV are user led – This is a very important distinction in ecosystems. And before people misconstrue what I am saying, let me clear this up. I believe most ecosystems in IT have a user element and often even user members and contributors. But to be user led is different. Let me give an example – the OpenStack Summit is about to kick off in the next couple of weeks and while the effort, progress and enthusiasm are all great, frankly, it’s taking too long to get to maturity. I believe there is a direct correlation between the progress of OpenStack and the fact the foundation that governs it is almost 100% vendor led. Vendors are incented to monetize the output of OpenStack and that leads to a broadly scoped solution and inevitably lower standards for enterprise readiness (otherwise what would the vendors sell to the users)? A user led community is better equipped to drive interoperability standards and prioritize real world adoption over who claims what revenues.
  2. Telco’s care less about compute and more about networking – This one is BIG. Technically, the computing aspects of cloud are pretty worked out, but outside of the mega public cloud guys, the networking parts are not as mature yet. This is coming along really quickly, but ETSI and OP-NFV are going to push very hard on the industry to get it done well, and to support the most demanding use cases. By extension, these innovations and learnings will trickle down into industries where networking may not have to be carrier grade.
  3. Solid standards are the key to adoption – One of the myths of clouds is that any one organization will build one cloud and/or manage multiple clouds with one orchestrator. That’s just fiction. There never has been one tool to solve any problem and the notion that one tool can do everything this time around is silly.  At the end of the day the management of however many clouds you have will be a more layered approach. There may be one master console, but different underlying services or users will automate and drive the cloud in different ways. The layering of management services is interesting and different because it enables more flexibility. In fact Cisco’s Intercloud offering has a good view on this. But, for that approach to work good interoperability standards are mandatory. ETSI gets this and sees it as a gap. I’m hoping to see more ETSI POCs to focus on this area so at least 1 or 2 verticals can standardize and everyone can move forward.

Truthfully, until you have good inter/op standards and a set of users being open about what they really need, no technology crosses over from early adoption to mainstream use. It’s not to belittle what’s been done so far. In fact it’s a reflection that good things have been done and now cloud technology is viable enough for users and user groups to invest in helping take it to the next level. Which is good for everyone.

Exciting times ahead.

The “Cloudification” of Telco: Part Two

3.27.2015Cloud, NFV/SDN, TelcoBy:  

In my previous blog post, I discussed why mobile and broadband communications service providers (CSPs) will move toward “cloudification”—and the first two steps down that road: network functions virtualization (NFV) and Virtualized Resilience Layer. In this post, I’ll examine the advanced steps that could transform the telco space in even more fundamental ways.

Step 3: Contextual Network Analysis

In the course of serving subscribers, CSPs accumulate a lot of data about their subscribers. This includes information about their devices, their usage patterns, service plans, geographic locations, contacts, purchase histories, and more. In addition to this “internal” data are subscriber insights available from social media and other online sources. Contextual Network Analysis is all about combining all this data to create massive repositories of information, then analyzing this Big Data to leverage even more value from the CSP’s network.

This added value could be in the form of highly personalized opt-in ads or offers, or service recommendations—all delivered within the context of each subscriber’s individual patterns. This capability could also open the door to third-party partnerships to deliver value-added services that generate new revenue streams while keeping subscribers stuck to the CSP like glue.

This kind of data analysis is already practiced in other business sectors. What’s new and exciting is the idea of integrating network-derived intelligence to give CSPs a powerful, new arrow in their quiver.

Step 4: Thinking Networks

What’s the endgame of this march to cloudification? I believe the final step will be taking telco networks to an even higher level of automated intelligence. Such a “thinking network” will have a high degree of software-defined intelligence across all of the CSP’s central offices. The result is a comprehensive, 360-degree view of the entire network and the CSP’s subscribers. This intelligent network will process all this information in real time, adapting dynamically to changing activity. The thinking network is a learning network, analyzing a variety of network activity data to predict what’s needed, precisely where and when it’s needed.

The result is an optimized subscriber experience where the network gets to “know” what subscribers want. Crucially, by allocating network resources in a “just in time” manner, the thinking network also optimizes utilization of bandwidth, maximizing operational efficiency and service provider profitability.

Meeting the Availability Standard

This is all pretty exciting stuff for CSPs plotting their strategy for future profitability. But there are technical hurdles that must be overcome. First and foremost is the need to ensure extreme availability. In the telco world, “five nines” availability is the standard. Rapid recovery from faults isn’t enough; subscriber applications must be able to maintain their state, no matter what. That means they must be able to “remember” the preceding events in a given sequence of user interactions and pick up immediately where they left off in the event of a fault. Failing to maintain stateful availability results in dropped calls and interrupted access to services. And that leads to subscriber churn and lost revenue.

The good news is that achieving stateful availability in low-cost cloud environments is now possible. A new generation of software-defined availability (SDA) technologies capture the state of the primary system at regular intervals and apply it to a secondary standby host. In the event of a primary host fault, the secondary can pick up execution starting from the most recent statepoint without losing any data. All completely transparent to the subscriber.

The key to this breakthrough is taking availability out of the application layer, enabling any application to receive its required availability level in the cloud, with application transparency. Stratus is leading the way in making this a reality in cloudified telco networks.

The road to telco cloudification represents an exciting opportunity for forward-looking CSPs. And Stratus is working to pave the way to the cloud for forward-looking telcos ready to seize the first-mover advantage.

Pageof 2

Share