Tuning for Zero Packet Loss in Red Hat OpenStack Platform – Part 1

For Telcos considering OpenStack, one of the major areas of focus can be around network performance. While the performance discussion may often begin with talk of throughput numbers expressed in Million-packets-per-second (Mpps) values across Gigabit-per-second (Gbps) hardware, it really is only the tip of the performance iceberg. The most common requirement is to have absolutely stable and deterministic network performance (Mpps and latency) over the absolutely fastest possible throughput. With that in mind, many applications in the Telco space require low latency that can only tolerate zero packet loss.
In this “Operationalizing OpenStack” blogpost Federico Iezzi, EMEA Cloud Architect with Red Hat, discusses some of the real-world deep tuning and process required to make zero packet loss a reality!

Packet loss is bad for business …
Packet loss can be defined as occurring “when one or more packets of data travelling across a computer network fail to reach their destination [1].” Packet loss results in protocol latency as losing a TCP packet requires retransmission, which takes time. What’s worse, protocol latency manifests itself externally as “application delay.” And, of course, “application delay” is nothing more than a fancy term for something that all Telco’s want to avoid: a fault. So, as network performance degrades, and packets drop, retransmission occurs at higher and higher rates.  The more retransmission the more latency experienced and the slower the system gets. With increased packets due to this retransmission we also see increased congestion slowing the system even further.

Tune in now for better performance …
So how do we prepare OpenStack for Telco? 
Photo CC0-licensed (Alan Levine)
It’s easy! Tuning!

Red Hat OpenStack Platform is supported by a detailed Network Functions Virtualization (NFV) Reference Architecture which offers a lot of deep tuning across multiple technologies ranging from Red Hat Enterprise Linux to the Data Plane Development Kit (DPDK) from Intel.  A great place to start is with the Red Hat Network Functions Virtualization (NFV) Product Guide. It covers tuning for the following components:

Red Hat Enterprise Linux version 7.3
Red Hat OpenStack Platform version 10 or greater
Data plane tuning

Open vSwitch with DPDK at least version 2.6
SR-IOV VF or PF

System Partitioning through Tuned using profile cpu-partitioning at least version 2.8
Non-uniform memory access (NUMA) and virtual non-uniform memory access (vNUMA)
General OpenStack configuration

Hardware notes and prep …
It’s worth mentioning that the hardware to be used in achieving zero packet loss often
Photo CC0-licensed (PC Gehäuse)
needs to be latest generation. Hardware decisions around network interface cards and vendors can often affect packet loss and tuning success. For hardware, be sure to consult your vendor’s documentation prior to purchase to ensure the best possible outcomes. Ultimately, regardless of hardware, some setup should be done in the hardware BIOS/UEFI for stable CPU frequency while removing power saving features.

Setting
Value

MLC Streamer
Enabled

MLC Spatial Prefetcher
Enabled

Memory RAS and Performance Config
Maximum Performance

NUMA optimized
Enabled

DCU Data Prefetcher
Enabled

DCA
Enabled

CPU Power and Performance
Performance

C6 Power State
Disabled

C3 Power State
Disabled

CPU C-State
Disabled

C1E Autopromote
Disabled

Cluster-on-Die
Disabled

Patrol Scrub
Disabled

Demand Scrub
Disabled

Correctable Error
10

Intel(R) Hyper-Threading
Disabled or Enabled

Active Processor Cores
All

Execute Disable Bit
Enabled

Intel(R) Virtualization Technology
Enabled

Intel(R) TXT
Disabled

Enhanced Error Containment Mode
Disabled

USB Controller
Enabled

USB 3.0 Controller
Auto

Legacy USB Support
Disabled

Port 60/64 Emulation
Disabled

BIOS Settings from:

Open vSwitch with DPDK
KVM4NFV Test Environment

Divide and Conquer …

Properly enforcing resource partitioning is essential in achieving zero packet loss performance and to do this you need to partition the resources between the host and the guest correctly. System partitioning ensures that software resources running on the host are always given access to dedicated hardware. However, partitioning goes further than just access to hardware as it can be used to ensure that resources utilize the closest possible memory addresses across all the processors. When a CPU retrieves data from a memory address it first looks at the local cache on the local processor core itself. Proper partitioning, via tuning, ensures that requests are answered from the closest cache (L1, L2 or L3 cache) as well as from the local memory, minimizing transaction times and the usage of a point-to-point processor interconnection bus such as the QPI (Intel QuickPath Interconnect). This way of accessing and dividing the memory is defined as NUMA (non-uniform memory access) design.
Tuned in …
System partitioning involves a lot of complex, low-level tuning. So how does one do this easily?
You’ll need to use the tuned daemon along with the the accompanying cpu partitioning profile. Tuned is a daemon that monitors the use of system components and dynamically tunes system settings based on that monitoring information. Tuned is distributed with a number of predefined profiles for common use cases. For all this to work, you’ll need the newest tuned features. This requires the latest version of tuned (i.e. 2.8 or later) as well as the latest tuned cpu-partitioning profile (i.e. 2.8 or later). Both have are available publicly via the Red Hat Enterprise Linux 7.4 beta release or you can grab the daemon and profiles directly from their upstream projects. 
Interested in the latest generation of Red Hat Enterprise Linux? Be the first to know when it is released by following the official Red Hat Enterprise Linux Blog!

However, before any tuning can begin, you must first decide how the system should be partitioned.
Based on Red Hat experience with customer deployments, we usually find it necessary to define how the system should be partitioned for every specific compute model. In the example pictured above, the total number of PMD cores – one CPU core is two CPU threads – had to be carefully calculated by knowing the overall required Mpps as well as the total number of DPDK interfaces, both physical and vPort. An unbalanced PMD number versus DPDK ports will result in lower performance and interrupts which will generate packet-loss. The rest of the tuning was for the VNF threads, excluding at least one core per NUMA node for the operating system.

Looking for more great ways to ensure your Red Hat OpenStack Platform deployment is rock solid? Check out the Red Hat Services Webinar Don’t fail at scale: How to plan, build, and operate a successful OpenStack cloud today! 

Looking at the upstream templates as well as in the tuned cpu-partitioning profile, there is a lot to understand about the specific settings that are executed on each core per NUMA node.
So, just what needs to be tuned? Find out more in Part 2 where you’ll get a thorough and detailed breakdown of many specific tuning parameters to help achieve zero packet loss!

The “Operationalizing OpenStack” series features real-world tips, advice and experiences from experts running and deploying OpenStack.
Quelle: RedHat Stack

OpenShift Online Pro tier is here

Today we announce the general availability of the Next Generation of Red Hat OpenShift Online Pro tier (OpenShift Online 3). OpenShift Online 3 has been re-engineered to be built on the same powerful and agile technology as Red Hat OpenShift Container Platform and is one of the first multi-tenant cloud application platforms powered by docker-format containers and Kubernetes-based container orchestration. OpenShift Online 3 Pro Tier provides additional resources, enhanced features, and global availability for professional projects.
Quelle: OpenShift

Mobile app helps visitors navigate a 1000-year-old Swedish castle

Wenngarn is a small village community with about 500 residents, located 30 minutes outside of Stockholm, Sweden. The homes are built around the Wenngarns slott (Wenngarn Castle), and the community is centered on the baroque castle just like it was during the Middle Ages.
Visitors to Wenngarn can enjoy the beautiful grounds and rich history of the area, as well as use the facilities for meetings or events. On the grounds are restaurants, nurseries, a hotel, a recreation center and even a micro-brewery. The castle and a nearby café are open to the public every day.
How do thousands of monthly visitors discover all that this welcoming landmark has to offer?
That is the question the Wenngarn Group, the organization responsible for the complex’s extensive development, had to answer. The group wants to ensure that it delivers the best possible guest experience.
Theme-park origin
When I visited a large theme park in the United States with my family, finding our way around was a challenge. That is, until one park representative showed us a mobile app that could help us find restaurants, locate other attractions and see the length of lines for rides in real time.
The app transformed what might otherwise have been a stressful visit into a great experience. It became clear that Wenngarn Group could use the same concept at the Wenngarn Castle and grounds.
Wenngarn Group set a goal to enable visitors and residents to use their mobile phones to do everything from opening their hotel room doors to making restaurant reservations.
To help turn its vision into a reality, Wenngarn Group engaged an expert team from IBM Business Partner Sisyfos Digital.
Powered by Bluemix
Sisyfos Digital recommended IBM Bluemix technology for the foundation for the new app. Wenngarn Group was confident that Bluemix was the right fit because it met the group’s needs for speed and scalability. The group wanted to deliver the app on iOS and Android and with Bluemix, so it got a system of pre-built components that can be combined quickly to build, test and deploy working applications with minimal complexity.
Wenngarn Group launched the mobile app in just eight months. Visitors, residents and staff alike can all use the app in for different tasks.
Visitors can learn more about the castle and the culture and the history with the app’s virtual map of the Wenngarn complex. They can also find parking, cafés and museums.
Residents and hotel guests can use the app to open the door to the gym or conference center.
Service workers know which hotel rooms are ready to be cleaned or any requests for maintenance made by another app user who may have noticed something not working properly.
The information in the app can be updated using the back-office management feature, so changes can be made without the need to involve a developer.
Better service with technology
Wenngarn Group has optimized hospitality at the castle and estate, because the app has enabled better service. It makes the experience richer, more personal and more customized for users, and helps them make the most out of the complex. The app improves user satisfaction and inspires repeat business.
One next step for the app is adding payment functionality, enabling people to make purchases with their phones, further streamlining visitor experiences at the Wenngarn Castle and complex.
Read the case study for more about this story.
Learn how IBM Cloud and Bluemix help you deliver engaging digital experiences on any platform.
The post Mobile app helps visitors navigate a 1000-year-old Swedish castle appeared first on Cloud computing news.
Quelle: Thoughts on Cloud

Introducing IBM Cloud Automation Starter Library

Some IT operators have trouble keeping up with increasing demand for more capabilities and higher efficiency. Many lack access to the skilled resources needed to automate all the tasks necessary to keep up with the faster pace.
There is no standardized set of automation instructions that fits every cloud platform. Creating end-to-end automation that can deploy and manage workloads can be difficult to do with existing tools. It can get complicated and expensive—and you might face the risk of losing control of workloads. How can IT operators handle these challenges?
Introducing the IBM Cloud Automation Starter Library. The starter library contains automation building locks that help IT operators deploy a select set of cloud-native payloads. This library is delivered as a service at no cost through IBM Cloud Automation Manager.
The automation building blocks deploy components on any cloud supported IBM Cloud Automation Manager. They can be further customized using IBM-recommended tooling and guidelines. Users can receive library updates by synchronizing their repositories with Cloud Automation Manager Hub.
Some of the automated deployments you can perform include:

MEAN stack
LAMP stack
Kubernetes cluster with NGINX
Three-tier Strongloop

You can take back control of your multicloud environment, save time, and always stay up-to-date. Our starter library will help you accelerate your cloud-native workload deployments. Get started at no cost today. Learn more about IBM Cloud Automation Libraries on our website.
If you’d like to go behind-the-scenes for a deeper look at what Cloud Automation Manager can do for your business, check out this blog post.
 
The post Introducing IBM Cloud Automation Starter Library appeared first on Cloud computing news.
Quelle: Thoughts on Cloud

eZanga outsmarts the bots with IBM Cloud

According to a recent CNBC study, in 2016 alone, 20 percent of total spending on digital advertising was wasted. The study predicts brands will lose about $16 billion globally to online advertising fraud in 2017.
Online marketing firm eZanga was founded in 2003 to combat fraud traffic so advertisers and publishers could thrive without cutting into their budgets. The company focuses on simplifying solutions to identify real or fake users—fake users being bots or human ad fraud—to anticipate attacks before they happen.
Today, eZanga is dedicated to help thwart advertising fraud and improve the number of real humans interacting with ads and content by validating the user, the source it comes from and how to attack it with accuracy. One of the most pernicious forms of ad fraud involves computer programs, or “bots,” that impersonate publishers and reap the profit from advertisers.
eZanga started as a small-scale company looking to bring fraud-combating technology to life. As the company grew, its leaders realized that advertisers must investigate the analytics behind their traffic to stop bots before they attack.
Between click fraud, bot traffic and consumer privacy, eZanga is looking to challenge fraudulent traffic and protect advertisers’ budgets. A longstanding IBM client, eZanga is finding new ways to use IBM Cloud technologies to push innovations forward to produce nimble technology to help reduce fraud. Today, eZanga is exploring new ways IBM Watson APIs can bring value to customers and help them stay one step ahead of fraud.
eZanga has a wide variety of innovations that focus on generating ROI for publishers and advertisers. Its offerings include AdPad, which enables customers to create, manage and enhance ad campaigns on a centralized, self-service platform on the IBM Cloud.
The company’s newest technology, Anura, is a dashboard offering that optimizes traffic sources by validating that the user is a human. It also shows real-time potential threats to lessen the chance of attack. eZanga hosts Anura on IBM Bluemix to give customers the benefits of a flexible, agile server infrastructure and enable the success of the Anura technology.
IBM and eZanga are ensuring that people, not bots, are seeing the ads companies serve up. With the help of IBM, eZanga is ready to take the leap and expand its cloud footprint to gain even deeper insights from data it collects, making its services future-proof for customers.
The company’s cloud journey is just beginning as it continues to partner with IBM to take small ideas and turn them into big technologies. Interested in how your company might benefit? Learn how to get started quickly with Bluemix.
The post eZanga outsmarts the bots with IBM Cloud appeared first on Cloud computing news.
Quelle: Thoughts on Cloud

Introducing Virtlet: VMs and Containers on one OpenContrail network in Kubernetes — a new direction for NFV?

The post Introducing Virtlet: VMs and Containers on one OpenContrail network in Kubernetes — a new direction for NFV? appeared first on Mirantis | Pure Play Open Cloud.
Some time ago I had a meeting at one of our potential enterprise (non-telco) customers. The company had just announced an RFP to replace their existing OpenStack distribution. As we began the discussion about finding the #1 contributor to OpenStack in Stackalytics, we found the root cause of the problem, and realized that they didn’t need to find an OpenStack distribution from top vendor commits.
They need to run their single application workload in large scale production.
In other words, they don’t need multi-tenancy, self-service, Murano, Trove, and so on. In fact, they don’t even want OpenStack, because it is too complex to ship an immutable VM image with their app.
On the other hand, running Kubernetes instead of OpenStack wasn’t the right answer either, because their app is not ready to take its place in the microservices world, and it would take at least six months to rewrite, re-test and certify all the tooling around it.
That was the day I realized how powerful it would be to enable standard VMs in Kubernetes, along with the same SDN we have today in OpenStack. By including the best of both platforms, imagine how we could simplify the control plane stack for use cases such as Edge Computing, Video streaming, and so on, where functions are currently deployed as virtual machines. It might even give us a new direction for NFV.
That’s the idea behind Virtlet.
What is Virtlet? An overview
The previous real example demonstrates that our customers are not ready for the pure microservices world, as I described in my previous blog. To solve this problem, we’re adding a new feature to Mirantis Cloud Platform called Virtlet. Virtlet is a Kubernetes runtime server that enables you to run VM workloads based on QCOW2 images.
Virtlet was started by Mirantis k8s folks almost year ago, with the first implementation done with Flannel. In other words, Virtlet is a Kubernetes CRI (Container Runtime Interface) implementation for running VM-based pods on Kubernetes clusters. (CRI is what enables Kubernetes to run non-Docker flavors of containers, such as Rkt.)
For the sake of simplicity of deployment, Virtlet itself runs as a DaemonSet, essentially acting as a hypervisor and making the CRI proxy available to run the actual VMs This way, it’s possible to have both Docker and non-Docker pods run on the same node.
The following figure shows the Virtlet architecture:

Virtlet consists of the following components:

Virtlet manager: Implements the CRI interface for virtualization and image handling
Libvirt: The standard instance of libvirt for KVM.
vmwrapper: Responsible for preparing the environment for the emulator
Emulator: Currently qemu with KVM support (with possibility of disabling KVM for nested virtualization tests)
CRI proxy: Provides the possibility of mixing docker-shim and VM based workloads on the same k8s node

You can find more detail in the github docs, but in its latest release, Virtlet supports the following features:

Volumes: Virtlet uses a custom FlexVolume (virtlet/flexvolume_driver) driver to specify block devices for the VMs. It supports:

qcow2 ephemeral volumes
raw devices
Ceph RBD
files stored in secrets or config maps

Environment variables: You can define environment variables for your pods, and then virtlet uses cloud-init to write those values into the /etc/cloud/environment file when the VM starts up.

Demo Lab Architecture
To demonstrate how all of this works, we created a lab with:

3 OpenContrail 3.1.1.x controllers running in HA
3 Kubernetes master/minion nodes
2 Kubernetes minion nodes

The K8s nodes are running Kubernetes 1.6 with the OpenContrail Container Network Interface (CNI) plugin, and we spun up a Ubuntu VM POD via virtlet and standard deployment with Nginx container pods.

So what we wind up with is an installation where we’re running containers and virtual machines on the same Kubernetes cluster, running on the same OpenContrail virtual network.
In general, the process looks like this:

Set up the general infrastructure, including the k8s masters and minions, as well as an OpenContrail controllers. Nodes running the Virtlet DaemonSet should have a label key set to a specific value. In our case, we’re using extraRuntime=virtlet.  (We’ll need this later.)
Create a pod for the VM, specifying the extraRuntime key in the nodeAffinity parameter so that it runs on a node that’s got the Virtlet DaemonSet. For the volume specify the VM image.
That’s it; there is no number 3.

Of course there’s much more to see than just those two steps, as you can see in this video:

Conclusion
So now that we’ve got the basics, we’ve got a couple of ideas of what we would like to do in the future regarding Virtlet and OpenContrail Kubernetes integration, such as:

Performance validation of VMs in Kubernetes, such as comparing containerized VMs with standard VMs on OpenStack
iSCSI Support for storage volumes
Enabling OpenContrail vRouter DPDK and SR-IOV, extending the OpenContrail CNI to make it possible to create advanced NFV integrations
CPU pinning and NUMA for Virtlet
Resource handling improvements, such as hard limits for memory, and qemu thread limits
Calico Support

As you can see, rather than pushing random commits, Mirantis is focusing on solving real problems, and only pushing those solutions back to the community. I would also like to give special thanks to Ivan Shvedunov, Dmitry Shulyak and all of the Mirantis Kubernetes team, who did an amazing job on this integration. If you want to reach us, you can find us in the Kubernetes slack channel #virtlet, or for network-related issues, you can find us on the OpenContrail Slack.
The post Introducing Virtlet: VMs and Containers on one OpenContrail network in Kubernetes — a new direction for NFV? appeared first on Mirantis | Pure Play Open Cloud.
Quelle: Mirantis