Self-hosted Load Balancer for OpenShift: an Operator Based Approach

Introduction
Some time ago, I published an article about the idea of self-hosting a load balancer within OpenShift to meet the various requirements for ingress traffic (master, routers, load balancer services). Since then, not much has changed with regards to the load balancing requirements for OpenShift. However, in the meantime, the concept of operators, as an approach to capture automated behavior within a cluster, has emerged. The release of OpenShift 4 fully embraces this new operator-first mentality.
Prompted by the needs of a customer, additional research on this topic was performed on the viability of deploying a self-hosted load balancer via an operator.
The requirement is relatively simple: an operator watches for the creation of services of type LoadBalancer and provides load balancing capabilities by allocating a load balancer in the same cluster for which the service is defined.

In the diagram above, an application is deployed with a LoadBalancer type of service. The hypothetical self-hosted load balancer operator is watching for those kinds of services and will react by instructing a set of daemons to expose the needed IP in an HA manner (creating effectively a Virtual IP [VIP]). Inbound connections to that VIP will be load balanced to the pods of our applications.
In OpenShift 4, by default, the router instances are fronted by a LoadBalancer type of service, so this approach would also be applicable to the routers.
In Kubernetes, a cloud provider plugin is normally in charge of implementing the load balancing capability of LoadBalancer services, by allocating a cloud-based load balancing solution. Such an operator as described previously would enable the ability to use LoadBalancer services in those deployments where a cloud provider is not available (e.g. bare metal).
Metallb
Metallb is a fantastic bare metal-targeted operator for powering LoadBalancer types of services. 
It can work in two modes: Layer 2 and Border Gateway Protocol (BGP) mode.
In layer 2 mode, one of the nodes advertises the load balanced IP (VIP) via either the ARP (IPv4) or NDP (IPv6) protocol. This mode has several limitations: first, given a VIP, all the traffic for that VIP goes through a single node potentially limiting the bandwidth. The second limitation is a potentially very slow failover. In fact, Metallb relies on the Kubernetes control plane to detect the fact that a node is down before taking the action of moving the VIPs that were allocated to that node to other healthy nodes. Detecting unhealthy nodes is a notoriously slow operation in Kubernetes which can take several minutes (5-10 minutes, which can be decreased with the node-problem-detector DaemonSet).
In BGP mode, Metallb advertises the VIP to BGP-compliant network routers providing potentially multiple paths to route packets destined to that VIP. This greatly increases the bandwidth available for each VIP, but requires the ability to integrate Metallb with the router of the network in which it is deployed. 
Based on my tests and conversations with the author, I found that the layer 2 mode of Metallb is not a practical solution for production scenarios as it is typically not acceptable to have failover-induced downtimes in the order of minutes. At the same time, I have found that the BGP mode instead would much better suit production scenarios, especially those that require very large throughput.
Back to the customer use case that spurred this research. They were not allowed to integrate with the network routers at the BGP level, and it was not acceptable to have a failover downtime of the order of minutes. 
What we needed was a VIP managed with the VRRP protocol, so that it could failover in a matter for milliseconds. This approach can easily be accomplished by configuring the keepalived service on a normal RHEL machine. For OpenShift, Red Hat has provided a supported container called ose-keepalived-ipfailover with keepalived functionality. Given all of these considerations, I decided to write an operator to orchestrate the creation of ipfailover pods.
Keepalived Operator
The keepalived operator works closely with OpenShift to enable self-servicing of two features: LoadBalancer and ExternalIP services.
It is possible to configure OpenShift to serve IPs for LoadBalancer services from a given CIDR in the absence of a cloud provider. As a prerequisite, OpenShift expects a network administrator to manage how traffic destined to those IPs reaches one of the nodes. Once reaching a node, OpenShift will make sure traffic is load balanced to one of the pods selected by that given service.
Similarly for ExternalIPs, additional configurations must be provided to specify the CIDRs range users are allowed to pick ExternalIPs from. Once again, a network administrator must configure the network to send traffic destined to those IPs to one of the OpenShift nodes.
The keepalived operator plays the role of the network administrator by automating the network configuration prerequisites.

When LoadBalancer services or services with ExternalIPs are created, the Keeplived operator will allocate the needed VIPs on a portion of the nodes by adding additional IPs on the node’s NICs. This will draw the traffic for those VIPs to the selected nodes.
VIPs are managed by a cluster of ipfailover pods via the VRRP protocol, so in case of a node failure, the failover of the VIP is relatively quick (in the order of hundreds of milliseconds).
Installation
To install the Keepalived operator in your own environment, consult the documentation within the GitHub repository.
Conclusions
The objective of this article was to provide an overview of options for self-hosted load balancers that can be implemented within OpenShift. This functionality may be required in those scenarios where a cloud provider is not available and there is a desire to enable self-servicing capability for inbound load balancers.
Neither of the examined approaches allows for the definition of a self-hosted load balancer for the master API endpoint. This remains an open challenge especially with the new OpenShift 4 installer. I would be interested in seeing potential solutions in this space.
The post Self-hosted Load Balancer for OpenShift: an Operator Based Approach appeared first on Red Hat OpenShift Blog.
Quelle: OpenShift

Docker Desktop for Windows Home is here!

Last year we announced that Docker had released a preview of Docker Desktop with WSL 2 integration. We are now pleased to announce that we have completed the work to enable experimental support for Windows Home WSL 2 integration. This means that Windows Insider users on 19040 or higher can now install and use Docker Desktop!

Feedback on this first version of Docker Desktop for Windows Home is welcomed! To get started, you will need to be on Windows Insider Preview build 19040 or higher and install the Docker Desktop Edge 2.2.2.0.

What’s in Docker Desktop for Windows Home?

Docker Desktop for WSL 2 Windows Home is a full version of Docker Desktop for Linux container development. It comes with the same feature set as our existing Docker Desktop WSL 2 backend. This gives you: 

Latest version of Docker on your Windows machine Install Kubernetes in one click on Windows Home Integrated UI to view/manage your running containers Start Docker Desktop in <5 secondsUse Linux WorkspacesDynamic resource/memory allocation Networking stack, support for http proxy settings, and trusted CA synchronization 

How do I get started developing with Docker Desktop? 

For the best experience of developing with Docker and WSL 2, we suggest having your code inside a Linux distribution. This improves the file system performance and thanks to products like VSCode mean you can still do all of your work inside the Windows UI and in an IDE you know and love. 

Firstly make sure you are on the Windows insider program, are on 19040 and have installed Docker Desktop Edge.

Next install a WSL distribution of Linux (for this example I will assume something like Ubuntu from the Microsoft store).

You may want to check your distro is set to V2, to check in powershell run

wsl -l -v 

If you see your distro is a version one you will need to run 

wsl ‐‐set-version DistroName 2

Once you have a V2 WSL distro, Docker Desktop will automatically set this up with Docker.

The next step is to start working with your code inside this Ubuntu distro and ideally with your IDE still in Windows. In VSCode this is pretty straightforward.

You will want to open up VSCode and install the Remote WSL extension, this will allow you to work with a remote server in the Linux distro and your IDE client still on Windows. 

 Now we need to get started working in VSCode remotely, the easiest way to do this is to open up your terminal and type:

Wsl code .

This will open a new VSCode connected remotely to your default distro which you can check in the bottom corner of the screen. 

(or you can just look for Ubuntu in your start menu, open it and then run  code . )

Once in VSCode there I use the terminal in VSCode to pull my code and start working natively in Linux with Docker from my Windows Home Machine!

Other tips and tricks:

If you want to get the best out of the file system performance avoid mounting from the windows file system (even from a WSL distro. eg: avoid docker run -v /mnt/c/users:/users)If you are worried about the size of the docker-desktop-data VHDX or need to change it you can do this through the WSL tooling built into Windows:https://docs.microsoft.com/en-us/windows/wsl/wsl2-ux-changes#understanding-wsl-2-uses-a-vhd-and-what-to-do-if-you-reach-its-max-size If you are worried about CPU/Memory usage you put limits on memory/cpu/swap size on the WSL2 utility VM https://docs.microsoft.com/en-us/windows/wsl/release-notes#build-18945 

Your feedback needed!

We are excited to get your feedback on the first version of Docker Desktop for Windows Home and for you to tell us how we can make it even better.

To get started with WSL 2 Docker Desktop on Windows home today you will need to be on Windows Insider Preview build 19040 or higher and install the Docker Desktop Edge 2.2.2.0.
The post Docker Desktop for Windows Home is here! appeared first on Docker Blog.
Quelle: https://blog.docker.com/feed/

How to build a simple edge cloud: Q&A

The post How to build a simple edge cloud: Q&A appeared first on Mirantis | Pure Play Open Cloud.
Last week we held a webinar explaining the basics behind creating edge clouds, but we didn’t have enough time for all of the questions. So as is our tradition, here are the Q&As, including those we didn’t get to on the call.
Does Docker Platform support GPUs?
Docker Engine 19.03 added support for GPUs. Also, we are working on adding features that make GPU availability visible for orchestration.
What’s the difference between Docker and Docker Enterprise?
Docker CE or Community Edition is a free containerization platform managed by Docker, Inc.
Docker EE or Enterprise Engine is an integrated, fully supported and certified container platform, owned by Mirantis.
Docker EE is part of the Docker Enterprise platform, a suite of solutions to help manage and deploy applications securely which includes: Docker Trusted Registry, Docker Universal Control Plane, Docker Content Trust, and Docker Enterprise Engine.
So how would an edge device know where to contact the edge cloud? Would it be hard-coded?
It certainly COULD be hard-coded, though I personally would have the device call home just to find out where it should go for its actual configuration.
Is it passing the video data to the Google container, or is it passing a container to the Google platform?
The idea is for an edge node to pass only the important data to the next step (in this case the Kubernetes cluster running on GKE). So in this case, the container running in the simulated camera is passing only individual video frames to the container that’s running in the regional cloud.
To access an application that is present on an Edge Cloud, for example, through a Mobile Network, a GATEWAY (e.g., PGW, UPF) is required. The question is: Are these Network functions ready to run in Containers? Do containers offer the necessary security for such Network functions?
In fact, many of these virtual network functions, or VNFs, have not yet been containerized, and need to run in Virtual Machines. One way to do this is to use a project like Virtlet, which enables you to run VMs as first-class citizens in a Kubernetes container; this way you can use VMs and Containers together.  
That said, there are some cloud native network functions, or CNFs, that are already available, such as Magma.
In the demo, you used AWS and GCP, where did Mirantis come into the picture?
In this particular example, the AWS piece was where we were running Docker Enterprise, which is now part of Mirantis. However, the discussion is purely on the technology; Mirantis isn’t REQUIRED (though of course we would like you to consider it :)). Mirantis also has an Edge offering, Mirantis Cloud Platform Edge, that uses a different architecture.
In your demo, may I say the edge cloud is running at your NB?
I don’t quite understand the question, but the edge cloud in this case was a container running on my laptop, yes.
Can you please explain again about the mirror registry and cache?
A mirror registry is a registry that holds images you expect to need in your environment. The idea is to bring images closer to your nodes so they can fetch them quickly and more efficiently than having each engine fetch a copy on the external network.
A cache registry is setup to keep a copy of each image requested by engines in your environment, so that duplicate requests can retrieve the image locally instead of downloading it again and again.
Read docs on Registry configured as Cache or Mirror: https://docs.docker.com/registry/recipes/mirror/
You show as the UCP, but for the edge it seems very expensive to have a worked node with Docker EE, could you do this with Basic or Community engine at the edge, and still put your Image on Docker Trusted Registry?
Today, only Docker EE can pull from DTR. With Community Edition, you are missing key security features that are required to secure the last mile. If you have a need for a more efficient (reduced size) engine due to constraints in your device, please contact us for assistance and we will review your use case and offer advice.  That said, if you’re not going to use DTR, you could use Docker CE.
So, video data isn’t passed to the central Google location, it is the central location that pushes approved images to the edge and the edge does all the facial comparison?
No, it’s just the opposite. The edge recognizes that there is a face, but doesn’t identify it. It just passes the image on to the “regional” cloud. The regional cloud, which happens to be running on GKE in this case, does the facial comparison, and if it finds a stranger, passes that image on to the “central” cloud, which happens to be running on Docker Enterprise on AWS in this case. So the flow of data is from the edge to the center.
Can you take an example of configuration pushed to a camera device or a CPE device connecting a branch location? Like Walmart with CPE or Camera monitoring cash registers?
I’m not sure I completely understand the question, but I’ll take a stab at it. Devices can be updated via either push, or pull. In a push situation, the central cloud sends instructions to the edge device. These instructions may include the new configuration, or they may simply include instructions to phone home for that new configuration. In a pull situation, the device will “phone home” to check for any instructions or configuration changes it needs to complete.
There is another situation. We could have an ARM processor at the edge, and I believe there is not Docker EE engine in that case, but you could have development for an ARM processor on the Cloud.
Docker EE has been tested on Arm architecture and is ready for POC with customers. This applies to both cloud and Edge instances of Arm. Please contact us if you need more information on this use case.
My question is around multi-tenancy. How can multiple enterprises push configurations to the end CPE or Camera devices with a multi-tenant central controller hosted in Cloud? How can we build a hierarchy to address latency aspects?
This one might require more questions about the use case. In general, multi-tenancy starts with the ingress of applications in the registry. Here, multiple images are received, scanned, and staged before they are deemed acceptable and made available to the end device. These images can come from multiple sources including Independent Software Vendors. From the controller interface, the operator can securely deploy these apps on the end points running a container runtime such as Docker Enterprise. The Hierarchy that is built to address latency is accomplished by moving the applications that process the data closer to the source of data creation and minimizing how much of this data is moved on the network. Then, if you can react to the output without having to go through a central hub, you can keep the latency as low as possible.
At the edge device, do you actually install Docker and run a container in the device memory itself? If yes, what would be the memory footprint in the device’s memory for docker?
Yes, the Edge device is expected to have a Container runtime engine such as Docker Enterprise Engine. We have done work with customers to reduce the impact to memory while maintaining the security features required to ensure an end to end secure environment. If you want more details, we’d need to review your specific use case.
Can you compare running Docker and containers on the edge device versus an application written in a microcontroller language with the same functionality as the app running in the docker container alternative?
There is no impact to system performance due to running an app in a container versus directly on the OS. There is an overhead due to memory requirements. The question to ask is whether the use of containers in an end device with all the cloud native advantages outweigh the need for additional memory. As you can imagine the answer depends on the use case.
Today there are several frameworks being proposed to provide HW management for Edge Clouds such as: OpenStack DCN, Mobile Edgex, Akraino etc. Does this heterogeneous environment tend to converge to a single solution? Which in your opinion?
Like most similar decisions, this is going to depend on too many factors to make a decision like this, including use case, staff familiarity, integration with the rest of your infrastructure.
If the application’s data is not ready yet at the edge how will the edge app behave?
That’s the job of the Edge application; to deal with these issues so the processing closer to the center doesn’t have to.
Is the Python code available?
Yes, the Python code will be available as part of the “Build a Basic Edge Cloud” blog series. You can find part one of that series, which covers the actual surveillance system, here. Part 2, which is due out in the next few days, covers containerization, and part 3 will deploy the containers to their respective clusters.
What does the demo app use to pass the vid images and approved pics back and forth between edge and central? Is mirroring the best or are you using like a restAPI push to where it needs to go?
In the case of this demo, it’s using a mounted drive to pass images from the camera to the first container, and then it’s using a REST API to pass images between clusters. Not that this is the ONLY way to do it, by any stretch of the imagination; it’s just the simplest way to get the concept across.
Will this example work with open hardware such as Arduino ?
I’m assuming that your question is whether you need an Arduino to do this, or whether it will work with open hardware. In fact I’ve never used an Arduino (though I really, really want to), so I can confidently say that edge doesn’t depend on it. That said, your actual Edge Devices may have specific requirements, such as durability or low power requirements, but that’s based on your specific use case and not Edge per se.
What kind of edge nodes are you recommending if any? Thinking of high latency links where sending data to any cloud or out of the edge DC would be an issue.
It all depends on your use case. In some cases you may need specific hardware that fulfills a particular requirement, such as low power use or remote accessibility, or it may be a simple laptop or even a mobile phone. Remember, anything that’s outside of a datacenter is technically “edge”.
As far as latency, that’s one of the things you’re trying to mitigate with Edge in the first place; so while it might be too much to send an entire video stream, you might send just specific frames. Or perhaps you might send just specific measurements instead of complete telemetry, doing the initial analysis on the Edge node.
How do you write that security system for kubernetes just mentioned?
Check out the white paper about Trusted Docker Containers.
Don’t forget, you can view the whole webinar here!
The post How to build a simple edge cloud: Q&A appeared first on Mirantis | Pure Play Open Cloud.
Quelle: Mirantis