OpenShift Scale-CI: Part 2 – Deep Dive

OpenShift Scale-CI: Part 2 – Deep Dive

In part one of the series, we saw how the Red Hat OpenShift Scale-CI evolved. In this post, we will look at the various components of Scale-CI. OpenShift Scale-CI is not a single tool, it’s a framework which orchestrates a bunch of tools to help analyze and improve the Scalability and Performance of OpenShift. It does this by:

Loading thousands of objects on a large scale production cluster to stress the control plane ( ApiServer, Controller, Etcd ), kubelet and other system components.
Running various benchmarks, gathering performance data during the run and visualizing the data to identify bottlenecks and tuning opportunities.
Repeating the scale tests on OpenShift deployed on various clouds including AWS, Azure, OpenStack and GCP to monitor Performance and Scalability regressions.

The motivation behind building Scale-CI is also to onboard and enable other teams to take advantage of the automation, tooling and hardware to see how well their application/component performs at scale instead of building and maintaining their own clusters, infrastructure and tools.
Architecture

Scale-CI comprises of the following components:

Scale-CI pipeline: Acts as the orchestrator for all tools to deploy, configure, monitor and diagnose OpenShift. This is the entrypoint for onboarding workloads which will be run automatically at scale.
Workloads: Tools an OpenShift cluster and runs OpenShift Performance and Scale workloads.
Scale-CI deploy: Collection of playbooks and scripts to provision and install OpenShift on various cloud platforms including AWS, Azure, GCP and OpenStack. It also supports scaling and upgrading the cluster to the desired payload.
Images: Hosts the container images source files for Scale-CI. The builds are triggered by commits into this repo. In addition we will periodically trigger rebuilds when tools in dependent containers are built and published.
Scale-CI graphshift: Deploys mutable Grafana with Performance Analysis Dashboards for OpenShift.

Scale-CI diagnosis: Running OpenShift at high scale is expensive. There is a chance that a particular config, component logs or metrics need to be looked at after the cluster has been terminated to find an issue during a particular scale test run on the cluster. This motivated us to create this tool. It helps in debugging the issues by capturing  Prometheus database from the running Prometheus pods to the local file system. This can be used to look at the metrics later by running Prometheus locally with the backed up DB. It also captures OpenShift cluster information including all the operator managed components using must-gather.
The Performance and Scalability team at Red Hat have built a number of other tools to help with our work:

Cluster Loader: Deploys large numbers of various objects to a cluster, which creates user-defined cluster objects. Build, configure, and run Cluster Loader to measure the performance metrics of your OpenShift Container Platform deployment in various cluster states. It is part of both OKD and upstream kubernetes.
Pbench:  This is a  benchmarking and performance analysis framework which runs benchmarks across one or more systems, while properly collecting the configuration of those systems, their logs and specified telemetry from various tools (sar, vmstat, perf, etc.). The collected data is shipped off to the Pbench server which is responsible for archiving the resulting tar balls, indexing them and unpacking them for display.

A typical Scale-CI run installs OpenShift on a chosen cloud provider, sets up tooling to run a pbench-agent DaemonSet, runs Conformance (e2e test suite ) to check the sanity of the cluster, scales up the cluster to the desired node count, runs various scale tests focusing on Control plane density, kubelet density, HTTP/Router, SDN, Storage, Logging, Monitoring and Cluster Limits. It also runs a Baseline workload which collects configuration and performance data on an idle cluster to know how the product is moving across various OpenShift releases. The results are shipped to the Pbench server after processing for analysis and long term storage. The results are scraped to generate a machine readable output ( JSON ) of the metrics to compare with previous runs to pass/fail the Job and send a green/red signal. 
For large and long running clusters, components like Prometheus need more disk and resources including CPU and memory. Instead of using bigger worker nodes, we create infrastructure nodes with huge amounts of disk, CPU and memory using custom MachineSets and modify the node selector to ensure that the components including Prometheus, Logging, Router and Registry run on the infrastructure nodes. This is part of the day two operation and is needed for large scale clusters.
Adding a new workload to the framework or making changes to the existing Jobs is as simple as creating a PR using the sample templates provided in the repositories.. The Scale-CI watcher picks up the change after the PR gets merged and updates the respective Jobs.
 

We spoke about the Automated OpenShift/Kubernetes Scalability testing at KubeCon + CloudNativeCon North America 2018. The slides are here, you can also watch the presentation online at https://youtu.be/37naDDcmDo4.
We recently scale tested OpenShift 4.1 before the general availability, Keep an eye out for our next blog OpenShift Scale-CI: part 3, which will have the highlights of the OpenShift  4.1 scalability run. Like always, any feedback or contributions are most welcome.
The post OpenShift Scale-CI: Part 2 – Deep Dive appeared first on Red Hat OpenShift Blog.
Quelle: OpenShift

PaaS vs KaaS: What’s the difference, and when does it matter?

The post PaaS vs KaaS: What’s the difference, and when does it matter? appeared first on Mirantis | Pure Play Open Cloud.
Earlier this month I had the pleasure of addressing the issue of Platform as a Service vs Kubernetes as a Service. We talked about the differences between the two modes, and their relative strengths and weaknesses. If you’d like to see the full webinar, you can click this link, but there were several questions we didn’t get to, and I promised to recap all of them in a blog.
Is it possible to have a PaaS that’s also a KaaS?
Yes.  While a PaaS is designed to provide applications for developers to use without having to worry about deploying them, a KaaS is designed to deploy Kubernetes clusters for developers.  There’s no reason that a PaaS can’t offer a Kubernetes cluster as an “application” to be deployed, though not all do.
Can you deploy a PaaS on Kubernetes?
Yes. At the end of the day, a PaaS is just an application, and it needs to be deployed somewhere.  If it’s already containerized, of course it can be deployed on Kubernetes. If not, it might take some additional work, but yes, it’s possible.
If the applications are written for AKS or EKS or PKS will they be locked into that respective provider APIs?
Any time you’re writing to a specific API, you’re locked into that API.  If it’s the Kubernetes API, or the OpenStack API, or any other open source API, your application can then be used anywhere that API is available.  If, however, you write to an API that’s only available from a particular provider (such as AKS or EKS or PKS) then you’re locked into that provider.
Are there open source KaaS solutions out there? Or do most people resort to Ansible/kubeadm automation to standup the clusters? This may only be relevant to organizations that run K8s on prem.
In general, most people do resort to a single-cluster tool such as Kubeadm, but once you get past a couple of clusters, a full-blown KaaS solution is generally more convenient.  There are some open source solutions, such as Kubespray, KQueen and Gardener, but so far, none that have really captured the market.
Can you talk more about why you consider OpenShift to be more PaaS than KaaS?
Most of my experience with OpenShift has been with OpenShift 3, which is definitely a PaaS; it’s essentially a single Kubernetes cluster with an application catalog and a wrapper, oc, for OpenShift-specific commands.  A single tenant can deploy a “project” which is architecturally just a namespace. (Which leads to the interesting side-effect that every project has to have a globally-unique name, but that’s just a side issue.) In to that project, OpenShift uses Operators to deploy applications of the user’s choice.
The important thing to note here is that the user has NOT been provisioned a Kubernetes cluster; they’re just squatting on the main cluster that is OpenShift.  So OpenShift 3 is definitely not a KaaS.
As another attendee of the webinar pointed out, OpenShift’s original motivations were to provide easy access to CI/CD and Software Defined Networking; Kubernetes was an afterthought.  (In fact early versions of OpenShift didn’t use it at all.)
I’ve been told that OpenShift 4 is more KaaS-like, which I assume means that it can deploy an independant Kubernetes cluster for you to use, but I’ve been unable to verify that through the documentation, and OpenShift Online still uses version 3.  (If someone has more information on this issue, I’d love to hear it.)
In terms of adoption, do we see more KaaS compared to PaaS?
That all depends on how you’re defining each. KaaS is definitely going to take off in the next few years, particularly as Edge Computing becomes more important and the need for deploying multiple clusters becomes impossible to ignore.  That said, however, many KaaSes also include application catalogs, so while the function of PaaS will continue to be important, it’s possible that stand-alone PaaSes themselves might begin to fall by the wayside.
Does KaaS provision nodes across a cluster or even multiple pods in a node?
Let’s get straight where KaaS fits in in the “provisioning” world. KaaS provisions the actual Kubernetes cluster, and not individual pods.  For example, if I were using Mirantis KaaS, I might define 5 servers to be used, and then specify that I want 3 control nodes and 2 worker nodes, which would then be spread across those 5 machines.
Once the cluster itself had been provisioned, I could then deploy my pods on those Kubernetes nodes, but that’s independent of the KaaS.
In the PDF you list Pivotal’s PKS but not PCF. I thought PKS was still beta. Can you speak where PCF fits in? It is clearly (non-k8s) PaaS, but is there anything more to add?
By PCF I assume that you’re referring to Pivotal Platform, which is not so much PaaS as an umbrella project for multiple things, including Pivotal Container Service (the KaaS), Pivotal Application Service, and Pivotal Function Service. Like OpenShift, it appears to be focused more on the CI/CD process.
If you know, what do you think about the EIRINI CloudFoundry project? Integrating Application Runtime & K8s. Is it a real convergence between PaaS and KaaS?
Yes, it does give CF KaaS capabilities. It only allows for CF Orchestration to be applied to Kubernetes Containers as well as VMs. They created a “plugin” to their CF Engine to call the same things, for example, that might be called in a KaaS.
Does OpenShift support Kubespray?
It doesn’t appear to, and there’s no mention of it in the documentation.
In your opinion, what is more cost-effective, KaaS or PaaS?
Like most questions that involve cost and technology, the answer is “it depends”.  There are a number of different factors that matter, such as what you’re trying to accomplish, your infrastructure, and your use case.  (Contact us and we’ll be happy to help you take a look at your situation.)
 
The post PaaS vs KaaS: What’s the difference, and when does it matter? appeared first on Mirantis | Pure Play Open Cloud.
Quelle: Mirantis

How to Handle OpenShift Worker Nodes Resources in Overcommitted State

One of the benefits in adopting a system like OpenShift is facilitating burstable and scalable workload. Horizontal application scaling involves adding or removing instances of an application to match demand. When OpenShift schedules a Pod, it’s important that the nodes have enough resources to actually run it. If a user schedules a large application (in the form of Pod) on a node with limited resources , it is possible for the node to run out of memory or CPU resources and for things to stop working!
It’s also possible for applications to take up more resources than they should. This could be caused by a team spinning up more replicas than they need to artificially decrease latency or simply because of a configuration change that causes a program to go out of control and try to use 100% of the available CPU resources. Regardless of whether the issue is caused by a bad developer, bad code, or bad luck, what’s important is how a cluster administrator can manage and maintain control of the resources.
In this blog, let’s take a look at how you can solve these problems using best practices.
What does “overcommitment” mean in OpenShift ? 
In an overcommitted state, the sum of the container compute resource requests and limits exceeds the resources available on the system. 
Overcommitment might be desirable in development environments where a tradeoff of guaranteed performance for capacity is acceptable.Therefore, in an overcommitted environment, it is important to properly configure your worker node to provide the best system behavior. With this note let’s find out what needs to be enabled on the worker nodes in an overcommitted environment.
Prerequisites for the overcommitted worker nodes: 
The following prerequisites flow chart describes all the checks that should be performed on the worker nodes. Let’s go into the details one by one.

 1. Is the worker node ready for overcommitment? 
In OpenShift Container Platform overcommitment is enabled by default. If not, it is always advisable to cross check. When the node starts, it ensures that the kernel tunable flags for memory management are set properly. The kernel should never fail memory allocations unless it runs out of physical memory.
To ensure this behavior, OpenShift Container Platform configures the kernel to always overcommit memory by setting the vm.overcommit_memory parameter to 1, overriding the default operating system setting.
OpenShift Container Platform also configures the kernel not to panic when it runs out of memory by setting vm.panic_on_oom parameter to 0. A setting of 0 instructs the kernel to call oom_killer in an Out of Memory (OOM) condition, which kills processes based on priority.
You can view the current setting by running the following commands on your nodes:
$ oc debug node/<worker node>
Starting pod/<worker node>-debug …
If you don’t see a command prompt, try pressing enter.
sh-4.2# sysctl -a |grep commit
vm.overcommit_memory = 1
sh-4.2# sysctl -a |grep panic
vm.panic_on_oom = 0

In case your worker node settings are different than the expected you can easily set it via machine-config operator for RHCOS and for RHEL via below command .
$ sysctl -w vm.overcommit_memory=1
 2. Is the worker node enforcing CPU limits using CPU CFS quotas ?
The Completely Fair Scheduler (CFS) is a process scheduler which was merged into the Linux Kernel 2.6.23 release (October 2007) and is the default scheduler. It handles CPU resource allocation for executing processes, and aims to maximize overall CPU utilization while also maximizing interactive performance.
By default, the Kubelet uses CFS quota to enforce pod CPU limits. For example, when a user sets a limit on CPU to 100 millicores for the pod, kubernetes (via the kubelet on the node) specifies a CFS quota for CPU on the pod’s processes. The pod/’s processes get throttled if they try to use more than the CPU limit.
When the node runs many CPU-bound pods, the workload can move to different CPU cores depending on whether the pod is throttled and which CPU cores are available at scheduling time.  Many workloads are not sensitive to this migration and thus work fine without any intervention.
kubeletArguments:
 cpu-cfs-quota:
   – “True”
 3. Are enough resources reserved for system and kube processes per node?
To provide more reliable scheduling and minimize node resource overcommitment, each node can reserve a portion of its resources for use by all underlying node components (such as kubelet, kube-proxy) and the remaining system components (such as sshd, NetworkManager) on the host.
CPU and memory resources reserved for node components in OpenShift Container Platform are based on two node settings:

kube-reserved
Resources reserved for node components. Default is none.

system-reserved
Resources reserved for the remaining system components. Default is none.

 
If a flag is not set, it defaults to 0. If none of the flags are set, the allocated resource is set to the node’s capacity as it was before the introduction of allocatable resources.
The below table summarizes the recommended resources to be reserved per worker node . This is based upon OpenShift version 4.1. Also note that this does not include the resources required to run any 3rd party CNI plugin , its operator etc.

You can set the reserved resources with the help of machineconfigpool and KubeletConfig (CR) as shown in the example below .
Find out the correct machineconfigpool for your worker node and label it if not done already.
$ oc describe machineconfigpool worker

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
 creationTimestamp: 2019-02-08T14:52:39Z
 generation: 1
 labels:
   custom-kubelet: small-pods

$ oc label machineconfigpool worker custom-kubelet=small-pods
Create a KubeletConfig as shown below and set the desired resources for system and kube processes .
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
 name: set-allocatable 
spec:
 machineConfigPoolSelector:
   matchLabels:
     custom-kubelet: small-pods 
 kubeletConfig:
   systemReserved:
     cpu: 500m
     memory: 512Mi
   kubeReserved:
     cpu: 500m
     memory: 512Mi

 4. Is swap memory disabled on the worker node?
By default, OpenShift disables swap partitions in the Node. A good practice in Kubernetes clusters is to disable swap on the cluster nodes in order to preserve quality of service (QOS) guarantees. Otherwise, physical resources on a node can oversubscribe, affecting the resource guarantees the Kubernetes scheduler makes during pod placement.
For example, if two guaranteed pods have reached their memory limit, each container could start using swap memory. Eventually, if there is not enough swap space, processes in the pods can be terminated due to the system being oversubscribed.
Failing to disable swap results in nodes not recognizing that they are experiencing MemoryPressure, resulting in pods not receiving the memory they made in their scheduling request. As a result, additional pods are placed on the node to further increase memory pressure, ultimately increasing your risk of experiencing a system out of memory (OOM) event.
 5. Is QoS defined ?
In an overcommitted environment, it is possible that the pods on the node will attempt to use more compute resources than is available at any given point in time. When this occurs, the node must give priority to one pod over another. The facility used to make this decision is referred to as a Quality of Service (QoS) Class.
For each compute resource, a container is divided into one of three QoS classes with decreasing order of priority.

Priority
Class
Name
Description


Guaranteed
If limits and optionally requests are set (not equal to 0) for all resources and they are equal, then the container is classified as Guaranteed.

2
Burstable
If requests and optionally limits are set (not equal to 0) for all resources, and they are not equal, then the container is classified as Burstable.


BestEffort
If requests and limits are not set for any of the resources, then the container is classified as BestEffort.

 
A priority class object can take any 32-bit integer value smaller than or equal to 1000000000 (one billion). Reserve numbers larger than one billion for critical pods that should not be preempted or evicted. For the critical Pods we define two classes .For example :

System-node-critical – This priority class has a value of 2000001000 and is used for all pods that should never be evicted from a node.
System-cluster-critical – This priority class has a value of 2000000000 (two billion) and is used with pods that are important for the cluster. Pods with this priority class can be evicted from a node in certain circumstances.

You can also use the qos-reserved parameter to specify a percentage of memory to be reserved by a pod in a particular QoS level. This feature attempts to reserve requested resources to exclude pods from lower QoS classes from using resources requested by pods in higher QoS classes. For example a value of qos-reserved=memory=100% will prevent the Burstable and BestEffort QoS classes from consuming memory that was requested by a higher QoS class i.e. Guaranteed QoS. Similarly, a value of qos-reserved=memory=0% will allow a Burstable and BestEffort QoS classes to consume up to the full node allocatable amount if available, but increases the risk that a Guaranteed workload will not have access to the requested memory.
 
Mechanisms to control the resources on the overcommitted worker nodes :
After executing the prerequisites on the worker nodes / cluster  it’s time now to see what all mechanisms are available from the kubernetes side to control the resources like CPU, Memory , Ephemeral storage, Ingress and Egress traffic etc.

Limit Ranges: 

A limit range, defined by a LimitRange object, enumerates compute resource constraints in a project at the pod, container, image, image stream, and persistent volume claim level, and specifies the amount of resources that a pod, container, image, image stream, or persistent volume claim can consume.All resources create and modification requests are evaluated against each LimitRange object in the project. If the resource violates any of the enumerated constraints, then the resource is rejected. If the resource does not set an explicit value, and if the constraint supports a default value, then the default value is applied to the resource.
Below is the example of limit range definition.
apiVersion: “v1″
kind: “LimitRange”
metadata:
 name: “core-resource-limits” 
spec:
 limits:
   – type: “Pod”
     max:
       cpu: “2” 
       memory: “1Gi” 
     min:
       cpu: “200m” 
       memory: “6Mi” 
   – type: “Container”
     max:
       cpu: “2” 
       memory: “1Gi” 
     min:
       cpu: “100m” 
       memory: “4Mi” 
     default:
       cpu: “300m” 
       memory: “200Mi” 
     defaultRequest:
       cpu: “200m” 
       memory: “100Mi” 
     maxLimitRequestRatio:
       cpu: “10” 

 2. CPU Requests:
Each container in a pod can specify the amount of CPU it requests on a node. The scheduler uses CPU requests to find a node with an appropriate fit for a container.The CPU request represents a minimum amount of CPU that your container may consume, but if there is no contention for CPU, it can use all available CPU on the node. If there is CPU contention on the node, CPU requests provide a relative weight across all containers on the system for how much CPU time the container may use.On the node, CPU requests map to Kernel CFS shares to enforce this behavior.
 3. CPU Limits:
Each container in a pod can specify the amount of CPU it is limited to use on a node. CPU limits control the maximum amount of CPU that your container may use independent of contention on the node. If a container attempts to exceed the specified limit, the system will throttle the container. This allows the container to have a consistent level of service independent of the number of pods scheduled to the node.
 4. Memory Requests:
By default, a container is able to consume as much memory on the node as possible. In order to improve placement of pods in the cluster, specify the amount of memory required for a container to run. The scheduler will then take available node memory capacity into account prior to binding your pod to a node. A container is still able to consume as much memory on the node as possible even when specifying a request.
 5. Memory Limits:
If you specify a memory limit, you can constrain the amount of memory the container can use. For example, if you specify a limit of 200Mi, a container will be limited to using that amount of memory on the node. If the container exceeds the specified memory limit, it will be terminated and potentially restarted dependent upon the container restart policy.
 6. Ephemeral Storage Requests:
By default, a container is able to consume as much local ephemeral storage on the node as is available. In order to improve placement of pods in the cluster, specify the amount of required local ephemeral storage for a container to run. The scheduler will then take available node local storage capacity into account prior to binding your pod to a node. A container is still able to consume as much local ephemeral storage on the node as possible even when specifying a request.
 7. Ephemeral Storage Limits:
If you specify an ephemeral storage limit, you can constrain the amount of ephemeral storage the container can use. For example, if you specify a limit of 2Gi, a container will be limited to using that amount of ephemeral storage on the node. If the container exceeds the specified memory limit, it will be terminated and potentially restarted dependent upon the container restart policy.
 8. Pods per Core:
The podsPerCore parameter limits the number of pods the node can run based on the number of processor cores on the node. For example, if podsPerCore is set to 10 on a node with 4 processor cores, the maximum number of pods allowed on the node is 40.
 9. Max Pods per node:
The maxPods parameter limits the number of pods the node can run to a fixed value, regardless of the properties of the node.Two parameters control the maximum number of pods that can be scheduled to a node: podsPerCore and maxPods. If you use both options, the lower of the two limits the number of pods on a nodeIn order to configure these parameters , label the machineconfigpool.
$ oc label machineconfigpool worker custom-kubelet=small-pods

$ oc describe machineconfigpool worker

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
 creationTimestamp: 2019-02-08T14:52:39Z
 generation: 1
 labels:
   custom-kubelet: small-pods
Create the KubeletConfig (CR) as shown below.
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
 name: set-max-pods 
spec:
 machineConfigPoolSelector:
   matchLabels:
     custom-kubelet: small-pods 
 kubeletConfig:
   podsPerCore: 10 
   maxPods: 250

 10. Limiting the bandwidth available to the Pods:
You can apply quality-of-service traffic shaping to a pod and effectively limit its available bandwidth. Egress traffic (from the pod) is handled by policing, which simply drops packets in excess of the configured rate. Ingress traffic (to the pod) is handled by shaping queued packets to effectively handle data. The limits you place on a pod do not affect the bandwidth of other pods.To limit the bandwidth on a pod you can specify the data traffic speed using kubernetes.io/ingress-bandwidth and kubernetes.io/egress-bandwidth annotations as shown below.
{
   “kind”: “Pod”,
   “spec”: {
       “containers”: [
           {
               “image”: “openshift/hello-openshift”,
               “name”: “hello-openshift”
           }
       ]
   },
   “apiVersion”: “v1″,
   “metadata”: {
       “name”: “iperf-slow”,
       “annotations”: {
           “kubernetes.io/ingress-bandwidth”: “10M”,
           “kubernetes.io/egress-bandwidth”: “10M”
       }
   }
}
Conclusion:
As you can see from the post, there are about 10 mechanisms available from the kubernetes perspective which can be used very effectively to control the resources on the worker nodes in the overcommitment state provided pre-requisites are applied at first place. Now which mechanism to be used preciously is entirely depends upon the end user and the use case he or she is trying to solve.
The post How to Handle OpenShift Worker Nodes Resources in Overcommitted State appeared first on Red Hat OpenShift Blog.
Quelle: OpenShift

Startup helps food companies reduce risk and maintenance costs with IBM Cloud solution

EcoPlant is helping food and beverage companies significantly improve energy use, optimize maintenance and save money. Our software as a service (SaaS) solution continually monitors and optimizes compressed air systems in near real time to help food and beverage makers, as well as companies in other industries, maintain and manage air compression systems.
Air compression systems are vital to the food and beverage industry. The systems are used every day to manufacture, shape, package and process the food and beverages products. Air compression systems are also used to help clean manufacturing equipment.
The challenge for food and beverage makers is keeping air compression systems and their multiple sub-systems well maintained and running at optimal efficiency. Systems that run inefficiently can cost businesses millions in wasted energy and emit tons of carbon dioxide (CO2) into the atmosphere. They can also put food safety at risk. A single filter leak, for instance, can introduce a host of contaminants and microorganisms into food containers.
Our EcoPlant platform, a smart monitoring and control system solution built and powered by IBM Cloud technologies, offers a solution.
Accelerating platform deployment with IBM
As a young startup, we wanted to develop our platform quickly, but it was important to keep infrastructure costs down. We didn’t want to have to set up, configure and maintain our own servers. We wanted to focus on writing code and building our proactive engine logic and AI algorithms. We also needed advanced capabilities to aggregate and analyze the data we were capturing from compressors, along with security features and scalability.
We found all of this, and more, in the IBM Alpha Zone accelerator. During the 20-week program, we talked to IBM experts and received technical training and support. We also had access to IBM infrastructure, like IBM Cloud Functions, a functions as a service (FaaS) programming platform and a service of IBM Cloud. Using the built-in platform capabilities, like events and periodic execution, we built our advanced analytics engine. Best of all, we only paid for the time we used, not a penny more.
Some very talented software architects helped us develop the platform the right way. For instance, we chose the IBM Watson IoT Platform to process, secure and analyze our customers’ air compression systems data. And because the Watson IoT Platform is a service of the IBM Cloud, we also get the scalability and security capabilities we need as our business grows. Plus, telling customers we use Watson technology gives us credibility.
Bringing predictive maintenance to air compression systems
Our platform collects data from air compression systems in near real time using strategically placed sensors and smart devices called EcoBoxes. The EcoBoxes send the data to the Watson IoT Platform where it’s analyzed by the predictive, AI-powered algorithms of our advanced analytics engine. If it detects a problem with the air compression system, like a leak in a filter, it sends an alert to the operations manager so he or she can address the problem proactively.
But what’s unique about our predictive maintenance solution is that it can also dynamically control the air compression systems. So, when it detects a leak in a filter, for instance, it sends the operations manager a suggested plan to fix it, such as closing a problematic valve or compressor. If the manager agrees, the platform sends the plan to the EcoBox, which then runs it and closes the valve.
Improving facility maintenance is win-win for business and environment
Today, we have customers throughout Europe and we’re rapidly expanding into the US market from our Minnesota office.
Through predictive maintenance and by optimizing the efficiency of air compression systems, we’re helping the food and beverage industry prevent contamination. We’re also helping companies reduce energy consumption, energy waste and costs.
For instance, a global food and beverages provider in Israel cut its energy consumption by roughly 25 percent. By reducing energy use it saved a total of USD 85,000 in less than five months, and USD 170,000 annually. The plant also reduced its annual CO2 emissions by nearly 700 tons by using our platform.
Even hospitals and commercial buildings can realize these benefits by applying the technology to pumps and chillers. In fact, on average, industrial plants can realize up to 50 percent in energy savings.
It’s a win-win for businesses and the environment alike.
Learn more about the EcoPlant solution.
 
The post Startup helps food companies reduce risk and maintenance costs with IBM Cloud solution appeared first on Cloud computing news.
Quelle: Thoughts on Cloud

OpenShift Commons Gathering in Milan 2019 – Recap [Slides]

The first Italian OpenShift Commons Gathering
gathered over 300 participants to Milan!
 
On September 18th, 2019, the first OpenShift Commons Gathering Milan brought together over 300 experts to discuss container technologies, operators, the operator framework and the open source software projects that support the OpenShift ecosystem. This was the first OpenShift Commons Gathering to take place in Italy.
The standing room only event hosted 11 talks in a whirlwind day of discussions. Of particular interest to the community was Christian Glombek’s presentation updating the status and roadmap for OKD4 and CoreOS.
Highlights from the Gathering induled an OpenShift 4 Roadmap Update, customer stories from Amadeus, the leading travel technology company, and local stories from Poste Italiane and SIA S.p.A. In addition to the technical updates and customer talks, there was plenty of time to network during the breaks and enjoy the famous Italian coffee.
Here are the slides from the event:
{please note: edited videos will be uploaded to youtube soon}

9:30 a.m.
Welcome to the Commons: Collaboration in Action
Diane Mueller (Red Hat)
Slides
Video

9:50 a.m.
Red Hat’s Unified Hybrid Cloud Vision
Brian Gracely (Red Hat)
Slides
Video

10:30 a.m.
OpenShift 4.1 Release Update and Road Map
William Markito Oliveira (Red Hat)  |  Christopher Blum (Red Hat)
Slides
Video

11:30 a.m.
Customer Keynote: OpenShift @ Amadeus
Salvatore Dario Minonne (Amadeus)
Slides
Video

12:00 a.m.
State of the Operators: Framework, SDKs, Hubs and beyond
Guil Barros (Red Hat)
Slides
Video

12:30 p.m.
Update on OKD4 and Fedora CoreOS
Christian Glombek (Red Hat)
Slides
Video

2:00 p.m.
OpenShift Managed su Azure
Marco D’Angelo (Microsoft)
Slides
Video

2:30 p.m.
Open Banking with Microservices Architectures and Apache Kafka on OpenShift
Paolo Gigante (Poste Italiane) | Pierluigi Sforza (Poste Italiane) | Paolo Patierno (Red Hat)
Slides
Video

3:00 p.m.
State of Serverless/Service Mesh
Giuseppe Bonocore (Red Hat) | William Markito Oliveira (Red Hat)
Slides
Video

4:15 p.m.
Case Study: OpenShift @ SIA
Nicola Nicolotti (SIA Spa) | Matteo Combi (SIA S.p.A.)
Slides
Video

4:45 p.m.
State of Cloud Native Storage
Christopher Blum (Red Hat)
Slides
Video

5:10 p.m.
AMA panel
Engineers & Product Managers (Red Hat OpenShift) + customer
 N/A
Video

5:30 p.m.
Road Ahead at OpenShift Wrap-Up
Diane Mueller & Tanja Repo (Red Hat)
Slides
Video

 
To stay updated of all the latest releases and events, please join the OpenShift Commons and join our mailing lists & Slack channel.
 
What is OpenShift Commons?
Commons builds connections and collaboration across OpenShift communities, projects, and stakeholders. In doing so we enable the success of customers, users, partners and contributors as we deepen our knowledge and experiences together.
Our goals go beyond code contributions. Commons is a place for companies using OpenShift to accelerate its success and adoption. To do this we’ll act as resources for each other, share best practices and provide a forum for peer-to-peer communication.
Join OpenShift Commons today!
 
Join us in the upcoming Commons Gatherings!
The OpenShift Commons Gatherings continue – please join us next time at:

October 28, 2019 in San Francisco, California – event is co-located with ODSC/West
November 18, 2019 in San Diego, California –  event is co-located with Kubecon/NA

 
The post OpenShift Commons Gathering in Milan 2019 – Recap [Slides] appeared first on Red Hat OpenShift Blog.
Quelle: OpenShift

International airline Etihad Airways delivers upscale flight booking with solution built on IBM Cloud

The airline industry is going through huge transformation both in terms of providing customer service and finding new ways to provide a better travel experience.
Etihad Airways is the national airline of the United Arab Emirates and hospitality is a key part of the culture. Excelling in customer service is important and having a technology platform that enables a friendly user experience is what we want to achieve and what our travelers have come to expect.
Creating improved flight booking to help travelers Choose Well
The first touch point we chose to address is traveler check in. How can we create a technology solution that provides the same fast and consistent experience across all touch points, digital and physical? What this means is whether a traveler starts the flight booking on an iPad, then moves to a mobile phone, then goes to the airport, we want them to have a consistent, easy and intuitive travel experience.
We were looking for a partner that would enable us to achieve this objective as fast as possible. Speed to market is important because the travel industry is very competitive.
We have a strategy, which was launched last year, branded “Choose Well.” Through Choose Well, we aim to empower our travelers to choose the right product at the right price point and with the option of the right ancillaries and services to meet their needs. The marketing campaign began in late November and we had to very quickly transform our technology to match that branding experience.
Moving past established technologies and silos to land a seamless solution
One of our challenges was the fact that the airline industry uses a lot of established technologies and silo-based solutions. To provide a transparent and seamless travel experience, we needed to have the platforms in place that could deliver that flight booking functionality across all devices and all use cases.
Originally, the strategy was to build the microservices to deliver this flight booking functionality from scratch. However, doing that has some risk and time penalty. In connecting with other airline teams and IBM representatives, we learned that using the IBM Cloud, which has industry-specific solutions to the travel sector, would significantly reduce our development time by using some of the existing assets that the technology offers.
The ability to work within an IBM Garage to further speed innovation was also fundamental to our decision to work with IBM. This way, we could engage various stakeholders from both companies in a constructive way and drive an outcome as quickly as possible.
We used IBM Garage methodologies at a nearby IBM site to co-create a flight booking solution with IBM. Our companies worked together as a single team to deliver what we were looking to achieve using existing assets, microservices and APIs to connect in an easier and significantly more efficient way. With the pre-built microservices architecture, our service orchestration platform and IBM API Connect, we’re able to connect very quickly and easily to existing systems like Sabre and also to new technologies like WhatsApp.
Gaining speed, improving efficiencies and enhancing customer experience
A very quick and successful proof-of-concept (POC) project gave us confidence in the solution path. Following that proof of concept, we launched the first release of the minimum viable product (MVP) in just 15 weeks.
With this project, the team started deploying the solution in the United States to be close to our host system. It was subsequently moved to the IBM Cloud data centers in the United Kingdom and then to Germany to optimize cost and performance. Moving workload between data centers was something that in the past would have taken weeks. With IBM Cloud, it took just hours.
By working with the IBM team and using IBM Garage methodologies, we achieved our goal. We are now offering a consistent and a more innovative flight booking and travel experience to our customers across touch points.
Learn more about IBM Garage and schedule a no-charge visit with the IBM Garage to get started.
The post International airline Etihad Airways delivers upscale flight booking with solution built on IBM Cloud appeared first on Cloud computing news.
Quelle: Thoughts on Cloud

9 steps to awesome with Kubernetes/OpenShift presented by Burr Sutter

Burr Sutter gave a terrific talk in India in July, where he laid out the terms, systems and processes needed to setup Kubernetes for developers. This is an introductory presentation, which may be useful for your larger community of Kubernetes users once you’ve already setup User Provisioned Infrastructure (UPI) in Red Hat OpenShift for them, though it does go into the deeper details of actually running the a cluster. To follow along, Burr created an accompanying GitHub repository, so you too can learn how to setup an awesome Kubernetes cluster in just 9 steps.

The post 9 steps to awesome with Kubernetes/OpenShift presented by Burr Sutter appeared first on Red Hat OpenShift Blog.
Quelle: OpenShift

A Look into the Technical Details of Kubernetes 1.16

This week Kubernetes 1.16 is expected and we want to highlight the technical features that enterprise Kubernetes users should know about. With Custom Resource Definitions (CRDs) moving into official general availability, storage improvements, and more, this release hardens the project and celebrates the main extension points for building cloud native applications on Kubernetes.
CRDs to GA
Custom Resource Definitions (CRDs) were introduced into upstream Kubernetes by Red Hat engineers in version 1.7. From the beginning, they were designed as a future-proof implementation of what was previously prototyped as ThirdPartyResources. The road of CRDs has focused on the original goal of making custom resources production ready, bringing it to be a generally available feature in Kubernetes, highlighted with the promotion of the API to v1 in 1.16.
CRDs have become a cornerstone of API extensions in the Kubernetes ecosystem, and is the basis of innovation and a core building block of OpenShift 4. Red Hat has continued pushing CRDs forward ever since, as one of the main drivers in the community behind the features and stability improvements, which finally lead to the v1 API. This progress made OpenShift 4 possible.
Let’s take a deeper look at what will change in the v1 API of Custom Resource Definitions (in the apiextensions.k8s.io/v1 API group). The main theme is around consistency of data stored in CustomResources:
The goal is that consumers of data stored in CustomResources can rely on that it has been validated on creation and on every update such that the data: 

follows a well-known schema
is strictly typed
and only contains values that were intended by the developers to be stored in the CRD.

 
We know all of these properties from native resources like, for example, pods.
 Pods have a well-known structure for metadata, spec, spec.containers, spec.volumes, etc.

Every field in a pod is strictly typed, e.g. every field is either a string, a number, an array, an object or a map. Wrong types are rejected by the kube-apiserver: one cannot put a string where a number is expected.
Unknown fields are stripped when creating a pod: a user or a controller cannot store arbitrary custom fields, e.g. pod.spec.myCustomField. For API compatibility reasons the Kubernetes API conventions say to drop and not reject those fields.

In order to fulfill these 3 properties, CRDs in the v1 version of apiextensions.k8s.io require:

That a schema is defined (in `CRD.spec.versions[n].schema`) – example:
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: crontabs.example.openshift.io
spec:
group: example.openshift.io
versions:
– name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
cronSpec:
type: string
image:
type: string
replicas:
type: integer
scope: Namespaced
names:
plural: crontabs
singular: crontab
kind: CronTab
2. That the schema is structural (https://kubernetes.io/blog/2019/06/20/crd-structural-schema/) — KEP: https://github.com/kubernetes/enhancements/blob/master/keps/sig-api-machinery/20190425-structural-openapi.md – the example above is structural.
3. That pruning of unknown fields (those which are not specified in the schema of (1)) is enabled (pruning used be opt-in in v1beta1 via `CRD.spec.preserveUnknownFields: false`) — KEP: https://github.com/kubernetes/enhancements/blob/master/keps/sig-api-machinery/20180731-crd-pruning.md – pruning is enabled for the example as it is a v1 manifest where this is the default.

These are all formal requirements about CRDs and their CustomResources, checked by the kube-apiserver automatically. But there is an additional dimension of high quality APIs in the Kubernetes ecosystem: API review and approval.
API Review and Approval
Getting APIs right does not only mean to be a good fit for the described business logic. APIs must be

compatible with Kubernetes API Machinery of today and tomorrow,
future-proof in their own domain, i.e. certain API patterns are good and some are knowingly bad for later extensions.

The core Kubernetes developers had to learn painful lessons in the first releases of the Kubernetes platform, and eventually introduced a process called “API Review”. There is a set of people in the community who very carefully and with a lot of experience review every change against the APIs that are considered part of Kubernetes. Concretely, these are the APIs under the *.k8s.io domain.
To make it clear to the API consumer that APIs in *.k8s.io are following all quality standards of core Kubernetes, CRDs under this domain must also go through the API Review process (this is not a new requirement, but has been in place for a long time) and – and this is new – must link the API review approval PR in an annotation::
metadata:
annotations:
“api-approved.kubernetes.io”: “https://github.com/kubernetes/kubernetes/pull/78458″
Without this annotation, a CRD under the *.k8s.io domain is rejected by the API server.
There are discussions about introducing other reserved domains for the wider Kubernetes community, e.g. *.x-k8s.io, with different, lower requirements than for core resources under *.k8s.io.
CRD Defaulting to Beta
Next to the presented theme of CRD data consistency, another important feature in 1.16 is the promotion of defaulting to beta. Defaulting is known to everybody for native resources, i.e. unspecified fields in a manifest are automatically set to default values on creation by the kube-apiserver.
For example, pod.spec.restartPolicy defaults to Always. Hence, if the user does not set that field, the API server will set and persist Always as the value.
Also old objects already persisted in etcd can get new fields when read from etcd using the defaulting mechanism. This is an important difference to mutating admission webhooks, which are not called on read from etcd, and hence cannot simulate real defaulting.
Defaults are an important API feature which heavily drives an API design. Defaults are now definable in CRD OpenAPI schemas. Here is an example from an OpenShift 4 CRD:
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
 name: kubeapiservers.operator.openshift.io
spec:
 scope: Cluster
 group: operator.openshift.io
 names:
   kind: KubeAPIServer
   plural: kubeapiservers
   singular: kubeapiserver
 subresources:
   status: {}
 versions:
 – name: v1
   served: true
   storage: true
   schema:
     openAPIV3Schema:
       spec:
           type: object
           properties:
             logLevel:
               type: string
               default: Normal
             managementState:
               pattern: ^(Managed|Force)$
               default: Managed
               type: string

When such an object is created with explicitly setting logLevel and managementState, the log level will be Normal and the managementState will be Managed.
Kubectl independence
Kubectl came to life almost five years ago as a replacement for the initial CLI for Kubernetes: kubecfg. Its main goals were:

improved user experience 
and modularity. 

Initially, these goals were met, but over time it flourished in some places, but not in others. Red Hat engineers worked on allowing extensibility and stability of kubectl since the beginning because this was required to make pieces of OpenShift as an enterprise distribution of Kubernetes possible.
The initial discussions about the possibility of splitting kubectl out of the main Kubernetes repository to allow faster iteration and shorter release cycles were started almost two years ago. Unfortunately, the years the kubectl code lived in the main Kubernetes repository caused it to have a tight coupling with some of the internals of Kubernetes. 
Several Red Hat engineers were involved in this effort from the start, refactoring the existing code to make it less coupled with internals, exposing libraries such as k8s.io/api (https://github.com/kubernetes/api/) and k8s.io/client-go (https://github.com/kubernetes/client-go/), to name a few, which are the foundation for many of the existing integrations. 
One of the biggest offenders in that internals fight was the fact that entire kubectl code relied on the internal API versions (iow. internal representation of all the resources exposed in kube-apiserver). Changing this required a lot of manual and mundane work to rewrite every piece of code to properly work with external, official API (iow. the ones you work on a regular basis when interacting with a cluster).
Many long hours of sometimes hard, other times dull work was put into this effort, which resulted in the recent initial brave step which moved (almost all) kubectl code to staging directory. In short, a staging repository is one that is treated as an external one, having its own distinct import path (in this case k8s.io/kubectl).
Reaching this first visible goal brings us several important implications. Kubectl is currently being published (through the publishing-bot) into its own repository that can be easily consumed by the external actors as k8s.io/kubectl. Even though there are a few commands left in the main kubernetes tree, we are working hard on closing this gap, while trying to figure out how the final extraction piece will work, mostly from the testing and release point of view.
Storage improvements
For this release, SIG-storage focused on bringing feature parity between Container Storage Interface (CSI) and in-tree drivers, as well as improving the stability of CSI sidecars and filling in functionality gaps.
We are working on migrating in-tree drivers and replacing them with their CSI equivalent. This is an effort across releases, with more work to follow, but we made steady progress. 
Some of the features the Red Hat storage team designed and implemented include:

Volume cloning (beta) to allow users to create volumes from existing sources.
CSI volume expansion as default, which brings feature parity between in-tree and CSI drivers. 
Raw block support improvements and bug fixes, especially when using raw block on iSCSI volumes.

Learn more
Kubernetes 1.16 brings enhancements to CRDs and storage. Check out the Kubernetes 1.16 release notes to learn more.
The post A Look into the Technical Details of Kubernetes 1.16 appeared first on Red Hat OpenShift Blog.
Quelle: OpenShift

Talium, Irene Energy remove barriers to accessing electricity in Africa

Approximately 600 million people do not have electricity in Africa according to reports on World Bank data. Even though progress has been made to get more people in Africa on the grid, the absolute number of people without power remains the same due to population growth.
In rural areas, cell phones are vital for people with no access to banks to send and receive money, access medical care and stay in contact with family and friends, according to The African Gourmet. Some people have to walk miles to the nearest town to drop off their cell phone for charging, and the wait could be three days. Furthermore, they may have to allocate as much as 24 percent of their daily living allowance per charge.
Talium has brought its expertise in blockchain projects and experience as an energy sector systems integrator to Irene Energy. Together, the companies have architected a solution to improve access to electricity in Africa and lower costs.
Reducing the costs of accessing electricity in Africa
Universal electrification is hard and expensive, according to Quartz Africa. Grid connections cost anywhere from $250 to more than $2,500 depending on proximity to the grid. Mini-grids that offer a grid-like service still cost between $500 and $1,500 to connect each household. These are steep costs for both providers and consumers.
Irene Energy wanted to create a flexible and cost-effective back-office infrastructure for energy service providers built on blockchain technologies. It chose the Stellar payment network to enable low-cost micropayments and needed a secure way to manage user credentials so that smaller companies and individuals could participate in the energy market. Typically, the way to address this requirement is through hardware with built-in encryption. But that would be an expensive proposition and contrary to our project goal of reducing the costs of accessing electricity in Africa. So, we instead looked to lower the costs of the back-office technologies being used by energy service providers.
We chose IBM Cloud Data Shield, which runs containerized applications in a secure enclave on an IBM Cloud Kubernetes Service host. This solution simplifies data-in-use protection by a huge margin, while at the same time addressing the huge scalability concerns of the Irene Energy project. With the Irene Energy platform, there are no up-front costs. This is not only because the platform is on the cloud, but also because companies do not have to design their applications to be compatible with security requirements. Instead, IBM Cloud Data Shield automates that security process.
More affordable and accessible electricity
It was a great experience to partner with IBM because we could see that there was mutual interest in building something together. IBM really cares about what’s going on in the field; and, in this case, the lack of electricity in Africa. We felt that IBM wanted to address this very real concern with a constructive solution.
Reducing the cost of back-office technology for electric service providers with IBM Cloud means electricity can be available to more people in rural areas.
The Irene Energy platform is making electricity more affordable and accessible for millions of people in Africa. It also facilities electricity roaming and shared ownership of electricity assets.
Watch the video or read the case study to learn more.
The post Talium, Irene Energy remove barriers to accessing electricity in Africa appeared first on Cloud computing news.
Quelle: Thoughts on Cloud

4 steps to modernize and cloud-enable applications

Customers today are no longer satisfied by the traditional consumer-business relationship. Instead, they expect engaging and informative digital experiences. In order to match these expectations and stay ahead of the curve, organizations must lean into digital transformation. Businesses need to modernize both customer-facing and enterprise applications to support a customer-centric approach to business. Developing a strategy to cloud-enable applications is crucial in gaining and maintaining a competitive advantage, especially when, according to a Forrester study, by 2023, 90 percent of current applications will still be in use, but most won’t have received sufficient modernization investment. This means there’s an opportunity for businesses who invest now in application modernization.
Application modernization: Taking a phased approach
Cloud-enabling applications doesn’t have to be an all-or-nothing proposition. Application modernization is best achieved by taking a phased approach, one that’s tailored to business goals and application architecture.
Companies can simplify and extend functionality while still meeting business and IT requirements by carefully choosing which applications to prioritize when modernizing for a hybrid cloud environment. This allows organizations to capitalize on the benefits of the cloud without disrupting existing workloads in on-premises environments.
Prime applications for digital transformation through the following four steps to gain a competitive edge.
1. Simplify with containers
Putting existing applications into containers is the first step to simplifying application deployment and management. Containers encapsulate the application with minimal or no changes to the application itself, enabling consistent testing and deployment that reduces costs and simplifies operations.
2. Extend with APIs
Extend existing applications with APIs that securely expose their full capabilities to developers. The applications become reusable across clouds to easily access and build new capabilities. Beyond APIs, this approach relies on an agile integration strategy that supports the volume of connections and variety of architectures required.
3. Decompose with microservices
Use microservices to break down monolithic applications into deployable components, where each component performs a single function. Businesses can then further enhance development agility and efficiency by putting each microservice in its own container. Using Kubernetes, companies can then manage and deliver the microservices of existing applications.
4. Refactor with new microservices
Refactoring involves building new microservices. In some instances, it may be easier to develop new applications utilizing cloud-native development practices instead of working with a current monolith. This provides teams with the ability to deliver innovation to users, encourage creative thinking and allow developers to experiment in a low-risk fashion.
Find your next step to modernize applications
Application modernization is a critical aspect of business modernization. Leading organizations that prioritize cloud-enabling their applications are breaking away from the competition by enhancing customer experiences and accelerating development and delivery.
Read the smart paper “Simplify and extend apps with an open, hybrid cloud” to learn more about application modernization and the unique approach, tools and solutions offered by IBM for application modernization.
The post 4 steps to modernize and cloud-enable applications appeared first on Cloud computing news.
Quelle: Thoughts on Cloud