OpenShift 4.1 UPI environment deployment on Microsoft Azure Cloud

Red Hat released Red Hat OpenShift Container Platform 4.1 (OCP4) earlier this year, introducing installer provisioned infrastructure and user provisioned infrastructure approaches on Amazon Web Services (AWS). The Installer provisioned infrastructure method is quick and only requires you to have AWS credentials, access to Red Hat telemetry and domain name. Red Hat also released a user provisioned approach where OCP4 can be deployed by leveraging CloudFormation, templates and using the same installer to generate ignition configuration files.
Since Microsoft Azure is getting more and more business attention, natural question would be: when is OCP4 going to be released on Azure Cloud? At the time of the writing OCP4 on Azure using installer is in developer preview. 
One of the main challenges with running OCP4 on Azure with the installer provisioned infrastructure method is setting upcustom Ingress infrastructure (e.g. custom Network Security Groups or custom Load Balancer for routers), because the Cluster Ingress Operator creates a Public facing Azure Load Balancer to serve routers by default, and once the cluster is deployed, the Ingress Controller type cannot be changed.
If it is deleted, or the OpenShift router Service type is changed, the Cluster Ingress Operator will reconcile and recreate the default controller object.
Trying to alter Network Security Groups by whitelisting allowed IP ranges will cause Kubernetes to reconcile the configuration to it’s desired state.
One of the ways is to deploy OCP4 on Azure Cloud by creating the objects manually with the user provisioned infrastructure approach, and then recreating the default ingress controller object just after control plane is deployed.
Openshift Container Platform 4.1 components
Our cluster consists of 3 master and 2 compute nodes. Master nodes are fronted with 2 Load Balancers, 1 Public facing for external API calls, and 1 Private for internal cluster communication. Compute nodes are using the same Public facing Load Balancer as the masters, but if needed they can each have their own Load Balancer.

Figure 1. OCP 4.1 design diagram with user provisioned infrastructure on Azure Cloud
Instances sizes
The OpenShift Container Platform 4.1 environment has some minimum hardware requirements.

Instance type
Bootstrap
Control plane
Compute nodes

D2s_v3


X

D4s_v3
X
X
X

Above VM sizes might change once Openshift Container Platform 4.1 is officially released for Azure.
Azure Cloud preparation for OCP 4.1 installation
The preparation steps here are the same as for Installer Provisioned Infrastructure. You need to complete these steps:

DNS Zone.
Credentials.
Cluster Installation (Follow the guide until cluster deployment section).

NOTE: The free Trial account is not enough and Pay As You Go is recommended with increased quota for vCPU
 
User Provisioned Infrastructure based OCP 4.1 installation
When using this method, you can:

Specify the number of masters and workers you want to provision
Change Network Security Group rules in order to lock down the ingress access to the cluster 
Change Infrastructure component names
Add tags

This Terraform based approach will split VMs across 3 Azure Availability Zones and will use 2 Zone Redundant Load Balancers (1 Public facing to serve OCP routers and api and 1 Private to serve api- int)
Deployment can be split into 4 steps:

Create the Control Plane (masters) and Surrounding Infrastructure (LB,DNS,VNET etc.) 
Set the default Ingress controller to type “HostNetwork”
Destroy Bootstrap VM
Create Compute (worker) nodes

This method uses the following tools:

terraform >= 0.12 • openshift-cli
git
jq (optional)

Prerequisites
We will deploy Red Hat Openshift Container Platform v4.1 on Microsoft Azure Cloud by using Terraform, since it is one of the most popular Infrastructure-as-Code tools.
Download Git repository content containing terraform scripts:
git clone https://github.com/JuozasA/ocp4-azure-upi.git
cd ocp4-azure-upi
Download the openshift-install binary and get the pull-secret. The OpenShift Installer binary and pull secret can be downloaded following this link.
 
Copy openshift-install binary to /usr/local/bin directory:
 cp openshift-install /usr/local/bin/
 
Generate install config files:
./openshift-install create install-config –dir=ignition-files
? SSH Public Key /home/user_id/.ssh/id_rsa.pub
? Platform azure
? azure subscription id xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
? azure tenant id xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
? azure service principal client id xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
? azure service principal client secret xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
? Region <Azure region>
? Base Domain example.com
? Cluster Name <cluster name. this will be used to create subdomain, e.g. test.example.com>
? Pull Secret [? for help]
Edit the install-config.yaml file to set the number of compute, or worker, replicas to 0:
 compute:
  – hyperthreading: Enabled
   name: worker
   platform: {}
   replicas: 0
Generate Kubernetes manifests which defines the objects bootstrap nodes will have to create initially:
 openshift-install create manifests –dir=ignition-files
Remove the files that define the control plane machines and worker machinesets:
rm -f ignition-files/openshift/99_openshift-cluster-api_master-machines-*
rm -f ignition-files/openshift/99_openshift-cluster-api_worker-machineset-*
Because you create and manage the worker machines yourself, you do not need to initialize these machines.
Obtain the Ignition config files. More about Ignition utility here. 
openshift-install create ignition-configs –dir=ignition-files
Extract the infrastructure name from the Ignition config file metadata:
 jq -r .infraID ignition-files/metadata.json

Open terraform.tfvars file and fill in the variables:
azure_subscription_id = “”
azure_client_id = “”
azure_client_secret = “”
azure_tenant_id = “”
azure_bootstrap_vm_type = “Standard_D4s_v3″ <- Size of the bootstrap VM
azure_master_vm_type = “Standard_D4s_v3″ <- Size of the Master VMs
azure_master_root_volume_size = 64 <- Disk size for Master VMs
azure_image_id = “/resourceGroups/rhcos_images/providers/Microsoft.Compute/images/rhcostestimage” <- Location of coreos image
azure_region = “uksouth” <- Azure region (the one you’ve selected when creating install-config)
azure_base_domain_resource_group_name = “ocp-cluster” <- Resource group for base domain and rhcos vhd blob.
cluster_id = “openshift-lnkh2″ <- infraID parameter extracted from metadata.json
base_domain = “example.com”
machine_cidr = “10.0.0.0/16″ <- Address range which will be used for VMs
master_count = 3 <- number of masters
Open worker/terraform.tfvars and fill in information there as well. 
Start OCP v4.1 Deployment
Initialize Terraform directory:
terraform init
Run Terraform Plan and check what resources will be provisioned:
terraform plan
Once ready, run Terraform apply to provision Control plane resources:
terraform apply
Once the Terraform job is finished, run openshift-install. It will check when the bootstrapping is finished:
openshift-install wait-for bootstrap-complete –dir=ignition-files
Once the bootstrapping is finished, export the kubeconfig environment variable and replace the default Ingress Controller object with the one with the endpointPublishingStrategy of type HostNetwork. This will disable the creation of the Public facing Azure Load Balancer and will allow you to have custom Network Security Rules which won’t be overwritten by Kubernetes.
export KUBECONFIG=$(pwd)/ignition-files/auth/kubeconfig

oc delete ingresscontroller default -n openshift-ingress-operator

oc create -f ingresscontroller-default.yaml
Since we don’t need the bootstrap VM anymore, we can remove it:
terraform destroy -target=module.bootstrap
Now we can continue with provisioning the Compute nodes:
cd worker
terraform init
terraform plan
terraform apply
cd ../
Since we are provisioning Compute nodes manually, we need to approve kubelet CSRs:
worker_count=`cat worker/terraform.tfvars | grep worker_count | awk ‘{print $3}’`
while [ $(oc get csr | grep worker | grep Approved | wc -l) != $worker_count ]; do
 oc get csr -o json | jq -r ‘.items[] | select(.status == {} ) | .metadata.name’ | xargs oc adm certificate approve
 sleep 3
done
Check openshift-ingress service type (it should be type: ClusterIP):
oc get svc -n openshift-ingress
NAME                    TYPE CLUSTER-IP EXTERNAL-IP          PORT(S) AGE               
router-internal-default   ClusterIP 172.30.72.53   <none> 80/TCP,443/TCP,1936/TCP   37m
Wait for installation to be completed. Run the openshift-install command
openshift-install wait-for install-complete –dir=ignition-files
Last command will output the cluster console url and kubeadmin username/password.
Scale Up
In order to add additional worker nodes, we use terraform scripts in the scaleup directory. Fill in other information in terraform vars:
azure_subscription_id = “”
azure_client_id = “”
azure_client_secret = “”
azure_tenant_id = “”
azure_worker_vm_type = “Standard_D2s_v3″
azure_worker_root_volume_size = 64
azure_image_id = “/resourceGroups/rhcos_images/providers/Microsoft.Compute/images/rhcostestimage”
azure_region = “uksouth”
cluster_id = “openshift-lnkh2″
Run terraform init and the script:
cd scaleup
terraform init
terraform apply
It will ask you to provide the Azure Availability Zone number where you would like to deploy new node and to provide the worker node number (if it is 4th node, then the number is 3 [indexing starts from 0 rather than 1])
Approving server certificates for nodes
To allow API server to communicate with the kubelet running on nodes, you need to approve the CSR generated by each kubelet.
You can approve all Pending CSR requests using:
oc get csr -o json | jq -r ‘.items[] | select(.status == {} ) | .metadata.name’ | xargs oc adm certificate approve
Conclusion
OpenShift Container Platform 4.1 Internet ingress access can be restricted by changing Network Security Groups on Azure Cloud if we inform the Ingress controller not to create a public facing Load balancer. Since we are using Terraform to provision infrastructure, multiple infrastructure elements are changeable and the whole OpenShift Container Platform 4.1 infrastructure provisioning can be added to the wider infrastructure provisioning pipeline, e.g. Azure DevOps.
It is worth mentioning that at the time of writing, Red Hat OpenShift Container Platform 4.1 deployed on user provisioned infrastructure is not yet supported on Microsoft Azure Cloud and some of the features might not work as you expect, e.g. Internal Image Registry is ephemeral and all images will be gone if the image registry pod get restarted. 
 
The post OpenShift 4.1 UPI environment deployment on Microsoft Azure Cloud appeared first on Red Hat OpenShift Blog.
Quelle: OpenShift

Published by