Manage and Auto-scale your IoT solution with a predictable IoT Cloud

As companies continue to fully roll out their IoT projects, management of the various components of the solution becomes a critical part of their operations. The flexibility of Azure IoT Hub to enable customers to start small, paying only for the amount of IoT Hub capacity needed at any point along the device deployment curve, helps drive predictability in the cost of an IoT solution.

However, the potentially irregular rate of device and message growth in an IoT solution does add a unique challenge for operations. When the number of messages ingested from devices in a given day exceeds the limit of the chosen IoT Hub capacity, the IoT Hub will begin to reject messages until either the IoT Hub is scaled-up, or the time rolls over into the next day (UTC time). Wouldn’t it be nice to have IoT Hub just automatically scale up to a higher capacity when a certain threshold of messages is met, before this limit is reached?

While at this point, IoT Hub does not have this capability built into the service, we have published a sample solution for monitoring and automatically scaling your IoT Hub based on reaching a specific threshold of messages. The sample, published on the Azure-Samples site, leverages the Azure Durable Functions framework and the IoT Hub Management Client to continually monitor the consumption of your IoT Hub message quota and, when needed, programmatically scale up your IoT Hub capacity.

Azure Durable Functions

To orchestrate our IoT Hub scaling solution, we leverage the Singleton Orchestrator pattern of the Azure Durable Functions framework. The key benefit of this pattern is the ability to ensure that exactly one instance of the scaling solution for a given IoT Hub is running at a time. That frees us from having to worry about the possible race conditions of multiple instances of our scaling function running concurrently. The pattern really consists of three functions that operate our solution:

IotHubScaleInit – this function is executed on a regular timer (by default, once per hour). This function checks to see if an instance of the Orchestrator function is running and, if not, starts one.
IotHubScaleOrchestrator – this function implements the Orchestrator for the solution. It’s role in the pattern is to manage the execution of the worker function
IotHubScaleWorker – this is the function that performs the actions of checking to see if the IoTHub needs to be scales and, if so, scaling it.

We start with a timer-initiated IoTHubScaleInit function that runs occasionally (in the sample, once an hour) and checks to see if an instance of the orchestrator is already running and, if not, starts one. The relevant code, from the IoTHubScaleInit function is shown below, some code removed for brevity.

const string IotHubScaleOrchestratorInstanceId = "iothubscaleorchestrator_1";

var existingInstance = await starter.GetStatusAsync(IotHubScaleOrchestratorInstanceId);

if (existingInstance == null)
{
await starter.StartNewAsync(IotHubScaleOrchestratorName,IotHubScaleOrchestratorInstanceId, input: null);
}

The key to this function is the constant instance ID. By default, when you launch an orchestrator, the system will generate a unique instance ID. In our case, by specifying an ID, we can check and see if that instance is already running with the GetStatusAsync function.

The IotHubScaleOrchestrator function, as the name implies, orchestrates the execution of the solution. It recovers from failures in execution, and also allows dehydration of the code while waiting on the next execution. But most importantly, it allows us to make sure we kick off another instance of the scaling function after the existing one finishes. This is the critical part of making sure we never have more than one instance executing at a given time. The key parts of this function are:

await context.CallActivityAsync(IotHubScaleWorkerName);

DateTime wakeupTime = context.CurrentUtcDateTime.Add(TimeSpan.FromMinutes(JobFrequencyMinutes));
await context.CreateTimer(wakeupTime, CancellationToken.None);

context.ContinueAsNew(null);

After calling and waiting on the worker function, we create a timer via the Durable Functions framework. The ContinueAsNew method of the context object then tells the framework to end this instance and schedule another one to fire up when the timer expires. The framework takes care of the rest.

The remainder of the solution is the IotHubScaleWorker function, which performs the actual work of checking the status of the IoT Hub usage and, if necessary, scaling it.

IoT Hub Management Client

The IoT Hub Management Client enables you to interact with the control plane of the IoT Hub service, including creating, deleting, and managing the configuration of your IoT Hubs. Within the worker function, the client does all of the heavy lifting of interacting with the IoT Hub service.

For example, the following two snippets from the code get the current configuration details of the IoT Hub, the most important of which for our purposes is the current SKU (S1, S2, or S3) and the current number of units. The second line gets the current operational metrics of the hub. The primary one of interest is the TotalMessages metric which gives the current number of messages the IoT Hub has ingested that day.

IotHubDescription desc = client.IotHubResource.Get(ResourceGroupName, IotHubName);
IPage<iothubquotametricinfo> mi = client.IotHubResource.GetQuotaMetrics(ResourceGroupName, IotHubName);

Once we have that information, we determine via a couple of helper functions included in the sample, if we need to scale the IoT Hub by comparing the current message count with a defined threshold for that SKU/unit combination. If we need to scale, we simply update the SKU and units within the IoTHubDescription object we obtained above, and leverage the CreateOrUpdate management function to update the configuration of our IoT Hub. This performs the scale up of the IoT Hub with no interruption to existing devices or clients.

desc.Sku.Name = newSkuName;
desc.Sku.Capacity = newSkuUnits;
client.IotHubResource.CreateOrUpdate(ResourceGroupName, IotHubName, desc);

Scaling Down

With the trajectory of most IoT projects being growth, and for simplicity, we focused this sample on scaling-up IoT Hubs. However, there are certainly valid scenarios where an IoT Hub may need to be automatically scaled down to lower costs when previous message volumes drop. In the sample documentation, we offer some suggestions for modifying the solution for scaling down IoT Hubs when necessary.

Give the sample a try and sleep better tonight knowing you have one fewer operational tasks on your plate!

A few notes about the sample

The sample only works for the Standard tiers of IoT Hub. The Free tier of IoT Hub can’t be scaled, so it’s not applicable. Also note that you cannot convert directly from the Free tier of IoT Hub to a Standard tier.
The sample provides one straightforward implementation of a scaling algorithm, but with the supplied source code, you can customize it to meet your unique scaling needs.
For the sake of your IoT budget, due consideration should be given to automatically scaling IoT Hub as you reach the higher service levels, such as S3, as each unit increase adds both significant capacity as well as cost.

Quelle: Azure

Introducing Container Storage Interface (CSI) Alpha for Kubernetes

One of the key differentiators for Kubernetes has been a powerful volume plugin system that enables many different types of storage systems to:Automatically create storage when required.Make storage available to containers wherever they’re scheduled.Automatically delete the storage when no longer needed.Adding support for new storage systems to Kubernetes, however, has been challenging. Kubernetes 1.9 introduces an alpha implementation of the Container Storage Interface (CSI) which makes installing new volume plugins as easy as deploying a pod. It also enables third-party storage providers to develop solutions without the need to add to the core Kubernetes codebase.Because the feature is alpha in 1.9, it must be explicitly enabled. Alpha features are not recommended for production usage, but are a good indication of the direction the project is headed (in this case, towards a more extensible and standards based Kubernetes storage ecosystem).Why Kubernetes CSI?Kubernetes volume plugins are currently “in-tree”, meaning they’re linked, compiled, built, and shipped with the core kubernetes binaries. Adding support for a new storage system to Kubernetes (a volume plugin) requires checking code into the core Kubernetes repository. But aligning with the Kubernetes release process is painful for many plugin developers.The existing Flex Volume plugin attempted to address this pain by exposing an exec based API for external volume plugins. Although it enables third party storage vendors to write drivers out-of-tree, in order to deploy the third party driver files it requires access to the root filesystem of node and master machines.In addition to being difficult to deploy, Flex did not address the pain of plugin dependencies: Volume plugins tend to have many external requirements (on mount and filesystem tools, for example). These dependencies are assumed to be available on the underlying host OS which is often not the case (and installing them requires access to the root filesystem of node machine).CSI addresses all of these issues by enabling storage plugins to be developed out-of-tree, containerized, deployed via standard Kubernetes primitives, and consumed through the Kubernetes storage primitives users know and love (PersistentVolumeClaims, PersistentVolumes, StorageClasses).What is CSI?The goal of CSI is to establish a standardized mechanism for Container Orchestration Systems (COs) to expose arbitrary storage systems to their containerized workloads. The CSI specification emerged from cooperation between community members from various Container Orchestration Systems (COs)–including Kubernetes, Mesos, Docker, and Cloud Foundry. The specification is developed, independent of Kubernetes, and maintained at https://github.com/container-storage-interface/spec/blob/master/spec.md.Kubernetes v1.9 exposes an alpha implementation of the CSI specification enabling CSI compatible volume drivers to be deployed on Kubernetes and consumed by Kubernetes workloads.How do I deploy a CSI driver on a Kubernetes Cluster?CSI plugin authors will provide their own instructions for deploying their plugin on Kubernetes.How do I use a CSI Volume?Assuming a CSI storage plugin is already deployed on your cluster, you can use it through the familiar Kubernetes storage primitives: PersistentVolumeClaims, PersistentVolumes, and StorageClasses.CSI is an alpha feature in Kubernetes v1.9. To enable it, set the following flags:CSI is an alpha feature in Kubernetes v1.9. To enable it, set the following flags:API server binary:–feature-gates=CSIPersistentVolume=true–runtime-config=storage.k8s.io/v1alpha1=trueAPI server binary and kubelet binaries:–feature-gates=MountPropagation=true–allow-privileged=trueDynamic ProvisioningYou can enable automatic creation/deletion of volumes for CSI Storage plugins that support dynamic provisioning by creating a StorageClass pointing to the CSI plugin.The following StorageClass, for example, enables dynamic creation of “fast-storage” volumes by a CSI volume plugin called “com.example.team/csi-driver”. kind: StorageClassapiVersion: storage.k8s.io/v1metadata:  name: fast-storageprovisioner: com.example.team/csi-driverparameters:  type: pd-ssdTo trigger dynamic provisioning, create a PersistentVolumeClaim object. The following PersistentVolumeClaim, for example, triggers dynamic provisioning using the StorageClass above.apiVersion: v1kind: PersistentVolumeClaimmetadata:  name: my-request-for-storagespec:  accessModes:  – ReadWriteOnce  resources:    requests:      storage: 5Gi  storageClassName: fast-storageWhen volume provisioning is invoked, the parameter “type: pd-ssd” is passed to the CSI plugin “com.example.team/csi-driver” via a “CreateVolume” call. In response, the external volume plugin provisions a new volume and then automatically create a PersistentVolume object to represent the new volume. Kubernetes then binds the new PersistentVolume object to the PersistentVolumeClaim, making it ready to use.If the “fast-storage” StorageClass is marked default, there is no need to include the storageClassName in the PersistentVolumeClaim, it will be used by default.Pre-Provisioned VolumesYou can always expose a pre-existing volume in Kubernetes by manually creating a PersistentVolume object to represent the existing volume. The following PersistentVolume, for example, exposes a volume with the name “existingVolumeName” belonging to a CSI storage plugin called “com.example.team/csi-driver”.apiVersion: v1kind: PersistentVolumemetadata:  name: my-manually-created-pvspec:  capacity:    storage: 5Gi  accessModes:    – ReadWriteOnce  persistentVolumeReclaimPolicy: Retain  csi:    driver: com.example.team/csi-driver    volumeHandle: existingVolumeName    readOnly: falseAttaching and MountingYou can reference a PersistentVolumeClaim that is bound to a CSI volume in any pod or pod template.kind: PodapiVersion: v1metadata:  name: my-podspec:  containers:    – name: my-frontend      image: dockerfile/nginx      volumeMounts:      – mountPath: “/var/www/html”        name: my-csi-volume  volumes:    – name: my-csi-volume      persistentVolumeClaim:        claimName: my-request-for-storageWhen the pod referencing a CSI volume is scheduled, Kubernetes will trigger the appropriate operations against the external CSI plugin (ControllerPublishVolume, NodePublishVolume, etc.) to ensure the specified volume is attached, mounted, and ready to use by the containers in the pod.For more details please see the CSI implementation design doc and documentation.How do I create a CSI driver?Kubernetes is as minimally prescriptive on the packaging and deployment of a CSI Volume Driver as possible. The minimum requirements for deploying a CSI Volume Driver on Kubernetes are documented here.The minimum requirements document also contains a section outlining the suggested mechanism for deploying an arbitrary containerized CSI driver on Kubernetes. This mechanism can be used by a Storage Provider to simplify deployment of containerized CSI compatible volume drivers on Kubernetes. As part of this recommended deployment process, the Kubernetes team provides the following sidecar (helper) containers:external-attacherSidecar container that watches Kubernetes VolumeAttachment objects and triggers ControllerPublish and ControllerUnpublish operations against a CSI endpoint.external-provisionerSidecar container that watches Kubernetes PersistentVolumeClaim objects and triggers CreateVolume and DeleteVolume operations against a CSI endpoint.driver-registrarSidecar container that registers the CSI driver with kubelet (in the future), and adds the drivers custom NodeId (retrieved via GetNodeID call against the CSI endpoint) to an annotation on the Kubernetes Node API ObjectStorage vendors can build Kubernetes deployments for their plugins using these components, while leaving their CSI driver completely unaware of Kubernetes.Where can I find CSI drivers?CSI drivers are developed and maintained by third-parties. You can find example CSI drivers here, but these are provided purely for illustrative purposes, and are not intended to be used for production workloads.What about Flex?The Flex Volume plugin exists as an exec based mechanism to create “out-of-tree” volume plugins. Although it has some drawbacks (mentioned above), the Flex volume plugin coexists with the new CSI Volume plugin. SIG Storage will continue to maintain the Flex API so that existing third-party Flex drivers (already deployed in production clusters) continue to work. In the future, new volume features will only be added to CSI, not Flex.What will happen to the in-tree volume plugins?Once CSI reaches stability, we plan to migrate most of the in-tree volume plugins to CSI. Stay tuned for more details as the Kubernetes CSI implementation approaches stable.What are the limitations of alpha?The alpha implementation of CSI has the following limitations:The credential fields in CreateVolume, NodePublishVolume, and ControllerPublishVolume calls are not supported.Block volumes are not supported; only file.Specifying filesystems is not supported, and defaults to ext4.CSI drivers must be deployed with the provided “external-attacher,” even if they don’t implement “ControllerPublishVolume”.Kubernetes scheduler topology awareness is not supported for CSI volumes: in short, sharing information about where a volume is provisioned (zone, regions, etc.) to allow k8s scheduler to make smarter scheduling decisions.What’s next?Depending on feedback and adoption, the Kubernetes team plans to push the CSI implementation to beta in either 1.10 or 1.11.How Do I Get Involved?This project, like all of Kubernetes, is the result of hard work by many contributors from diverse backgrounds working together. A huge thank you to Vladimir Vivien (vladimirvivien), Jan Šafránek (jsafrane), Chakravarthy Nelluri (chakri-nelluri), Bradley Childs (childsb), Luis Pabón (lpabon), and Saad Ali (saad-ali) for their tireless efforts in bringing CSI to life in Kubernetes.If you’re interested in getting involved with the design and development of CSI or any part of the Kubernetes Storage system, join the Kubernetes Storage Special-Interest-Group (SIG). We’re rapidly growing and always welcome new contributors.
Quelle: kubernetes