Review and Future Directions of CloudForms State-Machines

This article seeks to explain the use of State Machines in Red Hat CloudForms for the use in the flow control of automation.
The topic of State Machines is sometimes perceived as rocket science, barely used but often taught. The first thing to dispel is the complexity in state machines, then we can compare how a state machine differs from other process automation like Workflows.
Finally the article is to dispel the myth that State Machines are RUBY or if you use Ansible Automation Inside you do not need state machines, again not a true statement.

Why State Machine?
Many automation flows are typically bigger than first envisaged when taken into an enterprise.
Example :
You may wish to provision a Database cluster, so the primary task is the installation and configuration of the database cluster.

Your compliance officer may instruct that the corporate CMDB must be updated with any artifacts that have been provisioned.
As well as the IT department may wish to trace the activity of provisioning using the corporate help desk system, by opening a ticket at the start of the job and closing the ticket when the job has completed.
Or lastly maybe you have a new requirement to use IP addresses from a corporate IP Address Management (IPAM) system when provisioning your Database Cluster.

Enterprise’s bring with them corporate standards, regulatory compliance requirements and operation patterns to follow. State Machines are a great way to combine varying automation requirements together.
By having each requirement handled as a separate state provides the following value;

You can decide how the next state will behave based on how the current state has exited.
You can have re-use of states in other state machines.

The last benefit is important, as each automation group in the enterprise creates their automation in silos, CloudForms provides a way to re-use the corporate states outside of the primary automation play.
Example:

The Amazon team create an automation play that deploys instances into Amazon EC2.
The VMware team create an automation play that deploys VMs into vSphere Clusters.

Both teams need to update the corporate CMDB with the asset details. With CloudForms you can write the automation play that updates the CMDB, but share it twice. Allowing both the Amazon and VMware teams to leverage the same automation play, which saves time but also ensures adherence to corporate standards.
 
Why not Workflows?
“What is the difference between a State Machine and a Workflow?” is the common question.
Answer:
A state machine, which is a series of states with transitions between them, allows for loops as opposed to a sequential workflow, which precedes down different branches until done.
The key here is that a state machine has re-entrancy. A state can run, and by itself decide if it should run again. A state can jump states and even go back to a previous state.
Example:

State – Update the CMDB with asset detail.

This state should succeed, but it can also fail.
A fail could be a hard fail whereby the return from the CMDB tells the state that an authentication failure occurred.
Another more soft failure is simply that the CMDB service is unreachable. Both of these failures could be dealt with in different ways such as;

Authentication Failure EQUALS Fail the State.
Unreachable CMDB EQUALS Retry the state (predefined retry count and interval).

This results in the state that updates the corporate CMDB with information retrying when the service is unavailable/busy, fail the job if it fails to authenticate or continues onto the next state if successful.
This is the state machine for our example;
 

 
So what’s the benefit again? Well to do this in a workflow would require that the workflow author writes the logic to control the gates using decision processes, also the re-entrancy would need to be coded into the workflow too. Which would look something like;
 

 
Another benefit of using States in a State machine is that we have the ability to do pre- and post- processing of a state. This means that before entering a state we can do some logic, then we can execute the state itself, and finally upon exiting the state we can do more process logic on an exit state. Along with supporting a state for failure. This means that any one state can do the following:

Pre State – Is updating the CMDB enabled for this job?
State – Run the CMDB Update.
Post State – Did the CMDB update successfully?
Error State – Send an email to admin to say CMDB is misconfigured.

A flowchart diagram for pre state with state would look like;
 

 
Summary – State Machines are a table of states. Each state has entry, exit and retry logic so that any state can succeed, fail or retry its status giving any state the ability to traverse the state machine in any order.
 
State Machine and Method Types
A state machine transitions through a series of states. The states will call/connect to instances. It is the job of the instance to define the state. What does this mean?
 

 
The instance could define some attributes like;

CMDB Server URL.
CMDB Username.
CMDB Password.

But these are not much use unless you feed them into something that can use them. This brings us to METHODS.
A method is something that runs. You define the method and call the method from an instance. Such as;
 

 
The implementation of State Machines in CloudForms is limited by the method types it supports. Here are the supported method types’

Built-in – A number of built in methods exist for placement, quota and email use cases, etc.

 

 

In-line – RUBY scripting language support. Write a ruby script and it will be executed.

 

 

URI –  Point to a RUBY script on a URI resource. It will be executed from that location.

 

 
As you can see, two of the three options support native RUBY scripting language, but also the first being built in also demonstrates how RUBY is NOT the only Method choice.
 
State Machine using Built-In Method
Some time ago, I wrote a blog on running built-in methods in CloudForms.
It demonstrates how a state machine can call a built-in method, passing parameters to send an email. This requires no RUBY coding, and utilizes simply an instance that calls a method.
 
State Machine using In-Line Method
This is the most common method type. Most if not all states in any of the provisioning.
 
State Machine using URI Method
This method type is not used today out of the box, but you certainly can use it in a custom route. The main issue with URI based location for the method is availability. You need to ensure that ALL CloudForms servers running the Automate role can access the resource. Though…the concept of using an external location for methods is very cool. Maybe you could point the method location to be a Git URL. Then you have versioning, branching and availability all from Git., an interesting blog for the future.
 
State Machine and Ansible Automation Inside
The near future direction for CloudForms is to add Ansible as a method type for State Machines. This would allow for States to use instances that execute methods that are Ansible. This gives all the benefits of the Ansible simplistic language and power of its module integration merged with the power of state machines to control process flows. A single state example would be like;
 

 
A more complete example would look like;
 

 

You can see in this example how the first state calls a playbook to open a ticket in the corporate help-desk system.
Then state 2 is to perform a quota check, this in CloudForms is a RUBY method that takes some input parameters and either fails the state if no quota available or continues on.
The third state calls a built in method for provisioning.
Last state is another playbook to close the ticket, again taking parameters from the instance such as the connection parameters and the ticket number to close.

 
State Control
You can control a state in a state-machine using re-entrancy and exit codes. For example if you set the exit code of your method as follows;

Ok – The placement of the VM was successful.
Error – The placement of the VM failed.
Retry – We want to run this placement logic again.

Therefore what determines how the next state will be processed is simply the exit from the previous state.

If the previous state exits with a retry then the state will be retried for the number of retries the state machine is configured for, plus the duration between retires can be controlled.
If the state exists OK then the next state is processed.
If the state exists Error then the next state is actually the error state of the same state so to clean up any failure or backout what was partially done. A good example of this would be;

State 1 – Create VM.
State 2 – Configure Firewall.
State 3 – Install Apache.

If State 3 fails, then the error state for State 3 might undo the firewall config and remove the VM.
 
Appendix A
Definition from Google/Wikipedia
 

 
 
Assertions, Relationships and Schema
Relationships
A state machine when written contains states, but it can also contain other types. For example if you wish to connect a state machine to another state machine as follows;

State Machine “Create VM in VMware”.
State Machine “Install Apache”.

These two state machines may have many states doing various tasks. The advantage of separating the two state machines, is that you can re-use either of the state machines with others. Such as;

State Machine “Create VM in RHV”.
State Machine “Install Apache”.

Now uses a new state machine to create a VM in RHV, but the same state machine for installing Apache.
To allow state machines to connect to each other we have “Relationships”, you can bind from one place to another in state machines.
 
Assertions, and more on Relationships
These are very cool and allow you to stop a state machine mid-flow. The first example to look at is whereby you do not wish to continue with something based on a condition. Using the following as an example;

State 1 – Create VM.
State 2 – Configure Firewall.
State 3 – Install Apache.

You can in the state 2, do the following;

Assertion = “Continue only if the VM was created”.
Method = Configure Firewall.

This would result in the assertion being resolved first, and if True it will continue to the next line, being the method that actually configures the firewall. Otherwise if the condition returned False, then the state would end processing there and NOT continue to run the method.
You can mix Assertions with Relationships too, an example of this would be, you wish to install all packages onto a VM. You have created methods for each package install. You could do the following based on what we have discussed so far;

State 1 – Create VM.
State 2 – Install Apache.
State 3 – Install PHP.
State 4 – Install CSS.
State 5 – Install WebSite.

Or more easily you can;

State 1 – Create VM.
State 2 – Install Web Components.

The Install Web Components would be a wild card connection from the state to the methods, for example;

State 1 – Create VM.
State 2 – Install Web Components.

Relationship – WebComponents*

The reason why you may wish to use Assertions here is to stop the “resolution” of the wildcard picking instances to methods that should be excluded. Example, that if you wish to only pick up the Linux version of the web components because the VM template was Linux, you could configure on all the Linux instances heading the methods and assertion that evaluates to true or false based on the template OS = Linux. Example;
You have many instances for webcomponents, some Windows, and some Linux;

Linux – Install Apache.
Linux – Install PHP.
Linux – Install CSS.

And

Windows – Install Apache.
Windows – Install PHP.
Windows – Install CSS.

And

Common – Install WebSite.

Therefore when the state machine resolves the relationship it will take the webcomponents that match the condition but always take the “Common – Install WebSite”.
Quelle: CloudForms

Published by