Host-HA for KVM Hosts in CloudStack

, , , , ,


What is HA?

“High availability is a characteristic of a system, which aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period. ”  — Wikipedia

HA in CloudStack is currently supported for VMs only. In order to have it enabled, the service offering of the VM should be HA enabled, otherwise the VMs would not been taken into consideration. There is no HA activity around hosts at this stage, so we don’t have a defense mechanism if a host goes down. All investigations are VM-centric and we’re unable to determine the health of the host or whether it is actually still running the VM. This may result in the VM-HA mechanism starting the same VM on a different host while the faulty host is still running it, which would result in corrupt VMs and disks. Such issues have been seen in large scale KVM deployments.




The Solution

Such issues motivated us to figure out a long term solution to this problem, and we identified that the root of all evil is down to a lack of a reliable fencing and recovering mechanism. A new investigation model had to be introduced in order to achieve this, simply because the VM-centric one wasn’t going to be sufficient. Of-course, it needed to be easy to maintain for administrators.

Setting this as our destination point, we started defining our route to get there.  The first thing that became obvious to us is that CloudStack is missing an OOBM tool to fence and recover hosts. OOBM is the ability to execute power cycle operations to a certain host. So – we developed the CloudStack OOBM Plugin, which implements industry standard IPMI 2.0 provider, supported by most vendors. This way when enabled per host, users would be able to issue power commands such as: On, Off, Reset, etc.
OOBM Feature Specification

Host-HA Granular configuration: offers admins an ability to set explicit configuration on host/cluster/zone level. This way in a large environment some hosts from a cluster can be HA-enabled and some not, depending on the setup and specific hardware that is running.

Threshold based investigator: where the admin can set a specific point of failed investigations, only when it’s exceeded would the host transition is in a different state.

More accurate investigating: Host-HA to uses both health checks and activity checks to take decisions on recovering and fencing actions. Once determined the resource (host) is in faulty state (health checks failed) it runs activity check to figure out if there is any disk activity on the VMs running on the specific host.

Host-HA Design

Host-HA design aims to offer a way to separate policy from mechanism, where individuals are free to use different sets of pluggable tools (ha providers and OOBM tools), while having the same policy applied. Administrator can set the thresholds in global settings and not worry about the mechanism which is going to enforce it. With the resource management service CloudStack admins can manage lifecycle operations per resource and use a kill switch on zone/cluster/host level to disable HA policy enforcement. The framework itself is resource type agnostic and can be extended with any other resources within CloudStack, like for instance load-balancers.

HA Providers are resource specific and are responsible to execute the HA framework and force the applied policy. For example, the KVM HA Provider, as part of this feature, works with KVM Hosts and carries out the HA related activities.
A State-Machine implements event triggers and transitions of a specific HA resource based on which the framework takes the required actions to bring it to the right physical state. For example, if it passes the threshold for being in degraded state it will try to recover it, then the framework will issue an OOBM Restart task which will reset the host power and eventually host will come up. Here’s a list of the States:

Available – the feature is Enabled and Host-HA is available
Suspect – there are health checks failing with the Host
Checking – activity checks are being performed
Degraded – host is passing activity check ratio and still providing service to the end user, but cannot be managed from CloudStack Management
Recovering – the Host-HA framework is trying to Recover the host by issuing OOBM job
Recovered – the Host-HA framework has recovered the Host successfully
Fencing – the Host-HA framework is trying to Fence the host by issuing OOBM job
Fenced – the Host-HA framework has recovered the Host successfully
Disabled –  feature is Disabled for the Host
Ineligible – feature is Enabled, but it cannot be managed successfully by the Host-HA framework (possible OOBM not configured properly)

Please find this image and image of the FSM-Transitions, where all possible transitions are defined with the conditions that are required to move on with next state.

Host-HA on KVM host

Host-HA on KVM hosts is provided by the KVM HA Provider. It uses the STONITH (Shoot the other node in the head) fencing model. It also provides mechanism for activity checks on disks on the shared NFS storage. How does it work? While in a cluster, neighboring hosts are able to perform activity checks on VMs disks running on a faulty (health checks failed) host. The activity check is verifying if there’s any actual activity on the VM disk while the host where it’s running has been reported in bad health, if there is activity then the host would stay in degraded state, if there’s not the HA Framework will transition it to Recovering state and it’ll try to bring it back up. In case it fails the threshold for recovery it will fence it by powering off the machine.

Please checkout the FS for more technical details

Find the pull request on the Apache CloudStack Public Repo

HOST-HA and VM-HA coordination

For KVM HOST HA to work effectively it has to work in tandem with the existing VM HA framework. The current CloudStack implementation focuses on VM-HA as these are the first class entities, while a host is considered to be a resource. The CloudStack manages host states and a rough mapping of CloudStack states vs. the KVM Host HA state is as below:

VM-HA host States KVM Host HA host states
Up Available
Up (Investigating) Suspect/Checking
Alert Degraded
Disconnected Recovering/Recovered/Fencing
Down Fenced

The Host HA improves on Investigation by providing a new way of investigating VM using VM disk activity. It also adds on to the fencing capabilities by integrating with OOBM feature.

In order for VM HA to work correctly and in sync with Host HA it is important that the state of host seen by the two is same as per the above table. VM-HA model has been modified to query the Host-HA states to get the actual host state, when the feature is enabled. It also makes sure VM-HA related activities are not started unless the host has been properly fenced.

About the author

Boris Stoyanov is Software Engineer in testing at ShapeBlue, The Cloud Specialists. Bobby spends his time testing features for the Apache CloudStack Community and for our ShapeBlue clients.