Posts

Introduction

In a previous post, ShapeBlue’s Boris Stoyanov introduced a feature for KVM which allows administrators to register templates and ISOs without needing secondary storage as an intermediate cache. This feature is known as Direct Download in CloudStack.

Without an intermediate cache, the templates and ISOs direct download registration process differs from the usual process, which is:

  • The template and ISO registration process uses secondary storage as an intermediate cache
  • The zone SSVM handles the template or ISO download and stores it on the intermediate cache in secondary storage
  • On VM creation, the template is copied into primary storage from secondary storage.

With Direct Download, secondary storage is not involved in the process:

  • When a user registers a template or ISO, CloudStack retains the information about the template including its URL for later download
  • CloudStack selects a random host in the zone to check if the host can reach the template URL. If not, then registration fails
  • On VM deployments, the template is downloaded from the URL directly into primary storage or copied from another primary storage if it has been previously downloaded

Storage provider agnostic

The Direct Download feature was initially designed and implemented for NFS primary storage and KVM hypervisor only. As of CloudStack version 4.14, the Direct Download feature is extended and is no longer restricted to only NFS, but now provides an abstraction for any storage type, as the KVM storage processor allows conversion of different storage types.

The extension of the Direct Download support was done to support the most used protocols / storage types:

  • NFS – already supported before version 4.14
  • Local storage
  • SharedMountpoint

Support for other storage providers may be extended in the future.

Furthermore, prior to CloudStack version 4.14, it was not possible to register system VM templates using Direct Download.

Direct Downloads as of version 4.14

These are the main changes with this feature:

  • A new check of available free space on the host is added before attempting to download a template. On registration, a CheckUrlCommand is sent to a random KVM host, and this host performs a HEAD command to the template URL. If the server provides the remote file size of the template, this size is passed as part of the DirectDownloadCommand and used to check against the available free space on the scratch space. If the template size cannot be retrieved from the server then the check is skipped
  • The template file is downloaded to a scratch space in the storage system on the host. This location can be explicitly set by the administrator via agent.properties file on the property ‘direct.download.temporary.download.location’, and in the case where it is not set, the default location ‘/var/lib/libvirt/images’ is used
  • After the template file is downloaded to the scratch space, it is extracted (or copied) to the primary storage pool selected by the CloudStack planner
  • The implementation of the extract / copy method is extended to invoke the different StorageAdaptors implementations for different storage providers. These implementations include local storage, shared mount point and NFS in the scope of this feature
  • After the template is extracted / copied to the destination storage pool, the downloaded file is removed from the scratch space on the host

Configuration

Changes have been made to make different timeouts for direct download functionality configurable using global settings. The following global configurations have been added:

  • download.connect.timeout – Connection establishment timeout in milliseconds for direct download. Default value: 5000 milliseconds.
  • download.socket.timeout – Socket timeout (SO_TIMEOUT) in milliseconds for direct download. Default value: 5000 milliseconds.
  • download.connection.request.timeout – Requesting a connection from connection manager timeout in milliseconds for direct download. Default value: 5000 milliseconds. This setting is hidden and not visible in UI.

System VM templates

It is also possible to register new system VM templates using the direct download option.

This feature allows the sending of arbitrary additional VM configurations to user VMs on CloudStack and is supported by KVM, XenServer and VMware hypervisors.

The administrator enables or disables this feature by the global configuration  ‘enable.additional.vm.configuration’ which is disabled by default. To add a second layer of security, the administrator must explicitly set a comma-separated list of allowed VM additional configurations per hypervisor that users can use. This is achieved by the following global settings:

  • ‘allow.additional.vm.configuration.list.kvm’
  • ‘allow.additional.vm.configuration.list.vmware’
  • ‘allow.additional.vm.configuration.list.xen’

This means that users can send additional configuration to VMs on start or update, only if:

  • The administrator has set the feature on, and
  • The administrator has set the list of allowed additional configurations and the configurations that the user wants to send to their VMs is a subset of that list

A user can send additional configurations to their VMs by setting the parameter ‘extraconfig’ on the deployVirtualMachine and updateVirtualMachine APIs. There is currently no support in the UI for this feature.

KVM hypervisor

Additional VM configurations are added as parts of XML which are appended to the XML domain of the VM. However, CloudStack needs the XML to be URL UTF-8 encoded to be accepted as a valid ‘extraconfig’ parameter. Each XML tag must be part of the comma-separated list in the global configuration: ‘allow.additional.vm.configuration.list.kvm’

Example:

If a user would like to pass this XML configuration to its VM:

<memoryBacking>

<hugepages />

</memoryBacking >

Then the following steps are needed:

  • The user must encode the string above, resulting in the string:
    “%3CmemoryBacking%3E%0D%0A++%3Chugepages%2F%3E%0D%0A%3C%2FmemoryBacking%3E”
  • Set the ‘extraconfig’ parameter on deployVirtualMachine or updateVirtualMachine API to the encoded string
  • The administrator must have previously allowed the configurations: ‘memoryBacking’ and ‘hugepages’ by the global setting ‘additional.vm.configuration.list.kvm’

VMware hypervisor

CloudStack expects a set of URL UTF-8 encoded pairs of keys and values, in the format key=value. These key-value pairs are appended to the VM configuration .vmx file.

Example:

If a user would like to pass the following key-pair configuration to its VM:

hypervisor.cpuid.v0 = “FALSE

Then the following steps are needed:

  • The user must encode the string above, resulting in the string: “hypervisor.cpuid.v0%3DFALSE”
  • Set the ‘extraconfig’ parameter on deployVirtualMachine or updateVirtualMachine API to the encoded string
  • The administrator must have previously allowed the configurations: ‘hypervisor.cpuid.v0’ by the global setting ‘allow.additional.vm.configuration.list.vmware’

XenServer hypervisor

CloudStack also expects a set of URL UTF-8 key-value pairs which will be applied to the XAPI vm-param-set.

Example:

If a user would like to pass the following key-pair configurations to its VM:

HVM-boot-policy=

PV-bootloader=pygrub

PV-args=hvc0

Then the following steps are needed:

  • The user must encode the string above, resulting in the string: ‘HVM-boot-policy%3D%0APV-bootloader%3Dpygrub%0APV-args%3Dhvc0’
  • Set the ‘extraconfig’ parameter on deployVirtualMachine or updateVirtualMachine API to the encoded string
  • The administrator must have previously allowed the configurations: ‘HVM-boot-policy’, ‘PV-bootloader’ and ‘PV-args’ by the global setting ‘allow.additional.vm.configuration.list.xen’

The process of setting a host into maintenance in CloudStack requires an administrator to ask for ‘prepare for maintenance’, either via API or through the UI on a host. When CloudStack receives the request to prepare the host for maintenance, the host state is set to ‘PrepareForMaintenance’ and any VM running on the host start to be migrated away. Ideally, the process lasts until there are no VMs left running on the host and it can safely enter Maintenance mode.

However, in case of failure with these VM migrations, the host can stay indefinitely in the ‘PrepareForMaintenance’ state. This does not give useful information to the administrators, as it could mean that CloudStack is still trying to migrate away VMs or the process simply failed. In this last case, the administrator needs to cancel maintenance, fix any problem and try again preparing the host for maintenance.

This feature tackles the infinite state problem, by giving more control to administrators when preparing a host for maintenance with the following changes:

  • Set the maximum number of attempts to migrate VMs away from hosts preparing to enter maintenance mode by the global setting ‘vm.ha.migration.max.retries’.
  • If there are errors during the migrations of VMs, the host is marked in a new state ‘ErrorInPrepareForMaintenance’. While the host stays in this state admins can correct errors and host state will update on next iteration of checks by management server.
  • In case the maximum number of attempts is reached for every VM on a host preparing to enter maintenance mode, and migrations could still not be completed, then the host is marked as ‘ErrorInMaintenance’ state.

This means that the new behavior for preparing a host into maintenance is the following:

  • To enter maintenance mode, every VM must have been migrated away from a host
  • When a host is preparing to enter maintenance mode, the following must be met:
    • If after the number of attempts to migrate a VM on a ‘PrepareForMaintenance’ host reaches its limit, then no further migration attempts will be rescheduled for that VM.
    • If migration attempts including all subsequent retries for any VM on a ‘PrepareForMaintenance’ host have failed, then the host must transit to ‘ErrorInMaintenance’ state.
  • A host must transit to ‘ErrorInMaintenance’ only when it is preparing to enter maintenance mode, and one or more VMs could not be migrated away from the host after the number of migration attempts for each VM is consumed.
  • Running VMs on a host preparing to enter maintenance must not be stopped if any migration attempt fails
  • A host must still be able to enter maintenance mode when there are no failures on migrations as before

This feature introduces an easy and integrated way to check the health of virtual routers (VRs) within CloudStack. With the help of these checks, administrators can monitor VRs and take any necessary action when a failure is reported. These health checks can be basic or advanced.

Basic health checks include:

  • Connectivity from the management server to the virtual router
  • Connectivity from a virtual router to its interfaces’ gateways
  • Free disk space on virtual router
  • CPU and memory usage
  • VR Sanity checks: SSH/dnsmasq/haproxy/httpd services running

Advanced health checks include:

  • DHCP / DNS configuration matches management server DB
  • IPtables port forwarding rules match management server records
  • HAproxy configuration matches management server DB records
  • VR Version against the current version: this check is done by comparing the contents of the ‘/etc/cloudstack-release’ and ‘/var/cache/cloud/cloud-scripts-signature’ files with the data given by the management server

These health checks are run on each virtual router using the information that the management server periodically sends. After the virtual router completes the health checks, it stores the results in a dedicated JSON file for basic and advanced checks. The management server retrieves these results and stores them in database.

The administrator can easily retrieve the health checks results on virtual routers via the ‘getRouterHealthCheckResults‘ API or through the UI, in the new tab for Health Checks for each VR. This tab displays individual test results.

The health checks files are located in the ‘/root/healthchecks/’ directory on each virtual router. If the administrator wants to add more health checks, he can add them to this directory by  creating  a new system VM template or updating the existing system VM ISO.

Administrators can control VR health checks using global settings, but some of them can be overridden on a zone level. The as follows:

VR health checks are enabled by default. However, the administrator can disable it using this global setting:

  • health.checks.enabled.

Advanced and basic health checks are controlled byusing these global settings:

  • health.checks.advanced.interval
  • health.checks.basic.interval
  • health.checks.config.refresh.interval
  • health.checks.results.fetch.interval

A set of health checks can be ignored if the administrator sets them as a comma-separated list in this global setting:

  • health.checks.to.exclude

Health check failures defined by this configuration are the checks that should cause router to be recreated:

  • health.checks.failures.to.recreate.vr

Thresholds from which a test will fail if the value obtained is below it:

  • health.checks.free.disk.space.threshold
  • health.checks.max.cpu.usage.threshold
  • health.checks.max.memory.usage.threshold

For some time now, CloudStack has supported the direct download of user templates for the KVM hypervisor. This basically allows user to register a template (whilst bypassing secondary storage) and directly download and install it to primary storage where it can be used for deployment:

With CloudStack 4.14, this has been improved by adding direct download support for system VM templates. Previously, administrators could register new system VM templates with the direct download flag, but this flag was not honoured and there would be a failure during system VM deployment. Now, an administrator can register a new system VM template as ROUTING or USER type with the direct download flag, and it can be changed to SYSTEM type during the upgrade or by out-of-band database changes. The type of the newly registered template can be changed to SYSTEM in the database using a SQL query similar to:

UPDATE cloud.vm_template SET type=’SYSTEM’ WHERE uuid=’UUID_OF_NEW_TEMPLATE’;

Recreation of the system VM will now result in using the newly registered direct download template, and the new system VM will work as normal once deployed. With this feature, an administrator could run their datacenter using CloudStack without requiring secondary storage.

The new release also adds the ability to configure different timeout values for the direct downloading of templates. Three new global settings have been added:

  • download.connect.timeout – Connection establishment timeout in milliseconds for direct download. Default value: 5000 milliseconds.
  • download.socket.timeout – Socket timeout (SO_TIMEOUT) in milliseconds for direct download. Default value: 5000 milliseconds.
  • download.connection.request.timeout – Requesting a connection from connection manager timeout in milliseconds for direct download. Default value: 5000 milliseconds. This setting is hidden and not visible in UI.

Apache CloudStack has proved itself to be a great cloud orchestrator and is probably one of the best open-source platforms available for deployment of new IaaS environments. However, until now there has been little or no support for importing an existing cloud environment with all its resources and entities into CloudStack. This is going to change with new release CloudStack 4.14 which will include a new feature allowing the import of existing virtual machines (VMs) into a cloud environment.

With CloudStack 4.14 new APIs have been added to list and import unmanaged VMs. Currently, this support is only available for vSphere clusters (ie. only for VMware hypervisor) but given the vibrant community of the Apache CloudStack project, it will probably not be long before this support is extended to other hypervisors as well.

The following new APIs have been added to facilitate listing and importing of an unmanaged instance (VM):

  • listUnmanagedInstances – to list unmanaged VM instances that are present on the hypervisor end but could not be accessed from CloudStack
  • importUnmanagedInstance – to import unmanaged VM instances

While the system auto-configures several VM parameters during import (such as network, IP address, host and storage) it relies on a few pre-requisites for successful listing and import. The system requires a vSphere cluster added to CloudStack with the desired hosts and storage. It also requires CloudStack networks to be created for existing networks from vCenter. There are some additional requirements, more detail can be found in the CloudStack documentation: http://docs.cloudstack.apache.org/en/latest/adminguide/virtual_machines.html#importing-virtual-machines

Once these pre-requisites have been fulfilled, the administrator can successfully import virtual machines into CloudStack using importUnmanagedInstance which takes a number of parameters to configure template, compute, storage, network, etc. for the VM. On successful import, the imported VM operates as if it had been deployed from within CloudStack, and all normal VM operations can now be carried out.

Since the network will be the most dynamic aspect for the virtual machine, the new import API tries to automatically assign networks to its NICs. The user of the API can also manually assign the network(s) and IP addresses with appropriate parameters. To ease the creation of CloudStack network(s) for existing networks from vCenter, the new release also consists of a network discovery python script in cloudstack-common packages – discover_networks.py. This script allows the listing of all networks for a vCenter host or cluster which have at least one VM attached to them.

The addition of this new import functionality will surely aid migration of existing virtual machines into a CloudStack environment, and will help cloud operators wishing to move to CloudStack but could not due to their existing setup.