Posts

Introduction

CloudStack vSphere integration has not kept up with the evolution of vSphere itself, and several functions can be performed natively by vSphere much more efficiently than by CloudStack. vSphere also has additional features which would be beneficial to the operators of vSphere based CloudStack clouds.

This feature introduces support in CloudStack for VMFS6, vSAN, vVols and datastore clusters. Also, vSphere storage policies are tied with compute and disk offerings to improve linking offerings with storages, and CloudStack will allow inter-cluster VM and volume migrations, meaning that running VMs can now migrate along with all volumes across clusters. Furthermore, storage operations (create and attach volume; create snapshot / template from volume) are improved in CloudStack by using the native APIs of vSphere.

Storage types and management concepts

CloudStack supports NFS and VMFS5 storage for primary storage, but vSphere has supported other storage technologies for some time now (VMFS6, vSAN, vVols and datastore clusters). vSphere also has ‘vStorage API for Array Integration’ (VAAI), which enables vSphere integration with other vendors’ storage arrays on different storage technologies. Each storage technology is designed to serve a slightly different use case, but ultimately, they are all designed to improve the flexibility, efficiency, speed, and availability of storage to vSphere hosts. In addition to the storage types, there are storage management concepts in vSphere such as vSphere Storage Policies, which are not available in CloudStack.

Let us briefly go through these new technologies and concepts that are supported in CloudStack and vSphere.

VMFS6

VMFS6 is the latest VMware File System version (introduced with vSphere 6.5), and a few enhancements were introduced over VMFS5. The major differences are:

  • SESparse disks which provide improved space efficiency are now default disk types in VMFS6
  • Automatic space reclamation allows vSphere to reclaim dead or stranded space on thinly provisioned VMFS volumes in storage arrays

vSAN

vSAN was introduced with vSphere 5.5 and is a software-defined, enterprise storage solution that supports hyper-converged infrastructure (HCI) systems. vSAN is fully integrated with VMware vSphere, as a distributed layer of software within the ESXi hypervisor.

vVols

Virtual volumes (vVols), introduced with vSphere 6.5, is an integration and management framework for external storage providers, enabling a more efficient operational model optimized for virtualized environments and centred on the application instead of the infrastructure. vVols uniquely shares a common storage operational model with vSAN. Both solutions use storage policy-based management (SPBM) to eliminate storage provisioning, and uses descriptive policies at the VM or VMDK level.

Datastore clusters

A datastore cluster is a collection of datastores with shared resources and a shared management interface. After a datastore cluster is created, vSphere Storage DRS can be used to manage storage resources. When a datastore is added to a datastore cluster, the datastore’s resources become part of the datastore cluster’s resources.

Storage policies

Storage policies have become vSphere’s preferred method of determining the best placement of a disk image when differing ‘qualities’ of storage are available, and are simply a set of filters. For instance, a storage policy may say that the underlying disk must be encrypted, when a user specifies that storage policy, they would only be returned a list of data stores which have encrypted disks to choose to place the VM’s disk on. Storage policies are effectively a prerequisite for the use of the vSAN and the vVOLs.

GUI or API support

CloudStack has new APIs, modified existing APIs and UI support so that vSphere’s advanced capabilities can be used. To support different storage types for primary storages, the UI has changed.

Storage types Previously the only options for storage protocol type in CloudStack (while adding primary storage) were NFS, VMFS or custom. A new generic type called “presetup” has been added (for VMware) to add storage types VMFS5, VMFS6, vSAN or vVols. When a presetup datastore is added to CloudStack, the management server automatically identifies the storage pool type and saves it to the database.

To add a datastore cluster (which must have already been created on vCenter) as a primary storage, there is another new storage protocol type called “Datastore Clusters”.

To add one of the new Primary Storage types:

  1. Infrastructure tab -> Primary Storage -> Click “Add Primary Storage”
  2. Under “Protocol” the following options are available:
    • nfs
    • presetup
    • datastore cluster
    • custom

 

3. When “PreSetup” is selected as the storage protocol type, specify the vCenter server, datacenter and datastore details as shown below:

Storage policies

  • New APIs are introduced to import and list already imported storage policies from vCenter.
  • Storage policies are imported automatically from vCenter when a VMware zone is added in CloudStack.
  • Storage policies are re-imported and with CloudStack database whenever the “updateVmwareDc” API or the “importVsphereStoragePolicies” API are called. During re-import, any new storage policies added at vCenter will be newly imported to CloudStack and any storage policy deleted at vCenter will be marked as removed in CloudStack database.
  • Another new API “” is added, to list the compatible storage pools for an imported storage policy.
API New parameters
importVsphereStoragePolicies zoneid: id of the zone from which storage policies have to imported from corresponding vSphere
listVsphereStoragePolicies zoneid: id of the zone to list storage policies in it.
listVsphereStoragePolicyCompatiblePools zoneid: id of the zone to list storage pools in it which are compatible with storage policy

policyid: UUID of the storage policy

 

Existing APIs “createDiskOffering” and “createServiceOffering” are modified to bind vSphere storage policy to the offerings using a new parameter “storagepolicy” which takes policy UUID as input. In the GUI, while creating a service or disk offering, after selecting a specific VMware zone, already imported storage policies in that are zone are listed as below:

 

  • When VMs are deployed in VMware hosts, a primary storage pool is selected which is compliant with the storage policy defined in the offerings. For data disks, the storage policy defined in the disk offering will be used, and for root disks, the storage policy defined in the service offering will be used.

Implementation

As mentioned above, CloudStack now supports adding the new storage types and datastore clusters under one protocol category called “presetup”. Following are the steps that management server takes while adding a primary storage for the various storage protocols:

NFS storage

Management server mounts the NFS storage to the ESXi hosts by sending create NAS datastore API to vCenter.

PreSetup

  • Management server assumes that the provided datastore is already created on vCenter.
  • Management server checks the access of the datastore with name and vCenter details provided
  • Once a datastore with the provided name is found, management server fetches the type of datastore and adds the protocol type to “storage_pool_details” table in database

 Datastore cluster

Since datastore cluster on vCenter is a collection of datastores, let us call the actual datastore cluster the parent datastore and the datastores inside the cluster as child datastores. CloudStack handles a datastore cluster by adding it as a single primary storage. The pools inside the cluster are hidden and won’t be available individually for any operation.

There were some implementation challenges to directly add it as a primary storage. On vCenter a datastore cluster looks similar to the other datastore types as shown below.

In the underlying vSphere implementation, the type of all datastores other than datastore cluster is “Datastore” whereas the type of datastore cluster is “StoragePod”. vSphere native APIs related to storage operations are applicable only for the types “Datastore” and not to “StoragePod”. Due to this, the existing design of adding a datastore as a primary storage in CloudStack did not work for datastore cluster. The challenge comes here on how CloudStack abstracts the datastore cluster as a primary storage directly as a single entity. This is achieved by:

  • When a datastore cluster is added as a primary storage in CloudStack, it auto imports the child datastores inside the cluster as primary storages in CloudStack; eg.: when datastore cluster DS1 with 2 child datastores is added into CloudStack, the management server will create 3 primary storages (1 parent datastore and 2 child datastores) and note the child datastore’s parent in database.
  • A new column “parent” is introduced in “storage_pool” table in database.
  • “parent” column of child datastores is pointed to the parent datastore.
  • Only parent datastore is made visible to admins and child datastore are hidden making the datastore cluster acts like a black box.
  • Whenever a storage operation is performed on a datastore cluster, management server chooses one of the child datastores for that operation.
  • Any operation on a datastore cluster in fact performs that operation on all its child datastores. For example, if a datastore cluster is put in maintenance mode then all the child datastores will be put in maintenance mode and upon any failure, it reverts to the original state and throws error on original operation.

Following are the APIs where datastore cluster implementation is involved (storageid is passed as a parameter):

  • updateConfiguration – configures the value of global setting passed to the datastore cluster to all its child datastores
  • listSystemVms – lists all system VMs located in all child datastores
  • prepareTemplate – prepares templates in one of the available child datastores
  • listVirtualMachines – lists all virtual machines located in all child datastores
  • migrateVirtualMachine – migrates a VM to one of the available child datastores
  • migrateVolume – migrates a volume to one of the available child datastores
    • On a datastore cluster which is already added as a primary storage if any storage pool needs to be added or removed from the datastore cluster then we will need to remove the primary storage from CloudStack and re-add it after required modifications on the storage.

Storage Policies

On vCenter, storage policies act like a filter and control which type of storage is provided for the virtual machine, and how the virtual machine is placed within storage. So the best fit for storage policies in CloudStack is in disk offering and compute offering, since these offerings are also used to find the suitable storage and resources during virtual machine deployment.

 An admin can select an imported storage policy while creating a disk or service offering. Based on the storage policy, the corresponding disk is placed in the relevant storage pool which is in compliance with the storage policy, and the VM and disk are configured to enforce the required level of service based on the policy.

For example:

  • If a compute offering is created with “VVol No Requirement Policy” (the default storage policy for vVols), CloudStack tries to keep the root disk of the virtual machine in vVols primary storage, and the VM is also configured with that policy. Upon on any other storage operation (ie. volume migration) this storage policy will be taken into consideration for best placement of the VM and root disk.
  • If a disk offering is created with any storage policy, the same applies to the data disk.

vSphere related changes

“fcd” named folder in the root directory of storage pool

  • Previously any data disk was placed in the root folder of the primary storage pool. This is possible for NFS or VMFS5 storage types, but there is a limitation with the vSAN storage type as it does not support storage of user files directly in the root of the directory structure. Therefore, a separate folder is now created on all primary storage pools with the name “fcd”.
  • Since the storage operations are made independent of storage type, the “fcd” folder is created on all storage types.
  • The folder name is “fcd” because when the vSphere API is used to create first class disk, vCenter automatically creates a folder called ‘fcd’ (unless it already exists) and creates disk in that folder.

vVols template or VM creation with UUID as name

  • When deploying a VM from OVF template on vCenter, the UUID cannot be used as the name of VM or template. CloudStack seeds templates from secondary to primary using template UUID and uses a newly generated UUID for creating worker VM. So in case of vVols, datastore VM or template creation operations a suffix “cloud.uuid-” is added to the UUID whenever it is used:

vVols disk movement

vVols does not allow the disk to move from where it is created or from where it is placed using vSphere native APIs. If a disk is moved from its intended location then the pointer to the underlying vVols storage is lost and disk will be inaccessible. Therefore, following are the changes made with respect to vVols storage pool to avoid disk movements:

VM creation:

The VM will be cloned from the template on the vVols datastore with CloudStack’s VM internal name directly (Eg. i-2-43-VM). Prevoiusly CloudStack used to clone VM from template with root disk name (Eg. ROOT-43) and move volumes from root disk name folder to VM internal name.

Volume creation:

  • When a volume is first created and placed in a folder on the storage, it will not be moved from that folder whether it is attached to a VM or detached from a VM.

Conclusion

As of LTS version 4.15, CloudStack supports vSAN, vVols, VMFS5, VMFS6, NFS, datastore clusters, storage policies, and will also operate more like vSphere does natively to better manage them.

Vendors of virtual appliances (vApp) for VMware often produce ‘templates’ of their appliances in an OVA format. An OVA file will contain disc images, configuration data of the virtual appliance, and sometimes a EULA which must be acknowledged.

The purpose of this feature is to enable CloudStack to mimic the end-user experience of importing such an OVA directly into vCenter, the end result being a virtual appliance deployed with the same configuration data in the virtual machines descriptor (VMX) file as would be there if the appliance had been deployed directly through vCenter.

The OVA will contain configuration data regarding both hardware parameters required to start a virtual appliance, and software parameters which the virtual appliance will be able to read during instantiation. Generally, the software parameters will take the form of questions posed to the end-user, the answer to which is passed to the virtual appliance. Hardware parameters may either be set as a result of a question to the end-user (i.e. “Would you like a small, medium, or large appliance?”), or they may be passed directly into the virtual machines descriptor (VMX) file.

CloudStack version 4.15 includes full support for vApp OVA templates. Users will be able to deploy vApps in CloudStack, resulting in an appliance deployed with the same configuration data as if the VM has been deployed directly through vCenter.

Glossary:

The following terms will be used throughout this blog:

  • Workflow: The VM deployment cycle procedure / tasks on CloudStack for VMware environments.
  • ‘Deploy-as-is’: A new workflow / paradigm in which the deployed VMs inherit all the pre-set configurations and information from the template. In other words, ‘deploy-as-is VMs’ are clones of the templates from which they are deployed.
  • ‘Non-deploy-as-is’: The usual workflow on CloudStack for VMware environments prior to version 4.15.

High-level CloudStack VMware workflow refactor:

In this section, we will deep dive on the improvements and refactor in CloudStack to support vApps – from the usual VMware workflow to the ‘deploy-as-is’ workflow.

The default behavior for templates registered from CloudStack 4.15 and onwards will be the ‘deploy-as-is’ workflow. The main difference between this and the existing workflow is that ‘deploy-as-is’ lets CloudStack simplify the VM deployment process on VMware by using the template information about guest OS, hardware version, disks, network adapters and disk controllers to generate new VMs, without much user intervention.

As a vApp template can have multiple ‘configurations’, the copy from secondary storage to primary storage must be extended:

  • The ‘configuration’ ID selected by the user must be considered when copying a template from secondary to primary storage
  • The same template can now have multiple “versions” in the same primary storage, as the users can select from multiple OVF ‘configurations’ from the same template.
  • This is reflected on a new column ‘deployment_option’ in the table ‘template_spool_ref’

Prior to version 4.15, these were the steps involved in VMware VM deployment:

  • Deploy OVF template from Secondary to Primary storage into a ‘template VM’
  • Create the VM ROOT disk:
    • Clone the template VM on Primary storage into a temporary VM
    • Detach the disk
    • Remove the temporary VM
    • Deploy a VM from a template:
      • 4.1 Create a blank VM
      • Attach the ROOT disk created previously and any data disk.

As mentioned earlier, all templates registered from CloudStack 4.15 will use the improved workflow (‘deploy-as-is’). This extends but simplifies the previous workflow:

  • VMs can have multiple ROOT disks (as appliances may need multiple disks to work)
  • All the information set in the template is honored, and any user-specified setting such as Guest OS type, ROOT disk controller, network adapter type, boot type and boot mode are ignored.
  • OVF template deployed from Secondary to Primary storage matching the configuration ID selected by the user, creating a ‘template VM’ of a specific OVF ‘configuration’ on the Primary Storage
  • Create the VM ROOT disks:
    • Clone the template VM on Primary storage into a final user VM
    • Deploy a VM from a template (user VM exists by now): § Use the cloned VM to get its disk info, and then reconcile the disk information in database § The VM’s disk information is obtained by the SSVM, when CloudStack allocates volumes for virtual machines (allocation) in database. The SSVM uses the template ID and the selected ‘configuration’ to read the OVF file in secondary storage and retrieving the disks information.
  • Attach any data disk.

When it comes to the original workflow (for templates registered before 4.15), note that the use of a blank VM means dropping all the information from the source template. As the resulting VM was not a clone of the template (except for the ROOT disk), all the information that the template contained was not considered: guest OS type, hardware version, controllers, configurations, vApp properties, etc. With the new workflow we copy all the information available from the template.

As the information is now obtained from the template itself, CloudStack no longer requires some information at the template registration time. Instead, it obtains this information directly from the OVF descriptor file, meaning there is no need to select a ROOT disk controller, network adapter type or guest OS.

Initially, the template is registered with a default guest OS ‘OVF Configured OS’ until the template is successfully installed. Once installed, the guest OS is displayed from the information set in the OVF file.

  • To provide a complete list of supported guest OS, the tables ‘guest_os’ and ‘guest_os_hypervisor’ have been populated with the information from: https://code.vmware.com/apis/704/vsphere/vim.vm.GuestOsDescriptor.GuestOsIdentifier.html
  • There is also no need to select BIOS or UEFI (or Legacy vs. Secure boot mode) during te VM deployment time, and no ‘hardcoding’ the VM HW / VMX version to the ‘latest supported’ by the destination ESXi host (old workflow behaviour).
  • This information is obtained from the OVF file at template registration – i.e. after the template is downloaded to the secondary storage, it is extracted and the OVF file is parsed.
  • All the parsed information is sent to the management server and is persisted in database (in a new table ‘template_deploy_as_is_details’)

Disclaimer

Some of this functionality was introduced in version 4.14. However, as the only supported sections were the user-configurable properties, this support was extremely limited and should not be used with vApp templates. Any templates registered prior to upgrading to or installing CloudStack version 4.15 will not support vApp templates. These templates will continue working as they were before the upgrade, following the usual workflow for VMware deployments on CloudStack.

The default behavior for templates registered from version 4.15 and onwards is the “deploy-as-is” workflow.

vApp templates format

An appliance OVA contains a descriptor file (OVF file) containing information about the appliance, organized in different sections in the OVF descriptor file. Most sections used by the appliances are not set on ‘non-deploy-as-is’ templates.

The most common sections set on appliances are:

  • Virtual hardware and configurations. The appliance can provide different deployment options (configurations) where each one of them has different hardware requirements such as CPU number, CPU speed, memory, storage, networking.
  • User-configurable parameters. The appliance can provide a certain number of user-configurable properties (also known as vApp properties) which are often required for the initial configuration of an appliance. It is possible to define required parameters which must be set by the user to continue with the appliance deployment.
  • The license agreements. The appliance can define a certain number of license agreements which must be accepted by the user to continue with the appliance deployment.

For further information on the full OVF format and its syntax, please visit the OVF Specification Document: https://www.dmtf.org/sites/default/files/standards/documents/DSP0243_2.1.1.pdf

vCenter – Deploy a vApp from OVF

Before this feature, the only feasible way to deploy an appliance was directly through vCenter, using the ‘Deploy from OVF’ operation. However, CloudStack would not be aware of this VM and it would therefore not be managed by CloudStack. As this feature enables user to mimic the deployment experience of appliances through CloudStack, we will briefly describe the vCenter deployment experience with an example appliance containing all the sections described in the previous section (vApp templates format). Before starting the deployment of a virtual appliance, vCenter firstly displays the end-user license agreements which the user must accept before continuing with the deployment wizard:

The next step displays the different configurations for the appliance and their respective hardware requirements. The user must select only one of the configurations available to proceed to the next step:

The appliances are preset with a certain number of network interfaces, which must be connected to networks (either to different networks or to the same network). Each network interface shows a name to help the user connecting the interfaces to the appropriate networks.

The last step involves setting some properties. The property input fields can be of a different type: input fields, checkboxes or dropdown menus. vCenter displays the properties and allows the user to enter the desired values:

vCenter is now ready to deploy the appliance.

CloudStack – Deploy a vApp

The VM deployment wizard in CloudStack is enhanced to support deploying vApps in version 4.15. This is achieved by examining the OVF descriptor file of the ‘deploy-as-is’ templates and presenting the information that needs user input as new sections in the VM deployment wizard.

New VM deployment sections

With the new ‘deploy-as-is’ workflow, when CloudStack detects that a template contains the special sections described above, it presents them to the user in a similar way to vCenter, but in a different order, extending the existing VM deployment wizard steps.

The VM deployment wizard requires the user to select a template from which the VM must be deployed. If the selected template provides different OVF ‘configurations’, then the existing ‘Compute Offering’ step is extended. Instead of displaying the existing compute offerings to the user, CloudStack now displays a new dropdown menu showing all available configurations. The user must select one and a compatible compute offering:

When the user selects a configuration from the Configuration menu, then the list of compute offerings is filtered, displaying only the service offerings (fixed or custom) matching the minimum hardware requirements defined by the selected configuration.

In the case of custom offerings, CloudStack automatically populates the required values with all the information available from the configuration (for number of CPUs, CPU speed and memory). If CloudStack does not find information for some of these fields, then the user must provide a value.

The ‘Networks’ step is also extended, displaying all the network interfaces required by the appliance.
If the template contains user-configurable properties, then a new section ‘vApp properties’ is displayed:

If the template contains end-user license agreements, then a new section ‘License agreements’ is displayed, and the user must accept the license agreements to finish the deployment.

Conclusion

CloudStack version 4.15 introduces the deploy-as-is workflow for VMware environments, making it the default workflow for every new template. The previous workflow is still preserved, but only for templates registered prior to version 4.15.

In CloudStack, secondary storage pools (image stores) house resources such as volumes, snapshots and templates. Over time these storage pools may have to be decommissioned or data moved from one storage pool to another, but CloudStack isn’t too evolved when it comes to managing secondary storage pools.

This feature improves CloudStack’s management of secondary storage by introducing the following functionality:

  • Balanced / Complete migration of data objects among secondary storage pools
  • Enable setting image stores to read-only (making further operations such as download of templates or storage of snapshots and volumes impossible)
  • Algorithm to automatically balance image stores
  • View download progress of templates across datastores using the ‘listTemplates’ API

Balanced / Complete migration of data objects among secondary storage pools

To enable admins to migrate data objects (ie. snapshots, templates (private) or volumes) between secondary storage pools an API has been exposed which supports two types of migration:

  • Balanced migration – achieved by setting ‘migrationtype’ field of the API to “Balance”
  • Complete migration –achieved by setting ‘migrationtype’ field of the API to “Complete”

If the migration type isn’t provided by the user, it will default to “Complete”.

Usage:

migrate secondarystoragedata srcpool=<src image store uuid> destpools=<array of destination image store uuids> migrationtype=<balance/complete>

Balanced migration:

The idea here is to evenly distribute data objects among the specified secondary storage pools. For example, if a new secondary storage is added and we want data to be placed in it from another image store, the “Balanced” migration policy would be most suitable.

As part of this policy there is a Global setting “image.store.imbalance.threshold” which helps in deciding when the stores in question have been balanced. This threshold (by default, set to 0.3) basically indicates the ideal mean standard deviation of the image stores. Therefore, if the mean standard deviation is above this set threshold, migration of the selected data object will proceed. However, if the mean standard deviation of the image stores (destination(s) and source) is less than or equal to the threshold, then the image stores have reached a balanced point and migration can stop. As part of the balancing algorithm, we also check the mean standard deviation of the system before and after the migration of a specific file and if the standard deviation increases then we omit the particular file and proceed further as its migration will not provide any benefit.

Complete migration:

Complete migration migrates a file if the destination image store has sufficient free capacity to accommodate the data object (used capacity is below 90% and is larger than the size of the file chosen). Also, during complete migration the source image store is set to “read-only”, in order to ensure that the store is no longer selected during any other operation involving storage of data in image stores.

  • Source and destination image stores are valid (ie., are NFS based stores in the same datacenter)
  • Validity of the migration type / policy passed
  • Role of the secondary storage(s) is “Image”
  • Destination image stores don’t include the source image store
  • None of the destination image stores should be set to read-only
  • There can be only one migration job running in the system at any given time
  • If choice of migration is “Complete” then there should not be any files that are in Creating, Copying or Migrating states

Furthermore, care has been taken to ensure that snapshots belonging to a chain are migrated to the same image store. If snapshots are created during migration, then:

  • If the migration policy is “complete” and the snapshot has no parent, then it will be migrated to the
  • If the snapshot has a parent then the snapshot will be moved to the same image store as the parent

Another aspect of the migration feature is scaling of Secondary storage VMs (SSVMs) to prevent all migrate jobs being handled by one SSVM, which may hamper the performance of other activities that are scheduled to take place on it. The relevant global settings are:

  • max.migrate.sessions. New. Indicates the number of concurrent file transfer operations that can take place on an SSVM (defaults to 2)
  • ssvm.count. New. maximum number of additional SSVMs that can be spawned up to handle the load. (defaults to 5). However, if the number of hosts in the datacenter is less than the max count set, then the number of hosts takes precedence
  • vm.auto.reserve.capacity. Existing. Should be set to true (default) if we want scaling of SSVMs to when the load increases

Additional SSVMs will be created when half of the total number of jobs have been running for more than the duration defined by the global setting max.data.migration.wait.time (default 15 minutes). Therefore, ifmigrate job has been running for more than 15 mins then a new SSVM is spawned and jobs can be scheduled on it.

These additional SSVMs will be automatically destroyed when:

  • The migration job has reached completion
  • The total number of commands in the pipeline (as determined by the cloud.cmd_exec_log table) is below the defined threshold
  • There are no jobs running on the SSVM in question

UI Support:

As well as using the API (cloudmonkey / cmk), support has been added in the new Primate UI.

Navigate to: Infrastructure → Secondary Storages At the top right corner, click on migrate button:

 

 

 

 

 

 

 

 

 

Enable setting image stores to read-only

A secondary storage pool may need to be set to read-only mode, to prevent downloading objects onto it. This could prove useful when decommissioning a storage pool. An API has been defined to enable setting a Secondary storage to read only:

update imagestore id=<image_store_id> readonly=<true/false>

It is possible to filter out image stores based on read-only / read-write permissions using the API ‘listImagestores’.

Algorithm to automatically balance image stores

Currently the default behaviour of CloudStack is to choose an image store with the highest free capacity. There is a new global setting: “image.store.allocation.algorithm”, which by default is set to “firstfitleastconsumed”, meaning that it returns the image stores in decreasing order of their free capacity. Another allocation option is ‘random’, which returns image stores in a random order.

View download progress of templates across datastores using the ‘listTemplates’ API

The “listTemplates” API has been extended to support viewing download details: progress, download status, and image store. For example:

This feature will be available as of Apache CloudStack 4.15, which will be an LTS release.

For a while, the CloudStack community has been working on adding support for containers. ShapeBlue successfully implemented the CloudStack Container Service and donated it to the project in 2016, but it was not completely integrated into the codebase. However, with the recent CloudStack 4.14 LTS release, the CloudStack Kubernetes Service (CKS) plugin adds full Kubernetes integration to CloudStack – allowing users to run containerized services using Kubernetes clusters through CloudStack.

CKS adds several new APIs (and updates to the UI) to provision Kubernetes clusters with minimal configuration by the user. It also provides the ability to add and manage different Kubernetes versions, meaning not only deploying clusters with chosen version, but also providing the option to upgrade an existing cluster to a new version.

The integration

CKS leverages CloudStack’s plugin framework and is disabled by default (for a fresh install or upgrade) – enabled using a global setting. It also adds global settings to set the template for a Kubernetes cluster node virtual machine for different hypervisors; to set the default network offering for a new network for a Kubernetes cluster; and to set different timeout values for the lifecycle operations of a Kubernetes cluster:

cloud.kubernetes.service.enabled Indicates whether the CKS plugin is enabled or not. Management server restart needed
cloud.kubernetes.cluster.template.name.hyperv Name of the template to be used for creating Kubernetes cluster nodes on HyperV
cloud.kubernetes.cluster.template.name.kvm Name of the template to be used for creating Kubernetes cluster nodes on KVM
cloud.kubernetes.cluster.template.name.vmware Name of the template to be used for creating Kubernetes cluster nodes on VMware
cloud.kubernetes.cluster.template.name.xenserver Name of the template to be used for creating Kubernetes cluster nodes on Xenserver
cloud.kubernetes.cluster.network.offering Name of the network offering that will be used to create isolated network in which Kubernetes cluster VMs will be launched
cloud.kubernetes.cluster.start.timeout Timeout interval (in seconds) in which start operation for a Kubernetes cluster should be completed
cloud.kubernetes.cluster.scale.timeout Timeout interval (in seconds) in which scale operation for a Kubernetes cluster should be completed
cloud.kubernetes.cluster.upgrade.timeout Timeout interval (in seconds) in which upgrade operation for a Kubernetes cluster should be completed*
cloud.kubernetes.cluster.experimental.features.enabled Indicates whether experimental feature for Kubernetes cluster such as Docker private registry are enabled or not

* There can be some variation while obeying cloud.kubernetes.cluster.upgrade.timeout as the upgrade on a cluster node must be finished either successfully or as failure for CloudStack to report status of the cluster upgrade.

Once the initial configuration is complete and the plugin is enabled, the UI starts showing a new tab ‘Kubernetes Service’ and different APIs become accessible:

Under the hood

Provisioning a Kubernetes cluster in itself can be a complex process based on the tool used (minikube, kubeadm, kubespray, etc.). CKS simplifies  and automates the complete processusing the kubeadm tool for provisioning clusters and performing lifecycle operations. As mentioned in the kubeadm documentation:

Kubeadm performs the actions necessary to get a minimum viable cluster up and running. By design, it cares only about bootstrapping, not about provisioning machines. Likewise, installing various nice-to-have addons, like the Kubernetes Dashboard, monitoring solutions, and cloud-specific addons, is not in scope.

Therefore, all orchestration for cluster node virtual machines is taken care of by CloudStack, and it is only CloudStack that decides the host or storage for the node virtual machines. CKS uses the kubectl tool for communicating with the Kubernetes cluster to query its state, active nodes, version, etc. Kubectl is a command-line tool for controlling Kubernetes clusters.

For node virtual machines, CKS requires a CoreOS based template. CoreOS has been chosen as it provides docker installation and the networking rules needed for Kubernetes. Considering the current CoreOS situation, support for a different host OS could be added in the future.

Networking for the Kubernetes cluster is provisioned using Weave Net CNI provider plugin.

The prerequisites

To successfully provision a Kubernetes cluster using CKS there are few pre-requisites and conditions that must be met:

  1. The template registered for a node virtual machine must a public template.
  2. Currently supported Kubernetes versions are 1.11.x to 1.16.x. At present, v1.17 and above might not work due to their incompatibility with weave-net plugin.
  3. A multi-master, HA cluster can be created using Kubernetes versions 1.16.x only.
  4. While creating a multi-master, HA cluster over a shared network, an external load-balancer must be manually setup. This load-balancer should have port-forwarding rules for SSH and the Kubernetes API server access. CKS assumes SSH access to cluster nodes is available from port 2222 to (2222 + cluster node count -1). Similarly, for API access port 6443 must be forwarded to master nodes. Over the CloudStack isolated network, these rules are automatically provisioned.
  5. Currently only a CloudStack isolated or shared network can be used for deployment of a Kubernetes cluster. Network must have the Userdata service enabled.
  6. For CoreOS, a minimum of 2 CPU cores and 2GB of RAM is required for deployment of a virtual machine. Therefore, a suitable service offering must be created and used while deploying a Kubernetes cluster.
  7. Node virtual machines must have Internet access at the time of cluster provisioning, scale and upgrade operations, as kubeadm cannot perform certain cluster provisioning steps without it.

The flow

After completing the initial configuration and confirming the requirements, an administrator can proceed with adding the supported Kubernetes version and deploying a Kubernetes Cluster. The addition and management of Kubernetes versions can only be done by an administrator and other users only have permissions to list supported versions. Each Kubernetes version in CKS can only be added as an ISO – this will be a binaries ISO which will contain all the Kubernetes binaries and Docker images for any given Kubernetes release. Using an ISO with the required binaries allows faster installation of Kubernetes on the node virtual machines. kubeadm needs active Internet on the master nodes during the cluster provisioning. Using an ISO with binaries and docker images prevents their download from Internet. To facilitate the creation of an ISO for a given Kubernetes release a new script named create-kubernetes-binaries-iso.sh, has been added in cloudstack-common packages. More about this script can be found in the CloudStack documentation.

Add Kubernetes cluster form in CloudStack UI:

Once there is at least one enabled and ready Kubernetes version and the node VM template in place, CKS will be ready to deploy Kubernetes clusters, which can be created using either the UI or API. Several parameters such as Kubernetes version, compute offering, network, size, HA support, node VM root disk size, etc. can be configured while creating the cluster.

Different operations can be performed on a successfully created Kubernetes cluster, such as start-stop, retrieval of cluster kubeconfig, scale, upgrade or destroy. Both UI and API provide the means to do that.

Kubernetes cluster details tab in CloudStack UI:

Once a Kubernetes cluster has been successfully provisioned, CKS deploys the Kubernetes Dashboard UI. A user can download the cluster’s kubeconfig and use that to access the cluster locally, or to deploy services on the cluster. Alternatively, kubectl tool can be used along with a kubeconfig file to access the Kubernetes cluster via the command line. Instructions for both kubectl and Kubernetes dashboard access will be available in the Kubernetes cluster details page in CloudStack.

Kubernetes Dashboard UI accessible for Kubernetes clusters deployed with CKS:

The new APIs

CKS adds a number of new APIs for performing different operations on a Kubernetes supported version and Kubernetes cluster,

Kubernetes version related APIs:

addKubernetesSupportedVersion Available only to Admin, this API allows adding a new supported Kubernetes version
deleteKubernetesSupportedVersion Available only to Admin, this API allows deletion of an existing supported Kubernetes version
updateKubernetesSupportedVersion Available only to Admin, this API allows update of an existing supported Kubernetes version
listKubernetesSupportedVersions Lists Kubernetes supported versions

Kubernetes cluster related APIs:

createKubernetesCluster For creating a Kubernetes cluster
startKubernetesCluster For starting a stopped Kubernetes cluster
stopKubernetesCluster For stopping a running Kubernetes cluster
deleteKubernetesCluster For deleting a Kubernetes cluster
getKubernetesClusterConfig For retrieving Kubernetes cluster config
scaleKubernetesCluster For scaling a created, running or stopped Kubernetes cluster
upgradeKubernetesCluster For upgrading a running Kubernetes cluster
listKubernetesClusters For listing Kubernetes clusters

The CloudStack Kubernetes Service adds a new dimension to CloudStack, allowing cloud operators to provide their users with Kubernetes offerings, but this is just the beginning! There are already ideas for improvements within the community such as support for different CloudStack zone types, support for VPC network, and use of a Debian based or a user-defined host OS template for node virtual machine. If you have an improvement to suggest, please log it in the CloudStack Github project.

More details about CloudStack Kubernetes Service can be found in the CloudStack documentation.

About the author

Abhishek Kumar is a Software Engineer at ShapeBlue, the Cloud Specialists. Apart from spending most of his time implementing new features and fixing bugs in Apache CloudStack, he likes reading about technology and politics. Outside work he spends most of his time with family and tries to work out regularly.

There is currently significant effort going on in the Apache CloudStack community to develop a new, modern, UI (user interface) for CloudStack: Project Primate. In this article, I discuss why this new UI is required, the history of this project and how it will be included in future CloudStack releases.

There are a number of key dates that current users of CloudStack should take note of and plan for, which are listed towards the end of this article.

 

We also recently held a webinar on this subject:

 

The current CloudStack UI

The current UI for Apache CloudStack was developed in 2012/13 as a single browser page UI “handcrafted” in javascript. Despite becoming the familiar face of CloudStack, the UI has always had limitations, such as no browser history, poor rendering on tablets / phones and  loss of context on refresh. Its look and feel , although good for when it was created, has become dated. However, by far the biggest issue with the existing UI  is that its 90,000 lines of code have become very difficult to maintain and extend for new CloudStack functionality. This has resulted in some new CloudStack functionality being developed as API only, and a disproportionate amount of effort required to develop new UI functionality.

How to build a new UI for Cloudstack ?

A UI R&D project was undertaken by Rohit Yadav in early 2019. Rohit is the creator and maintainer of CloudMonkey (CloudStack CLI tool) and he set off to use the lessons he’d learnt creating CloudMonkey to evaluate the different options for creating a new UI for CloudStack.

Rohit’s initial R&D work identified a set of overall UI requirements and also a set of design principles.

UI Requirements:

  • Clean Enterprise Admin  & user UI
  • Intuitive to use
  • To match existing CloudStack UI functionality and features
  • Separate UI code from core Management server code so the UI becomes a client to the CloudStack API
  • API auto-discovery of new CloudStack functionality
  • Config and Role-based rendering of buttons, actions, views etc. Dashboard, list and detail views
  • URL router and browser history driven
  • Local-storage based notification and polling
  • Dynamic language translations
  • Support desktop, tablet and mobile screen form factors

Design principles:

  • Declarative programming and web-component based
  • API discovery and param-completion like CloudMonkey
  • Auto-generated UI widgets, views, behaviour
  • Data-driven behaviour and views, buttons, actions etc. based on role-based permissions
  • Easy to learn, develop, customise, extend and maintain
  • Use modern development methodologies, frameworks and tooling
  • No DIY frameworks, reuse opensource project(s)

A number of different JavaScript frameworks were evaluated for implementation, with Vue.JS being chosen due to the speed and ease that it could be harnessed to create a modern UI. Ant Design was also chosen as it gave off-the-shelf, enterprise-class, UI building blocks and components.

Project Primate

VM Instance details in Primate

Out of these initial principles came the first iteration of Project Primate , a new Vue based UI for Apache CloudStack. Rohit presented his first cut of Primate at the Cloudstack Collaboration conference in Las Vegas in September 2019 to much excitement and enthusiasm from the community.

Unlike the old UI, primate is not part of the core CloudStack Management server code, giving a much more modular and flexible approach. This allows Primate to be “pointed” at any CloudStack API endpoint or even multiple versions of the UI to be used concurrently. The API auto-discovery allows Primate to recognise new functionality in the CloudStack API, much like CloudMonkey currently does.

Primate is designed to work across all browsers, tablets and phones. From a developer perspective, the codebase should be about a quarter that of the old UI and, most importantly, the Vue.JS framework is far easier for developers to work with.

Adoption of Project Primate by Apache Cloudstack

Primate is now being developed  by CloudStack community members in a Specialist Interest Group (S.I.G). Members of that group include developers from EWERK, PCExtreme, IndiQus, SwissTXT  and ShapeBlue.

In late October, the CloudStack community voted to adopt Project Primate as the new UI for Apache CloudStack and deprecate the old UI. The code was donated to the Apache Software Foundation and the following plan for replacement of the old UI was agreed:

Technical preview – Winter 2019 LTS release

A technical preview of the new UI will be included with the Winter 2019 LTS release of CloudStack (targeted to be in Q1 2020 and based on the 4.14 release of CloudStack). The technical preview will have feature parity with the existing UI. The  release will still ship with the existing UI for production use, but CloudStack users will be able to deploy the new UI in parallel  for testing and familiarisation purposes. The release will also include a formal advance deprecation notice of the existing UI.

At this stage, the CloudStack community will also stop taking feature requests for new functionality in the existing UI. Any new feature development in CloudStack will be based on the new UI. In parallel to this, work will be done on the UI upgrade path and documentation.

General Availability  – Summer 2020 LTS release

The summer 2020 LTS release of CloudStack will ship with the production release of the new UI. It will also be the last version of CloudStack to ship with the old UI. This release will also have the final deprecation notice for the old UI.

Old UI deprecated – Winter 2020 LTS release

The old UI code base will be removed from the Winter 2020 LTS release of CloudStack, and will not be available in releases from then onwards.

It is worth noting that, as the new primate UI is a discrete client for CloudStack that uses API discovery, the UI will be no longer bound to the core CloudStack code.  This may mean that long term the UI  may adopt its own release cycle, independent of core CloudStack releases. This long term release strategy is yet to be decided by the CloudStack community.

What CloudStack users need to do

As the old UI is being deprecated, organisations need to plan to migrate to the new CloudStack UI.

What actions specific organisations need to take depends on their use of the current UI. Many organisations only use the CloudStack UI for admin purposes, choosing other solutions to present to their end-users. It is expected that the amount of training required for admins to use the new UI will be minimal and therefore such organisations will not need to extensively plan the deployment of the new UI.

For organisations that do use the CloudStack UI to present to their users, more considered planning is suggested. Although  the new UI gives a much enhanced & intuitive experience, it is anticipated that users may need documentation updates, etc . The new UI will need to be extensively tested with any 3rd party integrations and UI customisations  at users sites. As the technology stack is completely new, it is likely that such integrations and customisations may need to be re-factored.

A summary of support for the old / new UI’s is below

Cloudstack versionLikely Release dateShips with old UIShips with new UILTS support until*
Winter 2019 LTSQ1 2020YesTechnical Previewc. Sept 2021
Summer 2020 LTSQ2/3 2020Yes (although will contain no new features from previous version)Yesc. Feb 2022
Winter 2020 LTSQ1 2021NoYesc. Sept 2022

*LTS support cycle from the Apache CloudStack community. Providers of commercial support services (such as ShapeBlue) may have different cycles.

Anybody actively developing new functionality for CloudStack needs to be aware that changes to the old UI code will not be accepted after the Winter 2019 LTS release.

Get involved

Primate on an iphone

As development of Project Primate is still ongoing, I encourage CloudStack users to download and run the Primate UI before release – it is not recommended to use the new UI in production environments until it is at GA. The code and install documentation can be found at https://github.com/apache/cloudstack-primate. This provides a unique opportunity to view the work to date, contribute ideas and test in your environment before the release date. Anybody wishing to join the SIG can do so on the dev@cloudstack.apache.org mailing list.

 

 

Introduction

In my previous post, I described the new ‘Open vSwitch with DPDK support’ on CloudStack for KVM hosts. There, I focused on describing the feature, as it was new to CloudStack, and also explained the necessary configuration on the KVM agents’ side to enable DPDK support.

DPDK (Data Plane Development Kit) (https://www.dpdk.org/) is a set of libraries and NIC drivers for fast package processing in userspace. Using DPDK along with OVS brings benefits to networking performance on VMs and networking appliances. DPDK support in CloudStack requires that the KVM hypervisor is running on DPDK compatible hardware.

In this post, I will describe the new functions which ShapeBlue has introduced for the CloudStack 4.13 LTS. With these new features, DPDK support is extended, allowing administrators to:

  • Create service offerings with additional configurations. In particular, the DPDK required additional configurations can be included on service offerings
  • Select the DPDK vHost User mode to use on each VM deployment, from these service offerings
  • Perform live migrations of DPDK-enabled VMs between DPDK-enabled hosts

CloudStack new additions for DPDK support

In the first place, it is necessary to mention that DPDK support works along with additional VM configurations. Please ensure that the global setting ‘enable.additional.vm.configuration’ is turned on.

As a reminder from the previous post, DPDK support is enabled on VMs with additional configuration details/keys:

  • ‘extraconfig-dpdk-numa’
  • ‘extraconfig-dpdk-hugepages’

One of the new additions for DPDK support is the ability to select vHost user mode to use for DPDK by the administrator via service offerings. The vHost user mode describes a client / server model between Openvswitch along with DPDK and QEMU, in which one acts as a client while the other as a server. The server creates and manages the vHost user sockets, and the client connects to the sockets created by the server.

Additional configurations on service offerings

CloudStack allows VM XML additional configurations and DPDK vHost user mode to be stored on service offerings as details and be used on VM deployments from the service offering. Additional configurations and the DPDK vHost user mode for VM deployments must be passed as service offering details to ‘createServiceOffering’ API by the administrator.

For example, the following format is valid:

(cloudmonkey)> create serviceoffering name=NAME displaytext=TEXT domainid=DOMAIN hosttags=TAGS
serviceofferingdetails[0].key=DPDK-VHOSTUSER serviceofferingdetails[0].value=server
serviceofferingdetails[1].key=extraconfig-dpdk-numa serviceofferingdetails[1].value=NUMACONF
serviceofferingdetails[2].key=extraconfig-dpdk-hugepages serviceofferingdetails[2].value=HUGEPAGESCONF

Please note:

  • Each additional configuration value must be URL UTF-8 encoded (NUMACONF and HUGEPAGESCONF in the example above).
  • The DPDK vHost user mode key must be: “DPDK-VHOSTUSER”, and its possible values are “client” and “server”. Its value is passed to the KVM hypervisors. If it is not passed, then “server” mode is assumed. Please note this value must not be encoded.
  • Additional configurations on VMs are additive to the additional configurations on service offerings.
  • In case one or more additional configuration have the same name (or key), then the additional configurations on the VM take precedence over the additional configuration on the service offering.

On VM deployment, the DPDK vHost user mode is passed to the KVM host. Based on its value:

  • When DPDK vHost user mode = “server”:
    • OVS with DPDK acts as the server, while QEMU acts as the client. This means that VM’s interfaces are created in ‘client’ mode.
    • The DPDK ports are created with type: ‘dpdkvhostuser’
  • When DPDK vHost user mode = “client”:
    • OVS with DPDK acts as the client, and QEMU acts as the server.
    • If Openvswitch is restarted, then the sockets can reconnect to the existing sockets on the server, and standard connectivity can be resumed.
    • The DPDK ports are created with type: ‘dpdkvhostuserclient’

Live migrations of DPDK-enabled VMs

Another useful functionality of DPDK support is live migration between DPDK enabled hosts. This is possible by introducing a new host capability on DPDK enabled hosts (enablement has been described on the previous post). CloudStack uses the DPDK host capability to determine which hosts are DPDK enabled.

However, the management servers also need a mechanism to decide if a VM is DPDK-enabled, before allowing live migration to DPDK enabled hosts. The decision is made on the following criteria:

  • A VM is running on a DPDK-enabled host.
  • The VM possess the DPDK required configuration from VM details or service offering details.

This allows administrators to live migrate these VMs to suitable hosts.

Conclusion

As the previous post describes, DPDK support was initially introduced in CloudStack 4.12. This blog post covers the DPDK support extension for CloudStack 4.13 LTS, introducing more flexibility and improving its usage. As CloudStack recently started supporting DPDK, more additions to its support are expected to be added in future versions.

Future work may involve UI support for the previously described features. Please note that it is currently not possible to pass additional configuration to VMs or service offerings using the CloudStack UI, it is only available through the API.

For references, please check PRs:

About the author

Nicolas Vazquez is a Senior Software Engineer at ShapeBlue, the Cloud Specialists, and is a committer in the Apache CloudStack project. Nicolas spends his time designing and implementing features in Apache CloudStack.

Background

The original CloudMonkey was contributed to the Apache CloudStack project on 31 Oct 2012 under the Apache License 2.0. It is written in Python and shipped using the Python CheeseShop, and since its inception has gone through several refactors and rewrites. While this has worked well over the years, the installation and usage have been limited to just a few modern platforms due to the dependency on Python 2.7, meaning it is hard to install on older distributions such as CentOS6.

Over the past two years, several attempts have been made to make the code compatible across Python 2.6, 2.7 and 3.x. However, it proved to be a maintenance and release challenge – making it code compatible across all the platforms, all the Python versions and the varied dependency versions; whilst also keeping it easy to install and use. During late 2017, an experimental CloudMonkey rewrite called cmk was written in Go, a modern, statically typed and compiled programming language which could produce cross-platforms standalone binaries. Finally, in early 2018, after reaching a promising state the results of the experiment were shared with the community, to build support and gather feedback for moving the CloudMonkey codebase to Go and deprecate the Python version.

During 2018, two Go-based ports were written using two different readline and prompt libraries. The alpha / beta builds were shared with the community who tested them, reported bugs and provided valuable feedback (especially around tab-completion) which drove the final implementation. With the new rewrite CloudMonkey (for the first time) ships as a single executable file for Windows which can be easily installed and used having mostly the same user experience one would get on Linux or Mac OSX. The rewrite aims to maintain command-line tool backward compatibility as a drop-in replacement for the legacy Python-based CloudMonkey (i.e. shell scripts using legacy CloudMonkey can also use the modern CloudMonkey cmk). Legacy Python-based CloudMonkey will continue to be available for installation via pip but it will not be maintained moving forward.

CloudMonkey 6.0 requires a final round of testing and bug-fixing before the release process will commence. The beta binaries are available for testing here: https://github.com/apache/cloudstack-cloudmonkey/releases 

Major changes in CloudMonkey 6.0

  • Ships as standalone 32-bit and 64-bit binaries targeting Windows, Linux and Mac including ARM support (for example, to run on Raspberry Pi)
  • Drop-in replacement for legacy Python-based CloudMonkey as a command line tool
  • Interactive selection of API commands, arguments, and argument options
  • JSON is the default API response output format
  • Improved help docs output when ‘-h’ is passed to an API command
  • Added new output format ‘column’ that outputs API response in a new columnar way like modern CLIs such as kubectl and docker
  • Added new set option ‘debug’ to enable debug mode, set option ‘display’ renamed as ‘output’
  • New CloudMonkey configuration file locking mechanism to avoid file corruption when multiple cmk instances run
  • New configuration folder ~/.cmk to avoid conflict with legacy Python-based version

Features removed in CloudMonkey 6.0:

  • Removed XML output format.
  • Removed CloudMonkey logging API requests and responses to a file.
  • Coloured output removed.
  • Removed set options: color (for coloured output), signatureversion and expires (no longer acceptable API parameters), paramcompletion (API parameter completion is not enabled by default), cache_file (the default cache file, now at ~/.cmk/cache ), history_file (the history file), log_file (API log file).

About the author

Rohit Yadav is a Software Architect at ShapeBlue, the Cloud Specialists, and is a committer and PMC member of Apache CloudStack. Rohit spends most of his time designing and implementing features in Apache CloudStack.

Introduction

This blog describes a new feature to be introduced in the CloudStack 4.12 release (already in the current master branch of the CloudStack repository). This feature will provide support for the Data Plane Development Kit (DPDK) in conjunction with Open vSwitch (OVS) for guest VMs and is targeted at the KVM hypervisor.

The Data Plane Development Kit (https://www.dpdk.org/) is a set of libraries and NIC drivers for fast package processing in userspace. Using DPDK along with OVS brings benefits to networking performance on VMs and networking appliances. In this blog, we will introduce how DPDK can be used on on guest VMs once the feature is released.

Please note – DPDK support in CloudStack requires that the KVM hypervisor is running on DPDK compatible hardware.

Enable DPDK support

This feature extends the Open vSwitch feature in CloudStack with DPDK integration. As a prerequisite, Open vSwitch needs to be installed on KVM hosts and enabled in CloudStack. In addition, administrators need to install DPDK libraries on KVM hosts before configuring the CloudStack agents, and I will go into the configuration in detail.

KVM Agent Configuration

An administrator can follow this guide to enable DPDK on a KVM host:

Prerequisites

  • Install OVS on the target KVM host
  • Configure CloudStack agent by editing the /etc/cloudstack/agent/agent.properties file:
    • # network.bridge.type=openvswitch
      # libvirt.vif.driver=com.cloud.hypervisor.kvm.resource.OvsVifDriver
      
  • Install DPDK. Installation guide can be found on this link: http://docs.openvswitch.org/en/latest/intro/install/dpdk/.

Configuration

Edit /etc/cloudstack/agent/agent.properties file, in which <OVS_PATH> is the path in which your OVS ports are created, typically /var/run/openvswitch/:

  • # openvswitch.dpdk.enable=true
    # openvswitch.dpdk.ovs.path=<OVS_PATH>
    

Restart CloudStack agent so that changes take effect:

# systemctl restart cloudstack-agent

DPDK inside guest VMs

Now that CloudStack agents have been configured, users are able to deploy their guest VMs using DPDK. In order to achieve this, they will need to pass extra configurations to enable DPDK:

  • Enable “HugePages” on the VM
  • NUMA node configuration

As of 4.12, passing extra configurations to VM deployments will be allowed. In the case of KVM, the extra configurations are added to the VM XML domain. The CloudStack API methods deployVirtualMachine and updateVirtualMachinewill support the new optional parameter extraconfigand will work in the following way:

 
# deploy virtualmachine ... extraconfig=<URL_UTF-8_ENCODED_CONFIGS>

CloudStack will expect a URL UTF-8 encoded string which can support multiple extra configurations. For example, if a user wants to enable DPDK, they will need to pass two extra configurations as we have mentioned above. An example of both configurations are the following:

 
dpdk-hugepages:
<memoryBacking> 
   <hugePages/> 
</memoryBacking> 

dpdk-numa: 
<cpu mode='host-passthrough'>
   <numa>
      <cell id='0' cpus='0' memory='9437184' unit='KiB' memAccess='shared'/>
   </numa> 
</cpu>

…which becomes this URL UTF-8 encoded string, and is the one that CloudStack will expect on VM deployments:

 
dpdk-hugepages%3A%20%3CmemoryBacking%3E%20%3ChugePages%2F%3E%20%3C%2FmemoryBacking%3E%20dpdk-numa%3A%20%3Ccpu%20mode%3D%22host-passthrough%22%3E%20%3Cnuma%3E%20%3Ccell%20id%3D%220%22%20cpus%3D%220%22%20memory%3D%229437184%22%20unit%3D%22KiB%22%20memAccess%3D%22shared%22%2F%3E%20%3C%2Fnuma%3E%20%3C%2Fcpu%3E

KVM networking verification

Administrators can verify how OVS ports are created with DPDK support on DPDK enabled hosts, in which users have deployed DPDK enabled guest VMs. These port names start with “csdpdk”:

 

# ovs-vsctl show
....
Port "csdpdk-1"
   tag: 30
   Interface "csdpdk-1"
      type: dpdkvhostuser
Port "csdpdk-4"
   tag: 30
   Interface "csdpdk-4"
      type: dpdkvhostuser

About the author

Nicolas Vazquez is a Senior Software Engineer at ShapeBlue, the Cloud Specialists, and is a committer in the Apache CloudStack project. Nicolas spends his time designing and implementing features in Apache CloudStack.

Introduction

We published the original blog post on KVM networking in 2016 – but in the meantime we have moved on a generation in CentOS and Ubuntu operating systems, and some of the original information is therefore out of date. In this revisit of the original blog post we cover new configuration options for CentOS 7.x as well as Ubuntu 18.04, both of which are now supported hypervisor operating systems in CloudStack 4.11. Ubuntu 18.04 has replaced the legacy networking model with the new Netplan implementation, and this does mean different configuration both for linux bridge setups as well as OpenvSwitch.

KVM hypervisor networking for CloudStack can sometimes be a challenge, considering KVM doesn’t quite have the same mature guest networking model found in the likes of VMware vSphere and Citrix XenServer. In this blog post we’re looking at the options for networking KVM hosts using bridges and VLANs, and dive a bit deeper into the configuration for these options. Installation of the hypervisor and CloudStack agent is pretty well covered in the CloudStack installation guide, so we’ll not spend too much time on this.

Network bridges

On a linux KVM host guest networking is accomplished using network bridges. These are similar to vSwitches on a VMware ESXi host or networks on a XenServer host (in fact networking on a XenServer host is also accomplished using bridges).

A KVM network bridge is a Layer-2 software device which allows traffic to be forwarded between ports internally on the bridge and the physical network uplinks. The traffic flow is controlled by MAC address tables maintained by the bridge itself, which determine which hosts are connected to which bridge port. The bridges allow for traffic segregation using traditional Layer-2 VLANs as well as SDN Layer-3 overlay networks.

KVMnetworking41

Linux bridges vs OpenVswitch

The bridging on a KVM host can be accomplished using traditional linux bridge networking or by adopting the OpenVswitch back end. Traditional linux bridges have been implemented in the linux kernel since version 2.2, and have been maintained through the 2.x and 3.x kernels. Linux bridges provide all the basic Layer-2 networking required for a KVM hypervisor back end, but it lacks some automation options and is configured on a per host basis.

OpenVswitch was developed to address this, and provides additional automation in addition to new networking capabilities like Software Defined Networking (SDN). OpenVswitch allows for centralised control and distribution across physical hypervisor hosts, similar to distributed vSwitches in VMware vSphere. Distributed switch control does require additional controller infrastructure like OpenDaylight, Nicira, VMware NSX, etc. – which we won’t cover in this article as it’s not a requirement for CloudStack.

It is also worth noting Citrix started using the OpenVswitch backend in XenServer 6.0.

Network configuration overview

For this example we will configure the following networking model, assuming a linux host with four network interfaces which are bonded for resilience. We also assume all switch ports are trunk ports:

  • Network interfaces eth0 + eth1 are bonded as bond0.
  • Network interfaces eth2 + eth3 are bonded as bond1.
  • Bond0 provides the physical uplink for the bridge “cloudbr0”. This bridge carries the untagged host network interface / IP address, and will also be used for the VLAN tagged guest networks.
  • Bond1 provides the physical uplink for the bridge “cloudbr1”. This bridge handles the VLAN tagged public traffic.

The CloudStack zone networks will then be configured as follows:

  • Management and guest traffic is configured to use KVM traffic label “cloudbr0”.
  • Public traffic is configured to use KVM traffic label “cloudbr1”.

In addition to the above it’s important to remember CloudStack itself requires internal connectivity from the hypervisor host to system VMs (Virtual Routers, SSVM and CPVM) over the link local 169.254.0.0/16 subnet. This is done over a host-only bridge “cloud0”, which is created by CloudStack when the host is added to a CloudStack zone.

 

KVMnetworking42

Linux bridge configuration – CentOS

In the following CentOS example we have changed the NIC naming convention back to the legacy “eth0” format rather than the new “eno16777728” format. This is a personal preference – and is generally done to make automation of configuration settings easier. The configuration suggested throughout this blog post can also be implemented using the new NIC naming format.

Across all CentOS versions the “NetworkManager” service is also generally disabled, since this has been found to complicate KVM network configuration and cause unwanted behaviour:

 
# systemctl stop NetworkManager
# systemctl disable NetworkManager

To enable bonding and bridging CentOS 7.x requires the modules installed / loaded:

 
# modprobe --first-time bonding
# yum -y install bridge-utils

If IPv6 isn’t required we also add the following lines to /etc/sysctl.conf:

net.ipv6.conf.all.disable_ipv6 = 1 
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

In CentOS the linux bridge configuration is done with configuration files in /etc/sysconfig/network-scripts/. Each of the four individual NIC interfaces are configured as follows (eth0 / eth1 / eth2 / eth3 are all configured the same way). Note there is no IP configuration against the NICs themselves – these purely point to the respective bonds:

# vi /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
NAME=eth0
TYPE=Ethernet
BOOTPROTO=none
ONBOOT=yes
MASTER=bond0
SLAVE=yes
HWADDR=00:0C:12:xx:xx:xx
NM_CONTROLLED=no

The bond configurations are specified in the equivalent ifcfg-bond scripts and specify bonding options as well as the upstream bridge name. In this case we’re just setting a basic active-passive bond (mode=1) with up/down delays of zero and status monitoring every 100ms (miimon=100). Note there are a multitude of bonding options – please refer to the CentOS / RedHat official documentation to tune these to your specific use case.

# vi /etc/sysconfig/network-scripts/ifcfg-bond0
DEVICE=bond0
NAME=bond0
TYPE=Bond
BRIDGE=cloudbr0
ONBOOT=yes
NM_CONTROLLED=no
BONDING_OPTS="mode=active-backup miimon=100 updelay=0 downdelay=0"

The same goes for bond1:

# vi /etc/sysconfig/network-scripts/ifcfg-bond1
DEVICE=bond1
NAME=bond1
TYPE=Bond
BRIDGE=cloudbr1
ONBOOT=yes
NM_CONTROLLED=no
BONDING_OPTS="mode=active-backup miimon=100 updelay=0 downdelay=0"

Cloudbr0 is configured in the ifcfg-cloudbr0 script. In addition to the bridge configuration we also specify the host IP address, which is tied directly to the bridge since it is on an untagged VLAN:

# vi /etc/sysconfig/network-scripts/ifcfg-cloudbr0
DEVICE=cloudbr0
ONBOOT=yes
TYPE=Bridge
IPADDR=192.168.100.20
NETMASK=255.255.255.0
GATEWAY=192.168.100.1
NM_CONTROLLED=no
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=no
DELAY=0

Cloudbr1 does not have an IP address configured hence the configuration is simpler:

# vi /etc/sysconfig/network-scripts/ifcfg-cloudbr1
DEVICE=cloudbr1
ONBOOT=yes
TYPE=Bridge
BOOTPROTO=none
NM_CONTROLLED=no
DELAY=0
DEFROUTE=no
IPV4_FAILURE_FATAL=no
IPV6INIT=no

Optional tagged interface for storage traffic

If a dedicated VLAN tagged IP interface is required for e.g. storage traffic this can be accomplished by created a VLAN on top of the bond and tying this to a dedicated bridge. In this case we create a new bridge on bond0 using VLAN 100:

# vi /etc/sysconfig/network-scripts/ifcfg-bond.100
DEVICE=bond0.100
VLAN=yes
BOOTPROTO=none
ONBOOT=yes
TYPE=Unknown
BRIDGE=cloudbr100

The bridge can now be configured with the desired IP address for storage connectivity:

# vi /etc/sysconfig/network-scripts/ifcfg-cloudbr100
DEVICE=cloudbr100
ONBOOT=yes
TYPE=Bridge
VLAN=yes
IPADDR=10.0.100.20
NETMASK=255.255.255.0
NM_CONTROLLED=no
DELAY=0

Internal bridge cloud0

When using linux bridge networking there is no requirement to configure the internal “cloud0” bridge, this is all handled by CloudStack.

Network startup

Note – once all network startup scripts are in place and the network service is restarted you may lose connectivity to the host if there are any configuration errors in the files, hence make sure you have console access to rectify any issues.

To make the configuration live restart the network service:

# systemctl restart network

To check the bridges use the brctl command:

# brctl show
bridge name bridge id STP enabled interfaces
cloudbr0 8000.000c29b55932 no bond0
cloudbr1 8000.000c29b45956 no bond1

The bonds can be checked with:

# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth0
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:0c:xx:xx:xx:xx
Slave queue ID: 0

Slave Interface: eth1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:0c:xx:xx:xx:xx
Slave queue ID: 0

Linux bridge configuration – Ubuntu

With the 18.04 “Bionic Beaver” release Ubuntu have retired the legacy way of configuring networking through /etc/network/interfaces in favour of Netplan – https://netplan.io/reference. This changes how networking is configured – although the principles around bridge configuration are the same as in previous Ubuntu versions.

First of all ensure correct hostname and FQDN are set in /etc/hostname and /etc/hosts respectively.

To stop network bridge traffic from traversing IPtables / ARPtables also add the following lines to /etc/sysctl.conf, this prevents bridge traffic from traversing IPtables / ARPtables on the host.

# vi /etc/sysctl.conf
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0

Ubuntu 18.04 installs the “bridge-utils” and bridge/bonding kernel options by default, and the corresponding modules are also loaded by default, hence there are no requirements to add anything to /etc/modules.

In Ubuntu 18.04 all interface, bond and bridge configuration are configured using cloud-init and the Netplan configuration in /etc/netplan/XX-cloud-init.yaml. Same as for CentOS we are configuring basic active-passive bonds (mode=1) with status monitoring every 100ms (miimon=100), and configuring bridges on top of these. As before the host IP address is tied to cloudbr0:

# vi /etc/netplan/50-cloud-init.yaml
network:
    ethernets:
        eth0:
            dhcp4: no
        eth1:
            dhcp4: no
        eth2:
            dhcp4: no
        eth3:
            dhcp4: no
    bonds:
        bond0:
            dhcp4: no
            interfaces:
                - eth0
                - eth1
            parameters:
                mode: active-backup
                primary: eth0
        bond1:
            dhcp4: no
            interfaces:
                - eth2
                - eth3
            parameters:
                mode: active-backup
                primary: eth2
    bridges:
        cloudbr0:
            addresses:
                - 192.168.100.20/24
            gateway4: 192.168.100.1
            nameservers:
                search: [mycloud.local]
                addresses: [192.168.100.5,192.168.100.6]
            interfaces:
                - bond0
        cloudbr1:
            dhcp4: no
            interfaces:
                - bond1
    version: 2

Optional tagged interface for storage traffic

To add an options VLAN tagged interface for storage traffic add a VLAN and a new bridge to the above configuration:

# vi /etc/netplan/50-cloud-init.yaml
    vlans:
        bond100:
            id: 100
            link: bond0
            dhcp4: no
    bridges:
        cloudbr100:
            addresses:
               - 10.0.100.20/24
            interfaces:
               - bond100

Internal bridge cloud0

When using linux bridge networking the internal “cloud0” bridge is again handled by CloudStack, i.e. there’s no need for specific configuration to be specified for this.

Network startup

Note – once all network startup scripts are in place and the network service is restarted you may lose connectivity to the host if there are any configuration errors in the files, hence make sure you have console access to rectify any issues.

To make the configuration reload Netplan with

# netplan apply

To check the bridges use the brctl command:

# brctl show
bridge name	bridge id		STP enabled	interfaces
cloud0		8000.000000000000	no
cloudbr0	8000.52664b74c6a7	no		bond0
cloudbr1	8000.2e13dfd92f96	no		bond1
cloudbr100	8000.02684d6541db	no		bond100

To check the VLANs and bonds:

# cat /proc/net/vlan/config
VLAN Dev name | VLAN ID
Name-Type: VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD
bond100 | 100 | bond0
# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth1
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 10
Permanent HW addr: 00:0c:xx:xx:xx:xx
Slave queue ID: 0

Slave Interface: eth0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 10
Permanent HW addr: 00:0c:xx:xx:xx:xx
Slave queue ID: 0

 

OpenVswitch bridge configuration – CentOS

The OpenVswitch version in the standard CentOS repositories is relatively old (version 2.0). To install a newer version either locate and install this from a third party CentOS/Fedora/RedHat repository, alternatively download and compile the packages from the OVS website http://www.openvswitch.org/download/ (notes on how to compile the packages can be found in http://docs.openvswitch.org/en/latest/intro/install/fedora/).

Once packages are available install and enable OVS with

# yum localinstall openvswitch-<version>.rpm
# systemctl start openvswitch
# systemctl enable openvswitch

In addition to this the bridge module should be blacklisted. Experience has shown that even blacklisting this module does not prevent it from being loaded. To force this set the module install to /bin/false. Please note the CloudStack agent install depends on the bridge module being in place, hence this step should be carried out after agent install.

echo "install bridge /bin/false" > /etc/modprobe.d/bridge-blacklist.conf

As with linux bridging above the following examples assumes IPv6 has been disabled and legacy ethX network interface names are used. In addition the hostname has been set in /etc/sysconfig/network and /etc/hosts.

Add the initial OVS bridges using the ovs-vsctl toolset:

# ovs-vsctl add-br cloudbr0
# ovs-vsctl add-br cloudbr1
# ovs-vsctl add-bond cloudbr0 bond0 eth0 eth1
# ovs-vsctl add-bond cloudbr1 bond1 eth2 eth3

This will configure the bridges in the OVS database, but the settings will not be persistent. To make the settings persistent we need to configure the network configuration scripts in /etc/sysconfig/network-scripts/, similar to when using linux bridges.

Each individual network interface has a generic configuration – note there is no reference to bonds at this stage. The following ifcfg-eth script applies to all interfaces:

# vi /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
TYPE=Ethernet
BOOTPROTO=none
NAME=eth0
ONBOOT=yes
NM_CONTROLLED=no
HOTPLUG=no
HWADDR=00:0C:xx:xx:xx:xx

The bonds reference the interfaces as well as the upstream bridge. In addition the bond configuration specifies the OVS specific settings for the bond (active-backup, no LACP, 100ms status monitoring):

# vi /etc/sysconfig/network-scripts/ifcfg-bond0
DEVICE=bond0
ONBOOT=yes
DEVICETYPE=ovs
TYPE=OVSBond
OVS_BRIDGE=cloudbr0
BOOTPROTO=none
BOND_IFACES="eth0 eth1"
OVS_OPTIONS="bond_mode=active-backup lacp=off other_config:bond-detect-mode=miimon other_config:bond-miimon-interval=100"
HOTPLUG=no
# vi /etc/sysconfig/network-scripts/ifcfg-bond1
DEVICE=bond1
ONBOOT=yes
DEVICETYPE=ovs
TYPE=OVSBond
OVS_BRIDGE=cloudbr1
BOOTPROTO=none
BOND_IFACES="eth2 eth3"
OVS_OPTIONS="bond_mode=active-backup lacp=off other_config:bond-detect-mode=miimon other_config:bond-miimon-interval=100"
HOTPLUG=no

The bridges are now configured as follows. The host IP address is specified on the untagged cloudbr0 bridge:

# vi /etc/sysconfig/network-scripts/ifcfg-cloudbr0
DEVICE=cloudbr0
ONBOOT=yes
DEVICETYPE=ovs
TYPE=OVSBridge
BOOTPROTO=static
IPADDR=192.168.100.20
NETMASK=255.255.255.0
GATEWAY=192.168.100.1
HOTPLUG=no

Cloudbr1 is configured without an IP address:

# vi /etc/sysconfig/network-scripts/ifcfg-cloudbr1
DEVICE=cloudbr1
ONBOOT=yes
DEVICETYPE=ovs
TYPE=OVSBridge
BOOTPROTO=none
HOTPLUG=no

Internal bridge cloud0

Under CentOS7.x and CloudStack 4.11 the cloud0 bridge is automatically configured, hence no additional configuration steps required.

Optional tagged interface for storage traffic

If a dedicated VLAN tagged IP interface is required for e.g. storage traffic this is accomplished by creating a VLAN tagged fake bridge on top of one of the cloud bridges. In this case we add it to cloudbr0 with VLAN 100:

# ovs-vsctl add-br cloudbr100 cloudbr0 100
# vi /etc/sysconfig/network-scripts/ifcfg-cloudbr100
DEVICE=cloudbr100
ONBOOT=yes
DEVICETYPE=ovs
TYPE=OVSBridge
BOOTPROTO=static
IPADDR=10.0.100.20
NETMASK=255.255.255.0
OVS_OPTIONS="cloudbr0 100"
HOTPLUG=no

Additional OVS network settings

To finish off the OVS network configuration specify the hostname, gateway and IPv6 settings:

vim /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=kvmhost1.mylab.local
GATEWAY=192.168.100.1
NETWORKING_IPV6=no
IPV6INIT=no
IPV6_AUTOCONF=no

VLAN problems when using OVS

Kernel versions older than 3.3 had some issues with VLAN traffic propagating between KVM hosts. This has not been observed in CentOS 7.5 (kernel version 3.10) – however if this issue is encountered look up the OVS VLAN splinter workaround.

Network startup

Note – as mentioned for linux bridge networking – once all network startup scripts are in place and the network service is restarted you may lose connectivity to the host if there are any configuration errors in the files, hence make sure you have console access to rectify any issues.

To make the configuration live restart the network service:

# systemctl restart network

To check the bridges use the ovs-vsctl command. The following shows the optional cloudbr100 on VLAN 100:

# ovs-vsctl show
49cba0db-a529-48e3-9f23-4999e27a7f72
    Bridge "cloudbr0";
        Port "cloudbr0";
            Interface "cloudbr0"
                type: internal
        Port "cloudbr100"
            tag: 100
            Interface "cloudbr100"
                type: internal
        Port "bond0"
            Interface "veth0";
            Interface "eth0"
    Bridge "cloudbr1"
        Port "bond1"
            Interface "eth1"
            Interface "veth1"
        Port "cloudbr1"
            Interface "cloudbr1"
                type: internal
    Bridge "cloud0"
        Port "cloud0"
            Interface "cloud0"
                type: internal
    ovs_version: "2.9.2"

The bond status can be checked with the ovs-appctl command:

ovs-appctl bond/show bond0
---- bond0 ----
bond_mode: active-backup
bond may use recirculation: no, Recirc-ID : -1
bond-hash-basis: 0
updelay: 0 ms
downdelay: 0 ms
lacp_status: off
active slave mac: 00:0c:xx:xx:xx:xx(eth0)

slave eth0: enabled
active slave
may_enable: true

slave eth1: enabled
may_enable: true

To ensure that only OVS bridges are used also check that linux bridge control returns no bridges:

# brctl show
bridge name	bridge id		STP enabled	interfaces

As a final note – the CloudStack agent also requires the following two lines added to /etc/cloudstack/agent/agent.properties after install:

network.bridge.type=openvswitch
libvirt.vif.driver=com.cloud.hypervisor.kvm.resource.OvsVifDriver

OpenVswitch bridge configuration – Ubuntu

As discussed earlier in this blog post Ubuntu 18.04 introduced Netplan as a replacement to the legacy “/etc/network/interfaces” network configuration. Unfortunately Netplan does not support OVS, hence the first challenge is to revert Ubuntu to the legacy configuration method.

To disable Netplan first of all add “netcfg/do_not_use_netplan=true” to the GRUB_CMDLINE_LINUX option in /etc/default/grub. The following example also shows the use of legacy interface names as well as IPv6 being disabled:

GRUB_CMDLINE_LINUX="net.ifnames=0 biosdevname=0 ipv6.disable=1 netcfg/do_not_use_netplan=true"

Then rebuild GRUB and reboot the server:

grub-mkconfig -o /boot/grub/grub.cfg

To set the hostname first of all edit “/etc/cloud/cloud.cfg” and set this to preserve the system hostname:

preserve_hostname: true

Thereafter set the hostname with hostnamectl:

hostnamectl set-hostname --static --transient --pretty <hostname>

Now remove Netplan, install OVS from the Ubuntu repositories as well the “ifupdown” package to get standard network functionality back:

apt-get purge nplan netplan.io
apt-get install openvswitch-switch
apt-get install ifupdown

As for CentOS we need to blacklist the bridge module to prevent standard bridges being created. Please note the CloudStack agent install depends on the bridge module being in place, hence this step should be carried out after agent install.

echo "install bridge /bin/false" > /etc/modprobe.d/bridge-blacklist.conf

To stop network bridge traffic from traversing IPtables / ARPtables also add the following lines to /etc/sysctl.conf:

# vi /etc/sysctl.conf
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0

Same as for CentOS we first of all add the OVS bridges and bonds from command line using the ovs-vsctl command line tools. In this case we also add the additional tagged fake bridge cloudbr100 on VLAN 100:

# ovs-vsctl add-br cloudbr0
# ovs-vsctl add-br cloudbr1
# ovs-vsctl add-bond cloudbr0 bond0 eth0 eth1 bond_mode=active-backup other_config:bond-detect-mode=miimon other_config:bond-miimon-interval=100
# ovs-vsctl add-bond cloudbr1 bond1 eth2 eth3 bond_mode=active-backup other_config:bond-detect-mode=miimon other_config:bond-miimon-interval=100
# ovs-vsctl add-br cloudbr100 cloudbr0 100

As for linux bridge all network configuration is applied in “/etc/network/interfaces”:

# vi /etc/network/interfaces
# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
iface eth0 inet manual
iface eth1 inet manual
iface eth2 inet manual
iface eth3 inet manual

auto cloudbr0
allow-ovs cloudbr0
iface cloudbr0 inet static
  address 192.168.100.20
  netmask 255.255.155.0
  gateway 192.168.100.100
  dns-nameserver 192.168.100.5
  ovs_type OVSBridge
  ovs_ports bond0

allow-cloudbr0 bond0 
iface bond0 inet manual 
  ovs_bridge cloudbr0 
  ovs_type OVSBond 
  ovs_bonds eth0 eth1 
  ovs_option bond_mode=active-backup other_config:miimon=100

auto cloudbr1
allow-ovs cloudbr1
iface cloudbr1 inet manual

allow-cloudbr1 bond1 
iface bond1 inet manual 
  ovs_bridge cloudbr1 
  ovs_type OVSBond 
  ovs_bonds eth2 eth3 
  ovs_option bond_mode=active-backup other_config:miimon=100

Network startup

Since Ubuntu 14.04 the bridges have started automatically without any requirement for additional startup scripts. Since OVS uses the same toolset across both CentOS and Ubuntu the same processes as described earlier in this blog post can be utilised.

# ovs-appctl bond/show bond0
# ovs-vsctl show

To ensure that only OVS bridges are used also check that linux bridge control returns no bridges:

# brctl show
bridge name	bridge id		STP enabled	interfaces

As mentioned earlier the following also needs added to the /etc/cloudstack/agent/agent.properties file:

network.bridge.type=openvswitch
libvirt.vif.driver=com.cloud.hypervisor.kvm.resource.OvsVifDriver

Internal bridge cloud0

In Ubuntu there is no requirement to add additional configuration for the internal cloud0 bridge, CloudStack manages this.

Optional tagged interface for storage traffic

Additional VLAN tagged interfaces are again accomplished by creating a VLAN tagged fake bridge on top of one of the cloud bridges. In this case we add it to cloudbr0 with VLAN 100 at the end of the interfaces file:

# ovs-vsctl add-br cloudbr100 cloudbr0 100
# vi /etc/network/interfaces
auto cloudbr100
allow-cloudbr0 cloudbr100
iface cloudbr100 inet static
  address 10.0.100.20
  netmask 255.255.255.0
  ovs_type OVSIntPort
  ovs_bridge cloudbr0
  ovs_options tag=100

Conclusion

As KVM is becoming more stable and mature, more people are going to start looking at using it rather that the more traditional XenServer or vSphere solutions, and we hope this article will assist in configuring host networking. As always we’re happy to receive feedback , so please get in touch with any comments, questions or suggestions.

About The Author

Dag Sonstebo is  a Cloud Architect at ShapeBlue, The Cloud Specialists. Dag spends most of his time designing, implementing and automating IaaS solutions based on Apache CloudStack.

Introduction

CloudStack 4.11.1 introduces a new security enhancement on top of the new CA framework to secure live KVM VM migrations. This feature allows live migration of guest VMs across KVM hosts using secured TLS enabled libvirtd process. Without this feature, the live migration of guest VMs across KVM hosts would use an unsecured TCP connection, which is prone to man-in-the-middle attacks leading to leakage of critical VM data (the VM state and memory). This feature brings stability and security enhancements for CloudStack and KVM users.

Overview

The initial implementation of the CA framework was limited to the provisioning of X509 certificates to secure the KVM/CPVM/SSVM agent(s)  and the CloudStack management server(s). With the new enhancement, the X509 certificates are now also used by the libvirtd process on the KVM host to secure live VM migration to another secured KVM host.

The migration URI used by two secured KVM hosts is qemu+tls:// as opposed to qemu+tcp:// that is used by an unsecured host. We’ve also enforced that live VM migration is allowed only between either two secured KVM hosts or two unsecured hosts, but not between KVM hosts with a different security configuration. Between two secured KVM hosts, the web of trust is established by the common root CA certificate that can validate the server certificate chain when live VM migration is initiated.

As part of the process of securing a KVM host the CA framework issues X509 certificates and provisions them to a KVM host and libvirtd is reconfigured to listen on the default TLS port of 16514 and use the same X509 certificates as used by thecloudstack-agent. In an existing environment, the admin will need to ensure that the default TLS port 16514 is not blocked however in a fresh environment suitable iptables rules and other configurations are done via cloudstack-setup-agent using a new '-s' flag.

Starting CloudStack 4.11.1, hosts that don’t have both cloudstack-agent and libvirtd processes secured and in Up state will show up in ‘Unsecure’ state in the UI (and in host details as part of listHosts API response):

This will allow admins to easily identify and secure hosts using a new ‘provision certificate’ button that can be used from the host’s details tab in the UI:

After a KVM host is successfully secured it will show up in the Up state:

As part of the onboarding and securing process, after securing all the KVM hosts the admin can also enforce authentication strictness of client X509 certificates by the CA framework, by setting the global setting ‘ca.plugin.root.auth.strictness' to true (this does not require restarting of the management server(s)).

About the author

Rohit Yadav is a Software Architect at ShapeBlue, the Cloud Specialists, and is a committer and PMC member of Apache CloudStack. Rohit spends most of his time designing and implementing features in Apache CloudStack.