Balancing workload distribution, ensuring optimal resource allocation, minimising power consumption, and maintaining consistent application performance are complex tasks, especially in large-scale cloud environments. Manual management and rebalancing of workloads can lead to errors and inefficiencies, highlighting a need for an automated, intelligent solution to balance workloads and ensure resource utilisation across all instances and hypervisor clusters.
Introduced in Apache CloudStack 4.19, the CloudStack DRS Feature is designed to address these challenges. In this article, we’ll explore this new feature in detail, and consider scenarios where CloudStack DRS can significantly improve efficiency and operational performance.
What is CloudStack DRS?
CloudStack DRS is designed to simplify the balancing of Instances between Hosts in a Cluster, utilising a configurable algorithm tailored to specific needs. This feature enables administrators to strategically optimise and evenly distribute resource usage across hypervisor clusters, ensuring each host operates at optimal capacity, avoiding underutilisation and overhead, which are common challenges in managing cloud infrastructure.
Currently, CloudStack DRS offers two distinct algorithms:
The Balanced algorithm ensures that Instances are distributed evenly across Hosts in a Cluster. This approach is particularly useful for those who want to ensure that Instances are spread across Hosts to maintain reliability, making it an ideal choice for production environments. However, this algorithm requires higher power consumption because all hosts are in use. Since the number of Instances running on a single Host is relatively small, it is less likely to cause contention issues. Moreover, in case of a host failure, the impact on running Instances is also minimized. This is the default algorithm.
The condensed algorithm is designed to pack Instances into as few Hosts as possible, which is useful for reducing power and cooling costs. However, it may cause contention issues due to the high number of Instances running on a single Host. Additionally, if a Host fails, the impact on running Instances can be more significant. As a result, this algorithm is not recommended for production environments.
How does it work?
CloudStack DRS enables live migration of Instances within a Cluster to optimise their distribution across Hosts. This feature utilises the existing live migration functionality in CloudStack, making it independent of any specific hypervisor. The configured algorithm runs periodically and only migrates Instances if it can improve their distribution. The algorithm operates on a per-cluster basis and only migrates Instances within the Cluster.
One thing to note is that CloudStack DRS doesn’t consider the configured allocation algorithm and deployment planner. It only considers the current distribution of Instances within the Cluster and the configured affinity groups.
Configuring CloudStack DRS
By default, CloudStack DRS is disabled. To enable and configure it, follow these steps:
1. Navigate to “Configuration / Global Setting” in the side menu, and set the configuration parameter drs.enabled to true and set drs.automatic.interval as per the requirement.
2. To configure the algorithm, set the configuration parameter drs.algorithm to balanced or condensed.
3. Set the drs.metric to configure the metric to use when calculating the imbalance in the cluster. Possible values are memory (default), cpu. Note that this considers the allocated metrics defined in the service offering, not the real-time usage metrics.
4. Set drs.imbalance to a value between 0.0 and 1.0 to configure the level of imbalance allowed in the cluster. This determines the level of aggressiveness of DRS execution on the cluster. A value of 1.0 for the condensed algorithm indicates that instances are running on the minimum number of hosts possible. On the other hand, for the balanced algorithm, a value of 1.0 means that instances are evenly distributed across different hosts. Setting this parameter to 0 is equivalent to disabling it.
5. To restrict the number of maximum migrations that can be done in a single DRS execution, set drs.max.migrations.
Per-Cluster Basis DRS Configuration
It is also possible to configure CloudStack DRS on a per-cluster basis, allowing different algorithms to be used in different clusters. To do this, access the Infrastructure/Cluster tab in the side menu, select Cluster, and on the Settings tab, search for “drs” and set the values as explained earlier.
Running CloudStack DRS Manually
CloudStack DRS is configured to run at a specific interval controlled by the drs.automatic.interval parameter in Global Settings. However, through the DRS tab in the Cluster details pane, you can run it manually:
Clicking on the “Generate DRS plan” will generate a DRS migration plan for the target Cluster. After this, clicking on the “Execute” button will rebalance the Instances between the Cluster’s Hosts as suggested.
After that, the DRS execution details will be shown in the Cluster DRS tab itself as follows:
In the events tab in the side menu, you can view the events that were executed by CloudStack DRS as follows:
CloudStack DRS is a valuable resource for organisations seeking to improve the redistribution of virtualised cloud infrastructures. This new functionality utilises the built-in live migration feature in Apache CloudStack, making it agnostic to the underlying hypervisor and compatible with any hypervisor used. The new functionality will be available from Apache CloudStack 4.19 LTS.
Vishesh Jindal is a software engineer at ShapeBlue. He has experience in developing and managing cloud infrastructure. He has a particular interest in databases and has worked extensively on them.
When Vishesh is not working, he enjoys watching anime, playing DOTA, or working on an open-source project.