Apache CloudStack GPU Support | CloudStack Feature Deep Dive

Listen to this article:

0:00

The 4.21 release adds native GPU support on KVM, making GPU-enabled Instance deployment straightforward.

• Setup: Enable IOMMU, disable Nouveau, install tools, and optionally NVIDIA vGPU Manager.
• CloudStack: Auto-discovers GPUs, supports passthrough and vGPU, and lets you attach GPUs through Service Offerings.
• Extras: Works with Kubernetes GPU plugins, supports display (with video.hardware=none), and enforces GPU quotas via global settings.
• Tested: Validated with NVIDIA A10; other GPUs may vary.

Apache CloudStack 4.21 introduces native GPU support for KVM, enabling GPU-enabled Instances to be launched through Service Offerings without manual workarounds. The feature supports both passthrough and, where available, vGPU.

This is particularly valuable for workloads like AI/ML, VDI, or any application requiring GPU acceleration. CloudStack automatically discovers GPU devices on the Host, allows operators to select a GPU profile in the Service Offering, and attach it to an Instance directly from the UI or API. Quotas can be applied at the Account, Project, or Domain level.

This article covers:
• Preparing the host (IOMMU, Nouveau driver, dependencies, and optional vGPU Manager)
• Configuring CloudStack (discovery, Service Offerings, and GPU attachment)
• Optional Kubernetes integration (vendor plugins)
• Using vGPU for display and key configuration tips
• Setting quotas and limits

Host Prerequisites

Before attaching GPUs to Instances, the KVM Host must be properly configured. This involves enabling IOMMU, disabling the Nouveau driver, installing required tools, and optionally setting up vGPU.

Enabling IOMMU

IOMMU must be enabled both in the BIOS and in the kernel.
• In the BIOS, enable Intel VT-d (Intel) or AMD-Vi (AMD).
• In the kernel, set the intel_iommu=on or amd_iommu=on depending on the CPU architecture. To get the best performance, also add iommu=pt (pass-through) to the grub file when using SR-IOV.

Ubuntu

Edit /etc/default/grub below:

• For Intel:

GRUB_CMDLINE_LINUX_DEFAULT=”<existing options> intel_iommu=on iommu=pt”

• For AMD:

GRUB_CMDLINE_LINUX_DEFAULT=”<existing options> amd_iommu=on iommu=pt”

Update the grub configuration and reboot the system:

sudo grub-mkconfig -o /boot/grub/grub.cfg
sudo reboot

Enterprise Linux

Run the following command to set the parameters:

• For Intel:

grubby –args=“intel_iommu=on iommu=pt” –update-kernel DEFAULT

• For AMD:

grubby –args=“amd_iommu=on iommu=pt” –update-kernel DEFAULT

Reboot the system:

sudo reboot

Verify IOMMU

After rebooting the Host, verify that IOMMU is enabled by checking the dmesg logs:

sudo dmesg | grep IOMMU

If IOMMU is active, the output should include a line similar to:

[ 0.057175] DMAR: IOMMU enabled

Disable Nouveau Driver

The open-source Nouveau driver must be disabled to allow passthrough.

To proceed, create a modprobe configuration file:

echo “blacklist nouveau” | sudo tee /etc/modprobe.d/disable-nouveau.conf
echo “options nouveau modeset=0” | sudo tee -a /etc/modprobe.d/disable-nouveau.conf

Regenerate the initramfs:

• Ubuntu

sudo update-initramfs -u

• Enterprise Linux

sudo dracut –force

Reboot the Host:

sudo reboot

Verify the driver status:

lspci -v | grep -A 10 ” 3D “

If the Nouveau driver is disabled, the kernel driver in use should not list Nouveau.

If the Nouveau driver is still active, double-check the configuration file and initramfs regeneration.

Install Required Tools

CloudStack uses a helper script (/var/lib/cloudstack-common/vms/scripts/hypervisors/kvm/gpudiscovery.sh) to discover GPU devices on the Host. This script depends on a few additional packages that need be installed.

• Ubuntu

sudo apt install pciutils xmlstarlet

• Enterprise Linux

sudo dnf install -y pciutils xmlstarlet

Once installed, the Host will be able to report available GPU devices back to CloudStack during the discovery process.

vGPU Support

Some GPU devices support vGPU, allowing a single physical GPU to be split into multiple virtual GPUs that can be attached to different Instances.
To enable vGPU on the host, you must install the NVIDIA vGPU Manager.

Steps:
• Download the vGPU Manager package for your OS from the NVIDIA Licensing Portal.
• Follow the installation instructions included in the package.
• Configure vGPU profiles according to the hardware and licensing model.
• Reboot the Host if required by the installation.

Once installed, vGPU profiles will be discovered by CloudStack alongside physical GPU devices, allowing you to select specific profiles in Service Offerings.

CloudStack Configuration

Once the host is properly configured, GPU devices can be discovered and managed directly through Apache CloudStack.
You can view and allocate GPU resources from the UI or via API.

Discover GPU Devices

When the Host connects to the CloudStack Management Server, GPU devices are automatically discovered.
You can view them under the GPU tab on the Host’s details page.

Create a GPU-enabled Service Offering

To launch an Instance with a GPU:

Go to Service Offerings -> Compute Offerings.
Create a new offering and select the desired GPU card and profile (passthrough or vGPU).
Save the offering.

This Service Offering can now be used like any other when deploying Instances.

Launch an Instance with GPU

When launching an Instance, select the GPU-enabled Service Offering.

Once deployed, the GPU is attached and can be verified from the Instance details page or using guest OS tools (e.g., nvidia-smi for NVIDIA cards).

GPU Detachment on Stop

By default, stopping an Instance does not detach the GPU device.

If you want GPUs to be released automatically when the Instance is stopped, in Global Settings set:

gpu.detach.on.stop = true

This can be configured globally or at the Domain level.

Changing this setting does not affect already running Instances; it applies to future stop events.

This option helps maximise GPU utilisation by making resources available to other Instances when idle.

Kubernetes Integration

GPU-enabled Instances can also be used as Kubernetes worker nodes.

Once a node with a GPU is deployed, you can enable GPU discovery in Kubernetes by installing the GPU tool provided by the vendor.

Install the GPU Tool

Depending on your hardware vendor, install the appropriate integration:

Nvidia- NVIDIA GPU Operator
AMD – AMD GPU Device Plugin

These components handle GPU discovery and expose GPU resources to Kubernetes workloads.

vGPU for Display

When creating a Service Offering, you can configure a GPU device to be used for display in addition to passthrough or vGPU compute.

However, when a GPU is used for display, the Instance may experience issues with the console proxy — for example, a disappearing cursor or display rendering problems.

Adjusting the Display Settings

To address these issues, set the following parameter on the Instance or Template:

video.hardware = none

For Templates:

For Instances:

This disables the default virtual display and allows the physical GPU display to be used correctly.

Driver Requirements

For proper operation:

The required GPU drivers must be installed on the Host.
The guest OS must also have compatible drivers for display rendering.

With these settings, Instances using vGPU for display can function correctly without console rendering problems.

Limits

The GPU support feature also includes the ability to define quotas and limits at the Account, Project and Domain levels. This allows Administrators to control how GPU resources are consumed and ensure fair usage across different tenants.

The default maximum number of GPU devices can be configured through the following Global Settings:

Setting	Description
max.account.gpus	The default maximum number of GPU devices that can be used for an account.
max.domain.gpus	The default maximum number of GPU devices that can be used for a domain.
max.project.gpus	The default maximum number of GPU devices that can be used for a project.

These defaults can be overridden for specific accounts, domains, or projects as needed.

Both passthrough and vGPU are counted as one GPU device each. If multiple vGPU profiles are created on a single physical GPU, only the vGPU profiles themselves are considered in the quota calculation.

This approach gives operators flexibility to distribute GPU resources efficiently while maintaining control and visibility at different levels of the cloud environment.

Conclusion

Native GPU support in KVM significantly expands what Apache CloudStack can offer for modern workloads. By making GPU passthrough and vGPU directly available through Service Offerings, operators can deploy GPU-enabled Instances without manual configuration or custom scripting.

This is especially relevant for environments running AI/ML training and inference, VDI solutions, or other GPU-accelerated applications. During validation, this feature was successfully tested on NVIDIA A10 GPUs with both passthrough and vGPU modes.

While the implementation is based on open standards and should work with a range of GPU models, behaviour may vary with different vendors. Always verify compatibility and driver support in your environment before deploying at scale.

Vishesh Jindal

Vishesh Jindal is a software engineer at ShapeBlue. He has experience in developing and managing cloud infrastructure. He has a particular interest in databases and has worked extensively on them.

When Vishesh is not working, he enjoys watching anime, playing DOTA, or working on an open-source project.