The 4.21 release adds native GPU support on KVM, making GPU-enabled Instance deployment straightforward.
• Setup: Enable IOMMU, disable Nouveau, install tools, and optionally NVIDIA vGPU Manager.
• CloudStack: Auto-discovers GPUs, supports passthrough and vGPU, and lets you attach GPUs through Service Offerings.
• Extras: Works with Kubernetes GPU plugins, supports display (with video.hardware=none), and enforces GPU quotas via global settings.
• Tested: Validated with NVIDIA A10; other GPUs may vary.
Apache CloudStack 4.21 introduces native GPU support for KVM, enabling GPU-enabled Instances to be launched through Service Offerings without manual workarounds. The feature supports both passthrough and, where available, vGPU.
This is particularly valuable for workloads like AI/ML, VDI, or any application requiring GPU acceleration. CloudStack automatically discovers GPU devices on the Host, allows operators to select a GPU profile in the Service Offering, and attach it to an Instance directly from the UI or API. Quotas can be applied at the Account, Project, or Domain level.
This article covers:
• Preparing the host (IOMMU, Nouveau driver, dependencies, and optional vGPU Manager)
• Configuring CloudStack (discovery, Service Offerings, and GPU attachment)
• Optional Kubernetes integration (vendor plugins)
• Using vGPU for display and key configuration tips
• Setting quotas and limits
Host Prerequisites
Before attaching GPUs to Instances, the KVM Host must be properly configured. This involves enabling IOMMU, disabling the Nouveau driver, installing required tools, and optionally setting up vGPU.
Enabling IOMMU
IOMMU must be enabled both in the BIOS and in the kernel.
• In the BIOS, enable Intel VT-d (Intel) or AMD-Vi (AMD).
• In the kernel, set the intel_iommu=on or amd_iommu=on depending on the CPU architecture. To get the best performance, also add iommu=pt (pass-through) to the grub file when using SR-IOV.
Ubuntu
Edit /etc/default/grub below:
• For Intel:
GRUB_CMDLINE_LINUX_DEFAULT=”<existing options> intel_iommu=on iommu=pt”
• For AMD:
GRUB_CMDLINE_LINUX_DEFAULT=”<existing options> amd_iommu=on iommu=pt”
Update the grub configuration and reboot the system:
sudo grub-mkconfig -o /boot/grub/grub.cfg
sudo reboot
Enterprise Linux
Run the following command to set the parameters:
• For Intel:
grubby –args=“intel_iommu=on iommu=pt” –update-kernel DEFAULT
• For AMD:
grubby –args=“amd_iommu=on iommu=pt” –update-kernel DEFAULT
Reboot the system:
sudo reboot
Verify IOMMU
After rebooting the Host, verify that IOMMU is enabled by checking the dmesg logs:
sudo dmesg | grep IOMMU
If IOMMU is active, the output should include a line similar to:
[ 0.057175] DMAR: IOMMU enabled
Disable Nouveau Driver
The open-source Nouveau driver must be disabled to allow passthrough.
To proceed, create a modprobe configuration file:
echo “blacklist nouveau” | sudo tee /etc/modprobe.d/disable-nouveau.conf
echo “options nouveau modeset=0” | sudo tee -a /etc/modprobe.d/disable-nouveau.conf
Regenerate the initramfs:
• Ubuntu
sudo update-initramfs -u
• Enterprise Linux
sudo dracut –force
Reboot the Host:
sudo reboot
Verify the driver status:
lspci -v | grep -A 10 ” 3D “
If the Nouveau driver is disabled, the kernel driver in use should not list Nouveau.
If the Nouveau driver is still active, double-check the configuration file and initramfs regeneration.
Install Required Tools
CloudStack uses a helper script (/var/lib/cloudstack-common/vms/scripts/hypervisors/kvm/gpudiscovery.sh) to discover GPU devices on the Host. This script depends on a few additional packages that need be installed.
• Ubuntu
sudo apt install pciutils xmlstarlet
• Enterprise Linux
sudo dnf install -y pciutils xmlstarlet
Once installed, the Host will be able to report available GPU devices back to CloudStack during the discovery process.
vGPU Support
Some GPU devices support vGPU, allowing a single physical GPU to be split into multiple virtual GPUs that can be attached to different Instances.
To enable vGPU on the host, you must install the NVIDIA vGPU Manager.
Steps:
• Download the vGPU Manager package for your OS from the NVIDIA Licensing Portal.
• Follow the installation instructions included in the package.
• Configure vGPU profiles according to the hardware and licensing model.
• Reboot the Host if required by the installation.
Once installed, vGPU profiles will be discovered by CloudStack alongside physical GPU devices, allowing you to select specific profiles in Service Offerings.
CloudStack Configuration
Once the host is properly configured, GPU devices can be discovered and managed directly through Apache CloudStack.
You can view and allocate GPU resources from the UI or via API.
Discover GPU Devices
When the Host connects to the CloudStack Management Server, GPU devices are automatically discovered.
You can view them under the GPU tab on the Host’s details page.
Create a GPU-enabled Service Offering
To launch an Instance with a GPU:
- Go to Service Offerings -> Compute Offerings.
- Create a new offering and select the desired GPU card and profile (passthrough or vGPU).
- Save the offering.
This Service Offering can now be used like any other when deploying Instances.
Launch an Instance with GPU
When launching an Instance, select the GPU-enabled Service Offering.
Once deployed, the GPU is attached and can be verified from the Instance details page or using guest OS tools (e.g., nvidia-smi for NVIDIA cards).
GPU Detachment on Stop
By default, stopping an Instance does not detach the GPU device.
If you want GPUs to be released automatically when the Instance is stopped, in Global Settings set:
gpu.detach.on.stop = true
This can be configured globally or at the Domain level.
Changing this setting does not affect already running Instances; it applies to future stop events.
This option helps maximise GPU utilisation by making resources available to other Instances when idle.
Kubernetes Integration
GPU-enabled Instances can also be used as Kubernetes worker nodes.
Once a node with a GPU is deployed, you can enable GPU discovery in Kubernetes by installing the GPU tool provided by the vendor.
Install the GPU Tool
Depending on your hardware vendor, install the appropriate integration:
- Nvidia- NVIDIA GPU Operator
- AMD – AMD GPU Device Plugin
These components handle GPU discovery and expose GPU resources to Kubernetes workloads.
vGPU for Display
When creating a Service Offering, you can configure a GPU device to be used for display in addition to passthrough or vGPU compute.
However, when a GPU is used for display, the Instance may experience issues with the console proxy — for example, a disappearing cursor or display rendering problems.
Adjusting the Display Settings
To address these issues, set the following parameter on the Instance or Template:
video.hardware = none
- For Templates:
- For Instances:
This disables the default virtual display and allows the physical GPU display to be used correctly.
Driver Requirements
For proper operation:
- The required GPU drivers must be installed on the Host.
- The guest OS must also have compatible drivers for display rendering.
With these settings, Instances using vGPU for display can function correctly without console rendering problems.
Limits
The GPU support feature also includes the ability to define quotas and limits at the Account, Project and Domain levels. This allows Administrators to control how GPU resources are consumed and ensure fair usage across different tenants.
The default maximum number of GPU devices can be configured through the following Global Settings:
| Setting | Description |
| max.account.gpus | The default maximum number of GPU devices that can be used for an account. |
| max.domain.gpus | The default maximum number of GPU devices that can be used for a domain. |
| max.project.gpus | The default maximum number of GPU devices that can be used for a project. |
These defaults can be overridden for specific accounts, domains, or projects as needed.
Both passthrough and vGPU are counted as one GPU device each. If multiple vGPU profiles are created on a single physical GPU, only the vGPU profiles themselves are considered in the quota calculation.
This approach gives operators flexibility to distribute GPU resources efficiently while maintaining control and visibility at different levels of the cloud environment.
Conclusion
Native GPU support in KVM significantly expands what Apache CloudStack can offer for modern workloads. By making GPU passthrough and vGPU directly available through Service Offerings, operators can deploy GPU-enabled Instances without manual configuration or custom scripting.
This is especially relevant for environments running AI/ML training and inference, VDI solutions, or other GPU-accelerated applications. During validation, this feature was successfully tested on NVIDIA A10 GPUs with both passthrough and vGPU modes.
While the implementation is based on open standards and should work with a range of GPU models, behaviour may vary with different vendors. Always verify compatibility and driver support in your environment before deploying at scale.
Vishesh Jindal is a software engineer at ShapeBlue. He has experience in developing and managing cloud infrastructure. He has a particular interest in databases and has worked extensively on them.
When Vishesh is not working, he enjoys watching anime, playing DOTA, or working on an open-source project.