Share:

Apache CloudStack 4.21: GPU Power Meets Seamless Orchestration

Apache CloudStack 4.21.0 is the latest community release and is the result of approximately eight months of active development and collaboration. With contributions from a number of community members, this version delivers 15 new features, around 40 improvements and more than 150 bug fixes since 4.20. The release introduces meaningful enhancements continuing the project’s focus on stability, extensibility, and operator-centric design. Apache CloudStack 4.21 introduces several functional enhancements that add extensibility support, GPU workload orchestration, Kubernetes improvements, and infrastructure governance. Highlights include a modular XaaS Extensions Framework for third-party system orchestration, native GPU resource support with passthrough and vGPU capabilities, and SDN integration via Netris. Users can now recover Instances directly from backups, apply lease policies with expiration actions, and enforce backup and object storage quotas. Additional improvements include deploy-form enhancements, vTPM support, and greater flexibility in Kubernetes cluster composition with per-node offerings and custom CNI plugins.

 

Highlight Features

CloudStack XaaS Extensions Framework
CloudStack XaaS Extensions Framework – Orchestrate Anything

The CloudStack XaaS Extensions Framework introduces a first-class mechanism to integrate CloudStack with external orchestration and provisioning systems. This is accomplished via a new resource type — the External Orchestrator — which can be registered and managed through the CloudStack UI and API.

cloudstack extension

 

Each orchestrator is linked to a script or binary on the management server. CloudStack invokes these executables with JSON input and expects structured JSON output. Custom actions can be defined with input parameters, role-based access controls, and localized success/failure messages. This model enables integration with systems such as:

  • Legacy or non-native hypervisors (e.g. Hyper-V, Proxmox, etc)
  • Bare-metal provisioning platforms
  • Hyperscaler orchestration (e.g. launching resources on AWS/GCP/Azure)
  • Firewall and network appliance management
  • Integration with ITSM or CMDB systems
  • Edge infrastructure coordination

The framework supports action-based workflows such as create, delete, start, stop, reboot, etc. It is hypervisor-agnostic and operates independently of CloudStack’s native compute orchestration.

Custom Actions cloudstack

In 4.21, built-in orchestrators for Proxmox and Hyper-V are provided out of the box. Storage and network orchestration are currently outside the scope of this release, and console access is not included by default.

extensions page cloudstack

This feature enables CloudStack to coordinate infrastructure resources beyond its native stack, using external tools while maintaining governance and control via the core platform.


GPU Integration with KVM in CloudStack (Technical Preview)

Apache CloudStack 4.21 adds native support for managing GPU devices as allocatable resources in KVM-based environments. This enables GPU-accelerated workloads—such as machine learning (ML), artificial intelligence (AI), and virtual desktop infrastructure (VDI)—to be deployed and managed consistently within the platform.

compute offering GPU cloudstack

 

Key capabilities include:

  • Automatic discovery of GPU devices on supported KVM Hosts
  • Grouping and classification of GPUs by vendor or type
  • Support for passthrough and vGPU assignment models
  • GPU-backed Service Offerings for Instances and Kubernetes clusters
  • Limits enforcement at the Account, Domain, and Project levels

GPU summary cloudstack

Once discovered, GPU devices can be linked to Service Offerings and presented to end users when deploying Instances or provisioning clusters via CloudStack Kubernetes Service (CKS). GPU usage is tracked across the full allocation lifecycle, ensuring consistency with CloudStack’s native resource accounting, metering, and quota mechanisms.

Depending on host configuration, GPUs can be:

  • Assigned exclusively to a single Instance (passthrough), or
  • Partitioned for use by multiple Instances (vGPU).

create compute offering cloudstack

 

This native integration reduces reliance on manual scripting and provides a uniform operational model for managing GPU access across multi-tenant environments.

Netris SDN CloudStack Integration

Netris SDN Integration

Apache CloudStack 4.21 adds native support for Netris, a controller-driven SDN platform designed to simplify Layer 3 network operations. This integration allows administrators to register Netris controllers during zone creation and use them as network service providers in advanced KVM zones.

Once integrated, CloudStack can orchestrate VPC-based network topologies with automated provisioning of virtual routers, ACLs, source NAT, static NAT, port forwarding, load balancing, site-to-site and remote access VPNs. Netris handles the underlying fabric configuration—leveraging VXLAN encapsulation and its own management plane—while CloudStack retains full control over the user-facing abstraction and workflows.

The integration supports:

  • VPC and tier orchestration with policy-driven routing and NAT
  • ACL configuration and VPN provisioning through standard CloudStack flows
  • Kubernetes support via CKS, with Netris-backed networking for pods and services
  • Full API and UI management without requiring manual configuration of physical switches

Ideal for service providers and telcos operating modern Layer 3 datacenters, this plugin eliminates the need for static switch configuration or overlay automation tooling—enabling CloudStack to act as a control plane for Netris-powered SDN fabrics.

cks enhancements

 

CloudStack Kubernetes Service (CKS) Enhancements

Apache CloudStack 4.21 delivers a major update to its CloudStack Kubernetes Service (CKS), making it more adaptable to complex infrastructure and operational workflows. These enhancements improve how clusters are deployed, scaled, and maintained—whether in lab setups or production-grade environments.

CKS add cluster

Key Capabilities introduced:

  • Per-node-type Offerings: Users can now define different compute offerings for each node type—master, worker, and etcd—at cluster creation time. This allows fine-grained control over performance, cost, and placement.
  • Custom Templates Support: CKS now accepts both “CKS-ready” and generic templates. CKS-ready templates include preinstalled components and are marked for reuse, while generic ones require user preparation but offer greater flexibility.
  • Pre-created Worker Nodes: Administrators can attach an existing instance—prepared with CKS-compatible software—to a running cluster. This simplifies hybrid setups where some nodes are provisioned externally.
  • etcd Separation: Optionally, etcd nodes can be deployed separately from master nodes, improving fault isolation and aligning with best practices in Kubernetes architecture.
  • Manual Upgrade Control: Templates and offerings can now be marked to skip automated upgrades, giving teams tighter control over version changes and system stability.
  • Dedicated Resource Deployment: Using standard APIs like dedicateHost or dedicateCluster, operators can restrict cluster deployments to predefined hosts or clusters within a specific domain—enabling stronger isolation or resource reservation.
  • Custom CNI Plugin Support (e.g., Calico, Cilium): Users can now choose the CNI plugin to be used when deploying a cluster, with plugin-specific settings—such as ASN and BGP peer configuration—handled via structured userdata. This gives operators the flexibility to align network overlays with their existing infrastructure policies.

CKS add user-data

Together, these enhancements position CKS as a more flexible and infrastructure-aware Kubernetes solution, enabling service providers and enterprises to integrate CloudStack-based clusters into complex environments with diverse compute, network, and upgrade requirements.

Create Instance from Backup cloudstack

Create Instance from Backup

Apache CloudStack 4.21 introduces the ability to create new Instances directly from existing backups, even in cases where the original instance has been permanently deleted or expunged. This functionality enhances the platform’s disaster recovery and workload mobility capabilities, enabling users to recover or redeploy workloads using available backup snapshots.

Create instance from backup

 

Key Capabilities:

    •  Instance recovery without original Instance – Instances can be restored from backup data even if the source Instance no longer exists in the environment.
    • Configurable recovery flow – Users can customize parameters such as compute offering, root disk size, network selection, SSH key, affinity groups, and zone placement during recovery.
    • Metadata-driven restoration – CloudStack automatically extracts and displays relevant metadata from the backup, including instance name, OS type, and disk layout.
    • Multi-disk support – Backups of multi-disk instances can be used to restore the full disk set with proper ordering.
    • ISO attachment – If the original Instance used an ISO, it can be reattached during recovery.
    • UI Integration – The “Restore Instance” option is exposed directly from the backup list in the UI, presenting a simplified deployment form with prefilled data.

 

create instance from backup form

Additional Improvements to Backup Handling:

  • Support for naming and describing individual backups
  • Visual differentiation of scheduled vs. manual backups
  • Retention of restore metadata, even after source Instance deletion
  • Enhanced backup view with type, size, and disk metadata

This feature builds on the Backup and Recovery framework and is compatible with all officially supported backup providers, including Dummy, NAS, and Veeam. It enables seamless cross-zone workload restoration, enforces backup policies per account/domain/project, and strengthens the operational resilience of cloud environments.

CloudStack Instance Lease

Instance Lease (Automatic Stop/Deletion)

The Instance Lease feature in Apache CloudStack 4.21 allows Administrators and Users to assign a lease duration to Instances, specifying how long an Instance should remain active before being automatically stopped or deleted. This helps maintain infrastructure hygiene, reduce resource waste, and enforce internal policies for temporary workloads.

How it works:

A lease can be configured:

  • Globally via settings like instance.lease.default and instance.lease.action.
  • Per-service-offering, where specific durations and actions (stop/delete) are predefined.
  • At deployment time, when users can manually define the lease period if allowed.

 

Once the lease time elapses:

  • The system performs the configured action: either Stop or Expunge the Instance.
  • A grace period (instance.lease.expiry.action.grace.period) and a polling interval (instance.lease.poll.interval) control timing and behaviour.
  • The lease does not override safeguards like DeletionProtection. If an Instance is protected, delete actions will fail safely.

Lease time offering cloudstack

Example use cases:

    • Training labs and sandbox environments: Automatically clean up student or test deployments after a session ends
    • Ephemeral workloads: Ensure that short-lived demo or benchmark environments don’t consume resources beyond their intended lifespan
    • MSP and DevOps environments: Enforce contractual or SLA-based duration limits on customer-deployed Instances
    • Shared infrastructure (e.g., community clouds): Prevent long-term hoarding of compute by enforcing maximum runtime
    • Hackathons and internal R&D projects: Spin up disposable resources that automatically terminate after a few hours or days
    • QA and CI pipelines: Ensure test Instances are removed after the test window completes, without manual tracking
    • Cost control scenarios: Avoid budget leakage by automatically retiring idle or forgotten Instances

 

UI and API visibility:

  • Lease status and expiry date are shown in the Instance view.
  • Admins can search, filter, or audit lease-enabled Instances for capacity planning.
  • Fully integrated with CloudStack’s cleanup workflows—no need for external scripts.

This feature complements existing cleanup workflows and aligns with resource governance strategies—reducing the need for external tools or manual cleanup scripts.

Backup and Object Storage Limits cloudstack

Backup and Object Storage Limits

The CloudStack Backup and Object Storage Limits feature introduces native quota enforcement for backups at the tenant level, empowering administrators to control how much backup data each account, domain, or project can create and store. Leveraging CloudStack’s existing limits and resource tracking framework, this feature allows operators to set fine-grained policies for the maximum number of backups and maximum storage (in GB) consumed, with default values and override capabilities at every scope.

backup and Object storage limits

Fully integrated into the UI and API, users gain visibility into their backup usage and are automatically prevented from exceeding allocated quotas. Lifecycle operations—like backup creation, retention for recurring jobs, and deletion—respect these limits, helping avoid excessive or unplanned storage consumption. Additionally, the feature provides tools for identifying and handling inaccessible or missing restore points, enabling admins to remove orphaned entries without affecting valid backup chains.

This enhancement improves operational predictability and cost control, especially in multi-tenant environments with shared or metered backup infrastructure. It’s compatible with any backup provider supported by CloudStack, and includes updated logic for modern backup platforms to align with best practices for data protection and lifecycle management.

Deploy Instance Form Improvements cloudstack

Deploy Instance Form Improvements

Apache CloudStack 4.21 brings major enhancements to the instance deployment UI, focused on improving clarity, accuracy, and user efficiency—especially in environments with heterogeneous architectures and template variants.

Select template cloudstack

Key Improvements:

    • Architecture-Aware Template Mapping: The template selector now considers both the selected zone and the system.vm.preferred.architecture global setting. It prioritizes templates compatible with the system VM architecture of the zone (e.g., x86_64 or aarch64), helping prevent boot failures and invalid deployments.
    • Image Categorization by OS Type: Templates and ISOs are now grouped by operating system category (e.g., Ubuntu, CentOS, Windows), each with a descriptive label and icon. This makes it easier to visually browse and identify the appropriate image, even when many exist.
    • Improved Visual Hierarchy and Filtering: The image selector UI highlights featured templates, groups community/shared/private images, and de-emphasizes those that don’t match the current context. Fields such as template, ISO, and OS category are dynamically updated based on prior selections like zone or compute offering.
    • Consistency with Reinstall Flow: The same improvements apply to the Reinstall Instance workflow, including architecture-based filtering and categorised image display—ensuring consistent behaviour across the Instance lifecycle.

 

add guest OS Category cloudstack

 

This set of improvements enhances both the first-time deployment and reinstallation experience, aligning the UI with the backend architecture-awareness introduced in recent CloudStack releases.

CloudStack vTPM Support

Virtual Trusted Platform Module (vTPM) Support

Apache CloudStack 4.21 introduces support for virtual Trusted Platform Modules (vTPM) across both KVM and VMware hypervisors. This allows operators to provision Instances with hardware-backed encryption and attestation capabilities, meeting the requirements of secure workloads and compliance-sensitive environments.

vTPM cloudstack

 

Key Capabilities

1. KVM Support

vTPM is enabled via a flag in the Compute Offering (vtpm).

  • The underlying host must support tpm-emulator and UEFI-based booting with OVMF (/usr/share/qemu/OVMF.fd).
  • Compatible only with templates using UEFI firmware and without password-based instantiation.

2. VMware Support

  • VMware vSphere environments must be version 6.7 or later.
  • CloudStack provisions a virtual TPM device by using the EnableVTPM configuration.
  • Requires that the Instance be deployed with a compatible guest OS and firmware (UEFI).
  • vTPM is automatically attached when supported by the template and Compute Offering.

Security Use Cases

  • Enables disk encryption solutions (BitLocker, LUKS/Clevis).
  • Supports measured boot and attestation in cloud environments.
  • Useful for compliance frameworks (e.g., CJIS, HIPAA, PCI-DSS) that require TPM-backed secrets.

This enhancement aligns CloudStack with enterprise-grade virtualization platforms by supporting secure boot and trusted computing features on both major hypervisor families.

Other Features

Support for KVM on IBM z Systems (s390x)

CloudStack 4.21 enables deployment on IBM Z and LinuxONE hardware by extending support for s390x-based KVM hosts and guests—including the CloudStack agent, management server, and a new sysVM template image tailored to the s390x architecture.

Management Server Maintenance Mode

Introduces a maintenance mode for management servers—via new APIs (prepareForMaintenance and cancelMaintenance)—allowing an MS to enter a read‑only state, drain pending jobs and migrate both indirect and direct agent connections to alternative active servers, enabling controlled rolling restarts with minimal service disruption.

Incremental KVM Volume Snapshots

Introduces support for differential (incremental) volume snapshots on KVM-backed storage volumes, enhancing backup efficiency and reducing snapshot storage overhead by saving only changed data between snapshots.

Cloudian HyperStore Object Store Plugin

Adds native integration with Cloudian HyperStore, enabling administrators to offer S3-compatible object storage directly within the platform. Users can create and manage buckets with support for versioning, object lock, encryption, and IAM policies, all accessible through the CloudStack UI and API.

Reconcile Copy/Migrate Commands

Introduces a new reconcile API (and scheduler) that audits the internal command log and handles stuck or orphaned CopyCommand, MigrateCommand, and MigrateVolumeCommand operations—automatically retrying or cleaning up those failed or in‑progress jobs to restore progress and enhance storage/Instance migration stability.

Storage Access Groups

This feature introduces fine-grained control over Host-to-Primary Storage connectivity by allowing Administrators to define Storage Access Groups and assign them to Hosts, Clusters, Pods, Zones, or Primary Storage Pools. A Storage Pool with an Access Group connects only to Hosts sharing the same group, improving isolation and flexibility in large-scale environments.

Volume Allocation Algorithm Support

CloudStack now allows separate configuration of the Volume allocation algorithm, decoupling it from the Instance allocation logic. Operators can independently define strategies for Volume placement using the new volume.allocation.algorithm Global Setting, enabling more flexible and optimized resource scheduling.

Dell EMC PowerFlex Storage Connection Improvements

Enhancements to the PowerFlex (ScaleIO) integration includes configurable timeouts for MDM updates, better handling of SDC client restarts, and validation of MDM configurations during Primary Storage registration. These changes improve reliability when managing PowerFlex-backed Storage Pools across dynamic Host environments.

Support for Snapshot Copy to Primary Storage in Different Zones

This feature enables cross-Zone Volume Snapshot copies directly between Primary Storage Pools, bypassing Secondary Storage. Currently supported by the StorPool plugin, it allows Users to replicate Snapshots across Zones using the usestoragereplication API parameter and specific destination Storage settings.

File-Based Disk-Only Instance Snapshots with KVM

A new merge mechanism was introduced for file-based disk-only Instance Snapshots on KVM, with support for progress tracking via libvirt events and a configurable timeout (snapshot.merge.timeout). Volumes attached to Instances with such Snapshots can now be resized, and Snapshot operations have improved consistency and reliability.

KVM Incremental Snapshot

Support was added for incremental Snapshots on KVM, reducing storage and time requirements by capturing only changed data blocks since the last Snapshot. This optimization is especially beneficial for backup systems and frequent Snapshot operations.

Support for Ceph RBD Erasure-Coded Pools

CloudStack now supports provisioning Volumes from Ceph erasure-coded RBD Pools, enabling Users to take advantage of lower storage overhead and high fault tolerance. The feature maintains compatibility with operations such as resize, migration, Template creation, and Snapshot management.

Deploy Instance from Existing Volume or Snapshot (KVM)

Users can now deploy a new Instance directly from an existing Volume or Snapshot, streamlining provisioning workflows without requiring Template creation. This feature is currently available for KVM hypervisors and Zone-wide Primary Storage Pools.

Share:

Related Posts:

ShapeBlue