Apache CloudStack for Beginners – Part 2: CloudStack Architecture and Key Components

In our first post in this series, we introduced Apache CloudStack and explained why it’s a great cloud platform for building and managing IaaS clouds. We looked at its main capabilities—like launching virtual machine instances, managing networking and storage, using the web-based UI and REST-like API for automation, and deploying on top of existing infrastructure with minimal overhead.

Now, let’s dive a little deeper.

This post will explore how CloudStack is structured—its architecture and components. Don’t worry if these terms are new. We’ll explain everything in plain language and link to useful videos so you can see these concepts in action.

Why Architecture Matters

CloudStack may look simple from the user interface, but under the hood, it follows a layered architecture with clearly defined components and responsibilities. While it is delivered as a monolithic application, its internal structure separates concerns like compute orchestration, storage, and networking, making it scalable and easy to operate.

Understanding how it’s built will help you:

Set up your own CloudStack test environment with confidence
See how resources like virtual machine instances, networks, and storage fit together
Troubleshoot issues more effectively by knowing where each function takes place

Let’s break it down, piece by piece.

The CloudStack Hierarchical Architecture

CloudStack organises all the physical and virtual resources of your cloud into a clear hierarchy. At the top of this structure is the Region, which can represent a large geographic area. At the bottom is the Host, a physical server that runs your virtual machine instances.

This layered structure helps CloudStack manage everything efficiently—whether you’re running just a few servers or scaling up to thousands spread across multiple geographically distributed data centres.

We’ll now walk through each part of the CloudStack architecture—from the top-level Region to the physical Host. Understanding how these layers work together will help you see how CloudStack organises and controls your entire cloud infrastructure.

Region

The highest level of organisation—you can think of a Region as a logical grouping of cloud infrastructure, often representing a geographic location or an isolated administrative domain.
Every CloudStack deployment starts with a default Region (ID 1), and in most setups, a single Region is all you need.
A Region includes one or more Zones and shares a single Management Server database, which means all Zones in that Region are centrally managed.
Additional Regions are only used in advanced scenarios—such as when you want to operate completely independent cloud environments across different countries or data sovereignty zones.

Availability Zone

A Zone usually represents a single data centre—a physical location where your servers, storage, and networking equipment are hosted.
Each Zone contains at least one Pod, where compute resources (Hosts and Clusters) are located.
It also includes a shared storage component used to hold reusable assets like virtual machine templates and ISO images.
Zones are completely isolated from each other in terms of networking. This means that virtual machine instances in different Zones can’t communicate directly.
Most CloudStack environments start with a single Zone, but you can add more to spread workloads across different locations, improve availability, or separate environments for specific use cases.

Pod

A Pod usually represents a group of servers placed together in the same part of a data center—often in the same rack or connected to the same network switch.
Each Pod contains one or more Clusters, where your compute resources (Hosts) live.
All Hosts in a Pod must be connected to the same local network, so they can talk to each other directly without going through a router.

Cluster

A Cluster is a group of physical servers (Hosts) that all run the same hypervisor type—like KVM, VMware, or XCP-ng.
These Hosts share access to the same Primary Storage, where the virtual machine instances’ disks are stored.
Because the Hosts are connected and compatible, CloudStack can move virtual machine instances between them automatically—for example, to balance resource usage, recover from a failure, or when you place a Host into maintenance mode for upgrades or repairs.

Host

A Host is a physical server running a hypervisor like KVM, XenServer, or VMware.
This is where virtual machine instances are actually created, started, and executed.
How CloudStack communicates with the Host depends on the hypervisor:
- For KVM, an Agent runs directly on the Host to receive instructions.
- For XCP-ng/XenServer, CloudStack uses the built-in XenAPI.
- For VMware, CloudStack talks to the vCenter server, which manages the ESXi Hosts.

Primary Storage

This is the storage used for running virtual machine instances. It stores the root volumes (the virtual disks that contain the OS) and any additional data volumes attached to the instances.
Primary Storage is usually associated with a Cluster, meaning each Cluster can have its own storage system. However, depending on the hypervisor and setup, storage can also be shared across multiple Clusters or configured at the Zone level.
CloudStack also supports local storage, where each Host uses its own disks to store virtual machine volumes.
Supported storage backends include NFS, iSCSI, shared block storage, and software-defined storage (SDS) platforms—essentially, any storage supported by the underlying hypervisor.

Secondary Storage

Secondary Storage is used to store reusable and backup data, such as virtual machine templates, ISO images, and snapshots of virtual machine instance volumes.
It is shared across the entire Zone, so all Clusters and Pods within that Zone can access the same images and backups.
The standard and supported backend for Secondary Storage is NFS (Network File System).

CloudStack Key Components

Here are some of the main software components that make CloudStack work behind the scenes:

Management Server

This is the central brain of CloudStack.
It handles all orchestration tasks: provisioning virtual machine instances, configuring networks, managing storage, applying user permission, tracking resource usage, and much more.
Most CloudStack environments only required a single Management Server, but it can be scaled horizontally for high availability and performance, using a load balancer and a shared database.

Database

Apache CloudStack uses a relational database (typically MySQL) to store all configuration data, user information, resource states, and system logs.
The database is a critical component—it holds the “memory” of your cloud. The Management Server interacts with it constantly to track the state of virtual machine instances, networks, storage, and more.
In production environments, the database can be replicated and backed up regularly to ensure high availability and disaster recovery.
For beginners or lab setups, a single-node installation usually comes with a local database preconfigured.

CloudStack Agent

Used in KVM-based environments, the Agent runs directly on each Host.
It receives instructions from the Management Server—like “start this VM instance” or “attach this volume”— and executes them on the Host.
For other hypervisors, such as VMware and XCP-ng, CloudStack uses their native APIs (vCenter and XenAPI, respectively) instead of an on-host agent.

API & User Interface

Every feature in CloudStack is accessible through a REST-like API, which allows developers and tools to automate tasks such as deploying virtual machine instances, configuring networks, and managing storage.
The web-based UI is build on top of the same API, offering an intuitive, graphical interface to perform these tasks without writing code.
Because the API is open and well-documented, CloudStack can be integrated with popular external tools such as Terraform, ClusterAPI, Ansible, and Packer—allowing users to provision and manage infrastructure using their preferred automation workflows.

Bringing It All Together

Once you understand CloudStack’s basic structure, the rest starts to make sense. When a user launches a virtual machine instance:

CloudStack automatically selects the appropriated Zone, Pod, Cluster, and Host based on available resources and policies.
It provisions the required compute and storage, such as CPUs, memory, and volumes.
It attaches the instance to the correct network, including firewalls and load balancers if needed.
All of this happens behind the scenes, managed by CloudStack’s powerful orchestration engine—so users don’t have to worry about the underlying complexity.

Pretty neat, right?

What’s Next?

Now that you understand the architecture and terminology, you’re well on your way to becoming a CloudStack pro. In the next part of our series, we’ll walk you through how to get started with Apache CloudStack—including where to find downloads, how to spin up a lab environment, and how to connect with the community.

Curious to explore more right now? Check out:

Antonia Shehova

Antonia is a dedicated Marketing Assistant who genuinely enjoys her work. Since joining the team, she has been actively involved in the marketing activities.

In her spare time, Antonia enjoys spending time with her family and discovering new places.