What Is Apache CloudStack™
Apache CloudStack™ is an open source software platform that pools computing resources to build Public, Private, and Hybrid Infrastructure as a Service (IaaS) Clouds. Apache CloudStack manages the Network, Storage, and Compute nodes that make up a Cloud infrastructure.
The Story So Far.
CloudStack started life as VMOps, a company founded in 2008 with product development spearheaded by Sheng Liang, who developed the Java Virtual Machine at Sun. Whilst early versions were very much focused on the Xen Hypervisor, the team realised the benefits of multi-hypervisor support. In early 2010, the company achieved a massive marketing win when they acquired the domain name cloud.com and formally launched CloudStack, which was 98% open source. In July 2011, CloudStack was acquired by Citrix Systems, who released the remaining code as open source under GPLv3.
The big news came in April 2012, when Citrix donated CloudStack to the Apache Software Foundation where is was accepted into the Apache Incubator. At the same time Citrix also ceased their involvement in the OpenStack Initiative. Apache CloudStack has now been promoted to a Top-Level Project of the Apache Software Foundation, a measure of the maturity of the code and its community.
CloudStack works within multiple enterprise strategies and mandates, as well as supporting multiple cloud strategies from a service provider perspective. As an initial step beyond traditional server virtualization, many organizations are looking to private cloud implementations as a means to satisfy flexibility while still retaining control over service delivery. The private cloud may be hosted by the IT organization itself, or sourced from a managed service provider, but the net goals of total control and security without compromising SLAs are achieved.
For some organizations, the managed service model is stepped up one level with all resources sourced from a hosted solution. SLA guarantees and security concerns often dictate the types of providers an enterprise will look towards. At the far end of the spectrum are public cloud providers with pay as you go pricing structures and elastic scaling. Since public clouds often abstract details such as network topology, a hybrid cloud strategy allows IT to retain control over key aspects of their operations such as data, while leveraging the benefits of elastic public cloud capacity.
Multiple Hypervisor Support
CloudStack works with a variety of hypervisors, and a single cloud deployment can contain multiple hypervisor implementations. The current release of CloudStack supports pre-packaged enterprise solutions like Citrix XenServer and VMware vSphere, as well as OVM and KVM or Xen running on Ubuntu or CentOS. Support for Hyper-V is currently being developed and should be available in a future release.
Massively Scalable Infrastructure Management CloudStack can manage tens of thousands of host servers installed in multiple geographically distributed datacentres. The centralized management server scales linearly, eliminating the need for intermediate cluster-level management servers. No single component failure can cause a cloud-wide outage. Periodic maintenance of the management server can be performed without affecting the functioning of virtual machines running in the cloud.
Automatic Configuration Management CloudStack automatically configures each guest virtual machine’s networking and storage settings. CloudStack internally manages a pool of virtual appliances to support the cloud itself. These appliances offer services such as firewalling, routing, DHCP, VPN, console access, storage access, and storage replication. The extensive use of virtual appliances simplifies the installation, configuration, and ongoing management of a CloudStack deployment.
Graphical User Interface CloudStack offers an administrator’s Web interface, used for provisioning and managing the cloud, as well as an end-user’s Web interface, used for running VMs and managing VM templates. The UI can be customized to reflect the desired service provider or enterprise look and feel.
API and Extensibility CloudStack provides an API that gives programmatic access to all the management features available in the UI. This API enables the creation of command line tools and new user interfaces to suit particular needs. The CloudStack pluggable allocation architecture allows the creation of new types of allocators for the selection of storage and Hosts.
CloudStack can translate Amazon Web Services (AWS) EC2 & S3 API calls to native CloudStack API calls so that users can continue using existing AWS-compatible tools. CloudMonkey is a Command Line Interface (CLI) for CloudStack written in Python. CloudMonkey brings the ability to easily create scripts to automate complex or repetitive admin and management tasks from simply adding multiple users, to deploying a complete CloudStack architecture.
More information on CloudMonkey can be found at http://goo.gl/ESp8ha
Access to the API, either directly or by using CloudMonkey is protected by a combination of API & Secret Keys and a Signature Hash. Users can re-generate new random API & Secret Keys (as well as their UI Password) at any time, providing maximum security and peace of mind.
CloudStack has a number of features to increase the availability of the system. The Management Server itself may be deployed in a multi-node installation where the servers are load balanced across data centres. MySQL may be configured to use replication to provide for a failover in the event of database loss. For the hosts, CloudStack supports NIC bonding and the use of separate networks for storage as well as iSCSI Multipath.
CloudStack Deployment Architecture
CloudStack has 6 key Building Blocks:
Regions are very similar to an AWS Region, and are the 1st and largest unit of scale for a CloudStack Cloud. A Region consists of multiple Availability Zones, which are the 2nd largest unit of scale. Typically there is one Zone per Data Centre, and each Zone contains PODs, Clusters, Hosts and Storage. A Cloud can contain multiple Regions, and even if one region should go offline, VMs in other Regions are still accessible as each Region has dedicated Management Servers, located in one or more of its Zones.
PODs are the 3rd unit of scale, and are often a single rack which house Networking, Compute and Storage. PODs also have logical as well as physical properties with components such as IP Addressing and VM allocations being influenced by the PODs within a Zone.
Clusters are the 4th unit of scale, and are simply groups of homogenous Compute hardware combined with Primary Storage. Each Cluster will run a common Hypervisor, but a Zone can consist of combinations of all of the supported Hypervisors.
Hosts are the 5th unit of scale and provide the actual compute layer on which Virtual Machines will run.
Storage is the final building block and there are two key types within CloudStack, Primary and Secondary. Primary Storage is where Virtual Machines reside, and can be Local Storage within a Compute Host or Shared File/Block Storage using NFS, iSCSI, Fibre Channel etc.
Secondary Storage is where Virtual Machine Templates, ISO images, and Snapshots reside and is currently always presented over NFS. Swift can also be used to replicate Secondary Storage between Zones, ensuring users always have access to their Snapshots even if a Zone is offline. There is a lot of development work currently underway with regards to Storage and some great new features coming in the next release of CloudStack thanks to a new Storage Subsystem.
The ‘Glue’ that brings all of the building blocks together is the Network layer. CloudStack has two principle models for Networking, referred to as Basic and Advanced. Basic Networking is very similar to the model used by AWS, and can be deployed in 3 slightly different ways, with each adding to the features of the previous.
A true ‘Flat’ network where all VMs share a common Network Range with no form of Isolation.
Using Security Groups which utilise Layer-3 IP Address Filtering to isolate VMs from one another.
Elastic IP and Elastic Load Balancing – A Citrix NetScaler provides Public IP and Load Balancing functionality, and is completely orchestrated by CloudStack.
All three of these Basic Network models allow massive scale as the IP range used by VMs is contained within a POD. The Zone can be scaled horizontally by simply adding more PODs, consisting of Clusters of Hosts and their associated Top of Rack Switching and Primary Storage. The Advanced Networking model brings a raft of features which place a massive amount of power right into the hands of the end users. VLANs are the standard method of isolation but Software Defined Networking (SDN) offerings from Nicira, BigSwitch and soon Midokura bring the possibility of massive scale by overcoming any VLAN limitations.
CloudStack makes excellent use of System Virtual Machines to provide control and automation of Storage and Networking. One such System VM is the CloudStack Virtual Router. The key difference over a Basic Network, is that in the Advanced mode, users can create CloudStack Guest Networks, with each Network having a dedicated Virtual Router.
This innocuous sounding VM provides the following features: DNS & DHCP, Firewall, Client IPSEC VPN, Load Balancing, Source / Static NAT, Port Forwarding, and all of them are configurable by end users from either the GUI or the CloudStack API.
When a user creates a new Guest Network, and then deploys Guest VMs onto that Network, the VMs are attached to a dedicated L2 Broadcast Domain, isolated by VLANS and fronted by a Virtual Router. They have full control of all traffic entering and leaving the network, with a direct connection to the Public Internet.
Firewall and Port Forwarding rules enable the mapping of Live IPs to any number of Internal VMs. Load Balancing functionality with Round-Robin, Least Connections and Source Based Algorithms along with Source Based, App Cookie or LB Cookie Stickiness Policies is available straight out of the box.
Another powerful feature of the Advanced Network model is the Virtual Private Cloud (VPC). A VPC enables the user to create a multi-tiered network configuration, placing VMs within their own VLAN. ACLs enable the users to control the flow of traffic between each Network Tier and also the Internet. A typical VPC may contain 3 Network Tiers, Web, App and DB, with only the Web Tier having Internet Access.
VPCs also bring additional features such as Site-2-Site VPN, enabling a persistent connection with infrastructure running in alternate locations such as other Data Centres or even alternate Clouds. A VPC Private Gateway is a feature the Cloud Admins can leverage to provide a 2nd Gateway out of the VPC Virtual Router. The connection can be used to connect the VMs running within the VPC to other infrastructure via for example a MPLS Network rather than over the Public Internet.
CloudStack optimises the use of the underlying network architecture within a DC by enabling the Cloud Admins to split up the various types of Network Traffic and map them to different sets of Bonded NICs within each Compute Host.
There are four types of Physical Network which can be configured, and they can be setup to all use a single NIC, or multiple bonds, depending on how many NICs are available in the Host Server. The four networks are:
Management: Used by the CloudStack Management Servers and various other components within the system, sometimes referred to as the Orchestration Network.
Guest: Used by all Guest VMs when communicating with other Guest VMs or Gateway Devices such as Virtual Routers, Juniper SRX Firewalls, F5 Load Balancers etc. In an Advanced configuration, multiple Guest Networks can be created, allowing certain NICs to be dedicated to a particular user or function.
Public: In an Advanced Network configuration the Public Network connects the Virtual Routers to the Public Internet. It only exists in a Basic Network when a Citrix NetScaler is used to provide Elastic IP and Elastic LB services.
Storage: Used by the special Secondary Storage System VM and Host Servers when connecting to Secondary Storage devices. It enables the optimisation of traffic used for deploying new VMs from Templates, and in particular for handling Snapshot traffic which can get network intensive, without negatively impacting the Guest & Management Traffic.
The traffic associated with Primary Storage, where the actual VMs reside, can also be split out onto dedicated NICs or HBAs etc, again allowing for optimal performance and High Availability.
Network Service Providers
In addition to the Virtual Router and VPC Virtual Router, CloudStack can also leverage the power of real hardware, bringing even more functionality and greater scale. Currently supported devices are Citrix NetScaler, F5 Big IP, and Juniper SRX but with many more on the way. Once a device has been integrated by the Cloud Admins, the users have control of the features via the standard GUI or API. For example, if a Juniper SRX is deployed, when a user configures a Firewall Rule within CloudStack UI, CloudStack uses the Juniper API to apply that configuration on the physical SRX.
When a Citrix NetScaler is deployed, in addition to Load Balancing, NAT & Port Forwarding it also enables AutoScaling. AutoScaling is a method of monitoring the performance of your existing Guest VMs, and then automatically deploying new VMs as the load increases. After the load has dropped off the extra VMs can be destroyed, bringing your usage, and costs back down to a base level. This level of flexibility and scalability is a key driving force in the adoption of cloud computing.
CloudStack is actually quite easy to setup and administer thanks to its great Graphcal User Interface, API and CLI tools such as CloudMonkey. A Wizard take you through the configuration and deployment of your first Zone, Networking, POD, Cluster, Host and Storage, meaning you can be up and running within a matter of hours.
A simple Role Based Access Control (RBAC) system presents the different levels of users with the features they are entitled to, and the standard allocations can be fine-tuned as required. The authentication can also be passed off to LDAP enabling integration with Enterprise systems including Open LDAP and MS Active Directory.
Admins setup new User Accounts which are grouped together into Domains, allowing a hierarchical structure to be built up. By grouping users into Domains, Admins can make certain sub-sets of the infrastructure available to a particular group of users.
A set of system parameters called Global Settings allows the Admins to control all of the features and setup controls like limits and thresholds, smtp alerts and a whole host of other settings, and again all from an easy to use GUI.
Service Offerings enable Admins to setup the parameters which control the end user environment such as number of vCPUs, RAM, Network Bandwidth and Features, Preferred Hardware based on VM Operating System, Tiered Storage and much more.
Admins have full control over the infrastructure, and can initiate the live migration of any VM, between Hosts in the same Cluster. Stopped VMs can be migrated across different Clusters by moving their associated Volumes to different storage. Storage devices and Hosts can be taken offline for Maintenance and upgrades, and admins can steer VMs to a particular set of Hosts using either the API or Tags.
A big selling point of CloudStack, is the well thought out Graphical User Interface. The majority of the features available to end users are available via the GUI, with just a few of the more advanced newer features available via the API. Because of this easy to learn GUI, new users can get their first VMs up and running within a matter of minutes of their first login.
The process for creating a new VM is handled by a very intuitive graphical wizard which steps you through the process in 6 easy steps:
Choose the Availability Zone Choose a pre-built Template of mount an ISO for full custom install Choose the Compute Offering which controls the amount of CPU, RAM, Network Bandwidth, & Storage Tier Add an additional Data Volume and set its size Add to an existing Network or a VPC, or if none are available create a new Network automatically Allocate a name which will also be used as the VMs Hostname, then launch the VM
Once the users have their VMs up and running they can then start to explore the other features available to them. Snapshots provide a simple and effective way for a user to protect their VMs by taking instant Snapshots of any Disk Volume, or setting up an automatic schedule, such as Hourly, Daily, Weekly etc.
Custom private Templates can be created from any Root Volume or its associated Snapshot, enabling quick and easy replication of a particular VM should multiple instances be required. Data volumes can easily be un-mounted from one VM, and mounted to another VM in a matter of seconds.
Volumes, Snapshots and Templates can all be exported from the Cloud, and then used to re-create the user environment within another Cloud, alleviating concerns of getting locked in to a particular provider.
Why Choose CloudStack?
CloudStack has a proven track record in both the Enterprise and Service Provider space with some of the world’s largest Clouds built on its technology. I have personally been involved in a wide number of implementations on 3 different continents and whilst any large IT Project will hit a few bumps along the road, all the implementations came in on time. This is because of the mature nature of the product, and a set of well-developed design and deployment methodologies.
Unlike other open source Cloud technologies, CloudStack is truly a single Project, with a common set of objectives and goals, being driven by a very active and passionate community. The list of new features being developed is truly staggering, a few examples are:
A new Storage Framework – bringing better control over storage, allowing Primary Storage to stretch across a whole DC, and IOPs to be controlled at VM level. XenServer XenMotion enabling live migration of VM Volumes. Dedicated Resources – Allows a sub set of the infrastructure to be dedicated to a particular user, removing all the anti-cloud arguments referring to Shared Compute/Network/Storage etc. Support for Cisco Virtual Network Management Center (VNMC). Multiple IPs per Virtual NIC – ideal for Web Server VMs with multiple SSL Certificates. S3 Backed Secondary Storage – Enables Secondary Storage to stretch across a whole Region. Dynamic Scaling of CPU & RAM – Enables a user to dynamically increase or decrease the amount of CPU & RAM available to a VM. Support for Midokura Software Defined Networking. Additional Isolation within a VLAN – Using either PVLANs (VMware) or Security Groups (Xen and KVM) VMs on a common VLAN can be isolated enabling multi-tiered Guest Networks to be built on a single VLAN.
Strengths of CloudStack
- Proven Massive Scalability – Real Clouds with > 50,000 Hosts already in production
- Production deployment up and running in a matter of days, not months
- Excellent Documentation Fully supported upgrade path from all previous versions
- Polished web based Graphical User Interface Console Access for VMs
- Single coherent project with common vision to build the best IaaS Platform
- Support for multiple SDNs
- No need for large teams of DevOps staff to deploy and manage
- Backed by Apache Software Foundation
- AWS Compatibility
About the Author
Geoff Higginbottom is an Apache CloudStack Committer and CTO of ShapeBlue, the strategic cloud consultancy. Geoff spends most of his time designing private & public cloud infrastructures for telco’s, ISP’s and enterprises based on Apache CloudStack and Citrix CloudPlatform.