Tag Archive for: Apache CloudStack

Anurag Awasthi shares his experiences after one month in the ShapeBlue Engineering team

Let me start by briefly introducing myself. I hold a bachelors and master’s in computer science and engineering from the Indian Institute of Technology, Kanpur. I have been working in technology for more than 5 years now and have had the privilege of working with some world class scientists and engineers at various organizations. Some of my past employers are Microsoft Research Redmond, Pocketgems Inc, and Twitter Inc. During this time, I have strived to be a generalist and explore full stack development in a consumer-focused product development. In my free time I enjoy going on hikes, camping, painting and I experiment cooking different cuisines.

In late 2017, after spending 5 years in a heavy consumer focused product development, I wanted to expand my horizons, so I worked on my own startup to explore the world of cloud computing. It worked with moderate success for some time before I ran into an old friend of mine who had now become a PMC member in Apache CloudStack. That’s when I first heard of ShapeBlue – a company which was working very actively and contributing directly to the open source community of Apache CloudStack. The company works in distributed teams and is working with some big names in the industry as its customers. It sounded a little too good to be true. I had an interview with the company, and it became obvious that the subtitle of the company “the Cloud Specialists” wasn’t just a sales pitch but accurately reflected the true horsepower of expertise that the company carried. I saw incredible opportunities to grow technically and peers to learn from, and I joined within 4 days of the interview!

Having spent a little more than a month at the company I think I have reached a place where I can provide a comparison between a company focusing on opensource community development vs. a big name private company focusing on proprietary software. I will also try to highlight some of the pros and cons of working full time in a distributed company such as ShapeBlue. Two primary things that any engineering focused person would be concerned about are the development process (challenges, rewards, etc) and team environment (interactions with peers, support and expectations, etc).

Development Process

Overall, there are many similarities between the development process in an open source project and a traditional organization. A feature begins with a feature request that needs to be presented in a specific format, followed by discussions on proposed solutions, implementation, code reviewer approval and merge. The difference lies in the challenges that arise due to challenges with communication between collaborating parties, and time consumed because of a globally distributed team.

Technologies involved in an open source project are truly diverse, and multiple choices are available at each stage of a feature’s development. Naturally, each choice comes with its own challenges as one is responsible for the due-diligence work. ShapeBlue has its own CI test environment and test engineers, but the community as a whole does not have a fixed approach to testing. As one is directly contributing to the community project, the necessity of one’s own due diligence and initiative cannot be emphasized enough. Even for a simple task such as machine setup, one needs to explore options and wisely set up a preferred system for development. This is unlike a larger organization, which mostly would have a pre-tested and often pre-configured system ready for development, and managed by a separate IT department. This becomes a fun experience on its own when one jumps into development of an open source project that touches a multitude of system components like CloudStack. I started on a MacBook, but very soon switched to ThinkPad / Ubuntu because of the better community support available on the latter platform and the limit of RAM on Macbooks.

The Apache Way

Each Apache project community has its own set of principles and are determined by the Apache way! Since it’s not just managers controlling the flow of a project, feature development is usually slower than private development too. Each discussion requires genuine intentions and effort to be heard, and the necessary feedback from the community. This can be frustrating for the impatient ones, but following the discussions in turn brings in new learnings. It also means that any feature gets reviewed by a broad set of people. So, it has its advantages.

From Consumer to Infra

The true challenge lies when one transitions from a consumer product to an infra product, such as CloudStack. As one can imagine, orchestrating a truly scalable cloud deployment is a massive task that CloudStack does. Clearly, it is expected to be more challenging to be a true cloud specialist as compared to being a mobile engineering specialist, or a web specialist, and so on. In just my first month I have had to revisit most of my university learning and learn many new network concepts, linux administration and refer to some new design patterns in code architecture. Such large exposure is rare to most developers in larger, private organizations because they are abstracted in modern, product focused development. Now whether this is good or bad is a difficult call to make as it narrows down to one’s preferences. Personally, it’s been a rewarding experience, but I’ve had mentorship support from within ShapeBlue. It would definitely be a slightly harder and much slower transition otherwise, and would require patience.

Team Environment

An open source project is determined by its contributors and does not have a specific team that regularly goes on outings or hangs out at office parties. Most contributors don’t share an office location and often work in different time zones. Given most of us actively participate socially as well as professionally with our team members in larger organizations, I suppose that’s a disadvantage as it means social life with colleagues after work is negligible. ShapeBlue similar in structure and we have team members distributed across many time zones. The communication happens primarily over Slack, but is not a hurdle as much as in wider community as team members are one ping away. So that solves most of the problems of working in a distributed team. There is also a unique culture in the company that leaves the responsibility for being productive on the employee and provides good support to do so. If someone has worked in a controlled environment with a vertical hierarchy, it’s a little difficult to get accustomed to a flat structure in the beginning. But new perspectives open up in this environment which include thinking more creatively, independently, and having a better work life balance.

Hopefully this would help anyone thinking of transitioning from a monotonous iOS / Android / Web / Backend development in a larger company to a more technically enriching project such as CloudStack. On personal level, the move has been full of several new delightful experiences with some new challenges that don’t really pay back directly in skills in the short run but pile up benefits in the long run.

Apache CloudStack is generally considered secure out of the box, however it does have the capability of protecting both system VM traffic as well as management traffic with TLS certificates. Prior to version 4.11 CloudStack used Tomcat as the web server and servlet container. With 4.11 this has been changed to embedded Jetty web server and servlet container that makes CloudStack more secure and independent of distribution provided Tomcat version which could be prone to security issues. This changes the management server’s TLS configuration.

In this blog post we’ll go through how to implement native TLS to protect both the system VMs – CPVM and SSVM – as well as the management server with public TLS certificates.

Please note since SSL is now being deprecated we now refer to the replacement – TLS – throughout this post.

Overview

The TLS configuration in CloudStack provides different functionality depending on the system role:

  • TLS protecting the management server is a Jetty configuration which protects the main web GUI and API endpoint only. The configuration of this is handled in the underlying embedded Jetty configuration files.
  • System VMs:
    • The Console Proxy VM TLS configuration provides secure HTTPS connection between the main console screen (your browser) and the CPVM itself.
    • The Secondary Storage VM SSL configuration provides a secure HTTPS connection for uploads and downloads of templates / ISOs / volumes to secondary storage as well as between zones.
    • System VM SSL configuration is carried out through global settings as well as the CloudStack GUI.

System VM HTTPS configuration

Global settings

The following global setting are required configured to allow system VM TLS configuration:

Global settingValue
consoleproxy.url.domaindomain used for CPVM (see below)
consoleproxy.sslEnabledSwitches SSL configuration of the CPVM on / off
Values true | false
secstorage.ssl.cert.domaindomain used for SSVM (see below)
secstorage.encrypt.copySwitches SSL configuration of the SSVM on / off
Values true | false

The URL configurations can take three formats – and these also determine what kind of TLS certificate is required.

  • Blank: if left blank / unconfigured the URLs used for CPVM and SSVM will simply be passed as the actual public IP addresses of the system VMs.
  • Static URL: e.g. console.mydomain.com or ssvm.mydomain.com. In these cases CloudStack rely on external URL load balancing / redirection and/or DNS resolution of the URL to the IP address of the CPVM or SSVM. This can be achieved in a number of different ways through load balancing appliances or scripted DNS updates.
    This configuration relies on:

    • The same URL used for both CPVM and SSVM, or
    • a multi-domain certificate provided to cover both URLs if different ones are used for CPVM and SSVM.
  • Dynamic URL: e.g. *.mydomain.com. In this case CloudStack will redirect the connections to the CPVM / SSVM to the URL “a-b-c-d.mydomain.com” where a/b/c/d represent the IP address, i.e. a real world URL would be 192-168-34-145.mydomain.com.
    This relies on two things:

    • DNS name resolution configured for the full public system VM IP range, such that all combinations of “a-b-c-d.mydomain.com” can be resolved. Please note in CloudStack version 4.11 the public IP range used purely by system VMs can be limited by reserving a subrange of public IP addresses just for system use.
    • An TLS wildcard certificate covering the full “mydomain.com” domain name.

Configuration process – GUI

The first step is to configure the four global settings above, then restart the CloudStack management service to make these settings live.

Next upload the TLS root certificate chain, the actual TLS certificate as well as the PKCS8 formatted private key using the “TLS certificate” button in the zone configuration. In this example we use a wildcard certificate.

Configuration process – API / CloudMonkey

The uploadCustomCertificate API call can be used to upload the TLS certificates. Please note the upload process does require at least two API calls – more depending on how many intermediary certificates are used. If you use CloudMonkey the certificates can be uploaded in cleartext – otherwise they have to be URLencoded when passed as part of a normal HTTP GET API call.

  • In the first API call the combined root and intermediary certificates are uploaded. In this API call the following parameters are passed – note we don’t pass the private key:
    • id=1
    • name: Give the certificate a name.
    • certificate: the root / intermediary certificate in cleartext, with all formatting / line breaks in place. In the example below we pass a chain of root and intermediary certificates.
    • domainsuffix: provide the suffix used, e.g. *.mydomain.com.
  • The second API call in our example uploads the site certificate. In this case we do not give the certificate a name (CloudStack automatically names this):
    • id=2
    • certificate: the issued site certificate, again in cleartext.
    • privatekey: the private key, same cleartext format as certificate.
    • domainsuffix: as above.
cloudmonkey upload customcertificate id=1 name=RootCertificate certificate='-----BEGIN CERTIFICATE-----
MIIE0DCCsdf8HqjeIHgkqhkiG9w0BAQsFADCBgzELMAk
...8V3Idv7kaWKK3245lsoynJuh87/BKONvPi8BDAB
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
MIIEfTCCA2WgAwIBAgIDG+cVMA8djwj1ldggKd9d9s
.....mw4TRfZHcYQFHfjDCmrw==
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
MIIEADCCAuigAwIBAgIBADANBgkqhkiG9w0BAQUFA
.....csbkqWletNw+vHX/bvZ8=
-----END CERTIFICATE-----'
domainsuffix='*.mydomain.com'

cloudmonkey upload customcertificate id=2 certificate='-----BEGIN CERTIFICATE-----
MIIGrjCCBZagAwIBAgIJAJ....
...xKjPTkOLfwMVWXc8Ul25t7lkyi0+a9jZxFAuDXFRgkQnbw==
-----END CERTIFICATE-----'
privatekey='-----BEGIN PRIVATE KEY-----
MIIEvEbidvik1gkqhkiG9w0BAQEFAASCBK
.....rEF5Qyuyserre87d234jj/Uddf
-----END PRIVATE KEY-----'
domainsuffix='*.mydomain.com'

 

System VM restart

Once uploaded the CPVM and SSVM will automatically restart to pick up the new certificates. If the system VMs do not restart cleanly they can be destroyed and will come back online with the TLS configuration in place.

Testing the TLS protected system VMs

To test the Console Proxy VM simply open the console to any user VM or system VM. The popup browser window will at this point show up as a non-secure website. The reason for this is the actual console session is provided in an inline frame – and the popup page itself is presented from the unsecured management servers. If you however look at the page source for the page the HTTPS link which presents the inline frame shows up in the expected format:

Once the management server is also TLS protected the CPVM console popup window will also show as secured:

The SSVM will also now utilise the HTTPS links for all browser based uploads as well as downloads from secondary storage:

Securing the CloudStack management server GUI with HTTPS

Protecting the management servers requires creating a PCKS12 certificate and updating the underlying Jetty configuration to take this into account. This is described in http://wiki.eclipse.org/Jetty/Howto/Configure_SSL#Configuring_Jetty

First of all combine the TLS private key file, certificate and root certificate chain into one file (in the specified order), then convert this into PKCS12 format and write it to a new CloudStack keystore. Enter a new keystore password when prompted:

# cat myprivatekey.key mycertificate.crt gd_bundle-g2-g1.crt > mycombinedcert.crt

# openssl pkcs12 -in mycombinedcert.crt -export -out mycombinedcert.pkcs12
Enter Export Password: ************
Verifying - Enter Export Password:************

# keytool -importkeystore -srckeystore mycombinedcert.pkcs12 -srcstoretype PKCS12 -destkeystore /etc/cloudstack/management/keystore.pkcs12 -deststoretype pkcs12
Importing keystore mycombinedcert.pkcs12 to /etc/cloudstack/management/keystore.pkcs12...
Enter destination keystore password:************
Re-enter new password:************
Enter source keystore password:************
Entry for alias 1 successfully imported.
Import command completed:  1 entries successfully imported, 0 entries failed or cancelled

 

Next edit /etc/cloudstack/management/server.properties and update the following:

https.enable=true
https.keystore=/etc/cloudstack/management/keystore.pkcs12
https.keystore.password=<enter the same password as used for conversion>

 

In addition automatic redirect from HTTP/port 8080 to HTTPS/port 8443 can also be configured in /usr/share/cloudstack-management/webapp/WEB-INF/web.xml. Add the following section before the section around line 22:

  <security-constraint>
    <web-resource-collection>
      <web-resource-name>Everything</web-resource-name>;
      <url-pattern>*</url-pattern>
    </web-resource-collection>
    <user-data-constraint>
      <transport-guarantee>CONFIDENTIAL></transport-guarantee>
    </user-data-constraint>
  </security-constraint>

 

Lastly restart the management service:

# systemctl restart cloudstack-management

Conclusion

The TLS capabilities of CloudStack has been extended over the last few releases – and we hope this article helps to explain the new global settings and configuration procedure.

About The Author

Dag Sonstebo is  a Cloud Architect at ShapeBlue, The Cloud Specialists. Dag spends most of his time designing, implementing and automating IaaS solutions based on Apache CloudStack.


Background

The original CloudMonkey was contributed to the Apache CloudStack project on 31 Oct 2012 under the Apache License 2.0. It is written in Python and shipped using the Python CheeseShop, and since its inception has gone through several refactors and rewrites. While this has worked well over the years, the installation and usage have been limited to just a few modern platforms due to the dependency on Python 2.7, meaning it is hard to install on older distributions such as CentOS6.

Over the past two years, several attempts have been made to make the code compatible across Python 2.6, 2.7 and 3.x. However, it proved to be a maintenance and release challenge – making it code compatible across all the platforms, all the Python versions and the varied dependency versions; whilst also keeping it easy to install and use. During late 2017, an experimental CloudMonkey rewrite called cmk was written in Go, a modern, statically typed and compiled programming language which could produce cross-platforms standalone binaries. Finally, in early 2018, after reaching a promising state the results of the experiment were shared with the community, to build support and gather feedback for moving the CloudMonkey codebase to Go and deprecate the Python version.

During 2018, two Go-based ports were written using two different readline and prompt libraries. The alpha / beta builds were shared with the community who tested them, reported bugs and provided valuable feedback (especially around tab-completion) which drove the final implementation. With the new rewrite CloudMonkey (for the first time) ships as a single executable file for Windows which can be easily installed and used having mostly the same user experience one would get on Linux or Mac OSX. The rewrite aims to maintain command-line tool backward compatibility as a drop-in replacement for the legacy Python-based CloudMonkey (i.e. shell scripts using legacy CloudMonkey can also use the modern CloudMonkey cmk). Legacy Python-based CloudMonkey will continue to be available for installation via pip but it will not be maintained moving forward.

CloudMonkey 6.0 requires a final round of testing and bug-fixing before the release process will commence. The beta binaries are available for testing here: https://github.com/apache/cloudstack-cloudmonkey/releases 

Major changes in CloudMonkey 6.0

  • Ships as standalone 32-bit and 64-bit binaries targeting Windows, Linux and Mac including ARM support (for example, to run on Raspberry Pi)
  • Drop-in replacement for legacy Python-based CloudMonkey as a command line tool
  • Interactive selection of API commands, arguments, and argument options
  • JSON is the default API response output format
  • Improved help docs output when ‘-h’ is passed to an API command
  • Added new output format ‘column’ that outputs API response in a new columnar way like modern CLIs such as kubectl and docker
  • Added new set option ‘debug’ to enable debug mode, set option ‘display’ renamed as ‘output’
  • New CloudMonkey configuration file locking mechanism to avoid file corruption when multiple cmk instances run
  • New configuration folder ~/.cmk to avoid conflict with legacy Python-based version

Features removed in CloudMonkey 6.0:

  • Removed XML output format.
  • Removed CloudMonkey logging API requests and responses to a file.
  • Coloured output removed.
  • Removed set options: color (for coloured output), signatureversion and expires (no longer acceptable API parameters), paramcompletion (API parameter completion is not enabled by default), cache_file (the default cache file, now at ~/.cmk/cache ), history_file (the history file), log_file (API log file).

About the author

Rohit Yadav is a Software Architect at ShapeBlue, the Cloud Specialists, and is a committer and PMC member of Apache CloudStack. Rohit spends most of his time designing and implementing features in Apache CloudStack.

Andrija Panic shares some thoughts on joining the ShapeBlue team

Hi there, this is Andrija from… well, ShapeBlue! I’ve been working here for a month now and I thought that I’d share my views of working for the company.

Before I move to the actual topic, let me share just a little bit of background about myself.

Before joining ShapeBlue, I was working as a Cloud System Engineer for two different Swiss-based Public Cloud providers, both utilizing CloudStack to provide IaaS services for local (Swiss) and international customers – many of which (as you can probably guess) were serious financial institutions (Switzerland being considered a big privacy and security center). We even had customers connecting all the way from South America to their infrastructure for daily business, all managed by CloudStack – and it just worked flawlessly!

During my time with the Swiss guys, I had the pleasure (with my colleagues) to lead and build their CloudStack infrastructure from scratch. Here I gained some serious knowledge and experience on this topic. I also had the opportunity to work with some nice storage solutions, from NetApp SolidFire distributed All-Flash Storage (providing block-level storage to CloudStack VMs), to Cloudian Hyperstore S3 Object Storage solution providing (you can guess by its name…) S3 object storage with 100% Native S3 API compatibility. Both solutions had their challenges of integration into existing environment and I was lucky enough to pull the strings here and lead the thing myself. Really fun time! Did I mention CloudStack? Yes, we did quite a decent job here, we made a lot of tweaks and improvements, migrations and decent customer support.

But after 5 years with CloudStack in a service provider environment , it was time for me to move on and improve my cloud building skills even more, so my next logical step was to pull Giles Sirett, ShapeBlue CEO, for a quick coffee on the last CloudStack Conference (I even didn’t have to pay for the coffee – it was a free one!). The rest is pretty much history – I’m now paving my way into consultancy as a  Cloud Architect at ShapeBlue.

After spending a month here at ShapeBlue, I can honestly say that I’m nothing short of being impressed with both the people (colleagues) and the processes inside ShapeBlue. I was already used to Swiss guys being strict and very well organized, but my feeling is that ShapeBlue has moved this to a whole new level. When I joined the company, besides having a dedicated colleague as a mentor (hi there Dag – thanks for all your help!) helping me to find my way around the company, I also got proper training on many different tools and processes used in company, from some internal infrastructure stuff, to customer support tools, processes and SLAs, to many different things in general. In fact , this was a revelation when compared to the  old RTFM-it-yourself way (stands for Read The [Insert asterisks ***] Manual), in case you were wondering) that I’d experienced at previous companies. The people at ShapeBlue are supportive, the working atmosphere is just great, with tons of seriousness across the board but with a healthy dose of (mainly) British humor in the middle of hard work – to make you wake up and warm up during these cold winter days. From time to time we even get cats jumping from our Slack channel.

After being mostly in a technical leadership position in my previous jobs, I’m now, for the first time in my professional carrier, part of the team with a more experienced guys than me – and I’m really happy about that – it’s always nice to be able to get some help in case you need advice – but individual initiative and engagement is something that is strongly respected in ShapeBlue. One of the interesting things is, that the guys in the ShapeBlue Leadership Team do actually listen to engineers and take their advice / opinion – something you don’t necessarily find in every company. It’s a very collaborative and not authoritative environment – a thing that everybody respects here.

So far, I have been tasked with quite a few interesting things to work on: from  delivering the famous ShapeBlue Bootcamp to one of our new colleagues, playing around with some more interesting CloudStack setups (with different hypervisors) and been included in some customer projects and support stuff – all in all a good start!

In case you are still following me, here come a few personal things about me:

I’m based in Belgrade, Serbia (for all you techies, that is 44.0165° N, 21.0059° E ) – a country known for good cuisine, but mostly for ćevapi and šljivovica (national drink). Serbia is also home to Novak Djokovic, the world No. 1 in men’s singles tennis (this is the guy who regularly beats Roger Federer, for the record!).

In my free time I’m hanging around with my 3 princesses and sometimes I manage to squeeze some time for gym, music or very light electronic projects.

Talk to you later, Andrija.

There was a definite feel of Christmas in the air in London as we made our way to last Thursday’s (December 13) winter meetup of the Cloudstack European User Group (CSEUG), and that only increased as we arrived at the BT Centre near St. Paul’s and saw the big Christmas tree in reception!

A great turnout for this, the last meetup of 2018, and a great representation of the CloudStack community in Europe with people travelling from Germany, Serbia, Glasgow, Switzerland and Latvia to name but a few. After a quick lunch we took our seats, and Giles Sirett (chairman of the user group) welcomed everyone and got the event started with introductions and CloudStack news.

Firstly, Giles spoke about software updates and new releases. CloudStack 4.11 is an LTS (long term support) release and included more than 250 new capabilities and a big step towards zero downtime upgrades, 4.11.2 has just been released (including 71 fixes), 4.11.3 is coming soon and 4.12 is in planning. Giles then mentioned CloudStack events starting with the recent CloudStack Collaboration Conference in September (Montreal), and events for 2019 – the next CSEUG in March (London), and the next Collaboration Conference in September (Las Vegas). During Giles’ presentation, Maurice Nettisheim (Head of Cloud Compute for BT) took to the stage to say a few words about BT’s ongoing use of CloudStack in their IaaS platform and their continued support and involvement in the CloudStack community.

Giles slides contain much more information:

After Giles, Paul Angus gave us an update on ShapeBlue’s CloudStack Container Service (CCS), giving us a walkthrough of the recently released update.This update brings CCS bang up-to-date by running the latest version of Kubernetes (v1.11.3) on the latest version of Container Linux. CCS also now makes use of CloudStack’s new CA framework to automatically secure the Kubernetes environments it creates. Paul’s talks and slides are always packed with detail:

Olivier Lambert of XCP-ng & Xen Orchestra took the floor next to tell us about the current state of the project. For those that are not familiar, XCP-ng is an opensource, community powered hypervisor based on Xen. It is easy to upgrade from XenServer (keeping all VMs, settings etc.), 100% API compatible, requires no license and has no feature restrictions.

Please take a look through Olivier’s slides for much more on this fascinating subject:

After a short break, we welcomed Ingo Jochim and Andre Walter (itelligence) with their talk entitled ‘How our cloud works’. They talked through full automation with Ansible for all infrastructure components of the cloud with CloudStack, check_mk, LDAP and more, with all functionality available through a customer portal, also covering how the setup is fully scalable for larger landscapes.
Ingo and Andre’s slides right here:

Next up was Adam Dagnall (Cloudian) with ‘Advanced S3 compatible storage integration in CloudStack’. To provide tighter integration between the S3 compatible object store and CloudStack, Cloudian has developed a connector to allow users and their applications to utilize the object store directly from within the CloudStack platform in a single sign-on manner with self-service provisioning. Additionally, CloudStack templates and snapshots are centrally stored within the object store and managed through the CloudStack service. The object store offers protection of these templates and snapshots across data centres using replication or erasure coding. Adam went into the feature-set in great detail, and his slides provide much more information:

Last talk of the day, and the honours fell to Andrija Panic (Hiag Data) with ‘CloudStack – 5 years in production’. Andrija shared real world experience of designing, deploying and managing a CloudStack public cloud, explaining how high availability for the CloudStack management components was implemented and discussing different storage technologies and networking models used, as well as the challenges faced. Andrija also presented alternate methods for deploying CloudStack as regards to regions / zones / pods, and also touched on physical networking, finally looking at the different CloudStack guest networking models available (from Basic Zone / Shared Networks to all the Advanced Zone’s networking models) and when to use each of them.
Andrija went into a lot of detail and I encourage you to look through his slides:

After Andrija had finished answering questions, Giles wrapped things up and we moved to a local pub, where I am pleased to say that conversation and collaboration continued into the night, with what rapidly became the unofficial ‘CloudStack Christmas Party’! Huge thanks to BT for providing a first-rate venue and lunch, and to all our speakers, who make these events so interesting and such a success.

The next CloudStack User Group meetup will be on Thursday, March 14, and will be hosted by our friends at Ticketmaster here in London. Please register here!

All the talks were recorded and will be made available shortly on the ShapeBlue YouTube channel.

Introduction

This blog describes a new feature to be introduced in the CloudStack 4.12 release (already in the current master branch of the CloudStack repository). This feature will provide support for the Data Plane Development Kit (DPDK) in conjunction with Open vSwitch (OVS) for guest VMs and is targeted at the KVM hypervisor.

The Data Plane Development Kit (https://www.dpdk.org/) is a set of libraries and NIC drivers for fast package processing in userspace. Using DPDK along with OVS brings benefits to networking performance on VMs and networking appliances. In this blog, we will introduce how DPDK can be used on on guest VMs once the feature is released.

Please note – DPDK support in CloudStack requires that the KVM hypervisor is running on DPDK compatible hardware.

Enable DPDK support

This feature extends the Open vSwitch feature in CloudStack with DPDK integration. As a prerequisite, Open vSwitch needs to be installed on KVM hosts and enabled in CloudStack. In addition, administrators need to install DPDK libraries on KVM hosts before configuring the CloudStack agents, and I will go into the configuration in detail.

KVM Agent Configuration

An administrator can follow this guide to enable DPDK on a KVM host:

Prerequisites

  • Install OVS on the target KVM host
  • Configure CloudStack agent by editing the /etc/cloudstack/agent/agent.properties file:
    • # network.bridge.type=openvswitch
      # libvirt.vif.driver=com.cloud.hypervisor.kvm.resource.OvsVifDriver
      
  • Install DPDK. Installation guide can be found on this link: http://docs.openvswitch.org/en/latest/intro/install/dpdk/.

Configuration

Edit /etc/cloudstack/agent/agent.properties file, in which <OVS_PATH> is the path in which your OVS ports are created, typically /var/run/openvswitch/:

  • # openvswitch.dpdk.enable=true
    # openvswitch.dpdk.ovs.path=<OVS_PATH>
    

Restart CloudStack agent so that changes take effect:

# systemctl restart cloudstack-agent

DPDK inside guest VMs

Now that CloudStack agents have been configured, users are able to deploy their guest VMs using DPDK. In order to achieve this, they will need to pass extra configurations to enable DPDK:

  • Enable “HugePages” on the VM
  • NUMA node configuration

As of 4.12, passing extra configurations to VM deployments will be allowed. In the case of KVM, the extra configurations are added to the VM XML domain. The CloudStack API methods deployVirtualMachine and updateVirtualMachinewill support the new optional parameter extraconfigand will work in the following way:

 
# deploy virtualmachine ... extraconfig=<URL_UTF-8_ENCODED_CONFIGS>

CloudStack will expect a URL UTF-8 encoded string which can support multiple extra configurations. For example, if a user wants to enable DPDK, they will need to pass two extra configurations as we have mentioned above. An example of both configurations are the following:

 
dpdk-hugepages:
<memoryBacking> 
   <hugePages/> 
</memoryBacking> 

dpdk-numa: 
<cpu mode='host-passthrough'>
   <numa>
      <cell id='0' cpus='0' memory='9437184' unit='KiB' memAccess='shared'/>
   </numa> 
</cpu>

…which becomes this URL UTF-8 encoded string, and is the one that CloudStack will expect on VM deployments:

 
dpdk-hugepages%3A%20%3CmemoryBacking%3E%20%3ChugePages%2F%3E%20%3C%2FmemoryBacking%3E%20dpdk-numa%3A%20%3Ccpu%20mode%3D%22host-passthrough%22%3E%20%3Cnuma%3E%20%3Ccell%20id%3D%220%22%20cpus%3D%220%22%20memory%3D%229437184%22%20unit%3D%22KiB%22%20memAccess%3D%22shared%22%2F%3E%20%3C%2Fnuma%3E%20%3C%2Fcpu%3E

KVM networking verification

Administrators can verify how OVS ports are created with DPDK support on DPDK enabled hosts, in which users have deployed DPDK enabled guest VMs. These port names start with “csdpdk”:

 

# ovs-vsctl show
....
Port "csdpdk-1"
   tag: 30
   Interface "csdpdk-1"
      type: dpdkvhostuser
Port "csdpdk-4"
   tag: 30
   Interface "csdpdk-4"
      type: dpdkvhostuser

About the author

Nicolas Vazquez is a Senior Software Engineer at ShapeBlue, the Cloud Specialists, and is a committer in the Apache CloudStack project. Nicolas spends his time designing and implementing features in Apache CloudStack.

Integration testing – What it is and why SDLC needs it.

What is Integration testing? This is a type of testing where multiple components are combined and tested working together. There are different aspects of integration testing depending on the project and component scale, but usually it comes down to validating that different modules can work together and / or independently. This type of testing drives one out of the tunnel vision one could develop while working on a complex task and gives feedback how the work integrates with rest of the system.

Integration testing in CloudStack

Integration testing in CloudStack is done using a python-based testing framework called Marvin. Marvin offers an API client and a structured test class model to execute different scenarios. Written in python, each CloudStack test class focuses on different functionality and contains multiple test cases to cover its features. Separated by the product severity within the /test/integration directory there are two separate sub-directories: smoke and component (https://github.com/apache/cloudstack). Smoke tests are focused only on the main features and most severe functionalities they offer, while component tests go deep into each feature and executes more detailed tests covering more corner cases.

What is the benefit of these tests?

Over the years, our so called “Marvin tests” have proven to be really valuable for validation of pull requests, release testing and other testing scenarios, saving hours of manual validation and testing. It’s also mostly agnostic to hypervisor, storage and networking, meaning it can be executed against different types with relatively the same success rate. The Marvin test pack comes with wide range of coverage for different hypervisor / plugin / network / storage, and other specifics.

Downside

Tests need maintenance – and lots of it. As the code base changes, the Marvin tests also need attention. Execution time is also worth mentioning here. Usually it takes on average about a day to complete a single component test run, while the best performing KVM tests can take about 8 hours. Marvin tests are usually very complex and rely on multiple components working together. They normally create a network and deploy a VM in it, within which they can work out the scenario. This is time consuming and different hypervisors perform differently.

Marvin

The Marvin test library comes out of the project and can be installed as a python package. When installed, it will require a running management server and a config file. The management server will be the API endpoint or test subject where all test scenarios will be executed, and the config file will contain all environment related details that are required (more info here: https://cwiki.apache.org/confluence/display/CLOUDSTACK/Marvin+-+Testing+with+Python#Marvin-TestingwithPython-Installation). Marvin comes with several utilities that can be used while writing a test (eg., utilities for deploying a VM or creating a network), plus a large amount of test data to use and more. It also uses API documents to auto-generate its API references, so whenever you create a new API when building the Marvin package, it will automatically create an API reference, and the new API will be usable.

What’s new with Marvin

It’s fair to say that not much has gone on in the /marvin directory over the last couple of releases, but there’s a lot being done in terms of maintenance and new tests. Most new features in the latest releases of CloudStack come with a few Marvin tests to cover them. There were also great initiatives around 4.9 and 4.11 releases to fix the smoketests and make them healthier for the future. There are 300+ commits in the /test directory since the start of 4.9.

It’s always been time consuming to gather results for a certain code change quickly enough, and that’s why a new test attribute was introduced called ‘quick-test’. It aims to deliver quick results to the developer and help determine if their code is good enough to continue, or if further testing is required. Code changes can be found here: https://github.com/apache/cloudstack/pull/2209. Within the same PR, there’s further segmentation that goes through all the files under /test/integration/ and adds categories in each different file. For example, if a you want to test deployment of VMs, you can just execute label ‘deploy-vm’ and it will go through each file and search for test with the same attribute. This allows users to do further regression testing in combination with other components being tested at the same time.

About the author

Boris Stoyanov is Software Engineer in testing at ShapeBlue, the Cloud Specialists. Bobby spends his time testing features for the Apache CloudStack Community and for ShapeBlue clients.

Thursday, September 13 saw us back at the Early Excellence Centre, Canada Water, for the (late) Summer meetup of the CloudStack European User Group. As usual, a great turnout and representation of the community and Europe – with attendees traveling from Germany, Switzerland, Bulgaria, Latvia, Poland, and further afield from Ukraine. There were even a few of us there from the UK!

After we’d caught up with old friends and greeted new ones, we had a bite to eat and took our seats for the talks. Giles Sirett (ShapeBlue CEO and chairman of the CloudStack European User Group) was first up, starting with introductions, a run through the day’s agenda, and CloudStack news – and this past few months has seen lots of activity and development, including the release of the latest LTS branch of CloudStack (4.11), with 4.11.2 due soon. CloudStack 4.11 included more than 250 new capabilities, such as new host HA framework and Prometheus integration, whilst the 4.11.1 release brought us a step closer to ‘near zero downtime upgrades’ with a major refactor of the virtual router. Speaking of activity – approximately 800 downloads of CloudStack per month (in the last 6 months) shows continued strong adoption of the technology.

Giles then looked to the future, talking through upcoming events… and we were in Montreal for the CloudStack Collaboration Conference just last week! It was a fabulous event in a great city, and please see my blog for a roundup and some more information. Of course Giles also mentioned our next user group meetup – London, December 13, hosted by our friends at BT (London). Giles finished up with a call for users of CloudStack to talk more about it. For more information on that, and everything Giles talked about, here are his slides:

Giles then introduced our first featured speaker of the day – Paul Angus (VP Technology at ShapeBlue), with his talk: Backup & Recovery in CloudStack. As Paul explained – CloudStack users currently only have snapshots as a form of VM backup. With the Backup and Recovery Framework, end users will be presented with the features and functions that they have come to expect outside of ‘the cloud’, while cloud providers will be able to leverage the advantages of using enterprise backup and recovery products. In this talk, Paul explained some features of the forthcoming backup and recovery feature, the user experience and demonstrated the Veeam plugin working with the backup and recovery framework. This is a highly anticipated feature, and Paul’s slides are a treasure trove of information and detail:

Following Paul, Dag Sonstebo took control of the laser pointer. Dag is a Cloud Architect here at ShapeBlue and had chosen as his topic the CloudStack usage service. Dag started by explaining that the usage service is used to track consumption of resources in Apache CloudStack for reporting and billing purposes, before giving an overview of how the service is installed and configured. Dag then dived deeper into how data is processed from the database into the different usage types (VMs, network usage, storage, etc.), before being aggregated into billable units or time slices in the usage database.

The talk included several examples on how to query and report on this usage data, and looked at general maintenance and troubleshooting of the service. This really was a deep dive, as evidenced by Dag’s extensive slides:

After a brief interlude to grab coffee and some fresh air, next up was Olivier Lambert, the creator of Xen Orchestra and XCP-ng. Starting by talking about Citrix XenServer, Olivier explained why he developed an alternative that is truly open-source. He talked us through Xen Orchestra, before moving onto XCP-ng – a fork of XenServer removing all restrictions that were put in place with the free Citrix version. This is an exciting project, already proven and widely adopted. Olivier and his team continue to develop new functionality with a fast-growing community and have an exciting roadmap in place for future development. Olivier’s slides from his presentation are right here:

After Olivier we welcomed Vladimir Melnik to the podium (all the way from Ukraine, and I think the person who traveled the furthest). Vladimir is a co-founder of the first IaaS provider in Ukraine – Tucha, and his talk was ‘Building a redundant CloudStack management cluster’. Starting with a brief history of Tucha, Vlad covered building and maintaining an open-source-driven clustered environment for the Apache CloudStack management server with GNU Linux, HAProxy, HeartBeat, Bind, OpenLDAP and other tools. Vladimir’s slides are both entertaining and very interesting:

The honour of the last talk of the day fell to Boyan Ivanov of Storpool, providing advice on building software-defined clouds. Boyan posed the question ‘why software defined?’ and went on to answer the question quite comprehensively! Infrastructure is becoming more and more ‘software defined’, and Boyan illustrated how this should mean increased profitability, putting forward the business case for a software defined stack. Boyan was then good enough to provide several tactical tips and a free reference design!

Take a look through Boyan’s slides:

Once Boyan had finished taking questions, we all headed out to the nearest hostelry, and conversation continued into the night. A nice touch (indicative of the great CloudStack community) was that when it was time to say goodbye most people said ‘see you in Montreal’!

Thanks to Early Excellence for providing a first-class venue and refreshments, and huge thanks to the day’s speakers – Paul, Dag, Olivier, Vlad and Boyan, all of whom were good enough to donate their time, and in most cases travel great distances to share their expertise.

The next meetup of the CloudStack European User Group will be in London, on Thursday, December 13 and you can register here. We are always looking out for speakers with interesting and relevant subjects, and if you are interested in talking, please contact us.

All talks were recorded in full and can be found on our ShapeBlue YouTube channel:

Giles Sirett: https://youtu.be/Ls_HakbyxUU

Paul Angus: https://youtu.be/ZVThUKPeC_w

Dag Sonstebo: https://youtu.be/I5I7eduWHRQ

Olivier Lambert: https://youtu.be/KWBCKvwvnUc

Vladimir Melnik: https://youtu.be/aBNMysDoi5w

Boyan Ivanov: https://youtu.be/wt4pqTZ57OY

Thanks for reading, and I hope to see you at the next event!


We’re here in Montreal for the CloudStack Collaboration Conference, and it’s been a fantastic event with more to come! We’ve had two full days of back to back talks over two tracks, with subjects ranging from storage, billing and diagnostics through to containers, automation and monitoring… and everything in between. Mike Tutkowski (CloudStack VP) set the tone with his keynote at the beginning of the first day, asking the question ‘why are we here?’ The answer? To learn, work together​, share ideas​ and share problems. These fundamentals are what makes for a great community, and what makes Apache CloudStack such a great product. We have never really known just how widely adopted CloudStack is, so we have (for the first time) undertaken some in-depth analysis which Mike shared. In the last 12 months CloudStack management server packages were downloaded 116,796 times from 21,202 different IP addresses. We think this means that worldwide there are about 20,000 CloudStack clouds in production! Mike also mentioned several organisations that have recently adopted CloudStack, including Ticketmaster, from whom we saw a talk illustrating how they deployed their global cloud environment using Apache CloudStack.

The CloudStack community is full of smart, committed, talented people passionate about what they do, and this is clear from the quality and delivery of the talks, and the collaboration before and after. They aren’t just repeating facts or reading what has been written for them – they are talking from first hand experience, often about features and functionality they have personally developed and committed to the project. Thanks to the community, CloudStack is constantly being improved and developed by these real-world users and operators.

So we’re into day three, which means no more CloudStack talks. However – as I said, the event is far from finished. Today (Wednesday) we have an all-day hackathon – a room full of people working together on shared goals and ideas, the sole purpose to talk and share new ideas, and make CloudStack even better!

Every time I attend a CloudStack conference, I am privileged to spend time with a community who genuinely enjoy what they do, and I come away having made new friends, and having learnt something new. I am already excited about next year’s event, and seeing some of our new friends in London at our next CloudStack meetup (December 13).

Sincere thanks to the Apache Software Foundation (our conference co-locates every year with Apachecon). It’s always a well organised and well attended event, and we are delighted to be associated with it. Thanks also to the city of Montreal – a beautiful city which I hope to visit again soon.

All the CloudStack talks were recorded and will be published to Apache.org and our YouTube channel very soon.

    

  

Introduction

We published the original blog post on KVM networking in 2016 – but in the meantime we have moved on a generation in CentOS and Ubuntu operating systems, and some of the original information is therefore out of date. In this revisit of the original blog post we cover new configuration options for CentOS 7.x as well as Ubuntu 18.04, both of which are now supported hypervisor operating systems in CloudStack 4.11. Ubuntu 18.04 has replaced the legacy networking model with the new Netplan implementation, and this does mean different configuration both for linux bridge setups as well as OpenvSwitch.

KVM hypervisor networking for CloudStack can sometimes be a challenge, considering KVM doesn’t quite have the same mature guest networking model found in the likes of VMware vSphere and Citrix XenServer. In this blog post we’re looking at the options for networking KVM hosts using bridges and VLANs, and dive a bit deeper into the configuration for these options. Installation of the hypervisor and CloudStack agent is pretty well covered in the CloudStack installation guide, so we’ll not spend too much time on this.

Network bridges

On a linux KVM host guest networking is accomplished using network bridges. These are similar to vSwitches on a VMware ESXi host or networks on a XenServer host (in fact networking on a XenServer host is also accomplished using bridges).

A KVM network bridge is a Layer-2 software device which allows traffic to be forwarded between ports internally on the bridge and the physical network uplinks. The traffic flow is controlled by MAC address tables maintained by the bridge itself, which determine which hosts are connected to which bridge port. The bridges allow for traffic segregation using traditional Layer-2 VLANs as well as SDN Layer-3 overlay networks.

KVMnetworking41

Linux bridges vs OpenVswitch

The bridging on a KVM host can be accomplished using traditional linux bridge networking or by adopting the OpenVswitch back end. Traditional linux bridges have been implemented in the linux kernel since version 2.2, and have been maintained through the 2.x and 3.x kernels. Linux bridges provide all the basic Layer-2 networking required for a KVM hypervisor back end, but it lacks some automation options and is configured on a per host basis.

OpenVswitch was developed to address this, and provides additional automation in addition to new networking capabilities like Software Defined Networking (SDN). OpenVswitch allows for centralised control and distribution across physical hypervisor hosts, similar to distributed vSwitches in VMware vSphere. Distributed switch control does require additional controller infrastructure like OpenDaylight, Nicira, VMware NSX, etc. – which we won’t cover in this article as it’s not a requirement for CloudStack.

It is also worth noting Citrix started using the OpenVswitch backend in XenServer 6.0.

Network configuration overview

For this example we will configure the following networking model, assuming a linux host with four network interfaces which are bonded for resilience. We also assume all switch ports are trunk ports:

  • Network interfaces eth0 + eth1 are bonded as bond0.
  • Network interfaces eth2 + eth3 are bonded as bond1.
  • Bond0 provides the physical uplink for the bridge “cloudbr0”. This bridge carries the untagged host network interface / IP address, and will also be used for the VLAN tagged guest networks.
  • Bond1 provides the physical uplink for the bridge “cloudbr1”. This bridge handles the VLAN tagged public traffic.

The CloudStack zone networks will then be configured as follows:

  • Management and guest traffic is configured to use KVM traffic label “cloudbr0”.
  • Public traffic is configured to use KVM traffic label “cloudbr1”.

In addition to the above it’s important to remember CloudStack itself requires internal connectivity from the hypervisor host to system VMs (Virtual Routers, SSVM and CPVM) over the link local 169.254.0.0/16 subnet. This is done over a host-only bridge “cloud0”, which is created by CloudStack when the host is added to a CloudStack zone.

 

KVMnetworking42

Linux bridge configuration – CentOS

In the following CentOS example we have changed the NIC naming convention back to the legacy “eth0” format rather than the new “eno16777728” format. This is a personal preference – and is generally done to make automation of configuration settings easier. The configuration suggested throughout this blog post can also be implemented using the new NIC naming format.

Across all CentOS versions the “NetworkManager” service is also generally disabled, since this has been found to complicate KVM network configuration and cause unwanted behaviour:

 
# systemctl stop NetworkManager
# systemctl disable NetworkManager

To enable bonding and bridging CentOS 7.x requires the modules installed / loaded:

 
# modprobe --first-time bonding
# yum -y install bridge-utils

If IPv6 isn’t required we also add the following lines to /etc/sysctl.conf:

net.ipv6.conf.all.disable_ipv6 = 1 
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

In CentOS the linux bridge configuration is done with configuration files in /etc/sysconfig/network-scripts/. Each of the four individual NIC interfaces are configured as follows (eth0 / eth1 / eth2 / eth3 are all configured the same way). Note there is no IP configuration against the NICs themselves – these purely point to the respective bonds:

# vi /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
NAME=eth0
TYPE=Ethernet
BOOTPROTO=none
ONBOOT=yes
MASTER=bond0
SLAVE=yes
HWADDR=00:0C:12:xx:xx:xx
NM_CONTROLLED=no

The bond configurations are specified in the equivalent ifcfg-bond scripts and specify bonding options as well as the upstream bridge name. In this case we’re just setting a basic active-passive bond (mode=1) with up/down delays of zero and status monitoring every 100ms (miimon=100). Note there are a multitude of bonding options – please refer to the CentOS / RedHat official documentation to tune these to your specific use case.

# vi /etc/sysconfig/network-scripts/ifcfg-bond0
DEVICE=bond0
NAME=bond0
TYPE=Bond
BRIDGE=cloudbr0
ONBOOT=yes
NM_CONTROLLED=no
BONDING_OPTS="mode=active-backup miimon=100 updelay=0 downdelay=0"

The same goes for bond1:

# vi /etc/sysconfig/network-scripts/ifcfg-bond1
DEVICE=bond1
NAME=bond1
TYPE=Bond
BRIDGE=cloudbr1
ONBOOT=yes
NM_CONTROLLED=no
BONDING_OPTS="mode=active-backup miimon=100 updelay=0 downdelay=0"

Cloudbr0 is configured in the ifcfg-cloudbr0 script. In addition to the bridge configuration we also specify the host IP address, which is tied directly to the bridge since it is on an untagged VLAN:

# vi /etc/sysconfig/network-scripts/ifcfg-cloudbr0
DEVICE=cloudbr0
ONBOOT=yes
TYPE=Bridge
IPADDR=192.168.100.20
NETMASK=255.255.255.0
GATEWAY=192.168.100.1
NM_CONTROLLED=no
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=no
DELAY=0

Cloudbr1 does not have an IP address configured hence the configuration is simpler:

# vi /etc/sysconfig/network-scripts/ifcfg-cloudbr1
DEVICE=cloudbr1
ONBOOT=yes
TYPE=Bridge
BOOTPROTO=none
NM_CONTROLLED=no
DELAY=0
DEFROUTE=no
IPV4_FAILURE_FATAL=no
IPV6INIT=no

Optional tagged interface for storage traffic

If a dedicated VLAN tagged IP interface is required for e.g. storage traffic this can be accomplished by created a VLAN on top of the bond and tying this to a dedicated bridge. In this case we create a new bridge on bond0 using VLAN 100:

# vi /etc/sysconfig/network-scripts/ifcfg-bond.100
DEVICE=bond0.100
VLAN=yes
BOOTPROTO=none
ONBOOT=yes
TYPE=Unknown
BRIDGE=cloudbr100

The bridge can now be configured with the desired IP address for storage connectivity:

# vi /etc/sysconfig/network-scripts/ifcfg-cloudbr100
DEVICE=cloudbr100
ONBOOT=yes
TYPE=Bridge
VLAN=yes
IPADDR=10.0.100.20
NETMASK=255.255.255.0
NM_CONTROLLED=no
DELAY=0

Internal bridge cloud0

When using linux bridge networking there is no requirement to configure the internal “cloud0” bridge, this is all handled by CloudStack.

Network startup

Note – once all network startup scripts are in place and the network service is restarted you may lose connectivity to the host if there are any configuration errors in the files, hence make sure you have console access to rectify any issues.

To make the configuration live restart the network service:

# systemctl restart network

To check the bridges use the brctl command:

# brctl show
bridge name bridge id STP enabled interfaces
cloudbr0 8000.000c29b55932 no bond0
cloudbr1 8000.000c29b45956 no bond1

The bonds can be checked with:

# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth0
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:0c:xx:xx:xx:xx
Slave queue ID: 0

Slave Interface: eth1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:0c:xx:xx:xx:xx
Slave queue ID: 0

Linux bridge configuration – Ubuntu

With the 18.04 “Bionic Beaver” release Ubuntu have retired the legacy way of configuring networking through /etc/network/interfaces in favour of Netplan – https://netplan.io/reference. This changes how networking is configured – although the principles around bridge configuration are the same as in previous Ubuntu versions.

First of all ensure correct hostname and FQDN are set in /etc/hostname and /etc/hosts respectively.

To stop network bridge traffic from traversing IPtables / ARPtables also add the following lines to /etc/sysctl.conf, this prevents bridge traffic from traversing IPtables / ARPtables on the host.

# vi /etc/sysctl.conf
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0

Ubuntu 18.04 installs the “bridge-utils” and bridge/bonding kernel options by default, and the corresponding modules are also loaded by default, hence there are no requirements to add anything to /etc/modules.

In Ubuntu 18.04 all interface, bond and bridge configuration are configured using cloud-init and the Netplan configuration in /etc/netplan/XX-cloud-init.yaml. Same as for CentOS we are configuring basic active-passive bonds (mode=1) with status monitoring every 100ms (miimon=100), and configuring bridges on top of these. As before the host IP address is tied to cloudbr0:

# vi /etc/netplan/50-cloud-init.yaml
network:
    ethernets:
        eth0:
            dhcp4: no
        eth1:
            dhcp4: no
        eth2:
            dhcp4: no
        eth3:
            dhcp4: no
    bonds:
        bond0:
            dhcp4: no
            interfaces:
                - eth0
                - eth1
            parameters:
                mode: active-backup
                primary: eth0
        bond1:
            dhcp4: no
            interfaces:
                - eth2
                - eth3
            parameters:
                mode: active-backup
                primary: eth2
    bridges:
        cloudbr0:
            addresses:
                - 192.168.100.20/24
            gateway4: 192.168.100.1
            nameservers:
                search: [mycloud.local]
                addresses: [192.168.100.5,192.168.100.6]
            interfaces:
                - bond0
        cloudbr1:
            dhcp4: no
            interfaces:
                - bond1
    version: 2

Optional tagged interface for storage traffic

To add an options VLAN tagged interface for storage traffic add a VLAN and a new bridge to the above configuration:

# vi /etc/netplan/50-cloud-init.yaml
    vlans:
        bond100:
            id: 100
            link: bond0
            dhcp4: no
    bridges:
        cloudbr100:
            addresses:
               - 10.0.100.20/24
            interfaces:
               - bond100

Internal bridge cloud0

When using linux bridge networking the internal “cloud0” bridge is again handled by CloudStack, i.e. there’s no need for specific configuration to be specified for this.

Network startup

Note – once all network startup scripts are in place and the network service is restarted you may lose connectivity to the host if there are any configuration errors in the files, hence make sure you have console access to rectify any issues.

To make the configuration reload Netplan with

# netplan apply

To check the bridges use the brctl command:

# brctl show
bridge name	bridge id		STP enabled	interfaces
cloud0		8000.000000000000	no
cloudbr0	8000.52664b74c6a7	no		bond0
cloudbr1	8000.2e13dfd92f96	no		bond1
cloudbr100	8000.02684d6541db	no		bond100

To check the VLANs and bonds:

# cat /proc/net/vlan/config
VLAN Dev name | VLAN ID
Name-Type: VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD
bond100 | 100 | bond0
# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth1
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 10
Permanent HW addr: 00:0c:xx:xx:xx:xx
Slave queue ID: 0

Slave Interface: eth0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 10
Permanent HW addr: 00:0c:xx:xx:xx:xx
Slave queue ID: 0

 

OpenVswitch bridge configuration – CentOS

The OpenVswitch version in the standard CentOS repositories is relatively old (version 2.0). To install a newer version either locate and install this from a third party CentOS/Fedora/RedHat repository, alternatively download and compile the packages from the OVS website http://www.openvswitch.org/download/ (notes on how to compile the packages can be found in http://docs.openvswitch.org/en/latest/intro/install/fedora/).

Once packages are available install and enable OVS with

# yum localinstall openvswitch-<version>.rpm
# systemctl start openvswitch
# systemctl enable openvswitch

In addition to this the bridge module should be blacklisted. Experience has shown that even blacklisting this module does not prevent it from being loaded. To force this set the module install to /bin/false. Please note the CloudStack agent install depends on the bridge module being in place, hence this step should be carried out after agent install.

echo "install bridge /bin/false" > /etc/modprobe.d/bridge-blacklist.conf

As with linux bridging above the following examples assumes IPv6 has been disabled and legacy ethX network interface names are used. In addition the hostname has been set in /etc/sysconfig/network and /etc/hosts.

Add the initial OVS bridges using the ovs-vsctl toolset:

# ovs-vsctl add-br cloudbr0
# ovs-vsctl add-br cloudbr1
# ovs-vsctl add-bond cloudbr0 bond0 eth0 eth1
# ovs-vsctl add-bond cloudbr1 bond1 eth2 eth3

This will configure the bridges in the OVS database, but the settings will not be persistent. To make the settings persistent we need to configure the network configuration scripts in /etc/sysconfig/network-scripts/, similar to when using linux bridges.

Each individual network interface has a generic configuration – note there is no reference to bonds at this stage. The following ifcfg-eth script applies to all interfaces:

# vi /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
TYPE=Ethernet
BOOTPROTO=none
NAME=eth0
ONBOOT=yes
NM_CONTROLLED=no
HOTPLUG=no
HWADDR=00:0C:xx:xx:xx:xx

The bonds reference the interfaces as well as the upstream bridge. In addition the bond configuration specifies the OVS specific settings for the bond (active-backup, no LACP, 100ms status monitoring):

# vi /etc/sysconfig/network-scripts/ifcfg-bond0
DEVICE=bond0
ONBOOT=yes
DEVICETYPE=ovs
TYPE=OVSBond
OVS_BRIDGE=cloudbr0
BOOTPROTO=none
BOND_IFACES="eth0 eth1"
OVS_OPTIONS="bond_mode=active-backup lacp=off other_config:bond-detect-mode=miimon other_config:bond-miimon-interval=100"
HOTPLUG=no
# vi /etc/sysconfig/network-scripts/ifcfg-bond1
DEVICE=bond1
ONBOOT=yes
DEVICETYPE=ovs
TYPE=OVSBond
OVS_BRIDGE=cloudbr1
BOOTPROTO=none
BOND_IFACES="eth2 eth3"
OVS_OPTIONS="bond_mode=active-backup lacp=off other_config:bond-detect-mode=miimon other_config:bond-miimon-interval=100"
HOTPLUG=no

The bridges are now configured as follows. The host IP address is specified on the untagged cloudbr0 bridge:

# vi /etc/sysconfig/network-scripts/ifcfg-cloudbr0
DEVICE=cloudbr0
ONBOOT=yes
DEVICETYPE=ovs
TYPE=OVSBridge
BOOTPROTO=static
IPADDR=192.168.100.20
NETMASK=255.255.255.0
GATEWAY=192.168.100.1
HOTPLUG=no

Cloudbr1 is configured without an IP address:

# vi /etc/sysconfig/network-scripts/ifcfg-cloudbr1
DEVICE=cloudbr1
ONBOOT=yes
DEVICETYPE=ovs
TYPE=OVSBridge
BOOTPROTO=none
HOTPLUG=no

Internal bridge cloud0

Under CentOS7.x and CloudStack 4.11 the cloud0 bridge is automatically configured, hence no additional configuration steps required.

Optional tagged interface for storage traffic

If a dedicated VLAN tagged IP interface is required for e.g. storage traffic this is accomplished by creating a VLAN tagged fake bridge on top of one of the cloud bridges. In this case we add it to cloudbr0 with VLAN 100:

# ovs-vsctl add-br cloudbr100 cloudbr0 100
# vi /etc/sysconfig/network-scripts/ifcfg-cloudbr100
DEVICE=cloudbr100
ONBOOT=yes
DEVICETYPE=ovs
TYPE=OVSBridge
BOOTPROTO=static
IPADDR=10.0.100.20
NETMASK=255.255.255.0
OVS_OPTIONS="cloudbr0 100"
HOTPLUG=no

Additional OVS network settings

To finish off the OVS network configuration specify the hostname, gateway and IPv6 settings:

vim /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=kvmhost1.mylab.local
GATEWAY=192.168.100.1
NETWORKING_IPV6=no
IPV6INIT=no
IPV6_AUTOCONF=no

VLAN problems when using OVS

Kernel versions older than 3.3 had some issues with VLAN traffic propagating between KVM hosts. This has not been observed in CentOS 7.5 (kernel version 3.10) – however if this issue is encountered look up the OVS VLAN splinter workaround.

Network startup

Note – as mentioned for linux bridge networking – once all network startup scripts are in place and the network service is restarted you may lose connectivity to the host if there are any configuration errors in the files, hence make sure you have console access to rectify any issues.

To make the configuration live restart the network service:

# systemctl restart network

To check the bridges use the ovs-vsctl command. The following shows the optional cloudbr100 on VLAN 100:

# ovs-vsctl show
49cba0db-a529-48e3-9f23-4999e27a7f72
    Bridge "cloudbr0";
        Port "cloudbr0";
            Interface "cloudbr0"
                type: internal
        Port "cloudbr100"
            tag: 100
            Interface "cloudbr100"
                type: internal
        Port "bond0"
            Interface "veth0";
            Interface "eth0"
    Bridge "cloudbr1"
        Port "bond1"
            Interface "eth1"
            Interface "veth1"
        Port "cloudbr1"
            Interface "cloudbr1"
                type: internal
    Bridge "cloud0"
        Port "cloud0"
            Interface "cloud0"
                type: internal
    ovs_version: "2.9.2"

The bond status can be checked with the ovs-appctl command:

ovs-appctl bond/show bond0
---- bond0 ----
bond_mode: active-backup
bond may use recirculation: no, Recirc-ID : -1
bond-hash-basis: 0
updelay: 0 ms
downdelay: 0 ms
lacp_status: off
active slave mac: 00:0c:xx:xx:xx:xx(eth0)

slave eth0: enabled
active slave
may_enable: true

slave eth1: enabled
may_enable: true

To ensure that only OVS bridges are used also check that linux bridge control returns no bridges:

# brctl show
bridge name	bridge id		STP enabled	interfaces

As a final note – the CloudStack agent also requires the following two lines added to /etc/cloudstack/agent/agent.properties after install:

network.bridge.type=openvswitch
libvirt.vif.driver=com.cloud.hypervisor.kvm.resource.OvsVifDriver

OpenVswitch bridge configuration – Ubuntu

As discussed earlier in this blog post Ubuntu 18.04 introduced Netplan as a replacement to the legacy “/etc/network/interfaces” network configuration. Unfortunately Netplan does not support OVS, hence the first challenge is to revert Ubuntu to the legacy configuration method.

To disable Netplan first of all add “netcfg/do_not_use_netplan=true” to the GRUB_CMDLINE_LINUX option in /etc/default/grub. The following example also shows the use of legacy interface names as well as IPv6 being disabled:

GRUB_CMDLINE_LINUX="net.ifnames=0 biosdevname=0 ipv6.disable=1 netcfg/do_not_use_netplan=true"

Then rebuild GRUB and reboot the server:

grub-mkconfig -o /boot/grub/grub.cfg

To set the hostname first of all edit “/etc/cloud/cloud.cfg” and set this to preserve the system hostname:

preserve_hostname: true

Thereafter set the hostname with hostnamectl:

hostnamectl set-hostname --static --transient --pretty <hostname>

Now remove Netplan, install OVS from the Ubuntu repositories as well the “ifupdown” package to get standard network functionality back:

apt-get purge nplan netplan.io
apt-get install openvswitch-switch
apt-get install ifupdown

As for CentOS we need to blacklist the bridge module to prevent standard bridges being created. Please note the CloudStack agent install depends on the bridge module being in place, hence this step should be carried out after agent install.

echo "install bridge /bin/false" > /etc/modprobe.d/bridge-blacklist.conf

To stop network bridge traffic from traversing IPtables / ARPtables also add the following lines to /etc/sysctl.conf:

# vi /etc/sysctl.conf
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0

Same as for CentOS we first of all add the OVS bridges and bonds from command line using the ovs-vsctl command line tools. In this case we also add the additional tagged fake bridge cloudbr100 on VLAN 100:

# ovs-vsctl add-br cloudbr0
# ovs-vsctl add-br cloudbr1
# ovs-vsctl add-bond cloudbr0 bond0 eth0 eth1 bond_mode=active-backup other_config:bond-detect-mode=miimon other_config:bond-miimon-interval=100
# ovs-vsctl add-bond cloudbr1 bond1 eth2 eth3 bond_mode=active-backup other_config:bond-detect-mode=miimon other_config:bond-miimon-interval=100
# ovs-vsctl add-br cloudbr100 cloudbr0 100

As for linux bridge all network configuration is applied in “/etc/network/interfaces”:

# vi /etc/network/interfaces
# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
iface eth0 inet manual
iface eth1 inet manual
iface eth2 inet manual
iface eth3 inet manual

auto cloudbr0
allow-ovs cloudbr0
iface cloudbr0 inet static
  address 192.168.100.20
  netmask 255.255.155.0
  gateway 192.168.100.100
  dns-nameserver 192.168.100.5
  ovs_type OVSBridge
  ovs_ports bond0

allow-cloudbr0 bond0 
iface bond0 inet manual 
  ovs_bridge cloudbr0 
  ovs_type OVSBond 
  ovs_bonds eth0 eth1 
  ovs_option bond_mode=active-backup other_config:miimon=100

auto cloudbr1
allow-ovs cloudbr1
iface cloudbr1 inet manual

allow-cloudbr1 bond1 
iface bond1 inet manual 
  ovs_bridge cloudbr1 
  ovs_type OVSBond 
  ovs_bonds eth2 eth3 
  ovs_option bond_mode=active-backup other_config:miimon=100

Network startup

Since Ubuntu 14.04 the bridges have started automatically without any requirement for additional startup scripts. Since OVS uses the same toolset across both CentOS and Ubuntu the same processes as described earlier in this blog post can be utilised.

# ovs-appctl bond/show bond0
# ovs-vsctl show

To ensure that only OVS bridges are used also check that linux bridge control returns no bridges:

# brctl show
bridge name	bridge id		STP enabled	interfaces

As mentioned earlier the following also needs added to the /etc/cloudstack/agent/agent.properties file:

network.bridge.type=openvswitch
libvirt.vif.driver=com.cloud.hypervisor.kvm.resource.OvsVifDriver

Internal bridge cloud0

In Ubuntu there is no requirement to add additional configuration for the internal cloud0 bridge, CloudStack manages this.

Optional tagged interface for storage traffic

Additional VLAN tagged interfaces are again accomplished by creating a VLAN tagged fake bridge on top of one of the cloud bridges. In this case we add it to cloudbr0 with VLAN 100 at the end of the interfaces file:

# ovs-vsctl add-br cloudbr100 cloudbr0 100
# vi /etc/network/interfaces
auto cloudbr100
allow-cloudbr0 cloudbr100
iface cloudbr100 inet static
  address 10.0.100.20
  netmask 255.255.255.0
  ovs_type OVSIntPort
  ovs_bridge cloudbr0
  ovs_options tag=100

Conclusion

As KVM is becoming more stable and mature, more people are going to start looking at using it rather that the more traditional XenServer or vSphere solutions, and we hope this article will assist in configuring host networking. As always we’re happy to receive feedback , so please get in touch with any comments, questions or suggestions.

About The Author

Dag Sonstebo is  a Cloud Architect at ShapeBlue, The Cloud Specialists. Dag spends most of his time designing, implementing and automating IaaS solutions based on Apache CloudStack.