Problem Statement

When using OpenStack Heat, you want to have a way to test your templates, especially if you expect them to be used hundreds or thousands of times. Things on the internet change, and if you are using Heat to assist in orchestrating software deployment, you are going to want a way to validate that what you built today works tomorrow. For things that you need to rely on time and time again, you will likely want to trigger regular builds to make sure your template works as you’d expect.

Wait Conditions

First, wait conditions aren’t tests. They are useful for signaling when something is “done”. If you’re using the user_data options within Heat, you can include wait conditions to signal back to the heat service on whether things succeeded or failed. Here’s an example template:

resources:
  wait_condition:
    type: OS::Heat::SwiftSignal
    properties:
      handle: { get_resource: wait_condition_handle }
      timeout: 1800

  wait_condition_handle:
    type: OS::Heat::SwiftSignalHandle

  ssh_key:
      type: "OS::Nova::KeyPair"
      properties:
        name: { get_param: "OS::stack_id" }
        save_private_key: true

  example_server:
    type: "OS::Nova::Server"
    properties:
      name: example_server
      flavor: general1-1
      image: Ubuntu 14.04 LTS (Trusty Tahr) (PVHVM)
      key_name: { get_resource: ssh_key }
      config_drive: "true"
      user_data_format: RAW
      user_data:
         str_replace:
          template: |
            #cloud-config
            package_update: true
            package_upgrade: true
            packages:
              - curl
              - vim
            write_files:
              # Install Script
              - path: /tmp/example-server.sh
                permissions: '0544'
                content: |
                  #!/bin/bash -v
                  cd /tmp
                  curl -o wordpress.tar.gz https://wordpress.org/latest.tar.gz
                  # Notify success
                  wc_notify --data-binary '{"status": "SUCCESS"}'
            runcmd:
              - /tmp/example-server.sh
          params:
            wc_notify: { get_attr: ['wait_condition_handle', 'curl_cli'] }

There are some tradeoffs with this approach. If you want to be 100% sure your application is installed and running properly, you will need to install all testing requirements and run them as a part of the stack launch. Including all of the testing adds time to your deployment, as well as additional potential failures. The internet is not perfect perfect. Things go down, get DoS’d, IPs get blacklisted, or thing like package names change or are pulled. The more external dependencies you have (repositories, pypi, github, etc.), the more likely your stack is to have issues. In summary, testing as a part of your deploy will lead to:

  1. A greater chance of failure.
  2. Slower deployment times.

These things do not mean that wait conditions are bad things. I would say the opposite. If you don’t use a wait condition, the second a server comes up, it is considered “complete”, so in this single server example, your stack will be in the state CREATE_COMPLETE before the scripts to do the installation complete. This would mislead your users to believe that your deploy is complete, and if you have additional automation that triggers off a stack coming up, it may have issues.

The ‘hot’ package

I created hot a few years ago to solve a testing problem. Just because your stack says CREATE_COMPLETE doesn’t mean the application successfully deployed.

The hot utility is leverages python-heatclient for launching stacks, and then it provides a few basic functions to allow you to define your tests. hot does not require you to make any changes to your heat templates in order to use the tool. It’s about as flexible as your imagination. There are two types of tests it can run:

  1. fabric
  2. script

Fabric

Why not just script? We liked fabric for some of the things it could do, especially when it came to testing multiple servers, so we build in fabric support. Fabric can be a little tricky to setup for testing if you haven’t worked with it before, so the tests we created were designed to make it easy for a user who hasn’t used fabric before to look at an example, plug in their values, and get testing.

Why fabric at all? There were two main reasons:

  1. It’s written in Python, just like Heat
  2. Fabfiles can be run independently

The scripts being able to be run independently was a big deal for us at the time. We were doing a lot of work on templates, primarily using configuration management, and we wanted to make changes and test without completely re-deploying our stack. Once we had everything where we wanted it, we’d re-deploy and make sure all the tests passed. This saved a significant amount of time. One time that this came in very handy is when we were getting a false pass for one of our tests. By standing up the stack and invoking our fabfile directly, we were able to more quickly diagnose issues without having to do anything wonky with our local environment.

Here’s an example of the test cases running for the Rackspace memcached template. You will see that we actually pass in different parameters for the heat template itself into the second test:

test-cases:
- name: Default Build Test # Deploy using all default options
  create:
    timeout: 30 # Deployment should complete in under 30 minutes
  resource_tests: # Tests to run on the resources themselves
    ssh_private_key: { get_output: private_key } # Fetch from output-list of stack
    ssh_key_file: tmp/private_key # File to write with ssh_private_key
    tests:
    - memcached_server:
        fabric:
          # Fabric environment settings to use while running envassert script
          # http://docs.fabfile.org/en/latest/usage/env.html
          env:
            user: root
            key_filename: tmp/private_key
            hosts: { get_output: server_ip } # Fetch from output-list of stack
            tasks:
              - artifacts
              - check
            abort_on_prompts: True
            connection_attempts: 3
            disable_known_hosts: True
            use_ssh_config: True
            fabfile: test/fabric/memcached_server.py # Path to envassert test

- name: Standard Instance, non-standard Port
  create:
    parameters:
      flavor: 2GB Standard Instance
      memcached_port: 11212
    timeout: 30
  resource_tests:
    ssh_private_key: { get_output: private_key }
    ssh_key_file: tmp/private_key
    tests:
    - memcached_port:
        fabric:
          env:
            user: root
            key_filename: tmp/private_key
            hosts: { get_output: server_ip }
            tasks:
              - artifacts
              - check
            abort_on_prompts: True
            connection_attempts: 3
            disable_known_hosts: True
            use_ssh_config: True
            fabfile: test/fabric/memcached_port_11212.py

Here’s one of the fabfiles used for the tests:

from fabric.api import env, task
from envassert import detect, file, package, port, process, service
from hot.utils.test import get_artifacts


@task
def check():
    env.platform_family = detect.detect()

    assert package.installed("memcached")
    assert file.exists("/etc/memcached.conf")
    assert port.is_listening(11211)
    assert process.is_up("memcached")
    assert service.is_enabled("memcached")


@task
def artifacts():
    env.platform_family = detect.detect()
    get_artifacts()

If you work with Python, you’re probably starting to feel more comfortable with what you’re seeing at this point. hot will handle setting up your fabric environment as defined in your tests.yaml file.

Script

This is the “do whatever you want” option. It could be as simple as a bash script to curl a URL and check for the status code, or more complex to use something like Selenium to emulate a browser in testing your site. You could even go to the point of making sure you’re running the latest version of a given application if your template is meant to represent the latest and greatest. Here’s an example from the wordpress-small template that someone else wrote. Their tests combine script and fabric:

test-cases:
- name: One-Click Build Test # Test 1-Click URL version
  create:
    timeout: 30 # Deployment should complete in under 30 minutes
    parameters:
      flavor: 1 GB General Purpose v1
  resource_tests: # Tests to run on the resources themselves
    ssh_private_key: { get_output: ssh_private_key } # Fetch from output-list of stack
    ssh_key_file: tmp/private_key # File to write with ssh_private_key
    tests:
    - check_lb_login:
        script:
          commands:
            - command: "./test/script/wp_login.py"
              command_args:
                - { get_output: wordpress_public_ip }
                - { get_output: wordpress_public_url }
                - { get_output: wordpress_login_user }
                - { get_output: wordpress_login_password }
    - check_master_login:
        script:
          commands:
            - command: "./test/script/wp_login.py"
              command_args:
                - { get_output: server_ip }
                - { get_output: wordpress_public_url }
                - { get_output: wordpress_login_user }
                - { get_output: wordpress_login_password }
    - wordpress_server:
        fabric:
          # Fabric environment settings to use while running envassert script
          # http://docs.fabfile.org/en/latest/usage/env.html
          env:
            user: root
            key_filename: tmp/private_key
            hosts: { get_output: server_ip } # Fetch from output-list of stack
            tasks:
              - artifacts
              - check
            abort_on_prompts: True
            connection_attempts: 3
            disable_known_hosts: True
            use_ssh_config: True
            fabfile: test/fabric/wordpress.py # Path to envassert test

The above example is making sure they can login to WordPress both through the load balancer, as well as login directly to a server. Here’s what the login test script looks like:

#! /usr/bin/env python

import json
import re
import requests
import sys


class WPInteraction(object):
    def __init__(self, ip, domain="example.com",
                 login_id="wp_user", password=""):
        self.ip = ip
        self.domain = domain
        self.login_id = login_id
        self.password = password
        self.session = requests.Session()
        self.session.headers.update({"Host": self.domain[7:-1]})

    def get_login_page(self):
        url = "http://{}".format(self.ip)
        r = self.session.get(url)
        return r.text

    def wp_post_login(self):
        url = "http://{}/wp-login.php".format(self.ip)
        print "url is {}".format(url)
        data = {"log": self.login_id,
                "pwd": self.password}
        print json.dumps(data, indent=4)
        r = self.session.post(url, data=data, allow_redirects=False)
        if r.is_redirect:
            redirected_url = r.headers.get('location')
            r = self.session.get(re.sub(self.domain, "http://" + self.ip + "/", redirected_url))
        print "status code is {}".format(r.status_code)
        return r.text

    def login_successful(self):
        content = self.wp_post_login()
        return "Dashboard" in content


if __name__ == "__main__":
    print json.dumps(sys.argv)
    ip = sys.argv[1]
    domain = sys.argv[2]
    login_id = sys.argv[3]
    password = sys.argv[4]
    wp = WPInteraction(ip, domain=domain, login_id=login_id, password=password)

    if wp.login_successful():
        print "Wordpress login successful."
        sys.exit(0)
    else:
        print "login failed :("
        sys.exit(1)

I had nothing to do with writing the above test, but it’s a fantastic example of how you can test your stack through synthetic transactions.

Summary

hot is a nice utility for testing Heat templates that allow you to more fully test your deployments. There are additional options available for hot, so check out the docs to see what else can be done. hot is not meant to replace tools like the python-heatclient, in fact, it’s meant to supplement it.