Ansible Tower for Continuous Infrastructure

As infrastructure and teams scale, effective and robust configuration management requires growing beyond manual processes and local conventions. Fortunately, Ansible Tower (or the upstream Open Source project Ansible AWX) provides a perfect platform for configuration management at scale.

The Ansible Tower/AWX documentation and tutorials provide comprehensive information about the individual components.  However, assembling all the moving pieces into a whole working solution can involve some trial and error and reverse engineering in order to understand how the components relate to one another.  Ansible Tower, like the core Ansible solution, offers flexibility in how features assembled to support different typed of workflows. The types of workflows can include once-off initial configurations, ad-hoc system maintenance, or continuous convergence.

Continuous convergence, also referred to as desired state, regularly re-applies the defined configuration to infrastructure. This tends to ‘correct the drift’ often encountered when only applying the configuration on infrastructure setup. For example, a continuous convergence approach to configuration management could apply the desired configuration on a recurring schedule of every 30 minutes.  

Some continuous convergence workflow characteristics can include:

  • Idempotent Ansible roles. If there are no required configuration deviations, run will report 0 changes.
  • A source code repository per Ansible role, similar to the Ansible Galaxy approach,
  • A source code repository for Ansible playbooks that include the individual Ansible roles,
  • A host configured to provide one unique service function only,
  • An Ansible playbook defined for each unique service function that gets applied to the host,
  • Playbooks applied to each host on a repeating schedule.

One way to achieve a continuous convergence workflow combines the Ansible Tower components according to the following conceptual model.

The Workflow Components

Playbook and Role Source Code

Ansible roles contain the individual tasks, handlers, and content with a role responsible for the installation and configuration of a particular software service.

Ansible playbooks configure a host for a particular service function in the environment acting as a wrapper for the individual role based configurations.  All the roles expected to be applied to a host must be defined in the playbook.

Source Code Repositories

Role git repositories contain the versioned definition of a role, e.g. one git repository per individual role.  The roles are pulled into the playbooks using the git reference and tags, which pegs the role version used within a playbook.

Project git repositories group the individual playbooks into single collection, e.g. one git repository per set of playbooks.  As with roles, specific versions of project repositories are also identified by version tags. 

Ansible Tower Server

Two foundational concepts in Ansible Tower are projects and inventories. Projects provide access to playbooks and roles. Inventories provide the connection to “real” infrastructure.  Inventories and projects also provide authorisation scope for activities in Ansible Tower. For example, a given group can use the playbooks in Project X and apply jobs to hosts in Inventory Y.

Each Ansible Tower Project is backed by a project git repository.  Each repository contains the playbooks and included roles that can be applied by a given job.  The Project is the glue between the Ansible configuration tasks and the plays that apply the configuration.

Ansible Tower Inventories are sets of hosts grouped for administration, similar to inventory sets used when applying playbooks manually.  One option is to group hosts into Inventories by environment.  For example, the hosts for development may be in one Inventory while the hosts for production may be in another Inventory.  User authorisation controls are applied at the Inventory level.

Ansible Tower Inventory Groups define sub-sets of hosts within the larger Inventory.  These subsets can then be used to limit the scope of a playbook job.  One option is to group hosts within an Inventory by function.  For example, the hosts for web servers may be in one Inventory Group and the hosts for databases may be in another Inventory Group.  This enables one playbook to target one inventory group.  Inventory groups effectively provide metadata labels for hosts in the Inventory.

An Ansible Job Template determines the configuration to be applied to hosts.  The Job Template links a playbook from a project to an inventory.   The inventory scope can be optionally further limited by specifying inventory group limits.  A Job Template can be invoked either on an ad-hoc basis or via a recurring schedule.

Ansible Job Schedules define the time and frequency at which the configuration specified in the Job Template is applied.  Each Job Template can be associated with one or more Job Schedules.  A schedule supports either once-off execution, for example during a defined change window, or regularly recurring execution.  A job schedule that applies the desired state configuration with a frequency of 30 minutes provides an example of a job schedule used for a continuous convergence workflow.

“Real” Infrastructure

An Ansible Job Instance defines a single invocation of an Ansible Job Template, both for scheduled and ad-hoc invocations of the job template.  Outside of Ansible Tower, the Job Instance is the equivalent of executing the ansible-playbook command using an inventory file.

Host is the actual target infrastructure resources configured by the job instance, applying an ansible playbook of included roles.

A note on Ansible Variables

As with other features of Ansible and Ansible Tower, variables also offer flexibility in defining parameters and context when applying a configuration.  In addition to declaring and defining variables in roles and playbooks, variable definitions can also be defined in Ansible Tower job templates, inventory and inventory groups, and individual hosts.  Given the plethora of options for variable definition locations, without a set of conventions for managing variable values, debugging runtime issues with roles and playbooks can become difficult.  E.g. which value defined at which location was used when applying the role?

One example of variable definitions conventions could include:

  • Variables shall be given default values in the role, .e.g. in the ../defaults/main.yml file.
  • If the variable must have a ‘real’ value supplied when applying the playbook, the variable shall be defined with an obvious placeholder value which will fail if not overridden.
  • Variables shall be described in the role README.md documentation
  • Do not apply variables at the host inventory level as host inventory can be transient.
  • Variables that select specific capabilities within a role shall be defined at the Ansible Tower Inventory Group.  For example, a role contains the configuration definition for both master and work nodes.  The Inventory Group variables are used to indicate which hosts must have the master configuration and applied and which must have the worker configuration applied.
  • Variables that define the environment context for configuration shall be defined in the Ansible Tower Job Template.

Following these conventions, each of the possible variable definition options serves a particular purpose.  When an issue with variable definition does arise, the source is easily identified.

Test Driven Infrastructure and Test Automation with Ansible, Molecule and Azure

A few years back, before the rise of the hyper-scalers, I had my first infracode ‘aha moment’ with OpenStack. The second came with Kitchen.

I had already been using test driven development for application code and configuration automation for infrastructure but Kitchen brought the two together. Kitchen made it possible to write tests, spin up infrastructure, and then tear everything down again – the Red/Green/Refactor cycle for infrastructure. What made this even better was that it wasn’t a facsimile of a target environment, it was the same – same VM’s, same OS, same network.

Coming from a Chef background for configuration automation, Kitchen is a great fit to the Ruby ecosystem. Kitchen works with Ansible and Azure, but a Ruby environment and at least a smattering of Ruby coding skills are required.

Molecule provides a similar red-green development cycle to Kitchen, but without the need to step outside of the familiar Python environment.

Out of the box, Molecule supports development of Ansible roles using either a Docker or Virtual Box infrastructure provider. Molecule also leverages the Ansible drivers for private and public cloud platforms.

Molecule can be configured to test an individual role or collections of roles in Ansible playbooks.

This tutorial demonstrates how to use Molecule with Azure to develop and test an individual Ansible role following the red/green/refactor infracode workflow, which can be generalised as:

  • Red– write a failing infrastructure test
  • Green – write the Ansible tasks needed to pass the test
  • Refactor – repeat the process

The steps required for this tutorial are as follows:

Azure setup

Ensure there is an existing Azure Resource Group that will be used for infracode development and testing. Within the resource group, ensure there is a single virtual network (vnet) with a single subnet. Ansible will use these for the default network setup.

Setup a working environment

There are a number of options for setting up a Python environment for Ansible and Molecule, including Python virtualenv or a Docker container environment.

Create a Docker image for Ansible+Molecule+Azure

This tutorial uses a Docker container environment. A Dockerfile for the image can be found in ./molecule-azure-image/Dockerfile. The image sets up a sane Python3 environment with Ansible, Ansible[azure], and Molecule pip modules installed.

Create a Docker workspace

Setup a working environment using the Docker image with Ansible, Molecule, and the azure-cli installed.

This example assumes the following:

  • a resource group already exists with access rights to create virtual machines; and
  • the resource group contains a single vnet with a single subnet

Log into an Azure subcription

Ansible supports a number of different methods for authenticating with Azure. This example uses the azure-cli to login interactively.

Create an empty Ansible role with Molecule

Molecule provides an init function with defaults for various providers. The molecule-azure-role-template creates an empty role with scaffolding for Azure.

Check that the environment is working by running the following code:

The output should look be similar to…

Spin up an Azure VM

Spin up a fresh VM to be used for infra-code development.

Molecule provides a handy option for logging into the new VM:

There is now a fresh Ubuntu 18.04 virtual machine ready for infra-code development. For this example, a basic Nginx server will be installed and verified.

Write a failing test

Testinfra provides a pytest based framework for verifying server and infrastructure configuration. Molecule then manages the execution of those testinfra tests. The Molecule template provides a starting point for crafting tests of your own. For this tutorial, installation of the nginx service is verified. Modify the tests file using vi molecule/default/tests/test_default.py

Execute the failing test

The Ansible task needed to install and enable nginx has not yet been written, so the test should fail:

If the initial sample tests in test_default.py are kept, then 3 tests should fail and 2 tests should pass.

Write a task to install nginx

Add a task to install the nginx service using vi tasks/main.yml:

Apply the role

Apply the role to the instance created using Molecule.

The nginx package should now be installed, both enabled and started, and listening on port 80. Note that the nginx instance will not be accessible from the Internet due to the Azure network security rules. The nginx instance can be confirmed manually by logging into the instance and using curl to make a request to the nginx service.

Execute the passing test

After applying the Ansible task to the instance, the testinfra tests should now pass.

Cleanup

Now that the Ansible role works as defined in the test specification, the development environment can be cleaned up.

Molecule removes the Azure resources created to develop and test the configuration role. Note that deletion may take a few minutes.

Finally, once you are done, exit the container environment. If the container was started with the --rm switch, the container will also be removed, leaving you with a clean workspace and newly minted Ansible role with automated test cases.

Full source code can be found at: https://github.com/gamma-data/json-wrangling-with-golang