Ansible Roles: Multi-Node Automation

ansible roles multi-node web server automation

Prerequisites and Environment Assumptions
Executing Ansible Roles Multi-Node Web Server Automation
Optimizing Playbooks for Idempotency and Safety
Handling Edge Cases: Orphaned Configurations and Skipped Handlers
Frequently Asked Questions

I still remember the first time I had to roll out a basic Nginx config to a handful of VPS instances. I SSH’d into each node, ran the same apt install, copied over a vhost file with scp, and reloaded the service. It worked. The next month, when the devs asked for a new vhost on all six nodes, I did it again. And the month after that, I added a third vhost. That’s when I snapped — I realised I was spending my afternoons on a glorified copy-paste loop, and if I missed a single step, production would silently serve stale content until someone complained. I needed the exact same config to land on every box the same way, every time, and I needed the services to pick up changes automatically without me tripping over a reload. That’s when I leaned hard into ansible roles multi-node web server automation and never looked back.

Quick Summary

Build a modular and clean role structure with site.yml at the root.
Use host-specific inventory variables to separate host-specific values from the tasks.
Use Adhoc Shell for nginx/Apache VHosts creation instead of using templates.
Configure notify handlers to only reload Nginx/Apache services if a virtual host changed.
Always validate the entire process of deploying an application to a live server with a dry run before doing it live.
Write tasks that are inherently idempotent, so there’s no use for shell: echo “something”.
Safely handle mid-play failures with –force-handlers during playbook execution.

Prerequisites and Environment Assumptions

Before creating any Ansible task, you need two items: a control node (where you run ansible-playbook) and at least 2 target nodes. I assume that all nodes are running Ubuntu 22.04 and that Ansible 8.x is installed using pip on the control node. The target nodes are assumed to be fresh installs, with nothing installed but SSH daemon and have sudo access.

In my setup, I have 1 Control Node and 3 Web Nodes. I want to avoid hardcoding the IP addresses (or FQDNs) into the playbook, so I’m going to use inventory variables from the beginning.

Configuring Inventory Variables

The inventory file defines the server’s groups and which servers belong to which groups. Instead of redundantly listing ansible_host and custom variables, all of those variables are kept in one centralised file named inventory/hosts.yml. Below is the basic structure I will use for multi-node web server automation.

---
all:
  children:
    webservers:
      hosts:
        web01.example.com:
          ansible_host: 192.168.10.11
          site_domain: app1.mydomain.com
          site_port: 80
        web02.example.com:
          ansible_host: 192.168.10.12
          site_domain: app1.mydomain.com
          site_port: 80
        web03.example.com:
          ansible_host: 192.168.10.13
          site_domain: app2.mydomain.com
          site_port: 80

Using this structure groups all 3 Web Nodes under webservers. Each node has an assigned variable named site_domain. The above example illustrates that web01 and web02 share the same site_domain. In a production environment, these two nodes may be served by a load balancer, whereas web03 would service another application. By using host level variables for each node, the role can create the proper Virtual Host for each node.

According to the Ansible Inventory Documentation, host level variables always override group level variables. Therefore, you may set default values under webservers:vars:, but you can still override those values per host.

Establishing SSH Authentication

Ansible requires that you be able to connect to your target servers using SSH with no password (passwordless SSH).On the control node, I created a key pair and pushed the public key to each web server.

$ ssh-keygen -t ed25519 -f ~/.ssh/ansible_web -C "ansible-control"
$ ssh-copy-id -i ~/.ssh/ansible_web.pub ansible@192.168.10.11
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s)...
Number of key(s) added: 1

Now try logging into the machine, with: "ssh -i ~/.ssh/ansible_web ansible@192.168.10.11"
and check to make sure that only the key(s) you wanted were added.

Next, I executed ssh-copy-id for every target IP, after which I tested that I can reach every node via Ansible with ansible webservers -i inventory/hosts.yml -m ping -u ansible --private-key ~/.ssh/ansible_web; all nodes responded to the ping, confirming that they can communicate with one another through SSH.

Executing Ansible Roles Multi-Node Web Server Automation

When you start using roles in place of writing flat playbooks is when you start to see the real benefit of being able to reuse your playbooks in different scenarios. Roles are like containers that group different components (tasks, handlers, templates, etc.) that are packaged together in a common file structure. This simplifies onboarding new members of the team, and it also allows you to import/community roles down the road with an easy directory layout.

Why I Ultimately Chose This Route

Initially, I tried a few other options. I created one large playbook with include statements; it quickly became unmanageable. I also attempted to run ansible-pull from each node, but this would require me to install Ansible on each of the web servers as well as have to contend with cron jobs. Ultimately, roles won out as the best option because they allow me to separate the different tasks into separate files. This keeps my main entry point for my playbooks site.yml readable. Additionally, when I wanted to add another web server role (Apache) in addition to Nginx, I was able to add it in the same roles directory without changing the existing code.

Designing the YAML Structure and site.yml

Here’s the directory structure for the web server role that I’ve created.

.
├── ansible.cfg
├── inventory
│   └── hosts.yml
├── roles
│   └── nginx
│       ├── handlers
│       │   └── main.yml
│       ├── tasks
│       │   └── main.yml
│       ├── templates
│       │   └── vhost.conf.j2
│       └── vars
│           └── main.yml
└── site.yml

I wanted to keep things simple with the playbook, which is why the site.yml playbook looks almost boring.

---
- name: apply nginx web server role to all nodes
  hosts: webservers
  become: yes
  roles:
    - nginx

The flat, simple design of the YAML structure is how the webservers group will connect to the nginx role in the playbook.You don’t need to tell Ansible which tasks to load or when to perform them; it will automatically import the tasks/main.yml, handlers/main.yml, and template directories for you.

Generating Virtual Hosts with the Template Module

The template module allows you to create virtual hosts without needing to copy files over manually and replacing text in each one. Instead, by placing your Jinja2 template file in the templates/vhost.conf.j2 directory, the role will create a virtual host for each server using the ‘site_domain’ variable from your inventory.

server {
    listen {{ site_port }};
    server_name {{ site_domain }};
    root /var/www/{{ site_domain }}/html;
    index index.html;

    location / {
        try_files $uri $uri/ =404;
    }
}

The task that inserts the configuration file is also short.

- name: deploy nginx virtual host configuration
  ansible.builtin.template:
    src: vhost.conf.j2
    dest: "/etc/nginx/sites-available/{{ site_domain }}.conf"
    owner: root
    group: root
    mode: '0644'
  notify: reload nginx

The virtual host files are named after the ‘site_domain’ variable, so for example the third web server will end up with the filename of app2.mydomain.com.conf. This prevents filename collisions for virtual hosts, as two virtual host files can reside on the same server at the same time. When you use the Jinja2 template engine, it will process the placeholders {{ site_port }} and {{ site_domain }} automatically, without needing to run sed commands.

Managing Service States with Notify Handlers

The notify: reload nginx line in the task will trigger a handler if the template task has changed. This can be confusing to some users as they have a lot of extra work to define handlers. My defined handler is located in handlers/main.yml.

---
- name: reload nginx
  ansible.builtin.service:
    name: nginx
    state: reloaded

With handlers you can define a handler to execute at the end of the play only if it has been notified to execute. Assuming all virtual host configurations have been applied to three servers, if two of those servers have not changed their virtual host configurations, then the handler will NOT notify Nginx to reload. This supports the idea of an idempotent automated deployment process.

Optimizing Playbooks for Idempotency and Safety

The intent is not only to run successfully once. The aim is for the user to be able to re-run the playbook several times, preferably within 10 minutes of execution, without generating errors or exceptions (this is referred to as ‘idempotency’). The built-in modules of Ansible have the most to offer the developer when considered in terms of their capabilities.

Validating Changes Using a Dry Run

Prior to deploying a playbook to production, I have executed it in ‘check’ mode with diffs.

$ ansible-playbook -i inventory/hosts.yml site.yml --check --diff -u ansible --private-key ~/.ssh/ansible_web

TASK [nginx : deploy nginx virtual host configuration] *************************
--- before: /etc/nginx/sites-available/app1.mydomain.com.conf
+++ after: /etc/nginx/sites-available/app1.mydomain.com.conf
@@ -1,11 +1,11 @@
 server {
-    listen 80;
+    listen 443 ssl http2;
     server_name app1.mydomain.com;
     root /var/www/app1.mydomain.com/html;

...
changed: [web01.example.com] <-- would change if run without --check
ok: [web02.example.com]

Using the diff tool, I can see which lines will be modified by this action. Missing variable declarations, whitespace changes, and entire improperly configured templates would be identified through the diff without having to impact live systems. The dry run process is a required step in my workflow; when I see a node showing the status of changed when a corresponding diff is unexpected, I stop the process and perform debugging. The official documentation published by Ansible explain these aspects of performing dry runs in detail (official Ansible validate task docs).

Writing Tasks for Idempotency

Idempotency is not magic; idempotent tasks are those that use modules that can intelligently determine when two states are identical. One unsafe practice that I frequently see in blogs or forums is to add a command equivalent to shell: echo 'something' in task blocks. The shell module will always return ‘changed’ as an answer, because there is no way for Ansible to know if the command generated a change in state or not. I use modules such as template, apt, file, and lineinfile that can query the current state of an item and determine if it is the same as the required state; therefore, they are idempotent.

- name: ensure the site document root exists
  ansible.builtin.file:
    path: "/var/www/{{ site_domain }}/html"
    state: directory
    owner: www-data
    group: www-data
    mode: '0755'

In this example, if the directory already exists and has the correct permissions, it will return ok. This functionality does not require the use of shell commands.I use the very same principle when I enable the site. To do this, I use file along with src and dst to create a symlink in the sites-enabled directory. The module will only take action when the symlink does not exist or when the symlink exists but is pointing to a different location. This is another example of idempotency being built in with no additional effort required.

Handling Edge Cases: Orphaned Configurations and Skipped Handlers

Production always throws me a curveball. Sometimes, my playbook will fail halfway through running, and sometimes, I may change the domain for a host, and that subsequently leaves the old vhost file in place with a valid symlink. I started to solve these edge cases so I do not have to wake up at 3:00 am trying to find a ghost.

Bypassing Default Behavior to Force Handlers

Normally, handlers do not run until all tasks have finished running. If there is a failure in a task that comes after the template task, the handler will never execute, meaning that there will be new vhost configurations on disk, but they are not active until Nginx reloads. For exactly this case, the --force-handlers option exists.

$ ansible-playbook -i inventory/hosts.yml site.yml --force-handlers -u ansible --private-key ~/.ssh/ansible_web

PLAY RECAP *********************************************************************
web01.example.com : ok=5    changed=2    unreachable=0    failed=1    skipped=0
web02.example.com : ok=7    changed=2    unreachable=0    failed=0    skipped=0

RUNNING HANDLER [nginx : reload nginx] *****************************************
changed: [web02.example.com]

With --force-handlers, the node that had the failure later in the playbook still triggers its notified handlers (reloading Nginx). I get the reload for web02, and the failure for web01 does not leave me uncertain about whether the web server will reload the new vhost. While I do not use this flag daily, it has come in handy when a non-critical task has failed, and I needed the web server to reload a new vhost.

Removing Stale Node Configurations

When a server’s role or domain is changed, there are old virtual host files still sitting around. When I made this mistake, Nginx loaded the old and new configurations leading to two configurations conflicting on the same IP/port. Now when I have an orphaned file, I explicitly set it to absent.

- name: remove stale vhost config for previous domain
  ansible.builtin.file:
    path: "/etc/nginx/sites-available/{{ old_domain }}.conf"
    state: absent
  notify: reload nginx

The old_domain variable is based on the host’s specific variable that was present when the Host was migrated. I also delete the symlink in the sites-enabled directory for that corresponding file. After the playbook has finished executing, Nginx will only reload with the desired virtual hosts and no old/default files left behind.

Frequently Asked Questions

How do I override default variables within a specific role?

Create an entry in the host or group inventory that contains a variable with the same name as the default role variable. Ansible’s variable precedence states that the inventory host variable has a higher precedence than role default variables. For example, let’s say roles/nginx/vars/main.yml defines the following value for site_port:

site_port: 80

If you want to set the site_port as 8080 on a specific host, you can simply add a new entry into the inventory for that host with the desired value of site_port.

To view the complete list of Ansible’s variable precedence, see the Ansible variable precedence documentation.

Why do my notify handlers only restart the web service once?

Notify handlers run only once at the end of the play — even if multiple tasks are notifying the same handler. This behaviour is by design in Ansible. If you want a restart to take place following the first notified task and then a reload to occur after the second, either use different handler names (such as restart nginx for the first task and reload nginx for the second task) or consider a meta-flush task.

Can I target a specific node within a multi-node inventory group?

Yes. You can specify the --limit option with the ansible-playbook command to run the playbook against a single targeted host, such as so:

ansible-playbook -i inventory/hosts.yml site.yml --limit web03.example.com

I use this all the time in situations when I am developing new templates and testing them out on just one node prior to implementing them on the entire group.

Idempotency, dry-run validation, and discipline in maintaining consistent YAML structures and formatting ensure that the capabilities you build into your multi-node deployment will not result in a 3 A.M. call out frantically attempting to understand why something went wrong. Once inventory variables, the template module, and notify handlers have been connected to each other you can expect that the playbook will execute the same boring and repeatable method over and over again. That is the goal.

0 10 10 minutes read

Automating Multi-Node Web Server Provisioning with Ansible Roles and Handlers

Prerequisites and Environment Assumptions

Configuring Inventory Variables

Establishing SSH Authentication

Executing Ansible Roles Multi-Node Web Server Automation

Why I Ultimately Chose This Route

Designing the YAML Structure and site.yml

Generating Virtual Hosts with the Template Module

Managing Service States with Notify Handlers

Optimizing Playbooks for Idempotency and Safety

Validating Changes Using a Dry Run

Writing Tasks for Idempotency

Handling Edge Cases: Orphaned Configurations and Skipped Handlers

Bypassing Default Behavior to Force Handlers

Removing Stale Node Configurations

Frequently Asked Questions

How do I override default variables within a specific role?

Why do my notify handlers only restart the web service once?

Can I target a specific node within a multi-node inventory group?

Zain

Read Next

Analyzing and Fixing Failed to Template String Errors in Ansible Playbooks

Correcting Variable Precedence Conflicts in Complex Nested Ansible Inventories

Debugging Python Interpreter Discovery Failures on Legacy Managed Nodes

Fixing Ansible Become Sudo Password Errors and Permission Denied Failures

Resolving Unreachable Host Errors and SSH Connection Refused in Ansible Inventories

Configuring Persistent System Logging and Log Rotation via Ansible Playbooks

Designing Secure Credential Storage with Ansible Vault for Production Environments

Standardizing Package Management Across Hybrid Linux Environments Using Ansible Modules

Implementing SSH Key-Based Authentication for Secure Ansible Managed Node Connectivity