Skip to content

Playbooks

First PublishedByAtif Alam

A playbook is a YAML file containing one or more plays. Each play maps a group of hosts to a set of tasks.

site.yml
---
- name: Configure web servers
hosts: webservers
become: true
tasks:
- name: Install nginx
apt:
name: nginx
state: present
- name: Start nginx
service:
name: nginx
state: started
enabled: true
- name: Configure database servers
hosts: dbservers
become: true
tasks:
- name: Install PostgreSQL
apt:
name: postgresql
state: present

A play has:

  • name — Human-readable description.
  • hosts — Which inventory hosts/groups to target (supports patterns).
  • become — Whether to use privilege escalation (sudo).
  • tasks — Ordered list of actions to perform.

Each task calls a module with arguments:

tasks:
- name: Install packages
apt:
name:
- nginx
- curl
- git
state: present
update_cache: true
- name: Copy config file
copy:
src: files/nginx.conf
dest: /etc/nginx/nginx.conf
owner: root
mode: "0644"
- name: Create application user
user:
name: deploy
shell: /bin/bash
groups: www-data
append: true

Tasks run in order, top to bottom. If a task fails, Ansible stops on that host (but continues on other hosts by default).

Handlers are tasks that only run when notified. Use them for actions that should only happen when something changes (e.g. restart a service after a config file changes):

tasks:
- name: Copy nginx config
copy:
src: files/nginx.conf
dest: /etc/nginx/nginx.conf
notify: Restart nginx
- name: Copy app config
template:
src: templates/app.conf.j2
dest: /etc/app/config.yml
notify:
- Restart app
- Restart nginx
handlers:
- name: Restart nginx
service:
name: nginx
state: restarted
- name: Restart app
service:
name: myapp
state: restarted

Key points:

  • Handlers run once at the end of the play, even if notified multiple times.
  • Handlers run in the order they are defined (not the order they were notified).
  • If a task reports changed, the notify fires. If the task reports ok (no change), it doesn’t.
Terminal window
ansible-playbook site.yml # run with default inventory
ansible-playbook -i inventory.yml site.yml # specify inventory
ansible-playbook site.yml --limit webservers # run on a subset of hosts
ansible-playbook site.yml --limit web1.example.com # single host
ansible-playbook site.yml --tags deploy # only tagged tasks
ansible-playbook site.yml --skip-tags debug # skip tagged tasks

Preview what would change without actually changing anything:

Terminal window
ansible-playbook site.yml --check

Some modules support check mode natively; others may skip. Use --diff alongside to see file content differences:

Terminal window
ansible-playbook site.yml --check --diff

Increase output detail for debugging:

Terminal window
ansible-playbook site.yml -v # verbose
ansible-playbook site.yml -vv # more verbose
ansible-playbook site.yml -vvv # connection debugging
ansible-playbook site.yml -vvvv # full plugin debugging

Run a task only when a condition is true:

tasks:
- name: Install nginx (Debian/Ubuntu)
apt:
name: nginx
state: present
when: ansible_os_family == "Debian"
- name: Install nginx (RHEL/CentOS)
yum:
name: nginx
state: present
when: ansible_os_family == "RedHat"
- name: Enable debug logging
copy:
content: "DEBUG=true"
dest: /etc/app/debug.conf
when: app_debug | default(false)

when evaluates Jinja2 expressions. Common conditions:

when: ansible_distribution == "Ubuntu"
when: ansible_distribution_version is version('22.04', '>=')
when: result.rc == 0
when: my_var is defined
when: my_var | length > 0
when: inventory_hostname in groups['webservers']

Run a task multiple times with different values:

tasks:
- name: Install packages
apt:
name: "{{ item }}"
state: present
loop:
- nginx
- curl
- git
- htop
- name: Create users
user:
name: "{{ item.name }}"
groups: "{{ item.groups }}"
loop:
- { name: alice, groups: admin }
- { name: bob, groups: developers }
- { name: carol, groups: developers }
- name: Create numbered config files
copy:
content: "Server {{ ansible_loop.index }}"
dest: "/etc/app/server-{{ ansible_loop.index }}.conf"
loop: "{{ server_list }}"
loop_control:
extended: true

Capture the output of a task for use in later tasks:

tasks:
- name: Check if app is running
command: systemctl is-active myapp
register: app_status
ignore_errors: true
- name: Start app if not running
service:
name: myapp
state: started
when: app_status.rc != 0
- name: Show app status
debug:
msg: "App status: {{ app_status.stdout }}"

Continue even if a task fails:

- name: Check optional service
command: systemctl status optional-service
register: result
ignore_errors: true

Define custom failure conditions:

- name: Run health check
command: curl -s http://localhost:8080/health
register: health
failed_when: "'healthy' not in health.stdout"

Control when a task reports “changed”:

- name: Check current version
command: myapp --version
register: version
changed_when: false # this command never changes anything

Try-catch-finally pattern:

tasks:
- block:
- name: Deploy new version
command: deploy.sh
- name: Run smoke tests
command: test.sh
rescue:
- name: Rollback on failure
command: rollback.sh
always:
- name: Send notification
slack:
msg: "Deploy finished ({{ ansible_failed_task | default('success') }})"

By default, Ansible runs each task on all hosts before moving to the next task. With serial, you deploy to hosts in batches — essential for zero-downtime rolling updates.

- name: Rolling deploy
hosts: webservers
serial: 2 # 2 hosts at a time
become: true
tasks:
- name: Pull latest code
git:
repo: https://github.com/myorg/app.git
dest: /opt/myapp
version: main
- name: Restart app
service:
name: myapp
state: restarted

If you have 10 web servers, Ansible processes them in batches of 2. Each batch completes all tasks before the next batch starts.

serial: "25%" # 25% of hosts per batch

Start small, then increase batch size:

serial:
- 1 # first deploy to 1 host (canary)
- 3 # then 3 at a time
- "50%" # then half the remaining
- name: Rolling deploy
hosts: webservers
serial: 2
max_fail_percentage: 25 # stop if >25% of hosts fail
tasks:
- name: Deploy
# ...

Without max_fail_percentage, Ansible stops the current batch on failure but continues to the next batch. With it set, Ansible aborts the entire play if the failure threshold is exceeded.

- name: Rolling deploy
hosts: webservers
serial: 1
pre_tasks:
- name: Remove from load balancer
uri:
url: "http://lb.example.com/api/deregister/{{ inventory_hostname }}"
method: POST
delegate_to: localhost
tasks:
- name: Deploy new version
copy:
src: app/
dest: /opt/myapp/
- name: Restart app
service:
name: myapp
state: restarted
post_tasks:
- name: Add back to load balancer
uri:
url: "http://lb.example.com/api/register/{{ inventory_hostname }}"
method: POST
delegate_to: localhost

This takes each host out of rotation, deploys, then re-registers — one at a time.


Run a task on a different host than the one being targeted. Common for load balancer operations, database tasks, or API calls from the control node.

- name: Remove host from load balancer
uri:
url: "http://lb.example.com/api/deregister/{{ inventory_hostname }}"
method: POST
delegate_to: lb.example.com
- name: Run database migration (from bastion)
command: /opt/scripts/migrate.sh
delegate_to: bastion.example.com
run_once: true # only run once, not per host

The task runs on lb.example.com or bastion.example.com, but inventory_hostname and other host variables still refer to the current target host.

Run a task on the control node (your laptop / CI server):

- name: Send Slack notification
slack:
token: "{{ slack_token }}"
msg: "Deployed to {{ inventory_hostname }}"
delegate_to: localhost
- name: Wait for host to come back
wait_for:
host: "{{ ansible_host }}"
port: 22
delay: 10
timeout: 300
delegate_to: localhost

Combine with delegate_to to run a task exactly once across the entire play:

- name: Run database migration
command: /opt/myapp/migrate.sh
delegate_to: "{{ groups['dbservers'][0] }}"
run_once: true

Even if the play targets 20 web servers, the migration runs once on the first database server.


For long-running tasks (backups, large downloads, database restores) that might exceed the SSH timeout.

Start the task, poll for completion:

- name: Run long database backup
command: /opt/scripts/backup-db.sh
async: 3600 # allow up to 1 hour
poll: 30 # check every 30 seconds

Ansible keeps the SSH connection alive and polls every 30 seconds until the task finishes or times out.

Start the task and move on immediately:

- name: Start background reindex
command: /opt/scripts/reindex.sh
async: 3600
poll: 0 # don't wait
- name: Do other work
# ... other tasks run while reindex is in progress ...
- name: Check reindex status
async_status:
jid: "{{ reindex_result.ansible_job_id }}"
register: job_result
until: job_result.finished
retries: 60
delay: 30

poll: 0 means fire-and-forget. Use async_status later to check if it finished.

SituationUse
Task takes > 30 secondsasync with poll
Task takes minutes, and you need to do other work meanwhileasync with poll: 0
Normal short tasksDon’t use async

Strategies control how Ansible executes tasks across hosts.

Each task runs on all hosts before the next task starts. All hosts stay in sync:

- name: Deploy app
hosts: webservers
strategy: linear # default, no need to specify

Each host proceeds through tasks independently — fast hosts don’t wait for slow hosts:

- name: Update all servers
hosts: all
strategy: free
tasks:
- name: Update packages
apt:
upgrade: dist

Useful when tasks are independent and hosts have different speeds. Output can be harder to read since hosts interleave.

Like free, but each host’s tasks stay pinned together in output. Easier to read than free.

Interactive debugger — step through tasks one at a time:

- name: Debug a play
hosts: webservers
strategy: debug

Lets you inspect variables, re-run tasks, and step through execution. Useful for development only.

# ansible.cfg
[defaults]
strategy = linear

  • A playbook is a list of plays; each play targets hosts with ordered tasks.
  • Handlers run only when notified and only once at the end of the play.
  • Use --check --diff to preview changes before applying.
  • when for conditionals, loop for iteration, register to capture output.
  • Use block/rescue/always for error handling and rollback patterns.
  • Control task reporting with changed_when and failed_when.
  • Use serial for rolling updates — deploy in batches with max_fail_percentage for safety.
  • Use delegate_to to run tasks on a different host (load balancer, bastion, localhost).
  • Use async for long-running tasks; poll: 0 for fire-and-forget.
  • linear strategy keeps hosts in sync; free lets fast hosts proceed independently.