TTechclick ⚡ XP 0% All lessons
Ansible · Ops · Config Backup & ComplianceInteractive · L1 / L2 / L3

Ansible for Config Backup & Compliance: — Catching Network Drift Before It Bites

Someone makes a midnight CLI change on a core switch, tells no one, and three weeks later an outage has no clean rollback point. This lesson turns Ansible into a nightly camera + auditor: it pulls and versions every device config, then fails the host that drifts from your approved baseline — before the auditor or the outage finds it first.

📅 2026-06-11 · ⏱ 13 min · 3 live demos · 4 infographics · 🏷 10-Q assessment + AI Tutor inline

⚡ Quick Answer

Ansible config backup & compliance for L1/L2 engineers and RHCE: nightly ios_config backups to Git, diff_against drift detection, assert-based policy checks (no telnet, NTP, SNMPv3, AAA), and a full nightly pipeline.

🎯 By the end you will be able to

Read as:

Pick where you want to start

1

The drift problem

Midnight change, no record, no rollback point.

2

Backup playbooks

ios_config backup + Git = full change history.

3

Compliance checks

Read state, assert policy, fail the drifter.

4

A real pipeline

Backup → commit → assert → report → remediate.

🧠 Warm-up — 3 questions, no score

Just notice which ones make you pause. We answer all three inside the lesson.

1. A switch was changed by hand at 2 a.m. with no ticket. Three weeks later it fails. What does the team most lack?

Answered in The drift problem.

2. Your nightly backup commits to Git every morning even though nothing changed on the device. What is the usual culprit?

Answered in Compliance checks.

3. A compliance play finds telnet still enabled on one router. In report-only mode, what should the play do?

Answered in Backup playbooks.

Most engineers think…

Most engineers think "config backup" is a solved, boring problem — just copy run start on each box, or let the device save to a TFTP server, and you are done. So they treat compliance as a once-a-year spreadsheet the auditor emails around.

Wrong — and that gap is where outages and audit findings live. A local copy run start gives you no history, no diff, and no central record; when the Pune router breaks you cannot see what changed and when. The real move is a nightly Ansible job that pulls every config, versions it in Git (so every change is a dated, attributable diff), and then asserts the running state against a written policy so the play fails the host that drifted. Backup answers "what changed"; compliance answers "is it still allowed" — and both run unattended every night, not once a year.

① The drift + backup problem — the midnight change nobody recorded

Picture Rahul, an L1 network engineer at Infosys, on a Monday morning. A core switch at the Pune campus stopped passing a VLAN over the weekend. Someone — nobody is owning up — logged in around 2 a.m. on Saturday and "fixed" something by hand on the CLI. There is no ticket, no email, no record of what they typed. Rahul opens the switch and sees the current config, but he has nothing to compare it against. He cannot answer the only question that matters: what changed, and what did it look like before?

This is configuration drift, and it is the quiet killer of network ops. Every undocumented hand-edit — a quick no shutdown, a temporary ACL line "just for testing", an SNMP community string added for a vendor — pushes the live running-config a little further from what your design documents say it should be. Each one is harmless alone. Together they mean that on the day of an outage, nobody can roll back to a known-good state, because nobody recorded one.

The second half of the same problem is compliance. Your security policy says: no telnet, SNMPv3 only (no clear-text SNMPv1/v2c community strings), an approved NTP server set, and AAA configured for login. But who checks that on all 200 devices, every night? In most shops the honest answer is "nobody, until the auditor asks" — and by then the drift has been live for months.

👉 So far: drift = un-recorded changes, and the result is no rollback point + silent policy violations. Next: how Ansible turns this into a nightly camera that records every config and flags every drift.

Here is the shift. Ansible can log into every device on a schedule, pull the full running-config, and save it to a timestamped file — that is the backup. Commit each night's pull into Git and you suddenly have a full, dated change history with diffs. Then, in the same nightly run, Ansible reads the running state and asserts it against your written policy — and the play fails the host that drifted. Backup answers "what changed"; compliance answers "is it still allowed". Both, unattended, every night.

Figure 1 — Drift world vs versioned-and-audited world
Without versioned backups a midnight change has no rollback point; nightly Ansible + Git + assert gives history and catches drift A two-column comparison for one network device. Left, the drift world: someone makes a manual CLI change with no ticket, the only copy is the live running-config, there is no diff history, no rollback point and compliance is checked once a year by hand. Right, the Ansible world: a scheduled job pulls the config nightly, commits it to Git so every change is a dated diff, and an assert play checks the running state against policy and fails the host that drifted. Red marks the blind, un-recorded old path; green marks the recorded, audited new path. Same switch, two worlds — blind drift vs versioned + audited Drift world (hand-edits, no record) ✗ 2 a.m. CLI change, no ticket, no email ✗ only copy = the live running-config ✗ no diff: cannot see what changed or when ✗ outage day = no known-good rollback point ✗ telnet/SNMPv2 sneak back in, unnoticed ✗ compliance = once-a-year spreadsheet device only Outage: "what was it before?" → nobody knows Ansible world (nightly backup + audit) ✓ scheduled job pulls config every night ✓ each pull committed to Git = dated diff ✓ git diff shows exactly what changed, by whom ✓ any backup = a known-good rollback point ✓ assert play fails the host that breaks policy ✓ compliance report every morning, automatic Git history assert/policy Outage: git diff → "this line, added Saturday" → revert untracked / driftrecorded / trustedpolicy / decisionkey insightcompliant
Read both columns for the same device. Left (red) = hand-edits with no record, no rollback, no audit. Right (green) = nightly Ansible pull → Git diff history → assert against policy that fails the drifter.

The four faces of the drift problem

Tap each card — these are the four pains every "we need config backup and compliance" project starts from.

🕵️
No record of change
tap to flip

A hand-edit with no ticket means the only record is the live config. So: you can never prove what changed or when.

No rollback point
tap to flip

Without a saved known-good config, an outage has nothing to revert to. So: recovery becomes guesswork at 2 a.m.

🚪
Policy creeps back
tap to flip

Telnet, SNMPv2 community strings and weak logins sneak in over time. So: your hardened baseline quietly rots.

📋
Audit by spreadsheet
tap to flip

Compliance checked once a year, by hand, on a sample. So: most drift is live for months before anyone notices.

Daily-life analogy — the society gate-pass register

A backup playbook is your apartment society's gate-pass register. Every visitor who enters is written down with a timestamp and who let them in. Months later, when something goes missing, the secretary opens the register and sees exactly who came, when, and on whose authority. A device with no config history is a society with no register — anyone walked in, nobody wrote it down, and now you are arguing about what happened with zero evidence. Ansible + Git is the register, written automatically every night.

Quick check · Q1 of 10

Sneha at TCS says: "We run copy run start on every router after changes, so we already have backups." Why is that NOT enough for drift detection and rollback?

Correct: b. copy run start just overwrites the on-box startup-config — there is no dated history kept off the device and no way to diff today against last week, so you still cannot answer "what changed and when" or roll back to a specific earlier state. Speed is irrelevant to the gap; Ansible can absolutely read both configs; and the limitation is about history, not vendor.

Pause & Predict

Predict: if you only run config backups but never run a compliance check, what class of problem stays completely invisible to you? Type your guess.

Answer: Backups alone tell you the config changed, but not whether the change broke policy. So a device could be faithfully backed up every night while still running telnet, a clear-text SNMPv2 community, or no AAA — all perfectly recorded, none flagged. Backup answers "what is it / what changed"; only a compliance assert answers "is this allowed". You need both: the camera AND the auditor.

② Backup playbooks — ios_config backup, timestamped files, committed to Git

The fastest backup uses the cisco.ios.ios_config module with one switch: backup: true. When set, Ansible logs into each device, grabs the running-config, and writes it to a file on the control node. By default that file lands in a backup/ folder next to your playbook, named <hostname>_config.<date>@<time>. That default naming is already timestamped — but in real life you control it with backup_options.

backup_options takes two sub-options: dir_path (where the file goes) and filename (what it is called). This is how you organise backups by hostname and date so the folder stays sane across 200 devices. A common pattern: one directory per device, file named with a date stamp — so backups/BR-Mumbai-rtr01/2026-06-11.cfg sits next to yesterday's file and git diff between them is meaningful.

backup.yml — pull every device's running-config into a per-host, dated file
---
- name: Nightly config backup
  hosts: ios_devices
  gather_facts: false
  vars:
    backup_root: "/home/netauto/backups"
  tasks:
    - name: Pull running-config and save a timestamped copy
      cisco.ios.ios_config:
        backup: true
        backup_options:
          dir_path: "{{ backup_root }}/{{ inventory_hostname }}"
          filename: "{{ lookup('pipe','date +%Y-%m-%d') }}.cfg"
      register: backup_result

    - name: Show where each backup landed
      ansible.builtin.debug:
        var: backup_result.backup_path
Expected output
PLAY [Nightly config backup] ***************************************
TASK [Pull running-config and save a timestamped copy] *************
ok: [BR-Mumbai-rtr01]
ok: [BR-Pune-rtr01]
TASK [Show where each backup landed] *******************************
ok: [BR-Mumbai-rtr01] => { "backup_result.backup_path": "/home/netauto/backups/BR-Mumbai-rtr01/2026-06-11.cfg" }
PLAY RECAP *********************************************************
BR-Mumbai-rtr01 : ok=2  changed=0  unreachable=0  failed=0

Notice changed=0 in the recap — pulling a backup does not change the device, so a read-only backup run is safe to schedule. (If you do not use a Cisco IOS device, the same idea works with *_command modules: run show running-config, then write the captured output to a file with copy_to/ansible.builtin.copy. The *_config backup option is just the tidy shortcut where it exists.)

👉 So far: ios_config backup + backup_options gives a per-host, dated file. Next: why committing those files to Git is what turns "a pile of files" into a real change history.

A folder of dated files is useful; a Git repository of them is powerful. After the backup task, run a couple of shell tasks (or the ansible.builtin.git-adjacent pattern) to git add, git commit with a dated message, and push. Now every night's configs are a commit. When the Pune router breaks, you run git log on its file, find Saturday's 2 a.m. commit, run git diff, and see the exact line someone added — with a timestamp and author. That is the difference between "something changed" and "this line changed, at this time".

commit-backups.sh — version the night's pull (run after backup.yml)
cd /home/netauto/backups
git add -A
git commit -m "Nightly config backup 2026-06-11" || echo "No changes to commit"
git push origin main
Expected output
[main 7c4e1a9] Nightly config backup 2026-06-11
 2 files changed, 11 insertions(+), 3 deletions(-)
 rewrite BR-Pune-rtr01/2026-06-11.cfg (78%)
To github.com:infosys-netauto/network-backups.git
   3b21f0c..7c4e1a9  main -> main
Figure 2 — How a backup becomes a diffable history
A nightly backup is only useful once it is versioned: pull to a dated file, commit to Git, then git diff shows the exact drift A four-step backup-to-history flow. Step 1 the Ansible control node opens SSH connections to the managed IOS devices. Step 2 the ios_config backup option writes each running-config to a per-host, dated file on the control node. Step 3 a git commit versions that night's files. Step 4 later, git diff between two dated commits reveals the exact configuration line that was added or removed. Orange marks the policy/decision point, blue marks the trusted recorded path, lime marks the key insight that diff is what makes a backup actionable. From device → dated file → Git commit → a diff you can act on Control nodeansible-playbookbackup.yml BR-Mumbai-rtr01 BR-Pune-rtr01 1· SSH to each device Dated backup filesBR-Mumbai-rtr01/2026-06-11.cfgBR-Pune-rtr01/2026-06-11.cfg2· ios_config backup writes these 3· git add + commit "Backup 2026-06-11"each night = one dated, attributable commit 4· git diff 2026-06-10 → 2026-06-11 - transport input ssh + transport input telnet ssh <-- the drift, caught Key insight: the diff, not the file, is what makes a backup actionable untracked / driftrecorded / trustedpolicy / decisionkey insightcompliant
Follow the arrows: control node connects to each device (1), ios_config pulls the running-config to a dated file (2), git commit versions it (3), and git diff later shows the one line that drifted (4).
Common mistake — "every nightly run commits a change, even when nothing changed"

Symptom: git log shows a fresh commit every single morning for a device nobody touched, and your "what changed" diffs are full of noise. Cause: the running-config contains a line that changes on its own — a timestamp, an uptime counter, an ntp clock-period value, or certificate/cron metadata. Each backup looks "different", so Git commits it. Fix: strip the volatile lines before diffing/committing (filter out timestamp/clock lines), or back up with a normalised view. This is the #1 false-drift source in real pipelines — a stray line makes everything look changed.

Pause & Predict

Predict: you back up with backup: true but your saved .cfg files contain SNMP community strings and TACACS keys in clear text, and you push them to a shared Git repo. What new risk did you just create? Type your guess.

Answer: You turned a config backup into a secrets leak. Running-configs embed sensitive material — SNMP communities, shared keys, sometimes weakly-reversible passwords — and pushing them to a shared (or worse, public) repo exposes them to anyone with read access, permanently in history. Fix: keep the backup repo private + access-controlled, encrypt at rest, scrub or mask secret lines before commit, and never let device-pulled configs land in a repo that is broader than the secrets inside them.
Quick check · Q2 of 10

Aditya wants each device's backups in their own folder, named by date, so git diff between days is clean. Which ios_config setting does that?

Correct: c. backup_options lets you set dir_path (a per-host directory) and filename (a date-stamped name), which is exactly how you organise per-device, per-day backups for clean diffs. diff_against is for drift comparison, not file layout; gather_facts does not save configs; running_config is an input you supply, not a backup target.

③ Compliance checking — read the running state, assert the policy, fail the drifter

Backup is the camera. Compliance is the auditor. The pattern is always the same three moves: (1) read the running state, (2) assert it against a written policy, (3) fail the host that violates it. To read state you use either ios_facts (structured facts under ansible_net_* keys) or ios_command (raw show output you grep with filters).

The enforcer is the assert module. You hand it a list of conditions under that: and a fail_msg. If every condition is true, the host passes; if any is false, that host fails the play and your message names the violation. So your policy — no telnet, SNMPv3 only, NTP set, AAA configured — becomes a list of assert conditions, and the play recap turns red for exactly the devices that drifted.

compliance.yml — read running-config, assert the security policy, fail the drifter
---
- name: Network security compliance check
  hosts: ios_devices
  gather_facts: false
  tasks:
    - name: Grab the full running-config
      cisco.ios.ios_command:
        commands: ["show running-config"]
      register: rc

    - name: Assert the device meets policy
      ansible.builtin.assert:
        that:
          - "'transport input telnet' not in rc.stdout[0]"     # no telnet
          - "'snmp-server community' not in rc.stdout[0]"      # SNMPv3 only, no v1/v2c
          - "'ntp server 10.10.0.10' in rc.stdout[0]"          # approved NTP set
          - "'aaa new-model' in rc.stdout[0]"                  # AAA configured
        fail_msg: "NON-COMPLIANT: telnet/SNMPv2/NTP/AAA policy violated"
        success_msg: "Compliant"
Expected output
TASK [Assert the device meets policy] ******************************
ok: [BR-Mumbai-rtr01] => { "msg": "Compliant" }
fatal: [BR-Pune-rtr01]: FAILED! => {"assertion": "'transport input telnet' not in rc.stdout[0]",
        "evaluated_to": false, "msg": "NON-COMPLIANT: telnet/SNMPv2/NTP/AAA policy violated"}
PLAY RECAP *********************************************************
BR-Mumbai-rtr01 : ok=2  changed=0  failed=0
BR-Pune-rtr01   : ok=1  changed=0  failed=1

Read the recap: Mumbai passed, Pune failed=1 — and the failing assertion text tells you which condition broke. That is the whole point: a single play run, across every device, produces a precise red/green list. No human eyeballing 200 configs.

👉 So far: read with ios_command/ios_facts, assert against policy, fail the drifter. Next: the big choice that decides whether this play belongs in a nightly cron or a change window — report-only vs remediate.

Now the design decision that trips up every beginner: report-or-remediate. A report-only play asserts and fails but touches nothing — safe to run nightly, unattended, against production. A remediate play goes further: when it finds telnet, it removes it with ios_config. Remediation is powerful but it changes production, so you gate it behind a change window, --check dry-runs first, and human approval. The classic newbie error is wiring auto-remediation into the nightly cron and discovering, at 3 a.m., that the play "fixed" a temporary change an engineer needed.

Figure 3 — Report-only vs remediate — the decision that keeps you employed
The same drift detection branches two ways: report-only is safe to run nightly, remediate changes production and needs a change window A decision diagram. It starts with read running state and an assert that detects drift. The flow then forks. The left branch, report-only mode, fails the host and writes a report but changes nothing, so it is safe to schedule nightly against production. The right branch, remediate mode, pushes ios_config lines to fix the drift, which changes the device and therefore belongs in a controlled change window with check-mode dry-runs and approval. Amber marks the decision fork, green marks the safe nightly path, red marks the risk on the remediate path if run unattended. Detect drift → then choose: report, or remediate? Read state + assertios_command / ios_facts driftfound? Report-only mode • assert fails the host + writes report • changes NOTHING on the device • safe → schedule nightly on production the camera + auditor; no surgery Remediate mode • pushes ios_config to fix the drift • CHANGES production → change window • --check dry-run + approval first never wire this into the nightly cron just tell me fix it for me Nightly = report-only. Remediation is a deliberate, gated act.
The fork in the middle is the whole lesson: detect → report (safe, nightly, changes nothing) vs detect → push fix (changes production, change-window only). Pick deliberately.

▶ Watch one compliance run decide pass vs fail

A nightly compliance play hits two devices. Follow how it reads state, tests each policy line, and turns the recap red for exactly the drifter. Press Play for the healthy path, then Break it to see the failure.

① Readplay runs show running-config on both hosts; output stored in rc.stdout[0]
② Assert host ABR-Mumbai: no telnet, no community, NTP 10.10.0.10 set, AAA on → all true
③ Assert host BBR-Pune: finds transport input telnet → condition false
④ RecapMumbai ok, Pune failed=1 with the exact violating assertion named
Press Play to step through the healthy path. Then press Break it.
🖥️ This is the terminal you will actually watch — a compliance run at ram@netauto:~/playbooks → ansible-playbook compliance.yml. (Recreated for clarity — your terminal matches this.)
ram@netauto:~/playbooks
1
$ command
ansible-playbook -i inventory.ini compliance.yml
2
TASK assert
Assert the device meets policy
3
BR-Mumbai-rtr01
ok → "Compliant"
4
BR-Pune-rtr01
FAILED → "NON-COMPLIANT: telnet ..."
PLAY RECAP
Pune failed=1, Mumbai failed=0
▶ run
Prove the check actually tests what you think

A compliance play that always passes is worse than none — it gives false confidence. Before trusting it, test it against a known-bad device: deliberately add transport input telnet to a lab router and run the play. If the host does not turn red, your assert condition is wrong (loose match, wrong index, or testing the saved file instead of the live state). A good check must fail when it should — verify the failure path, not just the happy path.

Priya at ICICI faces this

Priya, an L1 analyst, gets the morning compliance report: BR-Pune-rtr01 is flagged NON-COMPLIANT, assertion "'transport input telnet' not in rc.stdout[0]" evaluated to false.

Likely cause

Someone re-enabled telnet on the Pune router's VTY lines during a weekend troubleshooting session and never backed it out. The running-config now has "transport input telnet ssh", so the no-telnet assert fails for that host only.

Diagnosis

She confirms it is real drift (not a false match) by reading the actual line, and cross-checks the Git backup to see when it appeared.

git log -p backups/BR-Pune-rtr01/ → finds the line added in the 2026-06-08 commit; ansible-playbook compliance.yml --limit BR-Pune-rtr01 to reproduce
Fix

In a scheduled change window she runs the remediate play (or hand-edits): ios_config sets "transport input ssh" on the VTY lines, removing telnet; she does a --check dry-run first.

Verify

Re-run compliance.yml --limit BR-Pune-rtr01 → host now ok="Compliant", failed=0; the next nightly Git backup shows the telnet line removed.

Quick check · Q3 of 10

Karthik at HCL writes a nightly play that detects drift AND auto-pushes ios_config fixes, then schedules it in cron. What is the dangerous flaw?

Correct: c. Remediation changes the device. Running it unattended every night means the play can silently undo a legitimate temporary change (or "fix" something mid-incident) with nobody watching — so nightly jobs report, and remediation goes in a gated change window with dry-runs and approval. ios_config runs fine from cron, can coexist with assert, and Ansible runs from cron normally; the issue is unattended change, not mechanics.

Pause & Predict

Predict: ios_facts can return parsed facts under ansible_net_* keys. Why might you still prefer reading raw text with ios_command for a "no telnet" check? Type your guess.

Answer: Because ios_facts exposes a fixed set of parsed facts (version, interfaces, neighbors, and resource-module sections) — it may not surface every arbitrary line you care about, like the exact VTY transport input statement. ios_command with show running-config returns the raw config text, so you can assert on any literal line. Facts are cleaner and structured when the data you need is in them; raw command output is the catch-all when your policy targets a specific config line that facts do not model.

④ A real pipeline — backup → commit → assert → report, end to end

Now we stitch the pieces into one nightly pipeline. The shape every shop converges on: (1) backup every device → (2) git commit the pull → (3) compliance assert against policy → (4) report the red/green result to the team (email, Slack, or an artifact). Steps 1–2 give you history; steps 3–4 give you the audit. Run it from cron tonight, and tomorrow you have both — for every device, automatically.

A worked example — "detect a device still running telnet and remediate it". The nightly report flags BR-Pune-rtr01 (telnet found). That is the detect. The next morning, in the approved change window, an engineer runs the remediate play with a --check dry-run, eyeballs the diff, then applies it. The play sets transport input ssh, telnet is gone, and that evening's backup + compliance run both come back green. Detect at night, remediate by day, prove it with the next backup.

remediate-telnet.yml — close telnet on the VTY lines (run in a change window, --check first)
---
- name: Remediate telnet on flagged hosts
  hosts: "{{ target | default('ios_devices') }}"
  gather_facts: false
  tasks:
    - name: Force SSH-only on VTY lines (no telnet)
      cisco.ios.ios_config:
        lines:
          - transport input ssh
        parents: line vty 0 15
      register: fix

    - name: Report what changed
      ansible.builtin.debug:
        msg: "{{ 'remediated' if fix.changed else 'already compliant' }}"
Expected output
# dry run first:  ansible-playbook remediate-telnet.yml -e target=BR-Pune-rtr01 --check --diff
--- before
+++ after
@@ line vty 0 15 @@
- transport input telnet ssh
+ transport input ssh
changed: [BR-Pune-rtr01]
TASK [Report what changed] => { "msg": "remediated" }

There is a slicker, declarative way to detect drift too: ios_config can take diff_against: intended with an intended_config baseline, and Ansible reports the difference between the device and your golden config. Its sibling choices are diff_against: running (before/after of your change) and diff_against: startup (running vs startup). For a "does this match my approved baseline" check, intended is the one that maps to drift detection.

👉 So far: the four-stage pipeline plus a worked detect-and-remediate. Next: the three real gotchas that bite production pipelines, then your cheat-sheet.

Three gotchas that bite real pipelines. One — secrets in saved configs. Running-configs carry SNMP communities and TACACS keys; if your Vault password or those configs leak you have a breach. This is not theoretical: CVE-2024-8775 showed Ansible Vault secrets exposed in plaintext in playbook output, and CVE-2024-0690 showed no_log not being respected in some loop scenarios — so put no_log: true on any task touching secrets and keep backup repos private. Two — huge diffs from timestamps. Strip volatile clock/timestamp lines or every backup looks changed. Three — false drift from loose matches. Assert on the real config line, not a substring, or harmless text triggers a red.

Terminal recreation — a full nightly run: backup → commit → compliance
ram@netauto:~/playbooks$ ansible-playbook -i inventory.ini backup.yml && \
  bash commit-backups.sh && \
  ansible-playbook -i inventory.ini compliance.yml
Expected output
PLAY RECAP (backup.yml) *** BR-Mumbai-rtr01: ok=2 changed=0   BR-Pune-rtr01: ok=2 changed=0
[main 7c4e1a9] Nightly config backup 2026-06-11 | 2 files changed, 11 insertions(+), 3 deletions(-)
TASK [Assert the device meets policy] ***
ok: [BR-Mumbai-rtr01] => "Compliant"
fatal: [BR-Pune-rtr01]: FAILED! => "NON-COMPLIANT: telnet/SNMPv2/NTP/AAA policy violated"
PLAY RECAP (compliance.yml) *** BR-Mumbai-rtr01: failed=0   BR-Pune-rtr01: failed=1

On the RHCE EX294 blueprint, this lesson lives where it counts: writing playbooks with tasks, variables and conditionals, using modules correctly, and — directly relevant here — protecting credentials with Ansible Vault. The 2026 exam runs on ansible-navigator with execution environments. The backup/assert pattern you just learned is the same task-and-module muscle the exam tests; the Vault + no_log discipline is the security objective. Career-wise, "we automate config backup and nightly compliance" is one of the most common first real jobs handed to a junior on a network automation desk.

Figure 4 — Ansible config backup & compliance — the cheat-sheet
Ansible config backup and compliance on one card — modules, the nightly pipeline, diff_against, assert policy, gotchas and first commands A nine-tile cheat sheet. Tiles cover the backup module and options, the compliance read-and-assert modules, the four-stage nightly pipeline, the diff_against choices, the four assert policy lines, report-only versus remediate, the three production gotchas, the secrets and CVE note, and the first commands to run. Each tile has a one-line takeaway. Config backup & compliance — your one-glance card Backup moduleios_config: backup: truebackup_options: dir_path / filenamedefault file: hostname_config.date@time Read + assertios_command "show running-config"ios_facts → ansible_net_*assert that: fail_msg → fails the host Nightly pipeline1 backup → 2 git commit3 assert → 4 reporthistory + audit, every night diff_againstrunning · startup · intendedintended + intended_config= drift vs your golden baseline Assert policy linesno telnet · SNMPv3 onlyNTP set · aaa new-modelmatch the real line, not a substring Report vs remediatereport-only = safe nightlyremediate = change window only--check dry-run + approval first 3 gotchassecrets in saved configstimestamp lines → false diffsloose match → false drift Secrets / CVEsVault + no_log: true on secret tasksCVE-2024-8775 vault plaintext leakCVE-2024-0690 no_log skipped in loops First commandsansible-playbook backup.ymlgit diff <day1> <day2>ansible-playbook compliance.yml
Your one-card map of this lesson — the modules, the nightly pipeline, the diff_against choices, the assert policy lines, the gotchas and the first commands you will run. Keep it open in week one.
Daily-life analogy — the dabbawala daily check + the bank passbook

The whole pipeline is two familiar habits stacked. The Git backup is your bank passbook: every entry is dated and you can flip back to any day and see the exact balance (config) and what moved (the diff). The nightly compliance assert is the dabbawala's end-of-day tally: every tiffin (device) is checked against the manifest, and the one that does not match is flagged loudly — not all 200, just the drifter. Passbook for history, daily tally for "is everything still right". Together they are config backup + compliance.

Next: run this pipeline from AWX / Automation Controller
Prove you own this lesson

Cold, in 30 seconds: name the four pipeline stages (backup → git commit → assert → report); say which ios_config option pulls the config (backup: true + backup_options) and which detects drift against a baseline (diff_against: intended); state the four assert policy lines (no telnet, SNMPv3 only, NTP set, AAA on); and explain why the nightly job is report-only while remediation needs a change window. If you can do that without notes, you are ready for AWX and for the Vault objectives on EX294.

Quick check · Q4 of 10

An interviewer asks Meera: "Walk me through your nightly network-config pipeline and the single biggest safety rule in it." Best answer?

Correct: a. The pipeline is backup → commit → assert → report, and the load-bearing safety rule is that the unattended nightly job only reports — it never changes production. Remediation is deliberate, gated and dry-run-first. copy run start gives no history; hourly auto-remediation is exactly the unattended-change danger to avoid; and "guess which changed" defeats the point of a complete nightly record.

🤖 Ask the AI Tutor

Tap any question — instant, scoped to this lesson. No login, no waiting.

Pre-curated from Ansible docs + community Q&A, scoped to this lesson. For a live prod issue, paste your export into chat.techclick.in.

📝 Wrap-up assessment — six more

You've answered 4 inline. Six left. 70% (7 of 10) marks the lesson complete on your profile. Tap Submit all answers at the end.

Q5 · Remember

In the cisco.ios.ios_config module, which option pulls the device's running-config to a file on the control node?

Correct: c. backup: true tells ios_config to fetch the running-config and write it to a file (default backup/_config.@
Q6 · Apply

An Airtel branch router must be checked nightly for "no telnet" without changing anything on the device. Which approach fits?

Correct: b. Reading with ios_command and testing with assert detects and reports drift while changing nothing (changed=0) — perfect for an unattended nightly check. Pushing ios_config or rebooting changes production; diff_against: startup compares configs, it does not "push" anything and is not a no-telnet test.
Q7 · Apply

You want each device's backups in their own dated file so git diff between two days is clean. Which ios_config configuration achieves it?

Correct: d. backup_options.dir_path (per-host directory) plus a date-stamped filename gives one tidy file per device per day, so diffs are clean. The bare default dumps everything into one backup/ folder; diff_against and ios_facts are about comparison and fact-gathering, not file organisation.
Q8 · Analyze

A device that nobody touched produces a fresh Git commit every single night, and the diffs are noisy. Most likely root cause?

Correct: a. A self-changing line — a timestamp, clock-period or uptime counter — makes every pulled config technically different, so Git commits it and the diff is full of noise. The fix is to strip volatile lines before committing. assert failing would show in compliance output, not cause backup commits; Git is not corrupting anything; and a backup play does not reboot devices.
Q9 · Analyze

A "no telnet" compliance play wrongly fails a device whose config has the comment "! telnet disabled for audit". What is the flaw and the fix?

Correct: c. The assert is matching the bare word "telnet", which also appears in a harmless comment, so it false-fails — a classic loose-match false drift. Asserting on the actual line ("transport input telnet") fixes it. The device is compliant (not running telnet), ios_command reads the whole config including comments, and the backup freshness is irrelevant to a live compliance read.
Q10 · Evaluate

Two designs for the nightly job: (A) detect drift AND auto-remediate (push ios_config fixes) every night via cron; (B) detect + report only at night, and remediate separately in a gated change window with a --check dry-run. Which is stronger and why?

Correct: b. B is safer and is the industry norm: the nightly job records and audits without touching production, while remediation — which changes devices — is gated, dry-run-first and approved so it cannot silently revert a legitimate temporary change or act mid-incident. A risks unattended outages; "change windows slow us down" ignores blast radius; and the two designs are not identical because one can cause harm with no human in the loop.
Lesson complete — saved to your profile.
Almost! You need 70% (7 of 10) — re-read the path that tripped you up and tap "Try again".

🧠 In your own words

Type one line: In one line, what is the difference between what a backup playbook proves and what a compliance playbook proves? Then compare to the expert version.

Expert version: A backup playbook proves what the config is and how it changed over time (a versioned record + diff in Git), while a compliance playbook proves whether that running state still satisfies a written policy (telnet off, SNMPv3 only, NTP set, AAA on) by asserting against it and failing the host that drifted — one is the camera, the other is the auditor.

🗣 Teach a friend

Best way to lock it in — explain it in one line to a teammate. Tap to generate a paste-ready summary.

📖 Glossary

Configuration drift
When a device's live config slowly diverges from the approved baseline because of un-tracked manual changes.
running-config
The config a device is running right now, in memory; on IOS you read it with show running-config.
Backup (config)
A point-in-time copy of a device's config saved off-box so you can compare, audit and restore it.
ios_config backup
cisco.ios.ios_config option (backup: true) that pulls the running-config to a file; backup_options sets dir_path and filename.
diff_against
ios_config option to compare the device config against running, startup, or intended (a golden baseline) — drift detection.
intended_config
The golden/approved baseline config you supply so diff_against: intended can show how the device differs from it.
ios_facts
Module that collects structured facts (version, interfaces, parsed sections) from an IOS device under ansible_net_* keys.
ios_command
Module that runs show commands and returns the raw text output for you to search/assert on.
assert module
ansible.builtin.assert — checks a list of conditions; if any is false the host fails the play, with your fail_msg.
Git
Version-control system; each commit is a dated, attributable snapshot, so git diff shows exactly what changed between backups.
Report vs remediate
Report-only detects + flags drift but changes nothing (safe nightly); remediate pushes the fix (change-window only).
Ansible Vault / no_log
Vault encrypts secrets at rest; no_log: true hides them from task output at run time — use both for credentials.

📚 Sources

  1. Ansible Community Documentation — cisco.ios.ios_config module (backup: true default false; backup_options.dir_path / filename with default _config.@
  2. Ansible Community Documentation — cisco.ios.ios_facts and cisco.ios.ios_command modules (ansible_net_* fact keys; raw show-command output for compliance assertions). docs.ansible.com/ansible/latest/collections/cisco/ios/ios_facts_module.html · docs.ansible.com/ansible/latest/collections/cisco/ios/ios_command_module.html
  3. CellStream — "Two Ansible Network Compliance Examples" + PacketCoders — "Automating Network Config Backups with Ansible and Git" (real ios_command + assert compliance pattern; git commit of timestamped backups; reporting). cellstream.com/2025/06/19/two-ansible-network-compliance-examples · packetcoders.io/automating-network-config-backups-with-ansible-and-git
  4. jwkenney — "Using Ansible to audit configuration drift in a brownfield environment" + ansiblebyexample — "Managing Compliance Drift with Ansible" (false-drift from timestamp/volatile lines, md5sum noise, detect-vs-remediate modes). jwkenney.github.io/auditing-configuration-drift · ansiblebyexample.com/articles/managing-compliance-drift-with-ansible
  5. Red Hat / NIST — Ansible Vault + no_log guidance and CVE-2024-8775 (vaulted secrets exposed in plaintext in playbook output) + CVE-2024-0690 (ANSIBLE_NO_LOG not respected in some loop scenarios). docs.ansible.com/ansible/latest/vault_guide/index.html · nvd.nist.gov/vuln/detail/CVE-2024-8775 · nvd.nist.gov/vuln/detail/CVE-2024-0690
  6. Red Hat EX294 — Red Hat Certified Engineer (RHCE) exam objectives: create/use playbooks with tasks, variables, conditionals and modules; protect sensitive data with Ansible Vault; 2026 toolset ansible-navigator + execution environments. redhat.com/en/services/training/ex294-red-hat-certified-engineer-rhce-exam-red-hat-enterprise-linux

What's next?

You can pull and audit configs from the CLI now — but who runs this nightly, stores the Vault password safely, and shows the whole team a red/green dashboard? That is the control room. Next we move the pipeline into AWX / Automation Controller.