What does it mean that Ansible is 'idempotent'?

Correct: b. Idempotency means a task changes the system only if it is not already in the desired state, so re-running a playbook is safe and predictable — the defining property of Ansible and the most-tested interview concept.

An Indian IT-services team must patch 200 servers but cannot risk an outage. What is the right pre-flight?

Correct: a. --check is a dry run (no changes), --diff shows exactly what would change, --limit canaries one batch, and serial rolls the real run out in controlled waves so a failure stops early — the textbook way to de-risk a large change.

You need a DB password inside a Git-tracked playbook repo. Which is correct?

Correct: c. Ansible Vault encrypts secrets at rest (AES256); you commit only ciphertext and decrypt at runtime via --ask-vault-pass / a vault-id / an AAP Vault credential. Plaintext or base64 in Git is an instant fail.

On the second run a playbook still reports changed=4 on every host. The most likely root cause is…

Correct: d. A truly idempotent play reports changed=0 on the second run. Persistent changes almost always come from command/shell tasks (which can't detect prior state) or missing creates:/changed_when: — switch to a real module or add the guards.

A play fails with 'unreachable=1' on one host while others succeed. Where do you look FIRST?

Correct: b. 'unreachable' is a transport problem, not a task problem — Ansible couldn't establish the SSH/WinRM session. Check ansible_user, the private key, the inventory entry, host-key trust and Python on the target. 'failed' (not 'unreachable') would point at the task.

An interviewer says 'a playbook is basically just a shell script.' The best correction is…

Correct: a. A shell script runs the same imperative commands on every execution; an Ansible playbook declares the desired end state and its modules check current state, changing only what's needed (idempotency). That declarative + idempotent distinction — not the YAML syntax — is the real answer.

Ansible Interview QnA

Q: An interviewer asks: "How does Ansible run a task on 50 Linux servers without anything installed on them?" Best answer?

Correct: a. Ansible is agentless and push-based: it opens an SSH connection, copies the relevant Python module to the target, executes it, captures ok/changed/failed, then deletes it. No agent, no daemon, no master to poll.

Q: A playbook deploys nginx.conf from a Jinja2 template and a separate task does "notify: restart nginx". On the SECOND run nothing changed in the template. Does nginx restart?

Correct: c. Handlers fire only when a notifying task reports changed . On the second run the rendered config matches the target, the template task is 'ok', the handler is not queued, and nginx is left running — that is idempotency protecting you from a needless restart.

Q: A teammate writes the same 30 lines of "install + configure + restart Apache" tasks in five different playbooks. What is the correct Ansible fix?

Correct: b. Extracting the repeated tasks into a roles/apache/ role makes them reusable, testable and shareable (via Galaxy/collections). Each playbook then references the role in one line — DRY, the entire reason roles exist.

Q: On the SECOND consecutive run of a working playbook, you still see "changed=4" on every host. What does that tell a senior engineer?

Correct: d. A correct playbook should report changed=0 on the second run because the state already matches. Persistent 'changed' usually means raw command / shell tasks (which always report changed) instead of proper modules, or missing creates: / changed_when: guards. That is a non-idempotency red flag.

Content-specific feature visual for this lesson: use it as the 60-second map before reading the full detail.

Most engineers think…

Most candidates say "Ansible needs an agent" or "a playbook is basically a shell script" — and the interview quietly ends there.

Both fail you. Ansible is agentless — it pushes Python modules over SSH/WinRM with nothing pre-installed on the target — and a playbook is declarative and idempotent, not imperative: running it twice converges to the same state instead of blindly re-running commands. That idempotency is the whole point and the #1 interview theme. This lesson trains the framing that gets you hired.

① Core concepts — agentless, push, idempotent

Ansible interviews open on the model, and the model is the whole exam. Ansible runs from a single control node and configures many managed nodes over SSH (WinRM for Windows). It is agentless — nothing is pre-installed on the targets — and it uses a push model, not a pull/poll model.

Figure 1 — Ansible architecture — one control node pushes to many

Nothing is installed on the targets to run a task — Ansible copies the module, executes it under Python, and cleans up. The green box is the whole point: re-running converges to the same declared state.

The Ansible vocabulary every interview opens with

Know these four cold before anything else. Tap each card.

🖧

Control vs managed node

tap to flip

Control node = where Ansible is installed and runs from. Managed nodes = the targets it configures over SSH/WinRM. Only the control node needs Ansible.

🚫

Agentless

tap to flip

No software is pre-installed on targets. Ansible copies a Python module, runs it over SSH/WinRM, then deletes it. Just need SSH + Python on the target.

🔁

Idempotency

tap to flip

Running the same play twice converges to the same state without redoing work. A task reports changed only when it truly alters the system — the core of Ansible.

📜

Playbook vs ad-hoc

tap to flip

A playbook is a reusable YAML file of plays/tasks. An ad-hoc command (ansible all -m ping) is a one-off — great for quick checks, not for repeatable config.

The single most-tested idea is idempotency: running the same playbook twice converges to the same state without re-doing changes. The inventory (static or dynamic) tells Ansible which hosts; modules do the real work and report ok/changed/failed.

Figure 3 — Ansible vs Puppet / Chef — why agentless + push wins

The one-liner that wins: Ansible is agentless and push-based — you manage a target the moment SSH works, with no agent to install, certificate to sign, or master server to keep alive.

Interviewers often probe the comparison with Puppet and Chef to test whether you really understand the model — so be ready to contrast agentless/push/YAML against agent/pull/DSL in one breath.

Quick check · Q1 of 10 · Apply

An interviewer asks: "How does Ansible run a task on 50 Linux servers without anything installed on them?" Best answer?

a) It copies a Python module over SSH, runs it, and removes it — agentlessb) It installs a lightweight agent on each server firstc) Each server polls a central Ansible master every 30 minutesd) It uses a kernel driver pushed via DHCP

Correct: a. Ansible is agentless and push-based: it opens an SSH connection, copies the relevant Python module to the target, executes it, captures ok/changed/failed, then deletes it. No agent, no daemon, no master to poll.

👉 So far: Ansible = agentless + push + declarative + idempotent. Control node runs it; managed nodes need only SSH/WinRM + Python. Inventory says WHICH hosts; modules do the work; idempotency means re-running converges to the same state.

The 'Ansible needs an agent' trap

Answer firmly: no. Puppet and Chef install an agent and pull from a master; Ansible installs nothing on the target and PUSHES over SSH/WinRM. The only requirements on a Linux target are SSH access and Python; on Windows it is WinRM. Saying 'Ansible needs an agent' is an instant fail.

Interview Q&A — core model questions they actually ask

❓ Q · What is the difference between a module and a plugin?

Modules are the units of work that get copied to and executed ON the managed node — ansible.builtin.dnf, copy, service, etc. Each one runs on the target, does the action, and returns JSON (ok/changed/failed). Plugins are Python pieces that run on the control node and extend Ansible's own behaviour — they never touch the target. The families interviewers expect you to name: lookup plugins (pull data in at templating time, e.g. lookup('file', …), lookup('env', …)), filter plugins (the | default, | to_json, | regex_replace transforms in Jinja2), connection plugins (ssh, winrm, local, docker — how Ansible reaches the host), callback plugins (control on-screen output and logging, e.g. the recap or a Slack notifier), plus inventory, test and become plugins. Trap: "everything is a module" is wrong — say "modules execute on the target, plugins execute on the controller."

❓ Q · Static vs dynamic inventory — what is dynamic inventory and when do you use it?

A static inventory is a hand-written INI or YAML file listing hosts and groups — fine when the fleet is small and stable. A dynamic inventory is generated at runtime by an inventory plugin that queries a live source so the host list is never stale. In 2026 the right answer is plugins, not the old executable scripts: amazon.aws.aws_ec2 for AWS, azure.azcollection.azure_rm for Azure, google.cloud.gcp_compute for GCP, plus VMware, OpenStack and constructed. You enable it with a *.aws_ec2.yml config in the inventory path (and enable_plugins in ansible.cfg), then run ansible-inventory -i inventory.aws_ec2.yml --graph to verify. The big win is keyed_groups / compose: hosts auto-group by tag, region or instance state, so a new EC2 box appears in tag_role_web automatically. Use dynamic for cloud/auto-scaling fleets; use static for a fixed lab or bootstrap. Trap: don't say "a Python script that prints JSON" — that's the deprecated style; lead with inventory plugins.

② Playbooks — YAML, tasks, handlers, variables & facts

A playbook is a YAML file of one or more play; each play maps a host group to an ordered list of task (module calls). Ansible gathers facts via the setup module unless you set gather_facts: false. Changed tasks can notify handlers, which run once at the very end.

▶ Watch a play converge — and why running it twice is safe

How Ansible installs and starts nginx idempotently, then what changes on the second run. Press Play for the healthy path, then Break it to see the failure.

① Gather factsThe setup module collects ansible_facts (OS, IP, packages) so tasks can branch on them.

▼

② Task: install nginxThe dnf/apt module checks if nginx is present. First run: absent → installs it → reports changed.

▼

③ Task: deploy config (template)Jinja2 renders nginx.conf.j2. The file differs from the target → written → changed → notifies a handler.

▼

④ Handler: restart nginxBecause the config changed, the queued handler fires ONCE at the end and restarts nginx. State now matches.

Press Play to step through the healthy path. Then press Break it.

Figure 2 — How a playbook actually runs — recite this order

Two facts interviewers love: handlers run ONCE at the very end (not inline), and a task reports changed=true only when it actually altered state — that distinction is idempotency on screen.

COLOUR KEYunreachable / failedtask running / inspectedchanged — state alteredok — already in desired state

variable precedence decides which value wins when the same variable is set in many places. Jinja2 powers {{ }} templates and when: conditionals; register captures a task's output for later steps.

Quick check · Q2 of 10 · Analyze

A playbook deploys nginx.conf from a Jinja2 template and a separate task does "notify: restart nginx". On the SECOND run nothing changed in the template. Does nginx restart?

a) Yes — handlers always run every timeb) Yes — gather_facts forces a restartc) No — the template task reports 'ok' (unchanged), so the handler is never notifiedd) Only if you pass --check

Correct: c. Handlers fire only when a notifying task reports changed. On the second run the rendered config matches the target, the template task is 'ok', the handler is not queued, and nginx is left running — that is idempotency protecting you from a needless restart.

Pause & Predict

Where in a role do you put a variable you want users to easily override, versus one that should be hard to override? Type your guess.

Answer: Put easily-overridden values in defaults/main.yml (lowest precedence — almost anything beats it). Put values you want to win in vars/main.yml (much higher precedence). The classic interview point: role defaults are the weakest source, role vars are strong — mixing them up is why 'my override isn't working'.

Sneha at Infosys faces this

A play installs a package and a handler should restart the service, but the service never restarts even though the package was just installed.

Likely cause

The package task reported 'ok' (already installed from a prior run), so it never sent the notify; OR the handler name in notify does not exactly match the handler's name.

Diagnosis

Run with -v and read the recap: is the install task 'changed' or 'ok'? Compare the notify string to the handler's name character-for-character.

ansible-playbook site.yml -v ▸ read changed/ok per task

Fix

Make the notify string match the handler name exactly; if you truly need a restart regardless, use a separate handler triggered by the config task, or force_handlers/meta: flush_handlers.

Verify

Re-run: when the config changes, the task reports changed, the handler is notified, and the service restarts exactly once.

👉 So far: Playbook = YAML plays → tasks (module calls). Facts come from the setup module. Handlers run once at the end, only when notified by a changed task. Variable precedence: -e (extra-vars) wins, role defaults lose. Jinja2 powers templates and when:.

Interview Q&A — playbook logic, errors & precedence

❓ Q · What are block / rescue / always, and how do you do error handling and retries?

A block groups tasks so you can apply shared directives (when, become, tags) once — and, crucially, it gives Ansible a try/catch/finally. rescue runs only if a task in the block fails; always runs no matter what (success or failure) — perfect for cleanup, releasing a lock, or re-enabling monitoring. Inside a rescue, ansible_failed_task and ansible_failed_result tell you what blew up. For retries on flaky steps, use until with retries and delay: until: result.rc == 0, retries: 5, delay: 10, with register: result — Ansible re-runs the task until the condition is true or attempts run out. Related controls interviewers pair with this: ignore_errors: true (keep going past a failure), failed_when: (define your own failure condition, e.g. a string in stderr), and any_errors_fatal: true (abort the whole play across all hosts on the first failure). Trap: rescue does NOT catch unreachable hosts — those are transport failures, not task failures; only failed tasks trigger a rescue.

❓ Q · How do loops work (loop vs with_items) and how do you use when: and register together?

loop: is the modern, recommended way to iterate; with_items (and the other with_* styles) is the older syntax now superseded — say "use loop; with_items still works but is legacy." Inside the loop the current element is item. Tune behaviour with loop_control: loop_var (rename item to avoid clashes in nested loops), label (clean up noisy output, e.g. show just item.name), index_var, and pause. Conditionals use when: — a raw Jinja2 expression (no {{ }} needed), e.g. when: ansible_facts['os_family'] == 'RedHat'; multiple items in a when list are AND-ed. register captures a task's result into a variable so a later task can branch: register: svc then when: svc.rc != 0. Two classic gotchas: (1) when you loop AND register, the result holds a .results list, so you iterate svc.results, not svc directly; (2) when is evaluated per item, so combining loop + when filters elements rather than skipping the whole task. Trap: don't reach for with_items in a 2026 interview — and never put {{ }} around the whole when expression.

❓ Q · Explain Ansible variable precedence in full — where do extra-vars, role vars, defaults, host_vars and set_fact sit?

Ansible merges variables from about 22 sources; when the same name is set in several, the highest-precedence one wins. You don't have to recite all 22, but you must know the anchors and their order. Lowest to highest, the ones interviewers test: role defaults (defaults/main.yml, the weakest — built to be overridden) → inventory group_vars → playbook group_vars → inventory host_vars → playbook host_vars → host facts / cached set_facts → play vars / vars_files → role vars (vars/main.yml, much stronger than defaults) → block vars → task vars → include_vars → set_fact / registered vars → role & include params → extra-vars (-e / --extra-vars, the absolute winner). The two facts that catch people: group_vars beats role defaults but loses to role vars; and -e overrides everything — even set_fact. Trap: "my override doesn't work" is almost always because the value was put in role vars/ (high precedence) instead of defaults/ (low), or because something passed -e upstream. Use ansible-playbook --extra-vars only as a deliberate override, and put tunables in defaults/.

Figure 6 — Variable precedence — weakest (bottom) to strongest (top)

Two lines that win the question: role defaults are the weakest (override-me) source, and -e extra-vars beat everything — including set_fact and role vars.

③ Roles, reuse & secrets — Galaxy, collections, Vault

roles are how you stop copy-pasting tasks. A role is a directory with a fixed layout — tasks/, handlers/, templates/, files/, vars/, defaults/, meta/ — that a play includes by name. Share and reuse them via Ansible Galaxy and bundle modules/roles/plugins into collections addressed by FQCN.

🖥️ This is the screen you run automation from in production — Automation Execution ▸ Templates ▸ Create job template in the AAP / AWX controller. Fields ①②③ decide WHAT runs, WHERE, and AS WHOM.

controller.aap.lab · Automation Execution ▸ Templates ▸ Create job template

Name *

Patch-RHEL-Web-Tier

Job Type

Run

Inventory *

Prod-WebServers

Project *

infra-playbooks (Git)

Playbook *

site.yml

Credentials *

rhel-ssh-key + vault-pass

Limit

webservers:&mumbai

Verbosity

1 (Verbose)

Save Cancel

① Playbook must be a file inside the linked Project (Git repo synced into AAP) — usually site.yml. ② Credentials pin BOTH the machine credential (SSH key) and the Vault credential, or encrypted vars fail to decrypt. ③ Limit narrows the run to a host pattern (e.g. one batch) without editing the inventory.

include vs import controls reuse timing. Secrets are handled by Ansible Vault — and you can encrypt a single value inline with encrypt_string.

Pause & Predict

You have a database password that must live in a Git repo with the playbook. How do you store it safely? Type your guess.

Answer: Encrypt it with Ansible Vault — either put it in a vault-encrypted vars file (ansible-vault encrypt group_vars/prod/vault.yml) or encrypt just that value with ansible-vault encrypt_string and paste the ciphertext into a normal vars file. At runtime you supply the password via --ask-vault-pass or a vault credential (in AAP). Plaintext in Git is the instant-fail answer.

Quick check · Q3 of 10 · Analyze

A teammate writes the same 30 lines of "install + configure + restart Apache" tasks in five different playbooks. What is the correct Ansible fix?

a) Copy them into a sixth playbook for safetyb) Move the tasks into a role and have each playbook just call that rolec) Put them all in one giant task with shelld) Disable gather_facts to save time

Correct: b. Extracting the repeated tasks into a roles/apache/ role makes them reusable, testable and shareable (via Galaxy/collections). Each playbook then references the role in one line — DRY, the entire reason roles exist.

Rahul at TCS faces this

After moving secrets into a Vault-encrypted vars file, the playbook fails on every host with 'Attempting to decrypt but no vault secrets found'.

Likely cause

The run wasn't given the Vault password — no --ask-vault-pass, no --vault-id, or (in AAP) no Vault credential attached to the job template.

Diagnosis

Re-run locally with --ask-vault-pass; if it works there, the gap is the missing Vault credential on the AAP job template.

ansible-playbook site.yml --ask-vault-pass ▸ then check AAP Credentials

Fix

Supply the Vault password: --ask-vault-pass / --vault-id prod@prompt locally, or attach the Vault credential alongside the SSH credential on the AAP job template.

Verify

Re-run: the encrypted vars decrypt, tasks proceed, and the recap shows failed=0.

'Roles are just folders' and 'Vault is optional'

Two killers. Roles aren't cosmetic — defaults/ vs vars/ have very different precedence, and meta/main.yml declares dependencies. And never hand-roll secret hiding: use Vault (or an external secrets manager like HashiCorp Vault) and commit only ciphertext. Plaintext passwords in a repo end interviews.

④ Scale & ops + troubleshooting

At scale you stop running from a laptop and move to AAP (the controller, formerly Tower; AWX is the open-source upstream). You define a job template that pins the playbook, inventory and credentials. dynamic inventory keeps the host list current.

Figure 4 — The play failed — why? Read the recap, then climb

Start at the recap line, not the traceback: unreachable means SSH/auth, failed means the task, changed-every-time means you broke idempotency, and undefined variable means templating/precedence.

Pause & Predict

Before patching 200 production servers, how do you prove the playbook is safe WITHOUT changing anything? Type your guess.

Answer: Run it in check mode with diff: ansible-playbook patch.yml --check --diff. --check is a dry run — modules report what they would change but make no changes; --diff shows the exact file/line differences. Combine with --limit and serial to roll out in batches. This is the answer interviewers want for 'how do you de-risk a big change'.

Arjun at HCL faces this

A patching play over 200 servers hammers them all at once and a few time out, leaving the fleet half-patched.

Likely cause

No batching — Ansible ran across all hosts up to the default forks at once. There is no serial setting to roll out gradually, and no check before the real run.

Diagnosis

Dry-run first with --check --diff on a --limit subset; then set serial to patch in waves so a failure stops the rollout early.

ansible-playbook patch.yml --check --diff --limit canary ▸ then serial: 10

Fix

Add serial: 10 (or a percentage) to the play, raise forks sensibly, and gate prod behind --check; use --limit to canary a small subset of hosts first.

Verify

Re-run: hosts patch in controlled waves, a failing wave halts the play (max_fail_percentage), and the recap shows unreachable=0 failed=0.

Dry-run, then verify connectivity and idempotency

ansible all -i inventory.ini -m ping              # is every host reachable over SSH?
ansible-playbook patch.yml --check --diff --limit 10.20.30.41   # dry run, show diffs
ansible-playbook patch.yml --become --limit webservers          # real run, sudo
ansible-playbook patch.yml --limit webservers                   # run AGAIN — must be all 'ok'

Expected output

PLAY RECAP *********************************************************
10.20.30.41   : ok=5    changed=0    unreachable=0    failed=0
10.20.30.42   : ok=5    changed=0    unreachable=0    failed=0

Quick check · Q4 of 10 · Apply

On the SECOND consecutive run of a working playbook, you still see "changed=4" on every host. What does that tell a senior engineer?

a) Everything is perfect — changes always happenb) The hosts rebooted themselvesc) gather_facts must be disabledd) The playbook is NOT idempotent — likely command/shell tasks with no creates/changed_when, re-doing work each run

Correct: d. A correct playbook should report changed=0 on the second run because the state already matches. Persistent 'changed' usually means raw command/shell tasks (which always report changed) instead of proper modules, or missing creates:/changed_when: guards. That is a non-idempotency red flag.

Priya at Wipro faces this

A play fails immediately with 'unreachable=1' on a brand-new host that the team swears is online.

Likely cause

It is a connectivity/auth problem, not a task problem: wrong SSH user or key, host not in the inventory group being targeted, host key not accepted, or Python missing on the target.

Diagnosis

Test the layer below Ansible: ssh user@host, then ansible -m ping. The error message (Permission denied / No route / host key) names the cause.

ansible 10.40.50.61 -m ping -vvv ▸ read the SSH error

Fix

Fix the inventory entry (ansible_user, ansible_ssh_private_key_file), accept/known_hosts the key, ensure become for privileged tasks, and confirm Python on the target.

Verify

ansible -m ping returns pong; the play recap shows unreachable=0.

Interview Q&A — 2026 ecosystem, testing & advanced controls

❓ Q · What is Event-Driven Ansible (EDA) and how do rulebooks differ from playbooks?

Event-Driven Ansible (EDA) is the part of Ansible that reacts to events automatically instead of waiting for a human to press run — it's how you do auto-remediation and self-healing in 2026. The unit of work is a rulebook (run by ansible-rulebook), and it has three parts: sources (plugins that listen for events — a webhook, Kafka, Prometheus/Alertmanager alerts, a Git change, AWX/AAP job events), rules (condition: expressions written in a language called Drools-style / EDA condition syntax, e.g. event.payload.alertname == "DiskFull"), and actions (most often run_playbook or run_module). The contrast interviewers want: a playbook is imperative-on-demand — you run it and it configures desired state top-to-bottom; a rulebook is declarative-reactive — it sits running, watches a stream of events, and fires a playbook only when a condition matches. In AAP this runs on the EDA Controller alongside the Automation Controller. Trap: don't say "EDA replaces playbooks" — it triggers them; the playbook is still where the actual change happens.

❓ Q · How do you test Ansible code with ansible-lint and Molecule, and why use FQCN?

ansible-lint is static analysis — it parses your playbooks/roles without running them and flags style problems, deprecations and risky patterns (using command where a module exists, missing name:, bare variables, non-FQCN module names). It ships profiles (min → production) you tighten over time and runs in CI on every PR. Molecule is the functional test harness for roles: it spins up a throwaway target (Docker/Podman by default, also Vagrant/cloud), runs the role, and then runs a converge + idempotence check — the idempotence step re-runs the role and fails the build if anything reports changed, which is how you prove idempotency automatically. Its phases are dependency → create → converge → idempotence → verify → destroy, where verify asserts the end state (often with Ansible asserts or testinfra). FQCN (Fully Qualified Collection Name, e.g. ansible.builtin.copy instead of bare copy) removes ambiguity about which collection a module comes from, is required by the production lint profile, and future-proofs you as modules move between collections. Trap: lint is static (no hosts touched); Molecule actually executes the role on a real ephemeral host — name both and say which is which.

❓ Q · What do delegate_to, run_once, async/poll and tags do, and when would you use each?

delegate_to runs a task on a different host than the one being looped over — the classic uses are talking to a load balancer or DNS API to drain a node, or gathering something from a central box; pair it with delegate_facts: true if you want the facts stored against the delegate. run_once: true runs a task a single time for the whole batch instead of once per host (e.g. take one DB migration, send one notification) — often combined with delegate_to: localhost. async + poll handle long-running or fire-and-forget tasks: async: 600 sets a max runtime and poll: 0 means "start it and don't wait" (kick off a long job, check it later with the async_status module), while poll: 5 backgrounds it but checks every 5s so SSH doesn't time out. tags let you run or skip slices of a play: label tasks/roles with tags:, then --tags deploy runs only those, --skip-tags slow excludes them; --tags always and the special never tag give you always-on / opt-in tasks. Trap: run_once picks the first host in the current batch, so under serial it can run once per wave, not truly once — delegate to localhost if you need exactly one execution.

Karthik at Tech Mahindra faces this

The team built an Event-Driven Ansible rulebook to auto-restart a service when Alertmanager fires 'ServiceDown', but the remediation playbook never runs even though alerts are arriving.

Likely cause

The rule's condition doesn't match the real event payload (wrong field path or label name), or the source plugin isn't actually receiving events — and an unmatched event is silently dropped, so nothing fires.

Diagnosis

Run the rulebook with verbose/print-events to see the exact payload, then compare every field in the condition to what's really arriving.

ansible-rulebook -r restart.yml -i inv --print-events -vv ▸ compare payload to condition

Fix

Correct the condition path to match the payload (e.g. event.alert.labels.alertname == "ServiceDown"), confirm the source (webhook/Kafka) is reachable, and ensure the EDA Controller has credentials to launch the remediation playbook.

Verify

Fire a test alert: the rule matches, the action launches the playbook, the service restarts, and the EDA activation log shows the rule hit.

'ansible-lint passing means the role works' and 'EDA replaces playbooks'

Two 2026 gotchas. ansible-lint is static — a green lint only proves style and syntax; it never runs the role, so it can't catch a broken task or a non-idempotent step. Only Molecule's idempotence phase (re-run, fail on any change) proves the role actually converges. And EDA doesn't replace playbooks — a rulebook only watches events and triggers a playbook; the real change still lives in the playbook it calls. Mixing these up signals you've read the buzzwords but not used the tools.

Figure 5 — Ansible interview cheat-sheet

Tap the Preview button at the top to save this one-page card before your interview.

Prove it with --check, ping, and a second run

Don't close an Ansible ticket on 'should be fine'. ansible all -m ping proves connectivity; --check --diff proves what a change WOULD do; and running the same play a second time should report changed=0 — that final check proves idempotency. These three answer the vast majority of Ansible problems.

👉 So far: Scale with AAP/AWX job templates + dynamic inventory; de-risk with --check --diff, --limit and serial; escalate with become. Read the RECAP first: unreachable=SSH/auth, failed=the task, changed-every-run=broken idempotency, undefined var=precedence.

🤖 Ask the AI Tutor

Tap any question — instant, scoped to this lesson. No login, no waiting.

Pre-curated from Ansible docs + community Q&A, scoped to this lesson. For a live prod issue, paste your export into chat.techclick.in.

🧠 In your own words

Type one line: why is Ansible idempotency such a big deal? Then compare to the expert version.

Expert version: Because idempotency makes automation safe to run repeatedly. A task checks current state and changes the system only if it is not already in the desired state, so re-running a playbook never re-does work or causes drift — you get the same end state every time, whether it is the first run or the hundredth. That predictability is what lets teams run the same play across hundreds of production servers, schedule it, and trust it. The proof in an interview: a correct play reports changed=0 on its second consecutive run.

🗣 Teach a friend

Best way to lock it in — explain it in one line to a teammate. Tap to generate a paste-ready summary.

📩 Quiz me on this in 7 days. Opt in and we'll email 3 micro-questions on Interview Prep at Day 1, Day 7 and Day 30 — spaced repetition is how this sticks. Un-tick any time.

📖 Glossary

Control node: The machine where Ansible is installed and run from; reads inventory and pushes modules. The only node needing Ansible.
Managed node: A target host Ansible configures over SSH (Linux) or WinRM (Windows); needs no agent — just SSH + Python or WinRM.
Agentless: No persistent agent on targets — Ansible copies a module, runs it, then removes it each run.
Idempotency: Re-running a playbook converges to the same state; a task reports changed only when it truly alters the system.
Inventory: The host list — static (INI/YAML) or dynamic (cloud plugin) — grouped, with host_vars/group_vars.
Playbook vs role: Playbook = a YAML file of plays/tasks; role = a reusable standard directory (tasks/handlers/templates/defaults/…).
Handlers: Tasks that run once at the end of a play, only if a changed task notified them — usually a service restart.
Variable precedence: -e extra-vars wins; role defaults lose; the merge order decides which value applies per host.
Ansible Vault: Encrypts secrets at rest (AES256); encrypt whole files or single values (encrypt_string) — commit only ciphertext.
AAP / AWX: Enterprise (AAP, ex-Tower) / open-source (AWX) controller: job templates, RBAC, scheduling, logging, dynamic inventory.

📚 Sources

Ansible Documentation — How Ansible works & the agentless architecture. docs.ansible.com
Ansible Documentation — Intro to playbooks, handlers and variable precedence. docs.ansible.com
Ansible Documentation — Roles, collections (FQCN) and Ansible Galaxy. docs.ansible.com / galaxy.ansible.com
Ansible Documentation — Protecting sensitive data with Ansible Vault (encrypt_string). docs.ansible.com
Red Hat — Ansible Automation Platform 2.5: Using automation execution — Job templates. docs.redhat.com
Spacelift / igmGuru — Ansible interview questions & answers (2026). spacelift.io, igmguru.com

What's next?

Cleared the Ansible round? Keep going — the interview-prep library covers Docker, Kubernetes, Terraform, Jenkins, Linux and more, all in the same hands-on style.

Next · All interview lessons → Practice on exam.techclick.in →

Ansible Interview Questions — Playbooks, Roles, Vault & Cheat-Sheet

🎯 By the end you will be able to

Pick where you want to start

Core concepts

Playbooks

Roles & secrets

Scale & ops

① Core concepts — agentless, push, idempotent

The Ansible vocabulary every interview opens with

Interview Q&A — core model questions they actually ask

② Playbooks — YAML, tasks, handlers, variables & facts

▶ Watch a play converge — and why running it twice is safe

Interview Q&A — playbook logic, errors & precedence

③ Roles, reuse & secrets — Galaxy, collections, Vault

④ Scale & ops + troubleshooting

Interview Q&A — 2026 ecosystem, testing & advanced controls

🤖 Ask the AI Tutor

📝 Wrap-up assessment — six more

🧠 In your own words

🗣 Teach a friend

📖 Glossary

📚 Sources

What's next?