Most engineers think…
Most people picture Splunk as 'one server you send logs to'. That model breaks the moment you have more than a handful of data sources or more than one analyst searching simultaneously.
A production Splunk environment is a distributed system with three tiers: a collection tier (forwarders), an indexing tier (clustered peer nodes managed by a cluster manager), and a search tier (a search-head cluster). The deployment server acts as the config-management plane for the collection tier. Understanding each role — what it stores, what it scales, and what fails when it goes down — is what separates a junior Splunk admin from an architect.
① Universal vs heavy forwarder — choosing the right collector
Splunk ships two forwarder types and the choice matters for CPU, bandwidth and flexibility. The universal forwarder (UF) is a lightweight, dedicated agent: it tails files, monitors network ports and WMI, and ships raw or minimally-parsed data to an indexer. It runs with a tiny footprint — far less CPU and memory than a full Splunk instance — making it the default choice for endpoints, servers and network devices. It does not index, it does not search, and it runs no web interface.
The heavy forwarder (HF) is a full Splunk Enterprise installation with indexing, searching and parsing capabilities — but deployed specifically to forward. You use a heavy forwarder when you need to parse, filter or mask data before it reaches the indexer (e.g. dropping noisy events at the edge, anonymising PII in transit, or routing specific sourcetypes to different indexer pools). Heavy forwarders consume more resources so you only add them where the pre-processing justifies the overhead.
The interview rule: UF by default, HF only when you need edge processing. Both can load-balance across multiple indexers and both receive configuration from the deployment server.
You need to deploy a Splunk collector on 500 Linux servers to tail log files. Which forwarder type is best?
② The deployment server — central config push at scale
When you have tens or hundreds of universal forwarders spread across an estate, logging into each one to update inputs.conf is not an option. The deployment server (DS) solves this: it is a Splunk instance whose only job is to push deployment apps — bundles of configuration files — to server classes (groups of forwarder clients that match a filter, e.g. all Windows hosts, all Linux DMZ boxes).
Forwarder clients poll the deployment server at a configurable interval (default every 60 seconds), download any changed apps, and restart the affected components automatically. The deployment server itself holds no data and does no indexing — it is purely a configuration management plane. In very large environments (thousands of forwarders) you may run multiple deployment servers or use Splunk's Monitoring Console to observe polling health.
What the DS does NOT do
The deployment server manages forwarder config only — it does not manage indexer or search-head configuration. For indexer peers you use the cluster manager; for search heads you use the deployer (a separate instance in a search-head cluster). Mixing them up is a classic interview mistake.
Lightweight agent — collects data (files, ports, WMI), ships raw or lightly-parsed events to indexers. No search, no indexing, tiny footprint. Default collector for every host.
Full Splunk Enterprise used as a forwarder. Use it when you need edge parsing, filtering, masking or complex routing before data reaches the indexer.
Splunk instance that pushes configuration apps to server classes of forwarder clients. Config-management only — it does not index, search or manage indexer/SHC peers.
Orchestrates the indexer cluster — directs replication, monitors peer health, coordinates rolling upgrades. Does NOT index or search data itself.
In an interview always name three different config-management roles: the deployment server (for forwarders), the deployer (for search-head cluster members) and the cluster manager (for indexer peers). Confusing them is one of the most common mistakes in Splunk architecture questions.
Which Splunk component pushes configuration updates to forwarder clients automatically?
③ Indexer clustering — replication factor, search factor and the cluster manager
An indexer cluster is a group of Splunk indexer peer nodes managed by a single cluster manager (formerly called master node). The cluster manager does not index or search — it orchestrates replication, monitors peer health, and coordinates rolling upgrades.
Two numbers define the cluster's durability and search availability. The replication factor (RF) is how many complete copies of each bucket exist across peers — a default of 3 means you can lose 2 peers before any data is at risk. The search factor (SF) is how many of those copies are in a searchable (uncompressed, with tsidx files) state — the default of 2 means at least 2 peers can answer search queries even if one goes down. The rule: RF >= SF >= 1 and you must have at least as many peers as the RF.
Sizing with EPS
Start with your ingestion rate in Events Per Second (EPS). Multiply by average event size to get GB/day, add the RF multiplier for storage across peers, and budget a hot-bucket overhead on top. Splunk's own sizing guidance recommends planning peer storage per-node as: (daily GB × retention days × RF) ÷ peer count, keeping each peer's disk utilisation well below 80% to leave headroom for replication bursts.
Setting replication factor to 1 means no copies — if one peer goes down you lose that bucket permanently. Always set RF >= 2 in production (default 3). The disk cost of replication is far cheaper than data loss. Never let a 'save storage' argument override RF.
▶ Watch a Windows event log reach a clustered indexer
An event travels from a UF through an HF to an indexer cluster. Press Play for the healthy path, then Break it to see the classic replication failure.
Your indexer cluster has replication factor=3 and search factor=2. How many peer nodes can fail before data becomes unsearchable?
④ Search-head clustering — captain, members, deployer and the full picture
A search-head cluster (SHC) is a group of search heads that share search workloads, knowledge objects (saved searches, dashboards, lookups) and scheduled jobs. One member is dynamically elected captain — it dispatches scheduled searches and coordinates artifact replication. If the captain fails, the cluster elects a new one automatically.
Knowledge objects are pushed to all SHC members by a separate instance called the deployer — never directly to members. The SHC requires a minimum of 3 members for quorum. An external load balancer sits in front of the SHC so users hit a single hostname regardless of which member serves them.
The full distributed picture
Put it together: UFs on every host ship logs to HFs (where edge processing is needed) which forward to indexer peers managed by the cluster manager. The deployment server keeps forwarder config current. The SHC (deployer + captain + members) answers analyst queries and runs scheduled alerts. The Monitoring Console watches the health of the whole estate from a single pane. This is the architecture that appears in Splunk certifications and enterprise interviews.
Vikram at a Mumbai fintech firm faces this
After adding two new indexer peers to the cluster, searches return incomplete results — some events from the past week are missing entirely in certain time ranges.
The cluster has not finished replicating data to the new peers; buckets are still in the process of meeting the replication factor after the expansion, and some buckets lack the minimum search factor copies.
Open the Monitoring Console ▸ Indexing ▸ Indexer Clustering — the dashboard shows buckets flagged as 'not yet searchable' and replication in progress. The cluster manager log confirms peers are catching up.
Monitoring Console ▸ Indexing ▸ Indexer Clustering ▸ Bucket StatusAllow the cluster manager time to complete replication. Do not add more peers mid-replication. Once all buckets meet the search factor, the cluster manager marks them searchable and results return fully.
Re-check the Monitoring Console — all buckets show green for replication and search factor compliance. Re-run the impacted search and confirm full results across the time range.
Before declaring a distributed deployment production-ready, check the Monitoring Console ▸ Indexer Clustering page. All peer nodes should be green, all buckets should meet RF and SF, and no replication is pending. Only a clean Monitoring Console reading means the cluster is actually protecting your data.
In a search-head cluster, where do you push shared dashboards and saved searches?
🤖 Ask the AI Tutor
Tap any question — instant, scoped to this lesson. No login, no waiting.
Pre-curated from vendor docs + community Q&A, scoped to this lesson. For a live prod issue, paste your export into chat.techclick.in.
📝 Wrap-up assessment — six more
You've answered 4 inline. Six left. 70% (7 of 10) marks the lesson complete on your profile. Tap Submit all answers at the end.
🧠 In your own words
Type one line: what are the three configuration-management roles in a full distributed Splunk deployment, and what does each one manage? Then compare with the expert version.
🗣 Teach a friend
Best way to lock it in — explain it in one line to a teammate. Tap to generate a paste-ready summary.
📖 Glossary
- Universal Forwarder (UF)
- A lightweight Splunk agent that collects data (files, ports, WMI) and ships it to indexers with minimal CPU/RAM. No search, no indexing, no web UI.
- Heavy Forwarder (HF)
- A full Splunk Enterprise instance used as a forwarder. Capable of parsing, filtering, masking and routing events before they reach the indexer.
- Deployment Server (DS)
- A Splunk instance that pushes configuration app bundles to server classes of forwarder clients. Config-management only; does not index or search.
- Cluster Manager
- Orchestrates the indexer cluster: directs replication, monitors peer health, enforces RF/SF, and coordinates rolling upgrades. Does not index or search.
- Replication Factor (RF)
- The number of complete copies of each bucket the indexer cluster maintains. Default 3; the cluster tolerates (RF-1) simultaneous peer failures.
- Search Factor (SF)
- The number of immediately searchable (uncompressed, tsidx-intact) bucket copies the cluster maintains. Default 2; keeps search available if one peer fails.
- Search-Head Cluster (SHC)
- A group of search heads that share workload, knowledge objects and scheduled searches. Requires a minimum of 3 members; one is elected captain dynamically.
- Deployer
- A separate Splunk instance that pushes knowledge objects (dashboards, saved searches, lookups) to all search-head cluster members.
- EPS (Events Per Second)
- The ingestion rate metric used to size Splunk peer nodes, storage and processing capacity in a distributed deployment.
- Server Class
- A named group of forwarder clients on the deployment server, matched by filter criteria, that all receive the same deployment apps.
📚 Sources
- Splunk Docs — Types of forwarders (Universal Forwarder vs Heavy Forwarder). help.splunk.com/en/splunk-enterprise/forward-and-process-data/forwarding-and-receiving-data/10.0/introduction-to-forwarding/types-of-forwarders
- Splunk Docs — Forwarder deployment topologies. help.splunk.com/en/splunk-enterprise/forward-and-process-data/forwarding-and-receiving-data/9.4/plan-your-deployment/forwarder-deployment-topologies
- Splunk Docs — The basics of indexer cluster architecture. help.splunk.com/en/data-management/manage-splunk-enterprise-indexers/9.4/overview-of-indexer-clusters-and-index-replication/the-basics-of-indexer-cluster-architecture
- Splunk Docs — Replication factor. help.splunk.com/en/data-management/manage-splunk-enterprise-indexers/10.0/how-indexer-clusters-work/replication-factor
- Splunk Docs — Search factor. help.splunk.com/en/data-management/manage-splunk-enterprise-indexers/9.1/how-indexer-clusters-work/search-factor
- Splunk Docs — System requirements and deployment considerations for search head clusters. help.splunk.com/en/splunk-enterprise/administer/distributed-search/9.4/deploy-search-head-clustering/system-requirements-and-other-deployment-considerations-for-search-head-clusters
What's next?
Got the deployment topology? Next, go deep on Splunk data pipelines — parsing, event-breaking, field extraction, and the transform chain — so you can control exactly what lands in the index.