What does the pipe character (|) do in SPL?

Correct: c. SPL is a pipeline read left to right. The pipe takes the output of everything before it and feeds it as input to the next command, so each command builds on the previous one's results.

Which command turns events into a results table by counting or grouping?

Correct: a. stats is the core transforming command — count, sum, avg, values, optionally 'by' a field — collapsing events into rows. eval and rex build fields per event; where filters; none of those aggregate.

You want to extract a username from raw text into a field without re-indexing. Which command?

Correct: b. rex extracts a field at search time using a regex with named capture groups, e.g. rex 'user=(? \w+)'. Because Splunk is schema-on-read, this needs no re-indexing. lookup enriches from a table; dedup and timechart don't extract fields.

Which group of commands can run in parallel on the indexers?

Correct: c. Distributable streaming commands (eval, rex, where, fields, rename) don't depend on event order, so each indexer runs them on its own data in parallel. Transforming commands gather events on the search head instead.

An interviewer asks how to speed up a slow SPL search. Best answer?

Correct: b. Filtering early limits the data read from disk and lets distributable streaming run on the indexers. Transforming late keeps work parallel for as long as possible. Wide all-time searches and early transforms are the classic causes of slow queries.

Why can you add a new field with rex and use it immediately, without re-indexing?

Correct: d. Splunk applies most of its schema when you read (search), not when you write (index). rex runs against the already-stored raw events at search time, so a new extraction takes effect on the next search with no re-indexing.

Splunk SPL — the Search Processing Language: Pipes, stats, eval, lookups & Fields (2026)

Q: In index=web status=500 | stats count by uri , what does the pipe do?

Correct: b. SPL reads left to right. The base search pulls the status=500 events, and the pipe hands those results to stats, which aggregates them into a count per uri. Commands always consume the output of everything to their left.

Q: Which command creates a new field from an expression, per event?

Correct: c. eval evaluates an expression for each event and writes the result into a field, e.g. eval is_slow=if(response_ms>1000,'yes','no'). stats aggregates; dedup removes duplicate rows; sort just orders results.

Q: You need the average response time per host as a table. Which command and shape?

Correct: a. Averaging and grouping is aggregation, which is stats: '| stats avg(response_ms) as avg_ms by host' returns one row per host. eval works per event and can't group; table only chooses columns; sort only orders.

Q: Why does putting index= , a tight time range and specific terms first make a search faster?

Correct: c. The base search bounds the data read from disk; filtering early means less to process. Distributable streaming commands (eval, rex, where) then run on the indexers in parallel. A transforming command later moves work to the search head, so filter early and transform late.

Most engineers think…

Most people think 'searching Splunk' means typing a few keywords in a box, like Google. That gets you raw events — but it falls apart the moment an interviewer asks you to count, group, or chart something, or to explain why a colleague's search is crawling.

SPL is a pipeline language. You begin with search terms over a time range to pull events, then pipe them through a chain of commands — each command takes the previous one's output as its input. Two ideas unlock everything: the difference between filtering commands (search, where) and building commands (eval, rex), versus transforming commands (stats, timechart, top) that turn events into a results table; and the fact that Splunk extracts most fields at search time. Understand the pipe model and that split, and you can read and write real SPL — and explain why filtering early is what makes a search fast.

① The pipe model — search terms, a time range, then a pipeline

The single most important idea: SPL is a pipeline, read left to right. Every search starts with search terms over a time range — typically an index, a sourcetype, and a few keywords — which pulls the matching events out of the indexers.

You then add a pipe (the | character) and a command. Each command takes the results of everything before it as its input and hands its output to whatever comes next. So index=web status=500 | stats count by uri reads as: get error events, then count them by URL. The base search (before the first pipe) decides how much data you pull; the commands after the pipes shape it.

The interview line: SPL is search-terms-then-pipeline. Keywords and the time range filter at the source; pipes pass a stream of results from one command to the next. Get that picture and the rest of SPL is just learning what each command does.

Figure 1 — An SPL search, read left to right

Every SPL search is search terms over a time range, then a pipeline of commands joined by the pipe (|).

Quick check · Q1 of 10 · Understand

In index=web status=500 | stats count by uri, what does the pipe do?

a) Searches a second index called statsb) Passes the matching 500 events into stats, which then counts them by URLc) Runs stats first, then the searchd) Comments out the stats command

Correct: b. SPL reads left to right. The base search pulls the status=500 events, and the pipe hands those results to stats, which aggregates them into a count per uri. Commands always consume the output of everything to their left.

👉 So far: SPL is a pipeline read left to right: search terms over a time range pull events, then the pipe (|) passes results from each command to the next.

② The everyday commands — filter, build a field, choose columns

A handful of commands cover most day-to-day searching. search filters events on raw keywords and field values (the implicit first command, and reusable later with another | search). eval creates or changes a field from an expression — e.g. | eval is_slow=if(response_ms>1000,'yes','no'). where filters using an expression and functions, so it can compare two fields or use calculations that plain search cannot.

Shaping the output

To control what you see, table picks the exact columns in order, fields keeps or drops fields (and trimming early speeds the search), sort orders results (e.g. | sort -count), and dedup removes duplicate rows for a field. A typical chain: index=auth action=failure | eval hour=strftime(_time,'%H') | table user, src_ip, hour | sort user. The mental split to keep: search and where filter; eval and rex build fields; table, fields, sort and dedup arrange the output.

Figure 2 — Four jobs SPL commands do

Most commands fall into one of four roles — keep the roles straight and any search reads cleanly.

➡️

The pipe |

tap to flip

Passes the results of everything to its left into the next command as input. SPL is read left to right as a pipeline.

🧮

stats

tap to flip

The workhorse transforming command. Aggregates events into a table: count, sum, avg, values, list — optionally 'by' one or more fields.

🛠️

eval

tap to flip

Creates or changes a field per event using an expression and functions like if(), case(), strftime(). Streaming — it keeps the events.

🔎

rex

tap to flip

Extracts a field at search time with a regex and named capture groups: rex 'user=(?\w+)'. No re-indexing needed — schema-on-read.

Sort commands into four jobs

When you read a search, label each command: filter (search, where), build a field (eval, rex), enrich (lookup), or transform (stats, timechart, top). table, fields, sort and dedup just arrange output. Naming the job of each piped command is how you read any SPL search out loud in an interview.

Quick check · Q2 of 10 · Remember

Which command creates a new field from an expression, per event?

a) statsb) dedupc) evald) sort

Correct: c. eval evaluates an expression for each event and writes the result into a field, e.g. eval is_slow=if(response_ms>1000,'yes','no'). stats aggregates; dedup removes duplicate rows; sort just orders results.

👉 So far: Everyday commands: search and where filter; eval and rex build fields; table, fields, sort and dedup arrange output. eval makes a field from an expression per event.

③ stats, transforms and fields — turning events into answers

Counting and grouping is where SPL earns its keep, and that is the job of transforming commands. stats aggregates events into a table: | stats count, avg(response_ms) as avg_ms by host gives one row per host. timechart is stats split over time for trend charts (| timechart count by status), and top is a shortcut for the most common values (| top limit=10 user). These do not return events — they return a results table.

Fields and enrichment

Splunk is schema-on-read: most fields are extracted at search time, so you can add new fields without re-indexing. rex extracts a field on the fly with a regex and named capture groups — | rex 'user=(?<user>\w+)' — and lookup enriches events from a table, e.g. mapping an error_code to a friendly description or a host to its owner. Together: filter to the right events, build or enrich the fields you need, then stats them into an answer.

Figure 3 — eval vs stats — the two people confuse

eval works per event to make a field; stats aggregates many events into a results table.

Figure 4 — The everyday SPL command set

Learn this handful and you can read and write most real searches; everything else builds on them.

Don't confuse eval with stats

A classic slip is reaching for eval to count or group — eval only works per event and writes a field, it never aggregates. Counting, summing, averaging and grouping are stats (or timechart over time, top for most-common). If your answer is 'one row per group', it's stats; if it's 'a value on each event', it's eval.

Quick check · Q3 of 10 · Apply

You need the average response time per host as a table. Which command and shape?

a) | stats avg(response_ms) by hostb) | eval avg(response_ms) by hostc) | table avg(response_ms)d) | sort response_ms

Correct: a. Averaging and grouping is aggregation, which is stats: '| stats avg(response_ms) as avg_ms by host' returns one row per host. eval works per event and can't group; table only chooses columns; sort only orders.

👉 So far: stats, timechart and top are transforming — they turn events into a results table. Fields are extracted at search time, so rex and lookup add or enrich fields without re-indexing.

④ Streaming vs transforming — and why filtering early is fast

Commands fall into two big families, and this is a favourite interview question. Streaming commands handle one event at a time — one event in, one (or zero) out. The distributable streaming ones — eval, rex, where, fields, rename — can run in parallel on the indexers, close to the data. Transforming commands — stats, timechart, top, chart, rare — gather all the events on the search head and build a results table; they are not streaming.

That is exactly why filtering early makes a search fast. The base search (index, sourcetype, time range, keywords) limits the data pulled from disk; distributable streaming commands then run on the indexers in parallel. The moment a transforming command runs, the work moves to the search head. So the rule is: filter as early as possible, transform as late as possible — narrow the index and time range, keep terms specific, and only then stats or timechart. A wide all-time search with a transform up front is the classic slow query.

Figure 5 — Streaming vs transforming commands

Distributable streaming runs on the indexers in parallel; transforming gathers everything on the search head.

Priya at a Hyderabad SOC faces this

Her dashboard panel 'top failing URLs' takes over a minute and sometimes times out, even though the data volume is modest.

Likely cause

The search runs a transform across all time first — '| stats count by uri | search uri=*login*' over an all-time window — so it pulls every event before filtering.

Diagnosis

Open the Job Inspector: nearly all the time is spent reading events because there is no index/sourcetype, the time range is All time, and the login filter is applied after stats.

Search ▸ Job Inspector ▸ execution costs + the search bar

Fix

Rewrite to filter early: 'index=web sourcetype=access_combined uri=*login* status>=400 earliest=-24h | stats count by uri | sort -count'. Now the base search narrows the data and runs distributably on the indexers.

Verify

Re-run: the Job Inspector shows far fewer events scanned, the panel returns in a couple of seconds, and the result is the same top list.

Prove it in the Job Inspector

Never guess why a search is slow. The Job Inspector shows events scanned and where time was spent. If most time is in reading events, your base search is too wide — add index, sourcetype, a tighter time range and specific terms, and watch the scanned count drop. That single read settles most 'my search is slow' tickets.

▶ Watch an SPL search run through the pipeline

How one search becomes a results table, command by command. Press Play for the healthy path, then Break it to see the classic slow-search failure.

① Base searchindex=web sourcetype=access_combined status>=500 earliest=-1h pulls only the recent server-error events from the indexers.

▼

② Build a field| eval slow=if(response_ms>1000,'yes','no') runs per event on the indexers, tagging each error as slow or not.

▼

③ Transform| stats count by uri, slow gathers the events on the search head and collapses them into one row per URL and slow flag.

▼

④ Arrange| sort -count puts the worst URLs on top, and the panel shows a clean ranked table of failing pages.

Press Play to step through the pipeline from base search to results table. Then press Break it.

Quick check · Q4 of 10 · Analyze

Why does putting index=, a tight time range and specific terms first make a search faster?

a) It changes the data into search-time fieldsb) It tells stats to run on the indexersc) It limits how much data is pulled from disk, and distributable streaming commands then run in parallel on the indexersd) It disables transforming commands

Correct: c. The base search bounds the data read from disk; filtering early means less to process. Distributable streaming commands (eval, rex, where) then run on the indexers in parallel. A transforming command later moves work to the search head, so filter early and transform late.

👉 So far: Streaming (eval/rex/where) is one-event-in-one-out and distributable runs on the indexers; transforming (stats/timechart/top) runs on the search head. Filter early, transform late, to keep searches fast.

🤖 Ask the AI Tutor

Tap any question — instant, scoped to this lesson. No login, no waiting.

Pre-curated from vendor docs + community Q&A, scoped to this lesson. For a live prod issue, paste your export into chat.techclick.in.

🧠 In your own words

Type one line: what makes SPL a 'pipeline', and why does filtering early make a search fast? Then compare with the expert version.

Expert version: SPL is a pipeline because a search reads left to right: it starts with search terms over a time range to pull events, then passes results through a chain of commands joined by the pipe (|), where each command takes the previous one's output as its input. Filtering early makes it fast because the base search — index, sourcetype, a tight time range and specific keywords — bounds how much data is read from disk, and distributable streaming commands (eval, rex, where) then run on the indexers in parallel. The moment a transforming command (stats, timechart, top) runs, work moves to the search head, so the rule is filter as early as possible and transform as late as possible.

🗣 Teach a friend

Best way to lock it in — explain it in one line to a teammate. Tap to generate a paste-ready summary.

📩 Quiz me on this in 7 days. Opt in and we'll email 3 micro-questions on Splunk at Day 1, Day 7 and Day 30 — spaced repetition is how this sticks. Un-tick any time.

📖 Glossary

SPL (Search Processing Language): Splunk's query language: a pipeline of search terms plus commands, joined by the pipe (|), that retrieves and processes events.
Pipe (|): The operator that passes the results of everything to its left into the next command as input. SPL reads left to right.
Base search: The part before the first pipe — index, sourcetype, keywords and time range — that decides how much data is pulled from disk.
search / where: Filtering commands. search filters on keywords and field values; where filters using an expression and functions, so it can compare fields.
eval: Creates or changes a field per event from an expression and functions like if(), case() and strftime(). A streaming command.
stats: The core transforming command. Aggregates events into a results table with count, sum, avg, values, optionally grouped 'by' fields.
rex: Extracts a field at search time using a regex with named capture groups. No re-indexing needed because Splunk is schema-on-read.
lookup: Enriches events with extra data from a lookup table — e.g. mapping a code to a description or a host to its owner.
timechart / top: timechart is stats split over time for trend charts; top returns the most common values of a field. Both are transforming.
Streaming vs transforming: Streaming = one event in, one out (distributable ones run on indexers in parallel); transforming = gather all events on the search head into a table.

📚 Sources

Splunk Docs — About the search language (SPL overview, pipes, commands). help.splunk.com/en/splunk-cloud-platform/search/search-manual/10.4.2604/search-overview/about-the-search-language
Splunk Docs — Types of commands: streaming, distributable streaming, centralized streaming and transforming. help.splunk.com (Splunk_Search_Typesofcommands)
Splunk Docs — Use the eval command and functions. docs.splunk.com/Documentation/Splunk/latest/Search/Usetheevalcommandandfunctions
Splunk Docs — rex command: search-time field extraction with named capture groups. help.splunk.com/en/splunk-cloud-platform/search/search-reference/10.4.2604/search-commands/rex
Splunk Docs — Quick tips for optimization: filter early, narrow the time range, use index and sourcetype. help.splunk.com/en/splunk-enterprise/search/search-manual/9.4/optimizing-searches/quick-tips-for-optimization
Splunk — Splunk Cheat Sheet: Query, SPL, RegEx & Commands. splunk.com/en_us/blog/learn/splunk-cheat-sheet-query-spl-regex-commands.html

What's next?

Comfortable with SPL? Next, see how the Splunk architecture under it works — forwarders, indexers and search heads — and then how Enterprise Security turns SPL into a full SIEM with correlation searches and notable events.

Next · All interview lessons → Practice on exam.techclick.in →

Splunk SPL — Pipes, stats, eval, lookups & Fields

🎯 By the end you will be able to

Pick where you want to start

The pipe model

Everyday commands

stats, transforms & fields

Streaming vs fast

① The pipe model — search terms, a time range, then a pipeline

② The everyday commands — filter, build a field, choose columns

Shaping the output

③ stats, transforms and fields — turning events into answers

Fields and enrichment

④ Streaming vs transforming — and why filtering early is fast

▶ Watch an SPL search run through the pipeline

🤖 Ask the AI Tutor

📝 Wrap-up assessment — six more

🧠 In your own words

🗣 Teach a friend

📖 Glossary

📚 Sources

What's next?