TTechclick ⚡ XP 0% All lessons
Splunk · SIEM · SPLInteractive · L1 / L2 / L3

Splunk SPL — Pipes, stats, eval, lookups & Fields

SPL is how you actually pull answers out of Splunk. It is a pipeline language: you start with search terms over a time range, then pass results through a chain of commands joined by the pipe (|). This lesson teaches you to read and write a basic search, explain stats and eval in an interview, and know why filtering early is what makes a search fast.

📅 2026-06-19 · ⏱ 16 min · 5 infographics · live SPL pipe demo · 🏷 10-Q assessment + AI Tutor inline

⚡ Quick Answer

A clear, interactive guide to Splunk SPL (2026): the search pipeline and the pipe model, search terms and time range, the key commands (search, stats, eval, where, rex, lookup, table, sort, dedup, timechart, top), streaming vs transforming commands, fields and search-time extraction, and why filtering early makes searches fast.

🎯 By the end you will be able to

Read as:

Pick where you want to start

1

The pipe model

Search terms, time range, and the | pipeline.

2

Everyday commands

search, eval, where, table, sort, dedup.

3

stats, transforms & fields

stats, timechart, top, rex, lookup, fields.

4

Streaming vs fast

Streaming vs transforming; filter early.

🧠 Warm-up — 3 questions, no score

Just notice which ones make you pause. We answer all three inside the lesson.

1. What does the pipe character | do in an SPL search?

Answered in The pipe model.

2. Which command creates a new field from an expression?

Answered in Everyday commands.

3. When are most fields extracted in Splunk?

Answered in stats, transforms & fields.

Most engineers think…

Most people think 'searching Splunk' means typing a few keywords in a box, like Google. That gets you raw events — but it falls apart the moment an interviewer asks you to count, group, or chart something, or to explain why a colleague's search is crawling.

SPL is a pipeline language. You begin with search terms over a time range to pull events, then pipe them through a chain of commands — each command takes the previous one's output as its input. Two ideas unlock everything: the difference between filtering commands (search, where) and building commands (eval, rex), versus transforming commands (stats, timechart, top) that turn events into a results table; and the fact that Splunk extracts most fields at search time. Understand the pipe model and that split, and you can read and write real SPL — and explain why filtering early is what makes a search fast.

① The pipe model — search terms, a time range, then a pipeline

The single most important idea: SPL is a pipeline, read left to right. Every search starts with search terms over a time range — typically an index, a sourcetype, and a few keywords — which pulls the matching events out of the indexers.

You then add a pipe (the | character) and a command. Each command takes the results of everything before it as its input and hands its output to whatever comes next. So index=web status=500 | stats count by uri reads as: get error events, then count them by URL. The base search (before the first pipe) decides how much data you pull; the commands after the pipes shape it.

The interview line: SPL is search-terms-then-pipeline. Keywords and the time range filter at the source; pipes pass a stream of results from one command to the next. Get that picture and the rest of SPL is just learning what each command does.

Figure 1 — An SPL search, read left to right
Every SPL search is search terms over a time range, then a pipeline of commands joined by the pipe (|).An SPL search, read left to rightSearch termsindex, sourcetype,keywordsTime rangeearliest / latestPipe |pass results onCommandseval, where, statsResultsevents or a table
Every SPL search is search terms over a time range, then a pipeline of commands joined by the pipe (|).
Quick check · Q1 of 10 · Understand

In index=web status=500 | stats count by uri, what does the pipe do?

Correct: b. SPL reads left to right. The base search pulls the status=500 events, and the pipe hands those results to stats, which aggregates them into a count per uri. Commands always consume the output of everything to their left.
👉 So far: SPL is a pipeline read left to right: search terms over a time range pull events, then the pipe (|) passes results from each command to the next.

② The everyday commands — filter, build a field, choose columns

A handful of commands cover most day-to-day searching. search filters events on raw keywords and field values (the implicit first command, and reusable later with another | search). eval creates or changes a field from an expression — e.g. | eval is_slow=if(response_ms>1000,'yes','no'). where filters using an expression and functions, so it can compare two fields or use calculations that plain search cannot.

Shaping the output

To control what you see, table picks the exact columns in order, fields keeps or drops fields (and trimming early speeds the search), sort orders results (e.g. | sort -count), and dedup removes duplicate rows for a field. A typical chain: index=auth action=failure | eval hour=strftime(_time,'%H') | table user, src_ip, hour | sort user. The mental split to keep: search and where filter; eval and rex build fields; table, fields, sort and dedup arrange the output.

Figure 2 — Four jobs SPL commands do
Most commands fall into one of four roles — keep the roles straight and any search reads cleanly.Four jobs SPL commands doFiltersearch, where — keep matching eventsBuild fieldseval, rex — create or extract a fieldEnrichlookup — add data from a tableTransformstats, timechart, top — events into a table
Most commands fall into one of four roles — keep the roles straight and any search reads cleanly.
➡️
The pipe |
tap to flip

Passes the results of everything to its left into the next command as input. SPL is read left to right as a pipeline.

🧮
stats
tap to flip

The workhorse transforming command. Aggregates events into a table: count, sum, avg, values, list — optionally 'by' one or more fields.

🛠️
eval
tap to flip

Creates or changes a field per event using an expression and functions like if(), case(), strftime(). Streaming — it keeps the events.

🔎
rex
tap to flip

Extracts a field at search time with a regex and named capture groups: rex 'user=(?\w+)'. No re-indexing needed — schema-on-read.

Sort commands into four jobs

When you read a search, label each command: filter (search, where), build a field (eval, rex), enrich (lookup), or transform (stats, timechart, top). table, fields, sort and dedup just arrange output. Naming the job of each piped command is how you read any SPL search out loud in an interview.

Quick check · Q2 of 10 · Remember

Which command creates a new field from an expression, per event?

Correct: c. eval evaluates an expression for each event and writes the result into a field, e.g. eval is_slow=if(response_ms>1000,'yes','no'). stats aggregates; dedup removes duplicate rows; sort just orders results.
👉 So far: Everyday commands: search and where filter; eval and rex build fields; table, fields, sort and dedup arrange output. eval makes a field from an expression per event.

③ stats, transforms and fields — turning events into answers

Counting and grouping is where SPL earns its keep, and that is the job of transforming commands. stats aggregates events into a table: | stats count, avg(response_ms) as avg_ms by host gives one row per host. timechart is stats split over time for trend charts (| timechart count by status), and top is a shortcut for the most common values (| top limit=10 user). These do not return events — they return a results table.

Fields and enrichment

Splunk is schema-on-read: most fields are extracted at search time, so you can add new fields without re-indexing. rex extracts a field on the fly with a regex and named capture groups — | rex 'user=(?<user>\w+)' — and lookup enriches events from a table, e.g. mapping an error_code to a friendly description or a host to its owner. Together: filter to the right events, build or enrich the fields you need, then stats them into an answer.

Figure 3 — eval vs stats — the two people confuse
eval works per event to make a field; stats aggregates many events into a results table.eval vs stats — the two people confuseeval (per event)Creates or changes a fieldRuns on every single eventUses functions like if(),Streaming — keeps the eventsstats (aggregate)Counts, sums, averages, groupsCollapses events into rowscount, avg, sum, values by …Transforming — returns a table
eval works per event to make a field; stats aggregates many events into a results table.
Figure 4 — The everyday SPL command set
Learn this handful and you can read and write most real searches; everything else builds on them.The everyday SPL command setBase searchindex + terms + timeeval / whererex (extract)lookup (enrich)stats / timecharttop / deduptable / sort
Learn this handful and you can read and write most real searches; everything else builds on them.
Don't confuse eval with stats

A classic slip is reaching for eval to count or group — eval only works per event and writes a field, it never aggregates. Counting, summing, averaging and grouping are stats (or timechart over time, top for most-common). If your answer is 'one row per group', it's stats; if it's 'a value on each event', it's eval.

Quick check · Q3 of 10 · Apply

You need the average response time per host as a table. Which command and shape?

Correct: a. Averaging and grouping is aggregation, which is stats: '| stats avg(response_ms) as avg_ms by host' returns one row per host. eval works per event and can't group; table only chooses columns; sort only orders.
👉 So far: stats, timechart and top are transforming — they turn events into a results table. Fields are extracted at search time, so rex and lookup add or enrich fields without re-indexing.

④ Streaming vs transforming — and why filtering early is fast

Commands fall into two big families, and this is a favourite interview question. Streaming commands handle one event at a time — one event in, one (or zero) out. The distributable streaming ones — eval, rex, where, fields, rename — can run in parallel on the indexers, close to the data. Transforming commands — stats, timechart, top, chart, rare — gather all the events on the search head and build a results table; they are not streaming.

That is exactly why filtering early makes a search fast. The base search (index, sourcetype, time range, keywords) limits the data pulled from disk; distributable streaming commands then run on the indexers in parallel. The moment a transforming command runs, the work moves to the search head. So the rule is: filter as early as possible, transform as late as possible — narrow the index and time range, keep terms specific, and only then stats or timechart. A wide all-time search with a transform up front is the classic slow query.

Figure 5 — Streaming vs transforming commands
Distributable streaming runs on the indexers in parallel; transforming gathers everything on the search head.Streaming vs transforming commandsStreaming (per event)eval, rex, where, fields, renameOne event in, one outDistributable runs on indexersKeeps parallelismTransforming (aggregate)stats, timechart, top, chart, rareNeeds all events togetherRuns on the search headReturns a results table
Distributable streaming runs on the indexers in parallel; transforming gathers everything on the search head.

Priya at a Hyderabad SOC faces this

Her dashboard panel 'top failing URLs' takes over a minute and sometimes times out, even though the data volume is modest.

Likely cause

The search runs a transform across all time first — '| stats count by uri | search uri=*login*' over an all-time window — so it pulls every event before filtering.

Diagnosis

Open the Job Inspector: nearly all the time is spent reading events because there is no index/sourcetype, the time range is All time, and the login filter is applied after stats.

Search ▸ Job Inspector ▸ execution costs + the search bar
Fix

Rewrite to filter early: 'index=web sourcetype=access_combined uri=*login* status>=400 earliest=-24h | stats count by uri | sort -count'. Now the base search narrows the data and runs distributably on the indexers.

Verify

Re-run: the Job Inspector shows far fewer events scanned, the panel returns in a couple of seconds, and the result is the same top list.

Prove it in the Job Inspector

Never guess why a search is slow. The Job Inspector shows events scanned and where time was spent. If most time is in reading events, your base search is too wide — add index, sourcetype, a tighter time range and specific terms, and watch the scanned count drop. That single read settles most 'my search is slow' tickets.

▶ Watch an SPL search run through the pipeline

How one search becomes a results table, command by command. Press Play for the healthy path, then Break it to see the classic slow-search failure.

① Base searchindex=web sourcetype=access_combined status>=500 earliest=-1h pulls only the recent server-error events from the indexers.
② Build a field| eval slow=if(response_ms>1000,'yes','no') runs per event on the indexers, tagging each error as slow or not.
③ Transform| stats count by uri, slow gathers the events on the search head and collapses them into one row per URL and slow flag.
④ Arrange| sort -count puts the worst URLs on top, and the panel shows a clean ranked table of failing pages.
Press Play to step through the pipeline from base search to results table. Then press Break it.
Quick check · Q4 of 10 · Analyze

Why does putting index=, a tight time range and specific terms first make a search faster?

Correct: c. The base search bounds the data read from disk; filtering early means less to process. Distributable streaming commands (eval, rex, where) then run on the indexers in parallel. A transforming command later moves work to the search head, so filter early and transform late.
👉 So far: Streaming (eval/rex/where) is one-event-in-one-out and distributable runs on the indexers; transforming (stats/timechart/top) runs on the search head. Filter early, transform late, to keep searches fast.

🤖 Ask the AI Tutor

Tap any question — instant, scoped to this lesson. No login, no waiting.

Pre-curated from vendor docs + community Q&A, scoped to this lesson. For a live prod issue, paste your export into chat.techclick.in.

📝 Wrap-up assessment — six more

You've answered 4 inline. Six left. 70% (7 of 10) marks the lesson complete on your profile. Tap Submit all answers at the end.

Q5 · Remember

What does the pipe character (|) do in SPL?

Correct: c. SPL is a pipeline read left to right. The pipe takes the output of everything before it and feeds it as input to the next command, so each command builds on the previous one's results.
Q6 · Understand

Which command turns events into a results table by counting or grouping?

Correct: a. stats is the core transforming command — count, sum, avg, values, optionally 'by' a field — collapsing events into rows. eval and rex build fields per event; where filters; none of those aggregate.
Q7 · Apply

You want to extract a username from raw text into a field without re-indexing. Which command?

Correct: b. rex extracts a field at search time using a regex with named capture groups, e.g. rex 'user=(?\w+)'. Because Splunk is schema-on-read, this needs no re-indexing. lookup enriches from a table; dedup and timechart don't extract fields.
Q8 · Analyze

Which group of commands can run in parallel on the indexers?

Correct: c. Distributable streaming commands (eval, rex, where, fields, rename) don't depend on event order, so each indexer runs them on its own data in parallel. Transforming commands gather events on the search head instead.
Q9 · Evaluate

An interviewer asks how to speed up a slow SPL search. Best answer?

Correct: b. Filtering early limits the data read from disk and lets distributable streaming run on the indexers. Transforming late keeps work parallel for as long as possible. Wide all-time searches and early transforms are the classic causes of slow queries.
Q10 · Evaluate

Why can you add a new field with rex and use it immediately, without re-indexing?

Correct: d. Splunk applies most of its schema when you read (search), not when you write (index). rex runs against the already-stored raw events at search time, so a new extraction takes effect on the next search with no re-indexing.
Lesson complete — saved to your profile.
Almost! You need 70% (7 of 10) — re-read the path that tripped you up and tap "Try again".

🧠 In your own words

Type one line: what makes SPL a 'pipeline', and why does filtering early make a search fast? Then compare with the expert version.

Expert version: SPL is a pipeline because a search reads left to right: it starts with search terms over a time range to pull events, then passes results through a chain of commands joined by the pipe (|), where each command takes the previous one's output as its input. Filtering early makes it fast because the base search — index, sourcetype, a tight time range and specific keywords — bounds how much data is read from disk, and distributable streaming commands (eval, rex, where) then run on the indexers in parallel. The moment a transforming command (stats, timechart, top) runs, work moves to the search head, so the rule is filter as early as possible and transform as late as possible.

🗣 Teach a friend

Best way to lock it in — explain it in one line to a teammate. Tap to generate a paste-ready summary.

📖 Glossary

SPL (Search Processing Language)
Splunk's query language: a pipeline of search terms plus commands, joined by the pipe (|), that retrieves and processes events.
Pipe (|)
The operator that passes the results of everything to its left into the next command as input. SPL reads left to right.
Base search
The part before the first pipe — index, sourcetype, keywords and time range — that decides how much data is pulled from disk.
search / where
Filtering commands. search filters on keywords and field values; where filters using an expression and functions, so it can compare fields.
eval
Creates or changes a field per event from an expression and functions like if(), case() and strftime(). A streaming command.
stats
The core transforming command. Aggregates events into a results table with count, sum, avg, values, optionally grouped 'by' fields.
rex
Extracts a field at search time using a regex with named capture groups. No re-indexing needed because Splunk is schema-on-read.
lookup
Enriches events with extra data from a lookup table — e.g. mapping a code to a description or a host to its owner.
timechart / top
timechart is stats split over time for trend charts; top returns the most common values of a field. Both are transforming.
Streaming vs transforming
Streaming = one event in, one out (distributable ones run on indexers in parallel); transforming = gather all events on the search head into a table.

📚 Sources

  1. Splunk Docs — About the search language (SPL overview, pipes, commands). help.splunk.com/en/splunk-cloud-platform/search/search-manual/10.4.2604/search-overview/about-the-search-language
  2. Splunk Docs — Types of commands: streaming, distributable streaming, centralized streaming and transforming. help.splunk.com (Splunk_Search_Typesofcommands)
  3. Splunk Docs — Use the eval command and functions. docs.splunk.com/Documentation/Splunk/latest/Search/Usetheevalcommandandfunctions
  4. Splunk Docs — rex command: search-time field extraction with named capture groups. help.splunk.com/en/splunk-cloud-platform/search/search-reference/10.4.2604/search-commands/rex
  5. Splunk Docs — Quick tips for optimization: filter early, narrow the time range, use index and sourcetype. help.splunk.com/en/splunk-enterprise/search/search-manual/9.4/optimizing-searches/quick-tips-for-optimization
  6. Splunk — Splunk Cheat Sheet: Query, SPL, RegEx & Commands. splunk.com/en_us/blog/learn/splunk-cheat-sheet-query-spl-regex-commands.html

What's next?

Comfortable with SPL? Next, see how the Splunk architecture under it works — forwarders, indexers and search heads — and then how Enterprise Security turns SPL into a full SIEM with correlation searches and notable events.