ZIA inspects a file's true type before applying DLP. Which three checks does File Type Control use, in order?

Correct: b. ZIA checks Magic Bytes (file signature), then MIME type, then File Extension — so a .docx renamed to .txt is still classified correctly and caught by file-type criteria.

Sneha must block uploads only when a real customer's exact name + account + card appear together, with near-zero false positives. Which technique?

Correct: a. EDM fingerprints exact field values from structured data, so it fires only on genuine records and stays quiet on test/fake data — the lowest false-positive technique. IDM (c) is for whole documents; a PCI dictionary (b) would flag any card-like number.

You're rolling out a brand-new DLP block rule across 5,000 users. What's the safest first move?

Correct: c. Monitor mode ("Allow and log only") surfaces the legitimate workflows your rule would break, so you tune before you block. Going straight to Block (a) is how DLP gets switched off by an angry business unit on day two.

After a 2025 upgrade, OCR stopped working on a specific rule even though that rule's ocrEnabled was set. Why?

Correct: b. Zscaler moved OCR to an organization-level setting under Administration → DLP Advanced Settings, and deprecated the per-rule ocrEnabled attribute. Setting it on a single rule now does nothing — enable OCR once for the org instead.

An IDM rule for a contract template is throwing too many false positives on unrelated documents. The team wants to keep IDM. Best tuning move?

Correct: d. IDM's partial-match flexibility is also its false-positive source. Raising the match-accuracy threshold (e.g. from a loose value toward higher similarity) means only genuine full/near-full copies trigger. Lowering severity (c) hides noise but doesn't fix it.

For a healthcare client, an engineer proposes sending full content of every DLP incident to a low-security shared ICAP receiver, "so auditors see everything". Sound design?

Correct: c. Full-content forwarding copies the leaked sensitive data to the receiver — a low-security shared box becomes a new breach target holding PHI. Either harden and access-control the receiver, or send MD5-only where the investigator doesn't need the payload. "More data" (a) without controls is a liability, not a win.

Zscaler ZIA Data Loss Prevention - Engines, Dictionaries

Q: A new DLP rule never triggers, even on obvious test data, for traffic to one HTTPS site. What do you check first?

Correct: b. Inline web DLP runs after SSL inspection. If the destination is bypassed (Do-Not-Inspect), DLP never receives a decrypted payload, so no rule can match. A low threshold (a) would cause over-triggering, not silence.

Q: An engine is defined as Dictionary-A AND Dictionary-B . A document matches only Dictionary-A. What happens?

Correct: c. AND means both dictionaries must match for the engine to fire. One match alone isn't enough. Use OR when either match should trigger; use AND/NOT to tighten and exclude. Severity (b) is set on the rule, not derived from partial matches.

Q: Legal wants to stop a specific confidential contract template from leaking — even if someone reorders paragraphs. Which technique fits best?

Correct: a. IDM is for unstructured documents and detects partial/reworded copies via a document index + match-accuracy threshold. EDM (c) is for structured field data, not whole documents. A PCI dictionary (b) wouldn't recognise this specific contract; OCR (d) only handles images.

Q: A High-severity DLP block rule never fires, yet a real match clearly happened. Web Insights "Reason" names a broad "Allow and log" rule above it. Root cause?

Correct: d. The "Reason" column already told you which rule decided — a broad Allow-and-log rule above the Block. Reorder so the specific block sits higher, or scope the Allow rule. SSL off (a) would mean no DLP row at all, not an Allow.

Start here · understand the lesson before the detail

What you are learning

This lesson shows the complete inline DLP decision. ZIA must see the content, understand the file and destination, match sensitive information, apply the ordered rule, and create evidence that an analyst can investigate.

In plain English

A dictionary describes what sensitive data looks like. A DLP engine combines dictionaries and thresholds. EDM matches exact structured records, IDM fingerprints sensitive documents, and OCR can extract text from images. Policy adds who, where, destination, file, action, and exceptions.

Real example

An employee uploads a spreadsheet containing customer account numbers to personal cloud storage. After permitted SSL inspection, ZIA identifies the destination and file, an EDM-based engine matches real customer records, and the rule blocks or coaches the upload while recording an incident for review.

Follow this flow

Define the data owner, sensitive data, approved destinations, and permitted business action.
Confirm traffic and file types can be inspected, including required SSL and OCR coverage.
Build and test dictionaries, engines, EDM/IDM templates, thresholds, and confidence.
Create a narrow pilot rule with user, destination, protocol, size, file, and action criteria.
Review incidents and false positives, tune safely, then expand with change control.

Evidence to collect

User, source, destination, file type, size, and inspection status
Matched rule, engine, dictionary, threshold, and confidence
Allow, coach, block, or quarantine action
Incident owner, business context, false-positive decision, and closure

Common mistake to avoid

Do not enable a broad block rule before validating content visibility and false positives. A regular expression alone is rarely enough for high-value data. Never place real sensitive records in a test screenshot or ticket; use approved synthetic test data.

Current official source checkpoint

DLP dictionaries and enginescurrent official reference used for this beginner explanation
Custom DLP dictionariescurrent official reference used for this beginner explanation

An employee upload is decrypted, classified, matched against dictionary, EDM, IDM, or OCR methods, evaluated by DLP policy, and logged. — ChatGPT-generated beginner infographic for this lesson. Read the labelled flow once, then continue into the technical detail below.

Key terms before you continue

DictionaryPattern, phrase, label, or indexed data definition.

DLP EngineLogical expression of dictionaries and thresholds.

EDMExact Data Match for structured records.

IDMIndexed Document Match for sensitive documents.

Most engineers think…

"DLP means I tick the PCI dictionary and now no card numbers can leave."

Wrong — and that wrong instinct floods the SOC with false positives. A raw card-number pattern fires on every 16-digit string: order IDs, tracking numbers, a developer's test data. Real DLP is two decisions: what counts as sensitive (the dictionary, tuned with thresholds and proximity) and how strict the logic is (the engine's AND / OR). And none of it works at all until SSL inspection decrypts the traffic first. This lesson builds that instinct: match the right data, on the right path, with the right strictness.

① Where DLP sits — and why SSL inspection comes first

Think of ZIA as airport security for outbound traffic. DLP is the bag scanner. But a scanner can only see what's opened. If a request stays inside a sealed TLS tunnel, ZIA sees a destination and nothing else — like a locked suitcase passing the X-ray belt.

That is the single most important fact about ZIA DLP: it is inline web DLP on the SSL-inspected egress path. SSL inspection decrypts the upload, DLP reads the now-visible payload, and the policy decides. No inspection, no payload, no DLP. This is the #1 "DLP isn't working" ticket on the Zscaler community.

👉 So far: DLP scans outbound content, but only what SSL inspection has decrypted. Next, the exact stage where DLP runs on the egress path.

Legend ZIA edge stage (SSL inspection, DLP) DLP engines & dictionaries match Allow / Block / Confirm action ICAP incident receiver DLP blind — no SSL inspection

Sneha at Infosys faces this

Sneha enabled a PCI DLP rule, then tested by uploading a card list to a personal cloud-storage site. Nothing blocked. The rule "doesn't work".

Likely cause

The destination domain is in the SSL Inspection Do-Not-Inspect list (or inspection isn't enabled for her location). DLP got no decrypted payload, so it had nothing to match.

Diagnosis

Check that the test site is actually being decrypted for her.

Policy → SSL Inspection → confirm the destination isn't bypassed

Fix

Remove the bypass (or add an inspect rule) for that destination so the upload is decrypted before DLP runs.

Verify

Re-test the upload. Analytics → Web Insights now shows a DLP block with the rule name and the triggering engine.

▶ Watch an upload get inspected — then watch DLP go blind

Rahul at TCS uploads a spreadsheet of customer records. Press Play for the healthy inspect-and-block path, then Break it to see what an SSL bypass does.

① UPLOAD10.20.30.45 POSTs customers.xlsx to a personal drive

▼

② SSL INSPECTZIA decrypts the upload → the payload becomes readable

▼

③ DLP MATCHContent hits the PCI engine — 4,000 card numbers, well over threshold

▼

④ ACTIONRule action = Block → upload stopped, incident sent to ICAP receiver

Press Play to step through the healthy inspect-match-block path. Then press Break it.

Quick check · Q1 of 10

A new DLP rule never triggers, even on obvious test data, for traffic to one HTTPS site. What do you check first?

a) The dictionary's hit-count threshold is too low b) Whether SSL inspection is actually decrypting that destination — DLP can't read payload it didn't decrypt c) The user's password expired d) The bandwidth class is throttling the upload

Correct: b. Inline web DLP runs after SSL inspection. If the destination is bypassed (Do-Not-Inspect), DLP never receives a decrypted payload, so no rule can match. A low threshold (a) would cause over-triggering, not silence.

Pause & Predict

You want to block uploads only when a document has both a credit-card number AND a customer name nearby — not either one alone. What feature lets you express that "both, and close together" logic? Type your guess.

Answer: A DLP engine combines dictionaries with AND, and a custom dictionary can use a proximity value so a keyword must sit within N characters of a pattern. "Both" = engine AND logic; "close together" = dictionary proximity. That's exactly Path 2.

② Engines vs dictionaries — the two-decision model

This is the part most people blur together. Keep them separate and DLP suddenly makes sense.

A DLP dictionary answers "what does sensitive data look like?" — a credit-card pattern, a list of project code-names, an SSN regex. A DLP engine answers "how strict is a match?" — it combines dictionaries with Boolean AND / OR / NOT. The rule then calls the engine.

Analogy: airport security. A dictionary is one rule ("no liquids over 100 ml"). An engine is the combined checkpoint logic ("valid boarding pass AND matching ID AND no banned items"). One liquid alone might be fine; it's the combination the checkpoint enforces.

Predefined vs custom dictionaries

Predefined — Zscaler ships hundreds: PCI (credit cards), PII, SSN, HIPAA / health terms, and newer ones rolled out in 2025 such as CCPA, DPDPA (India's data-protection act) and Credentials and Secrets. Fast to switch on, but blunt.
Custom — you define the match. Three flavours: Phrases (exact words, up to 256 per dictionary), Patterns (regular expressions for structured strings like an employee ID format), and a dictionary of words with a hit-count threshold (fire only after N matches) and an optional proximity (a high-confidence keyword must sit within a set character distance of the pattern, range 0–10000).

Pro tip — threshold + proximity kill false positives

A bare credit-card pattern matches every 16-digit number. Add a hit-count threshold (e.g. fire only at ≥ 10 numbers) so a single order ID is ignored, and a proximity requirement (the word "card"/"CVV" within 30 characters) so random digits don't trigger. This single tuning step is the difference between a SOC that trusts DLP and one that mutes it.

👉 So far: dictionary = what to look for, engine = how to combine. Custom dictionaries get sharp with thresholds + proximity. Next, see the AND/OR logic decide.

Priya at Flipkart faces this

Priya's PCI rule blocks legitimate order-confirmation emails because every order has a 16-digit order ID. The SOC is drowning in false positives.

Likely cause

The dictionary fires on any 16-digit string, with a low threshold and no proximity. Order IDs look like card numbers to a raw pattern.

DiagnosisAdministration → DLP Dictionaries & Engines → open the dictionary → check threshold & proximity

Fix

Raise the hit-count threshold, add a proximity requirement (keyword "card"/"CVV" near the digits), and build an engine as card-pattern AND NOT order-keyword to exclude the order format.

Verify

Re-send a real order email — allowed. Send a genuine card list — still blocked. Web Insights shows the false-positive volume drop.

ZIA Security Controls Deep-Dive ZIA Troubleshooting Playbook

Quick check · Q2 of 10

An engine is defined as Dictionary-A AND Dictionary-B. A document matches only Dictionary-A. What happens?

a) The engine fires — any dictionary match counts b) The engine fires, but only at severity Low c) The engine does NOT fire — AND requires both dictionaries to match the same content d) The rule errors out

Correct: c. AND means both dictionaries must match for the engine to fire. One match alone isn't enough. Use OR when either match should trigger; use AND/NOT to tighten and exclude. Severity (b) is set on the rule, not derived from partial matches.

Pause & Predict

EDM is coming next. You want DLP to fire only on your actual customer records — never on a developer's fake test data. How can ZIA know a card number is a real one of yours without storing the readable card in the cloud? Type your guess.

Answer: An on-prem Index Tool hashes the sensitive fields and uploads only the hashes to the Zscaler cloud — the readable PII never leaves your building. DLP compares incoming content against those hashes, so it fires on your real records and ignores synthetic data. That's EDM, coming up now.

③ EDM vs IDM — fingerprinting your actual data (+ OCR, MIP)

Patterns and keywords are generic — they describe a shape of data. Sometimes you need to protect your specific records: this customer, that contract. That's where fingerprinting comes in.

EDM — Exact Data Match (structured data)

EDM protects structured data — a database export or a CSV of customers. An on-prem Index Tool hashes the sensitive fields (name, account number, card) and uploads only the hashes to the Zscaler cloud — never the readable PII. DLP then fires only when a real record's fields appear together. A random test card number won't trigger; your actual customer's exact combo will.

Analogy: EDM is an Aadhaar exact-match. It's not "looks like an Aadhaar number" — it's "this specific person's exact details". No match on a made-up number.

IDM — Indexed Document Match (unstructured data)

IDM protects unstructured documents — a confidential contract, a design doc, a board deck. You index the source files; ZIA detects full or partial copies, even reworded or trimmed. You set a match-accuracy threshold (e.g. flag at ~75% similarity). More flexible than EDM, but partial-matching means a higher false-positive risk — so tune the accuracy.

Analogy: IDM is the plagiarism checker. Even if a student reorders sentences from the original answer sheet, it still flags the overlap.

OCR and MIP labels — two more inputs

OCR — Optical Character Recognition reads text inside images (PNG/JPG, screenshots, pictures embedded in a Word doc), then runs normal DLP classification on the extracted text. As of a 2025 update, OCR is configured once at the org level under Administration → DLP Advanced Settings — the old per-rule ocrEnabled toggle was deprecated. Image quality affects accuracy, so expect more false positives here.
MIP / Purview labels — if your org uses Microsoft Information Protection, ZIA can match on the document's sensitivity label (e.g. "Confidential") instead of re-deriving sensitivity from content. You retrieve the labels from Microsoft into the ZIA MIP account, then use them as match criteria.

👉 So far: EDM = exact structured records, IDM = whole-document copies, OCR = text in images, MIP = match on the Microsoft label. Next, a side-by-side so you pick the right one fast.

▶ Watch an EDM index get built — then misused

Karthik at Wipro indexes a customer database for EDM. Play the correct hash-only flow, then Break it to see the classic mistake.

① SOURCECustomer CSV sits on-prem: name, account, card — 2M rows

▼

② INDEX TOOLOn-prem Index Tool hashes each field — readable PII never leaves the building

▼

③ UPLOAD HASHESOnly the hash template is sent to the Zscaler cloud

▼

④ MATCHAn upload with a real customer's exact fields → EDM fires. A fake number → silence

Press Play for the correct hash-only EDM flow. Then press Break it.

The four detection techniques — tap each card

Each card front names the technique; the back gives you the "so what" — when to reach for it.

🔤

Dictionary

tap to flip

Patterns + keywords for generic shapes (PCI, SSN, PII). Tune with threshold + proximity. So what: your broad, fast first pass — but the noisiest.

🎯

EDM

tap to flip

Hashes exact fields from structured data; fires only on your real records. So what: near-zero false positives — use it for known customer/employee datasets.

📄

IDM

tap to flip

Indexes whole documents; catches full or partial copies even reworded. So what: protects contracts/designs — but set the match accuracy or it gets noisy.

🖼

OCR

tap to flip

Reads text inside images, then runs normal DLP on it. Org-level since 2025. So what: stops screenshot exfiltration — but image quality means more noise.

Quick check · Q3 of 10

Legal wants to stop a specific confidential contract template from leaking — even if someone reorders paragraphs. Which technique fits best?

a) IDM — it indexes the document and detects full or partial (reworded) copies b) A predefined PCI dictionary c) EDM — it's a document, after all d) OCR alone

Correct: a. IDM is for unstructured documents and detects partial/reworded copies via a document index + match-accuracy threshold. EDM (c) is for structured field data, not whole documents. A PCI dictionary (b) wouldn't recognise this specific contract; OCR (d) only handles images.

Pause & Predict

You're about to flip a brand-new DLP rule live for 5,000 users with action = Block. Before you do — what's the one workflow it might silently break, and what action should you use first instead? Type your guess.

Answer: Legitimate flows you forgot about — HR mailing real SSNs to payroll, support attaching customer data. Start in Allow and log only (monitor mode), watch Web Insights for a week, tune thresholds, then switch to Block. Going straight to Block is how DLP gets switched off by an angry business unit.

④ Build a DLP rule — order, action, severity, ICAP, validate

You've got the detection pieces. Now wire them into a rule. A DLP rule has criteria, calls one or more engines, and applies an action. Rules evaluate top-down, first match wins — order matters, just like firewall rules.

The pieces of a rule

Criteria — who and where: users / groups, locations, URL categories, cloud apps, and file types (ZIA inspects file type by Magic Bytes → MIME type → File Extension, so a renamed .txt that's really a .docx is still caught).
Engine(s) — the AND/OR logic from Path 2, or an EDM/IDM template.
Action — Allow, Block, Confirm (warn the user, let them proceed with justification), or Allow and log only (monitor mode — start here to measure before you block).
Severity — Low / Medium / High / Critical, for incident triage and reporting.

ICAP incident receiver — where the evidence goes

When a rule fires, ZIA can forward the incident to an on-prem ICAP DLP incident receiver (or a third-party DLP). You choose how much it sends:

MD5 only — just a hash of the offending content. Lightweight, privacy-preserving, but the auditor can't see what leaked.
Full content — the actual payload, so an investigator can review the leaked data. Heavier, and the receiver must be secured because it now holds sensitive data.

Analogy: MD5-only is a CCTV logbook entry ("something happened at 2pm"). Full content is the actual footage. You keep the footage only where you can lock it down.

Admin portal — build a custom dictionary (pattern) for a partner project code

Administration → DLP Dictionaries & Engines → Add DLP Dictionary
  Name:        Custom-ProjectKavach
  Type:        Patterns
  Pattern:     PRJ-KAV-[0-9]{6}        # e.g. PRJ-KAV-004217
  Threshold:   Hit Count >= 3          # ignore a single stray mention
  Proximity:   30                      # keyword must sit within 30 chars (optional)

Expected output

Dictionary "Custom-ProjectKavach" saved.
Match preview: 3 of 3 sample lines matched, 0 false hits on order-ID test set.
# If preview shows hits on your safe sample → tighten the regex / raise the threshold.

Admin portal — wrap it in an engine and a rule

DLP Dictionaries & Engines → Add DLP Engine
  Name:   Engine-Kavach-Strict
  Logic:  ((Custom-ProjectKavach)) AND ((Confidential-Keyword))

Policy → Data Loss Prevention → Add Web DLP Rule
  Name:        Block-Kavach-Exfil
  Order:       1
  Criteria:    Groups = Engineering | URL Category = Personal Storage, Webmail
               File Types = Documents, Spreadsheets, Archives
  Engine:      Engine-Kavach-Strict
  Action:      Block
  Severity:    High
  Notification: end-user block page + ICAP incident (Full content)

Expected output

Rule "Block-Kavach-Exfil" active at order 1.
Test upload (PRJ-KAV-004217 in a .docx to a webmail draft):
  Action: BLOCKED   Engine: Engine-Kavach-Strict   Severity: High
  Incident #88213 forwarded to ICAP receiver (full payload).

The trap behind a "working" rule

Don't start in Block. New DLP rules surprise you — they catch legitimate workflows you forgot about (HR sending real SSNs to payroll, support attaching customer data). Start in Allow and log only, watch Web Insights for a week, tune thresholds, then switch to Block. Going straight to Block is how DLP gets disabled by an angry business unit on day two.

👉 So far: a rule = criteria + engine + action + severity, evaluated top-down, with incidents to ICAP. Next, prove it works.

▶ Watch rule evaluation — first match wins

An upload hits the DLP policy. Play to see top-down evaluation, then Break it to see how a mis-ordered allow rule swallows the block.

① RULE 1Allow + log: "HR group → payroll portal" — criteria don't match this user → skip

▼

② RULE 2Block: "Engineering → personal storage, Engine-Kavach-Strict" — criteria match

▼

③ ENGINEEngine fires (code AND confidential keyword present) → condition met

▼

④ ACTIONBlock applied, evaluation stops — first match wins

Press Play for healthy top-down evaluation. Then press Break it.

Aditya at HCL faces this

Aditya's High-severity block rule never fires, even on real matches. A broad "Allow and log" rule sits above it for the same group.

Likely cause

Rule order. DLP is top-down, first match wins. The Allow-and-log rule at a lower order number matches first and stops evaluation, so the block never runs.

DiagnosisPolicy → Data Loss Prevention → read the rule order; check Web Insights "Reason" = which rule decided

Fix

Move the specific Block rule above the broad Allow-and-log rule, or scope the Allow rule's criteria so it doesn't swallow the same traffic.

Verify

Re-test; Web Insights "Reason" now names the Block rule, action = Blocked, severity High.

Verify it worked — the log that ends the argument

Never claim a DLP rule works from the config screen. Do a safe test upload (synthetic data that matches), then open Analytics → Web Insights, filter on your user + last hour, and read the DLP Engine and Reason columns. They name the exact rule and engine that decided. If the log disagrees with what you expected, your rule order or SSL-inspection scope is wrong — not the cloud.

Web Insights filter — confirm the rule and engine that fired

Analytics → Web Insights → Logs
  Filter:  User = aditya@org.in   AND   Action = Blocked   AND   last 1 hour
  Columns: Show "DLP Dictionary", "DLP Engine", "Rule", "Reason"

Expected output

2026-05-31 14:22  aditya@org.in  Action: Blocked
  Rule: Block-Kavach-Exfil   Engine: Engine-Kavach-Strict
  Dictionary: Custom-ProjectKavach (5 hits)  Severity: High
  Incident: #88213 → ICAP receiver (full content)
# If "Reason" names a different rule → fix rule order. If no row at all → SSL inspection isn't decrypting it.

Quick check · Q4 of 10

A High-severity DLP block rule never fires, yet a real match clearly happened. Web Insights "Reason" names a broad "Allow and log" rule above it. Root cause?

a) SSL inspection is off b) The engine logic is wrong c) The dictionary threshold is too high d) Rule order — DLP is top-down, first match wins; the Allow rule matched first and stopped evaluation

Correct: d. The "Reason" column already told you which rule decided — a broad Allow-and-log rule above the Block. Reorder so the specific block sits higher, or scope the Allow rule. SSL off (a) would mean no DLP row at all, not an Allow.

🤖 Ask the AI Tutor

Tap any question — instant, scoped to this lesson. No login, no waiting.

Pre-curated from Zscaler Help docs + community Q&A, scoped to ZIA DLP. For a live prod issue, paste your Web Insights export into chat.techclick.in.

🧠 In your own words

Type one line: why must SSL inspection be enabled before inline web DLP can do anything? Then compare to the expert version.

Expert version: Inline web DLP inspects the content of an upload. HTTPS hides that content inside a TLS tunnel. SSL inspection decrypts the traffic so DLP receives a readable payload to match against engines, dictionaries, EDM and IDM. If the destination is bypassed (Do-Not-Inspect), DLP gets no payload, so no rule can match — the upload sails through. If your answer mentioned "DLP can't read encrypted content it didn't decrypt", you've got it.

🗣 Teach a friend

Best way to lock it in — explain it in one line to a teammate. Tap to generate a paste-ready summary.

📩 Quiz me on this in 7 days. Opt in and we'll email 3 micro-questions on ZIA DLP at Day 1, Day 7 and Day 30 — spaced repetition is how this sticks. Un-tick any time.

📖 Glossary

DLP (Data Loss Prevention): Inspects outbound content for sensitive data and blocks or logs it before it leaves the org. In ZIA this runs inline on the SSL-inspected egress path.
DLP dictionary: A definition of what sensitive data looks like — predefined (PCI, SSN, HIPAA) or custom (phrases, patterns/regex, words with thresholds + proximity).
DLP engine: A logical container that combines one or more dictionaries with AND/OR/NOT to decide when a match counts. A rule calls the engine.
EDM (Exact Data Match): Fingerprints exact field values from a structured source via an on-prem Index Tool (hashes only). Fires only on your real records — very low false positives.
IDM (Indexed Document Match): Fingerprints whole documents and detects full or partial (reworded) copies via a match-accuracy threshold. For unstructured data.
OCR: Optical Character Recognition — extracts text from images so DLP can inspect screenshots and embedded pictures. Configured org-wide since 2025.
MIP / Purview label: Microsoft Information Protection sensitivity label; ZIA can match on the label instead of re-scanning content.
SSL inspection: ZIA decrypts HTTPS so content controls (including DLP) can read the payload. A hard prerequisite for inline web DLP.
ICAP incident receiver: An on-prem endpoint that receives DLP incidents over ICAP — either MD5-only (a hash) or full content (the payload).
DLP rule: An ordered policy entry: criteria (users, location, URL category, file type) → engine → action (Allow / Block / Confirm / Allow and log) + severity. Top-down, first match wins.

📚 Sources

Zscaler Help — About DLP Engines · Understanding DLP Engines · About DLP Dictionaries · Adding Custom DLP Dictionaries (phrases/patterns/threshold/proximity) · Configuring DLP Policy Rules. help.zscaler.com
Zscaler Help — About Exact Data Match (EDM) · Understanding EDM Index Templates · About Indexed Document Match (IDM) · Defining IDM Match Accuracy. help.zscaler.com
Zscaler Help — About ICAP Receivers for DLP · DLP Incident Receiver · Configuring OCR for DLP · About Microsoft Information Protection Labels (OCR now org-level in DLP Advanced Settings). help.zscaler.com
Zscaler Community (Zenith) — "DLP without SSL inspection" and "DLP Policy Best practices" threads (SSL inspection prerequisite, false-positive tuning). community.zscaler.com
Practitioner write-up — "Stopping Data Leaks: A High-Level Overview of Zscaler ZIA's DLP" (engines vs policies, incidents with content snippets). dontblamethenetwork.com; and "EDM vs IDM vs OCR", hackfaqs.com
Zscaler — Enhanced DLP Capabilities: OCR; Release Upgrade Summary 2025 (2025 engines: CCPA, DPDPA, Credentials & Secrets; ocrEnabled API deprecated). zscaler.com / help.zscaler.com
Zscaler ZDTA Certification — Data Protection Services domain (file-type inspection by Magic Bytes / MIME / File Extension; EDM = structured match; out-of-band = data at rest). customer.zscaler.com

What's next?

You can now place DLP on the ZIA egress path, pick the right detection technique, and build a rule that fires on the right data. Next, see how DLP chains with the rest of the ZIA policy stack — Cloud Firewall, URL Filtering, File Type Control and IPS — inside one SSE hop.

Next · ZIA Security Controls Deep-Dive → Practice on exam.techclick.in →

Zscaler ZIA Data Loss Prevention — Stop the Leak Before It Leaves

🎯 By the end you will be able to

Pick where you want to start

Where DLP Sits

Engines & Dictionaries

EDM vs IDM (+ OCR/MIP)

Build & Validate

What you are learning

In plain English

Real example

Follow this flow

Evidence to collect

Common mistake to avoid

Current official source checkpoint

Key terms before you continue

① Where DLP sits — and why SSL inspection comes first

▶ Watch an upload get inspected — then watch DLP go blind

② Engines vs dictionaries — the two-decision model

Predefined vs custom dictionaries

③ EDM vs IDM — fingerprinting your actual data (+ OCR, MIP)

EDM — Exact Data Match (structured data)

IDM — Indexed Document Match (unstructured data)

OCR and MIP labels — two more inputs

▶ Watch an EDM index get built — then misused

The four detection techniques — tap each card

④ Build a DLP rule — order, action, severity, ICAP, validate

The pieces of a rule

ICAP incident receiver — where the evidence goes

▶ Watch rule evaluation — first match wins

🤖 Ask the AI Tutor

📝 Wrap-up assessment — six more

🧠 In your own words

🗣 Teach a friend

📖 Glossary

📚 Sources

What's next?