TTechclick ⚡ XP 0% All lessons
Zscaler · ZIA · Data Loss PreventionInteractive · L1 / L2 / L3

Zscaler ZIA Data Loss Prevention — Stop the Leak Before It Leaves

A finance intern pastes 4,000 customer card numbers into a personal Gmail draft. Endpoint security never saw it. ZIA DLP sits on the egress path, reads the upload after SSL inspection, and blocks it mid-flight. This lesson shows you how — engines, dictionaries, EDM, IDM, ICAP, OCR — and how to build a rule that fires on the right data and stays quiet on the rest.

📅 2026-05-31 · ⏱ 11 min · 3 live demos · 4 infographics · 🏷 10-Q assessment + AI Tutor inline

⚡ Quick Answer

Learn Zscaler ZIA DLP the AI-era way — where it sits on the SSL-inspected egress path, engines vs dictionaries vs EDM vs IDM, ICAP incident receiver, OCR and MIP, plus a real rule build with expected output. 11 min.

🎯 By the end you will be able to

Read as:

Pick where you want to start

1

Where DLP Sits

The egress path, SSL inspection as the prerequisite, inline web vs endpoint DLP.

2

Engines & Dictionaries

Predefined vs custom, phrase / pattern / regex, thresholds, proximity, AND/OR logic.

3

EDM vs IDM (+ OCR/MIP)

Fingerprint exact records vs whole documents. Images via OCR. Microsoft labels.

4

Build & Validate

Rule order, actions, severity, ICAP incident receiver, and the Web Insights test.

🧠 Warm-up — 3 questions, no score

Just notice which ones make you pause. We answer all three inside the lesson.

1. SSL inspection is turned off for a user. Will ZIA DLP catch a credit-card upload to an HTTPS site?

Answered in Path 1 — Where DLP Sits. (Spoiler: inspection is a hard prerequisite.)

2. You must block uploads only when a real customer's exact record (name + their card) leaves. Which technique?

Answered in Path 3 — EDM vs IDM.

3. A DLP engine combines two dictionaries with AND. When does the rule fire?

Answered in Path 2 — Engines & Dictionaries. AND tightens; OR widens.

Most engineers think…

"DLP means I tick the PCI dictionary and now no card numbers can leave."

Wrong — and that wrong instinct floods the SOC with false positives. A raw card-number pattern fires on every 16-digit string: order IDs, tracking numbers, a developer's test data. Real DLP is two decisions: what counts as sensitive (the dictionary, tuned with thresholds and proximity) and how strict the logic is (the engine's AND / OR). And none of it works at all until SSL inspection decrypts the traffic first. This lesson builds that instinct: match the right data, on the right path, with the right strictness.

① Where DLP sits — and why SSL inspection comes first

Think of ZIA as airport security for outbound traffic. DLP is the bag scanner. But a scanner can only see what's opened. If a request stays inside a sealed TLS tunnel, ZIA sees a destination and nothing else — like a locked suitcase passing the X-ray belt.

That is the single most important fact about ZIA DLP: it is inline web DLP on the SSL-inspected egress path. SSL inspection decrypts the upload, DLP reads the now-visible payload, and the policy decides. No inspection, no payload, no DLP. This is the #1 "DLP isn't working" ticket on the Zscaler community.

👉 So far: DLP scans outbound content, but only what SSL inspection has decrypted. Next, the exact stage where DLP runs on the egress path.
Figure 1 — Where inline web DLP lives on the ZIA egress path
A left-to-right architecture diagram. A user device forwards an upload to a ZIA Service Edge. The edge runs authentication, then SSL inspection which decrypts the payload, then the content-aware controls including DLP content inspection, which compares against DLP engines and dictionaries, EDM and IDM, before allowing or blocking the upload and sending an incident to an ICAP incident receiver. A red note marks that DLP is blind without SSL inspection. Upload an attachment — follow it to the moment DLP decides User upload 10.20.30.45 → cloud Auth + Forwarding identity attached SSL inspection decrypt the payload ✗ off here = DLP is blind DLP content inspection reads the decrypted upload Match against: DLP Engines (AND/OR of dictionaries) · Dictionaries (predefined + custom) · EDM · IDM · OCR · MIP labels A DLP rule sets criteria (users/groups, URL category, cloud app, file type) → action + severity File Type Control inspects by Magic Bytes · MIME type · File Extension Allow / Block / Confirm + Allow and log only ICAP incident receiver MD5 only, or full payload Endpoint DLP runs on the device for things ZIA can't see (USB, local apps). Inline web DLP runs here, in the cloud, on the egress path. Out-of-band (SaaS Security API) scans data already at rest in sanctioned SaaS — a separate lane from this inline path.
The order matters: decrypt → read → match → act. If SSL inspection isn't on for that traffic, DLP never gets a payload to read.

Sneha at Infosys faces this

Sneha enabled a PCI DLP rule, then tested by uploading a card list to a personal cloud-storage site. Nothing blocked. The rule "doesn't work".

Likely cause

The destination domain is in the SSL Inspection Do-Not-Inspect list (or inspection isn't enabled for her location). DLP got no decrypted payload, so it had nothing to match.

Diagnosis

Check that the test site is actually being decrypted for her.

Policy → SSL Inspection → confirm the destination isn't bypassed
Fix

Remove the bypass (or add an inspect rule) for that destination so the upload is decrypted before DLP runs.

Verify

Re-test the upload. Analytics → Web Insights now shows a DLP block with the rule name and the triggering engine.

▶ Watch an upload get inspected — then watch DLP go blind

Rahul at TCS uploads a spreadsheet of customer records. Press Play for the healthy inspect-and-block path, then Break it to see what an SSL bypass does.

① UPLOAD10.20.30.45 POSTs customers.xlsx to a personal drive
② SSL INSPECTZIA decrypts the upload → the payload becomes readable
③ DLP MATCHContent hits the PCI engine — 4,000 card numbers, well over threshold
④ ACTIONRule action = Block → upload stopped, incident sent to ICAP receiver
Press Play to step through the healthy inspect-match-block path. Then press Break it.
Quick check · Q1 of 10

A new DLP rule never triggers, even on obvious test data, for traffic to one HTTPS site. What do you check first?

Correct: b. Inline web DLP runs after SSL inspection. If the destination is bypassed (Do-Not-Inspect), DLP never receives a decrypted payload, so no rule can match. A low threshold (a) would cause over-triggering, not silence.

Pause & Predict

You want to block uploads only when a document has both a credit-card number AND a customer name nearby — not either one alone. What feature lets you express that "both, and close together" logic? Type your guess.

Answer: A DLP engine combines dictionaries with AND, and a custom dictionary can use a proximity value so a keyword must sit within N characters of a pattern. "Both" = engine AND logic; "close together" = dictionary proximity. That's exactly Path 2.

② Engines vs dictionaries — the two-decision model

This is the part most people blur together. Keep them separate and DLP suddenly makes sense.

A DLP dictionary answers "what does sensitive data look like?" — a credit-card pattern, a list of project code-names, an SSN regex. A DLP engine answers "how strict is a match?" — it combines dictionaries with Boolean AND / OR / NOT. The rule then calls the engine.

Analogy: airport security. A dictionary is one rule ("no liquids over 100 ml"). An engine is the combined checkpoint logic ("valid boarding pass AND matching ID AND no banned items"). One liquid alone might be fine; it's the combination the checkpoint enforces.

Predefined vs custom dictionaries

Pro tip — threshold + proximity kill false positives

A bare credit-card pattern matches every 16-digit number. Add a hit-count threshold (e.g. fire only at ≥ 10 numbers) so a single order ID is ignored, and a proximity requirement (the word "card"/"CVV" within 30 characters) so random digits don't trigger. This single tuning step is the difference between a SOC that trusts DLP and one that mutes it.

👉 So far: dictionary = what to look for, engine = how to combine. Custom dictionaries get sharp with thresholds + proximity. Next, see the AND/OR logic decide.
Figure 2 — How a document flows through dictionaries into an engine into a rule
A left-to-right flow. A decrypted document is checked against two dictionaries: a credit card pattern dictionary and a customer-name keyword dictionary with proximity. A DLP engine combines them with AND, so both must match. If the engine matches, the DLP rule applies its action; if only one dictionary matches, the engine does not fire and the upload is allowed. Dictionaries detect · the engine decides strictness · the rule acts Decrypted document the upload payload Dictionary A — card pattern regex · threshold ≥ 10 ✓ 4,000 numbers found Dictionary B — name keyword phrases · proximity ≤ 30 ✓ "Cardholder" nearby Engine A AND B Rule fires action + severity If the engine were "A OR B" → either match alone fires (wider, noisier). "A AND NOT B" excludes a known-safe pattern. The engine is where you dial strictness.
Same two dictionaries, three engines. AND tightens, OR widens, NOT excludes. Choose the operator to your false-positive tolerance.

Priya at Flipkart faces this

Priya's PCI rule blocks legitimate order-confirmation emails because every order has a 16-digit order ID. The SOC is drowning in false positives.

Likely cause

The dictionary fires on any 16-digit string, with a low threshold and no proximity. Order IDs look like card numbers to a raw pattern.

DiagnosisAdministration → DLP Dictionaries & Engines → open the dictionary → check threshold & proximity
Fix

Raise the hit-count threshold, add a proximity requirement (keyword "card"/"CVV" near the digits), and build an engine as card-pattern AND NOT order-keyword to exclude the order format.

Verify

Re-send a real order email — allowed. Send a genuine card list — still blocked. Web Insights shows the false-positive volume drop.

ZIA Security Controls Deep-Dive ZIA Troubleshooting Playbook
Quick check · Q2 of 10

An engine is defined as Dictionary-A AND Dictionary-B. A document matches only Dictionary-A. What happens?

Correct: c. AND means both dictionaries must match for the engine to fire. One match alone isn't enough. Use OR when either match should trigger; use AND/NOT to tighten and exclude. Severity (b) is set on the rule, not derived from partial matches.

Pause & Predict

EDM is coming next. You want DLP to fire only on your actual customer records — never on a developer's fake test data. How can ZIA know a card number is a real one of yours without storing the readable card in the cloud? Type your guess.

Answer: An on-prem Index Tool hashes the sensitive fields and uploads only the hashes to the Zscaler cloud — the readable PII never leaves your building. DLP compares incoming content against those hashes, so it fires on your real records and ignores synthetic data. That's EDM, coming up now.

③ EDM vs IDM — fingerprinting your actual data (+ OCR, MIP)

Patterns and keywords are generic — they describe a shape of data. Sometimes you need to protect your specific records: this customer, that contract. That's where fingerprinting comes in.

EDM — Exact Data Match (structured data)

EDM protects structured data — a database export or a CSV of customers. An on-prem Index Tool hashes the sensitive fields (name, account number, card) and uploads only the hashes to the Zscaler cloud — never the readable PII. DLP then fires only when a real record's fields appear together. A random test card number won't trigger; your actual customer's exact combo will.

Analogy: EDM is an Aadhaar exact-match. It's not "looks like an Aadhaar number" — it's "this specific person's exact details". No match on a made-up number.

IDM — Indexed Document Match (unstructured data)

IDM protects unstructured documents — a confidential contract, a design doc, a board deck. You index the source files; ZIA detects full or partial copies, even reworded or trimmed. You set a match-accuracy threshold (e.g. flag at ~75% similarity). More flexible than EDM, but partial-matching means a higher false-positive risk — so tune the accuracy.

Analogy: IDM is the plagiarism checker. Even if a student reorders sentences from the original answer sheet, it still flags the overlap.

OCR and MIP labels — two more inputs

👉 So far: EDM = exact structured records, IDM = whole-document copies, OCR = text in images, MIP = match on the Microsoft label. Next, a side-by-side so you pick the right one fast.
Figure 3 — Dictionary vs EDM vs IDM — when to use which
Three columns compare a DLP dictionary, Exact Data Match, and Indexed Document Match across what they detect, the data type, how they match, false-positive risk, and a typical use. The dictionary detects generic shapes via patterns and keywords with medium risk; EDM detects exact records from structured data via hashed fields with very low risk; IDM detects full or partial document copies from unstructured files via a document index with higher risk. Pick the technique by data type and false-positive tolerance Detects Data type How it matches False-positive risk Typical use Dictionary EDM IDM Generic shapes Exact records Full/partial doc copies Any Structured (DB/CSV) Unstructured (docs) Pattern + keyword Hashed field values Document index + % Medium Very low Higher "any card-like number" broad coverage, fast "our 2M customer records" precise, low noise "this contract template" catches reworded copies Rule of thumb: structured + need precision → EDM · whole documents → IDM · broad first pass → dictionary. Images → add OCR on top.
EDM is your scalpel (exact records, near-zero noise). IDM is your net for documents. The dictionary is your wide first pass. They combine.

▶ Watch an EDM index get built — then misused

Karthik at Wipro indexes a customer database for EDM. Play the correct hash-only flow, then Break it to see the classic mistake.

① SOURCECustomer CSV sits on-prem: name, account, card — 2M rows
② INDEX TOOLOn-prem Index Tool hashes each field — readable PII never leaves the building
③ UPLOAD HASHESOnly the hash template is sent to the Zscaler cloud
④ MATCHAn upload with a real customer's exact fields → EDM fires. A fake number → silence
Press Play for the correct hash-only EDM flow. Then press Break it.

The four detection techniques — tap each card

Each card front names the technique; the back gives you the "so what" — when to reach for it.

🔤
Dictionary
tap to flip

Patterns + keywords for generic shapes (PCI, SSN, PII). Tune with threshold + proximity. So what: your broad, fast first pass — but the noisiest.

🎯
EDM
tap to flip

Hashes exact fields from structured data; fires only on your real records. So what: near-zero false positives — use it for known customer/employee datasets.

📄
IDM
tap to flip

Indexes whole documents; catches full or partial copies even reworded. So what: protects contracts/designs — but set the match accuracy or it gets noisy.

🖼
OCR
tap to flip

Reads text inside images, then runs normal DLP on it. Org-level since 2025. So what: stops screenshot exfiltration — but image quality means more noise.

Quick check · Q3 of 10

Legal wants to stop a specific confidential contract template from leaking — even if someone reorders paragraphs. Which technique fits best?

Correct: a. IDM is for unstructured documents and detects partial/reworded copies via a document index + match-accuracy threshold. EDM (c) is for structured field data, not whole documents. A PCI dictionary (b) wouldn't recognise this specific contract; OCR (d) only handles images.

Pause & Predict

You're about to flip a brand-new DLP rule live for 5,000 users with action = Block. Before you do — what's the one workflow it might silently break, and what action should you use first instead? Type your guess.

Answer: Legitimate flows you forgot about — HR mailing real SSNs to payroll, support attaching customer data. Start in Allow and log only (monitor mode), watch Web Insights for a week, tune thresholds, then switch to Block. Going straight to Block is how DLP gets switched off by an angry business unit.

④ Build a DLP rule — order, action, severity, ICAP, validate

You've got the detection pieces. Now wire them into a rule. A DLP rule has criteria, calls one or more engines, and applies an action. Rules evaluate top-down, first match wins — order matters, just like firewall rules.

The pieces of a rule

ICAP incident receiver — where the evidence goes

When a rule fires, ZIA can forward the incident to an on-prem ICAP DLP incident receiver (or a third-party DLP). You choose how much it sends:

Analogy: MD5-only is a CCTV logbook entry ("something happened at 2pm"). Full content is the actual footage. You keep the footage only where you can lock it down.

Admin portal — build a custom dictionary (pattern) for a partner project code
Administration → DLP Dictionaries & Engines → Add DLP Dictionary
  Name:        Custom-ProjectKavach
  Type:        Patterns
  Pattern:     PRJ-KAV-[0-9]{6}        # e.g. PRJ-KAV-004217
  Threshold:   Hit Count >= 3          # ignore a single stray mention
  Proximity:   30                      # keyword must sit within 30 chars (optional)
Expected output
Dictionary "Custom-ProjectKavach" saved.
Match preview: 3 of 3 sample lines matched, 0 false hits on order-ID test set.
# If preview shows hits on your safe sample → tighten the regex / raise the threshold.
Admin portal — wrap it in an engine and a rule
DLP Dictionaries & Engines → Add DLP Engine
  Name:   Engine-Kavach-Strict
  Logic:  ((Custom-ProjectKavach)) AND ((Confidential-Keyword))

Policy → Data Loss Prevention → Add Web DLP Rule
  Name:        Block-Kavach-Exfil
  Order:       1
  Criteria:    Groups = Engineering | URL Category = Personal Storage, Webmail
               File Types = Documents, Spreadsheets, Archives
  Engine:      Engine-Kavach-Strict
  Action:      Block
  Severity:    High
  Notification: end-user block page + ICAP incident (Full content)
Expected output
Rule "Block-Kavach-Exfil" active at order 1.
Test upload (PRJ-KAV-004217 in a .docx to a webmail draft):
  Action: BLOCKED   Engine: Engine-Kavach-Strict   Severity: High
  Incident #88213 forwarded to ICAP receiver (full payload).
The trap behind a "working" rule

Don't start in Block. New DLP rules surprise you — they catch legitimate workflows you forgot about (HR sending real SSNs to payroll, support attaching customer data). Start in Allow and log only, watch Web Insights for a week, tune thresholds, then switch to Block. Going straight to Block is how DLP gets disabled by an angry business unit on day two.

👉 So far: a rule = criteria + engine + action + severity, evaluated top-down, with incidents to ICAP. Next, prove it works.

▶ Watch rule evaluation — first match wins

An upload hits the DLP policy. Play to see top-down evaluation, then Break it to see how a mis-ordered allow rule swallows the block.

① RULE 1Allow + log: "HR group → payroll portal" — criteria don't match this user → skip
② RULE 2Block: "Engineering → personal storage, Engine-Kavach-Strict" — criteria match
③ ENGINEEngine fires (code AND confidential keyword present) → condition met
④ ACTIONBlock applied, evaluation stops — first match wins
Press Play for healthy top-down evaluation. Then press Break it.

Aditya at HCL faces this

Aditya's High-severity block rule never fires, even on real matches. A broad "Allow and log" rule sits above it for the same group.

Likely cause

Rule order. DLP is top-down, first match wins. The Allow-and-log rule at a lower order number matches first and stops evaluation, so the block never runs.

DiagnosisPolicy → Data Loss Prevention → read the rule order; check Web Insights "Reason" = which rule decided
Fix

Move the specific Block rule above the broad Allow-and-log rule, or scope the Allow rule's criteria so it doesn't swallow the same traffic.

Verify

Re-test; Web Insights "Reason" now names the Block rule, action = Blocked, severity High.

Verify it worked — the log that ends the argument

Never claim a DLP rule works from the config screen. Do a safe test upload (synthetic data that matches), then open Analytics → Web Insights, filter on your user + last hour, and read the DLP Engine and Reason columns. They name the exact rule and engine that decided. If the log disagrees with what you expected, your rule order or SSL-inspection scope is wrong — not the cloud.

Web Insights filter — confirm the rule and engine that fired
Analytics → Web Insights → Logs
  Filter:  User = aditya@org.in   AND   Action = Blocked   AND   last 1 hour
  Columns: Show "DLP Dictionary", "DLP Engine", "Rule", "Reason"
Expected output
2026-05-31 14:22  aditya@org.in  Action: Blocked
  Rule: Block-Kavach-Exfil   Engine: Engine-Kavach-Strict
  Dictionary: Custom-ProjectKavach (5 hits)  Severity: High
  Incident: #88213 → ICAP receiver (full content)
# If "Reason" names a different rule → fix rule order. If no row at all → SSL inspection isn't decrypting it.
Quick check · Q4 of 10

A High-severity DLP block rule never fires, yet a real match clearly happened. Web Insights "Reason" names a broad "Allow and log" rule above it. Root cause?

Correct: d. The "Reason" column already told you which rule decided — a broad Allow-and-log rule above the Block. Reorder so the specific block sits higher, or scope the Allow rule. SSL off (a) would mean no DLP row at all, not an Allow.
Figure 4 — ZIA DLP build cheat-sheet (screenshot this)
A grid of six cheat-sheet tiles covering the DLP build order: enable SSL inspection first, define dictionaries with threshold and proximity, combine them in an engine with AND/OR/NOT, pick EDM or IDM for fingerprinting, build the rule with action and severity in monitor mode first, and route incidents to an ICAP receiver with MD5 or full content. A golden-rule footer states: no SSL inspection, no DLP; validate in Web Insights. Build order → the one decision that matters at each step ① SSL inspection ON → DLP reads only decrypted → don't bypass test dest no inspection = no DLP ② Dictionary predefined OR custom phrase / pattern / words threshold + proximity ③ Engine combine with AND/OR/NOT AND tightens, OR widens NOT excludes safe data ④ EDM / IDM EDM = exact structured IDM = whole documents OCR for images, org-level ⑤ Rule criteria → engine → action top-down, first match wins start "Allow + log", then Block ⑥ ICAP incident MD5 only = hash, light full content = evidence secure the receiver GOLDEN RULE No SSL inspection, no DLP. Tune before you Block. Validate every rule in Web Insights — never from the config screen. File type checked by: Magic Bytes · MIME type · File Extension | log of truth: Analytics → Web Insights → "DLP Engine" / "Reason"
Six steps, six decisions, one golden rule. Screenshot this — it's the card you'll glance at while building a rule.

🤖 Ask the AI Tutor

Tap any question — instant, scoped to this lesson. No login, no waiting.

Pre-curated from Zscaler Help docs + community Q&A, scoped to ZIA DLP. For a live prod issue, paste your Web Insights export into chat.techclick.in.

📝 Wrap-up assessment — six more

You've answered 4 inline. Six left. 70% (7 of 10) marks the lesson complete on your profile. Tap Submit all answers at the end.

Q5 · Remember

ZIA inspects a file's true type before applying DLP. Which three checks does File Type Control use, in order?

Correct: b. ZIA checks Magic Bytes (file signature), then MIME type, then File Extension — so a .docx renamed to .txt is still classified correctly and caught by file-type criteria.
Q6 · Apply

Sneha must block uploads only when a real customer's exact name + account + card appear together, with near-zero false positives. Which technique?

Correct: a. EDM fingerprints exact field values from structured data, so it fires only on genuine records and stays quiet on test/fake data — the lowest false-positive technique. IDM (c) is for whole documents; a PCI dictionary (b) would flag any card-like number.
Q7 · Apply

You're rolling out a brand-new DLP block rule across 5,000 users. What's the safest first move?

Correct: c. Monitor mode ("Allow and log only") surfaces the legitimate workflows your rule would break, so you tune before you block. Going straight to Block (a) is how DLP gets switched off by an angry business unit on day two.
Q8 · Analyze

After a 2025 upgrade, OCR stopped working on a specific rule even though that rule's ocrEnabled was set. Why?

Correct: b. Zscaler moved OCR to an organization-level setting under Administration → DLP Advanced Settings, and deprecated the per-rule ocrEnabled attribute. Setting it on a single rule now does nothing — enable OCR once for the org instead.
Q9 · Analyze

An IDM rule for a contract template is throwing too many false positives on unrelated documents. The team wants to keep IDM. Best tuning move?

Correct: d. IDM's partial-match flexibility is also its false-positive source. Raising the match-accuracy threshold (e.g. from a loose value toward higher similarity) means only genuine full/near-full copies trigger. Lowering severity (c) hides noise but doesn't fix it.
Q10 · Evaluate

For a healthcare client, an engineer proposes sending full content of every DLP incident to a low-security shared ICAP receiver, "so auditors see everything". Sound design?

Correct: c. Full-content forwarding copies the leaked sensitive data to the receiver — a low-security shared box becomes a new breach target holding PHI. Either harden and access-control the receiver, or send MD5-only where the investigator doesn't need the payload. "More data" (a) without controls is a liability, not a win.
Lesson complete — saved to your profile.
Almost! You need 70% (7 of 10) — re-read the path that tripped you up and tap "Try again".

🧠 In your own words

Type one line: why must SSL inspection be enabled before inline web DLP can do anything? Then compare to the expert version.

Expert version: Inline web DLP inspects the content of an upload. HTTPS hides that content inside a TLS tunnel. SSL inspection decrypts the traffic so DLP receives a readable payload to match against engines, dictionaries, EDM and IDM. If the destination is bypassed (Do-Not-Inspect), DLP gets no payload, so no rule can match — the upload sails through. If your answer mentioned "DLP can't read encrypted content it didn't decrypt", you've got it.

🗣 Teach a friend

Best way to lock it in — explain it in one line to a teammate. Tap to generate a paste-ready summary.

📖 Glossary

DLP (Data Loss Prevention)
Inspects outbound content for sensitive data and blocks or logs it before it leaves the org. In ZIA this runs inline on the SSL-inspected egress path.
DLP dictionary
A definition of what sensitive data looks like — predefined (PCI, SSN, HIPAA) or custom (phrases, patterns/regex, words with thresholds + proximity).
DLP engine
A logical container that combines one or more dictionaries with AND/OR/NOT to decide when a match counts. A rule calls the engine.
EDM (Exact Data Match)
Fingerprints exact field values from a structured source via an on-prem Index Tool (hashes only). Fires only on your real records — very low false positives.
IDM (Indexed Document Match)
Fingerprints whole documents and detects full or partial (reworded) copies via a match-accuracy threshold. For unstructured data.
OCR
Optical Character Recognition — extracts text from images so DLP can inspect screenshots and embedded pictures. Configured org-wide since 2025.
MIP / Purview label
Microsoft Information Protection sensitivity label; ZIA can match on the label instead of re-scanning content.
SSL inspection
ZIA decrypts HTTPS so content controls (including DLP) can read the payload. A hard prerequisite for inline web DLP.
ICAP incident receiver
An on-prem endpoint that receives DLP incidents over ICAP — either MD5-only (a hash) or full content (the payload).
DLP rule
An ordered policy entry: criteria (users, location, URL category, file type) → engine → action (Allow / Block / Confirm / Allow and log) + severity. Top-down, first match wins.

📚 Sources

  1. Zscaler Help — About DLP Engines · Understanding DLP Engines · About DLP Dictionaries · Adding Custom DLP Dictionaries (phrases/patterns/threshold/proximity) · Configuring DLP Policy Rules. help.zscaler.com
  2. Zscaler Help — About Exact Data Match (EDM) · Understanding EDM Index Templates · About Indexed Document Match (IDM) · Defining IDM Match Accuracy. help.zscaler.com
  3. Zscaler Help — About ICAP Receivers for DLP · DLP Incident Receiver · Configuring OCR for DLP · About Microsoft Information Protection Labels (OCR now org-level in DLP Advanced Settings). help.zscaler.com
  4. Zscaler Community (Zenith) — "DLP without SSL inspection" and "DLP Policy Best practices" threads (SSL inspection prerequisite, false-positive tuning). community.zscaler.com
  5. Practitioner write-up — "Stopping Data Leaks: A High-Level Overview of Zscaler ZIA's DLP" (engines vs policies, incidents with content snippets). dontblamethenetwork.com; and "EDM vs IDM vs OCR", hackfaqs.com
  6. Zscaler — Enhanced DLP Capabilities: OCR; Release Upgrade Summary 2025 (2025 engines: CCPA, DPDPA, Credentials & Secrets; ocrEnabled API deprecated). zscaler.com / help.zscaler.com
  7. Zscaler ZDTA Certification — Data Protection Services domain (file-type inspection by Magic Bytes / MIME / File Extension; EDM = structured match; out-of-band = data at rest). customer.zscaler.com

What's next?

You can now place DLP on the ZIA egress path, pick the right detection technique, and build a rule that fires on the right data. Next, see how DLP chains with the rest of the ZIA policy stack — Cloud Firewall, URL Filtering, File Type Control and IPS — inside one SSE hop.