TTechclick All lessons
Zscaler · ZPA · Troubleshooting PlaybookMaster Triage Hub

ZPA Troubleshooting Playbook — Find the Layer, Find the Fix

A power cut: you check the bulb, the switch, the fuse, the meter, the street line — in that order — and stop the moment one fails. ZPA triage is the same 5 checks. Most "private app is down" tickets are solved not by knowing every subsystem, but by isolating WHICH of five layers — Client, Edge/Broker, Connector, App/Network, Policy — actually broke, then jumping straight to the right lesson. This is the master hub that routes you there.

📅 2026-06-05 · ⏱ 12 min · 5 layers · 1 router table · 5 diagrams · 🏷 10-Q assessment + AI Tutor inline · links 11 ZPA lessons

⚡ Quick Answer

Most "ZPA private app is down" tickets are solved by finding WHICH of five layers failed — L1 Client (ZCC), L2 Edge/Broker (Service Edge), L3 Connector (App Connector), L4 App/Network (DNS + last-mile), L5 Policy (Access Policy + SAML/SCIM) — then jumping to the right deep-dive lesson. This is the master triage hub: learn top-down vs bottom-up isolation, the 10-minute runbook, and a symptom → layer → lesson router that links all 11 ZPA lessons.

🎯 By the end, you will be able to

⏳ Before you scroll — 3 questions (no score, just notice what you don't know yet)

Answer these in your head. As you hit the matching section below, you'll see if you were right.

  1. A user's app is down. The App Connector shows Connected (green). Which layers could still be the cause?
  2. The same app is down for everyone in the office. Do you isolate top-down (from the laptop) or bottom-up (from policy/app)?
  3. You have no lead at all — which single Portal page tells you which layer to look at first?
Read as:

The playbook in 4 moves — jump straight to one

1

Isolate the layer

The 5-layer model + top-down vs bottom-up. One signal per layer.

2

The Diagnostics tour

Read the one field that names the layer — Policy, Connector, setup-time.

3

Symptom → lesson router

A table: symptom → likely layer → the exact deep-dive lesson to open.

4

10-minute runbook

A timed, repeatable triage you can run on any ZPA ticket, start to close.

Stop fixing ZPA. Start isolating layers.

Most engineers attack a "ZPA app is down" ticket by guessing a subsystem — "must be the connector," "must be policy" — and burn an hour in the wrong console. Wrong move. A ZPA request crosses five layers in order: L1 Client (ZCC)L2 Edge/Broker (Service Edge)L3 Connector (App Connector)L4 App/Network (DNS + last-mile)L5 Policy (Access Policy + identity). The app is down because exactly one of those layers failed. Find the layer first; the fix is then a short walk to the right lesson.

Think of a power cut at home. You don't rewire the whole house — you check the bulb, then the switch, then the fuse, then the meter, then the street line, in order, and stop at the first thing that's dead. ZPA triage is the same disciplined five checks. This blog is a triage hub, not another deep-dive: it teaches the isolation method and then hands you off to the lesson that owns each layer. That's the aha-moment: "connector healthy" is data about ONE layer (L3), not the whole chain — so a green connector actually narrows the fault to L1, L4, or L5.

👉 So far: a ZPA request crosses 5 layers; the app is down because exactly one failed. Find the layer, then open the lesson for it. Next: the layered map with the diagnostic tool labelled at each layer.
The 5-layer ZPA request path with the diagnostic tool at each layer A ZPA request flows Client to Edge/Broker to Connector to App to Policy. Each layer is numbered L1 to L5 and labelled with the tool you use to check it: ZCC Connection Status, Portal Diagnostics, journalctl on the connector host, dig curl nc, and the Access Policy result. The thesis: find the layer, find the fix. Find the layer, find the fix — one tool per layer L1 · Client ZCC enrolled? Z-Tunnel up? L2 · Edge Service Edge / broker reached? L3 · Connector healthy connector serving this app? L4 · App/Net resolve + TCP-reach the server? L5 · Policy does a rule ALLOW it? 🔧 Tool ZCC > More > Connection Status / Export Logs 🔧 Tool Portal Diagnostics + ZCC ZPA status 🔧 Tool journalctl -u zpa-connector + Portal health 🔧 Tool from connector host: dig / curl / nc 🔧 Tool Diagnostics policy result + Access Policy simulation Top-down (one user): walk L1 → L5 · Bottom-up (everyone): start L5 / L4 → L1 Connector green = L3 proven → the fault is L1, L4, or L5. Skip the layer you already proved. The app is down because exactly ONE layer failed. Your only job: name which. Power-cut order: bulb (L1) → switch (L2) → fuse (L3) → meter (L4) → street line (L5).
The five layers, left to right, with the one diagnostic tool you reach for at each. Find the layer first — the fix is then just opening that layer's lesson.

The five layers, in 30 seconds each

💻
L1 · Client (ZCC)
tap to flip

Is ZCC enrolled, the ZPA service ON, and the tunnel up? If not, traffic never enters ZPA. Tool: ZCC > More > Connection Status / Export Logs. Lesson: ZCC Troubleshooting.

🛰️
L2 · Edge / Broker
tap to flip

Did the device reach a healthy, near Public/Private Service Edge (the broker)? Tool: Portal Diagnostics + ZCC ZPA status. Lesson: ZPA vs VPN / Private Service Edge.

🔗
L3 · Connector
tap to flip

Is a healthy connector serving THIS app — right version, not at capacity? Tool: journalctl -u zpa-connector + Portal health. Lesson: App Connector Troubleshooting / Deploy.

🎯
L4/L5 · App + Policy
tap to flip

L4: can the connector resolve + TCP-reach the server (dig/nc)? L5: does a rule ALLOW it (Diagnostics policy result)? Lessons: DNS & App Discovery · Access Policy.

① Isolate the layer — the 5-layer method

This is the whole game. Instead of asking "what's broken?", ask "which layer is broken?" and check each one with a single yes/no signal. The instant a layer answers "no", you've found your bucket — and the lesson that fixes it. Two ways to walk the layers:

Top-down (L1 → L5) when one user or one device is affected — start at their laptop and move outward. Bottom-up (L5 / L4 → L1) when the same app is down for everyone — a shared fault almost always lives in policy or the app/network layer, so don't waste time on individual clients. And remember the shortcut: a green connector proves L3, so it tells you to look at L1, L4, or L5.

▶ Watch the 5-layer isolation walk

Sneha at Infosys reports SAP is down. Press Play — each layer lights up with its check; red = the layer where this ticket actually fails.

L1 · CLIENT (ZCC) Is ZCC enrolled, ZPA service ON, Z-Tunnel up?
✅ Tool: ZCC > More > Connection Status. Sneha's tunnel is up — L1 passes.
L2 · EDGE / BROKER Did she reach a healthy, near Service Edge (the broker)?
✅ Tool: Portal Diagnostics + ZCC ZPA status. Brokered fine — L2 passes.
L3 · CONNECTOR Is a healthy App Connector serving SAP (right version, not at capacity)?
✅ Tool: journalctl -u zpa-connector + Portal health. Connector green — L3 passes.
L4 · APP / NETWORK From the connector host, can it dig + nc the SAP server?
✅ Tool: dig sap.corp.local → 10.50.7.40, nc -vz 10.50.7.40 443 open. L4 passes.
L5 · POLICY Does an Access Policy rule actually ALLOW Sneha to SAP?
⚠ HERE. Diagnostics shows empty Policy + Close — the ALLOW rule references a SCIM group that was renamed.
FIX → LESSON Layer found = L5 → open the Access Policy lesson, repoint the rule to the new group name. ✅
Press Play to walk the five layers. The first "no" is your bucket — and the lesson that fixes it.
Worked triage · top-down · one user

"Sneha at Infosys says SAP is down" — walk L1 → L5

Symptom One user, one app: Sneha can't open sap.corp.local. Connector looks green.

L1 Client ZCC enrolled, ZPA ON, tunnel up? Yes — Connection Status is green. Pass.

L2 Edge Reached a healthy near Service Edge? Yes — Portal Diagnostics shows a brokered session. Pass.

L3 Connector Healthy connector serving SAP? Yes — green, current version, low load. Pass (so the fault is NOT the connector).

L4 App/Net From the connector host, dig sap.corp.local10.50.7.40, nc -vz 10.50.7.40 443 open. Pass.

L5 Policy FAIL. Diagnostics shows empty Policy + Close: the ALLOW rule references SCIM group SAP-Users, but identity renamed it to SAP_Users. Fix = repoint the rule → app back. Lesson: Access Policy.

L5 signal — Diagnostics → User Activity (the layer names itself)
Username        : sneha.r@infosys.com
Host            : sap.corp.local
ConnectionStatus: Close
Policy          : (empty)          <- no rule matched = L5 miss
Connector       : connector-mumbai-01   <- named => L3/L4 were fine
After repointing the rule to SAP_Users
ConnectionStatus: Open
Policy          : "Allow-SAP-Users"
ServerIP        : 10.50.7.40
ConnectionSetupTime: 38 ms
Pause & Predict

Same SAP outage, but now the whole office reports it at 9am. Do you still walk top-down from Sneha's laptop?

No — flip to bottom-up. When everyone is hit at once, the fault is almost never on every individual laptop (L1). Start at L5 Policy and L4 App/Network, where a single shared change (a renamed group, a server group rebinding, a DNS outage) breaks all users simultaneously. Top-down is for the one-off; bottom-up is for the blast radius.
Method · top-down vs bottom-up

Pick the isolation direction from the blast radius

One user Walk top-down L1 → L5: their ZCC, their broker session, the connector, the app reachability, then policy. The fault is usually personal (tunnel down, attribute missing).

One office Walk bottom-up but start at L2 Edge — a single sick Service Edge or a site-wide forwarding change hits everyone in one location.

Everyone Walk bottom-up L5 → L4: a policy edit, a renamed identity group, a server-group rebinding, or DNS — the shared objects. Never start at 5,000 laptops.

Shortcut Connector green = L3 proven. Cross it off and look at L1 / L4 / L5 only.

Lesson Once you know the layer, open its deep-dive (see the router table below) instead of re-deriving the fix here.

Method · the connector-green rule

"Connector is green but the app is still down"

Symptom Portal connector health = Connected, yet users can't open the app.

What it proves ONLY that L3 is healthy — the connector reached the broker and is alive. It says nothing about L1, L4, or L5.

So check L1 (is the user's tunnel actually up?), L4 (can the connector resolve + reach this server, or did DNS/last-mile break?), L5 (does a rule allow it?).

Trap A green connector in the wrong server/connector group still can't serve the app — that's an L3/L4 binding fault, not a connector-health fault. The light is green; the wiring is wrong.

Lesson App Connector Troubleshooting (L3) for health; DNS & App Discovery (L4) for reachability.

Quick check · Q1 of 10 · Remember

In the 5-layer method, what does a green (Connected) App Connector prove?

Correct: a. "Connected" is a fact about ONE layer (L3) — the connector reached the broker and is alive. It says nothing about the client tunnel (L1), the connector-to-server resolve/reach (L4), or whether policy allows it (L5). A green connector narrows the fault; it never closes the ticket.
👉 So far: name the layer with one signal each, choosing top-down (one user) or bottom-up (everyone). Next: the Diagnostics page that names the layer for you.

② The Diagnostics page tour — let the field name the layer

You don't have to guess the layer — ZPA's own Diagnostics page tells you. Open Analytics → Diagnostics → User Activity for the failing request and read which field is wrong. Each field maps to a layer. This is the single most powerful skill in the whole playbook: the field that's wrong is the layer.

https://admin.private.zscaler.com  ·  Analytics ▸ Diagnostics ▸ User Activity
Diagnostics ▸ User Activity ▸ sneha@infosys.com → sap.corp.internal
One row per request. Read down the five layers — the first field that is not green is the layer that failed.
L1 · Client (ZCC) statusAuthenticated ✓
L2 · Service Edge / brokerBrokered ✓ (Mumbai ZEN)
L3 · App ConnectorHealthy ✓ (172.16.8.21)
L4 · App reachability / DNSResolved + reachable ✓
!L5 · Policy resultBLOCKED — no matching ALLOW rule
L1–L4 are all green, so the connector is fine — the fault is L5 Policy. A SCIM group was renamed, so Sneha's ALLOW rule no longer matches. Go straight to the Access Policy lesson; never touch the connector.
🖥️ Recreated for clarity — your ZPA console matches this. Path: Analytics ▸ Diagnostics ▸ User Activity. Read the layers top-down; the first non-green field names the broken layer.
Diagnostics field → layer

Read User Activity top-down: four fields, four layers

Empty Policy + Close No rule matched → L5 Policy (missing ALLOW, or an identity/SCIM/SAML miss). Open the Access Policy lesson.

Empty Connector ZPA never picked a connector → L3/L4 binding (server group → connector group eligibility rejected all). Open Groups & Service Edge / App Connector lessons.

Named + slow A named connector with a high ConnectionSetupTimeL4 reachability (connector can't promptly reach the server — DNS or last-mile). Open DNS & App Discovery / Performance & MTU.

Open + app-side fail ZPA is clean but the app errors → L4/clientless (cert pinning, Browser Access TLS). Open Browser Access / PRA.

Where to look User Activity = per-request (status, policy, connector, timing). User Status / Authentication = the SAML/SCIM the IdP actually sent (L5 identity).

Diagnostics · L1 + L3 signals

Two more diagnostics you'll reach for first

L1 Client On the laptop: ZCC > More > Connection Status (ZPA service = ON? tunnel up?) and Export Logs to attach to the ticket. If the tunnel is down, no Diagnostics row will ever appear — that itself is the L1 tell.

L2 Edge Portal Administration → Diagnostics shows the broker/Service Edge the session used; ZCC's ZPA status confirms a near, healthy edge was reached.

L3 Connector Portal connector health for "Connected", AND on the connector host journalctl -u zpa-connector -f. A healthy connector runs a parent + child process — parent-only means it's unhealthy.

Pro tip journalctl -n10000 zpa-connector | grep fohh_connection_connected proves the connector↔Service Edge channel (L2/L3) is up, with rx/tx byte counters.

Retention The admin console keeps roughly 14 days of activity; for longer history stream LSS to a SIEM (Splunk, etc.).

Diagnostics · L4 from the connector host

Prove (or disprove) app reachability in three commands

Symptom Diagnostics names a connector but the app still won't open, or setup time is high.

Resolve dig app.corp.local on the connector host — does it return the right server IP? A DNS failure here is often a permissions issue on /etc/resolv.conf for the zscaler user.

Reach nc -vz 10.50.7.40 443 (or curl -vk https://10.50.7.40) — can the connector open TCP to the server's port?

Read Resolve fails → L4 DNS (DNS & App Discovery lesson). Reach fails → L4 last-mile/firewall. Both pass but app still down → step back to L5 Policy.

Lesson DNS & App Discovery (resolution) and Performance & MTU (transport) own this layer.

Symptom to layer decision tree Starting from an app-down symptom, five yes/no checks each isolate one layer: L1 client tunnel, L2 broker reached, L3 connector healthy, L4 connector resolves and reaches the server, L5 a policy rule allows it. The first No names the failed layer. "App won't open" L1 — ZCC enrolled + tunnel up? ZCC > Connection Status No → ZCC lesson L2 — healthy near Service Edge? Portal Diagnostics + ZCC status No → Private Service Edge lesson L3 — connector serving this app? journalctl + Portal health No → App Connector Troubleshooting / Deploy L4 — resolves + TCP-reaches server? from connector: dig / curl / nc No → DNS & App Discovery / MTU lesson L5 — a rule ALLOWs it? Diagnostics Policy field / simulation No → Access Policy / App Segments lesson App opens ✅
Run top-to-bottom. The first "No" names the failed layer — and the red label tells you exactly which lesson to open. You rarely reach the bottom.
Practice the layers in the lab: 🔗 App Connector Simulator (L3) 🛠 ZPA Troubleshooting Sim
Quick check · Q2 of 10 · Analyze

User Activity log shows a named Connector but a high ConnectionSetupTime and the app is slow/timing out. Which layer is the fault, and which lesson do you open?

Correct: b. A named connector means ZPA selected one (L3 binding was fine) — so the remaining suspect is L4: that connector resolving and reaching the server. High ConnectionSetupTime is the L4 reachability signal. Empty Connector would be the L3/L4 binding case instead.
Diagnostics · worked read

One User Activity row, read field by field

Row ConnectionStatus=Open · Connector=connector-pune-02 · ConnectionSetupTime=2900 ms · ServerIP=10.50.9.12

L5? Policy isn't empty and status is Open → a rule allowed it. L5 is fine.

L3? A connector is named → ZPA picked one. The binding (L3) is fine.

L4! Setup time of 2,900 ms is the smell — the named connector is struggling to resolve or reach 10.50.9.12. SSH to connector-pune-02 and run dig + nc. Lesson: DNS & App Discovery / Performance & MTU.

Takeaway You never opened the policy console or the segment config — the fields told you it was L4 in ten seconds.

Pause & Predict

A ticket says "app down". You open Diagnostics → User Activity and there is no row at all for that user and host. Which layer does the absence of a row point to?

L1 Client. If ZPA never logged the request, the traffic never reached ZPA — the ZCC tunnel is down, the ZPA service is off, or the FQDN isn't being steered to ZPA at all. Check ZCC > More > Connection Status first. A missing row is itself a signal: the request died before L2. Open the ZCC Troubleshooting lesson.

③ The symptom → layer → lesson router

This is the heart of the hub. Match the symptom on the left to its likely layer, then open the deep-dive lesson that owns it. Don't re-learn the subsystem here — this table is the map; the lessons are the territory.

L1 · Client (ZCC)

"No ZPA at all" / nothing is logged

Symptom No Diagnostics row; ZCC shows ZPA off / tunnel down / re-auth loop; only this user.

Likely layer L1 — the request never entered ZPA.

Go to lessonZCC Troubleshooting (enrollment, tunnel, service state, log export).

L2 · Edge / Broker

Everyone in one office is slow or down

Symptom A whole site degrades; sessions route to a distant or unhealthy Service Edge.

Likely layer L2 — the broker/Service Edge the device reached.

Go to lessonZPA vs VPN & Private Service Edge (broker model, on-prem Private Service Edge).

L3 · Connector

Connector unhealthy, wrong version, or none serving the app

Symptom Portal health red/flapping; empty Connector field; journalctl shows enrollment/handshake errors; parent-only process.

Likely layer L3 — connector health, or the server-group → connector-group binding.

Go to lessonApp Connector Troubleshooting · App Connector Deploy · Groups & Service Edge (the L3/L4 binding).

L4 · App / Network

Named connector but slow / can't reach the server

Symptom High ConnectionSetupTime; dig fails on the connector host; works on 443 but not 8443; pinned-cert / clientless TLS fail with green ZPA.

Likely layer L4 — DNS resolution, last-mile reachability, transport/MTU, or the app-segment object chain.

Go to lessonDNS & App Discovery · Performance & MTU · App Segments · Browser Access & PRA (clientless / cert).

L5 · Policy

Brokered fine, reachable — but denied or unmatched

Symptom Empty Policy + Close; a renamed/un-synced SCIM group; a SAML attribute the IdP isn't sending; posture/timeout blocking.

Likely layer L5 — Access Policy order, or the identity (SAML/SCIM) feeding it.

Go to lessonAccess Policy (rule order, criteria, identity) · App Segments (the object the rule targets).

The router in one line

No log → L1. Whole office → L2. Empty Connector / red health → L3. Named-but-slow / dig fails / clientless TLS → L4. Empty Policy + Close → L5. Pick the row, open the lesson — that's the whole hub.

Pause & Predict

A clientless (Browser Access) user gets a TLS error, but ZPA Diagnostics shows a clean Open with normal setup time. Which layer, and which lesson?

L4 / clientless — not L5, not L3. "ZPA looks fine but the app fails on TLS" is the classic signature of a cert/pinning problem on the app-or-clientless path. ZPA succeeded (Open), so policy (L5) and the connector (L3) are exonerated. Open the Browser Access & PRA lesson for the cert chain, and disable pinning or bypass ZPA for a pinned native app.
Router · worked

Use the table on a live ticket

Ticket A "App down, no Diagnostics row." → No log = L1 → open ZCC Troubleshooting. (Found it: ZPA service was toggled off on the laptop.)

Ticket B "Empty Connector field." → L3 binding → open Groups & Service Edge. (Server group pointed at a disabled connector group.)

Ticket C "Open + 3s setup, dig fails on host." → L4 DNS → open DNS & App Discovery. (resolv.conf perms for the zscaler user.)

Ticket D "Empty Policy + Close." → L5 → open Access Policy. (ALLOW rule referenced a renamed SCIM group.)

Takeaway Four tickets, four layers, four lessons — zero guessing.

Quick check · Q3 of 10 · Apply

A ticket shows a clean Open session, a named connector, but ConnectionSetupTime = 3,100 ms and dig fails on the connector host. Which router row do you follow?

Correct: a. Open + named connector clears L1, L2, L3 and L5. A high setup time plus a failed dig on the connector host is the L4 DNS signature — frequently a file-permission issue on /etc/resolv.conf for the zscaler service account. Router row L4 → DNS & App Discovery.
👉 So far: every symptom now routes to a layer and a specific lesson. Next: turn the method into a timed 10-minute runbook you can run on any ticket.

④ The 10-minute triage runbook

Put it all together as a timed, repeatable runbook. Run it top-down for one user, bottom-up for a crowd. Every box has a tool and a clock; the moment a check fails, you have your layer and your lesson — stop and fix.

The 10-minute ZPA triage runbook timeline A left-to-right timeline of five timed checks: minute 0-2 confirm the symptom and pull the Diagnostics row, minute 2-3 L1 client, minute 3-5 L2 and L3 connector, minute 5-8 L4 reachability from the connector host, minute 8-10 L5 policy and identity. The first failed check ends the run. 10 minutes, 5 checks — stop at the first failure 0–2 min Scope + row one user or all? pull Diagnostics User Activity row 2–3 min L1 Client ZCC Connection Status; tunnel up? no row ⇒ L1 3–5 min L2 + L3 near edge? health? empty Connector ⇒ L3 binding 5–8 min L4 App/Net dig / nc on host named + slow ⇒ L4 8–10 min L5 Policy empty Policy + Close ⇒ L5 check SAML/SCIM Stop at the FIRST failed check. That layer = your bucket = your lesson. One user → run left to right (top-down). Whole fleet → start at L5/L4 and work back (bottom-up). Then verify: the same Diagnostics row reads Open + named connector + the right Policy.
The runbook on a clock. Most tickets resolve before minute 6 — you stop at the first failed check and open that layer's lesson.
Runbook · minute 0–3

Scope, then check the client (L1)

0–2 Is it one user or many? Pull the failing Diagnostics → User Activity row. The scope chooses your direction; the row chooses your layer.

2–3 (L1) On the laptop: ZCC > More > Connection Status — ZPA ON, tunnel up? No Diagnostics row at all = L1 → ZCC Troubleshooting; export logs.

Stop rule If L1 fails, you're done — fix the client, don't open the policy console.

Runbook · minute 3–8

Edge + connector (L2/L3), then reachability (L4)

3–5 (L2/L3) Did the session reach a healthy near Service Edge (Portal Diagnostics)? Is the connector Connected (journalctl -u zpa-connector, parent+child)? Empty Connector = L3 binding → Groups & Service Edge.

5–8 (L4) SSH to the named connector: dig app.corp.local then nc -vz <ip> 443. Named-but-slow / dig fails = L4 → DNS & App Discovery or Performance & MTU.

Stop rule First failure ends the run — note the layer and open its lesson.

Runbook · minute 8–10 + verify

Policy + identity (L5), then prove the fix

8–10 (L5) If everything below passed, the fault is policy: empty Policy + Close = L5. Run Access Policy simulation; check User Status / Authentication for the SAML/SCIM the IdP actually sent. → Access Policy lesson.

Verify Re-run the request. The SAME Diagnostics row must now read Open, a named connector, low ConnectionSetupTime, and the correct Policy.

Close Record which layer it was — over a month, your layer histogram tells you where to invest (a flood of L1s means a ZCC rollout problem, not a connector problem).

LSS User Activity query (Splunk) — the one-shot triage row
index=zpa sourcetype=zpa_user_activity
  Username="sneha.r@infosys.com" Host="sap.corp.local"
| table LogTimestamp, ConnectionStatus, Policy,
        Connector, ServerIP, ConnectionSetupTime, DoubleEncryption
Verify — a healthy row after the fix
ConnectionStatus  : Open
Policy            : Allow-SAP-Users
Connector         : connector-mumbai-01
ServerIP          : 10.50.7.40
ConnectionSetupTime: 38 ms
DoubleEncryption  : false
Field → layer cheat (memorise this)

No row → L1 Client. Empty Connector + Close → L3 connector / binding. Open + high ConnectionSetupTime → L4 reachability (DNS / last-mile). Open + app-side TLS fail → L4 clientless / cert. Empty Policy + Close → L5 policy / identity.

Quick check · Q4 of 10 · Analyze

During the runbook for ONE user, L1 passes (tunnel up), L2/L3 pass (connector named + green), but dig on the connector host fails to resolve the app. Where do you stop, and what's the fix path?

Correct: c. The stop rule says: halt at the first failed check. L1, L2, L3 all passed, so the broken dig on the connector host is the L4 failure — you don't continue to L5. DNS resolution from the connector is the L4 signal, often a permissions issue on /etc/resolv.conf for the zscaler account.

Symptom → layer → lesson (the router in flip form)

📭
No log at all
tap

Layer: L1 Client. The request never entered ZPA — tunnel down / service off. Lesson: ZCC Troubleshooting. Check ZCC > Connection Status first.

📦
Empty Connector
tap

Layer: L3 binding. Server group → connector group eligibility rejected all. Lesson: Groups & Service Edge / App Connector Troubleshooting.

🐢
Named but slow
tap

Layer: L4 App/Net. High setup time / dig fails on the host. Lesson: DNS & App Discovery · Performance & MTU. Clientless TLS = Browser Access / PRA.

🚫
Empty Policy + Close
tap

Layer: L5 Policy. No rule matched, or identity (SCIM/SAML) miss. Lesson: Access Policy · App Segments. Run policy simulation.

Layer to tool to lesson cheat strip Each of the five layers is paired with the diagnostic tool that tests it and the deep-dive lesson that fixes it: L1 Client uses ZCC Connection Status and the ZCC lesson, L2 Edge uses Portal Diagnostics and the Private Service Edge lesson, L3 Connector uses journalctl and the App Connector lesson, L4 App/Network uses dig curl nc and the DNS App Discovery lesson, L5 Policy uses the Diagnostics Policy field and the Access Policy lesson. Layer → its tool → its lesson LAYER DIAGNOSTIC TOOL DEEP-DIVE LESSON L1 · Client (ZCC) ZCC > Connection Status ZCC Troubleshooting L2 · Edge / Broker Portal Diagnostics + ZCC Private Service Edge L3 · Connector journalctl -u zpa-connector App Connector T/S L4 · App / Network dig / curl / nc on host DNS & App Discovery L5 · Policy Diagnostics Policy field Access Policy
The whole hub on one strip: pick the layer you isolated, see the exact tool that tested it, and the deep-dive lesson that fixes it. Find the layer, find the fix.

One-glance cheat-sheet — symptom → first check → lesson

ZPA 5-layer triage cheat-sheet Six tiles mapping a symptom to the first check, the layer, and the lesson to open: no log goes to L1 ZCC, whole office to L2 edge, empty connector to L3 binding, named but slow to L4 DNS, empty policy to L5 access policy, and clientless TLS fail to L4 Browser Access. App down? Name the layer from one signal, then open its lesson. 📭 No Diagnostics row L1 · ZCC Connection Status tunnel up? service ON? → ZCC Troubleshooting 🏢 Whole office hit L2 · Portal Diagnostics near, healthy edge? → Private Service Edge 📦 Empty Connector L3 · Portal health + journalctl binding rejected all → App Connector / Groups 🐢 Named but slow L4 · dig / nc on connector resolve + reach server? → DNS & App Discovery 🚫 Empty Policy + Close L5 · Policy field / simulation no rule / SCIM / SAML → Access Policy 🔐 Clientless TLS fail L4 · green ZPA, app errors cert chain / pinning → Browser Access & PRA Golden rule: a healthy connector proves ONE layer (L3) — not the chain. Name the layer from one signal · open its lesson · verify the row reads Open + named connector + right Policy.
Screenshot this. Six symptoms, one signal each, and the exact lesson to open — the fastest path from "app down" to the right layer.

🤖 Ask the AI Tutor

Tap any question — instant context-aware answer scoped to this lesson. No login, no waiting.

Pre-curated from Zscaler ZPA help docs + community Q&A. For live prod issues, paste your Diagnostics → User Activity row into chat.techclick.in.

📝 Wrap-up — six more

You've answered 4 inline. Six left. 70% (7 of 10) total marks the lesson complete on your profile. Tap Submit all answers at the end.

Q5 · Apply

One user reports an app is down. You open Diagnostics → User Activity and there is no row at all for that user and host. Which layer, and what do you do first?

Correct: a. If ZPA logged nothing, the traffic never entered ZPA — that is the L1 Client tell. The tunnel is down, the ZPA service is off, or the FQDN isn't being steered to ZPA. Start at ZCC Connection Status; don't open the policy or connector consoles for a request that never arrived.
Q6 · Apply

The same app goes down for the whole company at 9am after a config change. Which isolation direction do you run, and where do you start?

Correct: b. A fleet-wide outage is almost never on every individual laptop (L1). The blast radius points at a shared object — policy (L5) or app/network (L4) — so you isolate bottom-up. Top-down is for the one-off user; bottom-up is for the crowd.
Q7 · Analyze

Diagnostics shows Open, a named connector, no errors — but ConnectionSetupTime = 3,200 ms and the app is sluggish/timing out. Which layer, and which lesson?

Correct: a. Open + named connector clears L1, L2, L3 and L5 (a rule allowed it, a connector was picked). A high ConnectionSetupTime is the L4 signal: that connector is struggling to resolve or reach the server. Confirm with dig / nc on the connector host.
Q8 · Analyze

A ticket shows an empty Connector field with Close. Policy isn't the issue (a rule matched on a working app for the same user). Which layer, and which lesson?

Correct: c. An empty Connector means ZPA never picked one — eligibility (group / location / health) filtered all candidates. That is an L3/L4 binding fault between the server group and connector group, not a connector-to-app reachability problem (which would show a named connector with high setup time).
Q9 · Evaluate

Two engineers debate triage style: (X) "memorise every subsystem and intuit the cause," or (Y) "isolate which of the 5 layers failed with one signal each, then open that layer's lesson." Which scales better across a SOC, and why?

Correct: d. Intuition doesn't transfer across a team and fails under pressure. A layer-isolation method is teachable, consistent, and evidence-led — anyone can run it, handoffs are clean, and deep knowledge is pulled in only for the one layer that failed. That is exactly what a triage hub plus per-layer lessons provides.
Q10 · Evaluate

A teammate says: "the App Connector is green, so the connector is definitely fine — stop looking at it and just reboot the user's laptop." Sound or not?

Correct: b. "Connector green" is genuine evidence that L3 is healthy — so don't waste time on connector health. But it says nothing about the client tunnel (L1), connector-to-server reachability (L4), or policy (L5). Rebooting the laptop is a blind guess; the disciplined move is to read the Diagnostics field and isolate the actual layer.
Lesson complete — saved to your profile.
Almost! You need 70% (7 of 10) — re-read the bucket that tripped you up and tap "Try again".

✍️ Explain it back (2 lines)

In your own words: what is the 5-layer method, and why does a green App Connector narrow the fault instead of closing the ticket? Type it, then reveal the expert version.

Expert version: Don't guess the cause — isolate which of five layers failed: L1 Client (ZCC), L2 Edge/Broker, L3 Connector, L4 App/Network, L5 Policy. Each has one signal and one diagnostic tool. A green connector proves only L3, so it narrows the fault to L1, L4, or L5 — you read the Diagnostics field to name the layer, then open that layer's deep-dive lesson and apply its fix.

🧑‍🏫 Teach a friend

Tap to generate a one-liner you can paste to a teammate who's stuck.

"ZPA triage = find the LAYER, not the cause. Check L1 Client (ZCC tunnel), L2 Edge (broker), L3 Connector (journalctl + Portal health), L4 App/Net (dig/nc on the connector), L5 Policy (Diagnostics Policy field). No log = L1. Empty Connector = L3. Named-but-slow = L4. Empty Policy + Close = L5. Green connector only proves L3 — so look at L1/L4/L5. Then open that layer's lesson."

📩 Quiz me again later

Opt in and we'll send 3 micro-questions on this playbook at Day 1, Day 7, and Day 30 — spaced recall locks it in.

✓ You're in. We'll nudge you with 3 quick questions — opt out anytime from your profile.

Glossary — the words this playbook used

📚 Sources

  1. Zscaler Help — Troubleshooting App Connectors & App Connector Managing & Troubleshooting (process health, journalctl -u zpa-connector, fohh_connection_connected). help.zscaler.com/zpa
  2. Zscaler Help — Accessing App Connector Status Diagnostics, About App Connector Status Log Fields, Accessing User Activity / User Status Diagnostics. help.zscaler.com/zpa
  3. Zscaler Help — Troubleshooting Private Service Edges & Understanding Private Access Session Status Codes (the L2 Edge/Broker layer). help.zscaler.com/zpa
  4. Zscaler Help — ZPA App Connector Troubleshooting Runbook (DNS resolution, /etc/resolv.conf permissions for the zscaler user). help.zscaler.com/troubleshooting-runbooks
  5. Zscaler Help — Configuring Access Policies & Enabling SCIM for Identity Management (the L5 Policy + identity layer). help.zscaler.com/zpa
  6. Zscaler Academy — ZDTA (Digital Transformation Administrator) blueprint — Identity, Connectivity, Access Control service layers.

What's next?

This hub taught the method. Go deeper per layer using the router above, or read the sister lesson that organises the same fixes by symptom bucket (policy, mapping, certs, steering) — both reach the same lessons from different angles.

— Techclick Team