The same app goes down for the whole company at 9am after a config change. Which isolation direction do you run, and where do you start?

Correct: b. A fleet-wide outage is almost never on every individual laptop (L1). The blast radius points at a shared object — policy (L5) or app/network (L4) — so you isolate bottom-up. Top-down is for the one-off user; bottom-up is for the crowd.

Diagnostics shows Open , a named connector, no errors — but ConnectionSetupTime = 3,200 ms and the app is sluggish/timing out. Which layer, and which lesson?

Correct: a. Open + named connector clears L1, L2, L3 and L5 (a rule allowed it, a connector was picked). A high ConnectionSetupTime is the L4 signal: that connector is struggling to resolve or reach the server. Confirm with dig / nc on the connector host.

A ticket shows an empty Connector field with Close . Policy isn't the issue (a rule matched on a working app for the same user). Which layer, and which lesson?

Correct: c. An empty Connector means ZPA never picked one — eligibility (group / location / health) filtered all candidates. That is an L3/L4 binding fault between the server group and connector group, not a connector-to-app reachability problem (which would show a named connector with high setup time).

Two engineers debate triage style: (X) "memorise every subsystem and intuit the cause," or (Y) "isolate which of the 5 layers failed with one signal each, then open that layer's lesson." Which scales better across a SOC, and why?

Correct: d. Intuition doesn't transfer across a team and fails under pressure. A layer-isolation method is teachable, consistent, and evidence-led — anyone can run it, handoffs are clean, and deep knowledge is pulled in only for the one layer that failed. That is exactly what a triage hub plus per-layer lessons provides.

A teammate says: "the App Connector is green, so the connector is definitely fine — stop looking at it and just reboot the user's laptop." Sound or not?

Correct: b. "Connector green" is genuine evidence that L3 is healthy — so don't waste time on connector health. But it says nothing about the client tunnel (L1), connector-to-server reachability (L4), or policy (L5). Rebooting the laptop is a blind guess; the disciplined move is to read the Diagnostics field and isolate the actual layer.

ZPA Troubleshooting Playbook - Find the Layer, Find

Q: In the 5-layer method, what does a green (Connected) App Connector prove?

Correct: a. "Connected" is a fact about ONE layer (L3) — the connector reached the broker and is alive. It says nothing about the client tunnel (L1), the connector-to-server resolve/reach (L4), or whether policy allows it (L5). A green connector narrows the fault; it never closes the ticket.

Q: User Activity log shows a named Connector but a high ConnectionSetupTime and the app is slow/timing out. Which layer is the fault, and which lesson do you open?

Correct: b. A named connector means ZPA selected one (L3 binding was fine) — so the remaining suspect is L4: that connector resolving and reaching the server. High ConnectionSetupTime is the L4 reachability signal. Empty Connector would be the L3/L4 binding case instead.

Q: A ticket shows a clean Open session, a named connector, but ConnectionSetupTime = 3,100 ms and dig fails on the connector host. Which router row do you follow?

Correct: a. Open + named connector clears L1, L2, L3 and L5. A high setup time plus a failed dig on the connector host is the L4 DNS signature — frequently a file-permission issue on /etc/resolv.conf for the zscaler service account. Router row L4 → DNS & App Discovery.

Q: During the runbook for ONE user, L1 passes (tunnel up), L2/L3 pass (connector named + green), but dig on the connector host fails to resolve the app. Where do you stop, and what's the fix path?

Correct: c. The stop rule says: halt at the first failed check. L1, L2, L3 all passed, so the broken dig on the connector host is the L4 failure — you don't continue to L5. DNS resolution from the connector is the L4 signal, often a permissions issue on /etc/resolv.conf for the zscaler account.

Q: One user reports an app is down. You open Diagnostics → User Activity and there is no row at all for that user and host. Which layer, and what do you do first?

Correct: a. If ZPA logged nothing, the traffic never entered ZPA — that is the L1 Client tell. The tunnel is down, the ZPA service is off, or the FQDN isn't being steered to ZPA. Start at ZCC Connection Status; don't open the policy or connector consoles for a request that never arrived.

Start here · understand the lesson before the detail

What you are learning

This lesson is a decision playbook for real ZPA incidents. Begin with the exact session evidence, choose the matching branch, and avoid wasting time on healthy layers.

In plain English

Different symptoms leave different evidence. No discovered app points toward name, segment, and port matching. A policy block points toward identity, posture, and rule order. No connector points toward eligibility and health. An unreachable app points toward the connector-to-app path.

Real example

A user reports 'timeout,' but diagnostics show POLICY_BLOCK. The browser description is vague; the ZPA action is precise. Check the matched user, posture, application, and rule rather than restarting the connector or changing DNS.

Follow this flow

Find the exact request in diagnostics and copy its status and matched objects.
Choose the discovery, policy, connector, app-reachability, or performance branch.
Check only the evidence named by that branch.
Compare with one working case and identify the first difference.
Fix one cause, repeat the original request, and document closure.

Evidence to collect

No event, discovered app, or matched application segment
Identity, posture, policy rule, and block reason
Eligible connector group, connector ID, and health
Private DNS, route, port, latency, MTU, and app response

Common mistake to avoid

Do not let the user’s phrase 'not working' choose the technical branch. The session event, or the absence of one, is stronger evidence. Preserve it before making a change.

Current official source checkpoint

App Connector troubleshooting runbookcurrent official reference used for this beginner explanation
ZPA performance runbookcurrent official reference used for this beginner explanation

A ZPA troubleshooting decision tree uses session evidence to choose checks for discovery, policy, connector availability, application reachability, or slowness. — ChatGPT-generated beginner infographic for this lesson. Read the labelled flow once, then continue into the technical detail below.

Key terms before you continue

Decision BranchNext checks selected from the observed evidence.

No EventOften points before policy, such as discovery or client path.

Block ReasonRecorded explanation for a denied request.

ClosureSuccessful retest plus saved evidence and documented fix.

Stop fixing ZPA. Start isolating layers.

Most engineers attack a "ZPA app is down" ticket by guessing a subsystem — "must be the connector," "must be policy" — and burn an hour in the wrong console. Wrong move. A ZPA request crosses five layers in order: L1 Client (ZCC) → L2 Edge/Broker (Service Edge) → L3 Connector (App Connector) → L4 App/Network (DNS + last-mile) → L5 Policy (Access Policy + identity). The app is down because exactly one of those layers failed. Find the layer first; the fix is then a short walk to the right lesson.

Think of a power cut at home. You don't rewire the whole house — you check the bulb, then the switch, then the fuse, then the meter, then the street line, in order, and stop at the first thing that's dead. ZPA triage is the same disciplined five checks. This blog is a triage hub, not another deep-dive: it teaches the isolation method and then hands you off to the lesson that owns each layer. That's the aha-moment: "connector healthy" is data about ONE layer (L3), not the whole chain — so a green connector actually narrows the fault to L1, L4, or L5.

👉 So far: a ZPA request crosses 5 layers; the app is down because exactly one failed. Find the layer, then open the lesson for it. Next: the layered map with the diagnostic tool labelled at each layer.

Legend The five layers (L1–L5) Diagnostic tool per layer Layer check detail Exactly one layer failed Power-cut order note

The five layers, in 30 seconds each

💻

L1 · Client (ZCC)

tap to flip

Is ZCC enrolled, the ZPA service ON, and the tunnel up? If not, traffic never enters ZPA. Tool: ZCC > More > Connection Status / Export Logs. Lesson: ZCC Troubleshooting.

🛰️

L2 · Edge / Broker

tap to flip

Did the device reach a healthy, near Public/Private Service Edge (the broker)? Tool: Portal Diagnostics + ZCC ZPA status. Lesson: ZPA vs VPN / Private Service Edge.

🔗

L3 · Connector

tap to flip

Is a healthy connector serving THIS app — right version, not at capacity? Tool: journalctl -u zpa-connector + Portal health. Lesson: App Connector Troubleshooting / Deploy.

🎯

L4/L5 · App + Policy

tap to flip

L4: can the connector resolve + TCP-reach the server (dig/nc)? L5: does a rule ALLOW it (Diagnostics policy result)? Lessons: DNS & App Discovery · Access Policy.

① Isolate the layer — the 5-layer method

This is the whole game. Instead of asking "what's broken?", ask "which layer is broken?" and check each one with a single yes/no signal. The instant a layer answers "no", you've found your bucket — and the lesson that fixes it. Two ways to walk the layers:

Top-down (L1 → L5) when one user or one device is affected — start at their laptop and move outward. Bottom-up (L5 / L4 → L1) when the same app is down for everyone — a shared fault almost always lives in policy or the app/network layer, so don't waste time on individual clients. And remember the shortcut: a green connector proves L3, so it tells you to look at L1, L4, or L5.

▶ Watch the 5-layer isolation walk

Sneha at Infosys reports SAP is down. Press Play — each layer lights up with its check; red = the layer where this ticket actually fails.

L1 · CLIENT (ZCC) Is ZCC enrolled, ZPA service ON, Z-Tunnel up?

✅ Tool: ZCC > More > Connection Status. Sneha's tunnel is up — L1 passes.

▼

L2 · EDGE / BROKER Did she reach a healthy, near Service Edge (the broker)?

✅ Tool: Portal Diagnostics + ZCC ZPA status. Brokered fine — L2 passes.

▼

L3 · CONNECTOR Is a healthy App Connector serving SAP (right version, not at capacity)?

✅ Tool: journalctl -u zpa-connector + Portal health. Connector green — L3 passes.

▼

L4 · APP / NETWORK From the connector host, can it dig + nc the SAP server?

✅ Tool: dig sap.corp.local → 10.50.7.40, nc -vz 10.50.7.40 443 open. L4 passes.

▼

L5 · POLICY Does an Access Policy rule actually ALLOW Sneha to SAP?

⚠ HERE. Diagnostics shows empty Policy + Close — the ALLOW rule references a SCIM group that was renamed.

▼

FIX → LESSON Layer found = L5 → open the Access Policy lesson, repoint the rule to the new group name. ✅

Press Play to walk the five layers. The first "no" is your bucket — and the lesson that fixes it.

Worked triage · top-down · one user

"Sneha at Infosys says SAP is down" — walk L1 → L5

Symptom One user, one app: Sneha can't open sap.corp.local. Connector looks green.

L1 Client ZCC enrolled, ZPA ON, tunnel up? Yes — Connection Status is green. Pass.

L2 Edge Reached a healthy near Service Edge? Yes — Portal Diagnostics shows a brokered session. Pass.

L3 Connector Healthy connector serving SAP? Yes — green, current version, low load. Pass (so the fault is NOT the connector).

L4 App/Net From the connector host, dig sap.corp.local → 10.50.7.40, nc -vz 10.50.7.40 443 open. Pass.

L5 Policy FAIL. Diagnostics shows empty Policy + Close: the ALLOW rule references SCIM group SAP-Users, but identity renamed it to SAP_Users. Fix = repoint the rule → app back. Lesson: Access Policy.

L5 signal — Diagnostics → User Activity (the layer names itself)

Username        : sneha.r@infosys.com
Host            : sap.corp.local
ConnectionStatus: Close
Policy          : (empty)          <- no rule matched = L5 miss
Connector       : connector-mumbai-01   <- named => L3/L4 were fine

After repointing the rule to SAP_Users

ConnectionStatus: Open
Policy          : "Allow-SAP-Users"
ServerIP        : 10.50.7.40
ConnectionSetupTime: 38 ms

Pause & Predict

Same SAP outage, but now the whole office reports it at 9am. Do you still walk top-down from Sneha's laptop?

No — flip to bottom-up. When everyone is hit at once, the fault is almost never on every individual laptop (L1). Start at L5 Policy and L4 App/Network, where a single shared change (a renamed group, a server group rebinding, a DNS outage) breaks all users simultaneously. Top-down is for the one-off; bottom-up is for the blast radius.

Method · top-down vs bottom-up

Pick the isolation direction from the blast radius

One user Walk top-down L1 → L5: their ZCC, their broker session, the connector, the app reachability, then policy. The fault is usually personal (tunnel down, attribute missing).

One office Walk bottom-up but start at L2 Edge — a single sick Service Edge or a site-wide forwarding change hits everyone in one location.

Everyone Walk bottom-up L5 → L4: a policy edit, a renamed identity group, a server-group rebinding, or DNS — the shared objects. Never start at 5,000 laptops.

Shortcut Connector green = L3 proven. Cross it off and look at L1 / L4 / L5 only.

Lesson Once you know the layer, open its deep-dive (see the router table below) instead of re-deriving the fix here.

Method · the connector-green rule

"Connector is green but the app is still down"

Symptom Portal connector health = Connected, yet users can't open the app.

What it proves ONLY that L3 is healthy — the connector reached the broker and is alive. It says nothing about L1, L4, or L5.

So check L1 (is the user's tunnel actually up?), L4 (can the connector resolve + reach this server, or did DNS/last-mile break?), L5 (does a rule allow it?).

Trap A green connector in the wrong server/connector group still can't serve the app — that's an L3/L4 binding fault, not a connector-health fault. The light is green; the wiring is wrong.

Lesson App Connector Troubleshooting (L3) for health; DNS & App Discovery (L4) for reachability.

Quick check · Q1 of 10 · Remember

In the 5-layer method, what does a green (Connected) App Connector prove?

a) Only that Layer 3 is healthy — the fault, if any, is at L1, L4, or L5 b) That the app will definitely open c) That the access policy allows the user d) That DNS on the server is resolving

Correct: a. "Connected" is a fact about ONE layer (L3) — the connector reached the broker and is alive. It says nothing about the client tunnel (L1), the connector-to-server resolve/reach (L4), or whether policy allows it (L5). A green connector narrows the fault; it never closes the ticket.

👉 So far: name the layer with one signal each, choosing top-down (one user) or bottom-up (everyone). Next: the Diagnostics page that names the layer for you.

② The Diagnostics page tour — let the field name the layer

You don't have to guess the layer — ZPA's own Diagnostics page tells you. Open Analytics → Diagnostics → User Activity for the failing request and read which field is wrong. Each field maps to a layer. This is the single most powerful skill in the whole playbook: the field that's wrong is the layer.

https://admin.private.zscaler.com · Analytics ▸ Diagnostics ▸ User Activity

Diagnostics ▸ User Activity ▸ sneha@infosys.com → sap.corp.internal

One row per request. Read down the five layers — the first field that is not green is the layer that failed.

L1 · Client (ZCC) status	Authenticated ✓
L2 · Service Edge / broker	Brokered ✓ (Mumbai ZEN)
L3 · App Connector	Healthy ✓ (172.16.8.21)
L4 · App reachability / DNS	Resolved + reachable ✓
!L5 · Policy result	BLOCKED — no matching ALLOW rule

L1–L4 are all green, so the connector is fine — the fault is L5 Policy. A SCIM group was renamed, so Sneha's ALLOW rule no longer matches. Go straight to the Access Policy lesson; never touch the connector.

🖥️ Recreated for clarity — your ZPA console matches this. Path: Analytics ▸ Diagnostics ▸ User Activity. Read the layers top-down; the first non-green field names the broken layer.

Diagnostics field → layer

Read User Activity top-down: four fields, four layers

Empty Policy + Close No rule matched → L5 Policy (missing ALLOW, or an identity/SCIM/SAML miss). Open the Access Policy lesson.

Empty Connector ZPA never picked a connector → L3/L4 binding (server group → connector group eligibility rejected all). Open Groups & Service Edge / App Connector lessons.

Named + slow A named connector with a high ConnectionSetupTime → L4 reachability (connector can't promptly reach the server — DNS or last-mile). Open DNS & App Discovery / Performance & MTU.

Open + app-side fail ZPA is clean but the app errors → L4/clientless (cert pinning, Browser Access TLS). Open Browser Access / PRA.

Where to look User Activity = per-request (status, policy, connector, timing). User Status / Authentication = the SAML/SCIM the IdP actually sent (L5 identity).

Diagnostics · L1 + L3 signals

Two more diagnostics you'll reach for first

L1 Client On the laptop: ZCC > More > Connection Status (ZPA service = ON? tunnel up?) and Export Logs to attach to the ticket. If the tunnel is down, no Diagnostics row will ever appear — that itself is the L1 tell.

L2 Edge Portal Administration → Diagnostics shows the broker/Service Edge the session used; ZCC's ZPA status confirms a near, healthy edge was reached.

L3 Connector Portal connector health for "Connected", AND on the connector host journalctl -u zpa-connector -f. A healthy connector runs a parent + child process — parent-only means it's unhealthy.

Pro tip journalctl -n10000 zpa-connector | grep fohh_connection_connected proves the connector↔Service Edge channel (L2/L3) is up, with rx/tx byte counters.

Retention The admin console keeps roughly 14 days of activity; for longer history stream LSS to a SIEM (Splunk, etc.).

Diagnostics · L4 from the connector host

Prove (or disprove) app reachability in three commands

Symptom Diagnostics names a connector but the app still won't open, or setup time is high.

Resolve dig app.corp.local on the connector host — does it return the right server IP? A DNS failure here is often a permissions issue on /etc/resolv.conf for the zscaler user.

Reach nc -vz 10.50.7.40 443 (or curl -vk https://10.50.7.40) — can the connector open TCP to the server's port?

Read Resolve fails → L4 DNS (DNS & App Discovery lesson). Reach fails → L4 last-mile/firewall. Both pass but app still down → step back to L5 Policy.

Lesson DNS & App Discovery (resolution) and Performance & MTU (transport) own this layer.

Practice the layers in the lab: 🔗 App Connector Simulator (L3) 🛠 ZPA Troubleshooting Sim

Quick check · Q2 of 10 · Analyze

User Activity log shows a named Connector but a high ConnectionSetupTime and the app is slow/timing out. Which layer is the fault, and which lesson do you open?

a) L1 Client — re-enroll ZCC b) L4 App/Network — the connector was chosen but is slow to resolve/reach the server; open DNS & App Discovery / Performance & MTU c) L5 Policy — the rule is wrong d) L2 Edge — the broker is down

Correct: b. A named connector means ZPA selected one (L3 binding was fine) — so the remaining suspect is L4: that connector resolving and reaching the server. High ConnectionSetupTime is the L4 reachability signal. Empty Connector would be the L3/L4 binding case instead.

Diagnostics · worked read

One User Activity row, read field by field

Row ConnectionStatus=Open · Connector=connector-pune-02 · ConnectionSetupTime=2900 ms · ServerIP=10.50.9.12

L5? Policy isn't empty and status is Open → a rule allowed it. L5 is fine.

L3? A connector is named → ZPA picked one. The binding (L3) is fine.

L4! Setup time of 2,900 ms is the smell — the named connector is struggling to resolve or reach 10.50.9.12. SSH to connector-pune-02 and run dig + nc. Lesson: DNS & App Discovery / Performance & MTU.

Takeaway You never opened the policy console or the segment config — the fields told you it was L4 in ten seconds.

Pause & Predict

A ticket says "app down". You open Diagnostics → User Activity and there is no row at all for that user and host. Which layer does the absence of a row point to?

L1 Client. If ZPA never logged the request, the traffic never reached ZPA — the ZCC tunnel is down, the ZPA service is off, or the FQDN isn't being steered to ZPA at all. Check ZCC > More > Connection Status first. A missing row is itself a signal: the request died before L2. Open the ZCC Troubleshooting lesson.

③ The symptom → layer → lesson router

This is the heart of the hub. Match the symptom on the left to its likely layer, then open the deep-dive lesson that owns it. Don't re-learn the subsystem here — this table is the map; the lessons are the territory.

L1 · Client (ZCC)

"No ZPA at all" / nothing is logged

Symptom No Diagnostics row; ZCC shows ZPA off / tunnel down / re-auth loop; only this user.

Likely layer L1 — the request never entered ZPA.

Go to lesson → ZCC Troubleshooting (enrollment, tunnel, service state, log export).

L2 · Edge / Broker

Everyone in one office is slow or down

Symptom A whole site degrades; sessions route to a distant or unhealthy Service Edge.

Likely layer L2 — the broker/Service Edge the device reached.

Go to lesson → ZPA vs VPN & Private Service Edge (broker model, on-prem Private Service Edge).

L3 · Connector

Connector unhealthy, wrong version, or none serving the app

Symptom Portal health red/flapping; empty Connector field; journalctl shows enrollment/handshake errors; parent-only process.

Likely layer L3 — connector health, or the server-group → connector-group binding.

Go to lesson → App Connector Troubleshooting · App Connector Deploy · Groups & Service Edge (the L3/L4 binding).

L4 · App / Network

Named connector but slow / can't reach the server

Symptom High ConnectionSetupTime; dig fails on the connector host; works on 443 but not 8443; pinned-cert / clientless TLS fail with green ZPA.

Likely layer L4 — DNS resolution, last-mile reachability, transport/MTU, or the app-segment object chain.

Go to lesson → DNS & App Discovery · Performance & MTU · App Segments · Browser Access & PRA (clientless / cert).

L5 · Policy

Brokered fine, reachable — but denied or unmatched

Symptom Empty Policy + Close; a renamed/un-synced SCIM group; a SAML attribute the IdP isn't sending; posture/timeout blocking.

Likely layer L5 — Access Policy order, or the identity (SAML/SCIM) feeding it.

Go to lesson → Access Policy (rule order, criteria, identity) · App Segments (the object the rule targets).

The router in one line

No log → L1. Whole office → L2. Empty Connector / red health → L3. Named-but-slow / dig fails / clientless TLS → L4. Empty Policy + Close → L5. Pick the row, open the lesson — that's the whole hub.

Pause & Predict

A clientless (Browser Access) user gets a TLS error, but ZPA Diagnostics shows a clean Open with normal setup time. Which layer, and which lesson?

L4 / clientless — not L5, not L3. "ZPA looks fine but the app fails on TLS" is the classic signature of a cert/pinning problem on the app-or-clientless path. ZPA succeeded (Open), so policy (L5) and the connector (L3) are exonerated. Open the Browser Access & PRA lesson for the cert chain, and disable pinning or bypass ZPA for a pinned native app.

Router · worked

Use the table on a live ticket

Ticket A "App down, no Diagnostics row." → No log = L1 → open ZCC Troubleshooting. (Found it: ZPA service was toggled off on the laptop.)

Ticket B "Empty Connector field." → L3 binding → open Groups & Service Edge. (Server group pointed at a disabled connector group.)

Ticket C "Open + 3s setup, dig fails on host." → L4 DNS → open DNS & App Discovery. (resolv.conf perms for the zscaler user.)

Ticket D "Empty Policy + Close." → L5 → open Access Policy. (ALLOW rule referenced a renamed SCIM group.)

Takeaway Four tickets, four layers, four lessons — zero guessing.

Quick check · Q3 of 10 · Apply

A ticket shows a clean Open session, a named connector, but ConnectionSetupTime = 3,100 ms and dig fails on the connector host. Which router row do you follow?

a) L4 App/Network — DNS resolution from the connector; open DNS & App Discovery (check /etc/resolv.conf perms for the zscaler user) b) L1 Client — re-enroll ZCC c) L5 Policy — rewrite the access rule d) L2 Edge — fail over the Service Edge

Correct: a. Open + named connector clears L1, L2, L3 and L5. A high setup time plus a failed dig on the connector host is the L4 DNS signature — frequently a file-permission issue on /etc/resolv.conf for the zscaler service account. Router row L4 → DNS & App Discovery.

👉 So far: every symptom now routes to a layer and a specific lesson. Next: turn the method into a timed 10-minute runbook you can run on any ticket.

④ The 10-minute triage runbook

Put it all together as a timed, repeatable runbook. Run it top-down for one user, bottom-up for a crowd. Every box has a tool and a clock; the moment a check fails, you have your layer and your lesson — stop and fix.

Runbook · minute 0–3

Scope, then check the client (L1)

0–2 Is it one user or many? Pull the failing Diagnostics → User Activity row. The scope chooses your direction; the row chooses your layer.

2–3 (L1) On the laptop: ZCC > More > Connection Status — ZPA ON, tunnel up? No Diagnostics row at all = L1 → ZCC Troubleshooting; export logs.

Stop rule If L1 fails, you're done — fix the client, don't open the policy console.

Runbook · minute 3–8

Edge + connector (L2/L3), then reachability (L4)

3–5 (L2/L3) Did the session reach a healthy near Service Edge (Portal Diagnostics)? Is the connector Connected (journalctl -u zpa-connector, parent+child)? Empty Connector = L3 binding → Groups & Service Edge.

5–8 (L4) SSH to the named connector: dig app.corp.local then nc -vz <ip> 443. Named-but-slow / dig fails = L4 → DNS & App Discovery or Performance & MTU.

Stop rule First failure ends the run — note the layer and open its lesson.

Runbook · minute 8–10 + verify

Policy + identity (L5), then prove the fix

8–10 (L5) If everything below passed, the fault is policy: empty Policy + Close = L5. Run Access Policy simulation; check User Status / Authentication for the SAML/SCIM the IdP actually sent. → Access Policy lesson.

Verify Re-run the request. The SAME Diagnostics row must now read Open, a named connector, low ConnectionSetupTime, and the correct Policy.

Close Record which layer it was — over a month, your layer histogram tells you where to invest (a flood of L1s means a ZCC rollout problem, not a connector problem).

LSS User Activity query (Splunk) — the one-shot triage row

index=zpa sourcetype=zpa_user_activity
  Username="sneha.r@infosys.com" Host="sap.corp.local"
| table LogTimestamp, ConnectionStatus, Policy,
        Connector, ServerIP, ConnectionSetupTime, DoubleEncryption

Verify — a healthy row after the fix

ConnectionStatus  : Open
Policy            : Allow-SAP-Users
Connector         : connector-mumbai-01
ServerIP          : 10.50.7.40
ConnectionSetupTime: 38 ms
DoubleEncryption  : false

Field → layer cheat (memorise this)

No row → L1 Client. Empty Connector + Close → L3 connector / binding. Open + high ConnectionSetupTime → L4 reachability (DNS / last-mile). Open + app-side TLS fail → L4 clientless / cert. Empty Policy + Close → L5 policy / identity.

Quick check · Q4 of 10 · Analyze

During the runbook for ONE user, L1 passes (tunnel up), L2/L3 pass (connector named + green), but dig on the connector host fails to resolve the app. Where do you stop, and what's the fix path?

a) Keep going to L5 and rewrite the access policy b) Re-enroll ZCC (L1) again c) Stop at L4 App/Network — DNS from the connector is broken (check /etc/resolv.conf perms for the zscaler user); open DNS & App Discovery d) Fail over to a different Service Edge (L2)

Correct: c. The stop rule says: halt at the first failed check. L1, L2, L3 all passed, so the broken dig on the connector host is the L4 failure — you don't continue to L5. DNS resolution from the connector is the L4 signal, often a permissions issue on /etc/resolv.conf for the zscaler account.

Symptom → layer → lesson (the router in flip form)

📭

No log at all

tap

Layer: L1 Client. The request never entered ZPA — tunnel down / service off. Lesson: ZCC Troubleshooting. Check ZCC > Connection Status first.

📦

Empty Connector

tap

Layer: L3 binding. Server group → connector group eligibility rejected all. Lesson: Groups & Service Edge / App Connector Troubleshooting.

🐢

Named but slow

tap

Layer: L4 App/Net. High setup time / dig fails on the host. Lesson: DNS & App Discovery · Performance & MTU. Clientless TLS = Browser Access / PRA.

🚫

Empty Policy + Close

tap

Layer: L5 Policy. No rule matched, or identity (SCIM/SAML) miss. Lesson: Access Policy · App Segments. Run policy simulation.

One-glance cheat-sheet — symptom → first check → lesson

🤖 Ask the AI Tutor

Tap any question — instant context-aware answer scoped to this lesson. No login, no waiting.

Pre-curated from Zscaler ZPA help docs + community Q&A. For live prod issues, paste your Diagnostics → User Activity row into chat.techclick.in.

✍️ Explain it back (2 lines)

In your own words: what is the 5-layer method, and why does a green App Connector narrow the fault instead of closing the ticket? Type it, then reveal the expert version.

Expert version: Don't guess the cause — isolate which of five layers failed: L1 Client (ZCC), L2 Edge/Broker, L3 Connector, L4 App/Network, L5 Policy. Each has one signal and one diagnostic tool. A green connector proves only L3, so it narrows the fault to L1, L4, or L5 — you read the Diagnostics field to name the layer, then open that layer's deep-dive lesson and apply its fix.

🧑‍🏫 Teach a friend

Tap to generate a one-liner you can paste to a teammate who's stuck.

"ZPA triage = find the LAYER, not the cause. Check L1 Client (ZCC tunnel), L2 Edge (broker), L3 Connector (journalctl + Portal health), L4 App/Net (dig/nc on the connector), L5 Policy (Diagnostics Policy field). No log = L1. Empty Connector = L3. Named-but-slow = L4. Empty Policy + Close = L5. Green connector only proves L3 — so look at L1/L4/L5. Then open that layer's lesson."

📩 Quiz me again later

Opt in and we'll send 3 micro-questions on this playbook at Day 1, Day 7, and Day 30 — spaced recall locks it in.

Yes, quiz me on the 5-layer ZPA method in 7 days

✓ You're in. We'll nudge you with 3 quick questions — opt out anytime from your profile.

Glossary — the words this playbook used

5-layer method — isolate which of L1 Client, L2 Edge/Broker, L3 Connector, L4 App/Network, L5 Policy failed, then fix that layer.
L1 · Client (ZCC) — the Zscaler Client Connector on the device; must be enrolled with the ZPA service ON and the tunnel up.
L2 · Edge / Broker — the Public/Private Service Edge that brokers the session; the device must reach a healthy, near one.
L3 · Connector — the App Connector that actually opens the connection to the app; check health with journalctl -u zpa-connector.
L4 · App / Network — DNS resolution + last-mile reachability from the connector host to the server (dig / curl / nc).
L5 · Policy — the Access Policy rule (and the SAML/SCIM identity feeding it) that must ALLOW the request.
Top-down vs bottom-up — walk L1→L5 for one user; walk L5/L4→L1 when everyone is hit at once.
Diagnostics → User Activity — the per-request log whose wrong field names the failed layer.
ConnectionSetupTime — time for the connector to set up to the server; a high value is the L4 reachability signal.
Connector-green rule — a green connector proves only L3, so the fault (if any) is L1, L4, or L5.

📚 Sources

Zscaler Help — Troubleshooting App Connectors & App Connector Managing & Troubleshooting (process health, journalctl -u zpa-connector, fohh_connection_connected). help.zscaler.com/zpa
Zscaler Help — Accessing App Connector Status Diagnostics, About App Connector Status Log Fields, Accessing User Activity / User Status Diagnostics. help.zscaler.com/zpa
Zscaler Help — Troubleshooting Private Service Edges & Understanding Private Access Session Status Codes (the L2 Edge/Broker layer). help.zscaler.com/zpa
Zscaler Help — ZPA App Connector Troubleshooting Runbook (DNS resolution, /etc/resolv.conf permissions for the zscaler user). help.zscaler.com/troubleshooting-runbooks
Zscaler Help — Configuring Access Policies & Enabling SCIM for Identity Management (the L5 Policy + identity layer). help.zscaler.com/zpa
Zscaler Academy — ZDTA (Digital Transformation Administrator) blueprint — Identity, Connectivity, Access Control service layers.

What's next?

This hub taught the method. Go deeper per layer using the router above, or read the sister lesson that organises the same fixes by symptom bucket (policy, mapping, certs, steering) — both reach the same lessons from different angles.

Sister lesson · ZPA Troubleshooting (by symptom) → L5 deep-dive · Access Policy → Practice on exam.techclick.in →

— Techclick Team

📩 Quiz me on this in 7 days. Opt in and we'll email you 3 micro-questions from this lesson at Day 1, Day 7 and Day 30 — spaced repetition is how it sticks. Un-tick any time.

ZPA Troubleshooting Playbook — Find the Layer, Find the Fix

🎯 By the end, you will be able to

⏳ Before you scroll — 3 questions (no score, just notice what you don't know yet)

The playbook in 4 moves — jump straight to one

Isolate the layer

The Diagnostics tour

Symptom → lesson router

10-minute runbook

What you are learning

In plain English

Real example

Follow this flow

Evidence to collect

Common mistake to avoid

Current official source checkpoint

Key terms before you continue

Stop fixing ZPA. Start isolating layers.

The five layers, in 30 seconds each

① Isolate the layer — the 5-layer method

▶ Watch the 5-layer isolation walk

"Sneha at Infosys says SAP is down" — walk L1 → L5

Pick the isolation direction from the blast radius

"Connector is green but the app is still down"

② The Diagnostics page tour — let the field name the layer

Read User Activity top-down: four fields, four layers

Two more diagnostics you'll reach for first

Prove (or disprove) app reachability in three commands

One User Activity row, read field by field

③ The symptom → layer → lesson router

"No ZPA at all" / nothing is logged

Everyone in one office is slow or down

Connector unhealthy, wrong version, or none serving the app

Named connector but slow / can't reach the server

Brokered fine, reachable — but denied or unmatched

Use the table on a live ticket

④ The 10-minute triage runbook

Scope, then check the client (L1)

Edge + connector (L2/L3), then reachability (L4)

Policy + identity (L5), then prove the fix

Symptom → layer → lesson (the router in flip form)

One-glance cheat-sheet — symptom → first check → lesson

🤖 Ask the AI Tutor

📝 Wrap-up — six more

✍️ Explain it back (2 lines)

🧑‍🏫 Teach a friend

📩 Quiz me again later

Glossary — the words this playbook used

📚 Sources

What's next?