TTechclick All lessons
Zscaler · ZPA · App Connector TroubleshootingInteractive · L1 → L3

Troubleshooting the ZPA App Connector — Every Failure, Diagnosed

Deploy the connector. Paste the key. And the portal still says Disconnected. This lesson walks every way the App Connector breaks between deployment and a user reaching a private app — 16 real scenarios, each with the exact symptom, the command that proves it, the expected output, the fix, and the check that confirms you're done.

📅 30 May 2026 · ⏱ 14 min · 16 scenarios · 4 SVGs · 🏷 10-Q assessment + AI Tutor inline
🎯 By the end of this lesson, you'll be able to

⚡ Quick Answer

Every way the Zscaler ZPA App Connector fails between deployment and a user reaching a private app — Disconnected/not enrolled, blocked 443 to the broker, expired provisioning key, time skew, DNS failures, no healthy connector, app-segment misconfig — each with the exact symptom, diagnosis command, expected output, fix and verify step.

Read as:

Where does it hurt? Jump to the failure bucket

1

Enrollment

Connector won't enroll, shows Disconnected, key expired, fingerprint changed.

2

Tunnel to Cloud

443 to the broker blocked, SSL inspection breaks TLS, behind a proxy.

3

Connector Health

Time skew, CPU/memory overload, version drift, flapping connection.

4

Last Mile to App

DNS can't resolve the app, segment misconfig, "no healthy connector".

The belief that costs you four hours

Most engineers see "Disconnected" in the ZPA portal and assume the network is down. Wrong — and that wrong instinct is exactly why a 5-minute fix turns into a half-day outage. A connector can show Disconnected because its clock drifted by 6 minutes, because a VM was cloned and its hardware fingerprint changed, or because someone enabled SSL inspection on the firewall last night. The network is fine. The identity or the trust broke.

So the senior move is never "is the network up?". It's "which hop in the path broke, and what does that specific hop's log say?" This blog teaches you the path first, then walks all 16 failure scenarios that live along it.

Before you read — 3 questions to sit with

No scoring. Just notice which ones you can't answer yet — those are the sections to slow down on. We answer all three as you scroll.

  1. The connector shows Disconnected but you can SSH into the box and ping the internet. What single thing do you check first?
  2. Which one firewall change, done with good intentions, silently kills every App Connector tunnel?
  3. The connector is green and healthy, but users get "app not reachable". Whose problem is it — the connector's, or the config's?

The big picture — where the App Connector sits

The ZPA App Connector is a lightweight Linux daemon you run next to your private apps — in your data centre, your VPC, your branch. It dials outbound only to the Zscaler cloud — specifically a broker — and stitches together two halves of a connection: the user's side (handled by ZCC) and the app's side. The user never routes to your network; the connector reaches out to meet the cloud. That's the whole trick — and the whole troubleshooting map.

Figure 1 — Where the App Connector sits in the user → app path
A horizontal map showing a user laptop running Zscaler Client Connector reaching the Zscaler cloud Service Edge broker, which the App Connector dials outbound to on port 443, with the App Connector then reaching the private application on the internal network. Outbound only — the connector dials the cloud, the cloud never dials in User laptop ZCC agent (Client Connector) "Sneha @ home" Zscaler Cloud Service Edge / Broker (ZEN) policy + brokering stitches both halves TCP 443 out App Connector Linux daemon zpa-connector 10.20.5.11 Private app hr.tcs.local 10.20.9.40 2 1 3 ①=tunnel ②=user side ③=last mile
Three break-points: ① the connector can't reach the broker on 443, ② the user side fails (out of scope here — that's ZCC), ③ the connector is healthy but can't reach the app. Most "Disconnected" tickets are ①. Most "app not reachable" tickets are ③.
👉 So far: the connector dials out on 443, the cloud brokers user↔app, and the user never touches your network. Next: the four buckets every failure falls into.

Your diagnostic toolbox — tap each card

These six are the commands and files you'll reach for in every scenario below. Memorise the front of each card; the back tells you what it proves.

🩺
Service status
systemctl status zpa-connector
tap to flip

Is the daemon even running? Shows active (running) or a crash reason. Your first command on every ticket — before you touch the portal.

📜
Live logs
journalctl -u zpa-connector
tap to flip

The truth source. Shows connector ID, version, cert expiry, "Control connection state", and the broker it's talking to. Add -f to follow live.

🔑
Provisioning key
/opt/zscaler/var/provision_key
tap to flip

The one-time enrollment token. If empty, malformed, or wrapped in smart-quotes, enrollment never starts. Plain double-quotes only.

🆔
Instance data
/opt/zscaler/var/
tap to flip

Holds instance_id.crypt — the fingerprint-encrypted identity. Wipe this dir to force a clean re-enroll when the hardware ID changed.

🕐
Time sync
chronyc tracking
tap to flip

TLS dies on clock drift. /etc/chrony.conf sets the NTP source. More than a few minutes of skew and cert validation fails — connector goes Disconnected.

🌐
Name resolution
/etc/resolv.conf
tap to flip

The connector resolves both the broker and your apps. If the zscaler user can't read this file, enrollment and app lookups fail. So-what: a permission bit can break everything.

Watch the connector come online — and see where it breaks

Before the scenarios, run this once. Press Play and watch the six stages a connector walks from a cold boot to brokering a live session. Each scenario below maps to exactly one of these stages going red.

▶ App Connector bring-up — 6 stages

Press Play (auto-steps), or tap Next to walk it yourself. Stage 4 turns red — that's where a blocked 443 or time skew stops you cold.

① SERVICE zpa-connector daemon starts on the Linux host (10.20.5.11).
② RESOLVE Reads /etc/resolv.conf, resolves the Zscaler broker FQDNs via DNS.
③ KEY Reads /opt/zscaler/var/provision_key and presents it to enroll.
④ TLS 443 Dials the broker outbound on TCP 443, validates the pinned cert. This is the hop that fails most.
⑤ ENROLL Fingerprint computed (MAC+VM+disk), identity sealed into instance_id.crypt. Portal flips to Connected.
⑥ BROKER Control connection up. Cloud now brokers user↔app sessions through this connector.
Press Play to step through bring-up. Watch stage ④ — it goes red to show the most common failure point.

Bucket 1 — Enrollment & onboarding failures

Symptoms here all share one trait: the connector never reaches Connected, or it was Connected and dropped to Disconnected and won't come back. The portal status lives at Configuration & Control → Private Infrastructure → App Connector Management → App Connectors. Your on-box truth source is always journalctl -u zpa-connector.

SCN-01Connector shows "Disconnected" / never enrolled
⚠ Problem / Symptom

Aditya at Wipro deployed the VM, but the portal App Connectors page shows the connector grey/absent, or status Disconnected. It never turned green.

◆ Likely cause(s)
  • Daemon not running (crashed or never started)
  • No outbound 443 to the broker (Bucket 2)
  • Provisioning key missing/empty
  • DNS can't resolve the broker FQDNs
🔍 Diagnosis

Start on the box, not the portal. Confirm the daemon, then read the log.

On the connector host
sudo systemctl status zpa-connector
sudo journalctl -u zpa-connector --since "15 min ago"
Expected output (healthy)
● zpa-connector.service - Zscaler App Connector
   Active: active (running) since Fri 2026-05-30 09:12:04 IST
   ...
zpa-connector[2041]: Connector ID 88121, version 24.x
zpa-connector[2041]: Control connection state: CONNECTED to broker
zpa-connector[2041]: Certificate valid until 2027-05-29
🛠 Fix

If Active: inactive (dead)sudo systemctl start zpa-connector and enable it for boot. If the log shows TLS/connect errors, jump to Bucket 2. If it shows a key error, jump to SCN-03.

✓ Verify

Log shows Control connection state: CONNECTED and the portal flips to a green Connected within ~60 seconds.

SCN-02Invalid or expired provisioning key — enrollment refused
⚠ Problem / Symptom

Brand-new connector, but the log loops on an enrollment/authorization rejection and the portal never shows the connector. The key may have hit its max-enrollment count or passed its expiry date.

◆ Likely cause(s)
  • Key expired (keys carry an expiry)
  • Key already used up to its max number of connectors
  • Key copied with smart-quotes / trailing newline
  • Key belongs to a different connector group
🔍 Diagnosis

In the portal, open the key under App Connector Management → Provisioning Keys and check enrolled count < max and the expiry. On the box, eyeball the key file.

On the connector host
sudo cat /opt/zscaler/var/provision_key
sudo journalctl -u zpa-connector | grep -i -E "enroll|key|auth"
Expected output (bad key)
3|api.private.zscaler.com|hex... (one clean line, no curly quotes)
zpa-connector[2102]: enrollment rejected: provisioning key invalid/expired
zpa-connector[2102]: retrying enrollment in 30s
🛠 Fix

Generate or edit a valid key in the portal (raise max-count or extend expiry), then re-paste with plain double-quotes (U+0022):

Re-seed the key
sudo systemctl stop zpa-connector
echo "3|api.private.zscaler.com|VALID-KEY-HEX" | sudo tee /opt/zscaler/var/provision_key
sudo systemctl start zpa-connector
✓ Verify

Log moves past enrollment to Control connection state: CONNECTED; the portal's key "enrolled" counter increments by one.

SCN-03"Cannot decrypt data from instance_id.crypt" — fingerprint changed
⚠ Problem / Symptom

A previously-working connector goes Disconnected after a VM clone, MAC change, disk resize, or migration. The log shows "Cannot decrypt data from instance_id.crypt".

◆ Likely cause(s)

The fingerprint (hardware ID) is computed from MAC + VM info + disk info. If any of those change, the connector can no longer decrypt its sealed identity and can't prove who it is to the cloud.

🔍 Diagnosis
Confirm the error
sudo journalctl -u zpa-connector | grep -i "instance_id.crypt"
Expected output
zpa-connector[1980]: ERROR Cannot decrypt data from instance_id.crypt
zpa-connector[1980]: fingerprint mismatch — host identity changed
zpa-connector[1980]: enrollment required
🛠 Fix

Wipe the stale instance data and re-enroll with a fresh/reusable key. The old connector entry in the portal can be deleted.

Clean re-enroll
sudo systemctl stop zpa-connector
sudo rm -f /opt/zscaler/var/instance_id.crypt /opt/zscaler/var/*.crypt
echo "3|api.private.zscaler.com|REUSABLE-KEY" | sudo tee /opt/zscaler/var/provision_key
sudo systemctl start zpa-connector
✓ Verify

A new connector object appears in the portal as Connected. Pin the VM's MAC address so this never recurs after a reboot or vMotion.

SCN-04Enrollment fails because DNS won't resolve the broker
⚠ Problem / Symptom

The connector starts but never enrolls. The log shows name-resolution failures for the Zscaler broker FQDNs — even though the host itself seems to have internet.

◆ Likely cause(s)
  • The zscaler service account can't read /etc/resolv.conf (wrong permission bit)
  • No nameserver configured, or it's an internal-only resolver that can't reach Zscaler FQDNs
🔍 Diagnosis
Check resolution + perms
ls -l /etc/resolv.conf
sudo -u zscaler nslookup api.private.zscaler.com
Expected output (broken)
-rw------- 1 root root 88 May 30 /etc/resolv.conf   # zscaler user can't read
;; connection timed out; no servers could be reached
🛠 Fix

Grant the zscaler user read access and ensure a working resolver is set.

Repair
sudo chmod 644 /etc/resolv.conf
sudo systemctl restart zpa-connector
✓ Verify

sudo -u zscaler nslookup api.private.zscaler.com returns an answer, and the log proceeds to enroll.

Predict: the connector is Disconnected, daemon is running, and journalctl shows TLS handshake timeouts to the broker. The host can browse the internet fine. What's the single most likely cause?

A firewall / proxy blocking outbound 443 to the Zscaler broker, or an SSL-inspection box breaking the pinned-cert TLS. "Browses the internet fine" only proves general egress — ZPA needs specific Zscaler FQDNs on 443, un-inspected. That's all of Bucket 2.
Quick check · Q1 of 10

Priya at Infosys clones a working App Connector VM to spin up a second one. The clone shows Disconnected and the log says "Cannot decrypt data from instance_id.crypt". What happened?

Correct: a. The fingerprint is MAC+VM+disk. A clone keeps the original's instance_id.crypt but has a different hardware identity, so decryption fails. Never clone an enrolled connector — wipe /opt/zscaler/var/ and re-enroll, or build from a clean image.
App Connector Simulator ZPA Troubleshooting Simulator

Bucket 2 — The tunnel to the cloud

Enrollment can be perfect and the connector still won't connect, because the control connection rides outbound TCP 443 to the Zscaler broker. Anything that blocks, intercepts, or mis-routes that 443 stream lands here. This is the bucket that answers pre-quiz Q2 — and it's the one engineers misdiagnose most, because the host "has internet".

Figure 2 — Request flow with failure points marked
A left-to-right flow of the App Connector dialing out: daemon, DNS resolve, TCP 443 to broker, pinned-cert TLS handshake, control connection up. Red markers show where a blocked port, SSL inspection, or a missing proxy config breaks the flow. Connector → broker: where the 443 stream dies Daemonzpa-connector DNS resolvebroker FQDN TCP 443 outto broker Pinned-certTLS handshake Control connCONNECTED AFW blocks 443 BSSL inspection Cproxy needed
The control connection only goes green at the far right. A) a firewall dropping 443, B) an SSL-inspection box breaking the pinned cert, or C) an unconfigured explicit proxy each stop it before it gets there.
SCN-05Outbound 443 to the broker blocked by firewall
⚠ Problem / Symptom

Connector Disconnected; log shows the TLS dial to the broker timing out or being refused. General web browsing from the host works, which throws people off.

◆ Likely cause(s)

The corporate firewall/egress filter doesn't allow outbound 443 to the Zscaler broker FQDNs. The connector needs only outbound 443 — no inbound ports — but it must reach the specific Zscaler destinations.

🔍 Diagnosis

Test the exact path the connector uses. The broker list is published at ips.zscaler.net/zpa.

Prove the 443 path
curl -sv -m 8 https://api.private.zscaler.com 2>&1 | head
sudo journalctl -u zpa-connector | grep -i -E "broker|connect|timeout"
Expected output (blocked)
*   Trying 165.225.x.x:443...
* connect to 165.225.x.x port 443 failed: Connection timed out
zpa-connector[2210]: broker connect failed: no route / timeout
🛠 Fix

Allowlist outbound TCP 443 from the connector's IP (e.g. 10.20.5.11) to the Zscaler ZPA destination ranges/FQDNs from ips.zscaler.net/zpa. No inbound rules needed.

✓ Verify

curl -v to the broker completes the TCP+TLS, and the connector log reports Control connection state: CONNECTED.

SCN-06SSL inspection breaks the pinned-cert TLS
⚠ Problem / Symptom

443 is open, but the connector still won't connect — and the log shows a certificate/TLS validation failure. Often appears right after the firewall team enables decryption on the egress segment.

◆ Likely cause(s)

ZPA uses TLS with certificate pinning. An SSL-inspection device decrypts and re-signs with its own CA — which the connector rejects, because the pinned cert no longer matches.

🔍 Diagnosis
Inspect the presented cert
echo | openssl s_client -connect api.private.zscaler.com:443 2>/dev/null \
  | openssl x509 -noout -issuer
Expected output (intercepted)
issuer=CN = Corporate-SSL-Inspection-CA, O = TCS-Proxy
# Should be a Zscaler issuer. A corporate CA here = you're being decrypted.
🛠 Fix

Add the Zscaler ZPA domains/IPs to the SSL-inspection bypass/do-not-decrypt list on the firewall or proxy. ZPA traffic must pass through un-inspected.

✓ Verify

The openssl issuer now shows a Zscaler CA, the TLS validation error clears, and the connector goes Connected.

SCN-07Connector sits behind an explicit proxy (no direct egress)
⚠ Problem / Symptom

The data centre forces all outbound through an explicit proxy. The connector can't dial the broker directly, so it never connects — even with 443 "allowed" on the proxy.

◆ Likely cause(s)

The connector has no proxy configured, so it tries a direct outbound that the network drops. It needs to be told to use the proxy with HTTP CONNECT.

🔍 Diagnosis

Confirm there's no direct path and that a proxy is mandatory on this segment, then check for the proxy file.

Check
ls -l /opt/zscaler/var/proxy 2>/dev/null || echo "no proxy file set"
curl -sv -m 8 https://api.private.zscaler.com 2>&1 | head -3
Expected output (proxy required)
no proxy file set
*   Trying ...:443...
* connect ... failed: Connection timed out   # direct egress is blocked
🛠 Fix

Create /opt/zscaler/var/proxy with <proxy-host>:<port>. This only proxies connector↔broker traffic — not connector↔app traffic.

Configure the proxy
echo "10.20.1.8:8080" | sudo tee /opt/zscaler/var/proxy
sudo systemctl restart zpa-connector
✓ Verify

The connector now reaches the broker via CONNECT through 10.20.1.8:8080; log shows CONNECTED. Proxy access logs show the connector's CONNECT requests.

SCN-08Wrong region / broker — high latency or repeated re-broker
⚠ Problem / Symptom

The connector is Connected but app sessions feel slow, or the connector keeps re-selecting brokers. A Mumbai-hosted app is being brokered through a far-away Service Edge.

◆ Likely cause(s)

The connector reaches a sub-optimal Service Edge because closer broker FQDNs/ranges are blocked by the firewall, forcing it onto distant ones — adding latency on every brokered session.

🔍 Diagnosis
See which broker it picked
sudo journalctl -u zpa-connector | grep -i -E "broker|service edge|latency"
Expected output
zpa-connector[2301]: connected to broker (region: us-west)  # app is in ap-south
zpa-connector[2301]: control latency 210 ms
🛠 Fix

Allowlist the full ZPA destination set from ips.zscaler.net/zpa (not just one region) so the connector can pick the nearest Service Edge. Confirm the connector's deployment region is correct.

✓ Verify

Log shows a nearby region and control latency drops (e.g. 210 ms → < 40 ms). Users report apps feel snappy again.

Predict: the firewall team enabled "decrypt all outbound TLS for visibility" last night. This morning every App Connector in that data centre is Disconnected. Why — and what's the fix that keeps their visibility goal mostly intact?

SSL inspection breaks ZPA's certificate pinning — the connector rejects the proxy's re-signed cert. Fix: add the Zscaler ZPA FQDNs/IPs to the do-not-decrypt bypass list. The firewall still inspects everything else; only the pinned ZPA tunnel is exempt (which it can't read anyway).
Quick check · Q2 of 10

A connector in a locked-down DC has no direct internet — everything goes via proxy 10.20.1.8:8080. Enrollment fails with connection timeouts. What's the correct fix?

Correct: b. The connector reads its proxy from /opt/zscaler/var/proxy (not system env vars). This proxies only connector↔broker traffic. The connector needs no inbound ports (a is wrong), and SSL-inspection on app servers is unrelated (d).

Bucket 3 — Connector health

Now the connector enrolls and reaches the broker — but it's unhealthy: it flaps, it's slow, or it drifts. These are the "it was fine yesterday" tickets. The decision tree below is your triage map for the whole blog; print it.

Figure 3 — Troubleshooting decision tree: symptom → branch → root cause
A decision tree starting from the symptom: does the connector show Connected in the portal? If no, branch to daemon, key, 443 or TLS causes. If yes but app unreachable, branch to DNS, app segment, or no-healthy-connector causes. Portal status =Connected? NO systemctl status + journalctl -u zpa-connector daemon dead→ start itSCN-01 key / FP / DNS→ re-enrollSCN-02/03/04 443 / TLS / proxy→ fix egressSCN-05..08 YES Healthy, but slow or app unreachable? CPU / mem / time→ scale / syncSCN-09..12 DNS / segment→ fix configSCN-13..15 no healthy conn→ add HA pairSCN-16 The one rule that splits the whole tree "Disconnected" = connector ↔ CLOUD problem (left branch). "App not reachable" with a green connector = connector ↔ APP problem (right branch). Decide the branch first. It saves you three hours of looking in the wrong place.
Triage every ticket here first: is the connector Connected in the portal? That single answer routes you to the left half (cloud) or right half (app) of every scenario in this blog.
SCN-09NTP / time skew breaks cert validation
⚠ Problem / Symptom

Connector flaps between Connected and Disconnected, or won't connect at all, with TLS/certificate errors — despite open 443 and no inspection. Classic after a VM was suspended/resumed and its clock drifted.

◆ Likely cause(s)

TLS rejects certificates whose validity window doesn't match the local clock. A few minutes of skew is enough to fail the handshake. /etc/chrony.conf has no reachable NTP source.

🔍 Diagnosis
Check the clock
chronyc tracking
timedatectl status
Expected output (skewed)
Reference ID    : 00000000 (unsynchronised)
System time     : 312.4 seconds slow of NTP time
Leap status     : Not synchronised
🛠 Fix

Point /etc/chrony.conf at a reachable NTP server (internal preferred, Zscaler pool as fallback), restart chrony, force a sync.

Repair time sync
sudo systemctl restart chronyd
sudo chronyc makestep
sudo systemctl restart zpa-connector
✓ Verify

chronyc tracking shows Leap status: Normal and sub-second offset; the cert error clears and the connector stays Connected.

SCN-10CPU / memory overload → app slowness, drops
⚠ Problem / Symptom

Karthik at TCS reports private apps are slow or sessions drop at peak hours. The connector is Connected, but it's pinned at high CPU/RAM, or shows a memory leak over days.

◆ Likely cause(s)

One connector is carrying too much load (under-provisioned, or it's the only one for a busy segment). Minimum spec is 2 vCPU / 4 GB RAM — busy segments need more, or more connectors.

🔍 Diagnosis
Check load
top -b -n1 | head -12
free -m
Expected output (overloaded)
%Cpu(s): 96.3 us,  2.1 sy   load average: 7.8, 7.2, 6.9
Mem:  total 3902   used 3760   free 142   # 96% RAM used
2041 zscaler  ... zpa-connector  98.0 %CPU
🛠 Fix

Add a second connector to the same connector group (the cloud load-balances across healthy members), or resize the VM up. For suspected leaks, collect periodic memory reports for Zscaler Support.

✓ Verify

Per-connector CPU drops below ~70%, app latency recovers, and the portal shows load spread across both connectors in the group.

SCN-11Version outdated / auto-upgrade (SARGE) failed
⚠ Problem / Symptom

A connector drops to Disconnected right after a scheduled auto-upgrade window, or the portal flags it as running an old version that won't re-register.

◆ Likely cause(s)

The in-place upgrade (SARGE) failed mid-flight — interrupted download, disk full, or a stale package state — leaving the service unable to start cleanly.

🔍 Diagnosis
Check version + upgrade log
sudo journalctl -u zpa-connector | grep -i -E "version|upgrade|sarge"
df -h /opt
Expected output (failed upgrade)
zpa-connector[2280]: upgrade (SARGE) failed: package verification error
zpa-connector[2280]: service start aborted
/dev/sda1   8.0G  8.0G   0    100%  /opt   # disk full
🛠 Fix

Free disk, update the package, reboot. If it still won't register cleanly, the supported path is to rebuild from the latest image and re-enroll (deploy in pairs so this never causes an outage).

Recover
sudo yum update -y
sudo reboot
# verify after boot:
sudo systemctl status zpa-connector
✓ Verify

Portal shows the connector on a current version and Connected. The connector group still served traffic throughout because its HA peer stayed up.

SCN-12Frequent flapping + health-reporting mode confusion
⚠ Problem / Symptom

The connector toggles Connected/Disconnected repeatedly, or an app shows healthy in continuous mode but goes "unreachable" in on-access mode (or vice-versa) — and the team argues about whether the connector is actually broken.

◆ Likely cause(s)
  • Routing/MTU instability or periodic restarts cause real flapping
  • Health-reporting mode mismatch: Continuous probes apps on a schedule; On Access only checks when a user connects — so "unhealthy" can simply mean "nobody has accessed it yet".
🔍 Diagnosis
Look for restart loops
sudo journalctl -u zpa-connector | grep -i -E "restart|disconnect|reconnect" | tail
Expected output
09:31:02 Control connection state: DISCONNECTED
09:31:40 Control connection state: CONNECTED
09:34:11 Control connection state: DISCONNECTED   # flapping ~every 3 min
🛠 Fix

For real flapping, stabilise the network path (see SCN-15 MTU) and check for a watchdog restarting the service. For false "unhealthy", confirm the app segment's health-reporting mode matches expectations — set Continuous if you want proactive status, accept that On Access reports only on demand.

✓ Verify

The Connected state holds steady for > 30 min with no restart log lines; app health reflects the chosen mode's expected behaviour.

Quick check · Q3 of 10

443 is open, no SSL inspection, DNS works — yet a connector keeps flapping with TLS certificate errors. chronyc tracking shows "312 seconds slow / unsynchronised". Root cause?

Correct: b. "Unsynchronised / 312s slow" is the smoking gun. TLS validates certs against the local clock; minutes of skew fails the handshake and the connector flaps. Time is the most-missed root cause for cert errors when 443/inspection/DNS all check out.

Bucket 4 — The last mile to the app

The connector is green and healthy. Users still get "app not reachable". This is pre-quiz Q3 — and the answer is almost always config, not connector. The connector is fine; what's between it and the app (DNS, the app-segment definition, MTU, or simply no connector covering that app) is not.

SCN-13Connector can't resolve the app's FQDN
⚠ Problem / Symptom

Healthy connector, but users can't reach hr.tcs.local. The connector resolves app FQDNs itself — if it can't, the session has nowhere to go.

◆ Likely cause(s)

The connector's resolver (/etc/resolv.conf) points at a DNS server that doesn't know the internal zone, or the app's record doesn't exist / points to the wrong IP.

🔍 Diagnosis

Resolve the app's FQDN from the connector itself.

On the connector host
nslookup hr.tcs.local
dig +short hr.tcs.local
Expected output (broken)
** server can't find hr.tcs.local: NXDOMAIN
# vs healthy:  hr.tcs.local  ->  10.20.9.40
🛠 Fix

Point the connector at an internal DNS server that can resolve the app zone (or add the missing record). For split-horizon apps, ensure it uses the internal view.

✓ Verify

dig +short hr.tcs.local returns the correct internal IP (10.20.9.40) from the connector; users reach the app.

SCN-14App segment misconfig — wrong domain, port, or no SNAT
⚠ Problem / Symptom

Connector healthy, DNS resolves, but a specific app times out — or works for some users and not others. The app may also see the user's real IP and reply asymmetrically (double-NAT / source-IP issue).

◆ Likely cause(s)
  • Application Segment domain/port set doesn't actually match what the app uses (e.g. app on 8443, segment says 443)
  • The segment's domain isn't mapped to the right connector group / server group
  • Source-IP anchoring (SNAT) disabled, so the app replies directly and the return path is asymmetric behind a NAT
  • App expects a specific cert/SNI the segment doesn't pass cleanly
🔍 Diagnosis

In the portal, open Application Segments for the app and confirm domain + TCP/UDP port range + connector/server group mapping. From the connector, test the actual app socket.

Test the real app port from the connector
nc -zv hr.tcs.local 8443
sudo journalctl -u zpa-connector | grep -i -E "segment|reachable|reset"
Expected output (port mismatch)
Connection to hr.tcs.local 8443 port [tcp] succeeded!
# but the App Segment only lists 443 -> users never hit 8443
🛠 Fix

Correct the segment's domain + port range to match the app, map it to the connector group covering that subnet, and enable source IP anchoring if the app sits behind a NAT so return traffic stays symmetric.

✓ Verify

nc -zv on the exact app port succeeds from the connector, and the user's access log shows the session brokered and allowed end-to-end.

SCN-15MTU / fragmentation hangs + split-DNS overlap with ZIA
⚠ Problem / Symptom

App connects but large transfers or file shares hang/stall; small requests work. Or a domain that should go via ZPA is being sent to ZIA (or the public internet) because of overlapping DNS/forwarding.

◆ Likely cause(s)
  • MTU mismatch on the connector's path (often a tunnel/overlay) causing large packets to be dropped silently when DF is set
  • Split-DNS overlap: the same FQDN is claimed by a ZPA app segment and a ZIA forwarding/PAC rule — traffic takes the wrong lane
🔍 Diagnosis
Find the MTU ceiling
ping -M do -s 1472 hr.tcs.local
ip link show eth0 | grep -i mtu
Expected output (MTU too high)
ping: local error: message too long, mtu=1400
# 1472+28 = 1500 fails; path MTU is actually 1400
🛠 Fix

Lower the connector NIC MTU to the path value (e.g. 1400) or enable MSS clamping on the gateway. For split-DNS overlap, make the FQDN owned by exactly one lane — keep the internal app in the ZPA app segment and remove it from ZIA forwarding/PAC.

✓ Verify

ping -M do -s 1372 succeeds at the corrected size; large file transfers complete; the FQDN consistently resolves and routes via ZPA.

SCN-16"No healthy App Connector available for this app"
⚠ Problem / Symptom

Users get an explicit "no healthy connector" error for an app, even though a connector exists in the tenant. The app segment has no healthy connector that can actually reach it.

◆ Likely cause(s)
  • The only connector for that segment is Disconnected or unhealthy (single point of failure — no connector group with a healthy peer)
  • The app segment / server group isn't mapped to any connector group that can route to the app's subnet
  • All connectors in the group are over capacity and marked unhealthy
🔍 Diagnosis

In the portal, open the app's Server Group → Connector Group mapping and confirm at least one member is Connected and reaches the app subnet.

From a connector in the group
nc -zv hr.tcs.local 8443
sudo journalctl -u zpa-connector | grep -i "no healthy"
Expected output
zpa-connector[2400]: app request: no healthy connector for segment "HR-Portal"
# group has 1 member and it's Disconnected (or can't route to 10.20.9.0/24)
🛠 Fix

Bring the connector healthy (it's a left-branch problem — work SCN-01..12), and deploy connectors in pairs per location so one failure never means "no healthy connector". Confirm the server/connector group mapping covers the app's subnet.

✓ Verify

The app segment shows ≥ 2 healthy connectors; the error stops; killing one connector no longer takes the app offline.

Figure 4 — One-glance cheat-sheet: top symptom → first command
A grid of six cheat-sheet tiles mapping a top symptom to the first command to run: Disconnected to systemctl status, won't enroll to check provision_key, TLS error to chronyc tracking and openssl issuer, 443 timeout to curl the broker, app not reachable to dig the app FQDN, no healthy connector to check connector group mapping. When the ticket lands — run THIS first "Disconnected" systemctl statusjournalctl -u zpa-connectordaemon alive? log says why Won't enroll cat /opt/zscaler/var/provision_keykey valid? < max? expiry? TLS / cert error chronyc trackingopenssl s_client … -issuerclock skew? SSL inspection? 443 timeout curl -v https://api.private.zscaler.comFW block? proxy needed? App not reachable dig +short app.fqdnnc -zv app.fqdn PORTDNS? segment port match? No healthy connector Portal: Server Group →Connector Group mapping≥2 healthy? covers subnet? Golden order: status → logs → time → 443 → cert → DNS → segment → group Connector ↔ cloud lives on the left. Connector ↔ app lives on the right. Pick the side before you pick a command.
Stick this on the wall. Six symptoms, the one command each, and the golden left-vs-right rule that decides every App Connector ticket.
👉 So far: all 16 scenarios sort into 4 buckets — enrollment, tunnel, health, last-mile. The aha: "Disconnected" is a cloud-side problem; "app not reachable" with a green connector is a config-side problem. Name the side first.

Predict: a single App Connector serves the HR app. It dies during a Sunday patch. Monday morning, every HR user sees "no healthy connector". The connector itself is the symptom — but what's the real design failure, and what stops this class of outage permanently?

The design failure is a single connector — no HA. One connector is a single point of failure for every app it serves. The permanent fix is to deploy connectors in pairs (a connector group of ≥2) per location. The cloud load-balances across healthy members, so losing one never means "no healthy connector".
Quick check · Q4 of 10

Connector is Connected and healthy, DNS resolves the app, but users still can't reach it. From the connector, nc -zv hr.tcs.local 8443 succeeds — but the app's Application Segment lists only port 443. What's the fix?

Correct: c. The connector reaching 8443 proves the last mile works. ZPA only brokers traffic for ports/domains defined in the App Segment. A port mismatch (app on 8443, segment says 443) means users never get brokered to the right socket. Fix the segment — the connector is innocent.
App Connector Simulator ZPA Troubleshooting Simulator

🤖 Ask the App-Connector Tutor

Tap any question — instant, scoped to this lesson. The kind of thing you'd ask after reading.

Pre-curated from Zscaler docs + community threads. For deeper/live questions, paste your log into chat.techclick.in.

✍️ Explain it back (generation effect)

In two lines: why is "Disconnected" almost never a network-down problem? Type your answer first — then reveal the expert version and compare.

Expert answer: "Disconnected" means the connector lost its trusted control connection to the cloud — usually an identity or trust failure (clock skew breaking the cert, a changed fingerprint, SSL inspection on the pinned TLS, or blocked 443), not the underlying network being down. The host can ping the internet fine and still be Disconnected. So you read the connector log, not the routing table.
🎁 Teach a friend

Tap to generate a one-liner you can paste to a teammate who's stuck on a Disconnected connector.

"Quick tip: before you blame the network on a Disconnected ZPA App Connector, run journalctl -u zpa-connector + chronyc tracking. 90% of the time it's clock skew, a blocked outbound 443, SSL inspection breaking the pinned cert, or a changed fingerprint — not the link. 'Disconnected' = cloud-side; 'app not reachable' with a green connector = config-side. — learned this on ai.techclick.in"
✓ You're set — 3 micro-questions on Day 1 / 7 / 30.

📖 Glossary

App Connector
A lightweight Linux daemon (zpa-connector) deployed next to your private apps that dials outbound to the Zscaler cloud and brokers user→app sessions.
ZPA
Zscaler Private Access — connects users to private apps over the Zscaler cloud, no network-level VPN.
ZCC (Client Connector)
The agent on the user's device that forwards their traffic into the Zscaler cloud.
Broker / Service Edge (ZEN)
The Zscaler cloud node the App Connector dials out to on 443; it stitches the user side to the app side.
Provisioning key
The one-time enrollment token at /opt/zscaler/var/provision_key that authorises a connector to join the tenant.
Fingerprint (hardware ID)
The connector's identity, computed from MAC + VM + disk info. Changing the hardware breaks decryption of instance_id.crypt.
Application Segment
The portal object defining which domains/ports an app uses; ZPA only brokers traffic the segment declares.
Connector group
A set of App Connectors the cloud load-balances across — deploy ≥2 so one failure isn't "no healthy connector".
Certificate pinning
ZPA expects a specific cert; any re-signed cert (from SSL inspection) is rejected — which is why inspection must bypass ZPA traffic.

🎓 Prove it — 10-question assessment

Q1–Q4 above already count. Answer Q5–Q10 below, then Submit. Pass = 70% (7/10). Passing marks this lesson complete on your profile.

Q5 · Remember

On a Linux App Connector host, which file holds the one-time enrollment token you paste during provisioning?

Correct: c. The provisioning key lives at /opt/zscaler/var/provision_key. resolv.conf (b) is DNS; instance_id.crypt (d) is the sealed post-enrollment identity, not the key you paste; (a) doesn't exist.
Q6 · Analyze

Every App Connector in a DC dropped to Disconnected overnight. 443 is open, NTP is synced, DNS works. The firewall team enabled "decrypt all outbound TLS" yesterday. openssl s_client to the broker shows issuer = "Corporate-SSL-Inspection-CA". Root cause?

Correct: a. A corporate-CA issuer on the broker connection is the fingerprint of SSL inspection. ZPA pins its cert, so a re-signed cert fails validation and the tunnel drops. The timing (decrypt enabled yesterday) seals it. Fix = do-not-decrypt bypass for ZPA FQDNs/IPs.
Q7 · Analyze

A new connector won't enroll. The daemon is running, the key is valid, the clock is synced, DNS resolves the broker. The host can browse public websites. journalctl shows "broker connect failed: timeout". Most likely cause?

Correct: d. "Browses fine" only proves generic egress; ZPA needs the specific Zscaler destinations on outbound 443. A timeout straight to the broker = egress filtering. The connector never needs inbound 443 (c). Fingerprint (a) gives a decrypt error, not a connect timeout; app segments (b) are a last-mile concern.
Q8 · Analyze

Connector is green/healthy in the portal. Users get "app not reachable" for hr.tcs.local. From the connector, dig +short hr.tcs.local returns NXDOMAIN. Where's the fault?

Correct: b. NXDOMAIN from the connector means the connector can't resolve the app — and ZPA resolves app FQDNs at the connector, not the client. So the user's DNS (c) is irrelevant. A green connector rules out broker/443 (a, d). Fix the connector's resolver or the DNS record.
Q9 · Evaluate

A team runs one App Connector per data centre to save cost. After a patch reboot, an app showed "no healthy connector" for 20 minutes. They ask whether single connectors are an acceptable design. Your call?

Correct: c. A single connector is a single point of failure for every app it serves — exactly what caused the outage. HA pairs are the standard design; the cost of one more small VM is trivial against a 20-minute outage. Disabling upgrades (b) trades one risk for another; VPN (d) abandons the whole zero-trust model.
Q10 · Evaluate

A SOC complains that several rarely-used private apps intermittently show "unhealthy" in the portal, causing false alerts — yet users never report problems. The apps use On-Access health reporting. What's the best fix?

Correct: d. On-Access mode reports health only when someone connects, so idle apps look "unhealthy/unknown" — a false alert, not a real outage (which is why users see no problem). Continuous mode probes on a schedule for real-time status. Disabling reporting (b) blinds the SOC; adding connectors (a) and nightly restarts (c) don't touch the reporting-mode mismatch.
Lesson complete — saved to your profile.
Almost! You need 70% (7 of 10) — re-read the bucket that tripped you up and tap "Try again".

📚 Sources

  1. Zscaler Help — Troubleshooting App Connectors & ZPA App Connector Troubleshooting Runbook. help.zscaler.com/zpa
  2. Zscaler Help — About / Accessing App Connector Status Diagnostics & Managing Deployed App Connectors. help.zscaler.com/zpa
  3. Zscaler Help — App Connector Deployment Prerequisites & About App Connector Provisioning Keys. help.zscaler.com/zpa
  4. Zscaler Help Centre — ZPA App Connector: Cannot decrypt data from instance_id.crypt. zscaler.az-ap.com/article/15
  5. Zscaler Community (Zenith) — App Connector disconnected after a failed SARGE update & ZPA connector not starting. community.zscaler.com
  6. Zscaler Config — Zscaler Private Access Firewall Allowlist (broker destinations / ips.zscaler.net/zpa). config.zscaler.com/private.zscaler.com/zpa
  7. N. Catania — Deploying ZPA ZEN Connectors (CLI, file paths, NTP/DNS, proxy). nathancatania.com
  8. Zscaler Academy — ZDTA Certification blueprint (App Connectors, provisioning keys, deploying connectors). zscaler.com/zscaler-cyber-academy

What's next?

You can find a broken connector. Next: build the App Segments and access policies on top of it — domain sets, server groups, and the policy order that decides who reaches which private app.

— Techclick Team