On a Linux App Connector host, which file holds the one-time enrollment token you paste during provisioning?

Correct: c. The provisioning key lives at /opt/zscaler/var/provision_key . resolv.conf (b) is DNS; instance_id.crypt (d) is the sealed post-enrollment identity, not the key you paste; (a) doesn't exist.

A new connector won't enroll. The daemon is running, the key is valid, the clock is synced, DNS resolves the broker. The host can browse public websites. journalctl shows "broker connect failed: timeout". Most likely cause?

Correct: d. "Browses fine" only proves generic egress; ZPA needs the specific Zscaler destinations on outbound 443. A timeout straight to the broker = egress filtering. The connector never needs inbound 443 (c). Fingerprint (a) gives a decrypt error, not a connect timeout; app segments (b) are a last-mile concern.

Connector is green/healthy in the portal. Users get "app not reachable" for hr.tcs.local . From the connector, dig +short hr.tcs.local returns NXDOMAIN. Where's the fault?

Correct: b. NXDOMAIN from the connector means the connector can't resolve the app — and ZPA resolves app FQDNs at the connector, not the client. So the user's DNS (c) is irrelevant. A green connector rules out broker/443 (a, d). Fix the connector's resolver or the DNS record.

A team runs one App Connector per data centre to save cost. After a patch reboot, an app showed "no healthy connector" for 20 minutes. They ask whether single connectors are an acceptable design. Your call?

Correct: c. A single connector is a single point of failure for every app it serves — exactly what caused the outage. HA pairs are the standard design; the cost of one more small VM is trivial against a 20-minute outage. Disabling upgrades (b) trades one risk for another; VPN (d) abandons the whole zero-trust model.

A SOC complains that several rarely-used private apps intermittently show "unhealthy" in the portal, causing false alerts — yet users never report problems. The apps use On-Access health reporting. What's the best fix?

Correct: d. On-Access mode reports health only when someone connects, so idle apps look "unhealthy/unknown" — a false alert, not a real outage (which is why users see no problem). Continuous mode probes on a schedule for real-time status. Disabling reporting (b) blinds the SOC; adding connectors (a) and nightly restarts (c) don't touch the reporting-mode mismatch.

My ZPA App Connector shows Disconnected but the host can ping the internet. What do I check first?

Disconnected is almost never network-down. Run systemctl status zpa-connector then journalctl -u zpa-connector — the log names the failure. Then check chronyc tracking (clock skew breaks the cert), curl the broker on 443, and the openssl issuer (SSL inspection breaks pinned certs). It is an identity/trust failure, not routing.

Why does the error 'Cannot decrypt data from instance_id.crypt' appear?

The connector's fingerprint (hardware ID) is computed from MAC, VM and disk info. If you clone the VM, change the MAC, or migrate it, the fingerprint changes and the connector can no longer decrypt its sealed identity. Fix: stop the service, wipe /opt/zscaler/var/ instance data, re-paste a valid provisioning key, restart, and pin the MAC.

Which firewall change silently kills every App Connector tunnel?

Enabling SSL inspection / TLS decryption on the connector's egress path. ZPA uses certificate pinning, so the connector rejects the inspection device's re-signed certificate and the control connection drops. Fix: add the Zscaler ZPA FQDNs/IPs to the do-not-decrypt bypass list.

What ports and destinations does the App Connector need?

Outbound TCP 443 only to the Zscaler broker/Service Edge — no inbound ports. Allowlist the full ZPA destination set from ips.zscaler.net/zpa so it can pick the nearest region. It also needs working DNS (broker + app FQDNs) and NTP (UDP 123) for clock sync, and the traffic must pass un-inspected.

The connector is healthy but users get 'app not reachable'. Whose problem is it?

Config, not connector. A green connector means cloud connectivity is fine. Check that the connector can resolve the app FQDN (dig from the connector), that the Application Segment lists the correct domain and port, and that a connector group covering the app's subnet is mapped. ZPA only brokers the ports/domains the segment defines.

Why does my connector keep flapping with TLS certificate errors when 443 and DNS are fine?

Clock skew. TLS validates certificates against the local clock, so a few minutes of drift fails the handshake and the connector flaps Connected/Disconnected. Run chronyc tracking; if it shows 'unsynchronised', fix the NTP source in /etc/chrony.conf, run chronyc makestep, and restart the service.

How do I run an App Connector behind an explicit proxy?

Create the file /opt/zscaler/var/proxy with the value proxy-host:port (e.g. 10.20.1.8:8080) and restart zpa-connector. The connector then reaches the broker via HTTP CONNECT. This proxies only connector-to-broker traffic, not connector-to-app traffic, and uses the file rather than system environment variables.

Why did I get 'no healthy App Connector available'?

The app segment has no healthy connector that can reach it — usually a single connector that went Disconnected, or a connector/server group that doesn't cover the app's subnet. Bring the connector healthy and deploy connectors in pairs (a group of two or more) per location so one failure never causes this error.

Troubleshooting the Zscaler ZPA App Connector

Q: Priya at Infosys clones a working App Connector VM to spin up a second one. The clone shows Disconnected and the log says "Cannot decrypt data from instance_id.crypt". What happened?

Correct: a. The fingerprint is MAC+VM+disk. A clone keeps the original's instance_id.crypt but has a different hardware identity, so decryption fails. Never clone an enrolled connector — wipe /opt/zscaler/var/ and re-enroll, or build from a clean image.

Q: A connector in a locked-down DC has no direct internet — everything goes via proxy 10.20.1.8:8080. Enrollment fails with connection timeouts. What's the correct fix?

Correct: b. The connector reads its proxy from /opt/zscaler/var/proxy (not system env vars). This proxies only connector↔broker traffic. The connector needs no inbound ports (a is wrong), and SSL-inspection on app servers is unrelated (d).

Q: 443 is open, no SSL inspection, DNS works — yet a connector keeps flapping with TLS certificate errors. chronyc tracking shows "312 seconds slow / unsynchronised". Root cause?

Correct: b. "Unsynchronised / 312s slow" is the smoking gun. TLS validates certs against the local clock; minutes of skew fails the handshake and the connector flaps. Time is the most-missed root cause for cert errors when 443/inspection/DNS all check out.

Q: Connector is Connected and healthy, DNS resolves the app, but users still can't reach it. From the connector, nc -zv hr.tcs.local 8443 succeeds — but the app's Application Segment lists only port 443. What's the fix?

Correct: c. The connector reaching 8443 proves the last mile works. ZPA only brokers traffic for ports/domains defined in the App Segment . A port mismatch (app on 8443, segment says 443) means users never get brokered to the right socket. Fix the segment — the connector is innocent.

Q: Every App Connector in a DC dropped to Disconnected overnight. 443 is open, NTP is synced, DNS works. The firewall team enabled "decrypt all outbound TLS" yesterday. openssl s_client to the broker shows issuer = "Corporate-SSL-Inspection-CA". Root cause?

Correct: a. A corporate-CA issuer on the broker connection is the fingerprint of SSL inspection. ZPA pins its cert, so a re-signed cert fails validation and the tunnel drops. The timing (decrypt enabled yesterday) seals it. Fix = do-not-decrypt bypass for ZPA FQDNs/IPs.

Start here · understand the lesson before the detail

What you are learning

This lesson helps you troubleshoot the App Connector from the connector outward. You will separate VM health, connector service, ZPA cloud connectivity, configuration binding, and private application reachability.

In plain English

A connector can appear green yet fail one application. Cloud health proves only part of the path. The connector must also resolve the private name, open the defined application port, belong to the correct group, and avoid TLS interception on its ZPA cloud connection.

Real example

Users receive APP_NOT_REACHABLE. The connector is online, but from its own network the payroll name resolves to an old IP and TCP 443 fails. Fixing Client Connector or access policy would not repair that server-side DNS and port failure.

Follow this flow

Check VM resources, clock, connector service, and local logs.
Resolve ZPA cloud endpoints and test outbound TLS 443 without interception.
Confirm connector ID, group, version, certificate, and health.
Trace connector-group to server-group to application-segment binding.
Resolve the private app and test its exact port from the connector network.

Evidence to collect

CPU, memory, disk, service, restart, and log timestamps
ZPA endpoint DNS and TLS certificate result
Connector ID, status code, group, and Service Edge
Private DNS, route, TCP result, health probe, and packet capture

Common mistake to avoid

Do not test only from an administrator laptop. The App Connector’s DNS, route, firewall, and source network are what matter. Avoid deleting or re-enrolling a connector until the identity and recovery consequences are understood.

Current official source checkpoint

Troubleshooting App Connectorscurrent official reference used for this beginner explanation
App Connector runbookcurrent official reference used for this beginner explanation

A six-layer App Connector diagnostic path checks VM health, service, DNS/time, cloud TLS, group bindings, and application DNS and port reachability. — ChatGPT-generated beginner infographic for this lesson. Read the labelled flow once, then continue into the technical detail below.

Key terms before you continue

NO_CONNECTOR_AVAILABLENo healthy eligible connector was selected.

APP_NOT_REACHABLESelected connector could not reach the application.

INVALID_DOMAINApplication name was invalid or unresolved.

TLS PinningZPA rejects intercepted connector cloud certificates.

The belief that costs you four hours

Most engineers see "Disconnected" in the ZPA portal and assume the network is down. Wrong — and that wrong instinct is exactly why a 5-minute fix turns into a half-day outage. A connector can show Disconnected because its clock drifted by 6 minutes, because a VM was cloned and its hardware fingerprint changed, or because someone enabled SSL inspection on the firewall last night. The network is fine. The identity or the trust broke.

So the senior move is never "is the network up?". It's "which hop in the path broke, and what does that specific hop's log say?" This blog teaches you the path first, then walks all 16 failure scenarios that live along it.

Before you read — 3 questions to sit with

No scoring. Just notice which ones you can't answer yet — those are the sections to slow down on. We answer all three as you scroll.

The connector shows Disconnected but you can SSH into the box and ping the internet. What single thing do you check first?
Which one firewall change, done with good intentions, silently kills every App Connector tunnel?
The connector is green and healthy, but users get "app not reachable". Whose problem is it — the connector's, or the config's?

The big picture — where the App Connector sits

The ZPA App Connector is a lightweight Linux daemon you run next to your private apps — in your data centre, your VPC, your branch. It dials outbound only to the Zscaler cloud — specifically a broker — and stitches together two halves of a connection: the user's side (handled by ZCC) and the app's side. The user never routes to your network; the connector reaches out to meet the cloud. That's the whole trick — and the whole troubleshooting map.

Legend Zscaler cloud / broker & endpoints App Connector Healthy outbound 443 tunnel Break-point / failure Diagram canvas

👉 So far: the connector dials out on 443, the cloud brokers user↔app, and the user never touches your network. Next: the four buckets every failure falls into.

Your diagnostic toolbox — tap each card

These six are the commands and files you'll reach for in every scenario below. Memorise the front of each card; the back tells you what it proves.

🩺

Service status

systemctl status zpa-connector

tap to flip

Is the daemon even running? Shows active (running) or a crash reason. Your first command on every ticket — before you touch the portal.

📜

Live logs

journalctl -u zpa-connector

tap to flip

The truth source. Shows connector ID, version, cert expiry, "Control connection state", and the broker it's talking to. Add -f to follow live.

🔑

Provisioning key

/opt/zscaler/var/provision_key

tap to flip

The one-time enrollment token. If empty, malformed, or wrapped in smart-quotes, enrollment never starts. Plain double-quotes only.

🆔

Instance data

/opt/zscaler/var/

tap to flip

Holds instance_id.crypt — the fingerprint-encrypted identity. Wipe this dir to force a clean re-enroll when the hardware ID changed.

🕐

Time sync

chronyc tracking

tap to flip

TLS dies on clock drift. /etc/chrony.conf sets the NTP source. More than a few minutes of skew and cert validation fails — connector goes Disconnected.

🌐

Name resolution

/etc/resolv.conf

tap to flip

The connector resolves both the broker and your apps. If the zscaler user can't read this file, enrollment and app lookups fail. So-what: a permission bit can break everything.

Watch the connector come online — and see where it breaks

Before the scenarios, run this once. Press Play and watch the six stages a connector walks from a cold boot to brokering a live session. Each scenario below maps to exactly one of these stages going red.

▶ App Connector bring-up — 6 stages

Press Play (auto-steps), or tap Next to walk it yourself. Stage 4 turns red — that's where a blocked 443 or time skew stops you cold.

① SERVICE zpa-connector daemon starts on the Linux host (10.20.5.11).

▼

② RESOLVE Reads /etc/resolv.conf, resolves the Zscaler broker FQDNs via DNS.

▼

③ KEY Reads /opt/zscaler/var/provision_key and presents it to enroll.

▼

④ TLS 443 Dials the broker outbound on TCP 443, validates the pinned cert. This is the hop that fails most.

▼

⑤ ENROLL Fingerprint computed (MAC+VM+disk), identity sealed into instance_id.crypt. Portal flips to Connected.

▼

⑥ BROKER Control connection up. Cloud now brokers user↔app sessions through this connector.

Press Play to step through bring-up. Watch stage ④ — it goes red to show the most common failure point.

Bucket 1 — Enrollment & onboarding failures

Symptoms here all share one trait: the connector never reaches Connected, or it was Connected and dropped to Disconnected and won't come back. The portal status lives at Configuration & Control → Private Infrastructure → App Connector Management → App Connectors. Your on-box truth source is always journalctl -u zpa-connector.

SCN-01Connector shows "Disconnected" / never enrolled

⚠ Problem / Symptom

Aditya at Wipro deployed the VM, but the portal App Connectors page shows the connector grey/absent, or status Disconnected. It never turned green.

◆ Likely cause(s)

Daemon not running (crashed or never started)
No outbound 443 to the broker (Bucket 2)
Provisioning key missing/empty
DNS can't resolve the broker FQDNs

🔍 Diagnosis

Start on the box, not the portal. Confirm the daemon, then read the log.

On the connector host

sudo systemctl status zpa-connector
sudo journalctl -u zpa-connector --since "15 min ago"

Expected output (healthy)

● zpa-connector.service - Zscaler App Connector
   Active: active (running) since Fri 2026-05-30 09:12:04 IST
   ...
zpa-connector[2041]: Connector ID 88121, version 24.x
zpa-connector[2041]: Control connection state: CONNECTED to broker
zpa-connector[2041]: Certificate valid until 2027-05-29

🛠 Fix

If Active: inactive (dead) → sudo systemctl start zpa-connector and enable it for boot. If the log shows TLS/connect errors, jump to Bucket 2. If it shows a key error, jump to SCN-03.

✓ Verify

Log shows Control connection state: CONNECTED and the portal flips to a green Connected within ~60 seconds.

SCN-02Invalid or expired provisioning key — enrollment refused

⚠ Problem / Symptom

Brand-new connector, but the log loops on an enrollment/authorization rejection and the portal never shows the connector. The key may have hit its max-enrollment count or passed its expiry date.

◆ Likely cause(s)

Key expired (keys carry an expiry)
Key already used up to its max number of connectors
Key copied with smart-quotes / trailing newline
Key belongs to a different connector group

🔍 Diagnosis

In the portal, open the key under App Connector Management → Provisioning Keys and check enrolled count < max and the expiry. On the box, eyeball the key file.

On the connector host

sudo cat /opt/zscaler/var/provision_key
sudo journalctl -u zpa-connector | grep -i -E "enroll|key|auth"

Expected output (bad key)

3|api.private.zscaler.com|hex... (one clean line, no curly quotes)
zpa-connector[2102]: enrollment rejected: provisioning key invalid/expired
zpa-connector[2102]: retrying enrollment in 30s

🛠 Fix

Generate or edit a valid key in the portal (raise max-count or extend expiry), then re-paste with plain double-quotes (U+0022):

Re-seed the key

sudo systemctl stop zpa-connector
echo "3|api.private.zscaler.com|VALID-KEY-HEX" | sudo tee /opt/zscaler/var/provision_key
sudo systemctl start zpa-connector

✓ Verify

Log moves past enrollment to Control connection state: CONNECTED; the portal's key "enrolled" counter increments by one.

SCN-03"Cannot decrypt data from instance_id.crypt" — fingerprint changed

⚠ Problem / Symptom

A previously-working connector goes Disconnected after a VM clone, MAC change, disk resize, or migration. The log shows "Cannot decrypt data from instance_id.crypt".

◆ Likely cause(s)

The fingerprint (hardware ID) is computed from MAC + VM info + disk info. If any of those change, the connector can no longer decrypt its sealed identity and can't prove who it is to the cloud.

🔍 Diagnosis

Confirm the error

sudo journalctl -u zpa-connector | grep -i "instance_id.crypt"

Expected output

zpa-connector[1980]: ERROR Cannot decrypt data from instance_id.crypt
zpa-connector[1980]: fingerprint mismatch — host identity changed
zpa-connector[1980]: enrollment required

🛠 Fix

Wipe the stale instance data and re-enroll with a fresh/reusable key. The old connector entry in the portal can be deleted.

Clean re-enroll

sudo systemctl stop zpa-connector
sudo rm -f /opt/zscaler/var/instance_id.crypt /opt/zscaler/var/*.crypt
echo "3|api.private.zscaler.com|REUSABLE-KEY" | sudo tee /opt/zscaler/var/provision_key
sudo systemctl start zpa-connector

✓ Verify

A new connector object appears in the portal as Connected. Pin the VM's MAC address so this never recurs after a reboot or vMotion.

SCN-04Enrollment fails because DNS won't resolve the broker

⚠ Problem / Symptom

The connector starts but never enrolls. The log shows name-resolution failures for the Zscaler broker FQDNs — even though the host itself seems to have internet.

◆ Likely cause(s)

The zscaler service account can't read /etc/resolv.conf (wrong permission bit)
No nameserver configured, or it's an internal-only resolver that can't reach Zscaler FQDNs

🔍 Diagnosis

Check resolution + perms

ls -l /etc/resolv.conf
sudo -u zscaler nslookup api.private.zscaler.com

Expected output (broken)

-rw------- 1 root root 88 May 30 /etc/resolv.conf   # zscaler user can't read
;; connection timed out; no servers could be reached

🛠 Fix

Grant the zscaler user read access and ensure a working resolver is set.

Repair

sudo chmod 644 /etc/resolv.conf
sudo systemctl restart zpa-connector

✓ Verify

sudo -u zscaler nslookup api.private.zscaler.com returns an answer, and the log proceeds to enroll.

Predict: the connector is Disconnected, daemon is running, and journalctl shows TLS handshake timeouts to the broker. The host can browse the internet fine. What's the single most likely cause?

A firewall / proxy blocking outbound 443 to the Zscaler broker, or an SSL-inspection box breaking the pinned-cert TLS. "Browses the internet fine" only proves general egress — ZPA needs specific Zscaler FQDNs on 443, un-inspected. That's all of Bucket 2.

Quick check · Q1 of 10

Priya at Infosys clones a working App Connector VM to spin up a second one. The clone shows Disconnected and the log says "Cannot decrypt data from instance_id.crypt". What happened?

a) The clone inherited the original's sealed identity, but its new MAC/VM/disk changed the fingerprint — so it can't decrypt that identity. Wipe instance data and re-enroll. b) The provisioning key expired c) Port 443 is blocked on the clone d) The clone needs a reboot

Correct: a. The fingerprint is MAC+VM+disk. A clone keeps the original's instance_id.crypt but has a different hardware identity, so decryption fails. Never clone an enrolled connector — wipe /opt/zscaler/var/ and re-enroll, or build from a clean image.

App Connector Simulator ZPA Troubleshooting Simulator

Bucket 2 — The tunnel to the cloud

Enrollment can be perfect and the connector still won't connect, because the control connection rides outbound TCP 443 to the Zscaler broker. Anything that blocks, intercepts, or mis-routes that 443 stream lands here. This is the bucket that answers pre-quiz Q2 — and it's the one engineers misdiagnose most, because the host "has internet".

SCN-05Outbound 443 to the broker blocked by firewall

⚠ Problem / Symptom

Connector Disconnected; log shows the TLS dial to the broker timing out or being refused. General web browsing from the host works, which throws people off.

◆ Likely cause(s)

The corporate firewall/egress filter doesn't allow outbound 443 to the Zscaler broker FQDNs. The connector needs only outbound 443 — no inbound ports — but it must reach the specific Zscaler destinations.

🔍 Diagnosis

Test the exact path the connector uses. The broker list is published at ips.zscaler.net/zpa.

Prove the 443 path

curl -sv -m 8 https://api.private.zscaler.com 2>&1 | head
sudo journalctl -u zpa-connector | grep -i -E "broker|connect|timeout"

Expected output (blocked)

*   Trying 165.225.x.x:443...
* connect to 165.225.x.x port 443 failed: Connection timed out
zpa-connector[2210]: broker connect failed: no route / timeout

🛠 Fix

Allowlist outbound TCP 443 from the connector's IP (e.g. 10.20.5.11) to the Zscaler ZPA destination ranges/FQDNs from ips.zscaler.net/zpa. No inbound rules needed.

✓ Verify

curl -v to the broker completes the TCP+TLS, and the connector log reports Control connection state: CONNECTED.

SCN-06SSL inspection breaks the pinned-cert TLS

⚠ Problem / Symptom

443 is open, but the connector still won't connect — and the log shows a certificate/TLS validation failure. Often appears right after the firewall team enables decryption on the egress segment.

◆ Likely cause(s)

ZPA uses TLS with certificate pinning. An SSL-inspection device decrypts and re-signs with its own CA — which the connector rejects, because the pinned cert no longer matches.

🔍 Diagnosis

Inspect the presented cert

echo | openssl s_client -connect api.private.zscaler.com:443 2>/dev/null \
  | openssl x509 -noout -issuer

Expected output (intercepted)

issuer=CN = Corporate-SSL-Inspection-CA, O = TCS-Proxy
# Should be a Zscaler issuer. A corporate CA here = you're being decrypted.

🛠 Fix

Add the Zscaler ZPA domains/IPs to the SSL-inspection bypass/do-not-decrypt list on the firewall or proxy. ZPA traffic must pass through un-inspected.

✓ Verify

The openssl issuer now shows a Zscaler CA, the TLS validation error clears, and the connector goes Connected.

SCN-07Connector sits behind an explicit proxy (no direct egress)

⚠ Problem / Symptom

The data centre forces all outbound through an explicit proxy. The connector can't dial the broker directly, so it never connects — even with 443 "allowed" on the proxy.

◆ Likely cause(s)

The connector has no proxy configured, so it tries a direct outbound that the network drops. It needs to be told to use the proxy with HTTP CONNECT.

🔍 Diagnosis

Confirm there's no direct path and that a proxy is mandatory on this segment, then check for the proxy file.

Check

ls -l /opt/zscaler/var/proxy 2>/dev/null || echo "no proxy file set"
curl -sv -m 8 https://api.private.zscaler.com 2>&1 | head -3

Expected output (proxy required)

no proxy file set
*   Trying ...:443...
* connect ... failed: Connection timed out   # direct egress is blocked

🛠 Fix

Create /opt/zscaler/var/proxy with <proxy-host>:<port>. This only proxies connector↔broker traffic — not connector↔app traffic.

Configure the proxy

echo "10.20.1.8:8080" | sudo tee /opt/zscaler/var/proxy
sudo systemctl restart zpa-connector

✓ Verify

The connector now reaches the broker via CONNECT through 10.20.1.8:8080; log shows CONNECTED. Proxy access logs show the connector's CONNECT requests.

SCN-08Wrong region / broker — high latency or repeated re-broker

⚠ Problem / Symptom

The connector is Connected but app sessions feel slow, or the connector keeps re-selecting brokers. A Mumbai-hosted app is being brokered through a far-away Service Edge.

◆ Likely cause(s)

The connector reaches a sub-optimal Service Edge because closer broker FQDNs/ranges are blocked by the firewall, forcing it onto distant ones — adding latency on every brokered session.

🔍 Diagnosis

See which broker it picked

sudo journalctl -u zpa-connector | grep -i -E "broker|service edge|latency"

Expected output

zpa-connector[2301]: connected to broker (region: us-west)  # app is in ap-south
zpa-connector[2301]: control latency 210 ms

🛠 Fix

Allowlist the full ZPA destination set from ips.zscaler.net/zpa (not just one region) so the connector can pick the nearest Service Edge. Confirm the connector's deployment region is correct.

✓ Verify

Log shows a nearby region and control latency drops (e.g. 210 ms → < 40 ms). Users report apps feel snappy again.

Predict: the firewall team enabled "decrypt all outbound TLS for visibility" last night. This morning every App Connector in that data centre is Disconnected. Why — and what's the fix that keeps their visibility goal mostly intact?

SSL inspection breaks ZPA's certificate pinning — the connector rejects the proxy's re-signed cert. Fix: add the Zscaler ZPA FQDNs/IPs to the do-not-decrypt bypass list. The firewall still inspects everything else; only the pinned ZPA tunnel is exempt (which it can't read anyway).

Quick check · Q2 of 10

A connector in a locked-down DC has no direct internet — everything goes via proxy 10.20.1.8:8080. Enrollment fails with connection timeouts. What's the correct fix?

a) Open inbound 443 to the connector b) Write the proxy host:port into /opt/zscaler/var/proxy and restart — the connector then reaches the broker via HTTP CONNECT. c) Set http_proxy in /etc/environment d) Disable SSL inspection on the app servers

Correct: b. The connector reads its proxy from /opt/zscaler/var/proxy (not system env vars). This proxies only connector↔broker traffic. The connector needs no inbound ports (a is wrong), and SSL-inspection on app servers is unrelated (d).

Bucket 3 — Connector health

Now the connector enrolls and reaches the broker — but it's unhealthy: it flaps, it's slow, or it drifts. These are the "it was fine yesterday" tickets. The decision tree below is your triage map for the whole blog; print it.

SCN-09NTP / time skew breaks cert validation

⚠ Problem / Symptom

Connector flaps between Connected and Disconnected, or won't connect at all, with TLS/certificate errors — despite open 443 and no inspection. Classic after a VM was suspended/resumed and its clock drifted.

◆ Likely cause(s)

TLS rejects certificates whose validity window doesn't match the local clock. A few minutes of skew is enough to fail the handshake. /etc/chrony.conf has no reachable NTP source.

🔍 Diagnosis

Check the clock

chronyc tracking
timedatectl status

Expected output (skewed)

Reference ID    : 00000000 (unsynchronised)
System time     : 312.4 seconds slow of NTP time
Leap status     : Not synchronised

🛠 Fix

Point /etc/chrony.conf at a reachable NTP server (internal preferred, Zscaler pool as fallback), restart chrony, force a sync.

Repair time sync

sudo systemctl restart chronyd
sudo chronyc makestep
sudo systemctl restart zpa-connector

✓ Verify

chronyc tracking shows Leap status: Normal and sub-second offset; the cert error clears and the connector stays Connected.

SCN-10CPU / memory overload → app slowness, drops

⚠ Problem / Symptom

Karthik at TCS reports private apps are slow or sessions drop at peak hours. The connector is Connected, but it's pinned at high CPU/RAM, or shows a memory leak over days.

◆ Likely cause(s)

One connector is carrying too much load (under-provisioned, or it's the only one for a busy segment). Minimum spec is 2 vCPU / 4 GB RAM — busy segments need more, or more connectors.

🔍 Diagnosis

Check load

top -b -n1 | head -12
free -m

Expected output (overloaded)

%Cpu(s): 96.3 us,  2.1 sy   load average: 7.8, 7.2, 6.9
Mem:  total 3902   used 3760   free 142   # 96% RAM used
2041 zscaler  ... zpa-connector  98.0 %CPU

🛠 Fix

Add a second connector to the same connector group (the cloud load-balances across healthy members), or resize the VM up. For suspected leaks, collect periodic memory reports for Zscaler Support.

✓ Verify

Per-connector CPU drops below ~70%, app latency recovers, and the portal shows load spread across both connectors in the group.

SCN-11Version outdated / auto-upgrade (SARGE) failed

⚠ Problem / Symptom

A connector drops to Disconnected right after a scheduled auto-upgrade window, or the portal flags it as running an old version that won't re-register.

◆ Likely cause(s)

The in-place upgrade (SARGE) failed mid-flight — interrupted download, disk full, or a stale package state — leaving the service unable to start cleanly.

🔍 Diagnosis

Check version + upgrade log

sudo journalctl -u zpa-connector | grep -i -E "version|upgrade|sarge"
df -h /opt

Expected output (failed upgrade)

zpa-connector[2280]: upgrade (SARGE) failed: package verification error
zpa-connector[2280]: service start aborted
/dev/sda1   8.0G  8.0G   0    100%  /opt   # disk full

🛠 Fix

Free disk, update the package, reboot. If it still won't register cleanly, the supported path is to rebuild from the latest image and re-enroll (deploy in pairs so this never causes an outage).

Recover

sudo yum update -y
sudo reboot
# verify after boot:
sudo systemctl status zpa-connector

✓ Verify

Portal shows the connector on a current version and Connected. The connector group still served traffic throughout because its HA peer stayed up.

SCN-12Frequent flapping + health-reporting mode confusion

⚠ Problem / Symptom

The connector toggles Connected/Disconnected repeatedly, or an app shows healthy in continuous mode but goes "unreachable" in on-access mode (or vice-versa) — and the team argues about whether the connector is actually broken.

◆ Likely cause(s)

Routing/MTU instability or periodic restarts cause real flapping
Health-reporting mode mismatch: Continuous probes apps on a schedule; On Access only checks when a user connects — so "unhealthy" can simply mean "nobody has accessed it yet".

🔍 Diagnosis

Look for restart loops

sudo journalctl -u zpa-connector | grep -i -E "restart|disconnect|reconnect" | tail

Expected output

09:31:02 Control connection state: DISCONNECTED
09:31:40 Control connection state: CONNECTED
09:34:11 Control connection state: DISCONNECTED   # flapping ~every 3 min

🛠 Fix

For real flapping, stabilise the network path (see SCN-15 MTU) and check for a watchdog restarting the service. For false "unhealthy", confirm the app segment's health-reporting mode matches expectations — set Continuous if you want proactive status, accept that On Access reports only on demand.

✓ Verify

The Connected state holds steady for > 30 min with no restart log lines; app health reflects the chosen mode's expected behaviour.

Quick check · Q3 of 10

443 is open, no SSL inspection, DNS works — yet a connector keeps flapping with TLS certificate errors. chronyc tracking shows "312 seconds slow / unsynchronised". Root cause?

a) Provisioning key expired b) Clock skew — the local time is 5+ minutes off, so TLS rejects the cert's validity window. Fix NTP in /etc/chrony.conf, makestep, restart. c) The connector is overloaded on CPU d) The app segment is misconfigured

Correct: b. "Unsynchronised / 312s slow" is the smoking gun. TLS validates certs against the local clock; minutes of skew fails the handshake and the connector flaps. Time is the most-missed root cause for cert errors when 443/inspection/DNS all check out.

Bucket 4 — The last mile to the app

The connector is green and healthy. Users still get "app not reachable". This is pre-quiz Q3 — and the answer is almost always config, not connector. The connector is fine; what's between it and the app (DNS, the app-segment definition, MTU, or simply no connector covering that app) is not.

SCN-13Connector can't resolve the app's FQDN

⚠ Problem / Symptom

Healthy connector, but users can't reach hr.tcs.local. The connector resolves app FQDNs itself — if it can't, the session has nowhere to go.

◆ Likely cause(s)

The connector's resolver (/etc/resolv.conf) points at a DNS server that doesn't know the internal zone, or the app's record doesn't exist / points to the wrong IP.

🔍 Diagnosis

Resolve the app's FQDN from the connector itself.

On the connector host

nslookup hr.tcs.local
dig +short hr.tcs.local

Expected output (broken)

** server can't find hr.tcs.local: NXDOMAIN
# vs healthy:  hr.tcs.local  ->  10.20.9.40

🛠 Fix

Point the connector at an internal DNS server that can resolve the app zone (or add the missing record). For split-horizon apps, ensure it uses the internal view.

✓ Verify

dig +short hr.tcs.local returns the correct internal IP (10.20.9.40) from the connector; users reach the app.

SCN-14App segment misconfig — wrong domain, port, or no SNAT

⚠ Problem / Symptom

Connector healthy, DNS resolves, but a specific app times out — or works for some users and not others. The app may also see the user's real IP and reply asymmetrically (double-NAT / source-IP issue).

◆ Likely cause(s)

Application Segment domain/port set doesn't actually match what the app uses (e.g. app on 8443, segment says 443)
The segment's domain isn't mapped to the right connector group / server group
Source-IP anchoring (SNAT) disabled, so the app replies directly and the return path is asymmetric behind a NAT
App expects a specific cert/SNI the segment doesn't pass cleanly

🔍 Diagnosis

In the portal, open Application Segments for the app and confirm domain + TCP/UDP port range + connector/server group mapping. From the connector, test the actual app socket.

Test the real app port from the connector

nc -zv hr.tcs.local 8443
sudo journalctl -u zpa-connector | grep -i -E "segment|reachable|reset"

Expected output (port mismatch)

Connection to hr.tcs.local 8443 port [tcp] succeeded!
# but the App Segment only lists 443 -> users never hit 8443

🛠 Fix

Correct the segment's domain + port range to match the app, map it to the connector group covering that subnet, and enable source IP anchoring if the app sits behind a NAT so return traffic stays symmetric.

✓ Verify

nc -zv on the exact app port succeeds from the connector, and the user's access log shows the session brokered and allowed end-to-end.

SCN-15MTU / fragmentation hangs + split-DNS overlap with ZIA

⚠ Problem / Symptom

App connects but large transfers or file shares hang/stall; small requests work. Or a domain that should go via ZPA is being sent to ZIA (or the public internet) because of overlapping DNS/forwarding.

◆ Likely cause(s)

MTU mismatch on the connector's path (often a tunnel/overlay) causing large packets to be dropped silently when DF is set
Split-DNS overlap: the same FQDN is claimed by a ZPA app segment and a ZIA forwarding/PAC rule — traffic takes the wrong lane

🔍 Diagnosis

Find the MTU ceiling

ping -M do -s 1472 hr.tcs.local
ip link show eth0 | grep -i mtu

Expected output (MTU too high)

ping: local error: message too long, mtu=1400
# 1472+28 = 1500 fails; path MTU is actually 1400

🛠 Fix

Lower the connector NIC MTU to the path value (e.g. 1400) or enable MSS clamping on the gateway. For split-DNS overlap, make the FQDN owned by exactly one lane — keep the internal app in the ZPA app segment and remove it from ZIA forwarding/PAC.

✓ Verify

ping -M do -s 1372 succeeds at the corrected size; large file transfers complete; the FQDN consistently resolves and routes via ZPA.

SCN-16"No healthy App Connector available for this app"

⚠ Problem / Symptom

Users get an explicit "no healthy connector" error for an app, even though a connector exists in the tenant. The app segment has no healthy connector that can actually reach it.

◆ Likely cause(s)

The only connector for that segment is Disconnected or unhealthy (single point of failure — no connector group with a healthy peer)
The app segment / server group isn't mapped to any connector group that can route to the app's subnet
All connectors in the group are over capacity and marked unhealthy

🔍 Diagnosis

In the portal, open the app's Server Group → Connector Group mapping and confirm at least one member is Connected and reaches the app subnet.

From a connector in the group

nc -zv hr.tcs.local 8443
sudo journalctl -u zpa-connector | grep -i "no healthy"

Expected output

zpa-connector[2400]: app request: no healthy connector for segment "HR-Portal"
# group has 1 member and it's Disconnected (or can't route to 10.20.9.0/24)

🛠 Fix

Bring the connector healthy (it's a left-branch problem — work SCN-01..12), and deploy connectors in pairs per location so one failure never means "no healthy connector". Confirm the server/connector group mapping covers the app's subnet.

✓ Verify

The app segment shows ≥ 2 healthy connectors; the error stops; killing one connector no longer takes the app offline.

👉 So far: all 16 scenarios sort into 4 buckets — enrollment, tunnel, health, last-mile. The aha: "Disconnected" is a cloud-side problem; "app not reachable" with a green connector is a config-side problem. Name the side first.

Predict: a single App Connector serves the HR app. It dies during a Sunday patch. Monday morning, every HR user sees "no healthy connector". The connector itself is the symptom — but what's the real design failure, and what stops this class of outage permanently?

The design failure is a single connector — no HA. One connector is a single point of failure for every app it serves. The permanent fix is to deploy connectors in pairs (a connector group of ≥2) per location. The cloud load-balances across healthy members, so losing one never means "no healthy connector".

Quick check · Q4 of 10

Connector is Connected and healthy, DNS resolves the app, but users still can't reach it. From the connector, nc -zv hr.tcs.local 8443 succeeds — but the app's Application Segment lists only port 443. What's the fix?

a) Re-enroll the connector b) Restart the zpa-connector service c) Add 8443 to the Application Segment's port range — the connector can reach the app, but ZPA only brokers the ports the segment defines. d) Open inbound 8443 on the firewall

Correct: c. The connector reaching 8443 proves the last mile works. ZPA only brokers traffic for ports/domains defined in the App Segment. A port mismatch (app on 8443, segment says 443) means users never get brokered to the right socket. Fix the segment — the connector is innocent.

App Connector Simulator ZPA Troubleshooting Simulator

🤖 Ask the App-Connector Tutor

Tap any question — instant, scoped to this lesson. The kind of thing you'd ask after reading.

Pre-curated from Zscaler docs + community threads. For deeper/live questions, paste your log into chat.techclick.in.

✍️ Explain it back (generation effect)

In two lines: why is "Disconnected" almost never a network-down problem? Type your answer first — then reveal the expert version and compare.

Expert answer: "Disconnected" means the connector lost its trusted control connection to the cloud — usually an identity or trust failure (clock skew breaking the cert, a changed fingerprint, SSL inspection on the pinned TLS, or blocked 443), not the underlying network being down. The host can ping the internet fine and still be Disconnected. So you read the connector log, not the routing table.

🎁 Teach a friend

Tap to generate a one-liner you can paste to a teammate who's stuck on a Disconnected connector.

"Quick tip: before you blame the network on a Disconnected ZPA App Connector, run journalctl -u zpa-connector + chronyc tracking. 90% of the time it's clock skew, a blocked outbound 443, SSL inspection breaking the pinned cert, or a changed fingerprint — not the link. 'Disconnected' = cloud-side; 'app not reachable' with a green connector = config-side. — learned this on ai.techclick.in"

📩 Quiz me on this in 7 days ✓ You're set — 3 micro-questions on Day 1 / 7 / 30.

📖 Glossary

App Connector: A lightweight Linux daemon (zpa-connector) deployed next to your private apps that dials outbound to the Zscaler cloud and brokers user→app sessions.
ZPA: Zscaler Private Access — connects users to private apps over the Zscaler cloud, no network-level VPN.
ZCC (Client Connector): The agent on the user's device that forwards their traffic into the Zscaler cloud.
Broker / Service Edge (ZEN): The Zscaler cloud node the App Connector dials out to on 443; it stitches the user side to the app side.
Provisioning key: The one-time enrollment token at /opt/zscaler/var/provision_key that authorises a connector to join the tenant.
Fingerprint (hardware ID): The connector's identity, computed from MAC + VM + disk info. Changing the hardware breaks decryption of instance_id.crypt.
Application Segment: The portal object defining which domains/ports an app uses; ZPA only brokers traffic the segment declares.
Connector group: A set of App Connectors the cloud load-balances across — deploy ≥2 so one failure isn't "no healthy connector".
Certificate pinning: ZPA expects a specific cert; any re-signed cert (from SSL inspection) is rejected — which is why inspection must bypass ZPA traffic.

📚 Sources

Zscaler Help — Troubleshooting App Connectors & ZPA App Connector Troubleshooting Runbook. help.zscaler.com/zpa
Zscaler Help — About / Accessing App Connector Status Diagnostics & Managing Deployed App Connectors. help.zscaler.com/zpa
Zscaler Help — App Connector Deployment Prerequisites & About App Connector Provisioning Keys. help.zscaler.com/zpa
Zscaler Help Centre — ZPA App Connector: Cannot decrypt data from instance_id.crypt. zscaler.az-ap.com/article/15
Zscaler Community (Zenith) — App Connector disconnected after a failed SARGE update & ZPA connector not starting. community.zscaler.com
Zscaler Config — Zscaler Private Access Firewall Allowlist (broker destinations / ips.zscaler.net/zpa). config.zscaler.com/private.zscaler.com/zpa
N. Catania — Deploying ZPA ZEN Connectors (CLI, file paths, NTP/DNS, proxy). nathancatania.com
Zscaler Academy — ZDTA Certification blueprint (App Connectors, provisioning keys, deploying connectors). zscaler.com/zscaler-cyber-academy

What's next?

You can find a broken connector. Next: build the App Segments and access policies on top of it — domain sets, server groups, and the policy order that decides who reaches which private app.

Recap · ZIA Security Controls → Practice ZDTA on exam.techclick.in

— Techclick Team

Troubleshooting the ZPA App Connector — Every Failure, Diagnosed

Where does it hurt? Jump to the failure bucket

Enrollment

Tunnel to Cloud

Connector Health

Last Mile to App

What you are learning

In plain English

Real example

Follow this flow

Evidence to collect

Common mistake to avoid

Current official source checkpoint

Key terms before you continue

The belief that costs you four hours

The big picture — where the App Connector sits

Your diagnostic toolbox — tap each card

Watch the connector come online — and see where it breaks

▶ App Connector bring-up — 6 stages

Bucket 1 — Enrollment & onboarding failures

Bucket 2 — The tunnel to the cloud

Bucket 3 — Connector health

Bucket 4 — The last mile to the app

🤖 Ask the App-Connector Tutor

📖 Glossary

🎓 Prove it — 10-question assessment

📚 Sources

What's next?