Why this lesson matters β connectors are where ZPA becomes real
Lesson 9 left you with a clean architecture: four components, double inside-out tunnel, dark apps, no inbound ports. That diagram is honest, but it hides the fact that one of those four components β the App Connector β is literally a Linux VM you have to build, harden, register, monitor, patch, scale, and pair into HA groups, in every place an internal app lives. The architecture diagram does not run itself.
This is the lesson where ZPA stops being slideware and becomes a deployment plan. The choices you make here β where you place connectors, how many you run per failure domain, what size you give them, which group they join, what your egress firewall allows them to talk to β directly determine three things students will be measured on in interviews and in their first 90 days on the job:
- User experience. A user in Mumbai brokered to a connector in Virginia adds 220 ms RTT to every Jira page load. Wrong region = bad-user-experience ticket, every day, forever.
- HA posture. One connector per region is a single point of failure. The day it patches, reboots, or its hypervisor host has a bad day, every user in that region loses every app served by it. CISOs and ops leads will both hate you for that.
- Cost. Over-provisioning eight 8-vCPU connectors per region "to be safe" turns a small ZPA deployment into a six-figure cloud bill. Right-sizing matters.
The cousins β Branch Connector and Cloud Connector β solve adjacent problems (office user egress to ZIA, and cloud-workload egress to ZIA respectively). Most people new to ZPA confuse them. We will fix that explicitly, because in production you will be asked to recommend one over the other on day one.
The three connector flavours β when to use which
Zscaler ships three connector products. They look similar (lightweight Linux VM, outbound-only, registers to Zscaler Cloud), but they serve completely different traffic patterns. Get this table cold before week one β recruiters love asking it.
| Connector | What it brokers | Where it sits | What it replaces |
|---|---|---|---|
| App Connector | Inbound user requests to your private apps (Jira, SAP, RDP host, internal API). Pure ZPA play. | Next to the app β in the DC subnet, the AWS VPC, the Azure VNet, the colo. | VPN concentrator + reverse proxy + WAF + ADC for internal apps. |
| Branch Connector | Outbound user traffic from a small office to ZIA cloud for inspection. Pure ZIA forwarding. | At the office edge β on a small VM, an x86 appliance, or the office router itself. | SD-WAN edge or branch firewall doing GRE/IPSec/PAC to ZIA. Also replaces "tunnel of last resort" config. |
| Cloud Connector | Outbound traffic from cloud workloads (EC2, GKE node, Azure VM) to ZIA for inspection β without installing Z-App on every instance. | Inside the cloud VPC/VNet, anchored to a NAT/route table that catches workload egress. | VPC-level NAT gateway + manual proxy config on every workload. Also Z-App on container hosts (which is awful). |
Plainer rule of thumb:
- User trying to reach a private app? App Connector.
- Office full of users trying to reach the internet (and you want ZIA in front)? Branch Connector.
- Cloud workloads trying to reach the internet (and you want ZIA in front, without per-VM agents)? Cloud Connector.
One more nuance students miss: Branch Connector does not serve ZPA. If you have a branch office that needs both internet inspection and access to private apps, you deploy Branch Connector at the edge (for ZIA forwarding) and install Z-App on the user devices (for ZPA). Branch Connector and App Connector are not substitutes for each other.
The whole picture on one page
Three connector flavours, one cloud. App Connector serves your apps (rightβcloud), Branch Connector serves your office users (leftβcloud via its own edge VM), Cloud Connector serves your cloud workloads. Every arrow is outbound; no one in this picture accepts new inbound TCP from outside.
App Connector β VMware / AWS / Azure deployment walkthrough
You will deploy more App Connectors than the other two flavours combined, because every place a private app lives needs one. Three deployment surfaces dominate: VMware in the DC, AWS in a VPC, Azure in a VNet. The prereqs are nearly identical; the install mechanics differ slightly.
Prereqs that bite in pilot
- Supported base OS: RHEL 8 / 9, CentOS Stream, Oracle Linux 8 / 9, Ubuntu 20.04 / 22.04 LTS. Amazon Linux 2 is supported on AWS specifically. Don't pick something exotic β the Zscaler-shipped package only knows about these.
- VM spec: 2 vCPU / 4 GB RAM / 16 GB disk is the official minimum and a fair starting point for β€1 Gbps and a few hundred concurrent TCP sessions. Voice or heavy RDP loads should jump straight to 4 vCPU / 8 GB.
- Outbound network: TCP/443 and TCP/9000 to Zscaler's Zen IP ranges. The TCP/9000 control channel is the one people forget; without it the connector registers but cannot accept brokered streams. Pull the JSON CIDR feed from
config.zscaler.comand allow-list it on your egress firewall before you boot the VM. - NTP: the enrollment cert is time-sensitive. If the VM's clock drifts by more than a few minutes, registration silent-fails with a misleading "auth error" in the log. Run chrony or ntpd on the VM and verify
chronyc trackingis in sync before enrolling. - SELinux / AppArmor: the connector ships with a tested SELinux profile. If your hardening playbook strips it or sets enforcing without the Zscaler context, outbound TCP/9000 can be silently blocked.
setenforce 0during the very first boot to confirm it's not the culprit, then re-enable with the shipped policy. - Provisioning key: generated in Administration β App Connectors β Provisioning Keys β Add. The key is one-time-use, bound to a Connector Group, and expires (default 14 days). Don't generate it a month before deployment β it will be dead when you finally need it.
Install script β AWS (Amazon Linux 2)
This is the path you'll use most often in modern deployments. Spin up an EC2 instance (t3.medium or larger, in a private subnet, with a NAT Gateway for outbound), then bake the connector in via cloud-init userdata:
# 1. Download the App Connector RPM from the ZPA Admin Portal: # Administration β App Connectors β Download β Linux (RHEL/CentOS) # OR use the published cloud image (AMI / Azure VHD / GCP image) # 2. Install the RPM on a 2vCPU/4GB minimum RHEL 8/9 or CentOS 8 host sudo yum localinstall -y zpa-connector-VERSION.x86_64.rpm # 3. Drop the provision key (copied from the same Admin Portal page) sudo bash -c 'echo "YOUR_PROVISION_KEY" > /opt/zscaler/var/provision_key' sudo chown zscaler:zscaler /opt/zscaler/var/provision_key sudo chmod 600 /opt/zscaler/var/provision_key # 4. Enable + start the service. Registration happens automatically on first start. sudo systemctl enable --now zpa-connector # 5. Verify sudo journalctl -u zpa-connector -f # Look for "Successfully enrolled" and "Connected to broker"
zpa-connector enroll CLI verb existsRegistration is driven by the provision_key file at first systemd start β there is no separate enroll sub-command. If you see zpa-connector enroll in a guide, the guide is wrong. The presence of /opt/zscaler/var/provision_key when the service starts is what triggers the one-shot enrollment exchange; on success the connector swaps the key for a per-device cert and you'll never need the key again.
Within roughly 30 seconds the connector appears in the ZPA admin portal as Healthy, attached to whichever Connector Group the provision key was bound to. Repeat the whole exercise on a second EC2 in a different availability zone β same Connector Group β to satisfy N+1 HA from day one.
VMware (DC) and Azure (VNet) β the differences
- VMware: Zscaler ships a signed OVA from the admin portal. Deploy via vCenter, set static IP via the VMware console (or DHCP if your DC tolerates it), paste the provisioning key into the first-boot wizard. Otherwise the post-install steps are identical to the YUM flow.
- Azure: Use the official Azure Marketplace image (search "Zscaler App Connector"). Deploy into a VNet subnet that has outbound NAT (Azure Firewall, NAT Gateway, or a UDR to a network virtual appliance). Provisioning key goes in via the cloud-init custom data field. Make sure the NSG does not have an inbound rule on TCP/9000 β that port is outbound-only from the connector and there is no inbound listener.
App Connector auto-update
App Connectors auto-update from the cloud by default β Zscaler pushes new versions on its own cadence, no admin action. For change-controlled environments (banking, healthcare): pin a version under Administration β App Connectors β Group β Version Pinning. Without pinning, a connector might silently upgrade on a Tuesday afternoon β keep this in your change calendar.
Connector log paths
Log paths: /var/log/zpa/connector.log (active), journalctl -u zpa-connector -f (systemd stream). Set LOG_LEVEL=DEBUG in /etc/zpa-connector/config only for active troubleshooting β debug logs are massive.
Capacity sizing β concurrent users per connector by app type
The single biggest day-one sizing mistake is treating "concurrent users" as a single number. ZPA throughput depends massively on the protocol mix. A connector that comfortably brokers 1,500 Jira (HTTP) users will buckle at 200 RDP users. Use this as your starting calculator:
| App type | Per-user bandwidth (avg) | ~Concurrent users on 2 vCPU / 4 GB connector | ~Concurrent users on 4 vCPU / 8 GB connector |
|---|---|---|---|
| HTTP/HTTPS app (Jira, Confluence, internal portal) | 50β150 kbps idle, bursts to 2 Mbps | ~250β500 (rule of thumb: ~250 sessions/vCPU at minimum spec; scale to 4vCPU/8GB or run 2β3 connectors in the group for 1,000+) | ~1,000β1,500 |
| SSH / Git over SSH (developer) | 30β80 kbps | 800β1,200 | 2,000β2,800 |
| RDP (Windows remote desktop) | 200 kbps β 2 Mbps depending on resolution | 150β250 | 400β600 |
| VDI / Citrix HDX | 200 kbps β 4 Mbps | 100β200 | 300β450 |
| SMB / CIFS (file share) | burst 50 Mbps during transfer | 40β80 active transfers | 120β200 active transfers |
| Voice / SIP over UDP | ~100 kbps steady, jitter-sensitive | 200β300 | 500β700 |
Two rules to take into every sizing conversation:
- Always N+1 per group. If your math says one connector handles the load, deploy two. If two handle it, deploy three. The +1 covers patching, hypervisor failure, and unexpected load spikes without a user-visible outage. A "perfectly sized" single connector is a deployment that will fail its first patching window.
- Scale horizontally, never vertically. The connector is single-process and gets diminishing returns past 8 vCPU. Five 4-vCPU connectors will out-throughput two 16-vCPU connectors and survive an AZ failure that the bigger pair cannot.
HA β Connector Groups and how ZPA Cloud picks a connector
A Connector Group is the unit of HA in ZPA. Group connectors by failure domain β usually one group per AWS AZ, one per Azure availability zone, one per DC row, one per branch DC building. Not by application. An app segment gets attached to one or more Connector Groups; when a user request arrives, ZPA Cloud picks any healthy connector from any attached group, weighted by latency and load.
Geometry that works in production:
- Per region (e.g. Mumbai DC): two App Connectors, in two separate VMware clusters or two AZs, both in the same Connector Group "mumbai-dc". App segments served only by Mumbai apps are attached to this group only. Users in India naturally route to Mumbai connectors (lowest RTT).
- Multi-region apps: if the same app exists in two regions (active-active), create the same segment in both region groups. ZPA Cloud will pick the closer one for each user.
- Cross-region failover: Connector Groups can have a priority order. Set Mumbai as primary, Singapore as secondary. If both Mumbai connectors die, users gracefully shift to Singapore at the cost of a one-time latency hit β far better than a hard outage.
How a connector actually boots and joins the pool
Six steps from cold boot to "Healthy in pool". Steps 3 and 4 are where 90% of failed deployments hang β egress firewall blocking TCP/9000, an SSL-inspecting proxy intercepting the Zen TLS, or NTP drift invalidating the enrollment exchange.
Branch Connector β when to put it at the office edge
Branch Connector is built for one specific shape of customer pain: a small office, no full-featured firewall, users on guest-grade Wi-Fi, and the business wants their internet traffic inspected by ZIA without buying SD-WAN. Branch Connector ships as a hardware appliance OR as a VM (KVM / Hyper-V / ESXi). The smallest hardware SKU competes with SD-WAN CPE pricing β not "NUC in a closet" cheap. Plan for ~$1.5kβ$5k per branch hardware + tenant license. All office user traffic is forwarded β via a lightweight TLS tunnel β to the nearest ZIA Public Service Edge, where full URL filtering, SSL inspection, malware scan, DLP, and sandbox kick in just as if every user were running Z-App.
Use Branch Connector when:
- The office has no firewall capable of GRE or IPSec tunnels to ZIA β i.e. nothing better than a consumer-grade router. Branch Connector replaces the missing firewall edge.
- You're doing a tactical / event deployment β pop-up event office, temporary site for a 3-month project, M&A acquisition where you don't yet own the network. Stand up a Branch Connector VM in an hour; instant ZIA coverage.
- You have a guest Wi-Fi only branch β kiosks, shop floors, lobbies β where you don't want to install Z-App on user devices (you don't manage them) but you do want internet hygiene at the SSID.
Do not use Branch Connector when:
- The branch already has a Palo Alto / Fortigate / Meraki / Cisco firewall capable of IPSec. Just configure a GRE or IPSec tunnel to ZIA β fewer moving parts.
- You need to give office users access to private apps. Branch Connector does not serve ZPA. For that, install Z-App on user devices, even at branches running Branch Connector.
Cloud Connector β when to anchor cloud-workload egress
Cloud workloads β EC2 instances, GKE pods, Azure VMs, container hosts β egress to the internet for package downloads, API calls, telemetry, and SaaS APIs. Without intervention this egress goes through your VPC's NAT Gateway directly to the internet, completely unfiltered. You don't see what's leaving, you don't apply DLP, you don't catch a compromised container talking to a C2 server.
The naive fix is "install Z-App on every workload". This is awful: containers are ephemeral, autoscaling groups bake new images, the agent install becomes a per-AMI baking concern, and a runtime crash takes a workload offline. Cloud Connector is the answer: a small VM (or a pair, for HA) that you deploy inside the VPC and point your workload subnet route tables at. All outbound traffic from those subnets is captured by Cloud Connector, tunnelled to ZIA Public Service Edges, inspected, and only then released to the internet. The workloads themselves are unmodified.
Use Cloud Connector when:
- You have ephemeral workloads (containers, autoscaling EC2, serverless behind a VPC) where per-instance agent management is not feasible.
- You want to anchor source IP for SaaS allow-listing β many SaaS vendors only allow your tenant's API key from specific source IPs. Cloud Connector gives you a stable egress IP regardless of how many workloads are behind it.
- You need full URL filtering + DLP + sandbox on outbound HTTPS from cloud workloads, without inflating each container image.
- You have a dev VPC with 200 ephemeral workloads doing
npm installfrom arbitrary registries; you want everyPOSTthey make inspected before it leaves.
- ZPA Admin Portal: Administration β App Connectors β Connector Management β every healthy connector shows a green dot, last-heartbeat under 30 s, version string visible. A connector stuck in Pending for >2 min means registration is failing.
- On the VM:
systemctl status zpa-connectorshows "active (running)".ss -tnp | grep -E ':443|:9000'shows two ESTABLISHED outbound sessions to Zen IPs.journalctl -u zpa-connector -n 200shows "Connector registered" near the end. - End-to-end: in ZPA Admin β Diagnostics β Trace User, run a simulated request for a test user against a segment served by this connector. The trace should show: User Z-App β Service Edge β ZPA Cloud β this connector β app reachable. Connector name visible at every hop.
- Boot test: from the connector VM itself,
curl -sv https://<internal-app-fqdn> --resolve <ip>must succeed. If the connector can't reach the app over its own local network, no amount of ZPA config will help.
- One App Connector per region β the SPOF that bites the first patch night. The team ships pilot with a single connector "to save cost". Two weeks later the connector reboots for a kernel update at 02:00. Every user in that region loses every internal app for the full reboot window. Always N+1, always from day one, even in pilot.
- SELinux blocking outbound on TCP/9000. Connector boots, registers over TCP/443, then silent-fails on the control channel. ZPA admin shows it as Pending or Disconnected and the logs say "auth ok, control timeout" β easy to misdiagnose as a Zscaler-side issue. Test with
setenforce 0; if it then works, apply the Zscaler-shipped SELinux policy and re-enable enforcing. - Branch Connector deployed with the assumption it handles private apps too. "We have Branch Connector at every office, why do we still need to give people Z-App?" Because Branch Connector is ZIA-only. Private apps need App Connector + Z-App on user devices. Same Zscaler tenant, different products.
- Cloud Connector in the same VPC as the app but in a different subnet β without a route table entry. Workloads in subnet A do not magically know to send traffic via Cloud Connector in subnet B. You must update subnet A's route table so the default route (0.0.0.0/0) points at the Cloud Connector ENI (or the load-balanced pair). Easy to miss when handing off to a separate cloud team.
- An outbound corporate proxy intercepting the connector's Zen-bound TLS with SSL inspection. Many enterprise egress firewalls re-sign outbound HTTPS with the corporate CA. ZPA Cloud rejects the re-signed cert as untrusted and the connector cannot establish its session. Allow-list the Zen CIDR ranges and explicitly bypass SSL inspection for connector traffic.
- OS auto-patcher upgrades a system library; the next connector update breaks. Unattended-upgrades on Ubuntu, or DNF automatic on RHEL, can bump openssl or libcurl to a version the current connector binary wasn't tested against. Run connector VMs with auto-patch disabled or scoped to security errata only β and patch them yourself during a controlled window, one of the HA pair at a time.
- Under-spec'd 2 vCPU / 2 GB connector for 500 concurrent RDP users. Math says "two connectors should cover it"; reality is RDP eats vCPU for breakfast. Sessions degrade β pixelation, click lag, periodic stalls β before users actually drop. Always re-size by protocol mix per the table above, and stress-test with the simulator before exposing to production users.
- Connector Groups by failure domain, not by application. One group per AZ / DC row / rack. App segments come and go; the failure domain is stable. Multiple apps in the same domain naturally share the same HA pair β fewer connectors to run, identical HA posture.
- Pre-stage a "canary" segment for every Connector Group. A throwaway HTTP endpoint like
canary.acme.comserved by the group, with a single test user entitled. Hit it from a script every minute; if it breaks, you know the group is unhealthy before real users open tickets. Wire the canary into your existing monitoring. - Patch connector VMs like firewall pairs, never like web servers. Pull one connector out of the group (or just stop the service), wait for ZPA Cloud to confirm sessions have drained to the partner, patch, restart, wait for "Healthy", then do the other one. Never patch both at the same time. Treat the pair like an active-active firewall HA cluster.
Real-world scenario β migrating off F5 APM across four DCs
Customer: a mid-size manufacturer running F5 APM as their VPN concentrator for engineering, sales, and contractor remote access. Four data centres: Mumbai, Singapore, Frankfurt, Virginia β each running its own Jira, Confluence, Bitbucket, SAP module, and internal toolchain. ~12,000 employees, ~3,000 concurrent VPN users on a typical weekday. F5 APM is two years from end-of-support; the customer wants ZPA before it dies.
Your job: design and deliver the connector layout, then a phased cutover that does not cause a production outage.
Step 1 β Connector layout (the design that survives the runbook)
- Per DC: two App Connectors, deployed in two separate VMware clusters (different racks, different power feeds, different ToR switches). Both connectors join the same Connector Group named after the DC β e.g.
cg-mumbai,cg-singapore,cg-frankfurt,cg-virginia. That gives you N+1 per region from day one. - Sizing: Jira / Confluence / Bitbucket dominate the protocol mix (HTTP), with a smaller RDP workload for finance teams. From the sizing table, 2 vCPU / 4 GB per connector handles ~250β500 concurrent HTTP sessions; with ~750 expected per DC at peak, scale to 4 vCPU / 8 GB and deploy at least two per DC, or run 3 Γ 2-vCPU connectors in the group. The horizontal route (3 Γ small) gives better blast-radius isolation than one big VM β and survives a single-node patch reboot without user-visible impact.
- Cross-region failover: set group priorities so each region's primary is itself, with the geographically nearest region as secondary β Mumbai primary / Singapore secondary, Frankfurt primary / Virginia secondary. If both Mumbai connectors die, users transparently shift to Singapore at the cost of ~60 ms RTT β far better than a hard outage.
- App segments: one segment per app per region (not per /16).
jira-mumbai.acme.com,jira-singapore.acme.com, etc. Each segment attaches to its region's Connector Group only (plus the secondary as failover).
Step 2 β Cutover plan that avoids drama
- Deploy + register connectors in all four DCs. Verify all eight show Healthy. Run the canary segment test from a script for 24 hours; expect zero failures.
- Pilot group of 50 users across all four regions, mixed roles. Push the Z-App with ZPA profile enabled to these users only; leave F5 APM available as fallback. Run for 7 days. Watch the connector health graphs, the ZPA Trace User reports, and the user-feedback channel.
- Expand to 500 users per region β broader role mix, including a few notoriously fussy engineering tools. Another 7 days. Tune segment definitions for any app that surfaces as misconfigured.
- Cutover wave 1 β engineering org (3,000 users). Flip Z-App ZPA profile via Intune for the whole engineering AD group on a Friday afternoon. Monitor through the weekend. F5 APM still up but no longer the default route.
- Wave 2 + 3 β sales then back-office orgs over the next two weeks.
- Decommission F5 APM after 30 days of stable ZPA-only operation, with a 7-day "switchback window" where APM stays cold-spare in case of catastrophic ZPA regression.
What goes wrong if you skip the regional grouping
This is the war story junior engineers learn the hard way. Imagine you skip Step 1's regional grouping and instead create a single global Connector Group with all eight connectors. ZPA Cloud's broker is latency-aware but not infinitely smart; some Mumbai users will get brokered to Virginia connectors when Virginia is momentarily less loaded than Mumbai. The user's Jira-Mumbai request travels Mumbai user β Mumbai Service Edge β ZPA Cloud β Virginia App Connector β back across the Internet to Mumbai DC's Jira server β all the way back. Round-trip time goes from ~30 ms to ~430 ms. Every Jira page load adds half a second. Users complain. Tickets pile up. You eventually re-architect with regional groups β which is what you should have done on day one.
Lesson: Connector Groups are not just an HA boundary, they're a routing boundary. Use them to express "this connector should serve this region's users" as well as "this connector should serve this region's apps". Both meanings are correct simultaneously.
Quick reference β ZPA connector deployment
- App Connector β broker for inbound user requests to your private apps. Sits next to the app.
- Branch Connector β forwards office user traffic to ZIA. Sits at the office edge. ZIA-only, not ZPA.
- Cloud Connector β anchors cloud-workload egress to ZIA. Sits inside the VPC/VNet.
- Spec: 2 vCPU / 4 GB / 16 GB disk minimum. RHEL/Ubuntu/Amazon Linux. Outbound TCP/443 + TCP/9000 to Zen IP ranges.
- HA rule: β₯2 connectors per Connector Group per failure domain. Group by AZ / DC row, not by app.
- Sizing rule of thumb: 2 vCPU connector β 250β500 concurrent HTTP sessions / 150β250 RDP / 200β300 voice. ~250 sessions per vCPU at minimum spec. For 1,000+ HTTP users, scale to 4 vCPU / 8 GB or run 2β3 connectors per group. Always size to peak + 50%.
- Install path: download RPM (or use cloud image) β
yum localinstall zpa-connector-VERSION.rpmβ drop key into/opt/zscaler/var/provision_key(chownzscaler:zscaler, chmod 600) βsystemctl enable --now zpa-connector. Enrollment happens automatically on first start; noenrollsub-command exists. - Log path:
/var/log/zpa/connector.logfor connector itself,journalctl -u zpa-connectorfor systemd-side events. - Stuck in Pending? Check (1) outbound TCP/9000 to Zen, (2) NTP drift, (3) SELinux blocking, (4) outbound proxy SSL-inspecting Zen traffic, (5) DNS resolving Zen FQDNs.
- Cutover discipline: deploy β canary β 50-user pilot β 500-user wave β org-level cutover β 30-day stability β decom legacy.
Deploy an App Connector from scratch:
- From ZPA Admin β Administration β App Connectors β Download. Get the latest RPM + provision key.
- On a 2vCPU/4GB RHEL VM:
sudo yum localinstall -y zpa-connector-VERSION.rpm - Write the provision key:
echo "KEY" | sudo tee /opt/zscaler/var/provision_key+ chmod 600. - Start:
sudo systemctl enable --now zpa-connector - Watch logs:
sudo journalctl -u zpa-connector -fβ wait for "Successfully enrolled" + "Connected to broker". - In ZPA Admin β confirm the connector appears in your App Connector Group with status Up.
π Check your understanding
10 scenario questions β same depth you'll see in interviews and practice exams. Pick one answer per question. You need 70% (7 of 10) to mark this lesson complete on your profile.
What's next?
Module 11 wires policy on top of the connectors β App Segments, Segment Groups, Access Policies, Posture Profiles, and the timeout policies that decide when a session must reauthenticate.