The Prisma Access service infrastructure โ three layers, one fabric
Before you click anything in SCM, fix the mental model of what is actually running in Palo Alto's cloud on your behalf. Prisma Access is delivered as three logical layers stitched together inside the service:
- The Portal layer. A globally anycast endpoint that every Mobile User agent reaches first. The Portal authenticates the user (via Cloud Identity Engine), evaluates HIP posture, returns the list of Gateways the user is allowed to use, and hands the client a signed configuration. There is exactly one Portal per tenant, replicated worldwide.
- The Gateway layer. Per-region cloud Gateways are where the user's IPSec / SSL-VPN tunnel actually terminates. Gateways run the App-ID + Threat Prevention engine and apply your Security policy. The Portal picks the best Gateway per user based on geographic proximity, latency probes and bandwidth load.
- The Compute Location layer. The physical region where a Gateway runs (e.g.
asia-south1 (Mumbai),us-east4 (Virginia)). Compute Locations are the units you reserve bandwidth against and what you allocate Service Connections to. Same compute location can serve Mobile Users and Remote Networks at the same time, sharing the regional bandwidth pool.
For Remote Networks, the same Gateway-and-Compute-Location concept applies, but the "user" is a branch CPE building an IPSec tunnel up to a region. For Service Connections, the same again, but now the on-prem firewall is the device terminating into the Compute Location to advertise private routes.
Mobile Users hit the Portal first, get a Gateway list, then build the tunnel directly to a regional Gateway running inside a Compute Location. Branches (Remote Networks) and on-prem firewalls (Service Connections) terminate directly at the regional Compute Location. Bandwidth is reserved per region, shared across all three onboarding modes that use it.
Compute location selection โ how does Prisma pick one?
For Mobile Users, the Portal hands the agent a list of allowed Gateways. The agent then runs latency probes to each one and connects to the lowest-RTT Gateway that has spare capacity. If two Gateways have similar RTT, ties break on load. Re-evaluation happens on connect, on network change, and at a configurable interval (default ~120 s).
For Remote Networks and Service Connections, you pick the compute location at onboarding time. Prisma will not move a branch tunnel between regions on its own โ the operator chooses. The way to handle a regional outage is the same way you'd handle one with a pair of physical firewalls: two tunnels to two different compute locations, BGP advertising the same prefixes, ECMP load-balancing.
Bandwidth allocation โ the most-missed slider
Every Prisma Access tenant has a total purchased bandwidth (e.g. 2 Gbps). That total has to be allocated to compute locations. An unallocated region cannot accept traffic. If you forget to allocate bandwidth to asia-south1 before onboarding a Mumbai branch, the IPSec tunnel will negotiate up to Phase 2 and then drop traffic with no obvious cause. Allocate before you onboard.
Over-allocate by ~20% in regions that host both Mobile Users and Remote Networks. The two share the same regional pool โ a Friday-afternoon WFH surge can starve a busy branch if you sized to the average instead of the peak.
Mobile Users in production โ beyond the demo
A demo Mobile User is one laptop with one rule. A production fleet of 4,000 endpoints needs deliberate choices on three fronts: authentication, posture, and split-tunnel.
Authentication via Cloud Identity Engine
Cloud Identity Engine (CIE) is the SAML / SCIM layer Prisma Access uses to bring user + group identity into policy. You point CIE at Entra ID, Okta, Ping or any SAML 2.0 IdP, configure the SAML assertion mapping (NameID = UPN, group claim = on), and Prisma now sees every authenticated session by username and group membership. Combine with MFA at the IdP (Entra Conditional Access, Okta Sign-On Policy) and there is no need for a separate Prisma-side OTP step.
HIP โ Host Information Profile
HIP is the device-posture gate that runs after auth but before Security policy is applied. Common production checks:
- Disk encryption is on (BitLocker for Windows, FileVault for macOS).
- Endpoint AV (Crowdstrike, Defender, Sentinel One) is running and signature โค 7 days old.
- OS patch level is within N days of current.
- Required certificate is present in the user / machine store.
- Required process is running (a custom checker exe, EDR agent, etc.).
HIP profiles compose into HIP objects, which are then referenced in Security policy: "only users whose endpoint matches HIP-Object-Compliant can reach Finance apps". A failed HIP check can drop the tunnel, restrict access to a quarantine subnet, or push a remediation message โ your call.
Split-tunnel design
Most production tenants run an include-domain / exclude-domain split-tunnel. Two patterns dominate:
- "Tunnel everything except media" โ default route through Prisma, but exclude Microsoft Teams / Zoom voice domains so RTP goes direct. Saves Prisma bandwidth and improves call quality.
- "Tunnel corporate apps only" โ pin specific corporate FQDNs to the tunnel and let everything else go direct. Useful for contractors and BYOD where you don't want to inspect personal traffic.
Both modes need accurate domain lists; the Microsoft 365 published JSON of optimize-required-default endpoints is the standard input for the first pattern.
Remote Networks โ branch onboarding with real config
The Remote Network onboarding flow in SCM walks you through naming, region selection, bandwidth and tunnel parameters. What it does not tell you is how to express the same parameters on a real branch CPE. Here is the production-grade pair: IKEv2 + ESP, with primary + secondary tunnels.
! ---- IKEv2 (Phase 1) ---- encryption : aes-256-cbc integrity : sha384 dh-group : group20 (ECP-384) authentication : pre-shared-key lifetime : 28800 seconds dpd : on-demand, 30 s interval, 3 retries ! ---- ESP (Phase 2) ---- encryption : aes-256-gcm integrity : null (implicit with GCM) pfs : group20 lifetime : 3600 seconds, 4 GB data ! ---- Identifiers ---- local-id : fqdn / branch-mumbai-01.example.com remote-id : fqdn / pa-asia-south1.prisma-access.com ! ---- Routing ---- mode : routed (NOT policy-based) local-net : advertised via BGP remote-net : advertised via BGP
BGP is the right answer for any branch with more than one prefix or more than one tunnel. Static routes work but become a maintenance burden the moment you add a new internal VLAN.
router bgp 64512 bgp router-id 10.20.0.1 neighbor 169.254.10.1 remote-as 65000 ! Prisma Access ASN neighbor 169.254.10.1 ebgp-multihop 2 neighbor 169.254.10.1 update-source Tunnel1 address-family ipv4 unicast network 10.20.10.0/24 ! branch user VLAN network 10.20.20.0/24 ! branch printers VLAN neighbor 169.254.10.1 prefix-list FROM-PA in neighbor 169.254.10.1 prefix-list TO-PA out exit-address-family
Duplicate the entire block for the secondary tunnel pointing at a different compute location, then ECMP across both. When Prisma withdraws the prefix it learns from a Service Connection during maintenance, BGP will silently steer that traffic onto the surviving tunnel within seconds.
Setting NAT-T off on the CPE because "Prisma is reachable directly". Most branches sit behind an ISP-provided NAT โ if NAT-T is off the tunnel will negotiate but ESP will be silently dropped. Always leave NAT-Traversal on; the per-packet overhead is negligible.
Service Connections โ the pipe into your private estate
A Service Connection is structurally another IPSec tunnel into a Prisma compute location, but its role is one-directional reach: it lets Mobile Users and Remote Networks reach private apps that live behind your on-prem firewall, cloud transit gateway, or IaaS VPC. It is not a path for internet egress.
Three production rules:
- Two Service Connections from two DCs, into two compute locations. Active-active, both advertising the same private prefixes with BGP. ECMP load-balances; either side surviving a regional outage keeps private reach alive.
- Advertise summary routes, not host routes. A typical mistake is advertising
10.30.5.13/32for an SAP server. Advertise the subnet (10.30.5.0/24) and let the on-prem fabric route within it. - Don't run a default route over a Service Connection. If you do, every Mobile User on the planet will pull internet through your HQ and Prisma will dutifully hairpin it. That is exactly the design Prisma replaces.
The request lifecycle โ eight hops in detail
Steps 1โ4 are control-plane (auth + posture + Gateway selection). Step 5 is the tunnel itself. Steps 6โ8 are the data-plane NGFW pipeline. ADEM probes and log forwarding run in parallel and are what makes the service operable on day 2.
ZTNA on Prisma Access โ App Gateway + Clientless Browser Access
The same Prisma Access service can publish private apps in two flavours: full-tunnel (Mobile User connects, Security policy decides what private subnets they can reach) and ZTNA App Gateway (per-app reverse-proxy publishing, no tunnel needed). The second is the closer match to a Zscaler ZPA / Cloudflare Access experience.
With App Gateway, you register an internal app (e.g. jira.internal), point Prisma at the on-prem reverse-proxy reachable through a Service Connection, and Prisma publishes a public URL like jira.access.example.com. Users authenticate to that URL via SAML, Prisma runs Security policy (App-ID, URL, Threat Prevention) on the reverse-proxied connection, and the user never sees the internal hostname or IP.
Clientless Browser Access is the same App Gateway with no agent on the endpoint โ useful for contractors or unmanaged devices. Just a browser, SAML SSO, and the per-app reverse proxy.
| Mode | Agent needed | Best for | Catch |
|---|---|---|---|
| Full tunnel (Mobile User) | GlobalProtect | Managed corporate fleet | Tunnel = full network reach unless tightly policied |
| ZTNA App Gateway (agent) | GlobalProtect | Per-app access on managed devices | Per-app policy needs careful per-FQDN config |
| Clientless Browser Access | None | Contractors, BYOD, vendor portals | Limited to web apps; no thick clients |
ADEM โ the day-2 observability you already paid for
Autonomous Digital Experience Management ships with every Prisma Access tenant. It runs synthetic probes from the agent and from the Gateway to surface per-segment performance:
- Endpoint segment โ CPU, memory, Wi-Fi RSSI, local DNS resolution time.
- ISP / Wi-Fi โ Prisma Gateway segment โ RTT, jitter, packet loss to the tunnel endpoint.
- Prisma Gateway โ SaaS segment โ TCP connect time, HTTPS first-byte, full-page load.
- Application-level โ synthetic transactions against M365, Salesforce, Workday, etc.
The output is a per-user experience score on a 1โ10 scale. When a user complains "Teams is bad", ADEM tells you within 30 seconds whether the bad segment is their Wi-Fi, the ISP, the Prisma path or the destination SaaS โ eliminating the 45-minute triage you used to do over chat.
# 1) On the endpoint gpcli --show-status # Connected, Gateway = pa-asia-south1, Internal = false # 2) In SCM # Manage โ Insights โ Mobile Users โ Connected # Filter on the user: expect Username from IdP, HIP = Match, Gateway name, # Source IP = endpoint NAT IP, Internal IP = a /32 from the Mobile User subnet. # 3) Trigger a known-blocked test curl -v https://web.telegram.org/ # expect Prisma block page (TLS handshake to Prisma cert, then RST) # 4) Confirm the log # SCM โ Activity โ Log Viewer โ Traffic # Filter: ( app eq telegram-base ) and ( action eq deny ) # The Source User column must show the IdP identity, Rule = your contractor block rule. # 5) ADEM sanity # SCM โ Insights โ ADEM โ User score for this username. # Score should be 8+ on a normal home network; drill down on any failing segment.
Logs & SIEM integration โ where the data lives, how it gets out
Every log Prisma Access generates lands first in the Strata Logging Service โ Palo Alto's hosted log store. Retention defaults are typically 30 days; you can extend with additional licence. You access them three ways:
- SCM Log Viewer โ interactive filter, drill-down, save searches.
- Log Forwarding to NSS / SIEM โ for long-term retention and cross-correlation. NSS (Network Security System) is the forwarder; it streams logs in CEF, LEEF, or syslog to Splunk, QRadar, Sentinel, Elastic, Chronicle.
- API access โ XSOAR playbooks pull events for automated response.
Sensible log categories to forward day-1: Traffic, Threat, URL, WildFire, Decryption, GlobalProtect events, System. Skip Auth-Success unless you have a specific use case โ the volume is enormous and the value is low.
A real-world day-2 scenario โ branch tunnel flap
It's 14:00 IST. The Mumbai branch reports "internet is slow, sometimes nothing loads". The branch has primary + secondary Remote Network tunnels to asia-south1 and asia-southeast1 respectively.
The L3 triage flow:
- SCM โ Insights โ Remote Networks โ Mumbai-Branch. Tunnel 1 status flapping โ Up / Down / Up โ every ~3 minutes. Tunnel 2 stable.
- BGP session table on the branch CPE: neighbour to Prisma on Tunnel 1 keeps re-establishing. Routes withdrawn during each flap โ ECMP repaints to Tunnel 2 โ users notice ~5 s of "frozen" connections per flap.
- System log filter on the CPE: IKEv2 SA expires at minute marks matching tunnel-flap timing โ DPD timeout. ISP path between branch and Prisma is dropping outer ESP under load.
- Resolution. Lower DPD interval from 30 s to 10 s to make the failover faster. Open a ticket with the ISP for the underlying packet loss. Verify in ADEM that user score for that branch climbed from 6.2 back to 8.8.
This is the muscle memory L3 SASE engineers build: Insights โ BGP โ System log โ fix the root cause โ verify with ADEM. Never close a ticket without the verification step.
Common Mistakes
Without allocation, the Phase 2 SA negotiates and traffic silently drops at the Gateway. The fix is one slider in SCM, but the symptom looks like a routing or policy bug for hours if you don't know to look. Allocate first, onboard second.
EDR detects compromise after the fact. HIP gates connection before the user reaches anything. They are complementary โ turn HIP on with at least three checks (disk encryption, AV running, OS patch level) on day one. Friction is minimal and the security value is enormous.
A bad CA push has every browser on every endpoint throwing cert errors on every site. Stage it: pilot ring โ 10% โ 50% โ 100% with explicit rollback gates. Only after the CA is universally trusted should you flip the Decryption policy to "decrypt".
The whole point of Remote Networks is local breakout to internet at the closest Prisma compute location. If you steer all branch internet through a Service Connection back to HQ, you re-create the centralised-egress latency you were paying Prisma to eliminate.
An "ADEM 6" is meaningless without context. Baseline normal score per region on a healthy day, then alert on a sustained drop, not on an absolute threshold. Otherwise SecOps will mute the alert by week two.
Pro Tips
Have your IdP send the group memberships in the SAML assertion (Entra: emit security groups; Okta: emit group attribute filtered to relevant groups). Then write Prisma Security rules in terms of those groups. Adding a new Finance user becomes "put them in the Finance AD group" โ zero Prisma change required.
Always-On stops users from disabling the tunnel. Pre-Logon brings the tunnel up before the Windows login screen so domain auth, GPO, and Intune policy can reach corporate services from any network. Two settings, massive operational value.
Decryption logs are the canary that tells you a pinned app, an OCSP failure, or a CRL outage broke a critical SaaS. They are noisy but cheap to keep โ give them their own SIEM index so SOC can search them without drowning in Traffic logs.
If a single compute location degrades, do you know which branches you have to manually pin to the secondary? Document the failover map: branch โ primary region โ secondary region โ manual cutover steps. Pin the runbook to the same Confluence page as the design.
Quick Reference
The lesson on one screen
- Three layers โ Portal (global anycast, authN + HIP), Gateway (regional NGFW pipeline), Compute Location (the region + bandwidth pool).
- Compute location selection โ agent picks lowest RTT Gateway from the Portal-issued list; branches and Service Connections are operator-chosen.
- Allocate regional bandwidth first. Unallocated region cannot accept traffic.
- Mobile Users = CIE auth + HIP posture + split-tunnel design. Always-On + Pre-Logon on managed devices.
- Remote Networks = IKEv2 + ESP + BGP + ECMP across two tunnels to two compute locations.
- Service Connections = active-active from two DCs into two regions; advertise summaries; no default route.
- ZTNA App Gateway publishes a private app behind a SAML-protected public URL. Clientless Browser Access for agentless users.
- ADEM gives per-segment experience scores โ your first stop for any "the app is slow" ticket.
- Logs live in Strata Logging Service; forward Traffic, Threat, URL, WildFire, Decryption, GP events, System to SIEM via NSS.
- Triage flow โ Insights โ BGP / system log โ root cause โ verify with ADEM. Never skip the verification step.
๐ฏ Scenario Assessment โ 10 Questions
Hit 70% (7 of 10) to mark this lesson complete. Submit to see scoring and per-question reasoning.
What's next?
Next up in the Palo Alto track: Prisma Cloud Defender deployment on EKS and AKS, IaC scanning in CI/CD with Bridgecrew/Prisma Code Security, Prisma SD-WAN ION devices and CloudBlade integration, and a complete day-2 runbook for the L3 SASE on-call. Practice the Mobile User and branch flows on the simulators, then take the cert-style scenario set on exam.techclick.in.