Most engineers think…
Most people treat NetScaler HA as 'just tick the box and the backup node is ready'. That mental model breaks under exam pressure — and in production at 2 a.m.
HA is a carefully choreographed system: a heartbeat channel, an incarnation counter, a floating IP handoff via gratuitous ARP, and a sync-vs-propagation split that confuses almost every first-time candidate. Getting those mechanics right — and knowing when INC mode changes all the rules — is what separates a real deployment from a lucky one.
① HA pair basics — active-passive, heartbeat and failover
A NetScaler HA pair is always active-passive: the primary node handles all client traffic while the secondary stays in standby, continuously monitoring the primary via heartbeat messages sent over a dedicated link (and optionally over the management network).
A failover is triggered when the secondary stops receiving heartbeats for the configured dead-interval (default 3 seconds), or when a route monitor, interface monitor, or VLAN monitor on the primary reports a critical failure. Manual failover is also possible via force failover.
On failover the new primary sends a gratuitous ARP (GARP) for every floating IP (MIPs, VIPs, SNIPs) so upstream routers and switches update their ARP tables and route traffic to the new primary immediately.
What stays on the nodes vs what floats
Each node has its own NSIP (management IP — never moves). The MIP, SNIP and VIP are floating and are owned by whichever node is primary. This split is essential for understanding why the secondary can be managed independently even during normal operation.
A NetScaler secondary node decides to fail over. What does it send immediately after promoting itself to primary?
② Sync & propagation — and when INC mode changes the rules
NetScaler keeps the secondary ready through two separate mechanisms. Synchronisation copies the primary's full ns.conf to the secondary — triggered on startup, after a manual sync HA files, or when the incarnation number of the primary is higher than the secondary. Propagation is different: it pushes individual CLI commands to the secondary in real time as they are entered on the primary, so the secondary's running config stays current without a full file copy.
Both sync and propagation are enabled by default. They are temporarily disabled during a sync to prevent conflicting commands. The sync log lives in /var/nslog/ha.log.
INC (Independent Network Configuration) mode
INC mode is needed when the two HA nodes are in different subnets — a common pattern for geographically separated data centres. In INC mode: the nodes do not share floating SNIPs (each node keeps its own subnet IPs), gratuitous ARPs are not sent by the new primary (because the L2 domain is not shared), and DNS or GSLB must handle the client redirect after failover. Certain configurations — notably SNIP addresses — are neither synced nor propagated in INC mode, which is the number-one exam trap.
Periodic health-check packets sent between HA nodes (default every 200 ms). The secondary initiates failover if the primary misses enough heartbeats within the dead-interval (default 3 s).
Sync copies the full ns.conf (triggered by incarnation mismatch or restart). Propagation pushes individual CLI commands in real time. Both are on by default; sync pauses propagation temporarily.
Independent Network Configuration — used when HA nodes are in different subnets. No shared floating SNIPs; no GARPs after failover. DNS or GSLB must redirect clients. SNIPs are NOT synced or propagated in INC mode.
Striped config is active on all cluster nodes (traffic distributed). Spotted config is bound to a single-node node group (one node handles that workload). Partially striped sits between. Use spotted SNIPs to avoid ARP flux.
Any command you run on the secondary's NSIP directly (not via propagation from the primary) will be overwritten at the next sync. Worse, conflicting commands can break propagation entirely. Always configure via the primary — or via the CLIP in a cluster.
Which HA mechanism copies individual CLI commands to the secondary in real time?
③ NetScaler clustering — striped, partially striped and spotted
A NetScaler cluster extends the active-active availability model to up to 32 nodes. A single CLIP (Cluster IP) is used to configure and manage the whole cluster, and one node acts as the configuration coordinator (CCO), propagating changes to all other nodes.
Configurations in a cluster are classified by how many nodes they are active on. A striped configuration is active on all nodes — traffic is distributed across the cluster using the striped SNIP or VIP. A spotted configuration is bound to a node group containing a single node, so only that one node handles that workload. A partially striped configuration is bound to a node group with more than one node — a subset of the cluster.
When to use spotted vs striped
Use striped when traffic can be handled by any node (most load-balanced VIPs). Use spotted for workloads that require locality — for example, a specific application that has licensing tied to one node, or where ARP flux from striped SNIPs would be a problem. Citrix recommends spotted SNIP addresses when USIP is disabled to avoid ARP flux; striped SNIPs should be used only when IP addresses are scarce.
When USIP is disabled and you use striped SNIPs (active on all nodes), each node responds to ARP for the same IP — causing ARP flux and unpredictable routing. The fix is to use spotted SNIPs (one per node) for each subnet and only resort to striped SNIPs when IP address space is genuinely scarce.
▶ Watch a NetScaler HA failover happen end-to-end
Step through a healthy failover, then Break it to see the classic split-brain scenario and its fix.
You have a NetScaler cluster where a specific application must run on one designated node only. Which configuration type do you use?
④ Troubleshooting HA & cluster failures
The most dangerous HA failure is split-brain: both nodes believe they are the primary and each sends GARPs for the floating IPs. This happens when the heartbeat link fails but the nodes are otherwise healthy. The fix is ensuring redundant heartbeat paths (management + dedicated HA VLAN) and enabling independent heartbeat monitoring on a separate interface.
For sync failures, check /var/nslog/ha.log. Common causes: the secondary's disk is full (df -h), the NSIP is unreachable over the sync path, or a command was manually entered on the secondary that conflicts with the primary's config (never configure the secondary directly unless you know what you are doing).
Cluster node INACTIVE state
In a cluster, a node goes INACTIVE when the CCO cannot reach it or when it has a config sync failure. Check show cluster node on the CLIP for the health of each node, review /var/nslog/cl.log for cluster events, and verify the cluster backplane interface is up. If a node rejoins after an outage, the CCO automatically pushes a full config sync to bring it current.
Rohan at a Mumbai fintech firm faces this
After a network maintenance window, clients report intermittent VIP unreachability. The NOC sees both NetScaler nodes showing as primary in SNMP alerts.
The dedicated HA heartbeat VLAN was accidentally shut down during the maintenance, and no redundant heartbeat interface was configured. Both nodes declared themselves primary — classic split-brain.
Log into each node's NSIP independently. Run show ha node — both report 'Primary'. Check interface status: the HA heartbeat VLAN interface is down on both.
/var/nslog/ha.logBring the heartbeat VLAN back up. The node with the lower HA ID will revert to secondary. Add a second heartbeat path on the management interface (set ha node -haStatus ENABLED on the NSIP subnet) so a single VLAN outage cannot repeat this.
Run show ha node on both nodes — one shows PRIMARY, the other SECONDARY. VIP reachability restores; SNMP shows only one primary alert.
For HA issues, /var/nslog/ha.log records every heartbeat event, failover, sync start/end, and propagation error with timestamps. For cluster issues, /var/nslog/cl.log does the same for CCO elections, node state changes, and backplane events. Read the log before touching any config — most HA tickets are solved in the first ten lines.
Both NetScaler HA nodes believe they are primary and are each sending GARPs for the same VIPs. What is the most likely root cause?
🤖 Ask the AI Tutor
Tap any question — instant, scoped to this lesson. No login, no waiting.
Pre-curated from vendor docs + community Q&A, scoped to this lesson. For a live prod issue, paste your export into chat.techclick.in.
📝 Wrap-up assessment — six more
You've answered 4 inline. Six left. 70% (7 of 10) marks the lesson complete on your profile. Tap Submit all answers at the end.
🧠 In your own words
Type one line: what is the difference between HA synchronisation and propagation, and when would you use INC mode? Then compare with the expert version.
🗣 Teach a friend
Best way to lock it in — explain it in one line to a teammate. Tap to generate a paste-ready summary.
📖 Glossary
- NSIP (NetScaler IP)
- The permanent management IP of each node — never floats between nodes. Used to manage each node independently via SSH or the GUI.
- Floating IP
- A VIP, SNIP or MIP that is owned by the current primary and claimed by the new primary via gratuitous ARP on failover.
- Heartbeat
- Periodic health-check packets (default every 200 ms) sent between HA nodes. Missing the configured dead-interval triggers failover.
- INC Mode
- Independent Network Configuration — an HA mode for nodes in different subnets. No shared floating SNIPs; no GARPs; DNS or GSLB redirects post-failover. SNIPs are not synced.
- CLIP (Cluster IP)
- A single management IP for an entire NetScaler cluster. All configuration is applied here and distributed by the CCO to every node.
- CCO (Config Coordinator)
- The elected node in a cluster that receives configuration from the CLIP and propagates it to all other nodes via the cluster backplane.
- Striped configuration
- A cluster config active on all nodes — traffic is distributed across the entire cluster for that VIP or SNIP.
- Spotted configuration
- A cluster config bound to a node group containing a single node — only that node handles the workload. Used for locality-sensitive or licence-tied applications.
- Split-brain
- The dangerous state where both HA nodes believe they are primary, usually due to heartbeat link failure. Both send GARPs causing network instability.
- Incarnation number
- A counter on each node that increments with every config change. A mismatch between primary and secondary triggers a full synchronisation.
📚 Sources
- Citrix / NetScaler Docs — High Availability Introduction (NetScaler 14.1 current release). docs.netscaler.com/en-us/citrix-adc/current-release/system/high-availability-introduction.html
- Citrix / NetScaler Docs — Configuring synchronisation in high availability (NetScaler 14.1). docs.netscaler.com/en-us/citrix-adc/current-release/system/high-availability-introduction/configuring-synchronization-high-availability.html
- Citrix / NetScaler Docs — Striped, partially striped and spotted configurations (clustering). docs.netscaler.com/en-us/citrix-adc/current-release/clustering/cluster-overview/cluster-config-type.html
- Citrix / NetScaler Docs — Troubleshooting NetScaler cluster. docs.netscaler.com/en-us/citrix-adc/current-release/clustering/cluster-troubleshooting.html
- Citrix Support — Troubleshooting Citrix ADC High Availability (HA) — cheat-sheet PDF. docs.citrix.com/en-us/tech-zone/learn/downloads/diagrams-posters_cheat-sheet-adc-troubleshooting-high-availability.pdf
- Citrix Support Article CTX116748 — How to Set Up a High Availability Pair on ADC. support.citrix.com/article/CTX116748
What's next?
Got HA and clustering solid? Next, go deep on NetScaler load-balancing algorithms, persistence methods and content switching — and how to pick the right combination for your application tier.