[Day 103] Cisco ISE Mastery Training: High Availability & Failover Testing
Table of Contents
Introduction
In enterprise networks, downtime is unacceptable. Whether you are protecting a hospital’s patient data, a bank’s financial records, or a defense network’s classified systems, Cisco ISE (Identity Services Engine) must be available 24×7 with zero disruption. A single node failure in an ISE cluster without proper high availability (HA) can cripple authentication, break policy enforcement, and lock users/devices out of critical services.
That’s why High Availability (HA) and Failover Testing is not a “nice-to-have” – it is a mission-critical requirement in every Cisco ISE deployment. Designing HA ensures redundancy, fault tolerance, and continuous authentication services. But design is not enough — you must validate and test failover scenarios in controlled labs and production rollouts, simulating real-world node crashes, database sync issues, or PSN overloads.
This session is where we move from theory to battle-ready engineering. You’ll not only configure HA, but also:
- Validate session persistence during failover (wired, wireless, VPN).
- Test redundancy for Policy Service Nodes (PSNs), PANs, and MnT nodes.
- Simulate failures with CLI shutdowns, VM suspensions, and interface blocking.
- Use ISE GUI, CLI, and syslogs to confirm that the cluster seamlessly absorbs the failure.
By the end of this workbook, you won’t just “know” HA design—you’ll be able to prove your Cisco ISE deployment can survive failures before your production network ever sees them.
Problem Statement (What breaks without discipline)
- Outages during PSN patching → Wi-Fi login storms fail.
- PAN outage → no policy changes, join/leave failures, stale config.
- MnT outage → compliance gaps (lost logs).
- WAN flap → remote sites time out against distant PSNs.
- AD/DC hiccup → EAP-TLS OK, but PEAP/MSCHAPv2 fails; posture/guest flows degrade.
You need a repeatable failover test to prove: no auth loss, no portal loops, logs preserved, clean promotion, fast detection.
Solution Overview (What we will implement & test)
- PSN Tier: ≥2 PSNs behind LB with health probes + persistence (RADIUS & HTTPS).
- PAN Tier: 1× Primary PAN + 1× Secondary PAN (manual promotion).
- MnT Tier: 1× Active + 1× Standby (automatic role assumption).
- Shared Services: NTP/DNS/PKI/AD reachability verified; latency within guidance.
- Runbooks: Planned drain/patch, failover tests, rollback.
Sample Lab Topology (VMware / EVE-NG)
Compute/Apps
- ISE nodes:
PAN-PRI
,PAN-SEC
,MNT-ACT
,MNT-STD
,PSN-1..4
- Load Balancer: F5 BIG-IP or Citrix ADC (VIPs: RADIUS-Auth 1812, RADIUS-Acct 1813, HTTPS 443)
- Services: AD Domain Controller(s), CA/PKI, NTP/DNS, Syslog/SIEM
- NADs: Catalyst 9300 (wired 802.1X/MAB), WLC 9800 (CWA/BYOD), AnyConnect/ASA or FTD VPN
- Endpoints: Windows 11 (EAP-TLS), iOS/Android (Guest/BYOD), IoT (MAB)
Topology diagram:

Step-by-Step Guide (GUI + CLI Validation)
Step 1 – Verify Node Roles & Deployment State
Why: Before testing HA, you must confirm your nodes (PAN, MnT, PSN) are properly registered and synchronized.
GUI Validation
- Log in to the ISE Primary Admin Node.
- Navigate to:
Administration → System → Deployment
[Screenshot: ISE Deployment Nodes Page]

- Confirm:
- Primary PAN and Secondary PAN roles.
- Primary MnT and Secondary MnT roles.
- Multiple PSNs registered.
CLI Validation
On each node, run:
show application status ise
Ensure Application Server
, Database Listener
, and ISE Indexing Engine
are running.
Note which node shows Primary
and which shows Secondary
.
Step 2 – Test Primary PAN Failure (Admin Node)
Why: PAN availability is critical for management + config pushes.
Simulation
On the Primary PAN, shut down the ISE application:
application stop ise
Or simulate network isolation:
interface gigabitEthernet 0 shutdown
GUI Validation
- From another console, log into Secondary PAN GUI.
- Go to:
Administration → System → Deployment
[Screenshot: Deployment showing Secondary PAN as Active]

- Confirm failover role promotion — Secondary PAN becomes Active.
CLI Validation
On the Secondary PAN:
show application status ise
Look for:
Primary: No Secondary: Yes (Active)
Rollback:
application start ise
or bring the interface back up.
Step 3 – Test MnT Failover (Monitoring Node)
Why: MnT collects logs, RADIUS live sessions, reports.
Simulation
Suspend the VM for Primary MnT or shut down ISE process.
GUI Validation
- Navigate to:
Operations → Reports → Authentications
[Screenshot: Authentication Logs on Secondary MnT]

- Ensure reports/logs are still accessible.
CLI Validation
show logging system tail
Check that logs are being recorded on Secondary MnT.
Step 4 – Test PSN Failover (Policy Service Node)
Why: PSN failure impacts live authentication traffic (critical test).
Simulation
On one PSN:
application stop ise
GUI Validation
- On PAN GUI:
Administration → System → Deployment
[Screenshot: One PSN Down, Others Active]

- On WLC or Switch live environment, authenticate a client (Wired/Wireless).
CLI Validation
From ISE CLI:
show radius statistics
Verify that authentication requests are now handled by other PSNs.
On WLC/Switch CLI:
show aaa servers
Confirm failover from failed PSN to healthy PSN.
Step 5 – Session Persistence Test
Why: During failover, ongoing sessions must remain intact.
Test Procedure
- Connect a test endpoint (laptop/VM) to wired switch with 802.1X enabled.
- Authenticate successfully (dot1x/MAB).
- Simulate PSN failure as above.
- Monitor:
- Endpoint should not get disconnected.
- Session continues until re-authentication is triggered.
CLI Validation on Switch
show authentication sessions interface Gi1/0/10
Session should still show as Authorized.
Step 6 – Database Synchronization Test
Why: PAN/Secondary must sync configs.
GUI Validation
- On Secondary PAN GUI:
Administration → System → Deployment → Synchronization Status
[Screenshot: DB Sync Status Page] - Confirm sync = Up to Date.
CLI Validation
On Secondary PAN:
show application status ise
Look for:Database Replication State: Active
Step 7 – Report & Document Results
Why: In real deployments, you must prove HA works.
- Record screenshots of:
- Deployment view during failover.
- Authentication logs during PSN failover.
- CLI outputs (
show application status ise
).
- Create a Failover Report:
- Test performed
- Expected outcome
- Actual outcome
- Pass/Fail
Expert-Level Use Cases
- Dual-Region ISE with GSLB
- Site-local PSNs + Global VIP DNS proximity.
- PAN/MnT anchored in primary DC; remote MnT for local log cache.
- Drill: cut Region-A WAN → Region-B clients unaffected.
- PSN Blue/Green + Staggered Cert Rotation
- Blue pool presents
*.corpA.com
, Green*.corpB.com
during migration with dual-SANs. - LB weights shift after root/intermediate swap; zero portal warnings.
- Blue pool presents
- Auth Partitioning by NAD Class
- Separate VIPs/pools for WLC, Switch, VPN; tailored persistence timeouts and weights.
- Stops Wi-Fi surges from starving wired/VPN.
- IoT/MAB High-Stickiness Window
- Increase RADIUS persistence timeout for MAB to hours; reduce CoA storms for flappy devices.
- AD Fragility Shield
- Policy fallback to EAP-TLS only during AD outage; admin banner alerts; automatic restore rule with time window.
- MnT Split-Write with External SIEM
- Keep MnT logs minimal (short retention) and stream to SIEM for long-term; failover test validates no loss during MnT switch.
- Change-Window Guardrails
- Real-time success-rate SLO in Grafana (ISE syslog + LB stats). If success <98.5% for 2 min → auto-rollback (re-enable drained PSNs).
- CoA Path Assurance
- Pre-deploy ACLs from all PSN IPs to all NADs; nightly CoA synthetic probes validate UDP/3799 reachability.
- WAN-Aware Policy Sync Windows
- Freeze policy publishes during known WAN maintenance; schedule PSN “config pull verification” after link up.
- Portal Isolation Zone
- Dedicated HTTPS VIP/pool for Guest/BYOD portals with different TLS ciphers and WAF in front; RADIUS VIPs untouched.
CLI Reference
ISE (any node)
show application status ise show replication status show logging application ise-radius.log tail show logging application ise-psc.log tail show cpu ; show memory ; show disk
Catalyst Switch
show authentication sessions interface Gi1/0/10 details show radius statistics test aaa group radius <user> <pass> legacy
WLC 9800
show radius summary show client detail <mac> show wlan <id>
F5 BIG-IP (if used)
tmsh show ltm pool tmsh show ltm virtual tmsh list ltm monitor radius
Citrix ADC (if used)
show lb vserver show servicegrp show ns runningconfig | grep -i persist
FAQs – Cisco ISE HA & Failover Testing
FAQ 1. What happens if the Primary PAN fails during business hours?
- Answer:
- The Secondary PAN automatically takes over as the Active Admin Node.
- All configuration changes must now be made on the Secondary.
- GUI Validation:
Administration → System → Deployment
→ Check Secondary shows Active.
- CLI Validation:
show application status ise
Look forPrimary: No, Secondary: Active
.
FAQ 2. Do authentications stop if both PAN nodes fail?
- Answer:
- No. PANs are management-only nodes, not involved in live authentications.
- Authentications are handled by PSNs, so end-users won’t be impacted if PANs fail.
- Impact: You cannot push policy/config changes until PAN is restored.
FAQ 3. How does ISE decide which MnT node becomes active?
- Answer:
- ISE supports Primary MnT and Secondary MnT roles.
- If Primary MnT fails, Secondary automatically takes over logging/reporting.
- GUI Validation:
Operations → Reports → Authentications
→ Ensure logs still appear.
- CLI Validation:
show logging system tail
FAQ 4. What happens to live user sessions if a PSN fails?
- Answer:
- Active sessions remain authorized until re-authentication (e.g., reauth timer, port bounce).
- New authentication requests failover to other PSNs.
- Switch CLI Validation:
show authentication sessions interface Gi1/0/10
Session remains Authorized.
FAQ 5. Can ISE provide Active/Active Admin Nodes?
- Answer:
- No. ISE supports one Active PAN and one Standby PAN (Active/Standby).
- Multi-admin concurrency is only for multiple admins connecting, not for both PANs managing at the same time.
FAQ 6. How do I test database replication between PANs?
- Answer:
- GUI Validation:
Administration → System → Deployment → Synchronization Status
- Check “Up to Date”.
- CLI Validation on Secondary PAN:
show application status ise
Look forDatabase Replication State: Active
.
- GUI Validation:
FAQ 7. Does load balancing apply to PSNs in HA?
- Answer:
- Yes. PSNs should be front-ended with Load Balancers (F5, Citrix ADC, DNS Round Robin).
- This ensures seamless failover and distribution of RADIUS/TACACS requests.
- Without LB, endpoints rely on RADIUS server lists configured on NADs (switches/WLC).
FAQ 8. How do I test TACACS+ failover in ISE?
- Answer:
- Configure multiple PSNs as TACACS servers on your device.
- Shut down one PSN:
application stop ise
- Attempt device login.
- Device CLI Validation:
show aaa servers
Device should failover to healthy PSN.
FAQ 9. What’s the difference between Node Failure and Network Failure testing?
- Answer:
- Node Failure: Stopping ISE services (
application stop ise
). - Network Failure: Disconnecting/shutting VM NIC → Simulates loss of connectivity.
- Both should be tested to validate true resilience in production.
- Node Failure: Stopping ISE services (
FAQ 10. How do I document HA testing for compliance audits?
- Answer:
- Capture:
- Screenshots of Deployment page before/during/after failover.
- CLI outputs (
show application status ise
,show authentication sessions
). - Authentication logs during PSN failover.
- Record each test in a Failover Test Matrix (Test, Expected, Actual, Result).
- Store reports for audit trails.
- Capture:
YouTube Link
For more in-depth Cisco ISE Mastery Training, subscribe to my YouTube channel Network Journey and join my instructor-led classes for hands-on, real-world ISE experience
Closing Notes (Key Takeaways)
- Build redundancy at every persona: PAN, MnT, PSN.
- Use LB VIPs for PSN tier with health + persistence.
- Practice repeatable failover runbooks; capture pass/fail evidence.
- Verify accounting integrity and CoA success—not just “auth OK”.
- Keep NTP/DNS/PKI/AD healthy; most “ISE HA” issues are dependencies.
Upgrade Your Skills – Start Today
For in-depth Cisco ISE Mastery Training, subscribe to Network Journey on YouTube and join my instructor-led classes.
Fast-Track to Cisco ISE Mastery Pro
• Duration: 4 months (live)
You’ll master: Enterprise ISE design, PSN/LB at scale, Guest/BYOD, pxGrid & SGT, upgrades/DR, HA runbooks, TAC-grade troubleshooting.
Course outline & enrollment: https://course.networkjourney.com/ccie-security/
Next step: Book a readiness call, download the HA & Failover Test Pack, and reserve your seat.
Enroll Now & Future‑Proof Your Career
Email: info@networkjourney.com
WhatsApp / Call: +91 97395 21088