[Day #45 Pyats Series] Multi-vendor VRRP/HSRP failover validation using pyATS for Cisco [Python for Network Engineer]

[Day #45 Pyats Series] Multi-vendor VRRP/HSRP failover validation using pyATS for Cisco [Python for Network Engineer]

Introduction on the Key Points

Welcome back to Day 45 of our “101 Days of pyATS” series, a practical journey into vendor-agnostic automation using Cisco’s pyATS framework! Today, we’re diving deep into failover validation for first-hop redundancy protocols (FHRPs)VRRP and HSRP—using pyATS and Genie parsers.

In real-world enterprise or campus network environments, FHRPs like VRRP (Virtual Router Redundancy Protocol) and HSRP (Hot Standby Router Protocol) are crucial for gateway availability and seamless user experience. However, manual validation of their states (active/standby/priority/timer consistency) across vendors and locations is both tedious and error-prone.

By the end of this article, you’ll know how to:

  • Automate failover testing across Cisco VRRP/HSRP devices
  • Validate roles (active/standby/virtual IP) using structured Genie output
  • Confirm consistency of configuration and failover behavior
  • Build a scalable, reusable automation workflow
  • Leverage this use case if you’re learning Python for Network Engineer roles in automation, ops, or SRE

Let’s simplify first-hop high availability testing!


Topology Overview

We’re using a multi-vendor, Cisco-centric topology with a focus on FHRP high availability:

  • R1 and R2 form a redundancy pair using HSRP or VRRP
  • Both routers share a virtual IP, serving as the default gateway
  • Clients are connected downstream through a switch (Cisco/Arista/Juniper)

You can easily adapt this topology for VRRP validation in a similar fashion, even across non-Cisco devices using customized pyATS parsers or REST API integrations.


Topology & Communications

VRRP Scenario:

  • R1: Priority 120
  • R2: Priority 100
  • Virtual IP: 192.168.1.1

HSRP Scenario:

  • R1: Active
  • R2: Standby
  • Group: 1
  • Virtual IP: 192.168.1.1

We want to validate:

  • Which device is active or standby?
  • What’s the priority value?
  • Who owns the virtual IP?
  • Are timers matching on both sides?

In case of a manual shutdown, the backup device should take over with correct state change.


Workflow Script (pyATS Job)

# File: validate_fhrp_states.py

from genie.testbed import load

testbed = load('testbed.yml')

devices = ['R1', 'R2']

for device_name in devices:
    device = testbed.devices[device_name]
    device.connect(log_stdout=False)

    print(f"\n--- Checking FHRP on {device_name} ---")

    try:
        # Works for both HSRP and VRRP, based on IOS parser
        output = device.parse('show standby')

        for interface, fhrp_data in output['interfaces'].items():
            for group, details in fhrp_data['groups'].items():
                print(f"Interface: {interface} | Group: {group}")
                print(f"  State        : {details.get('state', 'N/A')}")
                print(f"  Priority     : {details.get('priority', 'N/A')}")
                print(f"  Virtual IP   : {details.get('virtual_ip_address', 'N/A')}")
                print(f"  Active Router: {details.get('active_router', 'N/A')}")
                print(f"  Standby Router: {details.get('standby_router', 'N/A')}")
                print(f"  Hello Timer  : {details.get('hello_time', 'N/A')}")
                print(f"  Hold Timer   : {details.get('hold_time', 'N/A')}")

    except Exception as e:
        print(f"{device_name}: Error parsing output: {e}")

Explanation by Line

  • testbed = load('testbed.yml') – Load device information
  • device.connect() – Initiate SSH/console connection
  • show standby – Genie parses IOS HSRP/VRRP (this works across most Cisco platforms)
  • For each interface and group, print the key FHRP parameters:
    • Active/standby state
    • Virtual IP assigned
    • Priority value
    • Hello and hold timers

You can easily adapt this to include show vrrp for IOS-XR/NX-OS by modifying the parser.


testbed.yml Example

testbed:
  name: fhrp-testbed
  credentials:
    default:
      username: cisco
      password: cisco

devices:
  R1:
    os: iosxe
    type: router
    connections:
      cli:
        protocol: ssh
        ip: 192.168.100.1

  R2:
    os: iosxe
    type: router
    connections:
      cli:
        protocol: ssh
        ip: 192.168.100.2

Test connectivity using pyats run job validate_fhrp_states.py --testbed-file testbed.yml


Post-validation CLI Screenshots (Expected Output)

On R1 (Active):

R1# show standby
GigabitEthernet0/0 - Group 1
  State is Active
    Virtual IP address is 192.168.1.1
    Active virtual MAC address is 0000.0c9f.f001
    Local virtual MAC address is 0000.0c9f.f001 (default)
    Hello time 3 sec, hold time 10 sec
    Priority 120 (configured 120)
    Active router is local
    Standby router is 192.168.100.2

On R2 (Standby):

R2# show standby
GigabitEthernet0/0 - Group 1
  State is Standby
    Virtual IP address is 192.168.1.1
    Hello time 3 sec, hold time 10 sec
    Priority 100 (configured 100)
    Active router is 192.168.100.1
    Standby router is local

Cross-check this with Genie-parsed output in your Python script.


FAQs

1. What is the purpose of validating VRRP/HSRP failover across multi-vendor environments?

In enterprise networks, VRRP (Virtual Router Redundancy Protocol) and HSRP (Hot Standby Router Protocol) ensure default gateway redundancy. However, in multi-vendor setups (Cisco, Arista, Juniper, Fortigate, etc.), mismatches in configuration or protocol behavior can:

  • Cause failover delays
  • Lead to traffic blackholing
  • Trigger flapping gateways

Using pyATS, we automate the detection of state inconsistencies and validate if failover occurs smoothly and predictably across different vendor devices.


2. Which key parameters should be validated in VRRP/HSRP failover using pyATS?

For VRRP (RFC-compliant and vendor-neutral) and HSRP (Cisco proprietary), the following must be checked:

  • Current role/state: Master (VRRP) / Active (HSRP)
  • Priority value: Should align with failover design
  • Virtual IP binding: Must match across peers
  • Preempt setting: Should be consistent
  • Failover time (measured via timers)

pyATS test cases should assert role transitions, validate correct priority-based leader election, and ensure the virtual IP is only active on one device at a time.


3. How can pyATS differentiate between VRRP and HSRP on Cisco and non-Cisco devices?

With Genie parsers, Cisco devices expose show standby (HSRP) and show vrrp outputs in structured format.

For non-Cisco platforms:

  • Use custom parsers or CLI output capture via Unicon (e.g., show vrrp, get system ha status, etc.)
  • Normalize the data into a common dictionary format

This allows you to create vendor-agnostic validation logic inside pyATS that abstracts away protocol differences while validating state transitions.


4. Can pyATS trigger an interface shutdown to simulate failover scenarios?

Yes! pyATS + Unicon can be used to:

device.configure("interface Gi0/1\nshutdown")

Then wait and verify:

  • The standby router becomes active/master
  • Ping success via virtual IP resumes after failover
  • Failover timer thresholds are met (e.g., within 3-5 seconds)

This simulation allows you to validate high availability behavior proactively during change windows or pre-production testing.


5. How do you verify if the virtual IP remains active on a single gateway device?

From pyATS scripts, after collecting VRRP/HSRP state from both peers:

  • Extract the current active/master state
  • Cross-check which device is holding the virtual MAC (via ARP or interface MAC inspection)
  • Ensure no split-brain scenario (both claiming master/active)

Sample logic:

if device1_state == 'Active' and device2_state == 'Active':
    self.failed("Both devices are active – split brain detected")

6. What CLI commands are parsed for failover validation across vendors?

  • Cisco HSRP:
    show standby brief, show standby
  • Cisco VRRP:
    show vrrp, show vrrp brief
  • Arista EOS:
    show vrrp, show vrrp interfaces
  • Juniper:
    show vrrp, show interfaces vrrp
  • Fortigate (in HA mode):
    get system ha status, diagnose sys ha status

These outputs are parsed into test logic that checks roles, priorities, preempt flags, and tracks failover behavior.


7. How can I measure and validate failover duration using pyATS?

You can insert timestamp checkpoints in your test script:

start_time = time.time()
# Trigger failover (e.g., shut interface)
# Wait for peer to become active
end_time = time.time()

failover_duration = end_time - start_time
self.passed(f"Failover occurred in {failover_duration} seconds")

Set thresholds (e.g., failover_duration <= 5) to validate against SLA or HA design requirements. This quantifiable metric helps in validating failover efficiency and readiness.


YouTube Link

Watch the Complete Python for Network Engineer: Multi-vendor VRRP/HSRP failover validation using pyATS for Cisco [Python for Network Engineer] Lab Demo & Explanation on our channel:

Master Python Network Automation, Ansible, REST API & Cisco DevNet
Master Python Network Automation, Ansible, REST API & Cisco DevNet
Master Python Network Automation, Ansible, REST API & Cisco DevNet
Why Robot Framework for Network Automation?

Join Our Training

Liked how we automated VRRP/HSRP failover checks using pyATS?

This is just a glimpse of what you’ll learn in our hands-on, instructor-led training for Network Automation.

Trainer Sagar Dhawan is conducting a 3-month practical course covering:

  • Python for Network Engineer automation
  • Cisco DevNet + Ansible
  • pyATS + Genie for test automation
  • REST APIs, NETCONF, YANG modeling
  • CI/CD for network validation

Check Complete Course Outline Here

Whether you’re a beginner or a working professional looking to upscale, this program is designed to fast-track your DevNet journey.

Seats are filling fast. Make your move towards Python for Network Engineer success today!

Enroll Now & Future‑Proof Your Career
Emailinfo@networkjourney.com
WhatsApp / Call: +91 97395 21088