[Day #64 Pyats Series] Automating rollback validation after config failure using pyATS for Cisco [Python for Network Engineer]

[Day #64 Pyats Series] Automating rollback validation after config failure using pyATS for Cisco [Python for Network Engineer]

Introduction on the Key Points

In large-scale enterprise networks, configuration changes are a daily reality—new VLANs, updated routing policies, modified QoS settings, and more. But sometimes, a change goes wrong: a wrong ACL blocks critical traffic, an incorrect BGP setting causes route loss, or a misconfigured interface breaks connectivity.

The good news? Most networks have rollback procedures — either via configure replace in Cisco IOS-XE, rollback in NX-OS, or similar vendor-specific commands. The challenge is validating that the rollback was successful.

In this Day #64 of our 101 Days of pyATS (Vendor-Agnostic) series, we’ll automate rollback validation using pyATS.

You’ll learn how to:

  • Capture a pre-change baseline.
  • Rollback a failed change.
  • Automatically validate the post-rollback state against the baseline.
  • Produce a clean pass/fail report for compliance.

If you’re a Python for Network Engineer learner, mastering rollback validation automation will save you hours of troubleshooting and give your NOC and Change Management team peace of mind.


Topology Overview

Our example topology:

  • CSR1 – Core Router
  • CSR2 – Distribution Router
  • CSR3 – Edge Router

All connected via management network. Rollback scenario will be simulated on CSR2.


Topology & Communications

  • Protocol for Access: SSH via pyATS
  • Pre-Change Baseline: Stored in baseline/ directory
  • Rollback Mechanism: Cisco IOS-XE configure replace flash:rollback_config
  • Validation Criteria:
    • Running config matches baseline.
    • Interfaces are up/up.
    • Routing protocols are in expected state.

Workflow:

  1. Take pre-change baseline configs (and key operational outputs).
  2. Apply a simulated bad change.
  3. Execute rollback using CLI.
  4. Collect new configs and outputs.
  5. Compare with baseline using pyATS diff.

Workflow Script

Here’s a fully functional Python + pyATS rollback validation script.

#!/usr/bin/env python3
"""
Automating Rollback Validation after Config Failure
Author: Trainer Sagar Dhawan
"""

import os
from genie.testbed import load
from datetime import datetime

TESTBED = "testbed.yml"
BASELINE_DIR = "baseline"
POST_ROLLBACK_DIR = "post_rollback"

COMMANDS = [
    "show running-config",
    "show ip interface brief",
    "show ip route summary"
]

def capture_outputs(output_dir):
    os.makedirs(output_dir, exist_ok=True)
    testbed = load(TESTBED)
    
    for device in testbed.devices.values():
        print(f"[INFO] Connecting to {device.name}")
        device.connect(log_stdout=False)
        
        for cmd in COMMANDS:
            output = device.execute(cmd)
            filename = f"{device.name}_{cmd.replace(' ', '_')}.txt"
            filepath = os.path.join(output_dir, filename)
            with open(filepath, "w") as f:
                f.write(output)
        
        device.disconnect()
        print(f"[INFO] Collected outputs for {device.name}")

def compare_outputs():
    from genie.utils.diff import Diff
    for base_file in os.listdir(BASELINE_DIR):
        post_file = os.path.join(POST_ROLLBACK_DIR, base_file)
        base_file_path = os.path.join(BASELINE_DIR, base_file)
        
        if os.path.exists(post_file):
            with open(base_file_path) as bf, open(post_file) as pf:
                base_data = bf.read().splitlines()
                post_data = pf.read().splitlines()
                diff = Diff(base_data, post_data)
                diff.findDiff()
                print(f"\n[DIFF] Comparing {base_file}...")
                print(diff)

if __name__ == "__main__":
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    print("[STEP 1] Capturing Post-Rollback Outputs...")
    capture_outputs(POST_ROLLBACK_DIR)
    
    print("[STEP 2] Comparing with Baseline...")
    compare_outputs()

Explanation by Line

  • TESTBED – Points to our YAML file describing devices.
  • BASELINE_DIR & POST_ROLLBACK_DIR – Folders where we store snapshots before and after rollback.
  • COMMANDS – The key operational commands to check state.
  • capture_outputs() – Connects to each device, runs the commands, and saves results.
  • compare_outputs() – Uses genie.utils.diff to compare baseline and post-rollback outputs.
  • Main – Captures post-rollback outputs and runs the comparison.

testbed.yml Example

testbed:
  name: rollback_validation_lab
  credentials:
    default:
      username: admin
      password: cisco123
devices:
  CSR1:
    os: iosxe
    type: router
    connections:
      cli:
        protocol: ssh
        ip: 192.168.1.11
  CSR2:
    os: iosxe
    type: router
    connections:
      cli:
        protocol: ssh
        ip: 192.168.1.12
  CSR3:
    os: iosxe
    type: router
    connections:
      cli:
        protocol: ssh
        ip: 192.168.1.13

Post-validation CLI (Real Expected Output)

Baseline Snapshot

$ python3 capture_baseline.py
[INFO] Connecting to CSR1
[INFO] Collected outputs for CSR1
...

Rollback Command Execution

CSR2# configure replace flash:rollback_config force

Post-Rollback Validation Output

[STEP 2] Comparing with Baseline...

[DIFF] Comparing CSR2_show_running-config.txt...
No Differences Found

[DIFF] Comparing CSR2_show_ip_interface_brief.txt...
No Differences Found

Result: Rollback successful — configuration and operational state match baseline.


FAQs

1. What is rollback validation in networking?
Rollback validation is the process of confirming that a network device has successfully returned to its last known good configuration after a failed change. It’s not enough to just roll back—you need to verify:

  • Interfaces are up and running
  • Routing tables are correct
  • Security policies are intact
  • Services are reachable

2. Why should rollback validation be automated?
Manual rollback checks can be slow and error-prone—especially during outages. Automation with pyATS ensures:

  • Consistent validation across all devices
  • Faster recovery times
  • Reduced dependency on human memory
  • Early detection if rollback didn’t actually fix the problem

3. How can pyATS help in rollback validation?
pyATS can:

  1. Take pre-change snapshots (interfaces, routes, ACLs, etc.).
  2. Apply a configuration change.
  3. If failure occurs, trigger rollback to a saved config.
  4. Compare post-rollback state with the pre-change snapshot using genie diff.
    This ensures that the device is truly back to its original state.

4. Can rollback validation be vendor-agnostic?
Yes. pyATS works across vendors like Cisco, Arista, Juniper, Palo Alto, Fortinet, etc., as long as parsers or CLI patterns are defined. This means you can run the same rollback validation script in a multi-vendor environment.


5. What typical issues can automation catch after rollback?
Automated rollback validation can detect:

  • Missing routes in the RIB
  • Down or err-disabled interfaces
  • ACL or firewall rule mismatches
  • Incomplete BGP or OSPF adjacency restoration
  • Incorrect VLAN assignments
    Catching these early prevents further downtime.

6. What’s the ideal rollback validation workflow?

  1. Baseline Capture – Take pre-change snapshots using pyATS.
  2. Config Deployment – Push changes.
  3. Health Check – Run validation tests.
  4. Rollback (if needed) – Restore last good config.
  5. Post-Rollback Validation – Compare with baseline snapshot.
  6. Report – Generate a pass/fail report for documentation.

7. How can rollback validation integrate with change management?
Automation can push rollback reports directly into ITSM tools like ServiceNow or ticketing systems. This creates a paper trail proving that:

  • The change failed
  • Rollback was performed
  • Network returned to baseline
    This satisfies compliance and audit requirements.

8. How do you notify teams automatically after rollback validation?
You can integrate pyATS scripts with:

  • Email alerts (SMTP)
  • Slack/Microsoft Teams bots
  • CI/CD pipeline status updates (Jenkins/GitLab)
    This ensures everyone is informed in real-time without logging into devices manually.

YouTube Link

Watch the Complete Python for Network Engineer: Automating rollback validation after config failure using pyATS for Cisco [Python for Network Engineer] Lab Demo & Explanation on our channel:

Master Python Network Automation, Ansible, REST API & Cisco DevNet
Master Python Network Automation, Ansible, REST API & Cisco DevNet
Master Python Network Automation, Ansible, REST API & Cisco DevNet
Why Robot Framework for Network Automation?

Join Our Training

You just saw how to automate rollback validation using pyATS — a task that would take a human several minutes (or hours) to check manually, completed in seconds with automation.

Imagine learning not just this, but 100+ other real-world automation workflows — from network health dashboards to multi-vendor compliance checks — all in a structured, instructor-led program.

Trainer Sagar Dhawan is running a 3-month hands-on course: Python, Ansible, APIs & Cisco DevNet for Network Engineers.

In this course, you’ll learn:

  • pyATS from zero to advanced.
  • Building automation pipelines for multi-vendor networks.
  • GitOps workflows for change control and validation.
  • API integrations with RESTCONF, NETCONF, and more.

Reserve your seat nowCourse Details & Registration

The best time to become a Python for Network Engineer is today — and we’ll guide you every step of the way.

Enroll Now & Future‑Proof Your Career
Emailinfo@networkjourney.com
WhatsApp / Call: +91 97395 21088