[Day #79 PyATS Series] Validate ACI Fabric Health Score Using pyATS for Cisco [Python for Network Engineer]

[Day #79 PyATS Series] Validate ACI Fabric Health Score Using pyATS for Cisco [Python for Network Engineer]


Introduction on the Key Points

Cisco ACI (Application Centric Infrastructure) is widely adopted in modern data centers for scalable, policy-driven network automation. One of the critical metrics in ACI environments is the Fabric Health Score, which provides a snapshot of the overall operational health of the ACI fabric, including controllers, leaf/spine switches, and endpoints.

In this detailed Article, we will automate the process of fetching and validating the ACI Fabric Health Score using pyATS, making it part of a structured validation workflow for network engineers. As a part of your professional development in Python for Network Engineer, automating this key health check removes manual overhead and ensures timely visibility into fabric health status.

Key learning objectives:

  • Retrieve fabric health score using ACI REST API.
  • Validate against predefined thresholds.
  • Capture GUI snapshots for visual evidence.
  • Generate structured reports and actionable insights.

This approach enables engineers to perform consistent health audits and automate troubleshooting procedures.


Topology Overview

  • Cisco APIC (Application Policy Infrastructure Controller): Centralized controller managing ACI fabric configurations and health monitoring.
  • Leaf and Spine Switches: Build the ACI fabric backbone.
  • The Automation Server runs pyATS scripts to extract the health score and other fabric state metrics.

Objective:

  • Automatically fetch fabric health score.
  • Validate individual node health and overall score thresholds.
  • Provide a snapshot of fabric health over time.

Topology & Communications

Communication Flow:

  1. Automation server connects to the APIC using the REST API over HTTPS.
  2. Fetches health scores using the /api/node/class/fabricHealthTotal.json endpoint.
  3. Optionally connects to APIC GUI using HTTP for screenshots of health dashboard.
  4. Parses JSON responses to structured Python objects.
  5. Validates that health score and individual node health status are above thresholds.
  6. Generates detailed reports.

Workflow Script

import requests
from pyats.aetest import Testcase, test, main
from requests.auth import HTTPBasicAuth

APIC_URL = 'https://10.1.1.100'
USERNAME = 'admin'
PASSWORD = 'Cisco123'
HEALTH_THRESHOLD = 90  # Minimum acceptable health score

class ACIFabricHealthCheck(Testcase):

    @test
    def login_to_apic(self):
        self.session = requests.Session()
        self.session.auth = HTTPBasicAuth(USERNAME, PASSWORD)
        self.session.verify = False
        print("[INFO] Successfully established session to APIC.")

    @test
    def get_fabric_health_score(self):
        url = f"{APIC_URL}/api/node/class/fabricHealthTotal.json"
        response = self.session.get(url)
        assert response.status_code == 200, "Failed to fetch health data from APIC"
        self.health_data = response.json()

    @test
    def validate_overall_health_score(self):
        overall_health = int(self.health_data['imdata'][0]['fabricHealthTotal']['attributes']['cur'])
        print(f"[INFO] Current ACI Fabric Health Score: {overall_health}")
        assert overall_health >= HEALTH_THRESHOLD, \
            f"FAIL: ACI Fabric Health Score below threshold! Score: {overall_health}"
        print("PASS: ACI Fabric Health Score is within acceptable range.")

    @test
    def validate_each_node_health(self):
        nodes = self.health_data['imdata'][0]['fabricHealthTotal']['children']
        unhealthy_nodes = []

        for node in nodes:
            node_id = node['fabricHealthTotal']['attributes']['dn']
            node_health = int(node['fabricHealthTotal']['attributes']['cur'])
            print(f"[INFO] Node {node_id} health: {node_health}")

            if node_health < HEALTH_THRESHOLD:
                unhealthy_nodes.append((node_id, node_health))

        assert not unhealthy_nodes, f"Unhealthy nodes detected: {unhealthy_nodes}"
        print("PASS: All fabric nodes have health score above threshold.")

    @test
    def capture_dashboard_snapshot(self):
        # Optional: Use Selenium or other HTTP screenshot tools here
        snapshot_path = "/tmp/aci_dashboard_snapshot.png"
        # Example command: selenium_automation.capture_screenshot(APIC_URL + "/ui/#/fabric/health", snapshot_path)
        print(f"[INFO] Dashboard snapshot saved to {snapshot_path}")

if __name__ == '__main__':
    main()

Explanation by Line

  • login_to_apic():
    • Uses requests library to create an authenticated HTTPS session with APIC.
    • Disables SSL verification for lab environments.
  • get_fabric_health_score():
    • Queries the ACI REST API endpoint to retrieve the current fabric health snapshot as JSON.
  • validate_overall_health_score():
    • Extracts the cur attribute representing the current health score.
    • Asserts the health score is above the threshold (90).
  • validate_each_node_health():
    • Iterates through child health scores for individual nodes.
    • Tracks any node health scores below threshold and asserts zero unhealthy nodes.
  • capture_dashboard_snapshot():
    • Placeholder for GUI automation to capture a dashboard image snapshot for reporting.

testbed.yml Example

testbed:
  name: aci_fabric_health_testbed
  credentials:
    default:
      username: admin
      password: Cisco123

devices:
  apic:
    os: aci
    type: controller
    connections:
      https:
        protocol: https
        ip: 10.1.1.100

Post-validation CLI (Real expected output)

API Output Snapshot of Fabric Health JSON

{
  "imdata": [
    {
      "fabricHealthTotal": {
        "attributes": {
          "cur": "95",
          "max": "100",
          "min": "0"
        },
        "children": [
          {
            "fabricHealthTotal": {
              "attributes": {
                "dn": "topology/pod-1/node-101",
                "cur": "97"
              }
            }
          },
          {
            "fabricHealthTotal": {
              "attributes": {
                "dn": "topology/pod-1/node-102",
                "cur": "92"
              }
            }
          }
        ]
      }
    }
  ]
}

Sample Automation Output

[INFO] Successfully established session to APIC.
[INFO] Current ACI Fabric Health Score: 95
PASS: ACI Fabric Health Score is within acceptable range.
[INFO] Node topology/pod-1/node-101 health: 97
[INFO] Node topology/pod-1/node-102 health: 92
PASS: All fabric nodes have health score above threshold.
[INFO] Dashboard snapshot saved to /tmp/aci_dashboard_snapshot.png

FAQs

Q1. Why is it important to validate SD-WAN VPN status in a Viptela deployment?
A1. Validating SD-WAN VPN status ensures the control and data plane connectivity between vEdge devices, controllers, and on-premise sites are functioning properly. It helps detect misconfigurations, connectivity failures, or policy issues that could impact application performance and secure connectivity.


Q2. How does pyATS assist in automating SD-WAN VPN status snapshot checks?
A2. pyATS automates the validation by connecting to Viptela devices, running commands such as show control connections, show vpn, and show bfd sessions, parsing the outputs, and generating structured reports that summarize VPN status, tunnel health, and control connections.


Q3. Which CLI commands are typically used for SD-WAN VPN status validation on Viptela devices?
A3.

  • show control connections – Displays connection status between vEdge devices and controllers
  • show vpn – Provides VPN interface status and route availability
  • show bfd sessions – Shows Bidirectional Forwarding Detection (BFD) session status for rapid failure detection

Q4. Can pyATS provide historical VPN status snapshots for trending and analysis?
A4. Yes. By scheduling regular pyATS jobs, snapshots of VPN status can be stored in structured formats (JSON, HTML, CSV) for historical comparison, helping engineers track changes, identify recurring issues, and perform trend analysis over time.


Q5. Is pyATS capable of handling multi-vendor SD-WAN environments?
A5. Absolutely. With proper test scripts and parsing logic, pyATS can validate VPN status across Cisco Viptela, VMware VeloCloud, and other SD-WAN vendors, enabling consistent health checks in heterogeneous environments.


Q6. How does pyATS report unhealthy VPN status or anomalies?
A6. pyATS provides structured reports that clearly highlight unhealthy states: disconnected control connections, down VPN interfaces, failed BFD sessions, etc. These reports can be formatted in HTML for easy reading or JSON for automation workflows and integrations.


Q7. Can pyATS automation be integrated with alerting systems for proactive monitoring?
A7. Yes. Validation results from pyATS can be piped into monitoring platforms (e.g., Prometheus, Splunk) or alerting tools (e.g., PagerDuty, Slack) to trigger instant notifications when VPN health issues are detected, ensuring rapid response from network operations teams.


YouTube Link

Watch the Complete Python for Network Engineer: Validate ACI Fabric Health Score Using pyATS for Cisco [Python for Network Engineer] Lab Demo & Explanation on our channel:

Master Python Network Automation, Ansible, REST API & Cisco DevNet
Master Python Network Automation, Ansible, REST API & Cisco DevNet
Master Python Network Automation, Ansible, REST API & Cisco DevNet
Why Robot Framework for Network Automation?

Join Our Training

By completing Day #79 of the pyATS Series, you have taken a major step towards mastering network automation by validating ACI Fabric Health Score using structured automation workflows.

But this is just the beginning of building scalable and resilient network automation solutions.

Join Trainer Sagar Dhawan’s 3-Month Instructor-Led Training Program where you’ll learn to build comprehensive automation solutions using Python, Ansible, and APIs.

Full course outline available here:
Python Ansible API Cisco DevNet for Network Engineers – 3-Month Training

Transform into a highly skilled Python for Network Engineer expert, ready to handle real-world automation challenges confidently.

Enroll Now & Future‑Proof Your Career
Emailinfo@networkjourney.com
WhatsApp / Call: +91 97395 21088