[Day #81 Pyats Series] Building pyATS test suites for 1000+ devices using pyATS for Cisco [Python for Network Engineer]
Table of Contents
Introduction on the Key Points
When you manage a network of 1000+ Cisco devices, manual health checks, configuration validation, and performance monitoring become impossible to do efficiently. Logging into each router, switch, or firewall individually wastes countless hours and is prone to human error.
That’s where pyATS (Python Automated Test System) comes in. Designed by Cisco, pyATS lets you build scalable, repeatable test suites that can run against hundreds or even thousands of devices in parallel, all with a single command.
In this guide, we’ll see how to:
- Design pyATS test suites for large-scale networks.
- Efficiently handle 1000+ devices using parallelism and batching.
- Write modular test cases that can be reused across different environments.
- Manage testbed YAML files for bulk device inventory.
- Generate consolidated HTML or JSON reports for management.
If you are on the Python for Network Engineer learning path, this is one of the most career-boosting skills you can add — because every large enterprise or service provider needs automation frameworks for massive device fleets.
Topology Overview
For this lab example, we’ll scale down to a representative network so you can understand the concepts before applying them to a 1000+ device environment.

In a real scenario:
- R1–Rn can be spread across multiple sites and data centers.
- pyATS will connect via SSH (or API) to every device.
Topology & Communications
Components:
- pyATS Host — Your automation VM or server where pyATS and Genie are installed.
- Testbed YAML — The device inventory file that lists all 1000+ devices with their IPs, OS types, and login credentials.
- Network Devices — Cisco routers, switches, and firewalls.
- Communication Method — SSH for CLI access; optionally HTTPS/API for certain checks.
Workflow:
- Testbed YAML defines all devices.
- pyATS test suites execute in parallel to reduce runtime.
- Each test case runs the same health checks across all devices.
- Reports are aggregated into a single HTML or JSON summary.
Workflow Script
Here’s a sample scalable test suite:
# File: tests/basic_checks.py from pyats import aetest from genie.testbed import load class CommonSetup(aetest.CommonSetup): @aetest.subsection def connect_to_devices(self, testbed): self.parent.parameters['testbed'] = testbed for device in testbed.devices.values(): device.connect(log_stdout=False) class VerifyHostname(aetest.Testcase): @aetest.test def check_hostname(self, testbed): for device in testbed.devices.values(): output = device.execute('show running-config | include hostname') assert device.name in output, f"{device.name} hostname mismatch" class VerifyUptime(aetest.Testcase): @aetest.test def check_uptime(self, testbed): for device in testbed.devices.values(): output = device.execute('show version | include uptime') print(f"{device.name} uptime: {output}") class CommonCleanup(aetest.CommonCleanup): @aetest.subsection def disconnect_devices(self, testbed): for device in testbed.devices.values(): device.disconnect() if __name__ == '__main__': import sys from pyats.topology import loader testbed = load('testbed.yml') aetest.main(testbed=testbed)
Explanation by Line
from pyats import aetest
→ Imports pyATS AEtest framework for structured test cases.CommonSetup
→ Runs once before all tests; here we connect to every device.VerifyHostname
→ Ensures the device’s configured hostname matches its inventory name.VerifyUptime
→ Captures uptime to detect recent reboots or instability.CommonCleanup
→ Disconnects from all devices after tests are done.- Parallel Execution → With pyATS job files, tests can run in parallel to save hours.
testbed.yml
Example
Here’s a partial sample for thousands of devices:
testbed: name: LargeScaleTestbed credentials: default: username: admin password: cisco123 devices: R1: os: ios type: router connections: cli: protocol: ssh ip: 10.1.1.1 R2: os: ios type: router connections: cli: protocol: ssh ip: 10.1.1.2 # ... R1000: os: ios type: router connections: cli: protocol: ssh ip: 10.255.255.255
In reality, you’ll generate this YAML using Python or pull device lists from a CMDB.
Post-validation CLI (Real expected output)
All Hostnames Correct:
R1 hostname: R1 R2 hostname: R2 ... R1000 hostname: R1000
Some Hostnames Mismatched:
R45 hostname mismatch Expected: R45, Found: Core-Router-45
Uptime Check Example:
R1 uptime: 5 weeks, 3 days R2 uptime: 1 year, 2 weeks R50 uptime: 3 hours (possible reboot)
FAQs
1. Can pyATS handle large-scale testing for 1000+ devices?
Yes. pyATS is designed for scalable network testing. For large deployments, you can optimize by:
- Splitting devices into parallel test batches
- Using asynchronous connections
- Running on distributed workers or containers
- Leveraging testbed subsets instead of loading all devices at once
2. How do I structure the test suite for such a large environment?
Best practice is to use a modular test structure:
- Separate tests into categories (connectivity, routing, QoS, compliance, etc.)
- Create multiple test scripts instead of a single massive script
- Organize test cases in job files so you can selectively execute subsets without editing code
3. What is the recommended way to store testbed information for 1000+ devices?
Use a centralized YAML testbed file stored in a version control system (Git). For large environments:
- Break it into smaller YAML chunks (per site, per vendor) and merge them dynamically
- Use Jinja2 templates to avoid repetitive config blocks
- Store sensitive credentials in encrypted vaults
4. How can I speed up execution for so many devices?
Key optimizations include:
- Multiprocessing or thread pools in pyATS
- Limiting CLI command sets to only what’s needed
- Running tests in parallel across multiple pyATS workers
- Using clean YAML sections to skip redundant setup between test cases
5. How do I manage resource limits when testing 1000+ devices?
Large-scale testing can overwhelm CPU, memory, and network bandwidth. Use:
- Connection pooling and session re-use
- Rate limiting CLI commands
- Staggered test execution to avoid device overload
- Cloud-based runners (AWS/GCP) for distributed load
6. Can I run compliance and health checks together in a single suite?
Yes, but it’s better to decouple them into separate scripts and execute in parallel pipelines. This reduces runtime and makes debugging easier if one category of tests fails.
7. How can I monitor progress and results for a large test run?
Use pyATS Easypy reports and HTML logs for centralized visibility. For 1000+ devices, also consider:
- Sending progress updates to Slack/Teams
- Logging summaries to a database for historical tracking
- Using Grafana/ELK dashboards for trend analysis
8. What’s the best way to handle device failures mid-test?
Implement exception handling in test scripts:
- Skip to the next device on failure instead of aborting the entire run
- Log detailed error messages for troubleshooting
- Optionally retry failed devices at the end of the run
YouTube Link
Watch the Complete Python for Network Engineer: Building pyATS test suites for 1000+ devices using pyATS for Cisco [Python for Network Engineer] Lab Demo & Explanation on our channel:
Join Our Training
If you want to master Python for Network Engineer automation skills that work for 10 devices or 10,000 devices, join Trainer Sagar Dhawan’s 3-Month Instructor-Led Training Program.
You’ll learn how to:
- Build scalable pyATS testbeds for huge networks.
- Integrate automation with CI/CD pipelines.
- Automate multi-vendor health checks.
- Create compliance, performance, and security test suites.
Course Outline: Click Here to View
Take the next step in your career and become the Python for Network Engineer that every enterprise wants to hire.
Enroll Now & Future‑Proof Your Career
Email: info@networkjourney.com
WhatsApp / Call: +91 97395 21088