[Day #67 PyATS Series] Automating MPLS LSP Path Validation Using pyATS for Cisco [Python for Network Engineer]
Table of Contents
Introduction — key points
MPLS LSPs (Label Switched Paths) are the backbone of many carrier and service provider networks, and they’re increasingly used in large enterprise backbones for traffic engineering and service chaining. Ensuring an LSP actually follows the expected sequence of nodes (and that the label forwarding plane matches the control plane) is essential after device upgrades, topology changes, or control-plane tweaks.
In this Article we will:
- collect pre-change snapshots (LDP/RSVP neighbor state, forwarding tables, TE tunnel paths),
- define the expected LSP path(s) (from design/Network Intent),
- collect post-change snapshots and compute the actual LSP hop list (control & forwarding plane),
- run device-sourced traceroutes to validate the data-plane LSP,
- compute diffs and generate an audit-ready JSON report, and
- show how to integrate findings into a GUI (Kibana/ELK) or ticketing artifact.
Topology Overview
Use a small but realistic MPLS domain for labs and teaching:

Key notes:
- PE-1 and PE-2 are Provider Edge routers (ingress/egress for LSPs). They may originate LSPs (Rsvp-TE or static).
- P-1 is a core transit node (LDP/RSVP neighbor relationships with PEs).
- The automation host (pyATS) can SSH to all devices and execute both control-plane commands and device-sourced traceroutes. For large-scale deployments use a jump host or orchestration cluster.
Sample LSPs we’ll validate:
- LSP
LSP-ETH-CE-A-CE-B
(ingress PE-1 → egress PE-2), expected path:PE-1 -> P-1 -> PE-2
. - For TE LSPs, we’ll inspect explicit path (R1,R2,R3) and check label entries on each hop.
Topology & Communications — what to collect and why
To validate an MPLS LSP end-to-end you must look at both control-plane and forwarding/data-plane evidence:
Control-plane artifacts (what to collect)
- LDP neighbors:
show mpls ldp neighbor
(LDP-based LSPs)
- LDP bindings/labels / forwarding table:
show mpls forwarding-table
(shows label -> next-hop mapping)
- RSVP-TE / TE tunnels:
show mpls traffic-eng tunnels
(NX-OS/IOS-XE),show rsvp session
/show mpls lsp
/show mpls lsp name <name>
(IOS-XR/IOS-XE variants)
- RIB / CEF / LFIB evidence:
show ip route <prefix>
andshow ip cef <prefix>
(CEF mapping),show mpls forwarding-table
orshow cef
depending on platform
- BGP / IGP (for next-hop and reachability):
show ip bgp
/show ip route
/show ip ospf neighbor
to ensure the IGP and prefixes are stable.
Data-plane artifacts (what to collect)
- Device-sourced traceroute with MPLS labels:
traceroute mpls ipv4 <dest>
(platform dependent) ortraceroute <dest> mpls
— many platforms have an option to show labels at each hop (IOS-XR:traceroute <dest> mpls
). When not available usetrace mpls
or fall back to normal traceroute from each hop or ping with TTL manipulations.
- Ping tests from ingress to egress and from intermediate nodes if possible.
Logs & events
show logging | include MPLS|LDP|RSVP|TE
— capture logs for session drops or label errors.- Syslog server correlation if centralized logging exists.
Why both planes?
- Control-plane can show LSP state and intended path (labels / tunnels), but the LFIB (forwarding) must actually contain the corresponding labels for correct forwarding. Also, data-plane traceroute confirms service reachability and whether MPLS label stacking is consistent.
Workflow Script — full pyATS workflow
Below is a comprehensive pyATS script mpls_lsp_validation.py
. It is intended to be run from your automation host, reads testbed.yml
, and takes an input expected_paths.yml
(which maps LSP names to expected device hop lists). The script:
- collects pre-check snapshots from each device;
- computes actual LSP hop sequences (control-plane and LFIB evidence);
- performs device-sourced traceroute(s) when possible;
- compares actual vs expected and generates diffs + JSON report;
- saves artifacts under
results/<run_id>/
.
Important: This script is read-only (no configuration changes). Adjust commands per platform & software version as needed. For large scale, convert synchronous loops to concurrent collectors (ThreadPoolExecutor).
#!/usr/bin/env python3 """ mpls_lsp_validation.py Validate MPLS LSP paths (control plane + forwarding + data-plane) using pyATS. Usage: python mpls_lsp_validation.py --testbed testbed.yml --expected expected_paths.yml --run-id run001 """ import argparse, yaml, json, os, re, time, difflib from pathlib import Path from datetime import datetime from genie.testbed import load RESULTS = Path("results") RESULTS.mkdir(exist_ok=True) # Per-platform command map (extend as needed) CMD_MAP = { "ios": { "ldp_neighbor": "show mpls ldp neighbor", "mpls_fwd": "show mpls forwarding-table", "rsvp_tunnels": "show mpls traffic-eng tunnels", "mpls_lsp": "show mpls lsp", # not always present "traceroute_mpls": "traceroute {dest} mpls", "traceroute": "traceroute {dest}" }, "iosxr": { "ldp_neighbor": "show mpls ldp neighbor", "mpls_fwd": "show mpls forwarding-table", "rsvp_tunnels": "show mpls traffic-eng tunnels", "mpls_lsp": "show mpls lsp", "traceroute_mpls": "traceroute mpls ipv4 {dest}", "traceroute": "traceroute {dest}" }, "nxos": { "ldp_neighbor": "show mpls ldp neighbor", "mpls_fwd": "show mpls forwarding-table", "rsvp_tunnels": "show mpls traffic-eng tunnels", "mpls_lsp": "show mpls lsp", "traceroute_mpls": "traceroute {dest} mpls", "traceroute": "traceroute {dest}" }, # default fallback "default": { "ldp_neighbor": "show mpls ldp neighbor", "mpls_fwd": "show mpls forwarding-table", "rsvp_tunnels": "show mpls traffic-eng tunnels", "mpls_lsp": "show mpls lsp", "traceroute_mpls": "traceroute {dest}", "traceroute": "traceroute {dest}" } } def ts(): return datetime.utcnow().isoformat() + "Z" def ensure_dir(p): Path(p).mkdir(parents=True, exist_ok=True) def save_text(device, run_id, label, text): path = RESULTS / run_id / device ensure_dir(path) p = path / f"{label}.txt" with open(p, "w") as f: f.write(text or "") return str(p) def save_json(device, run_id, label, obj): path = RESULTS / run_id / device ensure_dir(path) p = path / f"{label}.json" with open(p, "w") as f: json.dump(obj, f, indent=2) return str(p) # --- Parsers (best-effort) --- # parse ldp neighbor output: capture neighbor IP and state LDP_NEIGH_RE = re.compile(r'(?P<peer>\d+\.\d+\.\d+\.\d+)\s+.*?(?P<state>oper|down|Init|UP|Down)', re.I) # parse mpls forwarding table: label -> next-hop FWD_RE = re.compile(r'^\s*(?P<label>\d+)\s+(?P<prefix>\S+)\s+.*?via\s+(?P<nh>\d+\.\d+\.\d+\.\d+)', re.I | re.M) # parse rsvp/te tunnels: parse tunnel name and path hops (platform output varies) # We'll use a heuristic: extract "Path: <A> <B> <C>" or hop lines showing IPs TE_HOP_RE = re.compile(r'(?P<hop>\d+\.\d+\.\d+\.\d+)') def parse_ldp_neighbors(raw): neighbors = [] if not raw: return neighbors for line in raw.splitlines(): m = LDP_NEIGH_RE.search(line) if m: neighbors.append({"peer": m.group("peer"), "state": m.group("state")}) return neighbors def parse_mpls_fwd(raw): """ Return list of entries like: {label, prefix, next_hop} """ entries = [] if not raw: return entries for line in raw.splitlines(): m = FWD_RE.search(line) if m: entries.append({"label": m.group("label"), "prefix": m.group("prefix"), "next_hop": m.group("nh")}) return entries def parse_te_tunnels(raw): """ Best-effort: find IPs in the tunnel path output and return as hop list. """ hops = [] if not raw: return hops # search for IPs in the whole output, sequentially for m in TE_HOP_RE.finditer(raw): ip = m.group("hop") if ip not in hops: hops.append(ip) return hops # --- collection functions --- def get_platform_key(device): # device.os often contains 'ios', 'iosxr', 'nxos' etc os_name = (device.os or "").lower() if "iosxr" in os_name or "ios-xr" in os_name: return "iosxr" if "nx-os" in os_name or "nxos" in os_name: return "nxos" if "ios" in os_name or "ios-xe" in os_name: return "ios" return "default" def collect_device_state(device, run_id, extra_cmds=None): """ Collect relevant MPLS state from device; returns dict with raw and parsed. """ name = device.name print(f"[{ts()}] Collecting from {name} (os={device.os})") out = {"device": name, "collected_at": ts(), "raw": {}, "parsed": {}} platform_key = get_platform_key(device) cmds = CMD_MAP.get(platform_key, CMD_MAP['default']).copy() if extra_cmds: cmds.update(extra_cmds) try: device.connect(log_stdout=False) device.execute("terminal length 0") for label, cmd in cmds.items(): if not cmd: continue formatted = cmd # We don't know dest here, traceroute not run in bulk try: ret = device.execute(formatted) except Exception as e: ret = f"ERROR executing {formatted}: {e}" out['raw'][label] = save_text(name, run_id, label, ret) # parse key outputs if label == "ldp_neighbor": out['parsed']['ldp_neighbors'] = parse_ldp_neighbors(ret) if label == "mpls_fwd": out['parsed']['mpls_fwd'] = parse_mpls_fwd(ret) if label == "rsvp_tunnels": out['parsed']['te_hops'] = parse_te_tunnels(ret) if label == "mpls_lsp": # best-effort parse for LSP hop ip's out['parsed']['mpls_lsp'] = parse_te_tunnels(ret) device.disconnect() except Exception as e: out['error'] = str(e) save_json(name, run_id, "snapshot", out) return out # --- path extraction utilities --- def build_lsp_path_from_te(parsed_entry): """ If TE hops are present (parsed from show mpls traffic-eng tunnels), use them as the canonical path. """ return parsed_entry.get('te_hops') or parsed_entry.get('mpls_lsp') or [] def infer_path_from_forwarding(entries, ingress, egress): """ Build an approximate path from mpls_fwd entries by following next-hop addresses starting from ingress toward egress. This is a heuristic and depends on label/prefix visibility. """ # Map next_hop -> list of labels/prefixes (reverse mapping) nh_map = {} for e in entries: nh_map.setdefault(e['next_hop'], []).append(e) # Heuristic: do BFS following next-hop IPs — in practice you need topology to map IP->device # Here we return the next_hops chain uniquely discovered path = [] for e in entries: nh = e.get('next_hop') if nh and nh not in path: path.append(nh) # fallback: return path as sequence of next_hops return path # --- device traceroute --- def traceroute_from_device(device, run_id, dest_ip): """ Run device-sourced traceroute; prefer MPLS traceroute if supported. Save output and return path of hops (IPs) if we can parse them. """ name = device.name platform_key = get_platform_key(device) cmd_tpl = CMD_MAP.get(platform_key, CMD_MAP['default']).get('traceroute_mpls') or CMD_MAP['default']['traceroute'] cmd = cmd_tpl.format(dest=dest_ip) try: device.connect(log_stdout=False) device.execute("terminal length 0") out = device.execute(cmd) device.disconnect() except Exception as e: out = f"ERROR executing {cmd}: {e}" # save save_text(name, run_id, f"traceroute_{dest_ip}", out) # parse traceroute lines for IPs ips = [] for line in out.splitlines(): m = re.search(r'\b(\d{1,3}(?:\.\d{1,3}){3})\b', line) if m: ip = m.group(1) if ip not in ips: ips.append(ip) return {"raw": out, "hops": ips} # --- diff helpers --- def unified_diff(a_list, b_list, a_label="expected", b_label="actual"): a = [f"{i}\n" for i in a_list] b = [f"{i}\n" for i in b_list] return "".join(difflib.unified_diff(a, b, fromfile=a_label, tofile=b_label)) # --- orchestrator main --- def main(testbed_file, expected_file, run_id): tb = load(testbed_file) # load expected LSPs (YAML): with open(expected_file) as f: expected = yaml.safe_load(f) # 1) Pre-collection pre = {} for name, device in tb.devices.items(): pre[name] = collect_device_state(device, run_id + "_pre") # 2) Compute actual paths from pre (control-plane) actual_paths = {} for name, snap in pre.items(): # attempt to extract TE paths or LSP lists per device parsed = snap.get('parsed', {}) te_hops = parsed.get('te_hops') or parsed.get('mpls_lsp') if te_hops: actual_paths[name] = {"control_hops": te_hops} else: mpls_fwd = parsed.get('mpls_fwd', []) path = infer_path_from_forwarding(mpls_fwd, None, None) actual_paths[name] = {"control_hops": path} # 3) For each expected LSP, compute end-to-end control hops by querying ingress device (or join per device) lsp_reports = {} for lsp_name, lsp_info in expected.items(): ingress = lsp_info['ingress'] egress = lsp_info['egress'] dest = lsp_info.get('dest_ip') # optional egress IP to traceroute to expected_hops = lsp_info.get('expected_hops', []) # Get ingress device snapshot ingress_snap = pre.get(ingress) control_path = [] if ingress_snap: parsed = ingress_snap.get('parsed', {}) # prefer TE tunnel hops on ingress snapshot control_path = parsed.get('te_hops') or parsed.get('mpls_lsp') or infer_path_from_forwarding(parsed.get('mpls_fwd', []), ingress, egress) # 4) run traceroute from ingress toward egress (device-sourced) traceroute_result = None if dest: # run traceroute from ingress device devobj = tb.devices[ingress] traceroute_result = traceroute_from_device(devobj, run_id, dest) # 5) compare expected vs actual actual_hops = traceroute_result['hops'] if traceroute_result else control_path diff = unified_diff(expected_hops, actual_hops, "expected", "actual") lsp_reports[lsp_name] = { "ingress": ingress, "egress": egress, "expected_hops": expected_hops, "control_hops": control_path, "traceroute_hops": traceroute_result['hops'] if traceroute_result else [], "traceroute_raw": traceroute_result['raw'] if traceroute_result else "", "diff": diff } # 6) Post-collection (optional) post = {} for name, device in tb.devices.items(): post[name] = collect_device_state(device, run_id + "_post") # 7) Save final report report = { "run_id": run_id, "timestamp": ts(), "expected": expected, "lsp_reports": lsp_reports, "pre_snapshots": {k: pre[k].get('raw') for k in pre}, "post_snapshots": {k: post[k].get('raw') for k in post} } save_json("summary", run_id, "report", report) print(f"[+] Report written to {RESULTS / run_id / 'summary' / 'report.json'}") return report if __name__ == "__main__": ap = argparse.ArgumentParser() ap.add_argument("--testbed", required=True) ap.add_argument("--expected", required=True) ap.add_argument("--run-id", required=True) args = ap.parse_args() main(args.testbed, args.expected, args.run_id)
Artifacts produced
- Raw command outputs:
results/<run_id>/<device>/<command>.txt
- Parsed JSON snapshots:
results/<run_id>/<device>/snapshot.json
- A final
results/<run_id>/summary/report.json
containing expected vs actual hops, diffs and raw traceroute output.
Explanation by Line — deep annotated walk-through
I’ll walk through the script’s most important parts and why each decision matters.
CMD_MAP
We centralize vendor/platform command mapping. Different NX-OS/IOS-XR/IOS-XE versions can differ in the exact traceroute or TE command; a central map lets you change commands without touching collection logic.
collect_device_state()
- Connects to device, runs
terminal length 0
to avoid pager artifacts. - Executes each command in the
CMD_MAP
and saves raw outputs usingsave_text()
. - For key outputs (
ldp_neighbor
,mpls_fwd
,rsvp_tunnels
,mpls_lsp
) we perform lightweight parsing. - We store both raw text and parsed minimally processed entries for audit and easier automated checks.
Parsers
- Parsers are deliberately conservative and heuristic-based. In a production rollout you’ll prefer
Genie
parsing (device.parse('show mpls ldp neighbor')
) because Genie returns structured dictionaries. The regex-based parse covers lab outputs and many variations. - parse_te_tunnels() is intentionally generic: many TE tunnel outputs list hops as IPs; we extract IPs in sequence.
traceroute_from_device()
- Device-sourced traceroutes are gold for data-plane validation since they show how the device forwards packets. The script prefers MPLS-capable traceroutes (
traceroute mpls
) where the platform supports it; otherwise it falls back to normal traceroutes. - The function saves raw traceroute output for forensic use and returns extracted hop IPs.
LSP extraction & comparison
- The script builds
control_path
primarily from TE/LSP outputs if available (these are authoritative for path reservation). If not, it tries to infer path from the forwarding table (infer_path_from_forwarding
) — a heuristic because forwarding table entries may be per-prefix and not show an LSP hop sequence directly. - The data-plane
traceroute_hops
are compared againstexpected_hops
. Thediff
is generated using Python’sdifflib
which produces unified diffs — easy to read in reports and tickets.
Report objectives
- Create a single JSON
report.json
that contains expected LSP definitions, control-plane vs data-plane findings, and raw outputs. This artifact is suitable for attaching to change requests or indexing into an observability system.
Hardening & scaling notes (teach these)
- Replace regex parsers with
device.parse()
(Genie) for robust parsing across OS versions. - Use concurrency for device collection (
ThreadPoolExecutor
) in real networks. - Add timeouts and retries for devices that are slow or under load.
- Add health gating: fail the validation early if core LDP neighbors are not
operational
or if BGP/IGP is unstable.
testbed.yml
Example
A simple testbed to get started — adjust IPs, credentials and OS strings for your lab.
testbed: name: mpls_lab credentials: default: username: netops password: NetOps!23 devices: PE1: os: ios-xe type: router connections: cli: protocol: ssh ip: 10.0.100.11 P1: os: ios-xr type: router connections: cli: protocol: ssh ip: 10.0.100.12 PE2: os: nxos type: router connections: cli: protocol: ssh ip: 10.0.100.13
And an expected_paths.yml
describing the LSPs you want to validate:
LSP-CE-A-CE-B: ingress: PE1 egress: PE2 dest_ip: 192.0.2.2 # egress loopback or target to traceroute expected_hops: - 10.0.1.1 # PE1 forward next-hop - 10.0.2.1 # P1 - 10.0.3.1 # PE2
Use this to teach how expected network intent maps to an automated test.
Post-validation CLI
Below are fixed-width terminal examples you can paste into your slides or blog. They represent typical outputs you’ll parse.
A. show mpls ldp neighbor
PE1# show mpls ldp neighbor Neighbor LDP Ident: 10.0.2.1:0; Peer LDP Ident: 10.0.1.1:0; Downstream LDP discovery sources: TCP: 10.0.2.1:646; connected; Downstream Session: Up Holdtime: 15
B. show mpls forwarding-table
PE1# show mpls forwarding-table Local Label Outgoing Label Prefix Next Hop 16 18 192.0.2.2/32 10.0.2.1 17 19 10.0.1.0/24 10.0.3.1
C. show mpls traffic-eng tunnels
(TE)
PE1# show mpls traffic-eng tunnels Tunnel Name: TUN-100 Source: 10.0.1.1 Destination: 192.0.2.2 Path: 10.0.1.1 -> 10.0.2.1 -> 10.0.3.1 State: UP, bandwidth: 100Mbps
D. Device-sourced MPLS traceroute (IOS-XR syntax)
PE1# traceroute mpls ipv4 192.0.2.2 Type escape sequence to abort. Tracing MPLS route to 192.0.2.2 1 10.0.1.1 1 ms 2 10.0.2.1 4 ms [labels: 160 200] 3 10.0.3.1 8 ms [labels: 200] 4 192.0.2.2 10 ms
E. Example diff in the report (expected vs actual)
--- expected +++ actual @@ - 10.0.1.1 - 10.0.2.1 - 10.0.3.1 + - 10.0.4.1 # unexpected transit on actual
This indicates the actual path diverged — a clear alarm for the engineering team.
FAQs
1. How do I define the canonical expected path in large networks?
Answer: Build expected paths from design documents (LSP explicit routes or TE policies) and store them as YAML or in a small database. For dynamic LSPs (LDP), expect the IGP shortest path; derive expected hops from the IGP topology (e.g., run an offline Dijkstra using the IGP adjacency graph). For TE LSPs, use the explicit-hop list from your traffic-engineering plan.
2. What if show mpls traffic-eng tunnels
is not available on my platform?
Answer: Use show mpls lsp
or show rsvp neighbors
or show rsvp session
depending on OS. If TE output is unavailable, infer the path from forwarding tables (LFIB) by following label next-hops, but this is heuristic and less exact.
3. Why run device-sourced traceroute rather than only using control-plane info?
Answer: Control-plane can indicate intended path (RSVP/TE reservations) but the forwarding plane (LFIB) might be stale or mis-programmed. Device-sourced traceroute reveals actual forwarding hops and label stacks seen by the device, giving you the real picture of packet forwarding.
4. How do I handle ECMP or dynamic load-sharing across multiple paths?
Answer: ECMP complicates path validation because multiple next-hops are valid. For ECMP you should:
- Validate that each expected next-hop is present in the LFIB for the traffic class,
- Sample multiple traceroutes using varying source ports or packet sizes to exercise different ECMP buckets,
- Accept any path that matches the ECMP candidate set as PASS.
5. Traceroute shows IPs, but I need node names in the report — how?
Answer: Maintain an ID map of IP → hostname (from your inventory/testbed, or via show ip interface
and neighbor mappings). After extracting IP hops, map to device names for human-friendly reports.
6. LSP path shows an unexpected intermediate device — is this always bad?
Answer: Not necessarily. Paths can change due to IGP weight changes, interface failures, or explicit policy. Investigate whether the change was intended (maintenance, traffic engineering) or a regression (misconfigured IGP metric, failed link). The report’s timestamped raw CLI artifacts help root-cause.
7. How often should I run LSP validation?
Answer: For change windows: run pre/post automation for each planned change. For continuous assurance: run periodic checks (daily for stable networks, or hourly for highly dynamic networks) and run immediate checks after any topology or control-plane change.
8. Can I integrate this script into CI/CD or a telemetry pipeline?
Answer: Yes. In CI/CD, run these checks as post-deploy smoke tests on a staging or canary path. For telemetry, ingest report.json
into Elasticsearch and create Kibana alerts for diffs or unexpected hop insertions.
YouTube Link
Watch the Complete Python for Network Engineer: Automating MPLS LSP path validation Using pyATS for Cisco [Python for Network Engineer] Lab Demo & Explanation on our channel:
Join Our Training
If you want guided, instructor-led help turning these masterclass patterns into production automation — including deep dives on pyATS, Genie parsing, traceroute/LFIB nuances, CI/CD integration, and dashboards — Trainer Sagar Dhawan is running a 3-month instructor-led program on Python, Ansible, API & Cisco DevNet for Network Engineers. The course walks you from scripts to deployed pipelines, with real labs and code reviews so you graduate as a confident Python for Network Engineer.
Learn more & enroll:
https://course.networkjourney.com/python-ansible-api-cisco-devnet-for-network-engineers/
Join the program to master MPLS validation, build robust automation, and get hands-on mentoring from a working trainer — and continue your journey as a Python for Network Engineer.
Enroll Now & Future‑Proof Your Career
Email: info@networkjourney.com
WhatsApp / Call: +91 97395 21088